docs/reporting.md

# Reporting

The `iohmm-evac report` subcommand renders diagnostic plots from a saved
simulation bundle. Each plot has a fixed shape signature for a healthy
baseline run; comparing your run against those shapes is the project's
canonical smoke check.

## The bundle

`report` consumes the four files written by `simulate`:

| File | Loaded as | Purpose |
| --- | --- | --- |
| `<stem>.parquet` | `observations` (long DataFrame) | one row per (household, t) |
| `<stem>.population.parquet` | `population` | static covariates per household |
| `<stem>.timeline.parquet` | `timeline` | hourly forecast and warning orders |
| `<stem>.config.toml` | `config` | the exact config that produced the run |

Use the public Python API directly if you want to compose your own
figures:

```python
from iohmm_evac.report import load_bundle, plot_state_occupancy

bundle = load_bundle("output/baseline.parquet")
ax = plot_state_occupancy(bundle)
ax.figure.savefig("occupancy.png", dpi=150, bbox_inches="tight")
```

Every plot function takes a `SimulationBundle` and an optional
`matplotlib.axes.Axes` (or sequence, for the multi-panel trajectory plot)
and returns the axes it drew on. None of them call `plt.show()` or
`fig.savefig()` — that is the caller's job.

## Color and label conventions

State labels and colors are defined once in
`iohmm_evac.report.constants` and shared across every plot:

| State | Meaning | Color |
| --- | --- | --- |
| UA | unaware | grey |
| AW | aware | amber |
| PR | preparing | orange-red |
| ER | en route | deep red |
| SH | sheltered | green |

## The plots

### `occupancy` — Fig. 3

A stacked area chart of state shares over time. Vertical dashed lines mark
the voluntary and mandatory order hours (read from the timeline, not
hardcoded); a solid vertical line marks landfall (max `t`).

**Healthy baseline.** UA depletes monotonically. AW and PR bulge in
sequence: AW peaks first, then drains into PR, which drains into ER. ER
itself bulges and drains as households reach destinations and transition
to SH. By landfall, SH absorbs most of the population.

### `departures` — Fig. 4 (single-scenario)

The cumulative share of households that have ever departed, vs time. Same
overlay lines as `occupancy`.

**Healthy baseline.** A roughly S-shaped curve, near zero before the
voluntary order, with a clear inflection near the mandatory order, and an
asymptote below 1 (some households shelter in place without ever
departing). The slope is steepest in the window between the warning
orders.

### `trajectories` — Fig. 2

A multi-panel plot for 2–3 households showing the forecast intensity, the
inferred state path (as a step plot over `UA → AW → PR → ER → SH`), a
single departure marker, and a normalized displacement track. Pass
`--household-ids 0,42,1337` to pick specific households; otherwise the CLI
defaults to `[0, 1, 2]` (a deterministic choice that works for any N).

To keep panels readable, the function rejects more than six household IDs
with a `ValueError`. If `ax` is supplied it must be a sequence of axes one
per id.

> **Why these plots ignore the `departure` emission column.** Both the
> cumulative-departures curve and the trajectory X marks are derived from
> the *latent state path*: a household's "departure hour" is the first ``t``
> at which its state is ER. The `departure` column in the observations
> Parquet is *not* used by either plot. That column carries Bernoulli noise
> (~3% per hour even from non-evacuating households) — a feature, not a
> bug, since it is exactly the kind of noisy emission an IO-HMM is meant
> to denoise. For Build 2's IO-HMM fitting the noise is essential. For
> visual sanity-checking the *underlying* behavioral dynamics it is just
> visual mush, so the report module reaches for the latent state instead.

### `summary` — diagnostic metrics

`iohmm-evac report summary --input PATH` prints a small two-column table of
sanity-check metrics derived from the loaded bundle. Mirrors the in-memory
`SimulationResult.summary()` method on the simulator, so the same metric
set is available to scripts that already hold a `SimulationResult`.

| Metric | Meaning |
| --- | --- |
| `share_sheltered_at_t48` | Population share in SH at t=48. Should be small for a baseline run; if households are sheltering before any warning order, this is the first place it'll show up. |
| `share_sheltered_at_landfall` | Population share in SH at landfall (max t). Healthy baseline lands in 0.3–0.7. |
| `share_failed_evacuation` | Share still in ER at landfall — i.e., en route but not yet sheltered. |
| `share_evacuated_away` | Share in SH whose `evac_path` is `away` (sheltered away from home). |
| `share_sheltered_in_place` | Share in SH whose `evac_path` is `home` (PR → SH transition). |
| `peak_enroute_share` | Maximum population share observed in ER over time. |
| `peak_enroute_hour` | Hour at which `peak_enroute_share` is reached. |
| `median_departure_hour` | Median hour at which households first transition into ER, across households that ever entered ER. |

These are informational: no thresholds are enforced inside `report summary`.
A separate test (`tests/test_baseline_shape.py`) asserts loose bounds on
the same metrics under the baseline scenario at `N=2000, T=120, seed=0`,
to catch the kind of "everyone evacuates before the warning fires"
regression seen in Build 1.5.

> **Note on `median_departure_hour`.** "Departure event" here refers to the
> latent transition into ER (the household has actually left), not the
> noisy `departure` emission flag — the latter fires under non-ER states
> with probability `p_departure_other` and would be dominated by
> measurement noise rather than evacuation timing.

### `emissions` — sanity check

A grouped bar chart of per-state means of `departure`, `displacement`, and
`comm_count`. The chapter outline does not assign a figure number to this
view; we use it as an emission-parameter recovery sanity check before
fitting the IO-HMM in later builds.

**Healthy baseline.** Mean departure rate spikes in ER (≈0.95) and is
near zero in UA. Displacement is highest in ER, smaller in SH, and near
zero in UA/AW. Communication counts increase from UA through PR, then
fall once households transition to ER/SH.

## CLI reference

See [`cli.md`](cli.md) for the full `report` subcommand grammar. The
canonical "first run" pair is:

```bash
uv run iohmm-evac simulate --scenario baseline --seed 0 \
    --output ./output/baseline.parquet
uv run iohmm-evac report all --input ./output/baseline.parquet \
    --output-dir ./output/figures/
```

The `all` subcommand writes a predictable filename set into the output
directory: `occupancy.png`, `departures.png`, `trajectories.png`,
`emissions.png`. Single-plot subcommands default to writing
`<input-stem>.<plot>.png` next to the input parquet when neither
`--output` nor `--show` is specified.

## Headless rendering

Plots are written via the matplotlib backend selected by your environment.
On a headless host, leave `MPLBACKEND` unset (matplotlib will pick a
non-interactive default) or export `MPLBACKEND=Agg`. The `--show` flag
assumes an interactive backend is available and is the only path that
calls `plt.show()`; tests pin `Agg` in `tests/conftest.py`.

## What's not here

No mplstyle file. Defaults are good enough for smoke checks, and
chapter-quality styling lands in Build 3.