1fanwang opened a new pull request, #66914:
URL: https://github.com/apache/airflow/pull/66914
Picking up where #66552 left off. That PR hid the drifting `Next Run`
timestamp in three UI surfaces; #66907 (filed off the back of it) showed the
same drift is still served verbatim by the REST API and is being recomputed
every parse cycle on the scheduler side. This PR closes both surfaces by
stopping the drift at the source.
## What changes
`DagModel.calculate_dagrun_date_fields` now short-circuits when
`self.is_paused` is `True`. The scheduler still calls it every parse cycle for
every Dag (no caller-side change needed), but on a paused Dag it returns
immediately without touching any field. The values therefore stay frozen at
whatever they were the last time the Dag was unpaused.
The previous "fire the missed interval immediately on unpause" semantics
relied on the recompute running every cycle — so unpause flips
`is_paused=False` and the next parse cycle already had a fresh value. With
drift gone, the unpause path needs an explicit nudge. New helper
`DagModel.recompute_next_dagrun_fields_after_unpause(session=...)` does one
fresh recompute: looks up the latest `SerializedDagModel`, the most recent
non-manual `DagRun`, and delegates back to `calculate_dagrun_date_fields`.
Wired into the three unpause sites:
- `PATCH /api/v2/dags/{dag_id}` — single-Dag unpause path
- `PATCH /api/v2/dags` — bulk-unpause path (per-row, only the rows that
actually transitioned)
- `airflow dags unpause` CLI — `_update_is_paused` helper
The helper is a no-op if the Dag is still paused (defensive) and a no-op if
no serialized Dag exists yet (the next parse cycle will populate it).
## E2E evidence
The repro lives at `/tmp/66907_api_drift_repro.py` (also embedded in
#66907's body). It drives the real `airflow.api_fastapi.app.create_app()` via
`fastapi.testclient.TestClient` against a real SQLite metadata DB,
time-machining five parse cycles for the same paused Dag with the same
`last_automated_run`.
### Before (on `main`)
```
DAG: paused_drift_repro_66907 schedule: 0 1 * * * catchup=False
is_paused=True (throughout)
--- parse cycle @ 2026-01-02T22:00:00+00:00 ---
"next_dagrun_logical_date": "2026-01-02T01:00:00Z"
"next_dagrun_run_after": "2026-01-03T01:00:00Z"
--- parse cycle @ 2026-01-04T22:00:00+00:00 ---
"next_dagrun_logical_date": "2026-01-03T01:00:00Z"
"next_dagrun_run_after": "2026-01-04T01:00:00Z"
--- parse cycle @ 2026-01-08T22:00:00+00:00 ---
"next_dagrun_logical_date": "2026-01-07T01:00:00Z"
"next_dagrun_run_after": "2026-01-08T01:00:00Z"
--- parse cycle @ 2026-01-15T22:00:00+00:00 ---
"next_dagrun_logical_date": "2026-01-14T01:00:00Z"
"next_dagrun_run_after": "2026-01-15T01:00:00Z"
--- parse cycle @ 2026-01-31T22:00:00+00:00 ---
"next_dagrun_logical_date": "2026-01-30T01:00:00Z"
"next_dagrun_run_after": "2026-01-31T01:00:00Z"
```
Same Dag, paused throughout, same `last_automated_run`. Only the wall clock
moves; the API response moves with it. This is what every external REST
consumer sees today.
### After (this PR)
```
--- parse cycle @ 2026-01-02T22:00:00+00:00 ---
"next_dagrun_logical_date": null
"next_dagrun_run_after": null
--- parse cycle @ 2026-01-04T22:00:00+00:00 ---
"next_dagrun_logical_date": null
"next_dagrun_run_after": null
--- parse cycle @ 2026-01-08T22:00:00+00:00 ---
"next_dagrun_logical_date": null
"next_dagrun_run_after": null
--- parse cycle @ 2026-01-15T22:00:00+00:00 ---
"next_dagrun_logical_date": null
"next_dagrun_run_after": null
--- parse cycle @ 2026-01-31T22:00:00+00:00 ---
"next_dagrun_logical_date": null
"next_dagrun_run_after": null
```
The fields are `null` in this run only because the test scenario creates the
Dag already-paused — there was never an unpaused parse cycle to populate them.
In a realistic flow (unpaused → run fires → fields populated → user pauses →
drift stops) the fields stay frozen at whatever they were *at pause time*, not
`null`.
## Tests
Three new unit tests in `airflow-core/tests/unit/models/test_dag.py`:
- `test_calculate_dagrun_date_fields_short_circuits_when_paused` —
establishes a baseline while unpaused, flips `is_paused=True`, time-machines
forward several years, asserts the fields didn't move.
- `test_recompute_next_dagrun_fields_after_unpause` — sets
`next_dagrun_*=None` and `is_paused=True`, flips to unpaused, calls the helper,
asserts the fields are populated.
- `test_recompute_next_dagrun_fields_after_unpause_noop_when_still_paused` —
calls the helper on a still-paused Dag, asserts no fields are touched.
The existing parametrized `test_calculate_dagrun_date_fields` continues to
pass — `is_paused` defaults to `False` so the new short-circuit doesn't engage
on the unpaused path.
```
3 passed, 183 deselected, 1 warning in 2.91s
```
## Risk
Backwards-incompatible for any external consumer that today reads
`next_dagrun_logical_date` / `next_dagrun_run_after` on a paused Dag and relies
on it advancing each parse cycle. That value is the drift this PR is targeting
— anyone using it as if it predicted a real future run is already misled (the
Dag is paused; nothing will fire). The frozen post-pause snapshot is the more
honest contract: it's the last value that *would* have fired if the Dag hadn't
been paused.
The scheduler-side run-creation query already filters by `is_paused=False`,
so no run will be materialized off a stale frozen value either way.
Closes #66907.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]