1fanwang commented on code in PR #66773:
URL: https://github.com/apache/airflow/pull/66773#discussion_r3269906184
##########
airflow-core/src/airflow/jobs/scheduler_job_runner.py:
##########
@@ -2938,7 +2944,10 @@ def _find_task_instances_without_heartbeats(self, *,
session: Session) -> list[T
.join(DM, TI.dag_id == DM.dag_id)
.where(
TI.state.in_((TaskInstanceState.RUNNING,
TaskInstanceState.RESTARTING)),
- TI.last_heartbeat_at < limit_dttm,
+ or_(
+ TI.last_heartbeat_at < limit_dttm,
Review Comment:
Circling back — you're right, @ephraimbuddy. Closing this; the live repro
shows this isn't the right fix for #58307.
Ran the repro on a freshly-migrated A3 sqlite deployment — `airflow db
migrate`, a real DAG, `airflow standalone`. Triggered, two TIs reached running
through `/run`. Stopped the scheduler, set `last_heartbeat_at = NULL` on both
rows to simulate the post-migration state, restarted with DEBUG logging.
```
19:38:30.413671 | info | Adopting or resetting orphaned tasks for active dag
runs (scheduler_job_runner.py:2815)
19:38:30.445094 | info | Reset the following 2 orphaned TaskInstances
(scheduler_job_runner.py:2894)
```
`adopt_or_reset_orphaned_tasks` rotates both into `task_instance_history`
within 32 ms of startup; fresh rows come up with `last_heartbeat_at` populated
by `/run` in 4 s. The cleanup query only ever sees the fresh rows, so the `OR
last_heartbeat_at IS NULL` predicate never fires.
The repro did surface something different in #58307. vgl-grin is on
`heartbeat_sec=0`, `heartbeat_timeout=300`, containers respawning between runs.
Their `last_heartbeat_at` isn't NULL; `/run` sets it. But the adopt path on
each respawn looks like it can land a fresh `last_heartbeat_at`, which would
restart the timeout clock indefinitely. That fits the asymmetry — `timeout=1`
or `2` terminate, `300/4/5/6/10` don't. Different code path than this PR —
leaving #58307 open.
Thanks for the pushback.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]