Both options make sense to me. Using the Log table allows retrospectively investigate the scheduler’s behaviour, but I that is arguably not valuable since you can already do that with logging. Most of the time it just takes up needless disk space. So yeah I feel adding a field to dagrun is reasonable.
TP > On 14 Jan 2026, at 04:36, Ferruzzi, Dennis <[email protected]> wrote: > > Proposal: I plan to implement a new timestamp column named `last_queued_at` > to the `dagrun` table which is updated any time the run is queued, including > when it is cleared. DeadlineAlert code will be modified to use this new > column for any calculations which currently use `dagrun.queued_at` and will > fall back on `queued_at` if it is `null` or missing. This will require a > small migration which sets the new column to `null` for existing rows. > > To summarize the discussion [1] regarding the `dagrun.queued_at` field: it > currently tracks the initial queue time and is never updated, which breaks > expected behavior of DeadlineAlerts (and maybe other areas?) if a run is > cleared or re-triggered. Which means the `queued_at` column essentially > represents the first time this run was queued, not the most recent time it > was attempted. For example, if you expect an email if the run takes more > than 30 minutes from when it was queued and it gets cleared and restarted, > you get that email 30 minutes from the first time it was queued regardless of > how long it actually took to run. > > There was a good discussion there and on Slack about expectations and a few > ideas were proposed. I think these are the two primary options: > > Option 1: We leave `dagrun.queued_at` alone to represent the first time it > was attempted and add a new field to the `dagrun` table which is updated each > time it is queued, representing the most recent attempt. > > Option 2: Add rows to the `Log` table to store when a run was queued/requeued > (as suggested by Standish) and use that as the source of truth for when a > specific run was last attempted. > > > While I like Option 2, it's a bigger project and feels like overkill for > this, especially considering the recent discussion [2] about the Log table > getting out of hand on some environments. I think maybe Option 1 is the > right answer. It maintains backward compatibility and solves the immediate > issue well. > > If there are no objections, I'll consider this accepted on Friday, 16 Jan at > 21:00 UTC. > > > [1] Email thread "DagRun queued_at timestamp discussion": > https://lists.apache.org/thread/n5y2khy8l9472spoclmql3nj2bskqksj > [2] Email thread "Managing airflow database size and retention": > https://lists.apache.org/thread/88odp590r1syklo5rok4tq3kxpkhv922 > > > - ferruzzi --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
