spider-man-tm opened a new pull request, #69086:
URL: https://github.com/apache/airflow/pull/69086

   Add a new scheduler.dagruns.not_started gauge metric that tracks scheduled 
DagRuns that have not started within a configurable threshold. This covers two 
cases: DagRuns not yet created (scheduler backlog) and DagRuns stuck in the 
queued state (executor backlog). Platform teams can now alert on scheduling 
latency directly via StatsD/OpenTelemetry without writing custom DB queries.
   
   The threshold is configurable via [scheduler] dagrun_late_threshold (default 
15.0 seconds), which mirrors the default used by Prefect's MarkLateRuns service 
(PREFECT_API_SERVICES_LATE_RUNS_AFTER_SECONDS). The metric is emitted as a 
gauge — not a counter — so it reflects the current number of late slots and 
resets to zero once all pending DagRuns have started.
   
   DB impact: two COUNT(*) queries are added to the existing 
dagrun_metrics_interval (default 30 s) polling cycle. Both leverage existing 
indexes: idx_next_dagrun_create_after on the dag table and 
idx_dag_run_queued_dags (partial index on PostgreSQL/SQLite) on the dag_run 
table. In healthy deployments the queued count is near zero; in degraded 
deployments the query cost is proportional to the problem being diagnosed.
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   - [x] Yes (please specify the tool below)
   
   Claude Code
   
   ---
   
   * Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)**
 for more information. Note: commit author/co-author name and email in commits 
become permanently public when merged.
   * For fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   * When adding dependency, check compliance with the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   * For significant user-facing changes create newsfragment: 
`{pr_number}.significant.rst`, in 
[airflow-core/newsfragments](https://github.com/apache/airflow/tree/main/airflow-core/newsfragments).
 You can add this file in a follow-up commit after the PR is created so you 
know the PR number.
   * 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to