spider-man-tm opened a new pull request, #69086: URL: https://github.com/apache/airflow/pull/69086
Add a new scheduler.dagruns.not_started gauge metric that tracks scheduled DagRuns that have not started within a configurable threshold. This covers two cases: DagRuns not yet created (scheduler backlog) and DagRuns stuck in the queued state (executor backlog). Platform teams can now alert on scheduling latency directly via StatsD/OpenTelemetry without writing custom DB queries. The threshold is configurable via [scheduler] dagrun_late_threshold (default 15.0 seconds), which mirrors the default used by Prefect's MarkLateRuns service (PREFECT_API_SERVICES_LATE_RUNS_AFTER_SECONDS). The metric is emitted as a gauge — not a counter — so it reflects the current number of late slots and resets to zero once all pending DagRuns have started. DB impact: two COUNT(*) queries are added to the existing dagrun_metrics_interval (default 30 s) polling cycle. Both leverage existing indexes: idx_next_dagrun_create_after on the dag table and idx_dag_run_queued_dags (partial index on PostgreSQL/SQLite) on the dag_run table. In healthy deployments the queued count is near zero; in degraded deployments the query cost is proportional to the problem being diagnosed. --- ##### Was generative AI tooling used to co-author this PR? - [x] Yes (please specify the tool below) Claude Code --- * Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)** for more information. Note: commit author/co-author name and email in commits become permanently public when merged. * For fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. * When adding dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). * For significant user-facing changes create newsfragment: `{pr_number}.significant.rst`, in [airflow-core/newsfragments](https://github.com/apache/airflow/tree/main/airflow-core/newsfragments). You can add this file in a follow-up commit after the PR is created so you know the PR number. * -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
