1fanwang opened a new issue, #66800: URL: https://github.com/apache/airflow/issues/66800
### Description `scheduler_job_runner.py` emits gauges for pool slot states (`pool.open_slots`, `pool.queued_slots`, `pool.running_slots`, `pool.starving_tasks`). On most backends, gauges are last-write-wins — a spike in pool pressure between two scheduler loop iterations shows up as a single value, and the distribution between scrapes is lost. ### Use case / motivation Backend operators sizing pools want p50/p95/p99 of pool utilization, not just point-in-time gauge samples. Today there's no way to see the spread. ### Proposal Alongside each existing pool slot gauge emission, also emit a histogram with the same value. Four `Stats.histogram(...)` additions in `scheduler_job_runner.py`, same call sites as the existing gauges. Nothing removed — gauges stay for backwards-compatible scrapers. ### Are you willing to submit a PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's Code of Conduct -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
