MD-Mushfiqur123 commented on PR #67549:
URL: https://github.com/apache/airflow/pull/67549#issuecomment-4552407870

   Thanks for the review @henry3260!
   
   Re: *misleading percentages when capped* — The per-state counts are still 
capped at 1000, and the frontend already shows "N+" when a count reaches that 
cap (see `state_count_limit` in the response). The uncapped total just provides 
a better denominator so that non-capped states get accurate percentages. Capped 
states will show an understated percentage, but the "1000+" indicator signals 
the uncertainty. This is strictly better than the current behavior where all 
percentages are wrong (even for non-capped states) because the denominator 
itself is wrong.
   
   Re: *expensive COUNT(*) queries* — Both `dag_run_total_count` and 
`task_instance_total_count` are simple `COUNT(*)` queries that hit the same 
indexed filters (`dag_id`, `start_date`, `end_date`) already used by the capped 
counting subqueries. In practice these are index-only scans that execute in 
single-digit milliseconds, adding negligible overhead to the overall endpoint.
   
   An alternative would be to remove the caps entirely, but that was 
intentionally avoided to keep the endpoint fast for deployments with millions 
of task instances. The current approach is a minimal, pragmatic improvement.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to