sonalprsd opened a new issue, #27782:
URL: https://github.com/apache/airflow/issues/27782

   ### Description
   
   Airflow/MWAA does not seem to have any scalable API for returning the status 
of a dagRun, the APIs states-for-dag-run or list-runs are not scaling well. To 
fetch the dagRun status, every team seems to have some custom solution using 
sns_notification or updating the status to some external data store via Airflow 
callbacks.
   
   The ask is to expose an API which can return dagRun status in most optimized 
time/by an internal query operation and not a scan.
   
   Discussion https://github.com/apache/airflow/discussions/27765 
   
   ### Use case/motivation
   
   My use case is to fetch the Dag status of all the Active runs and update the 
status tables in the system. There is a poller (with a timeout of 150s 
configured based on our SLA). The states-for-dag-run API seems to be doing scan 
operation internally. As the number of DAG runs in system increases, the time 
to get the status of dagRun increases further. Initially, fetching the status 
of 100 runs took 2.5 minutes. With increase of dagRuns in the system by 50, the 
fetch operation to get status for 100 dagRuns is taking more than 5 minutes.
   
   ### Related issues
   
   NA
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to