pierrejeambrun opened a new pull request, #48771: URL: https://github.com/apache/airflow/pull/48771
Improve the Grid endpoint response time. There are plenty of other improvements that can be made but this PR focuses on the main bottleneck of the endpoint which is `fill_task_instance_summaries`. Deserializing the serialized dag from the DB as well as computing `get_task_group_map` for the serialized dag is pretty expensive. More importantly it is done on the serdag for each TI. The serdag is actually the same as long as the version do not change and therefore caching both results are helping. On the dag `nested_groups` with more than 20 dag runs, the endpoint went from: - 1s5-2s - 0.5 - 0.7s More importantly this will not scale linearly with the number of tasks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
