[ 
https://issues.apache.org/jira/browse/AIRFLOW-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Bregula updated AIRFLOW-6965:
-----------------------------------
    Summary: The get_task_instances method is performed three times during one 
creation of the DAGRun file.  (was: The method is performed playthree times 
during one creation of the DAGRun file.)

> The get_task_instances method is performed three times during one creation of 
> the DAGRun file.
> ----------------------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-6965
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6965
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: scheduler
>    Affects Versions: 1.10.9
>            Reporter: Kamil Bregula
>            Priority: Major
>
> Hello,
> Task_instances queries are executed three times. This is redundant. If we can 
> limit the number of these queries, we can achieve performance improvements.
> First query:
> perform_file: 
> [https://github.com/apache/airflow/blob/cc562dd/airflow/jobs/scheduler_job.py#L792]
> process_dags: 
> [https://github.com/apache/airflow/blob/cc562dd/airflow/jobs/scheduler_job.py#L853]
> create_dag_run: 
> [https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/jobs/scheduler_job.py#L726]
> create_dagrun: 
> [https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/jobs/scheduler_job.py#L638]
> verify_integrity: 
> [https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/models/dag.py#L1454]
> get_task_instances: 
> [https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/models/dagrun.py#L436]
> Third query:
> perform_file: 
> [https://github.com/apache/airflow/blob/cc562dd/airflow/jobs/scheduler_job.py#L792]
> process_dags: 
> [https://github.com/apache/airflow/blob/cc562dd/airflow/jobs/scheduler_job.py#L853]
> _process_task_instances: 
> [https://github.com/apache/airflow/blob/cc562dd/airflow/jobs/scheduler_job.py#L738]
> update_state: 
> [https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/jobs/scheduler_job.py#L685]
> get_task_instances: 
> [https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/models/dagrun.py#L292
> ]
> perform_file: 
> [https://github.com/apache/airflow/blob/cc562dd/airflow/jobs/scheduler_job.py#L792]
> process_dags: 
> [https://github.com/apache/airflow/blob/cc562dd/airflow/jobs/scheduler_job.py#L853]
> _process_task_instances: 
> [https://github.com/apache/airflow/blob/cc562dd/airflow/jobs/scheduler_job.py#L738]
> verify_integrity: 
> [https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/jobs/scheduler_job.py#L684]
> get_task_instances: 
> [https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/models/dagrun.py#L436]
> [|https://github.com/apache/airflow/blob/cc562ddfc7a53932d89c92ee1fb8f780c1fb38e3/airflow/models/dagrun.py#L292]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to