[ https://issues.apache.org/jira/browse/AIRFLOW-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429355#comment-16429355 ]
Chris Bandy commented on AIRFLOW-2128: -------------------------------------- [~szmate1618] what is your {{scheduler.min_file_process_interval}} (or {{AIRFLOW__SCHEDULER__MIN_FILE_PROCESS_INTERVAL}} environment) set to? > 'Tall' DAGs scale worse than 'wide' DAGs > ---------------------------------------- > > Key: AIRFLOW-2128 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2128 > Project: Apache Airflow > Issue Type: Bug > Components: DAG, DagRun, scheduler > Affects Versions: 1.9.0 > Reporter: Máté Szabó > Priority: Major > Labels: performance, usability > Attachments: tall_dag.py, wide_dag.py > > > Tall DAG = a DAG with long chains of dependencies, e.g.: 0 -> 1 -> 2 -> ... > -> 998 -> 999 > Wide DAG = a DAG with many short, parallel dependencies e.g. 0 -> 1; 0 -> 2; > ... 0 -> 999 > Take a super simple case where both graphs are of 1000 tasks, and all the > tasks are just "sleep 0.03" bash commands (see the attached files). > With the default SequentialExecutor (without paralellism), I would expect my > 2 example DAGs to take (approximately) the same time to run, but apparently > this is not the case. > For the wide DAG it was about 80 successfully executed tasks in 10 minutes, > for the tall one it was 0. > This anomaly also seem to affect the web UI. Opening up the graph view or the > tree view for the wide DAG takes about 6 seconds on my machine, but for the > tall one it takes significantly longer, in fact currently it does not load at > all. -- This message was sent by Atlassian JIRA (v7.6.3#76005)