[ https://issues.apache.org/jira/browse/AIRFLOW-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rolf Schroeder updated AIRFLOW-2683: ------------------------------------ Attachment: airflow_priorities_purge_in_the_middle.png > priority_weights honored by LocalExecutor but not by CeleryExecutor > ------------------------------------------------------------------- > > Key: AIRFLOW-2683 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2683 > Project: Apache Airflow > Issue Type: Bug > Components: celery, scheduler > Affects Versions: Airflow 1.7.1.3 > Environment: Linux > Reporter: Rolf Schroeder > Priority: Major > Attachments: airflow_priorities_celery.png, > airflow_priorities_dag.png, airflow_priorities_localexecutor.png, > airflow_priorities_purge_in_the_middle.png, test_priority.py > > > Hi, > I am running Airflow 1.7.1.3 with Celery 4.0.2 (and rabbitmq). I have come > across the following issue: It seems to me that task priorities are not > honored as expected when scheduling tasks via Celery. However, when using the > LocalExecutor, the priorities seem to work fine. This issue arises when low > prio tasks get scheduled/queued before high prio tasks. Celery will finish > all previously scheduled/queued low prio tasks before tackling the high prio > ones. In contrast, the LocalExecutor does not further process remaining low > prio tasks and starts the high prio tasks. I fee like once Airflow has sent > the tasks to Celery, it "looses" control of the execution order. Details > below. I searched in Jira and possibly the following issues are linked > (although I am not convinced) > (AIRFLOW-1510)https://issues.apache.org/jira/browse/AIRFLOW-1510 > (AIRFLOW-584)https://issues.apache.org/jira/browse/AIRFLOW-584 > I have the following test setup: Two independent initial tasks, each with 3 > downstream tasks. One of the initial tasks takes longer but has high prio > downstream tasks. The other initial task has a short duration and low prio > downstream tasks. Both initial tasks start at the same time. My expectation > is that some of the low prio tasks get executed first (since their initial > upstream task is faster) but the moment the slower initial taks is done, its > high prio downstream tasks should get executed before the low prio downstream > tasks. > Here is a picture of the setup (forget about init0, this is just to make the > DAG look correct): > !airflow_priorities_dag.png! > I've been trying this with a LocalExecutor (AIRFLOW__CORE__PARALLELISM=2) and > two celery workers. When running the LocalExecutor, the high prio tasks get > indeed executed once their init task is done (as expected, the 'prio100' > tasks finish before the remaining 'prio10' tasks): > !airflow_priorities_localexecutor.png! > However, when using Celery, it seems that that the priorities are not honored > anymore. Once can see clearly that the low prio tasks (prio10) are all > executed before the high prio ones (prio100) > !airflow_priorities_celery.png! > Here is the corresponding code > [^test_priority.py] > I do not know whether this expected behavior or a bug? Is there any way to > "fix" this? -- This message was sent by Atlassian JIRA (v7.6.3#76005)