[ 
https://issues.apache.org/jira/browse/AIRFLOW-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcin Szymanski updated AIRFLOW-3065:
--------------------------------------
    Description: 
In a DAG with concurrency limit of 4, with about 150 task inside, when the 
limit of active tasks is reached, the scheduler starts to fail queued tasks. 
They later are retried, but if they have downstream tasks, these remain in 
upstream_failed status.

A few additional details:
 * celery executor
 * environment upgraded from 1.9 (no issues back then)
 * all configuration in airflow.cfg updated to the latest set of options
 * issue happens both with PyPi 1.10 and a build from branch v1-10-test 
(c36ef06)

 

 
{noformat}
[2018-09-14 13:51:23,560] {models.py:1336} INFO - Dependencies all met for 
<TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>
[2018-09-14 13:51:23,850] {models.py:1330} INFO - Dependencies not met for 
<TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>, 
dependency 'Task Instance Slots Available' FAILED: The maximum number of 
running tasks (4) for this task's DAG 'consolidated_db' has been reached.
[2018-09-14 13:51:23,852] {models.py:1531} WARNING - 
--------------------------------------------------------------------------------
FIXME: Rescheduling due to concurrency limits reached at task runtime. Attempt 
1 of 1. State set to NONE.
--------------------------------------------------------------------------------

[2018-09-14 13:51:23,853] {models.py:1534} INFO - Queuing into pool None

[2018-09-14 13:51:23,560] {models.py:1336} INFO - Dependencies all met for 
<TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>
[2018-09-14 13:51:23,850] {models.py:1330} INFO - Dependencies not met for 
<TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>, 
dependency 'Task Instance Slots Available' FAILED: The maximum number of 
running tasks (4) for this task's DAG 'consolidated_db' has been reached.
[2018-09-14 13:51:23,852] {models.py:1531} WARNING - 
--------------------------------------------------------------------------------
FIXME: Rescheduling due to concurrency limits reached at task runtime. Attempt 
1 of 1. State set to NONE.
--------------------------------------------------------------------------------

[2018-09-14 13:51:23,853] {models.py:1534} INFO - Queuing into pool None
[2018-09-14 13:52:49,939] {models.py:1336} INFO - Dependencies all met for 
<TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>
[2018-09-14 13:52:50,142] {models.py:1336} INFO - Dependencies all met for 
<TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>
[2018-09-14 13:52:50,235] {models.py:1548} INFO - 
--------------------------------------------------------------------------------
Starting attempt 1 of 1
--------------------------------------------------------------------------------

[2018-09-14 13:52:50,646] {models.py:1570} INFO - Executing 
<Task(PostgresDumpOperator): item> on 2018-09-14T12:42:55.379761+00:00
{noformat}
 

  was:
In a DAG with concurrency limit of 4, with about 150 task inside, when the 
limit limit of active tasks is reached, the scheduler starts to fail queued 
tasks. They later are retried, but if they have downstream tasks, these remain 
in upstream_failed status.

A few additional details:
 * celery executor
 * environment upgraded from 1.9 (no issues back then)
 * all configuration in airflow.cfg updated to the latest set of options
 * issue happens both with PyPi 1.10 and a build from branch v1-10-test 
(c36ef06)

 

 
{noformat}
[2018-09-14 13:51:23,560] {models.py:1336} INFO - Dependencies all met for 
<TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>
[2018-09-14 13:51:23,850] {models.py:1330} INFO - Dependencies not met for 
<TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>, 
dependency 'Task Instance Slots Available' FAILED: The maximum number of 
running tasks (4) for this task's DAG 'consolidated_db' has been reached.
[2018-09-14 13:51:23,852] {models.py:1531} WARNING - 
--------------------------------------------------------------------------------
FIXME: Rescheduling due to concurrency limits reached at task runtime. Attempt 
1 of 1. State set to NONE.
--------------------------------------------------------------------------------

[2018-09-14 13:51:23,853] {models.py:1534} INFO - Queuing into pool None

[2018-09-14 13:51:23,560] {models.py:1336} INFO - Dependencies all met for 
<TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>
[2018-09-14 13:51:23,850] {models.py:1330} INFO - Dependencies not met for 
<TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>, 
dependency 'Task Instance Slots Available' FAILED: The maximum number of 
running tasks (4) for this task's DAG 'consolidated_db' has been reached.
[2018-09-14 13:51:23,852] {models.py:1531} WARNING - 
--------------------------------------------------------------------------------
FIXME: Rescheduling due to concurrency limits reached at task runtime. Attempt 
1 of 1. State set to NONE.
--------------------------------------------------------------------------------

[2018-09-14 13:51:23,853] {models.py:1534} INFO - Queuing into pool None
[2018-09-14 13:52:49,939] {models.py:1336} INFO - Dependencies all met for 
<TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>
[2018-09-14 13:52:50,142] {models.py:1336} INFO - Dependencies all met for 
<TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>
[2018-09-14 13:52:50,235] {models.py:1548} INFO - 
--------------------------------------------------------------------------------
Starting attempt 1 of 1
--------------------------------------------------------------------------------

[2018-09-14 13:52:50,646] {models.py:1570} INFO - Executing 
<Task(PostgresDumpOperator): item> on 2018-09-14T12:42:55.379761+00:00
{noformat}
 


> Scheduler failing tasks when DAG concurrency limit reached
> ----------------------------------------------------------
>
>                 Key: AIRFLOW-3065
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3065
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: 1.10.0
>            Reporter: Marcin Szymanski
>            Priority: Critical
>
> In a DAG with concurrency limit of 4, with about 150 task inside, when the 
> limit of active tasks is reached, the scheduler starts to fail queued tasks. 
> They later are retried, but if they have downstream tasks, these remain in 
> upstream_failed status.
> A few additional details:
>  * celery executor
>  * environment upgraded from 1.9 (no issues back then)
>  * all configuration in airflow.cfg updated to the latest set of options
>  * issue happens both with PyPi 1.10 and a build from branch v1-10-test 
> (c36ef06)
>  
>  
> {noformat}
> [2018-09-14 13:51:23,560] {models.py:1336} INFO - Dependencies all met for 
> <TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>
> [2018-09-14 13:51:23,850] {models.py:1330} INFO - Dependencies not met for 
> <TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 
> [queued]>, dependency 'Task Instance Slots Available' FAILED: The maximum 
> number of running tasks (4) for this task's DAG 'consolidated_db' has been 
> reached.
> [2018-09-14 13:51:23,852] {models.py:1531} WARNING - 
> --------------------------------------------------------------------------------
> FIXME: Rescheduling due to concurrency limits reached at task runtime. 
> Attempt 1 of 1. State set to NONE.
> --------------------------------------------------------------------------------
> [2018-09-14 13:51:23,853] {models.py:1534} INFO - Queuing into pool None
> [2018-09-14 13:51:23,560] {models.py:1336} INFO - Dependencies all met for 
> <TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>
> [2018-09-14 13:51:23,850] {models.py:1330} INFO - Dependencies not met for 
> <TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 
> [queued]>, dependency 'Task Instance Slots Available' FAILED: The maximum 
> number of running tasks (4) for this task's DAG 'consolidated_db' has been 
> reached.
> [2018-09-14 13:51:23,852] {models.py:1531} WARNING - 
> --------------------------------------------------------------------------------
> FIXME: Rescheduling due to concurrency limits reached at task runtime. 
> Attempt 1 of 1. State set to NONE.
> --------------------------------------------------------------------------------
> [2018-09-14 13:51:23,853] {models.py:1534} INFO - Queuing into pool None
> [2018-09-14 13:52:49,939] {models.py:1336} INFO - Dependencies all met for 
> <TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>
> [2018-09-14 13:52:50,142] {models.py:1336} INFO - Dependencies all met for 
> <TaskInstance: consolidated_db.item 2018-09-14T12:42:55.379761+00:00 [queued]>
> [2018-09-14 13:52:50,235] {models.py:1548} INFO - 
> --------------------------------------------------------------------------------
> Starting attempt 1 of 1
> --------------------------------------------------------------------------------
> [2018-09-14 13:52:50,646] {models.py:1570} INFO - Executing 
> <Task(PostgresDumpOperator): item> on 2018-09-14T12:42:55.379761+00:00
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to