[ 
https://issues.apache.org/jira/browse/AIRFLOW-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-5881:
------------------------------------

    Assignee: Sajid Sajid

> Dag gets stuck in "Scheduled" State when scheduling a large number of tasks
> ---------------------------------------------------------------------------
>
>                 Key: AIRFLOW-5881
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5881
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: 1.10.6
>            Reporter: David Hartig
>            Assignee: Sajid Sajid
>            Priority: Critical
>         Attachments: 2 (1).log, airflow.cnf
>
>
> Running with the KubernetesExecutor in and AKS cluster, when we upgraded to 
> version 1.10.6 we noticed that the all the Dags stop making progress but 
> start running and immediate exiting with the following message:
> "Instance State' FAILED: Task is in the 'scheduled' state which is not a 
> valid state for execution. The task must be cleared in order to be run."
> See attached log file for the worker. Nothing seems out of the ordinary in 
> the Scheduler log. 
> Reverting to 1.10.5 clears the problem.
> Note that at the time of the failure maybe 100 or so tasks are in this state, 
> with 70 coming from one highly parallelized dag. Clearing the scheduled tasks 
> just makes them reappear shortly thereafter. Marking them "up_for_retry" 
> results in one being executed but then the system is stuck in the original 
> zombie state. 
> Attached is the also a redacted airflow config flag. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to