[ 
https://issues.apache.org/jira/browse/AIRFLOW-6264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nidhi updated AIRFLOW-6264:
---------------------------
    Description: 
I am trying to run my tasks using Airflow and Celery Executor. I have created 
one DAG inside that I have approximately 60000 tasks. I am trying to run this 
and When I trigger a DAG it starts running but not scheduling my tasks. It 
stays in running state for 2 days without scheduling the tasks.

And the main bug is If i schedule 5000 tasks then it takes less than 1 minute 
to schedule it but if I am trying to schedule more than that limit it is 
scheduling my tasks.

I have tried different solutions also to solve this the first one is:
 * *{{PARALLELISM=1000}}*
 * NON_POOLED_TASK_SLOT_COUNT=1000
 * DAG_CONCURRENCY=10000
 * 

*But, this does not work as New Version of airflow does not support 
non_pooled_task_slot_count.*

*I have tried to use CELERYD_COUNT which is also not working in my case. I have 
tries mostly every changes which can be useful but none of them is working for 
me.*

*Can someone let me know how can schedule this much amount of tasks /*

  was:
I am trying to run my tasks using Airflow and Celery Executor. I have created 
one DAG inside that I have approximately 60000 tasks. I am trying to run this 
and When I trigger a DAG it starts running but not scheduling my tasks. It 
stays in running state for 2 days without scheduling the tasks.

And the main bug is If i schedule 5000 tasks then it takes less than 1 minute 
to schedule it but if I am trying to schedule more than that limit it is 
scheduling my tasks.

I have tried different solutions also to solve this the first one is:

 *{{- PARALLELISM=10000}}*

*- NON_POOLED_TASK_SLOT_COUNT=10000* 

 *- DAG_CONCURRENCY=10000*

*But, this does not work as New Version of airflow does not support 
non_pooled_task_slot_count.*

*I have tried to use CELERYD_COUNT which is also not working in my case. I have 
tries mostly every changes which can be useful but none of them is working for 
me.*

*Can someone let me know how can schedule this much amount of tasks /*


> Airflow not scheduling tasks after staying in Running state
> -----------------------------------------------------------
>
>                 Key: AIRFLOW-6264
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6264
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: celery, DAG, operators
>    Affects Versions: 1.10.6
>            Reporter: Nidhi
>            Priority: Major
>
> I am trying to run my tasks using Airflow and Celery Executor. I have created 
> one DAG inside that I have approximately 60000 tasks. I am trying to run this 
> and When I trigger a DAG it starts running but not scheduling my tasks. It 
> stays in running state for 2 days without scheduling the tasks.
> And the main bug is If i schedule 5000 tasks then it takes less than 1 minute 
> to schedule it but if I am trying to schedule more than that limit it is 
> scheduling my tasks.
> I have tried different solutions also to solve this the first one is:
>  * *{{PARALLELISM=1000}}*
>  * NON_POOLED_TASK_SLOT_COUNT=1000
>  * DAG_CONCURRENCY=10000
>  * 
> *But, this does not work as New Version of airflow does not support 
> non_pooled_task_slot_count.*
> *I have tried to use CELERYD_COUNT which is also not working in my case. I 
> have tries mostly every changes which can be useful but none of them is 
> working for me.*
> *Can someone let me know how can schedule this much amount of tasks /*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to