Hi There!

TLDR: In fix PR https://github.com/apache/airflow/pull/61769 we came to the point that it seems today in Airflow Core the "Deferred" state seems to be counted inconsistently. I would propose to consistently count "Deferred" into the counts of "Running".

Details:

 * In Pools for a longer time (since PR
   https://github.com/apache/airflow/pull/32709) it is possible to
   decide whether tasks in deferred state are counted into pool
   allocation or not.
 * Before that Deferred were not counted into, which caused tasks being
   in deferred potentially overwhelm backends which defesated the
   purpose of pools
 * Recently it was also seen that other limits we usually have on Dags
   defined as following do not consistently include deferred into limits.
     o max_active_tasks - `The number of task instances allowed to run
       concurrently`
     o max_active_tis_per_dag - `When set, a task will be able to limit
       the concurrent runs across logical_dates.`
     o max_active_tis_per_dagrun - `When set, a task will be able to
       limit the concurrent task instances per Dag run.`
 * This means at the moment defining a task as async/deferred escapes
   the limits

Code references:

 * Counting tasks in Scheduler on main:
   
https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/jobs/scheduler_job_runner.py#L190
 * EXECUTION_STATES used for counting:
   
https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/ti_deps/dependencies_states.py#L21
     o Here "Deferred" is missing!

Alternatives that I see:

 * Fix it in Scheduler consistently that limits are applied counting
   Deferred always in
 * There might be a historic reason that Deferred is not counting in -
   then a proper documentation would be needed - but I'd assume this
   un-likely
 * There are different opinions - then the behavior might need to be
   configurable. (But personally I can not see a reason for having
   deferred escaping the limits defined)

Jens

Reply via email to