I think I agree with this:

I feel it should be applied at the dag *run* scope
> and not across all dag runs.
>

Just a thought: If someone *did* want to run multiple DAG runs at the same
time and limit the max active tasks per DAG, they could create a pool for
that DAG and pass the pool in default_args.

On Fri, Oct 4, 2024 at 1:51 PM Daniel Standish
<daniel.stand...@astronomer.io.invalid> wrote:

> Ok, sorry, these concurrency settings are confusing.
>
> Let me clarify.
>
> `max_active_tasks_per_dag` is a core airflow setting and it provides the
> default for DAG.max_active_tasks.
>
> DAG.max_active_tasks I think is a reasonable config to have but the problem
> in my view is the scope.  I feel it should be applied at the dag *run*
> scope
> and not across all dag runs.  That just gets into confusing and footgunish
> territory if you allow many concurrent dag runs but limit the number of
> concurrent tasks.  Then you might have many many dags running but all
> limping along.
>
> So I guess let me change my proposal.  I would propose that we have
> DAG.max_active_tasks be applied at the dag *run* scope.  Not limiting
> concurrency across all dag runs.
>
> I think in practice this is essentially what it already is, because I would
> expect that the vast majority of dag runs are the only dag run running for
> a given dag at a given time.  It's only when you have many dag runs of the
> same dag running that this parameter ends up meaning something different.
>
> So, I propose, DAG.max_active_tasks should be evaluated per-dag-run.  And
> we can change the name accordingly if folks on board.
>
> Now whether a mapped task is a task or not, I leave that for another day :)
>
>
>
>
>
>
> On Fri, Oct 4, 2024 at 10:28 AM Daniel Standish <
> daniel.stand...@astronomer.io> wrote:
>
> > The setting  max_active_tasks_per_dag seems mostly useless to me / and
> > footgunish.
> >
> > Why?
> >
> > Because you already have a setting for max active dag runs.  If you don't
> > want to run more tasks, don't create the extra dag runs.
> >
> > We also already have a mechanism (param on base operator) for limiting
> > individual tasks across all dag runs where that may be needed.  But just
> a
> > general "i don't want more than 16 tasks running across all dag runs of
> all
> > types and for all tasks" seems just, imprecise and not useful.
> >
> > I actually think it makes sense to remove this param entirely.  But at
> > least we should remove the default.
> >
> > WDYT
> >
>

Reply via email to