I get your points and also I had an offline discussion on slack with Ash
who had a similar opinion. He pointed out that each new algo is a new
scheduler, leading to an unwanted maintenance burden.

I’m not going to pursue this any further. Thank you for your replies!

Christos

Christos

On Sat, Sep 13, 2025 at 10:37 Jens Scheffler <[email protected]> wrote:

> I see a bit of a risk, as the scheduler code is quite complex...
> (similar like Jarek) if somebody sees this and plugs in, I assume in
> most cases this make it worse. Also locks us in a plugin API and removes
> flexibility if we need to change/refactor something.
>
> On the other side I fear also a bit that the Scheduler is very complex
> and adding multiple parallel strategies adds redundant code path's which
> make it hard to maintain as load tests etc. must validate both not to
> degrade and features need to be added to both.
>
> So I'd favor to keep it to a (maybe configurable) single logic.
>
> Unfortunately I had no mental capacity in drilling into the discussion
> and details so far, the beast of SQL code shared was frightening me a bit.
>
> Jens
>
> On 13.09.25 07:06, Jarek Potiuk wrote:
> > I think, even if we do it - this should only be something internal. I
> don't
> > see why  we should make it customizable. If we want to choose between
> > different algorithms we should explicitly tell users why they should
> choose
> > different algorithms and make sure we have data  backing it up. There is
> > absolutely no way we can make it available for users to override and use
> > their own implementation - because we will have to support whatever
> someone
> > implemented.
> >
> > On Thu, Sep 4, 2025 at 3:08 PM Christos Bisias <[email protected]>
> > wrote:
> >
> >> I’d appreciate any feedback on this.
> >>
> >> On Mon, Sep 1, 2025 at 18:35 Christos Bisias <[email protected]>
> >> wrote:
> >>
> >>> Hello,
> >>>
> >>> A while back I started a discussion on the mailing list regarding
> making
> >>> some changes to the task selection query in order to improve the
> >>> scheduler's throughput.
> >>>
> >>> https://github.com/apache/airflow/pull/54103
> >>>
> >>> Another topic came up during that discussion related to task starvation
> >>> due to the current selection algorithm. There are two open PRs with
> >>> different fixes for that issue.
> >>>
> >>> https://github.com/apache/airflow/pull/54284
> >>>
> >>> https://github.com/apache/airflow/pull/53492
> >>>
> >>> Everyone has his own needs and it's probable that a good number of
> users
> >>> won't experience the starvation issue.
> >>>
> >>> Each approach has its own advantages and disadvantages and for that
> >> reason
> >>> it doesn't feel like there is a right or wrong approach here or a
> single
> >>> solution for all.
> >>>
> >>> There have been papers on task selection algorithms like this one
> >>>
> >>> https://ieeexplore.ieee.org/document/9799199
> >>>
> >>> I would like to suggest refactoring the scheduler so that the task
> >>> selection algorithm can be pluggable. The current implementation will
> be
> >>> the default. Everyone will be able to configure the path to his own
> >> class.
> >>> That will be the most beneficial to the majority of users.
> >>>
> >>> In the future, anyone could create a PR with his implementation and if
> >>> enough people like it, it could be added to the repo.
> >>>
> >>> This has already been done for the priority weights algorithm, so why
> not
> >>> in this case as well?
> >>>
> >>>
> >>>
> >>
> https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/priority-weight.html#custom-weight-rule
> >>> If there is positive feedback on this idea, I would like to implement
> it.
> >>>
> >>> Please let me know what you think. Thank you!
> >>>
> >>> Regards,
> >>> Christos
> >>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to