I get your points and also I had an offline discussion on slack with Ash who had a similar opinion. He pointed out that each new algo is a new scheduler, leading to an unwanted maintenance burden.
I’m not going to pursue this any further. Thank you for your replies! Christos Christos On Sat, Sep 13, 2025 at 10:37 Jens Scheffler <[email protected]> wrote: > I see a bit of a risk, as the scheduler code is quite complex... > (similar like Jarek) if somebody sees this and plugs in, I assume in > most cases this make it worse. Also locks us in a plugin API and removes > flexibility if we need to change/refactor something. > > On the other side I fear also a bit that the Scheduler is very complex > and adding multiple parallel strategies adds redundant code path's which > make it hard to maintain as load tests etc. must validate both not to > degrade and features need to be added to both. > > So I'd favor to keep it to a (maybe configurable) single logic. > > Unfortunately I had no mental capacity in drilling into the discussion > and details so far, the beast of SQL code shared was frightening me a bit. > > Jens > > On 13.09.25 07:06, Jarek Potiuk wrote: > > I think, even if we do it - this should only be something internal. I > don't > > see why we should make it customizable. If we want to choose between > > different algorithms we should explicitly tell users why they should > choose > > different algorithms and make sure we have data backing it up. There is > > absolutely no way we can make it available for users to override and use > > their own implementation - because we will have to support whatever > someone > > implemented. > > > > On Thu, Sep 4, 2025 at 3:08 PM Christos Bisias <[email protected]> > > wrote: > > > >> I’d appreciate any feedback on this. > >> > >> On Mon, Sep 1, 2025 at 18:35 Christos Bisias <[email protected]> > >> wrote: > >> > >>> Hello, > >>> > >>> A while back I started a discussion on the mailing list regarding > making > >>> some changes to the task selection query in order to improve the > >>> scheduler's throughput. > >>> > >>> https://github.com/apache/airflow/pull/54103 > >>> > >>> Another topic came up during that discussion related to task starvation > >>> due to the current selection algorithm. There are two open PRs with > >>> different fixes for that issue. > >>> > >>> https://github.com/apache/airflow/pull/54284 > >>> > >>> https://github.com/apache/airflow/pull/53492 > >>> > >>> Everyone has his own needs and it's probable that a good number of > users > >>> won't experience the starvation issue. > >>> > >>> Each approach has its own advantages and disadvantages and for that > >> reason > >>> it doesn't feel like there is a right or wrong approach here or a > single > >>> solution for all. > >>> > >>> There have been papers on task selection algorithms like this one > >>> > >>> https://ieeexplore.ieee.org/document/9799199 > >>> > >>> I would like to suggest refactoring the scheduler so that the task > >>> selection algorithm can be pluggable. The current implementation will > be > >>> the default. Everyone will be able to configure the path to his own > >> class. > >>> That will be the most beneficial to the majority of users. > >>> > >>> In the future, anyone could create a PR with his implementation and if > >>> enough people like it, it could be added to the repo. > >>> > >>> This has already been done for the priority weights algorithm, so why > not > >>> in this case as well? > >>> > >>> > >>> > >> > https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/priority-weight.html#custom-weight-rule > >>> If there is positive feedback on this idea, I would like to implement > it. > >>> > >>> Please let me know what you think. Thank you! > >>> > >>> Regards, > >>> Christos > >>> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
