I don't think that "conflict" isn't strictly a problem, the second (or
n-th) scheduler that tries to work on the locked dag will simply move on
to the next one.

Looking at "fast-follow"/moving some of the scheduling decisions to the
workers is already on my todo list, but the other thing to consider
there is that we _might_ want to remove direct database access from
workers in the future so that we can limit what Connections a DAG/task
has access to. But we can address that in when we come to it, achieving
both of these should be possible still.

No complaints, I'll start the vote.

-ash

On Mon Mar 16, 2020 at 6:27 PM, Dan Davydov wrote:
> Haven't checked the math in the AIP but I believe with the given
> formula,
> with 5 schedulers and 100 DAGs there is already a 9% chance of conflict
> and
> the larger users of Airflow have many more DAGs than that.
>
> 
> I'm a bit concerned putting about putting more load on the DB which is
> already a scalability bottleneck. I agree with the sentiment in the AIP
> about using a more long-term solution like leader election (or
> consistent
> hashing with hash(dag_id) -> scheduler instance, etc), and the even more
> radical change would be pushing the scheduling logic to the workers
> themselves so scheduling becomes push-based instead of pull-based. The
> proposed change is probably better than doing nothing though in the
> short
> term, and I think one that shouldn't be too hard to reverse/change if
> done
> properly so I'm neutral overall.
>
> 
> On Mon, Mar 16, 2020 at 6:12 PM Deng Xiaodong <xd.den...@gmail.com>
> wrote:
>
> 
> > Would be happy to give +1 for this AIP later!
> >
> >
> > XD
> >
> > On Mon, Mar 16, 2020 at 11:08 PM Ash Berlin-Taylor <a...@apache.org> wrote=
> :
> >
> > > Does anyone have any other opinions about this? If not I'd like to call=
> a
> > > vote (and start working on the code!)
> > >
> > > -ash
> > > On Mar 3 2020, at 12:34 pm, Kaxil Naik <kaxiln...@gmail.com> wrote:
> > > > The goal would be to support both MySQL and PostgreSQL for production
> > as
> > > we know many of Airflow users use MySQL as Metadata DB. On Tue, Mar 3,
> > 2020
> > > at 12:25 PM Ash Berlin-Taylor wrote: > It _shouldn't_, and we will test
> > > extensively with mysql. > > Worse case is we'll have to fall back to
> > > managing the lock ourselves with > a column rather than relying on db/r=
> ow
> > > level locks. This might be a case > where we have different/specialised
> > > behaviour for different dbs, or even db > versions, if say mysql 8
> > behaves
> > > okay but 5.7/5.6 doesn't. > > Ash > > On 3 March 2020 07:01:15 GMT-05:0=
> 0,
> > > "Kamil Bregu=C5=82a" < > kamil.breg...@polidea.com> wrote: > >Hello, > =
> > >
> > > >Will reliance on the database cause problems with MySQL? A lot of my >
> > > >users use this database. I am afraid that the lock mechanism in MySQL =
> >
> > > >is much less stable and predictable than PostgresSQL and this can >
> > >cause
> > > various stability problems. I know that Astronomer uses > >PostgreSQL,
> > but
> > > Airflow supports RDMS in a production environment and > >both must work
> > > properly in this AIP. > > > >Best regards, > >Kamil > > > >On Tue, Mar =
> 3,
> > > 2020 at 12:50 PM Kaxil Naik wrote: > >> > >> Good work on the Proposal
> > Ash
> > > & Vikram. > >> > >> > >> > >> On Fri, Feb 28, 2020 at 10:39 PM Vikram
> > Koka
> > > > > > >> wrote: > >> > >> > Team, > >> > > >> > > >> > > >> > We just
> > > updated 'AIP-15 Support Multiple-Schedulers for HA & Better > >> >
> > > Scheduling Performance' on Confluence and would very much > >appreciate=
> >
> > > >> > feedback and suggestions from the community. > >> > > >> > > >> > =
> >
> > >>
> > > > > > >
> > >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=3D103092=
> 651
> > > > >> > > >> > > >> > > >> > The original AIP was filed by Xiaodong Deng
> > on
> > > March 2nd, 2019 and > >has > >> > stalled after a while, so with his
> > > blessing, we are taking the > >baton on > >> > this AIP. We at Astronom=
> er
> > > have heard several enterprises ask for > >both High > >> > Availability
> > as
> > > well as greater scalability, specifically around > >starting > >> >
> > > hundreds and thousands of tasks in a very short time window. > >> > > >=
> >
> > >
> > > > >> > > >> > We would like to attempt this based on our experience
> > running
> > > > >Airflow as a > >> > Service and deploying Airflow at enterprises
> > around
> > > the globe. We > >believe > >> > that this will benefit Airflow and fuel
> > > greater adoption of Airflow > >for > >> > production pipelines within
> > > enterprises. > >> > > >> > > >> > > >> > Building on the original AIP, =
> we
> > > have proposed an active/active > >model, > >> > where we can scale
> > > schedulers, but are staying away from the quorum > >> > approach.
> > Xiaodong
> > > Deng had put in some really good thinking about > >the > >> > problem
> > > including approaches towards reducing contention between > >multiple >
> > >> >
> > > schedulers and we have included some of those concepts here. >
> > >Additional
> > > > >> > commenters had discussed the possibilities of leader selection
> > and >
> > > >those > >> > challenges, and we have incorporated their thinking as
> > well.
> > > . > >> > > >> > > >> > > >> > Any feedback, suggestions, and comments
> > would
> > > be greatly > >appreciated. > >> > > >> > > >> > > >> > Best Regards, >
> > >> >
> > > > >> > > >> > Ash Berlin-Taylor and Vikram Koka > >> > >
> > >
> > >
> >
>
> 
>
> 

Reply via email to