Re: Multiple Schedulers - "scheduler_lock"

2019-03-29 Thread Kevin Yang
Sry I'm late again :) I agree with most of Bas's idea especially if we decided to do HA maybe don't do it ourselves( Apache Helix? haven't look deep just an idea). Peter, I'm very curious about your setup and how come it took 2-3 hours to process 5000+ DAG files( I assume you meant DAG files not

Re: Multiple Schedulers - "scheduler_lock"

2019-03-18 Thread Ash Berlin-Taylor
Does anything change about your proposal if you do t assume that workers have “quick access” to the DAG files - i.e. what if we are on kube executors and the task spin up time plus git sync time is 30-60s? (Perhaps this is an extreme case, but we are talking about extreme cases) > On 18 Mar 201

Re: Multiple Schedulers - "scheduler_lock"

2019-03-18 Thread Bas Harenslak
Peter, The numbers you mention seem to come out of the blue. I think you’re oversimplifying it and cannot simply state 180/36 = 5 minutes. Throwing in numbers without explanation creates confusion. I have some questions when reading your AIP. I have to make lots of assumptions and think explai

Re: Multiple Schedulers - "scheduler_lock"

2019-03-17 Thread Peter van t Hof
Hi, My proposal is focusing mainly on scalability and indeed not so much on HA. This mainly because that is also the main issue from the original author. Have a form of HA on this MainScheduler would still be nice to have. The problem with is that have a fixed number of scheduler does not scale

Re: Multiple Schedulers - "scheduler_lock"

2019-03-17 Thread Maxime Beauchemin
The proposal reads "Looking at the original AIP-15 the author proposes to use locking to enable the use of multiple schedulers, this might introduce unnecessary complexity" To me introducing multiple roles (master scheduler + scheduler minions), may be actually more complex than just having "share

Re: Multiple Schedulers - "scheduler_lock"

2019-03-17 Thread Peter van t Hof
Hi all, I think that scheduling locking is maybe not the best way in solving this issue. Still I’m in support of taking a good look at the scheduler because it has some real scaling issues. I did wrote an alternative proposal to solve the scalability of the scheduler: https://cwiki.apache.org/c

Re: Multiple Schedulers - "scheduler_lock"

2019-03-02 Thread Deng Xiaodong
Thanks Max. I have documented all the discussions around this topic & useful inputs into AIP-15 (Support Multiple-Schedulers for HA & Better Scheduling Performance) https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103092651

Re: Multiple Schedulers - "scheduler_lock"

2019-03-02 Thread Maxime Beauchemin
Personally I'd vote against the idea of having certain scheduler handling a subset of the DAGs, that's just not HA. Also if you are in an env where you have a small number of large DAGs, the odds of having wasted work and double-firing get pretty high. With the lock in place, it's just a matter o

Re: Multiple Schedulers - "scheduler_lock"

2019-03-02 Thread Deng Xiaodong
Get your point and agree. And the suggestion you gave lastly to random sort DAGs is a great idea to address it. Thanks! XD > On 2 Mar 2019, at 10:41 PM, Jarek Potiuk wrote: > > I think that the probability calculation holds only if there is no > correlation between different schedulers. I thin

Re: Multiple Schedulers - "scheduler_lock"

2019-03-02 Thread Jarek Potiuk
I think that the probability calculation holds only if there is no correlation between different schedulers. I think however there might be an accidental correlation if you think about typical deployments. Some details why I think accidental correlation is possible and even likely. Assume that:

Re: Multiple Schedulers - "scheduler_lock"

2019-03-02 Thread Deng Xiaodong
I’m thinking of which architecture would be ideal. # Option-1: The master-slave architecture would be one option. But leader-selection will be very essential to consider, otherwise we have issue in terms of HA again. # Option-2: Another option we may consider is to simply start multiple schedu

Re: Multiple Schedulers - "scheduler_lock"

2019-03-01 Thread Tao Feng
Does the proposal use master-slave architecture(leader scheduler vs slave scheduler)? On Fri, Mar 1, 2019 at 5:32 PM Kevin Yang wrote: > Preventing double-triggering by separating DAG files different schedulers > parse sounds easier and more intuitive. I actually removed one of the > double-trig

Re: Multiple Schedulers - "scheduler_lock"

2019-03-01 Thread Kevin Yang
Preventing double-triggering by separating DAG files different schedulers parse sounds easier and more intuitive. I actually removed one of the double-triggering prevention logic here (expensive) and was re

Re: Multiple Schedulers - "scheduler_lock"

2019-03-01 Thread Maxime Beauchemin
Forgot to mention: the intention was to use the lock, but I never personally got to do the second phase which would consist of skipping the DAG if the lock is on, and expire the lock eventually based on a config setting. Max On Fri, Mar 1, 2019 at 1:57 PM Maxime Beauchemin wrote: > My original

Re: Multiple Schedulers - "scheduler_lock"

2019-03-01 Thread Maxime Beauchemin
My original intention with the lock was preventing "double-triggering" of task (triggering refers to the scheduler putting the message in the queue). Airflow now has good "double-firing-prevention" of tasks (firing happens when the worker receives the message and starts the task), even if the sched

Re: Multiple Schedulers - "scheduler_lock"

2019-03-01 Thread Deng Xiaodong
It’s exactly what my team is doing & what I shared here earlier last year (https://lists.apache.org/thread.html/0e21230e08f07ef6f8e3c59887e9005447d6932639d3ce16a103078f@%3Cdev.airflow.apache.org%3E

Re: Multiple Schedulers - "scheduler_lock"

2019-03-01 Thread Deng Xiaodong
Thanks @Bolke for the inputs! In that case, possibly we can change part of the AIP scope to “thoroughly test if running multiple schedulers causes issue”. And a few thoughts about your inputs: - “Database locking is mostly in place (DagRuns and TaskInstances)”: let’s say we already have DagRun x

Re: Multiple Schedulers - "scheduler_lock"

2019-03-01 Thread Mario Urquizo
We have been running multiple schedulers for about 3 months. We created multiple services to run airflow schedulers. The only difference is that we have each of the schedulers pointed to a directory one level deeper than the DAG home directory that the workers and webapp use. We have seen much be

Re: Multiple Schedulers - "scheduler_lock"

2019-03-01 Thread Bolke de Bruin
I have done quite some work on making it possible to run multiple schedulers at the same time. At the moment I don’t think there are real blockers actually to do so. We just don’t actively test it. Database locking is mostly in place (DagRuns and TaskInstances). And I think the worst that can

Multiple Schedulers - "scheduler_lock"

2019-03-01 Thread Deng Xiaodong
Hi Max, Following https://lists.apache.org/thread.html/0e21230e08f07ef6f8e3c59887e9005447d6932639d3ce16a103078f@%3Cdev.airflow.apache.org%3E , I’m trying to prepare an AI