Thanks @Bolke for the inputs! In that case, possibly we can change part of the AIP scope to “thoroughly test if running multiple schedulers causes issue”.
And a few thoughts about your inputs: - “Database locking is mostly in place (DagRuns and TaskInstances)”: let’s say we already have DagRun xxx created, and another scheduler is trying to create it again. Then the creating will fail for sure because of the DB locking. But if it fails gracefully? - “the worst that can happen is that a task is scheduled twice…..Everyone is having idempotent tasks right so no harm done?”: Firstly it may mean that we’re “wasting” some scheduler resource (it’s like driving around to find a parking lot, and competing with other drivers meanwhile. It will be much more efficient if there is someone telling me the exact location of an available lot); secondly, some tasks are not idempotent, like inserting a few records into database. XD > On 1 Mar 2019, at 11:43 PM, Bolke de Bruin <bdbr...@gmail.com> wrote: > > I have done quite some work on making it possible to run multiple schedulers > at the same time. At the moment I don’t think there are real blockers > actually to do so. We just don’t actively test it. > > Database locking is mostly in place (DagRuns and TaskInstances). And I think > the worst that can happen is that a task is scheduled twice. The task will > detect this most of the time and kill one off if concurrent if not sequential > then I will run again in some occasions. Everyone is having idempotent tasks > right so no harm done? ;-) > > Have you encountered issues? Maybe work those out? > > Cheers > Bolke. > > Verstuurd vanaf mijn iPad > >> Op 1 mrt. 2019 om 16:25 heeft Deng Xiaodong <xd.den...@gmail.com> het >> volgende geschreven: >> >> Hi Max, >> >> Following >> https://lists.apache.org/thread.html/0e21230e08f07ef6f8e3c59887e9005447d6932639d3ce16a103078f@%3Cdev.airflow.apache.org%3E >> >> <https://lists.apache.org/thread.html/0e21230e08f07ef6f8e3c59887e9005447d6932639d3ce16a103078f@%3Cdev.airflow.apache.org%3E>, >> I’m trying to prepare an AIP for supporting multiple-scheduler in Airflow >> (mainly for HA and Higher scheduling performance). >> >> Along the process of code checking, I found that there is one attribute of >> DagModel, “scheduler_lock”. It’s not used at all in current implementation, >> but it was introduced long time back (2015) to allow multiple schedulers to >> work together >> (https://github.com/apache/airflow/commit/2070bfc50b5aa038301519ef7c630f2fcb569620 >> >> <https://github.com/apache/airflow/commit/2070bfc50b5aa038301519ef7c630f2fcb569620> >> ). >> >> Since you were the original author of it, it would be very helpful if you >> can kindly share why the multiple-schedulers implementation was removed >> eventually, and what challenges/complexity there were. >> (You already shared a few valuable inputs in the earlier discussion >> https://lists.apache.org/thread.html/d37befd6f04dbdbfd2a2d41722352603bc2e2f97fb47bdc5ba454d0c@%3Cdev.airflow.apache.org%3E >> >> <https://lists.apache.org/thread.html/d37befd6f04dbdbfd2a2d41722352603bc2e2f97fb47bdc5ba454d0c@%3Cdev.airflow.apache.org%3E> >> , mainly relating to hiccups around concurrency, cross DAG prioritisation & >> load on DB. Other than these, anything else you would like to advise?) >> >> I will also dive into the git history further to understand it better. >> >> Thanks. >> >> >> XD