Thanks Kevin! Your sharing about the works you guys have done at Airbnb would be a great reference! We can get to know how scalable Airflow can be in a real-world use case. Greatly helpful.
Best regards, XD On Sat, Sep 8, 2018 at 6:52 AM Ruiqin Yang <yrql...@gmail.com> wrote: > Thank you Xiaodong for bringing this up and pardon me for being late on > this thread. Sharing the setup within Airbnb and some ideas/progresses, > which should benefit people who's interested in this topic. > > *- Setting-up*: > One-time on 1.8 with cherry-picks, planning to move to containerization > after releasing 1.10 internally. > > *- Executors*: > CeleryExecutor. > > *- Scale*: > 400 worker nodes with celery concurrency on each nodes varies from 2 to > 200( depends on the queue it is serving). > > *- Queues*: > We have 9 queues and 2 of them are to serve task with special env > dependency( need GPU, need special packages, etc) and other > 7 are to serve tasks with different resource consumptions, which leads to > worker nodes with different celery concurrency and cgroup sizes. > > *- SLA*: > 5 mins. 3 mins is possible( our current scheduling delay stays in 0.5-3 > min range) but we wanted some more headroom. (Our SLA is evaluated by a > monitoring task that is running > on the cluster with 5m interval. It will compare the current timestamp > against the expected scheduling timestamp(execution_date + 5m) > and send the time diff in min as one data point) > > *- # of DAGs/Tasks*: > We're maintaining ~9,500 active DAGs produced by ~1,600 DAG files( # of > DAG file is actually the biggest scheduling bottleneck now). During peak > hours we have ~15,000 task running at the same time. > > > Xiaodong, you had a very good point about scheduler being the performance > bottleneck and we need HA for it. Looking forward for your contribution on > the scheduelr HA topic! > > About scheduler performance/scaling, I've previously sent a proposal with > title "[Proposal] Scale Airflow" to the dev mailing list and currently I > have one open PR <https://github.com/apache/incubator-airflow/pull/3830> > improving performance with celery executor querying, > one WIP PR <https://github.com/yrqls21/incubator-airflow/pull/3>( fixing > the CI, ETA this weekend) improving performance of scheduelr and one PR in > my backlog( have an interval PR, need to open source it) improving > performance with enqueuing in both scheduler and > celery executor. From our internal stress test result, with all three PRs > together, our cluster can handle *4,000 DAG files, 30,000 peak concurrent > running tasks(even when they all need to be scheduled at the same time)* > *within our 5 min SLA*( calculated using our real DAG file parsing time, > which can be as long as 90 seconds for one DAG file). I think we will have > enough headroom after the changes have been merged and thus > some longer term improvements( separate DAG parsing component/service, > scheduler sharding, distribute scheduler responsibility to worker, etc) can > wait a bit. But of course I'm open of hear other opinions that can > better scale Airflow. > [image: Screen Shot 2018-08-31 at 3.26.11 PM.png] > Also I do want to mention that with faster scheduling and DAG parsing, DB > might become the bottleneck for performance. With our stress test setup we > can handle the DB load with an AWS RDS r3.4xlarge instance( only > with the improvement PRs). And webserver is not scaling very well as it is > parsing all DAG files in a single process fashion, which is my planned next > item to work on. > > > Cheers, > Kevin Y > > On Thu, Sep 6, 2018 at 11:14 PM ramandu...@gmail.com <ramandu...@gmail.com> > wrote: > >> Yeah, we are seeing scheduler becoming bottleneck as number of DAG files >> increase as scheduler can scale vertically and not horizontally. >> We are trying with multiple independent airflow setup and are >> distributing the load between them. >> But managing these many airflow clusters is becoming a challenge. >> >> Thanks, >> Raman Gupta >> >> On 2018/09/06 14:55:26, Deng Xiaodong <xd.den...@gmail.com> wrote: >> > Thanks for sharing, Raman. >> > >> > Based on what you shared, I think there are two points that may be worth >> > further discussing/thinking. >> > >> > *Scaling up (given thousands of DAGs):* >> > If you have thousands of DAGs, you may encounter longer scheduling >> latency >> > (actual start time minus planned start time). >> > For workers, we can scale horizontally by adding more worker nodes, >> which >> > is relatively straightforward. >> > But *Scheduler* may become another bottleneck.Scheduler can only be >> running >> > on one node (please correct me if I'm wrong). Even if we can use >> multiple >> > threads for it, it has its limit. HA is another concern. This is also >> what >> > our team is looking into at this moment, since scheduler is the biggest >> > "bottleneck" identified by us so far (anyone has experience tuning >> > scheduler performance?). >> > >> > *Broker for Celery Executor*: >> > you may want to try RabbitMQ rather than Redis/SQL as broker? Actually >> the >> > Celery community had the proposal to deprecate Redis as broker (of >> course >> > this proposal was rejected eventually) [ >> > https://github.com/celery/celery/issues/3274]. >> > >> > >> > Regards, >> > XD >> > >> > >> > >> > >> > >> > On Thu, Sep 6, 2018 at 6:10 PM ramandu...@gmail.com < >> ramandu...@gmail.com> >> > wrote: >> > >> > > Hi, >> > > We have a requirement to scale to run 1000(s) concurrent dags. With >> celery >> > > executor we observed that >> > > Airflow worker gets stuck sometimes if connection to redis/mysql >> breaks >> > > (https://github.com/celery/celery/issues/3932 >> > > https://github.com/celery/celery/issues/4457) >> > > Currently we are using Airflow 1.9 with LocalExecutor but planning to >> > > switch to Airflow 1.10 with K8 Executor. >> > > >> > > Thanks, >> > > Raman Gupta >> > > >> > > >> > > On 2018/09/05 12:56:38, Deng Xiaodong <xd.den...@gmail.com> wrote: >> > > > Hi folks, >> > > > >> > > > May you kindly share how your organization is setting up Airflow and >> > > using >> > > > it? Especially in terms of architecture. For example, >> > > > >> > > > - *Setting-Up*: Do you install Airflow in a "one-time" fashion, or >> > > > containerization fashion? >> > > > - *Executor:* Which executor are you using (*LocalExecutor*, >> > > > *CeleryExecutor*, etc)? I believe most production environments are >> using >> > > > *CeleryExecutor*? >> > > > - *Scale*: If using Celery, normally how many worker nodes do you >> add? >> > > (for >> > > > sure this is up to workloads and performance of your worker nodes). >> > > > - *Queue*: if Queue feature >> > > > <https://airflow.apache.org/concepts.html#queues> is used in your >> > > > architecture? For what advantage? (for example, explicitly assign >> > > > network-bound tasks to a worker node whose parallelism can be much >> higher >> > > > than its # of cores) >> > > > - *SLA*: do you have any SLA for your scheduling? (this is inspired >> by >> > > > @yrqls21's PR 3830 < >> > > https://github.com/apache/incubator-airflow/pull/3830>) >> > > > - etc. >> > > > >> > > > Airflow's setting-up can be quite flexible, but I believe there is >> some >> > > > sort of best practice, especially in the organisations where >> scalability >> > > is >> > > > essential. >> > > > >> > > > Thanks for sharing in advance! >> > > > >> > > > >> > > > Best regards, >> > > > XD >> > > > >> > > >> > >> >