Thanks Kevin,
I am specifically interested in scheduler settings
like scheduler_zombie_task_threshold, max_tis_per_query
We are expecting the load in terms of 1000(s) concurrent Dags so any airflow 
setting which might help us in achieving this target would be useful.
There will be 1000(s)  local DAG file increase with schedule set to @once. 



On 2018/06/08 05:13:39, Ruiqin Yang <yrql...@gmail.com> wrote: 
> Not sure about 1.9 but parallelism seems to be supported on master
> <https://github.com/apache/incubator-airflow/blob/272952a9dce932cb2c648f82c9f9f2cafd572ff1/airflow/executors/base_executor.py#L113>.
> We are using 1.8 with some bug fixing cherry-picks. The machine is just out
> of the box AWS EC2 instances. We've been using I3 for scheduler and R3 for
> worker, but I urge you to checkout the new generations which are more
> powerful and cheaper. As always, you may pick the best series by profile
> your machine usage( on I/O, ram, cpu, etc). I don't think we've tuned too
> much on the default Airflow settings and the best setting for you guys
> should be different that the one best for us( that being said, I can
> provide some more details when I'm back to the office if you are curious on
> some particular settings).
> 
> Cheers,
> Kevin Y
> 
> On Thu, Jun 7, 2018 at 9:02 PM ramandu...@gmail.com <ramandu...@gmail.com>
> wrote:
> 
> > We have similar use case where we need to support multiple teams and
> > expected load is 1000(s) active TIs. We are exploring setting up multiple
> > airflow cluster on for each team and scale that cluster horizontally
> > through   celery executor.
> > @Ruiquin could you please share some details on airflow setup like
> > Airflow Version, Machine configuration, Airflow cfg settings etc..
> > How can we configure infinity(0) for cluster-wide setting. (We are using
> > airflow v1.9 and it seems that
> > airflow cfg's parallelism = 0 is not supported in v1.9)
> >
> > On 2018/06/07 22:27:20, Ruiqin Yang <yrql...@gmail.com> wrote:
> > > Here to provide a datapoint from Airbnb--all users share the same cluster
> > > (~8k active DAGs and ~15k running tasks at peak).
> > >
> > > For the cluster-wide concurrency setting, we put infinity( 0) there and
> > > scale up on the # of workers if we need more worker slot.
> > >
> > > For the scheduler & Airflow UI coupling, I believe Airflow UI is not
> > > coupled with the scheduler. Actually in Airbnb we couple airflow worker
> > and
> > > airflow webserver together on the same EC2 instance--but you can always
> > > have a set of instances only hosting webservers.
> > >
> > > If you have some critical users that don't want their DAG affected by
> > > changes from other users( adhoc new DAGs/tasks), you can probably set up
> > > dedicated celery queue( assuming you are using celery executor, local
> > > executor is in theory not for production) for the user, or, you can
> > enforce
> > > DAG level concurrency( maybe a CI or through policy
> > > <
> > https://github.com/apache/incubator-airflow/blob/master/airflow/settings.py#L109
> > >--which
> > > I'm not sure is a good practice since it is more for task level
> > attributes).
> > >
> > > With the awesome RBAC change in place, I think it make sense to share the
> > > same cluster, easier maintenance, less user confusion, etc.
> > >
> > > Cheers,
> > > Kevin Y
> > >
> > > On Thu, Jun 7, 2018 at 1:59 PM Ananth Durai <vanant...@gmail.com> wrote:
> > >
> > > > At Slack, We follow a similar pattern of deploying multiple airflow
> > > > instances. Since the Airflow UI & the scheduler coupled, it introduces
> > > > friction as the user need to know underlying deployment strategy. (like
> > > > which Airflow URL I should visit to see my DAGs, multiple teams
> > > > collaborating on the same DAG, pipeline operations, etc.)
> > > >
> > > > In one of the forum question, max mentioned renaming the scheduler to
> > > > supervisor as the scheduler do more than just scheduling.
> > > > It would be super cool if we can make multiple supervisors share the
> > same
> > > > airflow metadata storage and the Airflow UI. (maybe introducing a
> > unique
> > > > config param `supervisor.id` for each instance)
> > > >
> > > > The approach will help us to scale Airflow scheduler horizontally and
> > while
> > > > keeping the simplicity from the user perspective.
> > > >
> > > >
> > > > Regards,
> > > > Ananth.P,
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On 7 June 2018 at 04:08, Arturo Michel <arturo.mic...@starlizard.com>
> > > > wrote:
> > > >
> > > > > We have had up to 50 dags with multiple tasks each. Many of them run
> > in
> > > > > parallel, we've had some issues with compute as it was meant to be a
> > > > > temporary deployment but somehow it's now the permanent production
> > one
> > > > and
> > > > > resources are not great.
> > > > > Oranisationally it is very similar to what Gerard described. More
> > than
> > > > one
> > > > > group working with different engineering practices and standards,
> > this is
> > > > > probably one of the sources of problems.
> > > > >
> > > > > -----Original Message-----
> > > > > From: Gerard Toonstra <gtoons...@gmail.com>
> > > > > Sent: Wednesday, June 6, 2018 5:02 PM
> > > > > To: dev@airflow.incubator.apache.org
> > > > > Subject: Re: Single Airflow Instance Vs Multiple Airflow Instance
> > > > >
> > > > > We are using two cluster instances. One cluster is for the
> > engineering
> > > > > teams that are in the "tech" wing and which rigorously follow tech
> > > > > principles, the other instance is for use by business analysts and
> > more
> > > > > ad-hoc, experimental work, who do not necessarily follow the
> > principles.
> > > > We
> > > > > have a nomad engineer helping out the ad-hoc cluster, setting it up,
> > > > > connecting it to all systems and resolving programming questions. All
> > > > > clusters are fully puppetized, so we reuse configs and ways how
> > things
> > > > are
> > > > > configured, plus have a common "platform code" package that is reused
> > > > > across both clusters.
> > > > >
> > > > > G>
> > > > >
> > > > >
> > > > > On Wed, Jun 6, 2018 at 5:50 PM, James Meickle <
> > jmeic...@quantopian.com>
> > > > > wrote:
> > > > >
> > > > > > An important consideration here is that there are several settings
> > > > > > that are cluster-wide. In particular, cluster-wide concurrency
> > > > > > settings could result in Team B's DAG refusing to schedule based
> > on an
> > > > > error in Team A's DAG.
> > > > > >
> > > > > > Do your teams follow similar practices in how eagerly they ship
> > code,
> > > > > > or have similar SLAs for resolving issues? If so, you are probably
> > > > > > fine using co-tenancy. If not, you should probably talk about it
> > first
> > > > > > to make sure the teams are okay with co-tenancy.
> > > > > >
> > > > > > On Wed, Jun 6, 2018 at 11:24 AM, gauthiermarti...@gmail.com <
> > > > > > gauthiermarti...@gmail.com> wrote:
> > > > > >
> > > > > > > Hi Everyone,
> > > > > > >
> > > > > > > We have been experimenting with airflow for about 6 months now.
> > > > > > > We are planning to have multiple departments to use it. Since we
> > > > > > > don't have any internal experience with Airflow we are wondering
> > if
> > > > > > > single instance per department is more suited than single
> > instance
> > > > > > > with multi-tenancy? We have been aware about the upcoming
> > release of
> > > > > > > airflow
> > > > > > > 1.10 and changes that will be made to the RBAC which will be more
> > > > > > > suited for multi-tenancy.
> > > > > > >
> > > > > > > Any advice on this ? Any tips could be helpful to us.
> > > > > > >
> > > > > >
> > > > >
> > > > > This e-mail message and any attachments are confidential and are for
> > the
> > > > > exclusive use of the addressee only.  If you are not the intended
> > > > > recipient, you should not use the content, place any reliance on it
> > or
> > > > > disclose it to anyone else.  Please notify the sender immediately by
> > > > > replying to it and then ensure that it is deleted from your system
> > > > > (including any attachments).
> > > > >
> > > >
> > >
> >
> 

Reply via email to