Jarek,

It has been our intention (as Astronomer) to release the Scheduler HA work
directly to open source as part of Airflow 2.0.

We realized early on that the Scheduler reliability and performance were
highlighted as the key issues from the community as part of the latest
survey results from last year. We therefore crafted the Scheduler HA AIP to
cover those areas explicitly and scalability in addition. We believe that
these are key for Airflow to continue to grow and therefore have been
committed from the get go to make these available as open source as quickly
as possible for the community.

Ash shared his preliminary benchmark numbers as part of the Airflow 2.0
summit, but had to step away after that because of paternity leave. Now
that he is back, we have begun merging back all his changes into master. We
intend to do everything humanly possible to get it into the 2.0 release
within the published timeline.

Best regards,
Vikram


On Fri, Sep 11, 2020 at 7:15 AM Jarek Potiuk <jarek.pot...@polidea.com>
wrote:

> I am personally even super happy if Astronomer provides it to the customers
> with commercial obligations - until it is merged in 2.1 for example.
> Including the support - while we are discussing it and merging and
> releasing it in 2.1 (and likely later supporting migration to the community
> one internally).
>
> I believe there is nothing to prevent that from the ASF rules (and
> community) point of view :). It just has to be transparently communicated,
> that's all :).
>
> J.
>
>
> On Fri, Sep 11, 2020 at 2:18 PM Ash Berlin-Taylor <a...@apache.org> wrote:
>
> > Hi Jarek, and all.
> >
> > You aren't the only one to have this thought -- it's been on my mind too.
> >
> > Sadly I wasn't able to get the code in a PR-able state before heading
> > off on paternity leave. I have started separating out and submitting the
> > "easy"/preparatory PRs to try to lessen the size of the "main" PR:
> >
> > https://github.com/apache/airflow/pull/10729
> > https://github.com/apache/airflow/pull/10710
> > https://github.com/apache/airflow/pull/10706
> >
> > But yes, at some point it needs a large "big-bang" PR. That is coming.
> >
> > We will have a discussion internally and see what course we think is
> > best for us to take. One possible path is we submit the PR to open
> > discussion, and concurrently make that change available via our
> > astronomer images of 2.0 (which is available under Apache 2 License
> > without commercial obligations, so usable by anyone in the community)
> >
> > Thanks for bringing this up.
> >
> > -ash
> >
> > On Sep 11 2020, at 12:56 pm, Jarek Potiuk <jarek.pot...@polidea.com>
> > wrote:
> >
> > > I started to feel that we need to clarify statements about the HA
> > Scheduler
> > > for our community. Not that I am losing sleep regularly over this but
> it
> > > did keep me away last night when I started to think about it :).
> > >
> > > I have a feeling that while we already defined some - rather
> > > aggressive -
> > > timelines for 2.0, the subject of HA Scheduler was not touched in the
> > > previous Airflow 2.0 meetings. We are not very far from the release
> > > but the
> > > HA scheduler is implemented inside Astronomer and we have not seen any
> > code
> > > for it yet in the community. I understand that a lot of work (not only
> > > development but especially testing) has been put into it from the
> > > Astronomer team internally.
> > >
> > > I am actually quite OK with that to be like that. I think Astronomer
> > > is a
> > > super-valuable member of the community and I have no doubts Ash and
> Kaxil
> > > and Daniel and others will do an awesome job with it. I am simply
> afraid
> > > that when we see it, some of the cases that we see as needed by the
> > > community will require more work. This will either delay the 2.0
> > > release or
> > > we will have to drop it from the 2.0 release. Looking at the number of
> > > discussions we had with - much simpler IMHO - Smart Sensors, I have the
> > > feeling that HA scheduler will spark even more discussions. The AIP-15
> > > <
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-15+Scalable+Scheduler
> > >
> > > was
> > > not very rich in detail and the last time it has been updated was March
> > > 2019 (!) and I have no doubt a big number of design decisions,
> > > observations, learning has happened in Astronomer since.
> > >
> > > And to be perfectly honest - I am ok with both of the scenarios I can
> > see:
> > >
> > > 1) We release HA Scheduler in 2.0
> > >
> > > For that, I think we should start looking at the code and discuss it
> > > already quite some time ago IMHO. It might be too late if we want to
> fit
> > > the aggressive timeline we have - especially that there are other
> things
> > > the most active people are committing to for 2.0 and they might simply
> > not
> > > have enough time to make the quality review rounds and discussions. I
> > think
> > > we need to see it first to be even able to assess if we think we can
> make
> > > it within the timeline.
> > >
> > > 2) We agree to release the HA Scheduler in 2.1 (or 2.2) and Astronomer
> > will
> > > use the HA Scheduler in their own service as a "commercial" add-on or
> > > "advantage" of their offering.
> > >
> > > In the meantime - between 2.0 and 2.1 Astronomer could donate the code
> > and
> > > we could make sure it is reviewed and merged in the way that answers
> the
> > > needs of different community members. This has also numerous
> > > advantages to
> > > the community - similar to the case of Smart Sensors, Astronomer can
> test
> > > it in production then and solve all the teething problems of such a
> > > service.
> > >
> > > I cannot speak for the business models of Astronomer of course :), but
> it
> > > seems to me like a nice advantage to have for a while, from the
> business
> > > point of view. And as a community, we also benefit that we have such a
> > > strong member of the community with a sustainable and good business
> > model.
> > > Without Astronomer's generous support, Ash, Kaxil, and Daniel
> especially
> > > (but also others) - Airflow would not be where it is today. And I
> > > would be
> > > 100% happy with such an approach as a PMC and member of the community
> > > and I
> > > support it a lot if Astronomer chooses this path.
> > >
> > > I think, however, it's the highest time that we decide and clearly
> > > communicate it to the users as a community. At least I have a feeling
> > that
> > > without the community members, committers, and some heavy users being
> > > involved in the open, and having time for quality review and
> discussion,
> > > releasing HA in 2.0 might be not possible. And to just reiterate -
> > > this has
> > > nothing to do with the expected quality of the code and testing, but
> more
> > > about potential differences in expectations, assumptions,
> understanding,
> > > performance limitations, and anything else that might (and usually
> does)
> > > come up.
> > >
> > > I think - since we already started to publish the schedule, this is the
> > > right time that we make a decision on that and align expectations.
> > >
> > > Ry, Vikram - I'd love to hear what the intentions of Astronomer as a
> > > company for the HA Scheduler are? I know as a group of committers we
> said
> > > it a number of times that HA Scheduler will be in 2.0 so we built the
> > > expectations among our users as a community. But maybe you really think
> > > that pursuing scenario 2) (or maybe another scenario I have not thought
> > > about) is the way to go for Astronomer?
> > >
> > > As I wrote above - I am personally perfectly fine with either of the
> > > scenarios, and I think they are both beneficial for the community, but
> I
> > > think we should discuss it, align expectations, and clearly
> > > communicate as
> > > the Apache Airflow community.
> > >
> > > J.
> > >
> > > --
> > > Jarek Potiuk
> > > Polidea | Principal Software Engineer
> > >
> > > M: +48 660 796 129
> > >
> >
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>

Reply via email to