Hi all,

As promised, we are pleased to share our proposal for Airflow 3
<https://docs.google.com/document/d/1MTr53101EISZaYidCUKcR6mRKshXGzW6DZFXGzetG3E/edit?usp=sharing>.


We met several community members in the last few days to get feedback on
this proposal, and we are glad to say that most of the things in the doc
resonated with them. It was also very pleasing to see that everyone we
talked to asked how they could help contribute towards Airflow 3 and making
it successful.

We would like to now open it up for feedback from the entire community.
Please add comments to this Google doc. This proposal is purposefully
high-level to get feedback on the general direction and we will have
several AIPs for the big pieces mentioned in the doc.

If there aren't any strong objections on why we (as the Airflow community)
should work on Airflow 3, we propose a dedicated fortnightly recurring
call starting the first week of June. This will give enough time to get
feedback on our proposal, incorporate any feedback and then focus our
discussion on the What & How of Airflow 3 as opposed to Why.

As we hit the 10-year mark at this year’s Airflow Summit, we have a unique
marketing opportunity to officially announce Airflow 3. We can either use
this milestone to just look back at a decade of growth or make it more
exciting to not only showcase growth but innovation and, more importantly,
get the Airflow users excited about the next chapter in Airflow’s story.
We’ll have the perfect platform to gather immediate feedback, engage the
community in shaping future features, and share our vision for what lies
ahead. We can then focus Airflow Summit 2025 to discuss how Airflow 3
features are used in the wild by companies around the world, showcase the
migration tools and utilities that will have matured by then, and gather
more community insights to continue improving Airflow. This will give our
users more confidence in Airflow 3 and help them feel comfortable upgrading
to it. This is, of course, just our proposal, and we would love to hear
what others think. It’s certainly ambitious, but nothing we haven’t done in
the past, and having a goal-post will help us all.

Hope you all have a great weekend.

Regards,
Kaxil, Vikram & Constance

On Mon, 29 Apr 2024 at 05:11, Amogh Desai <amoghdesai....@gmail.com> wrote:

> Thanks for starting the discussion, Jarek!
>
> I too agree that with the new upcoming features and AIPs, it might just be
> the right time
> to discuss the possibility of having Airflow 3. I agree with most reasons
> pointed out by others and
> I would love to see it happen, and also be a part of it.
>
> Since this is a major step for the future of Airflow, we need to carefully
> consider the user experience for
> users coming from versions of Airflow 2 and would not want this migration
> to be a pain.
>
> Btw, I concur with Jens and I too am not very clear when we say that "Gen
> AI is going to be the new trigger for Airflow".
> Would be obliged if someone could explain that portion to me :)
>
>
> Thanks & Regards,
> Amogh Desai
>
>
> On Mon, Apr 22, 2024 at 1:52 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> > Just one comment here - while maybe "shocking" for some cases - yes, this
> > one has been clearly coming. Actually, it took a lot of my brain cycles
> > recently to think "what's next". Too much, to the point that I started
> the
> > thread.
> > I thought it might be quite a valuable opening from someone who always
> said
> > "well, we have to have **really** good reason to do Airflow 3" and "maybe
> > there will not be Airflow 3".
> >
> > And I quite agree with Kaxil - that trying to organise our thoughts
> around
> > what to do and how our Approach for Airflow 3 based on just this thread
> is
> > a bit too early.
> > I do not think this one thread here will lead to us deciding what to do -
> > if we try to do it now in a discussion thread or even a confluence doc,
> we
> > might fail achieving the goal.
> >
> > My main point here was to really get the feel and open thoughts of those
> > who are actively involved in Airflow - on what we should do next. And to
> > see if this is the right time to start thinking in "two" modes: Airflow 2
> > and Airflow 3 (even if we do not know yet what Airflow 3 will be).
> >
> > I'd rather let a free stream of thoughts of what people think should
> happen
> > here continue. Merely opening our minds to the possibility of Airflow 3.
> > And I would love to keep it flowing for others - without the goal of
> > organizing it or achieving consensus.
> >
> > And I think all that Kaxil writes about - starting a series of calls,
> > organizing our discussions, getting "product manager(s)" working on
> > organizing those discussions is the **right** thing to do.
> > How exactly to do that, how to make sure everyone is involved, while we
> are
> > not tied up in endless discussions and bike-shedding, should materialize
> > from our discussion.
> >
> > But I would propose (and encourage) others' thoughts here as well - just
> a
> > free stream of those - then it might provide valuable feedback to what's
> > next.
> >
> > J.
> >
> >
> > On Mon, Apr 22, 2024 at 4:14 AM Kaxil Naik <kaxiln...@gmail.com> wrote:
> >
> > > Hello all,
> > >
> > > I didn't anticipate reading an Airflow 3 email from a sunny beach in
> > Nice,
> > > France <https://en.wikipedia.org/wiki/Nice> -- I had a great time
> there
> > > over the weekend, highly recommended :D
> > >
> > > I say that because, as Vikram pointed out, some of us at Astronomer
> have
> > > been polishing up the doc to propose Airflow 3 to the community in the
> > > coming week. Such is the beauty of the open-source project that
> multiple
> > > people (in the form of developers, committers, PMC members and various
> > > Stakeholders) think the same. From the Astronomer front, Constance had
> > been
> > > championing a doc with Vikram & myself, with inputs from various other
> > > committers & users, to have a good blend of different perspectives --
> > > Product, PMC member & Industry leaders that cover several areas from
> > > User-facing pain points, Industry trends in the Orchestration space,
> > > Innovation in AI & ML space & opportunities as well as maintainability
> of
> > > the current codebase. We would love to share it this week that goes
> into
> > > some of the details to share our perspective on Why Airflow 3.0 & Why
> now
> > > etc.
> > >
> > > I would like to reiterate my statement from last year's panel session
> at
> > > the Airflow Summit with Marc, Jarek & Pierre
> > > <https://airflowsummit.org/sessions/2023/panels/panel-faces-airflow/>:
> > > "We,
> > > as the Airflow project, have maintained a great balance of Innovation &
> > > Stability", and I truly believe in that and is clearly visible in the
> > > number of downloads and Airflow's popularity as the leader in the
> > Workflow
> > > Orchestration space. Our industry is rapidly evolving, especially in
> the
> > > Data, AI & ML space. The role and expectations of the Data Orchestrator
> > > (more specialized than a generic Workflow Orchestrator) are also
> evolving
> > > as it is more and more utilized for Business critical applications than
> > > just powering dashboards. So, IMO, we must continue catering to those
> > > use-cases and innovate to aid the new use cases. Some of us from
> > Astronomer
> > > would like to create AIPs around these new use-cases LLM/Gen-AI, Data
> > > Awareness, and some of the other things discussed above in the coming
> > weeks
> > > to receive feedback from everyone.
> > >
> > > Apart from the new use cases around Data, AI & ML, balancing them with
> > > resolving user pain points with things like DAG Versioning, a more
> modern
> > > and extensible UI, lack of permissions with Airflow CLI, simplifying
> the
> > > first-user Learning Curve etc while cleaning up Tech-debt (example we
> > have
> > > around 100+ deprecations in our code-base), a more performant scheduler
> > > like the async SqlAlchemy & Scheduler discussions on the mailing list,
> > > dropping dependency on FAB, rethinking provider/core separation -- both
> > for
> > > users & developers -- will make for a powerful 3.0 release that users
> > will
> > > want to update and will provide a cleaner code-base for the
> contributors
> > to
> > > build new foundational pieces.
> > >
> > > Needless to say, similar to Airflow 2.0, we will have to provide our
> > users
> > > with utilities like the Airflow Upgrade check
> > > <
> > >
> >
> https://airflow.apache.org/docs/apache-airflow/stable/howto/upgrading-from-1-10/upgrade-check.html
> > > >
> > > script
> > > & other tools and docs to ease the migration.
> > >
> > > Regarding the proposal to move the discussion to Confluence: In my
> > opinion,
> > > Confluence is a good place once things become more concrete and
> defined,
> > > and we are looking for feedback. For a thing as big as Airflow 3, I
> would
> > > humbly suggest the same route as what we did for Airflow 2 -- to have
> > > a few recurring
> > > Dev calls
> > > <https://cwiki.apache.org/confluence/display/AIRFLOW/Meeting+Notes> to
> > > gather areas of interest and various upcoming AIPs from different
> > > stakeholders and get aligned on Why Airflow 3? and Defining the
> > high-level
> > > scope
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158869992
> > > >.
> > > This way, we can iterate much faster, and since all of the calls will
> be
> > > recorded and summary notes will be added on Confluence (example
> > > <https://cwiki.apache.org/confluence/display/AIRFLOW/Meeting+Notes>)
> and
> > > posted on the mailing list
> > > <
> https://lists.apache.org/list?dev@airflow.apache.org:2020-9:dev%20call
> > >,
> > > we will have a written record of the important things. For any decision
> > > points, we will bring things to the mailing list for Vote or Lazy
> > > consensus. We could also re-use the Town Hall calls if needed. Once we
> > have
> > > good enough alignment and scope,  utilizing Confluence and mailing list
> > on
> > > the individual items & AIPs would be more valuable, in my opinion.
> > >
> > > PS: It makes me very excited that we are discussing a new major phase
> of
> > > Airflow. Looking forward to it.
> > >
> > > Regards,
> > > Kaxil
> > >
> > > On Sun, 21 Apr 2024 at 17:21, Scheffler Jens (XC-AS/EAE-ADA-T)
> > > <jens.scheff...@de.bosch.com.invalid> wrote:
> > >
> > > > Hi Developers,
> > > >
> > > > TLDR Summary: I propose to move the discussion from a Email
> > Replay-to-all
> > > > chain to a discussion collection in
> > > > https://cwiki.apache.org/confluence/x/hQv9EQ
> > > >
> > > > When I first saw this email from Jarek I was a bit surprised and
> > actually
> > > > the email was pulling me out of a kind of comfort zone, knowing what
> > are
> > > > the next steps. Naturally I was shocked a bit. So I decided to have a
> > > sleep
> > > > over it. (Might be a bit shocking because Jarek dropped it 😃
> > especially
> > > I
> > > > heard his position about a 3.0 before and that always sounded to me
> > like
> > > a
> > > > strong position... haha)
> > > > After having a sleep-over the post I think it is valid to raise the
> > > > discussion. Especially as we are going to a 10th feature-release
> which
> > > was
> > > > also a cut-over from 1.x to 2.x. At some point every software product
> > > needs
> > > > a re-factoring and cleanup. Structures are never perfect. But a lot
> of
> > > > emotions and work are included with such a step. And a risk to fail
> and
> > > to
> > > > lose a lot of users and force them to migrate (or have them
> run-away).
> > So
> > > > my current outcome is: We should carefully consider. But we need to
> > > > consider.
> > > >
> > > > I believe the discussion will take a moment and focus - and a
> > > Reply-to-all
> > > > chain will not be a good path as we will lose a lot of detail and
> focus
> > > and
> > > > emails will create a lot of noise which is hard to follow. In a
> perfect
> > > > non-distributed world I'd call you to a half-day visioning workshop
> in
> > a
> > > > room and focus on the whiteboard. Not possible with this level of
> > > > distribution. Next option would be a (large) ~4h conference call
> which
> > is
> > > > hard to make in a time-zone matching the sleep cycle for all. Perfect
> > > would
> > > > be if Summit would be close-by and plan a 1/2 day or full-day
> breakout
> > > for
> > > > contributors on Day4 or so. But September is far far away.
> > > >
> > > > Therefore - to reduce amount of emails - I propose to start points,
> > > ideas,
> > > > pain points etc. first on a Confluence page. Therefore I tried to
> start
> > > one
> > > > page as starting points (contrary ideas welcome!) to have a place to
> > > > collaborate and sketch. A virtual whiteboard would also be OK but I
> had
> > > > none at my hands to share... (like Miro, Mural etc.). If we collect
> > > ideas,
> > > > points etc. on this page we can have a rather short (2h) call with
> > > > contributors in the next time to pitch and discuss the points and
> > define
> > > > follow-up steps to a plan, vote and conclusion.
> > > >
> > > > Proposed Confluence discussion page:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+3.0+Discussion+and+Planning
> > > >
> > > > As a starting point I tried to import the both emails I saw in the
> > thread
> > > > into the page as starter. As it is a call to collaborate, please
> start
> > > > editing and drop your points as well.
> > > >
> > > > Towards Jarek's mentioned trigger points:
> > > > Actually the dropped AIP-68 and AIP-69 are something that in my view
> do
> > > > NOT require Airflow to get to 3.0. I would see them either "Tactical"
> > or
> > > > "just functional enhancements". AIP-68 is "just" a bit of sugar to UI
> > and
> > > > extensions to Plugin interface in my view. AIP-69 is basically
> building
> > > > something on-top, based on the concept of Hybrid Executors. As long
> as
> > we
> > > > would assume AIP-69 does not need drastical changes, maybe only small
> > > > adjustments in the core (but concept not elaborated yet). I see this
> > > mainly
> > > > as "just another Executor" that should not need breaking changes. I
> did
> > > not
> > > > want to drop these two AIP's to start a fundamental discussion but
> > rather
> > > > to bring-in a new feature each.
> > > > The points as factors that are hard to achieve in Airflow 2.x world
> are
> > > > rather the "Multi Tenancy/Team" and "Dag Versioning" which in my eyes
> > > might
> > > > be able to move faster with a 3.0.
> > > >
> > > > P.S.: I do not get the point (yet?) Why GenAI is a trigger point that
> > > > forced structural breaking changes?
> > > >
> > > > Mit freundlichen Grüßen / Best regards
> > > >
> > > > Jens Scheffler
> > > >
> > > > Alliance: Enabler - Tech Lead (XC-AS/EAE-ADA-T)
> > > > Robert Bosch GmbH | Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen |
> > > > GERMANY | www.bosch.com
> > > > Tel. +49 711 811-91508 | Mobil +49 160 90417410 |
> > > > jens.scheff...@de.bosch.com
> > > >
> > > > Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
> > > > Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer;
> > > > Geschäftsführung: Dr. Stefan Hartung, Dr. Christian Fischer, Dr.
> Markus
> > > > Forschner,
> > > > Stefan Grosch, Dr. Markus Heyn, Dr. Frank Meyer, Dr. Tanja Rückert
> > > >
> > > > -----Original Message-----
> > > > From: Vikram Koka <vik...@astronomer.io.INVALID>
> > > > Sent: Saturday, April 20, 2024 6:23 PM
> > > > To: dev@airflow.apache.org
> > > > Subject: Re: [HUGE DISCUSSION] Airflow3 and tactical (Airflow 2) vs
> > > > strategic (Airflow 3) approach
> > > >
> > > > A wonderful and exciting Saturday morning discussion!
> > > > Thank you Jarek for bringing the offline conversations into the
> mailing
> > > > list.
> > > >
> > > > I completely agree on the necessity of Airflow 3.
> > > > I also agree that Gen AI is the trigger i.e. the answer to "Why now"?
> > > >
> > > > Having been thinking about this for a while from a strategic
> > perspective,
> > > > as opposed to the tactical perspective of the bi-weekly and monthly
> > > > releases, I believe that our thinking as you articulated should have
> a
> > > > clear understanding of strategic vs. tactical, but I don't believe
> our
> > > > execution needs to necessarily be either or, but can actually be
> > blended.
> > > >
> > > > With that said,  I believe that there are the following four buckets
> > that
> > > > we should use as a framework for Airflow 3.
> > > >
> > > > 1. Gen AI / LLM support
> > > > 2. Airflow User Improvements
> > > > 3. Easy adoption of Airflow by new users 4. Integration improvements
> /
> > > > Provider maintainability
> > > >
> > > > Describing them in more detail below:
> > > > 1. Gen AI / LLM support
> > > > Reiterating the fact that this needs more work, I do believe this can
> > be
> > > > incremental to Airflow. As Astronomer, we have worked on the LLM
> > > Providers
> > > > which we contributed to Airflow late last year. But clearly, there is
> > so
> > > > more to do, both from building awareness of the patterns / templates
> to
> > > > use, as well as patterns to support in Airflow to make these easier
> to
> > > use
> > > > and adopt.
> > > >
> > > > 2. Airflow User Improvements
> > > > Clearly features and improvements desired by the Community are
> > important
> > > > to continue to work on to make Airflow more approachable. The top two
> > > > features which leap to mind for me here are:
> > > > 2.1 DAG Versioning - the most requested feature in the Airflow User
> > > Survey,
> > > > 2.2 Modern UI - also comes up a lot
> > > > 2.3 Different DAG distribution processes
> > > > 2.4 Different execution mechanisms
> > > > I know there are many more which I don't currently recall.
> > > >
> > > > 3. Airflow adoption
> > > > We have discussed this many times, but we absolutely need to make the
> > > > individual first-time adoption of Airflow better.
> > > > I think the most common term I recall here is the notion of "Airflow
> > > > Standalone", but whatever the term may be, an ultra quick, simple
> > install
> > > > of Airflow and the getting started experience is something we owe our
> > > > community.
> > > >
> > > > 4. Integration / Providers
> > > > The changes we made as part of Airflow 2.0 to split the Core Airflow
> > > > releases from the Provider releases was clearly a good choice and
> made
> > a
> > > > huge impact. However, the integration maintainability balanced with
> > > growth
> > > > still seems like it could use a significant set of improvements. Elad
> > > and I
> > > > spoke about this a couple of days ago as well and I don't have a
> clear
> > > set
> > > > of next steps here, but definitely worth exploring.
> > > >
> > > > Some of us at Astronomer have been discussing this quite a bit and
> > > > planning on bringing a more polished draft to the community, but an
> > > initial
> > > > discussion on a Saturday is fun as well :). We will definitely share
> > our
> > > > Airflow 3 proposal as a document with the community within the next
> > week,
> > > > as a request for comment.
> > > >
> > > >
> > > >
> > > > On Sat, Apr 20, 2024 at 1:50 AM Jarek Potiuk <ja...@potiuk.com>
> wrote:
> > > >
> > > > > Hello here,
> > > > >
> > > > > I have been thinking a lot recently and discussing with some people
> > > > > and I am more and more convinced it's about the time we - as a
> > > > > community - should start doing changes considering "Airflow 2"
> > current
> > > > and "Airflow 3" future.
> > > > >
> > > > >
> > > > > *TL;DR: I think we should seriously start work on Airflow 3 and
> > decide
> > > > > what it means for our AIPs  - to treat some of them as more
> > "tactical"
> > > > > - things that should go into Airflow 2 and some "strategic" ones -
> > > > > being foundational for Airflow 3 - with different goals and
> > criteria.*
> > > > >
> > > > > A lot of us already think that way and a lot of us have already
> > talked
> > > > > about it for quite some time, so you should treat my mail mostly
> as a
> > > > > little trigger "let's start publicly discussing what it might mean
> > for
> > > > > us and our community and let's make it clear about the target of
> the
> > > > > initiatives we do".
> > > > >
> > > > > Some might be surprised it comes from me as I've been often saying
> > "no
> > > > > Airflow 3 without a good reason" or "possibly we will have no
> Airflow
> > > > > 3", but I think (and a number of people I spoke to have similar
> > > > > opinion) we have plenty of reasons to make some bold moves now.
> > > > >
> > > > > Over the last 4 years since Airflow 2 was out, a lot has changed
> and
> > > > > we have a number of different needs that current Airflow 2 cannot
> > > > > **really** do well
> > > > >
> > > > > - LLM/Gen-AI mainly as the important trigger
> > > > > - Cloud Native is the "way to go"
> > > > > - need to submit DAGs in other ways than dropping them to a shared
> > DAG
> > > > > folder.
> > > > > - local testing and fast iteration on developing pipelines.
> > > > > - ability to run tasks with workflow with "affinity" so that they
> can
> > > > > share inputs/outputs in shared CPU/GPU memory
> > > > > - ability to integrate seamlessly with other workflow engines -
> > making
> > > > > Airflow a "workflow of workflows
> > > > > - probably way more
> > > > > - all that while keeping a lot of the strengths of Airflow 2 - such
> > as
> > > > > continuing to have the option of using the many thousands of
> > operators
> > > > > with
> > > > > 90+ providers.
> > > > >
> > > > > All those above - we could implement better if we get rid of a
> number
> > > > > of the implicit or explicit luggage we have in Airflow 2. I think
> the
> > > > > last two proposals from Jens: AIP-68 and AIP-69 reflect very much
> > that
> > > > > - both  would have been much easier and straightforward if we got
> > > > > Airflow 3 re-designed basically at a drawing board with boldly
> > > > > dropping some Airflow 3 assumptions.
> > > > > And if we implemented core airflow 3 - taking the best part of what
> > we
> > > > > have now in Airflow 2, but generally dropping the luggage  in a new
> > > > framework.
> > > > >
> > > > > And it won't be possible without breaking some fundamental
> > assumptions
> > > > > and making Airflow 3 quite heavily incompatible with Airflow 2
> > > > >
> > > > > From "my" camp - dropping the need of having the 700+ dependencies
> > for
> > > > > Airflow + all providers in a single Python interpreter, dropinnig
> > > > > dependency on Flask/Plugins/FAB would be a huge win on its own. Not
> > > > > mentioning being able to split provider's development and
> > contribution
> > > > > from airflow core (while keeping the development of providers as
> well
> > > > > and
> > > > > contributions) - this has been highly requested.
> > > > >
> > > > > And I think we have a lot of people in our community who would be
> > able
> > > > > (and would love) to do it - I think a number of us (including
> myself)
> > > > > are a bit burned out and tired of just maintaining things in
> Airflow
> > > > > in a backwards-compatible way and would jump on the opportunity to
> > > > > rebuilding Airflow.
> > > > >
> > > > > But - we of course cannot forget about Airflow 2 users. We do not
> > want
> > > > > to "stop the world" for them. We want to keep fixing things and
> > adding
> > > > > incremental changes - and those things do not necessarily super
> > > > > "future-proof". They should help  to "keep the lights on" for a
> while
> > > > > - which means that in a number of cases it could be "band-aid".
> > AIP-44
> > > > > (internal-API), AIP-67 (multi-team) are more of those.
> > > > >
> > > > > So - what I think we might want to do as a community:
> > > > >
> > > > > * start working on Airflow 3 foundations (and decide what it means
> > for
> > > > > our users and developer community). Decide what to keep, what to
> > drop,
> > > > > what to redesign, assumptions to recreate.
> > > > >
> > > > > * explicitly split the initiatives/AIPs we have to target Airflow 2
> > > > > and Airflow 3 and treat them a bit differently in terms of
> > > > > future-proofness
> > > > >
> > > > > I would love to hear your thoughts on that (bracing for the storm
> of
> > > > > those).
> > > > >
> > > > > J.
> > > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > > > For additional commands, e-mail: dev-h...@airflow.apache.org
> > > >
> > >
> >
>

Reply via email to