Just one comment here - while maybe "shocking" for some cases - yes, this
one has been clearly coming. Actually, it took a lot of my brain cycles
recently to think "what's next". Too much, to the point that I started the
thread.
I thought it might be quite a valuable opening from someone who always said
"well, we have to have **really** good reason to do Airflow 3" and "maybe
there will not be Airflow 3".

And I quite agree with Kaxil - that trying to organise our thoughts around
what to do and how our Approach for Airflow 3 based on just this thread is
a bit too early.
I do not think this one thread here will lead to us deciding what to do -
if we try to do it now in a discussion thread or even a confluence doc, we
might fail achieving the goal.

My main point here was to really get the feel and open thoughts of those
who are actively involved in Airflow - on what we should do next. And to
see if this is the right time to start thinking in "two" modes: Airflow 2
and Airflow 3 (even if we do not know yet what Airflow 3 will be).

I'd rather let a free stream of thoughts of what people think should happen
here continue. Merely opening our minds to the possibility of Airflow 3.
And I would love to keep it flowing for others - without the goal of
organizing it or achieving consensus.

And I think all that Kaxil writes about - starting a series of calls,
organizing our discussions, getting "product manager(s)" working on
organizing those discussions is the **right** thing to do.
How exactly to do that, how to make sure everyone is involved, while we are
not tied up in endless discussions and bike-shedding, should materialize
from our discussion.

But I would propose (and encourage) others' thoughts here as well - just a
free stream of those - then it might provide valuable feedback to what's
next.

J.


On Mon, Apr 22, 2024 at 4:14 AM Kaxil Naik <kaxiln...@gmail.com> wrote:

> Hello all,
>
> I didn't anticipate reading an Airflow 3 email from a sunny beach in Nice,
> France <https://en.wikipedia.org/wiki/Nice> -- I had a great time there
> over the weekend, highly recommended :D
>
> I say that because, as Vikram pointed out, some of us at Astronomer have
> been polishing up the doc to propose Airflow 3 to the community in the
> coming week. Such is the beauty of the open-source project that multiple
> people (in the form of developers, committers, PMC members and various
> Stakeholders) think the same. From the Astronomer front, Constance had been
> championing a doc with Vikram & myself, with inputs from various other
> committers & users, to have a good blend of different perspectives --
> Product, PMC member & Industry leaders that cover several areas from
> User-facing pain points, Industry trends in the Orchestration space,
> Innovation in AI & ML space & opportunities as well as maintainability of
> the current codebase. We would love to share it this week that goes into
> some of the details to share our perspective on Why Airflow 3.0 & Why now
> etc.
>
> I would like to reiterate my statement from last year's panel session at
> the Airflow Summit with Marc, Jarek & Pierre
> <https://airflowsummit.org/sessions/2023/panels/panel-faces-airflow/>:
> "We,
> as the Airflow project, have maintained a great balance of Innovation &
> Stability", and I truly believe in that and is clearly visible in the
> number of downloads and Airflow's popularity as the leader in the Workflow
> Orchestration space. Our industry is rapidly evolving, especially in the
> Data, AI & ML space. The role and expectations of the Data Orchestrator
> (more specialized than a generic Workflow Orchestrator) are also evolving
> as it is more and more utilized for Business critical applications than
> just powering dashboards. So, IMO, we must continue catering to those
> use-cases and innovate to aid the new use cases. Some of us from Astronomer
> would like to create AIPs around these new use-cases LLM/Gen-AI, Data
> Awareness, and some of the other things discussed above in the coming weeks
> to receive feedback from everyone.
>
> Apart from the new use cases around Data, AI & ML, balancing them with
> resolving user pain points with things like DAG Versioning, a more modern
> and extensible UI, lack of permissions with Airflow CLI, simplifying the
> first-user Learning Curve etc while cleaning up Tech-debt (example we have
> around 100+ deprecations in our code-base), a more performant scheduler
> like the async SqlAlchemy & Scheduler discussions on the mailing list,
> dropping dependency on FAB, rethinking provider/core separation -- both for
> users & developers -- will make for a powerful 3.0 release that users will
> want to update and will provide a cleaner code-base for the contributors to
> build new foundational pieces.
>
> Needless to say, similar to Airflow 2.0, we will have to provide our users
> with utilities like the Airflow Upgrade check
> <
> https://airflow.apache.org/docs/apache-airflow/stable/howto/upgrading-from-1-10/upgrade-check.html
> >
> script
> & other tools and docs to ease the migration.
>
> Regarding the proposal to move the discussion to Confluence: In my opinion,
> Confluence is a good place once things become more concrete and defined,
> and we are looking for feedback. For a thing as big as Airflow 3, I would
> humbly suggest the same route as what we did for Airflow 2 -- to have
> a few recurring
> Dev calls
> <https://cwiki.apache.org/confluence/display/AIRFLOW/Meeting+Notes> to
> gather areas of interest and various upcoming AIPs from different
> stakeholders and get aligned on Why Airflow 3? and Defining the high-level
> scope
> <
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158869992
> >.
> This way, we can iterate much faster, and since all of the calls will be
> recorded and summary notes will be added on Confluence (example
> <https://cwiki.apache.org/confluence/display/AIRFLOW/Meeting+Notes>) and
> posted on the mailing list
> <https://lists.apache.org/list?dev@airflow.apache.org:2020-9:dev%20call>,
> we will have a written record of the important things. For any decision
> points, we will bring things to the mailing list for Vote or Lazy
> consensus. We could also re-use the Town Hall calls if needed. Once we have
> good enough alignment and scope,  utilizing Confluence and mailing list on
> the individual items & AIPs would be more valuable, in my opinion.
>
> PS: It makes me very excited that we are discussing a new major phase of
> Airflow. Looking forward to it.
>
> Regards,
> Kaxil
>
> On Sun, 21 Apr 2024 at 17:21, Scheffler Jens (XC-AS/EAE-ADA-T)
> <jens.scheff...@de.bosch.com.invalid> wrote:
>
> > Hi Developers,
> >
> > TLDR Summary: I propose to move the discussion from a Email Replay-to-all
> > chain to a discussion collection in
> > https://cwiki.apache.org/confluence/x/hQv9EQ
> >
> > When I first saw this email from Jarek I was a bit surprised and actually
> > the email was pulling me out of a kind of comfort zone, knowing what are
> > the next steps. Naturally I was shocked a bit. So I decided to have a
> sleep
> > over it. (Might be a bit shocking because Jarek dropped it 😃 especially
> I
> > heard his position about a 3.0 before and that always sounded to me like
> a
> > strong position... haha)
> > After having a sleep-over the post I think it is valid to raise the
> > discussion. Especially as we are going to a 10th feature-release which
> was
> > also a cut-over from 1.x to 2.x. At some point every software product
> needs
> > a re-factoring and cleanup. Structures are never perfect. But a lot of
> > emotions and work are included with such a step. And a risk to fail and
> to
> > lose a lot of users and force them to migrate (or have them run-away). So
> > my current outcome is: We should carefully consider. But we need to
> > consider.
> >
> > I believe the discussion will take a moment and focus - and a
> Reply-to-all
> > chain will not be a good path as we will lose a lot of detail and focus
> and
> > emails will create a lot of noise which is hard to follow. In a perfect
> > non-distributed world I'd call you to a half-day visioning workshop in a
> > room and focus on the whiteboard. Not possible with this level of
> > distribution. Next option would be a (large) ~4h conference call which is
> > hard to make in a time-zone matching the sleep cycle for all. Perfect
> would
> > be if Summit would be close-by and plan a 1/2 day or full-day breakout
> for
> > contributors on Day4 or so. But September is far far away.
> >
> > Therefore - to reduce amount of emails - I propose to start points,
> ideas,
> > pain points etc. first on a Confluence page. Therefore I tried to start
> one
> > page as starting points (contrary ideas welcome!) to have a place to
> > collaborate and sketch. A virtual whiteboard would also be OK but I had
> > none at my hands to share... (like Miro, Mural etc.). If we collect
> ideas,
> > points etc. on this page we can have a rather short (2h) call with
> > contributors in the next time to pitch and discuss the points and define
> > follow-up steps to a plan, vote and conclusion.
> >
> > Proposed Confluence discussion page:
> >
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+3.0+Discussion+and+Planning
> >
> > As a starting point I tried to import the both emails I saw in the thread
> > into the page as starter. As it is a call to collaborate, please start
> > editing and drop your points as well.
> >
> > Towards Jarek's mentioned trigger points:
> > Actually the dropped AIP-68 and AIP-69 are something that in my view do
> > NOT require Airflow to get to 3.0. I would see them either "Tactical" or
> > "just functional enhancements". AIP-68 is "just" a bit of sugar to UI and
> > extensions to Plugin interface in my view. AIP-69 is basically building
> > something on-top, based on the concept of Hybrid Executors. As long as we
> > would assume AIP-69 does not need drastical changes, maybe only small
> > adjustments in the core (but concept not elaborated yet). I see this
> mainly
> > as "just another Executor" that should not need breaking changes. I did
> not
> > want to drop these two AIP's to start a fundamental discussion but rather
> > to bring-in a new feature each.
> > The points as factors that are hard to achieve in Airflow 2.x world are
> > rather the "Multi Tenancy/Team" and "Dag Versioning" which in my eyes
> might
> > be able to move faster with a 3.0.
> >
> > P.S.: I do not get the point (yet?) Why GenAI is a trigger point that
> > forced structural breaking changes?
> >
> > Mit freundlichen Grüßen / Best regards
> >
> > Jens Scheffler
> >
> > Alliance: Enabler - Tech Lead (XC-AS/EAE-ADA-T)
> > Robert Bosch GmbH | Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen |
> > GERMANY | www.bosch.com
> > Tel. +49 711 811-91508 | Mobil +49 160 90417410 |
> > jens.scheff...@de.bosch.com
> >
> > Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
> > Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer;
> > Geschäftsführung: Dr. Stefan Hartung, Dr. Christian Fischer, Dr. Markus
> > Forschner,
> > Stefan Grosch, Dr. Markus Heyn, Dr. Frank Meyer, Dr. Tanja Rückert
> >
> > -----Original Message-----
> > From: Vikram Koka <vik...@astronomer.io.INVALID>
> > Sent: Saturday, April 20, 2024 6:23 PM
> > To: dev@airflow.apache.org
> > Subject: Re: [HUGE DISCUSSION] Airflow3 and tactical (Airflow 2) vs
> > strategic (Airflow 3) approach
> >
> > A wonderful and exciting Saturday morning discussion!
> > Thank you Jarek for bringing the offline conversations into the mailing
> > list.
> >
> > I completely agree on the necessity of Airflow 3.
> > I also agree that Gen AI is the trigger i.e. the answer to "Why now"?
> >
> > Having been thinking about this for a while from a strategic perspective,
> > as opposed to the tactical perspective of the bi-weekly and monthly
> > releases, I believe that our thinking as you articulated should have a
> > clear understanding of strategic vs. tactical, but I don't believe our
> > execution needs to necessarily be either or, but can actually be blended.
> >
> > With that said,  I believe that there are the following four buckets that
> > we should use as a framework for Airflow 3.
> >
> > 1. Gen AI / LLM support
> > 2. Airflow User Improvements
> > 3. Easy adoption of Airflow by new users 4. Integration improvements /
> > Provider maintainability
> >
> > Describing them in more detail below:
> > 1. Gen AI / LLM support
> > Reiterating the fact that this needs more work, I do believe this can be
> > incremental to Airflow. As Astronomer, we have worked on the LLM
> Providers
> > which we contributed to Airflow late last year. But clearly, there is so
> > more to do, both from building awareness of the patterns / templates to
> > use, as well as patterns to support in Airflow to make these easier to
> use
> > and adopt.
> >
> > 2. Airflow User Improvements
> > Clearly features and improvements desired by the Community are important
> > to continue to work on to make Airflow more approachable. The top two
> > features which leap to mind for me here are:
> > 2.1 DAG Versioning - the most requested feature in the Airflow User
> Survey,
> > 2.2 Modern UI - also comes up a lot
> > 2.3 Different DAG distribution processes
> > 2.4 Different execution mechanisms
> > I know there are many more which I don't currently recall.
> >
> > 3. Airflow adoption
> > We have discussed this many times, but we absolutely need to make the
> > individual first-time adoption of Airflow better.
> > I think the most common term I recall here is the notion of "Airflow
> > Standalone", but whatever the term may be, an ultra quick, simple install
> > of Airflow and the getting started experience is something we owe our
> > community.
> >
> > 4. Integration / Providers
> > The changes we made as part of Airflow 2.0 to split the Core Airflow
> > releases from the Provider releases was clearly a good choice and made a
> > huge impact. However, the integration maintainability balanced with
> growth
> > still seems like it could use a significant set of improvements. Elad
> and I
> > spoke about this a couple of days ago as well and I don't have a clear
> set
> > of next steps here, but definitely worth exploring.
> >
> > Some of us at Astronomer have been discussing this quite a bit and
> > planning on bringing a more polished draft to the community, but an
> initial
> > discussion on a Saturday is fun as well :). We will definitely share our
> > Airflow 3 proposal as a document with the community within the next week,
> > as a request for comment.
> >
> >
> >
> > On Sat, Apr 20, 2024 at 1:50 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> > > Hello here,
> > >
> > > I have been thinking a lot recently and discussing with some people
> > > and I am more and more convinced it's about the time we - as a
> > > community - should start doing changes considering "Airflow 2" current
> > and "Airflow 3" future.
> > >
> > >
> > > *TL;DR: I think we should seriously start work on Airflow 3 and decide
> > > what it means for our AIPs  - to treat some of them as more "tactical"
> > > - things that should go into Airflow 2 and some "strategic" ones -
> > > being foundational for Airflow 3 - with different goals and criteria.*
> > >
> > > A lot of us already think that way and a lot of us have already talked
> > > about it for quite some time, so you should treat my mail mostly as a
> > > little trigger "let's start publicly discussing what it might mean for
> > > us and our community and let's make it clear about the target of the
> > > initiatives we do".
> > >
> > > Some might be surprised it comes from me as I've been often saying "no
> > > Airflow 3 without a good reason" or "possibly we will have no Airflow
> > > 3", but I think (and a number of people I spoke to have similar
> > > opinion) we have plenty of reasons to make some bold moves now.
> > >
> > > Over the last 4 years since Airflow 2 was out, a lot has changed and
> > > we have a number of different needs that current Airflow 2 cannot
> > > **really** do well
> > >
> > > - LLM/Gen-AI mainly as the important trigger
> > > - Cloud Native is the "way to go"
> > > - need to submit DAGs in other ways than dropping them to a shared DAG
> > > folder.
> > > - local testing and fast iteration on developing pipelines.
> > > - ability to run tasks with workflow with "affinity" so that they can
> > > share inputs/outputs in shared CPU/GPU memory
> > > - ability to integrate seamlessly with other workflow engines - making
> > > Airflow a "workflow of workflows
> > > - probably way more
> > > - all that while keeping a lot of the strengths of Airflow 2 - such as
> > > continuing to have the option of using the many thousands of operators
> > > with
> > > 90+ providers.
> > >
> > > All those above - we could implement better if we get rid of a number
> > > of the implicit or explicit luggage we have in Airflow 2. I think the
> > > last two proposals from Jens: AIP-68 and AIP-69 reflect very much that
> > > - both  would have been much easier and straightforward if we got
> > > Airflow 3 re-designed basically at a drawing board with boldly
> > > dropping some Airflow 3 assumptions.
> > > And if we implemented core airflow 3 - taking the best part of what we
> > > have now in Airflow 2, but generally dropping the luggage  in a new
> > framework.
> > >
> > > And it won't be possible without breaking some fundamental assumptions
> > > and making Airflow 3 quite heavily incompatible with Airflow 2
> > >
> > > From "my" camp - dropping the need of having the 700+ dependencies for
> > > Airflow + all providers in a single Python interpreter, dropinnig
> > > dependency on Flask/Plugins/FAB would be a huge win on its own. Not
> > > mentioning being able to split provider's development and contribution
> > > from airflow core (while keeping the development of providers as well
> > > and
> > > contributions) - this has been highly requested.
> > >
> > > And I think we have a lot of people in our community who would be able
> > > (and would love) to do it - I think a number of us (including myself)
> > > are a bit burned out and tired of just maintaining things in Airflow
> > > in a backwards-compatible way and would jump on the opportunity to
> > > rebuilding Airflow.
> > >
> > > But - we of course cannot forget about Airflow 2 users. We do not want
> > > to "stop the world" for them. We want to keep fixing things and adding
> > > incremental changes - and those things do not necessarily super
> > > "future-proof". They should help  to "keep the lights on" for a while
> > > - which means that in a number of cases it could be "band-aid". AIP-44
> > > (internal-API), AIP-67 (multi-team) are more of those.
> > >
> > > So - what I think we might want to do as a community:
> > >
> > > * start working on Airflow 3 foundations (and decide what it means for
> > > our users and developer community). Decide what to keep, what to drop,
> > > what to redesign, assumptions to recreate.
> > >
> > > * explicitly split the initiatives/AIPs we have to target Airflow 2
> > > and Airflow 3 and treat them a bit differently in terms of
> > > future-proofness
> > >
> > > I would love to hear your thoughts on that (bracing for the storm of
> > > those).
> > >
> > > J.
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > For additional commands, e-mail: dev-h...@airflow.apache.org
> >
>

Reply via email to