Hi all,

I've been following Airflow development fairly actively for over a year. In
that time, the company I work at (Quantopian) has gone all-in on Airflow.
It's a core part of our business and required for daily operations.

However, I've had some concerns over the future of the project. Part of
these concerns are because it's difficult to contribute to Airflow:

- There are a lot of users of Airflow, but their use cases and feature
usage are not well described. Something that seems trivial or unnecessary
to one user turns out to be what someone else's entire workflow depends on.

- The Airflow JIRA feels completely unmaintained. Most of the issues I've
reported have never even been acknowledged, and it's hard to know what
versions an issue applies to. This makes it hard to know what to work on or
what would be most impactful to other users.

- Hacking on Airflow is challenging, especially if you need to run a real
workload to examine your changes. (I saw the work for an improved local dev
process - great stuff!)

- Keeping track of what's on master vs. what's in a release is challenging,
particularly since so many commits are for operators we'll never use. (I
know there's some discussion about breaking operators into their own repos,
and I hope that goes through.)

- The PMCs are too busy to guarantee timely reviews, and rebasing is
extremely costly with how much code reorganization is happening. This
strongly discourages putting in time to develop anything other than
relatively isolated features, often new features.

A lot of the problems that Quantopian experiences with Airflow can't be
tackled without either "hacks" on top of Airflow; or deep reworkings of
Airflow components. But that kind of rework is very challenging to
implement with the current Airflow contribution process.

I'm glad that we've recently adopted AIPs, but the way we're using them
seems better suited to planning isolated features. The Airflow project does
not have a well-maintained roadmap, nor any mechanism to produce one by
weighing AIPs based on synergy vs. developer interest vs. user interest.

I think that this lack of long-term planning makes it even more challenging
to propose larger reworks that might require multiple AIPs to implement,
each of which individually might yield little benefit. I worry that we may
approve a series of "promising" AIPs that, taken together, don't amount to
anything greater than a "pile of new features"; instead of balancing
feature improvements with platform improvements that will unlock more
fundamental changes to how Airflow can work.

I'd like to see some discussion of what it would look like to set long term
goals for Airflow. What is Airflow 2 going to look like? How much backwards
compat will it break? When should we expect Airflow 3? Are they going to be
"business as usual" releases, or will they embrace any new concepts or
idioms? Will there be a true container-native, or cloud-native version of
Airflow? Will we work to be better for current users, or to embrace new
classes of users?

I have some thoughts of my own, of course, but I'd like to hear what other
people have to say on this topic first!

Reply via email to