Hey here,

I have a first (very draft and still requires a number of changes) PR for
the final step of big refactoring of our projects and using workspace. This
is to let you know about the changes coming (so please take a look at the
consequences to not be surprised).

This is the most *scary* one -> moving all airflow code to "airflow-core".
And I have  draft version of it in
https://github.com/apache/airflow/pull/47798

And it's not for the faint of heart :)

[image: image.png]

Note! It's not yet complete and unless you have some general comments, it's
likely not worth pointing to individual changes (yet) - it's more to take a
look at how things will look like eventually. I will work in the next two
days to get it to  reviewable state, and will keep it rebased and running
till mid next-week. I would like to have it ready (including the release
process) for the fourth (and final?) beta).

Some resulting packaging changes:

*FOR DEVELOPMENT:*

* the pyproject.toml in the "root" of Airflow is still "apache-airflow"
package - but this will be an empty "meta" package that will install
together "apache-airflow-core", "apache-airflow-task-sdk" and optionally
providers (via extras)

* the airflow-core is a new "apache-airflow-core" distribution, where only
airflow dependencies and airflow "core" extras are configured (smtp/ otel,
pandas,rabbitmq etc) - I will likely cleanup some of those as well, some of
them are not needed. the nice thing is that this package has all
dependencies static (no hatch_build.py - everything is in pyproject.toml) -
which is pretty cool and allow us to better use dependabot for security
upgrades and notifications

The airflow-core structure is pretty standard:

airflow-core  # <- this is folder where airflow-core distribution is
            \- src
            |     \ airflow # <- This is airflow package
            |             \- api
            |             |- api_fastapi
            |             |- assets
            |             ...
            |- tests
            |       \- always
            |       |- api
            |       ...
            |- docs
            |
            |- pyproject.toml
            |- README.md


* for development - i will describe later the `pypi` way, but with `uv`
things get simpler and we have a few new options (Dennis - this is
continuation of discussion on the uv sync commands, so it's worth to
look closely:

There are a number of ways you will be (eventually able to interact with
venv. After you checkout Airflow. You can change working directory and work
on different packages and depending on which directory you run `uv sync` -
uv (using workspace feature) will sync the **expected** dependencies.

It's best to get used to the fact that instead of one airflow project we
will have ~100 pretty independent projects, and while you can continue
working with all of them as a single huge "workspace", it is generally way
more convenient to change directory to the "distribution" you are working
on currently and do everything there - with isolated set of dependencies
required only for that "distribution" - "airflow-core", "task-sdk",
"providers/amazon", "providers/mongo" - those are all separate
distributions, and more and more we will be able to treat them as
independent projects (but we will conveniently keep the option to develop
and run tests in a joined "workspace" environment at the top of the project
where we can install and test everything together - that's a bit of `uv
workspace` magic in play.

Here are typical patterns:

1) Installing all development dependencies for everything (I.e complete
environment like in breeze)  -- allows to run all tests for all airflow and
all providers

cd .
uv sync --all-packages

2) installing just airflow core with required dependencies (ready for most
core tests)

cd airflow-core
uv sync

3) installing airflow core with optional dependencies (should allow to run
all core tests - including for the optional core features such as otel etc).

cd airflow-core
uv sync --all-extras

4) installing individual provider dependencies (say amazon) - this allows
to run all tests of the provider you are working on - including installing
all dependencies from cross-provider dependencies (i.e. if you have google
tests in amazon provider, it will also install necessary google
dependencies).

cd providers/amazon
uv sync

Generally speaking - "airflow-core" will become (eventually) a truly
airflow-only distribution. It will have a few dependencies to "standard"
and "fab" providers - but I hope we will be able to get rid of those during
the resulting cleanup.

The IDE (IntelliJ) setting will just require "airflow-core/src" and
"airflow-core/tests" to be source/test roots as usual for other
distributions.

I will update the docs after I complete the PR, there are some small
variations on when to install which extras and I will play a bit to get to
the best developer experience and least surprises.

*FOR USERS*

For "installable" airflow (i.e. user's experience) - the changes will be
pretty much 100% transparent. When user will install "apache-airflow" or
"apache-airflow[google]" - things will work as they did before - only
instead of one "apache-airflow" distribution, they will have
"apache-airflow", "apache-airflow-core" and "apache-airflow-task-sdk"
installed.

Regarding version numbers etc., I will start a separate discussion - later
next week after we see how those packages will interact ("apache-airflow"
will only contain extras, but for compatibility reasons we likely want to
pin both "apache-airflow" and "apache-airflow-core" to each other, so that
users will be able to upgrade "core" by upgrading "apache-airflow" - we do
not want to change those habits likely.

The "apache-airflow-task-sdk" will be versioned separately.

Please take a look - also at the PR, see if you have any big
issues/questions/doubts - let's start discussion here - I am happy to
answer all general questions and adapt the PR to respond to
questions/suggestions.

In the meantime I will be working on making the PR green and adding missing
bits and pieces for the release process.

J.

Reply via email to