Hello everyone,

I iterated quite a bit on the PR and I think it's ready for an even
more serious review:  https://github.com/apache/airflow/pull/36537 . I
solved all of the TODOs and teething problems and while it likely
still has some tests to fix, all the build and packaging pieces, local
installation and even developer/contributor documentation should be
already in the state that is ready for serious scrutiny. Thanks to
Jens and TP for the reviews so far - I addressed all of the comments
already - and there are just 2 conversations left remaining.

See the comment for status summary:
https://github.com/apache/airflow/pull/36537#issuecomment-1880193452

BTW. I found it really useful to follow the "unresolved conversation"
routine - it's really nice to see such things as a summary (see
attachment) and be able to see that there are still 2 conversations to
resolve.
That's the in-progress experiment with conversations which I
personally like a lot so far. It already saved me from merging a PR
that still had things to resolve.

J.

On Thu, Jan 4, 2024 at 8:04 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> I slept over it a few nights and got away of it and I have an idea to
> simplify it quite a bit - i.e. cut the number of extras by half and
> virtually make 0 impact on current editable installation so you might
> wnnt to hold on a bit with that (unless you want to see it changing :)
>  ) .. The whole concept won't change, I just realized that I do not
> need to add new `editable_` extras to achieve the same effect.
>
> I will also attempt to split it a bit to make it easier to review.
>
> Hold tight :) - but also feel free to look and comment even now :)
>
> And yes. Exciting. It kept me awake a night or two where I could not
> get to sleep until I finally got it working :D
>
> J
>
> On Thu, Jan 4, 2024 at 6:52 PM Pierre Jeambrun <pierrejb...@gmail.com> wrote:
> >
> > I personally think that this is a great idea. I have been following the
> > hatch project for a while and I am convinced it has a lot to offer for
> > airflow. The two big pros for me are its ease of use (backend and front
> > end) as well as the security covered aspects (reproducible builds to name
> > one).
> >
> > I will take a look at the PR later this week, but it definitely sounds
> > exciting.
> >
> >
> >
> > On Tue 2 Jan 2024 at 20:26, Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> > > Hello everyone.
> > >
> > > Tl;DR; I have a proposal to adopt Hatchling as a build backend (and
> > > recommend, but not require Hatch as frontend) for Airflow as our way
> > > of switching to PEP-standard compliant pyproject.toml way of
> > > installing Airflow (including local venvs) and building the Airflow
> > > package.
> > >
> > > I have a working implementation that needs polishing and taking a few
> > > less important decisions and rather simple TODOS). Here is draft PR:
> > > https://github.com/apache/airflow/pull/36537
> > >
> > > I've spent a better part of the Xmas/New Years break on implementing
> > > it - something that we've been discussing for - literally - years -
> > > and several people (including myself) made several attempts in the
> > > past  - unsuccessfully- with standardising python packaging/ build
> > > process for Airflow to use modern standard-driven tooling.
> > >
> > > I think I succeeded. finally.
> > >
> > > In short, what it means:
> > >
> > > When this change is merged, Airflow will have a nice and slick and
> > > modern, standard compliant contributor's experience - with editable
> > > installation that will **just work**, that will work with multiple
> > > build front-ends and it will make it very easy to install and manage
> > > local virtualenv(s) to contribute to Airflow. The extras structure and
> > > airflow configuration will be in one place (pyproject.toml) and it
> > > will be much easier to reason about our extras and dependencies. As a
> > > bonus point - with tools like Hatch, contributors will get the
> > > canonical way of managing local virtualenvs for Airflow development
> > > and a very easy recommended way to manage both Python and Venvs (but
> > > without forcing a single frontend).
> > >
> > > From the user perspective Airflow packages will be more standardised,
> > > with just user extras defined. From maintainers and PMC members, we
> > > will get reproducible builds (similarly as we have now for Providers)
> > > - which means that it will be easier and more robust to verify
> > > provenance of the packages (security!)
> > >
> > > Why can we do it now and we could not do it before ?
> > >
> > > This is mostly thanks to Herculean efforts of Python Packaging team
> > > (hats off to TP being part of the team and leading a lot of
> > > standardisation efforts there) - after a few years of relentless
> > > introduction and implementation of many PEPs and releasing new tooling
> > > (particularly Hatch, but also Flit that we already use for providers)
> > > it seems finally Airflow can move away from a very complex, completely
> > > custom setup.py and setup tools being abused by us in ways that
> > > authors and Packaging team did not originally anticipate.
> > >
> > > What problems does the change solve?
> > >
> > > My PR solves all the difficult requirements of our custom solution,
> > > but also (mostly thanks to standardisation efforts by the packaging
> > > team), it improves on a lot of problems we could not solve.
> > >
> > > Happy to have a detailed discussion here, and more detailed in the PR
> > > (I added a lot more context and documentation-  showing how this will
> > > work when we merge it). but here is the list of things such a move
> > > provides:
> > >
> > > * We are using hatchling build backend, that follows appropriate PEP
> > > standards and makes it work with any "frontend" you choose to install
> > > and manage your local installation (You can use modern Hatch which is
> > > counterpart to hatchling - highly recommended, but also it will work
> > > with just pip, poetry, flit, and any other standard-compliant tool in
> > > the future. No habits of the contributors need to be changed, it will
> > > **just** work
> > >
> > > * our editable installation has been broken for some time (mostly
> > > because we were abusing setuptools and setup.py A LOT). See
> > > https://github.com/apache/airflow/issues/30764 . This change puts the
> > > shine back on being able to make editable install of airflow work as
> > > expected and getting a first-class experience for contributors with
> > > local virtualenvs
> > >
> > > * all Airflow package configuration is now merged into a single
> > > appropriate PEP-compliant pyproject.toml - no more setup.py,
> > > setup.cfg, MANIFEST.in.
> > >
> > > * the extras are refactored and organized into logical groups and
> > > start to make sense. I introduced new "editable" extras to allow you
> > > to easily install provider dependencies locally and reorganized devel
> > > extras to make it easy to understand what you should install in your
> > > editable environment to run tests. More importantly those "devel"
> > > extras - while present in pyproject.toml are stripped off (thanks to
> > > custom hooks) from the final package - so final package has just
> > > things that are important to our users
> > >
> > > * we use pre-commit to automatically use provider.yaml dependencies
> > > and merge them into pyproject.toml - thanks to that provider.yaml will
> > > remain the single source of truth for providers. This provides a
> > > single source of truth for provider configuration, while it also
> > > allows one local installation to develop them all together" - and in a
> > > very seamless way.
> > >
> > > * no more INSTALL_PROVIDERS_FROM_SOURCES hack when you install airflow
> > > for local development. I figured a nice way to avoid installing
> > > pre-installed providers, and to make it super-easy to install
> > > dependencies of providers in editable installation (hint: `pip install
> > > -e .[editable_google]` . This thanks to custom build hooks the PEP
> > > standardized.
> > >
> > > * I also recommend Hatch as a Python/Venv management tool and used it
> > > for testing - it's a great tool for managing both - Python
> > > installations and Virtualenv management. For many people - providing
> > > such a canonical way (while following the standards and not forcing
> > > Hatch) will be really great to simplify their local environment
> > > installation.
> > >
> > > * Hatchling supports reproducible builds out-of-the-box, which is
> > > great for security - and it will make our package generation much
> > > safer and easier to verify (as we do with our providers now).
> > >
> > > There are many more details and thoughts (and also some future
> > > possible developments) that I am aware of, but this mail is already
> > > too long. and we can discuss it in the thread/PR or future threads.
> > >
> > > Happy to take any questions, critique, proposals and feedback - I got
> > > quite deep into how modern package building works so I likely made
> > > some mistakes / bad assumptions or things can be improved or maybe we
> > > can take other directions.  It will take some time to merge and
> > > discuss details, and if this one gets approved it's likely going to be
> > > targeted for Airflow 2.9.
> > >
> > > J.
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > > For additional commands, e-mail: dev-h...@airflow.apache.org
> > >
> > >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org

Reply via email to