TL;DR; Yes, if you use our reference docker images, you are already following the "How to reproducibly install Airflow" recommendation, because this is what we do when preparing the release, also because those images are generally "frozen" in time once released.
*Little longer explanation if you care* The dockerfiles are convenience packages (also called reference images) which we publish for convenience for our users. They are not "official source artifact" - they are just conveniently packaging the wheel packages + system packages that were available at the time of release that make them run on top of Python-debian-bullseye base image. As of Airflow 2.8.0 - assuming lazy consensus will be reached tomorrow morning - based on Python-debian-bookworm based image - following our policies (see this LAZY CONSENSUS thread on the devlist - https://lists.apache.org/thread/gcy143nqodf8dqbjxo2xt5gq4npv334p) They are just that - conveniently packaged installation based on Python Debian base image, necessary system packages, and yes - this is what you refer to - airflow packages with pre-selected provider list installed via `pip`, These are indeed installed and constraints - so indeed the way we build and publish those already strictly follows the reproducible installation recommendation I repeated. In fact it's even a bit more "reproducible" than just reproducible `pip` installation - because once we publish the images we generally (unless there are really exceptional situations) do not update those "released" reference images. They are basically frozen in time, and if someone wants to update them (for example to upgrade some packages that received security fixes) - users should take our reference images and upgrade whatever they want to upgrade, because our reference docker images are basically frozen once released (of course the best way to get latest compliant packages and security fixes is to update to latest released image when it is released). This is nicely described in the docker image documentation as well https://airflow.apache.org/docs/docker-stack/index.html#fixing-images-at-release-time. Quoting the documentation here for convenience: ------------------------------------- *Fixing images at release time* The released “versioned” reference images are mostly fixed when we release Airflow version and we only update them in exceptional circumstances. For example when we find out that there are dependency errors that might prevent important Airflow or embedded provider’s functionalities working. In normal circumstances, the images are not going to change after release, even if new version of Airflow dependencies are released - not even when those versions contain critical security fixes. The process of Airflow releases is designed around upgrading dependencies automatically where applicable but only when we release a new version of Airflow, not for already released versions. If you want to make sure that Airflow dependencies are upgraded to the latest released versions containing latest security fixes in the image you use, you should implement your own process to upgrade those yourself when you build custom image based on the Airflow reference one. Airflow usually does not upper-bound versions of its dependencies via requirements, so you should be able to upgrade them to the latest versions - usually without any problems. And you can follow the process described in Building the image to do it (even in automated way). Obviously - since we have no control over what gets released in new versions of the dependencies, we cannot give any guarantees that tests and functionality of those dependencies will be compatible with Airflow after you upgrade them - testing if Airflow still works with those is in your hands, and in case of any problems, you should raise issue with the authors of the dependencies that are problematic. You can also - in such cases - look at the Airflow issues Airflow Pull Requests and Airflow Discussions, searching for similar problems to see if there are any fixes or workarounds found in the main version of Airflow and apply them to your custom image. The easiest way to keep-up with the latest released dependencies is however, to upgrade to the latest released Airflow version via switching to newly released images as base for your images, when a new version of Airflow is released. Whenever we release a new version of Airflow, we upgrade all dependencies to the latest applicable versions and test them together, so if you want to keep up with those tests - staying up-to-date with latest version of Airflow is the easiest way to update those dependencies. J, On Sun, Nov 5, 2023 at 6:41 PM Herve Ballans <herve.ball...@ias.u-psud.fr> wrote: > Dear Jarek, > > Thank you for this really useful recommandation! > > But, just, I would like to be sure of something: when you say that 'pip' > is the only way to install Airflow in a reproducible way, you mean > comparing to installation from sources? > > I guess the installation from Docker images is also recommended as well > right? (unless I'm wrong, an official Airflow Docker image uses pip for > installing Airflow, including also the constraints file for the > dedicated version). > > Here we've been using Docker compose installation for years without > encounter any major problems, even in the case of upgrades... > > Best, > Hervé > > On 04/11/2023 12:57, Jarek Potiuk wrote: > > > > If you want to make sure released Airflow installs in a reproducible > > way from the scratch - now and in the future, the only way to > > achieve that is described here: > > > > > https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html > > > > It involves using constraints. It only works with `pip`. There are no > > other ways and other tools that can be achieved easily, so we strongly > > recommend you use `pip` when installing Airflow. > > >