potiuk commented on code in PR #35086: URL: https://github.com/apache/airflow/pull/35086#discussion_r1367733237
########## dev/breeze/src/airflow_breeze/utils/cdxgen.py: ########## @@ -217,28 +231,41 @@ def get_requirements_for_provider( ) -def build_all_airflow_versions_base_image(python_version: str): +def build_all_airflow_versions_base_image( + python_version: str, + confirm: bool = True, +): """ Build an image with all airflow versions pre-installed in separate virtualenvs. """ image_name = f"apache/airflow-dev/all_airflow_versions/python{python_version}" image_name = get_all_airflow_versions_image_name(python_version=python_version) dockerfile = f""" FROM ghcr.io/apache/airflow/main/ci/python{python_version} -RUN pip install --upgrade pip +RUN pip install --upgrade pip --no-cache-dir # Prevent setting sources in PYTHONPATH to not interfere with virtualenvs ENV USE_AIRFLOW_VERSION=none +ENV START_AIRFLOW=none """ - for airflow_version in get_active_airflow_versions(): + compatible_airflow_versions = [ + airflow_version + for airflow_version, python_versions in AIRFLOW_PYTHON_COMPATIBILITY_MATRIX.items() + if python_version in python_versions + ] + + for airflow_version in compatible_airflow_versions: dockerfile += f""" # Create the virtualenv and install the proper airflow version in it RUN python -m venv /opt/airflow/airflow-{airflow_version} && \ -/opt/airflow/airflow-{airflow_version}/bin/pip install --upgrade pip && \ +/opt/airflow/airflow-{airflow_version}/bin/pip install --no-cache-dir --upgrade pip && \ /opt/airflow/airflow-{airflow_version}/bin/pip install apache-airflow=={airflow_version} \ --constraint https://raw.githubusercontent.com/apache/airflow/\ constraints-{airflow_version}/constraints-{python_version}.txt """ - run_command(["docker", "build", "--tag", image_name, "-"], input=dockerfile, text=True, check=True) + build_command = run_command( Review Comment: Here some comments how to make the images "shareable" in ghcr.io and usable for cache builds (in case you would like to pursue it @pierrejeambrun) 1. The name should be similar to the names we are using currently for CI images - see for example https://github.com/apache/airflow/pkgs/container/airflow%2Fmain%2Fci%2Fpython3.8 For example: `ghcr.io/apache/airflow/airflow-dev/all-airflow/python3.8:latest` 2. Then the image should have the right label set to indicate, it is part of apache/airflow project: `org.opencontainers.image.source="https://github.com/apache/airflow" See https://github.com/apache/airflow/blob/dc2e8522bfb377daf6e2e9cf19b265f13dfe41c1/Dockerfile.ci#L1546 This will make the image "belong" to apache/airflow project in GitHub. 3. Once the image is built with the right name and label, they should be pushed from local to ghcr.io - they should appear there (this requires to have "write package" cabapble token generate via github developer settings and running `docker login ghcr.io` with it. `docker push ghcr.io/apache/airflow/airflow-dev/all-airflow/python3.8:latest` Those are just 6 images generally, so we can push them individually. 4. The person who pushed the image (owner) should go to the package settings (bottom right in the package details page of package details info found here: https://github.com/orgs/apache/packages?repo_name=airflow and: a) change visibility to public if it is private b) mark it as `Inherit access from source repository (recommended)` in case it is not set ![image](https://github.com/apache/airflow/assets/595491/74c0ef4a-8555-4ce0-a751-1524c5c0c382) 5. Finally, the `docker build` command below should be changed to `docker buildx build --cache-from {image-name}` - this will use layers of the remote image in ghcr.io and rather than building them locally - which takes a lot of time and network, will pull the layers instead - which only uses the network. I think once we do it - any of us will be able to get the "all-airflow" images quite a bit faster (providing fast network). Let me know @pierrejeambrun if you would like to follow it up this way :) (happy to do it myself too). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org