ashb commented on a change in pull request #4938: [AIRFLOW-4117] Multi-staging Image - Travis CI tests [Step 3/3] URL: https://github.com/apache/airflow/pull/4938#discussion_r303054843
########## File path: Dockerfile ########## @@ -278,42 +273,81 @@ RUN echo "Pip version: ${PIP_VERSION}" RUN pip install --upgrade pip==${PIP_VERSION} -# We are copying everything with airflow:airflow user:group even if we use root to run the scripts +ARG AIRFLOW_REPO=apache/airflow +ENV AIRFLOW_REPO=${AIRFLOW_REPO} + +ARG AIRFLOW_BRANCH=master +ENV AIRFLOW_BRANCH=${AIRFLOW_BRANCH} + +ENV AIRFLOW_GITHUB_DOWNLOAD=https://raw.githubusercontent.com/${AIRFLOW_REPO}/${AIRFLOW_BRANCH} + +# Airflow Extras installed +ARG AIRFLOW_EXTRAS="all" +ENV AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS} + +RUN echo "Installing with extras: ${AIRFLOW_EXTRAS}." + +ARG AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD="false" +ENV AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD=${AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD} + +# By changing the CI build epoch we can force reinstalling Arflow from the current master - +# in case of CI optimized builds (next step). Our build scripts will change the EPOCH every month normally +# But it can also be overwritten manually by setting the AIRFLOW_CI_BUILD_EPOCH environment variable. +ARG AIRFLOW_CI_BUILD_EPOCH="" +ENV AIRFLOW_CI_BUILD_EPOCH=${AIRFLOW_CI_BUILD_EPOCH} + +# In case of CI-optimised builds we want to pre-install master version of airflow dependencies so that +# We do not have to always reinstall it from the scratch. +# This can be reinstalled from latest master by increasing PIP_DEPENDENCIES_EPOCH_NUMBER. +# And is automatically reinstalled from the scratch every month +RUN \ + if [[ "${AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD}" == "true" ]]; then \ + pip install --no-use-pep517 \ + "https://github.com/apache/airflow/archive/master.tar.gz#egg=apache-airflow[${AIRFLOW_EXTRAS}]" \ + && pip uninstall --yes apache-airflow; \ + fi + +# Note! We are copying everything with airflow:airflow user:group even if we use root to run the scripts # This is fine as root user will be able to use those dirs anyway. # Airflow sources change frequently but dependency configuration won't change that often # We copy setup.py and other files needed to perform setup of dependencies -# This way cache here will only be invalidated if any of the -# version/setup configuration change but not when airflow sources change +# So in case setup.py changes we can install latest dependencies required. COPY --chown=airflow:airflow setup.py ${AIRFLOW_SOURCES}/setup.py COPY --chown=airflow:airflow setup.cfg ${AIRFLOW_SOURCES}/setup.cfg COPY --chown=airflow:airflow airflow/version.py ${AIRFLOW_SOURCES}/airflow/version.py COPY --chown=airflow:airflow airflow/__init__.py ${AIRFLOW_SOURCES}/airflow/__init__.py COPY --chown=airflow:airflow airflow/bin/airflow ${AIRFLOW_SOURCES}/airflow/bin/airflow -# Airflow Extras installed -ARG AIRFLOW_EXTRAS="all" -ENV AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS} -RUN echo "Installing with extras: ${AIRFLOW_EXTRAS}." - -# First install only dependencies but no Apache Airflow itself -# This way regular changes in sources of Airflow will not trigger reinstallation of all dependencies -# And this Docker layer will be reused between builds. -RUN pip install --no-use-pep517 -e ".[${AIRFLOW_EXTRAS}]" +# The goal of this line is to install the dependencies from the most current setup.py from sources +# This will be usually incremental small set of packages so it will be very fast +RUN \ + if [[ "${AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD}" == "true" ]]; then \ + pip install --no-use-pep517 -e ".[${AIRFLOW_EXTRAS}]"; \ Review comment: I think we could do this even when not "ci-optimized" - that way it would install python deps before copying the rest of the airflow sources in. Otherwise the deps aren't installed until after we've done `COPY --chown=airflow:airflow . ${AIRFLOW_SOURCES}/` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services