ashb commented on a change in pull request #4938: [AIRFLOW-4117] Multi-staging 
Image - Travis CI tests [Step 3/3]
URL: https://github.com/apache/airflow/pull/4938#discussion_r303054843
 
 

 ##########
 File path: Dockerfile
 ##########
 @@ -278,42 +273,81 @@ RUN echo "Pip version: ${PIP_VERSION}"
 
 RUN pip install --upgrade pip==${PIP_VERSION}
 
-# We are copying everything with airflow:airflow user:group even if we use 
root to run the scripts
+ARG AIRFLOW_REPO=apache/airflow
+ENV AIRFLOW_REPO=${AIRFLOW_REPO}
+
+ARG AIRFLOW_BRANCH=master
+ENV AIRFLOW_BRANCH=${AIRFLOW_BRANCH}
+
+ENV 
AIRFLOW_GITHUB_DOWNLOAD=https://raw.githubusercontent.com/${AIRFLOW_REPO}/${AIRFLOW_BRANCH}
+
+# Airflow Extras installed
+ARG AIRFLOW_EXTRAS="all"
+ENV AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS}
+
+RUN echo "Installing with extras: ${AIRFLOW_EXTRAS}."
+
+ARG AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD="false"
+ENV 
AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD=${AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD}
+
+# By changing the CI build epoch we can force reinstalling Arflow from the 
current master -
+# in case of CI optimized builds (next step). Our build scripts will change 
the EPOCH every month normally
+# But it can also be overwritten manually by setting the 
AIRFLOW_CI_BUILD_EPOCH environment variable.
+ARG AIRFLOW_CI_BUILD_EPOCH=""
+ENV AIRFLOW_CI_BUILD_EPOCH=${AIRFLOW_CI_BUILD_EPOCH}
+
+# In case of CI-optimised builds we want to pre-install master version of 
airflow dependencies so that
+# We do not have to always reinstall it from the scratch.
+# This can be reinstalled from latest master by increasing 
PIP_DEPENDENCIES_EPOCH_NUMBER.
+# And is automatically reinstalled from the scratch every month
+RUN \
+    if [[ "${AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD}" == "true" ]]; then \
+        pip install --no-use-pep517 \
+        
"https://github.com/apache/airflow/archive/master.tar.gz#egg=apache-airflow[${AIRFLOW_EXTRAS}]";
 \
+        && pip uninstall --yes apache-airflow; \
+    fi
+
+# Note! We are copying everything with airflow:airflow user:group even if we 
use root to run the scripts
 # This is fine as root user will be able to use those dirs anyway.
 
 # Airflow sources change frequently but dependency configuration won't change 
that often
 # We copy setup.py and other files needed to perform setup of dependencies
-# This way cache here will only be invalidated if any of the
-# version/setup configuration change but not when airflow sources change
+# So in case setup.py changes we can install latest dependencies required.
 COPY --chown=airflow:airflow setup.py ${AIRFLOW_SOURCES}/setup.py
 COPY --chown=airflow:airflow setup.cfg ${AIRFLOW_SOURCES}/setup.cfg
 
 COPY --chown=airflow:airflow airflow/version.py 
${AIRFLOW_SOURCES}/airflow/version.py
 COPY --chown=airflow:airflow airflow/__init__.py 
${AIRFLOW_SOURCES}/airflow/__init__.py
 COPY --chown=airflow:airflow airflow/bin/airflow 
${AIRFLOW_SOURCES}/airflow/bin/airflow
 
-# Airflow Extras installed
-ARG AIRFLOW_EXTRAS="all"
-ENV AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS}
-RUN echo "Installing with extras: ${AIRFLOW_EXTRAS}."
-
-# First install only dependencies but no Apache Airflow itself
-# This way regular changes in sources of Airflow will not trigger 
reinstallation of all dependencies
-# And this Docker layer will be reused between builds.
-RUN pip install --no-use-pep517 -e ".[${AIRFLOW_EXTRAS}]"
+# The goal of this line is to install the dependencies from the most current 
setup.py from sources
+# This will be usually incremental small set of packages so it will be very 
fast
+RUN \
+    if [[ "${AIRFLOW_CONTAINER_CI_OPTIMISED_BUILD}" == "true" ]]; then \
+        pip install --no-use-pep517 -e ".[${AIRFLOW_EXTRAS}]"; \
 
 Review comment:
   I think we could do this even when not "ci-optimized" - that way it would 
install python deps before copying the rest of the airflow sources in.
   
   Otherwise the deps aren't installed until after we've done `COPY 
--chown=airflow:airflow . ${AIRFLOW_SOURCES}/`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to