xinbinhuang commented on a change in pull request #7832: [WIP] Add production 
image support
URL: https://github.com/apache/airflow/pull/7832#discussion_r396970218
 
 

 ##########
 File path: Dockerfile
 ##########
 @@ -0,0 +1,325 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+# THIS DOCKERFILE IS INTENDED FOR PRODUCTION USE AND DEPLOYMENT.
+# NOTE! IT IS ALFA-QUALITY FOR NOW - WE ARE IN A PROCESS OF TESTING IT
+#
+#
+# This is a multi-segmented image. It actually contains two images:
+#
+# airflow-build-image  - there all airflow dependencies can be installed (and
+#                        built - for those dependencies that require
+#                        build essentials). Airflow is installed there with
+#                        --user switch so that all the dependencies are
+#                        installed to ${HOME}/.local
+#
+# main                 - this is the actual production image that is much
+#                        smaller because it does not contain all the build
+#                        essentials. Instead the ${HOME}/.local folder
+#                        is copied from the build-image - this way we have
+#                        only result of installation and we do not need
+#                        all the build essentials. This makes the image
+#                        nuch smaller.
+#
+ARG PYTHON_BASE_IMAGE="python:3.6-slim-buster"
+
+ARG AIRFLOW_VERSION="2.0.0.dev0"
+ARG AIRFLOW_ORG="apache"
+ARG AIRFLOW_REPO="airflow"
+ARG AIRFLOW_GIT_REFERENCE="master"
+ARG 
AIRFLOW_EXTRAS="async,azure_blob_storage,azure_cosmos,azure_container_instances,celery,crypto,elasticsearch,gcp,kubernetes,mysql,postgres,s3,emr,redis,slack,ssh,statsd,virtualenv"
+
+ARG AIRFLOW_HOME=/opt/airflow
+ARG AIRFLOW_USER="airflow"
+ARG AIRFLOW_GROUP="airflow"
+ARG AIRFLOW_UID="50000"
+ARG AIRFLOW_GID="50000"
+
+ARG PIP_VERSION="19.0.2"
+ARG CASS_DRIVER_BUILD_CONCURRENCY="8"
+
+##############################################################################################
+# This is the build image where we build all dependencies
+##############################################################################################
+FROM ${PYTHON_BASE_IMAGE} as airflow-build-image
+SHELL ["/bin/bash", "-o", "pipefail", "-e", "-u", "-x", "-c"]
+
+LABEL org.apache.airflow.docker=true
+LABEL org.apache.airflow.distro="debian"
+LABEL org.apache.airflow.distro.version="buster"
+LABEL org.apache.airflow.module="airflow"
+LABEL org.apache.airflow.component="airflow"
+LABEL org.apache.airflow.image="airflow-build-image"
+LABEL org.apache.airflow.uid="${AIRFLOW_UID}"
+
+ARG AIRFLOW_VERSION
+ARG AIRFLOW_ORG
+ARG AIRFLOW_REPO
+ARG AIRFLOW_GIT_REFERENCE
+ARG AIRFLOW_EXTRAS
+
+ARG PIP_VERSION
+ARG CASS_DRIVER_BUILD_CONCURRENCY
+
+ENV PYTHON_BASE_IMAGE=${PYTHON_BASE_IMAGE}
+ENV AIRFLOW_VERSION=${AIRFLOW_VERSION}
+ENV AIRFLOW_ORG=${AIRFLOW_ORG}
+ENV AIRFLOW_REPO=${AIRFLOW_REPO}
+ENV AIRFLOW_GIT_REFERENCE=${AIRFLOW_GIT_REFERENCE}
+
+ENV AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS}
+
+ENV AIRFLOW_REPO_URL="https://github.com/${AIRFLOW_ORG}/${AIRFLOW_REPO}";
+ENV 
AIRFLOW_RAW_CONTENT_URL="https://raw.githubusercontent.com/${AIRFLOW_ORG}/${AIRFLOW_REPO}";
+
+ENV PIP_VERSION=${PIP_VERSION}
+ENV CASS_DRIVER_BUILD_CONCURRENCY=${CASS_DRIVER_BUILD_CONCURRENCY}
+
+# Print versions
+RUN echo "Building airflow-build-image stage" \
+    echo "Base image: ${PYTHON_BASE_IMAGE}"; \
+    echo "Airflow version: ${AIRFLOW_VERSION}"; \
+    echo "Airflow git reference: ${AIRFLOW_GIT_REFERENCE}"; \
+    echo "Airflow org: ${AIRFLOW_ORG}"; \
+    echo "Airflow repo: ${AIRFLOW_REPO}"; \
+    echo "Airflow repo url: ${AIRFLOW_REPO_URL}"; \
+    echo "Airflow extras: ${AIRFLOW_EXTRAS}" ;\
+    echo "PIP version: ${PIP_VERSION}" ;\
+    echo "Cassandra concurrency: ${CASS_DRIVER_BUILD_CONCURRENCY}" ;\
+    echo
+
+# Make sure noninteractive debian install is used and language variables set
+ENV DEBIAN_FRONTEND=noninteractive LANGUAGE=C.UTF-8 LANG=C.UTF-8 
LC_ALL=C.UTF-8 \
+    LC_CTYPE=C.UTF-8 LC_MESSAGES=C.UTF-8
+
+# Note missing man directories on debian-buster
+# https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=863199
+# Install basic apt dependencies
+RUN mkdir -pv /usr/share/man/man1 \
+    && mkdir -pv /usr/share/man/man7 \
+    && apt-get update \
+    && apt-get install -y --no-install-recommends \
+           apt-transport-https \
+           apt-utils \
+           build-essential \
+           ca-certificates \
+           curl \
+           gnupg \
+           dirmngr \
+           freetds-bin \
+           freetds-dev \
+           gosu \
+           krb5-user \
+           ldap-utils \
+           libffi-dev \
+           libkrb5-dev \
+           libpq-dev \
+           libsasl2-2 \
+           libsasl2-dev \
+           libsasl2-modules \
+           libssl-dev \
+           locales  \
+           lsb-release \
+           openssh-client \
+           postgresql-client \
+           python-selinux \
+           sasl2-bin \
+           software-properties-common \
+           sqlite3 \
+           sudo \
+           unixodbc \
+           unixodbc-dev \
+    && apt-get autoremove -yqq --purge \
+    && apt-get clean \
+    && rm -rf /var/lib/apt/lists/*
+
+# Install MySQL client from Oracle repositories (Debian installs mariadb)
+RUN KEY="A4A9406876FCBD3C456770C88C718D3B5072E1F5" \
+    && GNUPGHOME="$(mktemp -d)" \
+    && export GNUPGHOME \
+    && for KEYSERVER in $(shuf -e \
+            ha.pool.sks-keyservers.net \
+            hkp://p80.pool.sks-keyservers.net:80 \
+            keyserver.ubuntu.com \
+            hkp://keyserver.ubuntu.com:80 \
+            pgp.mit.edu) ; do \
+          gpg --keyserver "${KEYSERVER}" --recv-keys "${KEY}" && break || true 
; \
+       done \
+    && gpg --export "${KEY}" | apt-key add - \
+    && gpgconf --kill all \
+    rm -rf "${GNUPGHOME}"; \
+    apt-key list > /dev/null \
+    && echo "deb http://repo.mysql.com/apt/debian/ stretch mysql-5.7" | tee -a 
/etc/apt/sources.list.d/mysql.list \
+    && apt-get update \
+    && apt-get install --no-install-recommends -y \
+        libmysqlclient-dev \
+        mysql-client \
+    && apt-get autoremove -yqq --purge \
+    && apt-get clean && rm -rf /var/lib/apt/lists/*
+
+# disable bytecode generation
+ENV PYTHONDONTWRITEBYTECODE=1
+
+RUN pip install --upgrade pip==${PIP_VERSION}
+
+RUN pip install --user \
+    
"${AIRFLOW_REPO_URL}/archive/${AIRFLOW_GIT_REFERENCE}.tar.gz#egg=apache-airflow[${AIRFLOW_EXTRAS}]"
 \
+    --constraint  
"${AIRFLOW_RAW_CONTENT_URL}/${AIRFLOW_GIT_REFERENCE}/requirements.txt"
+
+##############################################################################################
+# This is the actual Airflow image - much smaller than the build one. We copy
+# installed Airflow and all it's dependencies from the build image to make it 
smaller.
+##############################################################################################
+FROM ${PYTHON_BASE_IMAGE} as main
+SHELL ["/bin/bash", "-o", "pipefail", "-e", "-u", "-x", "-c"]
+
+LABEL org.apache.airflow.docker=true
+LABEL org.apache.airflow.distro="debian"
+LABEL org.apache.airflow.distro.version="buster"
+LABEL org.apache.airflow.module="airflow"
+LABEL org.apache.airflow.component="airflow"
+LABEL org.apache.airflow.image="airflow"
+LABEL org.apache.airflow.uid="${AIRFLOW_UID}"
+
+ARG AIRFLOW_VERSION
+
+ARG AIRFLOW_HOME
+ARG AIRFLOW_USER
+ARG AIRFLOW_GROUP
+ARG AIRFLOW_UID
+ARG AIRFLOW_GID
+
+ARG PIP_VERSION
+ARG CASS_DRIVER_BUILD_CONCURRENCY
+
+ENV PYTHON_BASE_IMAGE=${PYTHON_BASE_IMAGE}
+ENV AIRFLOW_VERSION=${AIRFLOW_VERSION}
+
+ENV AIRFLOW_HOME=${AIRFLOW_HOME}
+ENV AIRFLOW_USER=${AIRFLOW_USER}
+ENV AIRFLOW_GROUP=${AIRFLOW_GROUP}
+ENV AIRFLOW_UID=${AIRFLOW_UID}
+ENV AIRFLOW_GID=${AIRFLOW_GID}
+
+ENV PIP_VERSION=${PIP_VERSION}
+
+# Print versions
+RUN echo "Building main airflow image"; \
+    echo "Base image: ${PYTHON_BASE_IMAGE}"; \
+    echo "Airflow version: ${AIRFLOW_VERSION}"; \
+    echo "Airflow home: ${AIRFLOW_HOME}"; \
+    echo "Airflow user: ${AIRFLOW_USER}"; \
+    echo "Airflow uid: ${AIRFLOW_UID}" ;\
+    echo "PIP version: ${PIP_VERSION}" ;\
+    echo
+
+# Make sure noninteractive debian install is used and language variables set
+ENV DEBIAN_FRONTEND=noninteractive LANGUAGE=C.UTF-8 LANG=C.UTF-8 
LC_ALL=C.UTF-8 \
+    LC_CTYPE=C.UTF-8 LC_MESSAGES=C.UTF-8
+
+# Note missing man directories on debian-buster
+# https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=863199
+# Install basic apt dependencies
+RUN mkdir -pv /usr/share/man/man1 \
+    && mkdir -pv /usr/share/man/man7 \
+    && apt-get update \
+    && apt-get install -y --no-install-recommends \
+           apt-transport-https \
+           apt-utils \
+           ca-certificates \
+           curl \
+           dumb-init \
+           freetds-bin \
+           freetds-dev \
+           gnupg \
+           gosu \
+           krb5-user \
+           ldap-utils \
+           libffi-dev \
+           libkrb5-dev \
+           libpq-dev \
+           libsasl2-2 \
+           libsasl2-dev \
+           libsasl2-modules \
+           libssl-dev \
+           locales  \
+           lsb-release \
+           netcat \
+           openssh-client \
+           postgresql-client \
+           python-selinux \
+           sasl2-bin \
+           software-properties-common \
+           sqlite3 \
+           sudo \
+           unixodbc \
+           unixodbc-dev \
+    && apt-get autoremove -yqq --purge \
+    && apt-get clean \
+    && rm -rf /var/lib/apt/lists/*
+
+# Install MySQL client from Oracle repositories (Debian installs mariadb)
+RUN KEY="A4A9406876FCBD3C456770C88C718D3B5072E1F5" \
+    && GNUPGHOME="$(mktemp -d)" \
+    && export GNUPGHOME \
+    && for KEYSERVER in $(shuf -e \
+            ha.pool.sks-keyservers.net \
+            hkp://p80.pool.sks-keyservers.net:80 \
+            keyserver.ubuntu.com \
+            hkp://keyserver.ubuntu.com:80 \
+            pgp.mit.edu) ; do \
+          gpg --keyserver "${KEYSERVER}" --recv-keys "${KEY}" && break || true 
; \
+       done \
+    && gpg --export "${KEY}" | apt-key add - \
+    && gpgconf --kill all \
+    rm -rf "${GNUPGHOME}"; \
+    apt-key list > /dev/null \
+    && echo "deb http://repo.mysql.com/apt/debian/ stretch mysql-5.7" | tee -a 
/etc/apt/sources.list.d/mysql.list \
+    && apt-get update \
+    && apt-get install --no-install-recommends -y \
+        libmysqlclient-dev \
+        mysql-client \
+    && apt-get autoremove -yqq --purge \
+    && apt-get clean && rm -rf /var/lib/apt/lists/*
+
+RUN pip install --upgrade pip==${PIP_VERSION}
+
+RUN addgroup --gid "${AIRFLOW_GID}" "${AIRFLOW_GROUP}" && \
+    adduser --quiet "${AIRFLOW_USER}" --uid "${AIRFLOW_UID}" \
+        --ingroup "${AIRFLOW_GROUP}" \
+        --home /home/${AIRFLOW_USER}
+
+RUN mkdir -pv ${AIRFLOW_HOME}; \
+    mkdir -pv ${AIRFLOW_HOME}/dags; \
+    mkdir -pv ${AIRFLOW_HOME}/logs; \
+    chown -R "${AIRFLOW_USER}" ${AIRFLOW_HOME}
+
+# Note that we have to hard-code user id/group id as you cannot use
+# args nor variables in --chown :(. That's a limitation of
+# docker build.
+COPY --chown="50000:50000" --from=airflow-build-image \
+        /root/.local "/home/${AIRFLOW_USER}/.local"
 
 Review comment:
   @potiuk 
   I can build the image with
   ```
   COPY --chown=${AIRFLOW_GID}:${AIRFLOW_UID} --from=airflow-build-image \
           /root/.local "/home/${AIRFLOW_USER}/.local"
   ```
   
   Is it something to do with the docker engine version? My version is 
   
   ```
   Client: Docker Engine - Community
    Version:           19.03.6
    API version:       1.40
    Go version:        go1.12.16
    Git commit:        369ce74a3c
    Built:             Thu Feb 13 01:27:49 2020
    OS/Arch:           linux/amd64
    Experimental:      false
   
   Server: Docker Engine - Community
    Engine:
     Version:          19.03.6
     API version:      1.40 (minimum version 1.12)
     Go version:       go1.12.16
     Git commit:       369ce74a3c
     Built:            Thu Feb 13 01:26:21 2020
     OS/Arch:          linux/amd64
     Experimental:     false
    containerd:
     Version:          1.2.10
     GitCommit:        b34a5c8af56e510852c35414db4c1f4fa6172339
    runc:
     Version:          1.0.0-rc8+dev
     GitCommit:        3e425f80a8c931f88e6d94a8c831b9d5aa481657
    docker-init:
     Version:          0.18.0
     GitCommit:        fec3683
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to