This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit dbc9ab95ae733cced3f285b9472c28cbf5ef3fcf
Author: Jarek Potiuk <[email protected]>
AuthorDate: Sun Oct 11 06:19:57 2020 +0200

    Add capability of customising PyPI sources (#11385)
    
    * Add capability of customising PyPI sources
    
    This change adds capability of customising installation of PyPI
    modules via custom .pypirc file. This might allow to install
    dependencies from in-house, vetted registry of PyPI
    
    (cherry picked from commit 45d33dbd432fd010f6ff2b698c682c31ac436c24)
---
 Dockerfile                     |  4 ++++
 docs/production-deployment.rst | 40 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/Dockerfile b/Dockerfile
index f257606..7cc7f94 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -164,6 +164,8 @@ RUN mkdir -p /root/.local/bin
 ARG AIRFLOW_PRE_CACHED_PIP_PACKAGES="true"
 ENV AIRFLOW_PRE_CACHED_PIP_PACKAGES=${AIRFLOW_PRE_CACHED_PIP_PACKAGES}
 
+COPY .pypirc /root/.pypirc
+
 # In case of Production build image segment we want to pre-install master 
version of airflow
 # dependencies from github so that we do not have to always reinstall it from 
the scratch.
 RUN if [[ ${AIRFLOW_PRE_CACHED_PIP_PACKAGES} == "true" ]]; then \
@@ -385,6 +387,8 @@ RUN chmod a+x /entrypoint /clean-logs
 # See https://github.com/apache/airflow/issues/9248
 RUN chmod g=u /etc/passwd
 
+COPY .pypirc ${AIRFLOW_USER_HOME_DIR}/.pypirc
+
 ENV PATH="${AIRFLOW_USER_HOME_DIR}/.local/bin:${PATH}"
 ENV GUNICORN_CMD_ARGS="--worker-tmp-dir /dev/shm"
 
diff --git a/docs/production-deployment.rst b/docs/production-deployment.rst
index 5e6cad2..7c4bfab 100644
--- a/docs/production-deployment.rst
+++ b/docs/production-deployment.rst
@@ -262,6 +262,14 @@ You can combine both - customizing & extending the image. 
You can build the imag
 ``customize`` method (either with docker command or with ``breeze`` and then 
you can ``extend``
 the resulting image using ``FROM:`` any dependencies you want.
 
+Customizing PYPI installation
+.............................
+
+You can customize PYPI sources used during image build by modifying .pypirc 
file that should be
+placed in the root of Airflow Directory. This .pypirc will never be committed 
to the repository
+and will not be present in the final production image. It is added and used 
only in the build
+segment of the image so it is never copied to the final image.
+
 External sources for dependencies
 ---------------------------------
 
@@ -595,3 +603,35 @@ More details about the images
 
 You can read more details about the images - the context, their parameters and 
internal structure in the
 `IMAGES.rst <https://github.com/apache/airflow/blob/master/IMAGES.rst>`_ 
document.
+
+.. _production-deployment:kerberos:
+
+Kerberos-authenticated workers
+==============================
+
+Apache Airflow has a built-in mechanism for authenticating the operation with 
a KDC (Key Distribution Center).
+Airflow has a separate command ``airflow kerberos`` that acts as token 
refresher. It uses the pre-configured
+Kerberos Keytab to authenticate in the KDC to obtain a valid token, and then 
refreshing valid token
+at regular intervals within the current token expiry window.
+
+Each request for refresh uses a configured principal, and only keytab valid 
for the principal specified
+is capable of retrieving the authentication token.
+
+The best practice to implement proper security mechanism in this case is to 
make sure that worker
+workloads have no access to the Keytab but only have access to the 
periodically refreshed, temporary
+authentication tokens. This can be achieved in docker environment by running 
the ``airflow kerberos``
+command and the worker command in separate containers - where only the 
``airflow kerberos`` token has
+access to the Keytab file (preferably configured as secret resource). Those 
two containers should share
+a volume where the temporary token should be written by the ``airflow 
kerberos`` and read by the workers.
+
+In the Kubernetes environment, this can be realized by the concept of 
side-car, where both Kerberos
+token refresher and worker are part of the same Pod. Only the Kerberos 
side-car has access to
+Keytab secret and both containers in the same Pod share the volume, where 
temporary token is written by
+the side-care container and read by the worker container.
+
+This concept is implemented in the development version of the Helm Chart that 
is part of Airflow source code.
+
+
+.. spelling::
+
+   pypirc

Reply via email to