potiuk commented on issue #4483: [AIRFLOW-3673] Add official dockerfile URL: https://github.com/apache/airflow/pull/4483#issuecomment-453675025 I created a pull request from my private repo : https://github.com/ffinfo/incubator-airflow/pull/1 (https://github.com/potiuk/incubator-airflow/commit/f7e3e2646823122c05f0075e5b019b21426a90fa) . This improves the original Dockerfile in the following ways: * there are several layers of the image: * two apt-get layers for basic/complex dependencie * upgrade apt-get layer for future upgrades in apt-get dependencies * layer with pip airflow dependencies installed * layer with source changes of airflow * layer with additional pip dependencies * Cassandra driver install time is vastly decreased - no CYTHON compilation is done which shortens the build time by 10 minutes or so. Also automatically multi-processor build (8 processors) is enabled. This all can be changed by simply changing the default values. also it can be overwritten by --build-arg flag of `docker build` This should work as follows (if caching is enabled in dockerhub): - if only airflow sources change without setup.py dependencies (which is most common case) only the last layer is rebuild, all the other layers are taken from the cache. This should not only speed up the build time but also amount of data downloaded by the users/developers. This can also be forced by increasing value of FORCE_REINSTALL_AIRFLOW_SOURCES env in the Dockerfile - if dependencies are changed in setup.py, then last two layers are invalidated and rebuilt - all dependencies will be installed from scratch. This can also be forced by increasing the value of FORCE_REINSTALL_ALL_PIP_DEPENDENCIES - if we want to upgrade all apt-get dependencies we can increase the value of FORCE_UPGRADE_OF_APT_GET variable in the Dockerfile - this will invalidate the cache and force "apt-get upgrade" - if we want to force reinstalling of everything from the scratch we can increase the value of FORCE_REINSTALL_APT_GET_DEPENDENCIES variable in the Dockerfile As result - if we just push code to airflow repo, the image built in Dockerhub will be prepared in optimal way and users will only download incremental updates to the base image they already have. We also have a way to force rebuilding of parts or the whole image if we choose to.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
