BasPH commented on pull request #8621: URL: https://github.com/apache/airflow/pull/8621#issuecomment-621632259
I think we should decide if the docker-compose file will serve as a "hello world" example, or should it be something more complex, but closer to a production deployment? My ideas: - Users will likely want to use it as an example and change it for their specific deployment - I think these should be as simple as possible to run, i.e. a one-liner in the readme - I think the folder name "templates" is incorrect here, no templating is being done? Something like "examples/docker", or even "docker-compose" in the root of the project makes more sense IMO - There are already plenty of example DAGs, let's not add yet another example To address your points about the extension fields: > Most production environments like ECS/Fargate and Docker swarm don't support them unlike docker env variables. I think the goal should be a simple example docker-compose file. People will then extract bits and pieces from the example into their deployment. There's a ton of different ways to do deployment, so I would not aim for creating a docker-compose file with the goal of being close to one single deployment method. Instead, aim for something easy to comprehend. > Users might want to use different images and volumes for different airflow services. (Airflow workers might need java, specific python packages etc.,) Okay, why does this change the argument for using the extension fields? > Much of code duplication is eliminated in form of .env master file already. If we want the docker-compose file to serve as an example, I think it should be as self-explanatory as possible. Storing env vars in a separate file does not help that cause, I think using extension fields and having everything in the same file will be clearer. > Readability and Adoptation of extension-fields is still lagging behind with majority of docker/docker-compose users. How so? > Container names should be unique. You still have unique container names. > Advanced users can always configure the config according to their needs. Yes. See the following docker-compose file using extension fields, I think it's pretty readable: ```yaml version: '3' # ========================== AIRFLOW ENVIRONMENT VARIABLES =========================== x-environment: &airflow_environment - AIRFLOW__CORE__EXECUTOR=LocalExecutor - AIRFLOW__CORE__LOAD_DEFAULT_CONNECTIONS=False - AIRFLOW__CORE__LOAD_EXAMPLES=False - AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql://airflow:airflow@postgres:5432/airflow - AIRFLOW__CORE__STORE_DAG_CODE=True - AIRFLOW__CORE__STORE_SERIALIZED_DAGS=True - AIRFLOW__WEBSERVER__RBAC=True # ========================== /AIRFLOW ENVIRONMENT VARIABLES ========================== services: postgres: image: postgres:12-alpine environment: - POSTGRES_USER=airflow - POSTGRES_PASSWORD=airflow - POSTGRES_DB=airflow ports: - "5432:5432" webserver: image: apache/airflow:1.10.10-python3.7 ports: - "8080:8080" environment: *airflow_environment command: webserver scheduler: image: apache/airflow:1.10.10-python3.7 environment: *airflow_environment command: scheduler ``` For the initialization I've used this thing in the past, worked quite okay. We could make it a bit more sophisticated, e.g. by checking some condition to be True once a second, instead of `sleep 5`: ```yaml initdb_adduser: image: apache/airflow:1.10.10-python3.7 depends_on: - postgres environment: *airflow_environment entrypoint: /bin/bash # The webserver initializes permissions, so sleep for that to (approximately) be finished # No disaster if the webserver isn't finished by then, but create_user will start spitting out errors until the permissions exist command: -c 'airflow initdb && sleep 5 && airflow create_user --role Admin --username airflow --password airflow -e airf...@airflow.com -f airflow -l airflow' ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org