potiuk commented on a change in pull request #20338: URL: https://github.com/apache/airflow/pull/20338#discussion_r775862078
########## File path: dev/breeze/src/airflow_breeze/ci/build_params.py ########## @@ -0,0 +1,121 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import os +from dataclasses import dataclass +from datetime import datetime + +from airflow_breeze.breeze import get_airflow_sources_root +from airflow_breeze.utils import run_command + + +@dataclass +class BuildParams: + # To construct ci_image_name + pythonVersion: str = "3.7" + airflow_branch: str = "main" + build_id: int = 0 + # To construct docker cache ci directive + dockerCache: str = "pulled" + airflow_extras: str = "devel_ci" + additional_airflow_extras: str = "" + additional_python_deps: str = "" + # To construct ci_image_name + tag: str = "latest" + # To construct airflow_image_repository + githubRepository: str = "apache/airflow" + constraints_github_repository: str = "apache/airflow" + # Not sure if defaultConstraintsBranch and airflow_constraints_reference are different + defaultConstraintsBranch: str = "constraints-main" Review comment: Yes. Very good point and it's one of the most important pieces there and the reason why we convert the args in the first place. I will try to explain it in detail. I hope it will be clear, this is one of the "complex" parts of the image building. Those parameteres are different. The `airflow_branch` and `default_constraint_branch` should be separated out to a separate Python file with only those two values and imported from there. Those are "default" values that are used in calculating other valiues. And those values will be manually updated when we branch-off Airflow. We can name the file `branch_defaults.py`: ``` # Licence.... AIRFLOW_BRANCH="main" DEFAULT_AIRFLOW_CONSTRAINTS_BRANCH="constraints-main". ``` This file content will be different when we will have a different branch. For example in `v2-2-test` branch this file will look like that: ``` # Licence.... AIRFLOW_BRANCH="v2-2-test" DEFAULT_AIRFLOW_CONSTRAINTS_BRANCH="constraints-2-2". ``` And in BuildParams you will use it imported from there.: ``` airflow_branch = AIRFLOW_BRANCH ... default_constraits_branch = DEFAULT_CONSTRAINTS_BRANCH ``` The "airlfow_constraints_reference" and "airlfow_constraints_location" are "override" parameters. They should be `None` by default (so should be Optional[str]) and only when one of them is manually overwriten, they should be used - otherwise they should be passed as empty/Not set and they will be calculated from other args. By default (if we do not override those two) they will be automatically calculated during the build using other "build-args". See here: https://github.com/apache/airflow/blob/main/scripts/docker/common.sh#L46 This is to make sure that whenever we build image in any PR for a specific branch, the right constraints and cache sources will be used. It is particularly used by `install_airflow_dependencies_from_branch_tip.sh`. For example there is a step there (simplified): ``` pip install "https://github.com/${AIRFLOW_REPO}/archive/${AIRFLOW_BRANCH}.tar.gz#egg=apache-airflow[" \ --constraint "${AIRFLOW_CONSTRAINTS_LOCATION}" ``` 1) when no constraints-location, nior constraints-reference are specified and we are in `main` branch this should translate to: ``` pip install "https://github.com/apache/airflow/archive/main.tar.gz#egg=apache-airflow" \ --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-source-providers-3.6.txt"" ``` 2) same case when we are in "v2-2-test" branch: ``` pip install "https://github.com/apache/airflow/archive/v2-2-test.tar.gz#egg=apache-airflow" \ --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2-2/constraints-source-providers-3.6.txt"" ``` 4) Then you can override those automatically calculate values. Taking last example, if you manually set "airflow_constraints_reference" - to "constraints-xyz", only the "constraints-2-2" part wil be replaced - and the "airflow_constraints_location" will be calculated using this reference. The result will be: ``` pip install "https://github.com/apache/airflow/archive/v2-2-test.tar.gz#egg=apache-airflow" \ --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-xyz/constraints-source-providers-3.6.txt"" ``` 5) And finally if you manually set `airflow_constraint_location" - for example by specifying "file:///constraints.txt" - then no calculation happens and it will be directly used instead of the whole constraints location: ``` pip install "https://github.com/apache/airflow/archive/v2-2-test.tar.gz#egg=apache-airflow" \ --constraint "file:///constraints.txt" ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org