Hello everyone,

TL;DR; I have finally got to something we planned when we switched to UV, I
have a green PR where we introduced automated management of "lower-bounds"
dependencies in Airflow and all providers (thanks to uv`s --lowest-direct
resolution mechanism).

The PR is here: https://github.com/apache/airflow/pull/39946 . It's ready
to review and merge (green).

Thanks to Maciek nagging me on slack and helping with initial checks - I
managed to complete it before going to Community Over Code this week.

At the end of the email I summarized changes in dependencies that were
needed to do it (so basically all the missing lower-bounds that the tests
helped to detect and fix).

Once it is merged, our CI will run a special
"LowestDirectDependencyResolution" test suite that will fail if the tests
run in PR in Airflow and any Provider uses a feature that requires adding
`>=` limit for any library version (lower binding). This means that we will
finally have "proper" lower bindings for both Airflow and Providers and
there will be no more cases where Airflow or any Provider fails because
someone has an old version of a library installed.

For example if you are using bedrock (amazon provider) - our Amazon
provider had botocore > 1.3.3 but the tests found that Bedrock is only
available in 1.34 and we have to bump it. Once we merge the PR, those cases
will be detected automatically and you will have to fix them before you
merge your PRs.

It's done in the way that in the special test suite, dependencies are
downgraded to lowest direct ones before our unit tests are run. This is
done for Airflow tests and for each provider separately, so we are able to
detect missing lower bounds very accurately - separately for core Airflow
and separately for each provider.

When the test fails in CI, it will be very easy to reproduce it locally
with Breeze. For example if you work on google provider and it fails you
run this command:

breeze shell --force-lowest-dependencies --test-type "Providers[google]"

This will drop you in Breeze shell, and downgrade google provider
dependencies to lowest "direct" ones and allow you to run pytest tests
there and fix the problem by manually installing newer dependency versions
and re-running the tests.

Then you can iterate over tests, manually downgrade and upgrade
dependencies as you see fit and eventually when you figure out the minimum
binding, you just add it to provider.yaml, run pre-commit and then
restarting the command above can be repeated.

I've added detailed instructions on how to approach fixing "lowest
dependencies" problems, and when the tests fail in CI, you will be directed
to those instructions.  I even described how to effectively use bisecting
to easily find the actual version of dependency that needs to be set in
such cases.

-------------------------------

The list of dependency fixes:

Airflow:

-    "asgiref",
+    "asgiref>=2.3.0",
-    "connexion[flask]>=2.10.0,<3.0",
+    "connexion[flask]>=2.14.2,<3.0",
-    "cryptography>=39.0.0",
+    "cryptography>=41.0.0",
-    "flask-caching>=1.5.0",
+    "flask-caching>=2.0.0",
-    "flask-wtf>=0.15",
+    "flask-wtf>=1.1.0",
-    "flask>=2.2,<2.3",
+    "flask>=2.2.1,<2.3",
-    "httpx",
+    "httpx>=0.18.0",
-    "lazy-object-proxy",
+    "lazy-object-proxy>=1.2.0",
-    "opentelemetry-exporter-otlp",
-    "packaging>=14.0",
+    "opentelemetry-exporter-otlp>=1.15.0",
+    "packaging>=22.0",
-    "pluggy>=1.0",
-    "psutil>=4.2.0",
+    "pluggy>=1.5.0",
+    "psutil>=5.8.0",
-    "python-dateutil>=2.3",
+    "python-dateutil>=2.7.0",
+    "requests-toolbelt>=0.4.0",
-    "setproctitle>=1.1.8",
+    "setproctitle>=1.3.3",
-    "tenacity>=6.2.0,!=8.2.0",
+    "tenacity>=8.0.0,!=8.2.0",

Providers:

Amazon:

-  - boto3>=1.33.0
-  - botocore>=1.33.0
+  - boto3>=1.34.0
+  - botocore>=1.34.0
-  - watchtower>=2.0.1,<4
+  - watchtower>=3.0.0,<4
-  - asgiref
+  - asgiref>=2.3.0
-  - jmespath
+  - jmespath>=0.7.0

Amazon[aiobotocore]
-      - aiobotocore[boto3]>=2.5.3
+      - aiobotocore[boto3]>=2.10.0

Apache Flink:
-  - cryptography>=2.0.0
+  - cryptography>=41.0.0

Apache Hive:
-  - thrift>=0.9.2
+  - thrift>=0.11.0
+  - jmespath>=0.7.0

Apache Kylin:
-  - kylinpy>=2.6
+  - kylinpy>=2.7.0

Apache Spark:
-  - pyspark
+  - pyspark>=3.0.0

CNCF Kubernetes:
-  - cryptography>=2.0.0
+  - cryptography>=41.0.0

FAB:
-  - jmespath
+  - jmespath>=0.7.0

Github:
-  - PyGithub!=1.58
+  - PyGithub>=2.1.1

Google:
+  - dill>=0.2.3
-  - google-analytics-admin
+  - google-analytics-admin>=0.9.0
-  - google-cloud-bigquery<3.21.0,>=3.0.1
+  - google-cloud-bigquery<3.21.0,>=3.4.0
-  - google-cloud-run>=0.9.0
+  - google-cloud-run>=0.10.0
-  - httpx
+  - httpx>=0.18.0
-  - looker-sdk>=22.2.0
-  - pandas-gbq
+  - looker-sdk>=22.4.0
+  - pandas-gbq>=0.7.0
-  - PyOpenSSL
-  - python-slugify>=5.0
+  - python-slugify>=7.0.0
+  - PyOpenSSL>=23.0.0
+  - tenacity>=8.1.0

Grpc:
-  - grpcio>=1.15.0
+  - grpcio>=1.38.0

Microsoft Azure:
-  - azure-mgmt-cosmosdb
+  - azure-mgmt-cosmosdb>=3.0.0
-  - azure-storage-file-share
+  - azure-storage-file-share>=12.7.0
-  - azure-synapse-spark
+  - azure-synapse-spark>=0.2.0

Mongo:

 devel-dependencies:
-  - mongomock
+  - mongomock>=3.12.0

MySQL:
-  - mysqlclient>=1.3.6
+  - mysqlclient>=1.4.0

Odbc:
-  - pyodbc
+  - pyodbc>=4.0.24

Pinecone:
-  - pinecone-client>=3.0.0
+  - pinecone-client>=3.1.0

SFTP:
-  - paramiko>=2.8.0
+  - paramiko>=2.9.0

SSH:
-  - paramiko>=2.6.0
+  - paramiko>=2.9.0

Tableau:
-  - tableauserverclient
+  - tableauserverclient>=0.25

Vertica:
-  - vertica-python>=0.5.1
+  - vertica-python>=0.6.0

J.

Reply via email to