> That sounds like a really nice improvement :) Thanks ! It's "quite" useful and nice indeed.
The actual impact on our users of "bad" dependency versions is quite low - especially since we advocate constraints for years and a lot of people are following, and if they have problems we always direct them to constraints so it's not a "big" deal). That's why it always was "nice to have" - but from a "perfection" point of view it is a nice improvement indeed - because we now will prevent a number of users struggling (and even having to fall-back to constraints). So it was never "first" priority. But this one will have a much bigger effect on the "ecosystem" - especially longer term. Airflow is a "dependency resolution hell". I discussed it with Damian Shaw a number of times (including over a beer in NY :). He is using Airflow as a test-bed for some improvements he implements and proposes to pip - also `uv` recently merged performance test case/benchmark based on airflow dependency resolution https://github.com/astral-sh/uv/pull/3643 . Having no lower bindings in our dependencies caused multiple problems of resolvers - pip and uv both struggle and backtrack a lot sometimes when airflow is being resolved. And that's precisely because of lack of lower-binding or far too low binding. As a result resolving installation where airflow and providers are involved will become much more stable, faster and predictable - especially when we will also attempt the next step, which is somewhat limiting the old provider versions on newer versions of Airflow (as suggested by Damian in https://github.com/apache/airflow/issues/39100 ). We **might** use the lower-binding from editable dependencies to be used as a "base" for those limits proposed by Damian (but that's something that we will take a look after we have a few releases of airflow after this PR is merged and we will see whether we will need it all (as I have a feeling that just having lower binding in Airflow core will help in a number of cases). J. On Sat, Jun 1, 2024 at 1:55 PM Pierre Jeambrun <pierrejb...@gmail.com> wrote: > Great work! That sounds like a really nice improvement :) > > Le sam. 1 juin 2024 à 10:48, Jarek Potiuk <ja...@potiuk.com> a écrit : > > > Hello everyone, > > > > TL;DR; I have finally got to something we planned when we switched to > UV, I > > have a green PR where we introduced automated management of > "lower-bounds" > > dependencies in Airflow and all providers (thanks to uv`s --lowest-direct > > resolution mechanism). > > > > The PR is here: https://github.com/apache/airflow/pull/39946 . It's > ready > > to review and merge (green). > > > > Thanks to Maciek nagging me on slack and helping with initial checks - I > > managed to complete it before going to Community Over Code this week. > > > > At the end of the email I summarized changes in dependencies that were > > needed to do it (so basically all the missing lower-bounds that the tests > > helped to detect and fix). > > > > Once it is merged, our CI will run a special > > "LowestDirectDependencyResolution" test suite that will fail if the tests > > run in PR in Airflow and any Provider uses a feature that requires adding > > `>=` limit for any library version (lower binding). This means that we > will > > finally have "proper" lower bindings for both Airflow and Providers and > > there will be no more cases where Airflow or any Provider fails because > > someone has an old version of a library installed. > > > > For example if you are using bedrock (amazon provider) - our Amazon > > provider had botocore > 1.3.3 but the tests found that Bedrock is only > > available in 1.34 and we have to bump it. Once we merge the PR, those > cases > > will be detected automatically and you will have to fix them before you > > merge your PRs. > > > > It's done in the way that in the special test suite, dependencies are > > downgraded to lowest direct ones before our unit tests are run. This is > > done for Airflow tests and for each provider separately, so we are able > to > > detect missing lower bounds very accurately - separately for core Airflow > > and separately for each provider. > > > > When the test fails in CI, it will be very easy to reproduce it locally > > with Breeze. For example if you work on google provider and it fails you > > run this command: > > > > breeze shell --force-lowest-dependencies --test-type "Providers[google]" > > > > This will drop you in Breeze shell, and downgrade google provider > > dependencies to lowest "direct" ones and allow you to run pytest tests > > there and fix the problem by manually installing newer dependency > versions > > and re-running the tests. > > > > Then you can iterate over tests, manually downgrade and upgrade > > dependencies as you see fit and eventually when you figure out the > minimum > > binding, you just add it to provider.yaml, run pre-commit and then > > restarting the command above can be repeated. > > > > I've added detailed instructions on how to approach fixing "lowest > > dependencies" problems, and when the tests fail in CI, you will be > directed > > to those instructions. I even described how to effectively use bisecting > > to easily find the actual version of dependency that needs to be set in > > such cases. > > > > ------------------------------- > > > > The list of dependency fixes: > > > > Airflow: > > > > - "asgiref", > > + "asgiref>=2.3.0", > > - "connexion[flask]>=2.10.0,<3.0", > > + "connexion[flask]>=2.14.2,<3.0", > > - "cryptography>=39.0.0", > > + "cryptography>=41.0.0", > > - "flask-caching>=1.5.0", > > + "flask-caching>=2.0.0", > > - "flask-wtf>=0.15", > > + "flask-wtf>=1.1.0", > > - "flask>=2.2,<2.3", > > + "flask>=2.2.1,<2.3", > > - "httpx", > > + "httpx>=0.18.0", > > - "lazy-object-proxy", > > + "lazy-object-proxy>=1.2.0", > > - "opentelemetry-exporter-otlp", > > - "packaging>=14.0", > > + "opentelemetry-exporter-otlp>=1.15.0", > > + "packaging>=22.0", > > - "pluggy>=1.0", > > - "psutil>=4.2.0", > > + "pluggy>=1.5.0", > > + "psutil>=5.8.0", > > - "python-dateutil>=2.3", > > + "python-dateutil>=2.7.0", > > + "requests-toolbelt>=0.4.0", > > - "setproctitle>=1.1.8", > > + "setproctitle>=1.3.3", > > - "tenacity>=6.2.0,!=8.2.0", > > + "tenacity>=8.0.0,!=8.2.0", > > > > Providers: > > > > Amazon: > > > > - - boto3>=1.33.0 > > - - botocore>=1.33.0 > > + - boto3>=1.34.0 > > + - botocore>=1.34.0 > > - - watchtower>=2.0.1,<4 > > + - watchtower>=3.0.0,<4 > > - - asgiref > > + - asgiref>=2.3.0 > > - - jmespath > > + - jmespath>=0.7.0 > > > > Amazon[aiobotocore] > > - - aiobotocore[boto3]>=2.5.3 > > + - aiobotocore[boto3]>=2.10.0 > > > > Apache Flink: > > - - cryptography>=2.0.0 > > + - cryptography>=41.0.0 > > > > Apache Hive: > > - - thrift>=0.9.2 > > + - thrift>=0.11.0 > > + - jmespath>=0.7.0 > > > > Apache Kylin: > > - - kylinpy>=2.6 > > + - kylinpy>=2.7.0 > > > > Apache Spark: > > - - pyspark > > + - pyspark>=3.0.0 > > > > CNCF Kubernetes: > > - - cryptography>=2.0.0 > > + - cryptography>=41.0.0 > > > > FAB: > > - - jmespath > > + - jmespath>=0.7.0 > > > > Github: > > - - PyGithub!=1.58 > > + - PyGithub>=2.1.1 > > > > Google: > > + - dill>=0.2.3 > > - - google-analytics-admin > > + - google-analytics-admin>=0.9.0 > > - - google-cloud-bigquery<3.21.0,>=3.0.1 > > + - google-cloud-bigquery<3.21.0,>=3.4.0 > > - - google-cloud-run>=0.9.0 > > + - google-cloud-run>=0.10.0 > > - - httpx > > + - httpx>=0.18.0 > > - - looker-sdk>=22.2.0 > > - - pandas-gbq > > + - looker-sdk>=22.4.0 > > + - pandas-gbq>=0.7.0 > > - - PyOpenSSL > > - - python-slugify>=5.0 > > + - python-slugify>=7.0.0 > > + - PyOpenSSL>=23.0.0 > > + - tenacity>=8.1.0 > > > > Grpc: > > - - grpcio>=1.15.0 > > + - grpcio>=1.38.0 > > > > Microsoft Azure: > > - - azure-mgmt-cosmosdb > > + - azure-mgmt-cosmosdb>=3.0.0 > > - - azure-storage-file-share > > + - azure-storage-file-share>=12.7.0 > > - - azure-synapse-spark > > + - azure-synapse-spark>=0.2.0 > > > > Mongo: > > > > devel-dependencies: > > - - mongomock > > + - mongomock>=3.12.0 > > > > MySQL: > > - - mysqlclient>=1.3.6 > > + - mysqlclient>=1.4.0 > > > > Odbc: > > - - pyodbc > > + - pyodbc>=4.0.24 > > > > Pinecone: > > - - pinecone-client>=3.0.0 > > + - pinecone-client>=3.1.0 > > > > SFTP: > > - - paramiko>=2.8.0 > > + - paramiko>=2.9.0 > > > > SSH: > > - - paramiko>=2.6.0 > > + - paramiko>=2.9.0 > > > > Tableau: > > - - tableauserverclient > > + - tableauserverclient>=0.25 > > > > Vertica: > > - - vertica-python>=0.5.1 > > + - vertica-python>=0.6.0 > > > > J. > > >