The problem with tightly specifying all the constraints (including the
providers) is that it means you can't do something like `pip install -U
apache-airflow-providers-google` but have the _core_ Airflow
constrained. (Pip may be better at upgrading less in cases like this
now?)
I have a proposal: two constraint files (for each python version) -- a
"core" and a "full".
The "full" is as you propose, with the providers, and their deps in the
file.
The "core" is _just_ the core requirements for Airflow without any
providers, or any transitive deps. This will include deps non-provider
extras though.
How does that sound?
-ash
On Thu, 11 Feb, 2021 at 14:07, Kaxil Naik <[email protected]> wrote:
Yup, that is correct. That will allow us to make sure that whenever
Airflow was released, all the dependencies including the provider are
snapshotted in constraints. So even if someone tries to install the
same version a year later with constraints it should work fine
without having to worry about the latest version of a specific
provider breaking it.
And then users can ofcourse install or upgrade providers after that
if they like.
Kaxil - did I understand it correctly ? If so - I think this is the
best we can do to keep two properties:
* repeatable installation of already released version
* capability (and easy way of) upgrading to latest providers
On Thu, Feb 11, 2021 at 12:42 PM Jarek Potiuk <[email protected]
<mailto:[email protected]>> wrote:
> Oh I misunderstood -- I thought you were suggesting putting the
transitive deps of apache-airflow-providers-google v2.0 into
constraint-2.0 files etc.
Well. That too. The transitive deps already are in the constraint
files and that will remain, I think this is the main reason why we
have the constraint files. The main reason why constraint files are
"snapshots of all dependencies'' (currently they exclude providers)
is to have a repeatable installation. Let me reiterate then how I
understand Kaxil's proposal (which I think makes perfect sense).
I really see the "extras" and constraints as a convenient way for
users to install a released version of airflow with the set of
providers they choose and dependencies in versions that we know are
working. No more, no less. Then they are free to upgrade the
dependencies as they wish.
How I see the current proposal - the constraint files will only
differ from the current ones by adding
'apache-airflow-providers-google==1.0.0" for example. Literally
(compared to the current process) it means that we will just add the
version of providers that were released at the time the airflow
X.Y.Z version was released (this is one line change in generation of
constraints as we have now). So the final constraint file for 2.0.1
version will look like this:
apache-airfow-providers-google=1.0.0
google-cloud-automl=1.9.0
....
~500 other dependencies with ==
...
Those constraint files will contain all providers that were released
at the time of airflow X.Y.Z release and all their transitive
dependencies.
This way if you run 'pip apache-airflow[google, amazon]==2.0.1
--constraints ...../2.0.1/python3.6.txt ' - you will always get the
google-provider==1.0.0 installed and amazon 1.0.0 as well.
And that preserves the only capability that constraint files +
extras give - an easy installation path when you want to install an
older version of airflow for the first time - with pretty much
guarantee that it will always work (this is the only problem
constraint files were introduced. This will be now extended to this
semantic: "install airflow x.y.z with all the providers and
dependencies that we found were ok at the time when x.y.z were
released".
Then, the users will still be free to do `pip install --upgrade
apache-airflow-providers-google' and specific upgrade airflow
provider to the latest version. Or if they are adventurous they
could upgrade all dependencies to latest with 'pip install
apache-airflow[google] --upgrade --upgrade-strategy eager' (but
without guarantee it will work).
Or if there is a new airflow released they could run: 'pip
apache-airflow[google, amazon]==2.0.2 --constraints
...../2.0.2/python3.6.txt` - and they will get set of dependencies
and providers that were there at the time of 2.0.2 release (but
still they are free to upgrade to latest versions of providers at
will).
Kaxil - did I understand it correctly ? If so - I think this is the
best we can do to keep two properties:
* repeatable installation of already released version
* capability (and easy way of) upgrading to latest providers
J.
On Thu, Feb 11, 2021 at 1:06 PM Ash Berlin-Taylor <[email protected]
<mailto:[email protected]>> wrote:
Oh I misunderstood -- I thought you were suggesting putting the
transitive deps of apache-airflow-providers-google v2.0 in to
constraint-2.0 files etc.
Cool
On Thu, 11 Feb, 2021 at 12:11, Jarek Potiuk <[email protected]
<mailto:[email protected]>> wrote:
> This unfortunately means that people would be unable to install
v1 of the google provider anymore -- forcing them to upgrade.
Not really. If we specify just this: [google] ->
"apache-airflow-providers-google" - any provider version could be
installed.
On Thu, Feb 11, 2021 at 12:07 PM Ash Berlin-Taylor <[email protected]
<mailto:[email protected]>> wrote:
On Thu, 11 Feb, 2021 at 00:34, Jarek Potiuk <[email protected]
<mailto:[email protected]>> wrote:
*Solution proposal:*
Every time when we release a new wave of providers, we
regenerate the constraints for all past released 2.* versions of
airflow, so that the new providers are taken into account and
they can install cleanly with `pip install
apache-airflow[provider]==2.0.N --constraint == ....
2.0.N/python ...
Both problems can be solved rather easily. 1) requires 2.0.2
release of Airflow, 2) can be implemented any time (happy to do
it).
Let me know what you think.
This unfortunately means that people would be unable to install
v1 of the google provider anymore -- forcing them to upgrade.
I'm not sure there's a _ready_ solution to this though.
-ash
--
+48 660 796 129
--
+48 660 796 129