PR ready: https://github.com/apache/airflow/pull/14227
I have some docs build related issue (likely related to self-hosted cleanup) - but other than that I think it's ready to review. I also semi-automatically updated the the 2.0.0 and 2.0.1 version constraints following the same pattern so that installing 2.0.0 and 2.0.1 will also be repeatable. J, On Thu, Feb 11, 2021 at 7:11 PM Jarek Potiuk <[email protected]> wrote: > I will work on it during the weekend and will fx the 2.0.1 image > accordingly, Good discussion :) > > On Thu, Feb 11, 2021 at 7:10 PM Jarek Potiuk <[email protected]> wrote: > >> Yep. Sounds cool. >> >> On Thu, Feb 11, 2021 at 4:20 PM Kaxil Naik <[email protected]> wrote: >> >>> Yeah that sounds good too. >>> >>> On Thu, Feb 11, 2021, 15:17 Ash Berlin-Taylor <[email protected]> wrote: >>> >>>> The problem with tightly specifying all the constraints (including the >>>> providers) is that it means you can't do something like `pip install -U >>>> apache-airflow-providers-google` but have the _core_ Airflow constrained. >>>> (Pip may be better at upgrading less in cases like this now?) >>>> >>>> I have a proposal: two constraint files (for each python version) -- a >>>> "core" and a "full". >>>> >>>> The "full" is as you propose, with the providers, and their deps in the >>>> file. >>>> >>>> The "core" is _just_ the core requirements for Airflow without any >>>> providers, or any transitive deps. This will include deps non-provider >>>> extras though. >>>> >>>> How does that sound? >>>> >>>> -ash >>>> >>>> On Thu, 11 Feb, 2021 at 14:07, Kaxil Naik <[email protected]> wrote: >>>> >>>> Yup, that is correct. That will allow us to make sure that whenever >>>> Airflow was released, all the dependencies including the provider are >>>> snapshotted in constraints. So even if someone tries to install the same >>>> version a year later with constraints it should work fine without having to >>>> worry about the latest version of a specific provider breaking it. >>>> >>>> And then users can ofcourse install or upgrade providers after that if >>>> they like. >>>> >>>> Kaxil - did I understand it correctly ? If so - I think this is the >>>>> best we can do to keep two properties: >>>> >>>> * repeatable installation of already released version >>>>> * capability (and easy way of) upgrading to latest providers >>>> >>>> >>>> On Thu, Feb 11, 2021 at 12:42 PM Jarek Potiuk <[email protected]> wrote: >>>> >>>>> > Oh I misunderstood -- I thought you were suggesting putting the >>>>> transitive deps of apache-airflow-providers-google v2.0 into >>>>> constraint-2.0 >>>>> files etc. >>>>> >>>>> Well. That too. The transitive deps already are in the constraint >>>>> files and that will remain, I think this is the main reason why we have >>>>> the constraint files. The main reason why constraint files are "snapshots >>>>> of all dependencies'' (currently they exclude providers) is to have a >>>>> repeatable installation. Let me reiterate then how I understand Kaxil's >>>>> proposal (which I think makes perfect sense). >>>>> >>>>> I really see the "extras" and constraints as a convenient way for >>>>> users to install a released version of airflow with the set of providers >>>>> they choose and dependencies in versions that we know are working. No >>>>> more, >>>>> no less. Then they are free to upgrade the dependencies as they wish. >>>>> >>>>> How I see the current proposal - the constraint files will only differ >>>>> from the current ones by adding 'apache-airflow-providers-google==1.0.0" >>>>> for example. Literally (compared to the current process) it means that we >>>>> will just add the version of providers that were released at the time the >>>>> airflow X.Y.Z version was released (this is one line change in generation >>>>> of constraints as we have now). So the final constraint file for 2.0.1 >>>>> version will look like this: >>>>> >>>>> apache-airfow-providers-google=1.0.0 >>>>> google-cloud-automl=1.9.0 >>>>> .... >>>>> ~500 other dependencies with == >>>>> ... >>>>> >>>>> Those constraint files will contain all providers that were released >>>>> at the time of airflow X.Y.Z release and all their transitive >>>>> dependencies. >>>>> This way if you run 'pip apache-airflow[google, amazon]==2.0.1 >>>>> --constraints ...../2.0.1/python3.6.txt ' - you will always get the >>>>> google-provider==1.0.0 installed and amazon 1.0.0 as well. >>>>> >>>>> And that preserves the only capability that constraint files + extras >>>>> give - an easy installation path when you want to install an older version >>>>> of airflow for the first time - with pretty much guarantee that it will >>>>> always work (this is the only problem constraint files were introduced. >>>>> This will be now extended to this semantic: "install airflow x.y.z with >>>>> all >>>>> the providers and dependencies that we found were ok at the time when >>>>> x.y.z >>>>> were released". >>>>> >>>>> Then, the users will still be free to do `pip install --upgrade >>>>> apache-airflow-providers-google' and specific upgrade airflow provider to >>>>> the latest version. Or if they are adventurous they could upgrade all >>>>> dependencies to latest with 'pip install apache-airflow[google] --upgrade >>>>> --upgrade-strategy eager' (but without guarantee it will work). >>>>> >>>>> Or if there is a new airflow released they could run: 'pip >>>>> apache-airflow[google, amazon]==2.0.2 --constraints >>>>> ...../2.0.2/python3.6.txt` - and they will get set of dependencies and >>>>> providers that were there at the time of 2.0.2 release (but still they are >>>>> free to upgrade to latest versions of providers at will). >>>>> >>>>> Kaxil - did I understand it correctly ? If so - I think this is the >>>>> best we can do to keep two properties: >>>>> >>>>> * repeatable installation of already released version >>>>> * capability (and easy way of) upgrading to latest providers >>>>> >>>>> >>>>> J. >>>>> >>>>> >>>>> >>>>> On Thu, Feb 11, 2021 at 1:06 PM Ash Berlin-Taylor <[email protected]> >>>>> wrote: >>>>> >>>>>> Oh I misunderstood -- I thought you were suggesting putting the >>>>>> transitive deps of apache-airflow-providers-google v2.0 in to >>>>>> constraint-2.0 files etc. >>>>>> >>>>>> Cool >>>>>> >>>>>> On Thu, 11 Feb, 2021 at 12:11, Jarek Potiuk <[email protected]> wrote: >>>>>> >>>>>> > This unfortunately means that people would be unable to install v1 >>>>>> of the google provider anymore -- forcing them to upgrade. >>>>>> >>>>>> Not really. If we specify just this: [google] -> >>>>>> "apache-airflow-providers-google" - any provider version could be >>>>>> installed. >>>>>> >>>>>> >>>>>> On Thu, Feb 11, 2021 at 12:07 PM Ash Berlin-Taylor <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> On Thu, 11 Feb, 2021 at 00:34, Jarek Potiuk <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> *Solution proposal:* >>>>>>> >>>>>>> Every time when we release a new wave of providers, we regenerate >>>>>>> the constraints for all past released 2.* versions of airflow, so that >>>>>>> the >>>>>>> new providers are taken into account and they can install cleanly with >>>>>>> `pip >>>>>>> install apache-airflow[provider]==2.0.N --constraint == .... >>>>>>> 2.0.N/python >>>>>>> ... >>>>>>> >>>>>>> Both problems can be solved rather easily. 1) requires 2.0.2 release >>>>>>> of Airflow, 2) can be implemented any time (happy to do it). >>>>>>> >>>>>>> Let me know what you think. >>>>>>> >>>>>>> >>>>>>> This unfortunately means that people would be unable to install v1 >>>>>>> of the google provider anymore -- forcing them to upgrade. >>>>>>> >>>>>>> I'm not sure there's a _ready_ solution to this though. >>>>>>> >>>>>>> -ash >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> +48 660 796 129 >>>>>> >>>>>> >>>>> >>>>> -- >>>>> +48 660 796 129 >>>>> >>>> >> >> -- >> +48 660 796 129 >> > > > -- > +48 660 796 129 > -- +48 660 796 129
