Good point Kaxil - this is one of the reasons why we are not yet doing
AIP-8 - because of potential dependency hell

Currently we are pretty well synchronised w/regards dependencies. And the
packages installs nicely on 1.10.9 without any dependency conflict. As part
of the
implementation I am planning to test the packages for other versions as
well and see if any fixes are needed. In fact the second PR will be updated
still so that
Breeze becomes an easy test platform for all those older versions as well.
I have some ideas on how to make it really easy..

The only correction I had to make was docutils (limit the upper bound) -
actually what makes it better is that we have "open" dependencies in
released packages.

As long as we do not increase the lower bounds of dependencies in airflow
master, I think we are good as-is.

For the future - if you look at the current setup.py - dependencies
fot backported packages are reused from main airflow, but in case we need
it, we can always
modify the dependencies specifically for the backported packages. It's now
nicely organized per provider, so it should be manageable even if we change
a lot in the future.

I also do not foresee that it will be very long. I think we should release
maybe a few backport packages and at some point in time when 2.0 is
released we should stop doing it
to encourage people to move to 2.0.

J.


On Mon, Feb 10, 2020 at 8:52 AM Kaxil Naik <[email protected]> wrote:

> I think a single backported package makes sense for now.
>
> However I have one question, how would we deal with a situation where we
> have dependecy conflict between the operator in Airflow 1.10.* and the
> operator in the "backported" providers package.
>
> An example would be let's say the GCS Operator in Airflow 1.10.9 requires
> google-cloud-storage<0.10.0 and the one we use in Master / in backported
> package needs google-cloud-storage>=0.11.0
>
> I am just making up this example, it is not the case with Gcs operators.
> And may be we might not even have any compatibility issues at all :) but
> just thinking it out loud.
>
> Regards,
> Kaxil
>
> On Mon, Feb 10, 2020, 04:05 Jarek Potiuk <[email protected]> wrote:
>
> > Hello everyone,
> >
> > TL;DR; I've been quite busy recently as I was working on backported
> > "providers" packages for Airflow 1.10.* and I have some pretty good news
> on
> > that front. I would love to have your comments and opinions on the
> current
> > state of it.  This is more 'information" on what is being implemented
> now -
> > I will send a separate thread about some future decisions needed - mostly
> > from PMC side.
> >
> > I have two PRs that are relevant and I wanted to describe both here:
> >
> > 1) Preparing backportable packages for Airflow 1.10.*
> > https://github.com/apache/airflow/pull/7391
> >
> > This PR modifies setyp.py to enable preparation of backportable packages
> > for Airlfow 1.10.*. Using this version  of setup.py we can prepare and
> > release PIP packages of "providers" package that will be installable for
> > Airflow 1.10.*  series. I managed to have it working without converting
> > packages to implicit namespaces (separate discussion on the devlist).
> >
> > I did it in a way that we can either prepare "apache-airflow-providers"
> > package (with all "providers" code in a single package) or we can have
> > "apache-airflow-providers-XXXXX" packages - separately for each providers
> > package we have. The latter approach produces many more smaller (and
> > potentially inter-dependent) packages - something that in the future
> might
> > be base for AIP-8
> > <
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303&focusedCommentId=103093048
> > >
> > -
> > but we do not need it for now. It also nicely keeps dependencies
> separately
> > - so each of the packages has only minimum set of dependencies needed for
> > each package.
> >
> > I would like to leave it for now, but for the purpose of backporting I
> > think releasing single "providers" package makes much more sense. But if
> > others think that we should release many more smaller "providers"
> packages
> > separately - I am also quite OK with it. It's just the matter of
> > testing/status of each package and some inter-dependencies (some packages
> > depend on each other) - especially for transfer operators.
> >
> > 2) System testing of backportable packages:
> > https://github.com/apache/airflow/pull/7389
> >
> > We need to have a way to test that the backported packages are working
> for
> > Airflow 1.10. We cannot run all unit tests for Airflow 1.10, but we can
> run
> > some system end-2-end tests. While we do not have consistent system
> > "end-2-end" tests for all operators we have quite extensive set of system
> > tests for GCP operators. Those tests run example dags from google cloud
> > platform operators - the example dags are used to both - provide examples
> > in the docs but also can be run (with appropriate environment) to run the
> > example dag automatically with a real external system (GCP in this
> case). I
> > proposed this approach a long time ago in AIP-4
> > <
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-4+Support+for+System+Tests+for+external+systems
> > >
> > and
> > while it has not been "universally" accepted yet, we followed it with GCP
> > operator implementations (and we have all GCP operators automatically
> > testable with system tests), With this PR I made the system test approach
> > nicely integrated with Pytest markers, Breeze and our test environment -
> so
> > it is now very easy to run system tests semi-automatically (and in the
> > future we can fully automate it when we switch to GitHub actions).
> >
> > We are planning to run all the system tests for all GCP operators, but
> when
> > it's there it's also rather easy to add tests for other groups of
> operators
> > so I am planning to have a community-driven effort to add more of those
> > system-tests (and make sure that backported packages can be safely used
> in
> > 1.10.* environment).
> >
> >
> > J.
> >
> >
> >
> >
> >
> > --
> >
> > Jarek Potiuk
> > Polidea <https://www.polidea.com/> | Principal Software Engineer
> >
> > M: +48 660 796 129 <+48660796129>
> > [image: Polidea] <https://www.polidea.com/>
> >
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to