I think a single backported package makes sense for now. However I have one question, how would we deal with a situation where we have dependecy conflict between the operator in Airflow 1.10.* and the operator in the "backported" providers package.
An example would be let's say the GCS Operator in Airflow 1.10.9 requires google-cloud-storage<0.10.0 and the one we use in Master / in backported package needs google-cloud-storage>=0.11.0 I am just making up this example, it is not the case with Gcs operators. And may be we might not even have any compatibility issues at all :) but just thinking it out loud. Regards, Kaxil On Mon, Feb 10, 2020, 04:05 Jarek Potiuk <[email protected]> wrote: > Hello everyone, > > TL;DR; I've been quite busy recently as I was working on backported > "providers" packages for Airflow 1.10.* and I have some pretty good news on > that front. I would love to have your comments and opinions on the current > state of it. This is more 'information" on what is being implemented now - > I will send a separate thread about some future decisions needed - mostly > from PMC side. > > I have two PRs that are relevant and I wanted to describe both here: > > 1) Preparing backportable packages for Airflow 1.10.* > https://github.com/apache/airflow/pull/7391 > > This PR modifies setyp.py to enable preparation of backportable packages > for Airlfow 1.10.*. Using this version of setup.py we can prepare and > release PIP packages of "providers" package that will be installable for > Airflow 1.10.* series. I managed to have it working without converting > packages to implicit namespaces (separate discussion on the devlist). > > I did it in a way that we can either prepare "apache-airflow-providers" > package (with all "providers" code in a single package) or we can have > "apache-airflow-providers-XXXXX" packages - separately for each providers > package we have. The latter approach produces many more smaller (and > potentially inter-dependent) packages - something that in the future might > be base for AIP-8 > < > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303&focusedCommentId=103093048 > > > - > but we do not need it for now. It also nicely keeps dependencies separately > - so each of the packages has only minimum set of dependencies needed for > each package. > > I would like to leave it for now, but for the purpose of backporting I > think releasing single "providers" package makes much more sense. But if > others think that we should release many more smaller "providers" packages > separately - I am also quite OK with it. It's just the matter of > testing/status of each package and some inter-dependencies (some packages > depend on each other) - especially for transfer operators. > > 2) System testing of backportable packages: > https://github.com/apache/airflow/pull/7389 > > We need to have a way to test that the backported packages are working for > Airflow 1.10. We cannot run all unit tests for Airflow 1.10, but we can run > some system end-2-end tests. While we do not have consistent system > "end-2-end" tests for all operators we have quite extensive set of system > tests for GCP operators. Those tests run example dags from google cloud > platform operators - the example dags are used to both - provide examples > in the docs but also can be run (with appropriate environment) to run the > example dag automatically with a real external system (GCP in this case). I > proposed this approach a long time ago in AIP-4 > < > https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-4+Support+for+System+Tests+for+external+systems > > > and > while it has not been "universally" accepted yet, we followed it with GCP > operator implementations (and we have all GCP operators automatically > testable with system tests), With this PR I made the system test approach > nicely integrated with Pytest markers, Breeze and our test environment - so > it is now very easy to run system tests semi-automatically (and in the > future we can fully automate it when we switch to GitHub actions). > > We are planning to run all the system tests for all GCP operators, but when > it's there it's also rather easy to add tests for other groups of operators > so I am planning to have a community-driven effort to add more of those > system-tests (and make sure that backported packages can be safely used in > 1.10.* environment). > > > J. > > > > > > -- > > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/> >
