TL;DR; I wanted to discuss the approach we are going to take for backported
providers packages. This is important for PMCs to decide about how we are
going to make release process for it, but I wanted to make it public
discussion so that anyone else can chime-in and we can discuss it as a
community.

*Context*

As explained in the other thread - we are close to have releasable/tested
backport packages for Airflow 1.10.* series for "providers"
operators/hooks/packages. The main purpose of those backport packages is to
let users migrate to the new operators before they migrate to 2.0.* version
of Airflow.

The 2.0 version is still some time in the future, and we have a number of
operators/hooks/sensors implemented that are not actively used/tests
because they are in master version. There are a number of changes and fixes
only implemented in master/2.0 so it would be great to use them in 1.10 -
to use the new features but also to test the master versions as early as
possible.

Another great property of the backport packages is that they can be used to
ease migration process - users can install the "apache-airflow-providers"
package and start using the new operators without migrating to a new
Airflow. They can incrementally move all their DAGs to use the new
"providers" package and only after all that is migrated they can migrate
Airflow to  2.0 when they are ready. That allows to have a smooth migration
path for those users.

*Testing*

The issue we have with those packages is that we are not 100% sure if the
"providers" operators will work with any 1.10.* airflow version. There were
no fundamental changes and they SHOULD work - but we never know until we
test.

Some preliminary tests with subset of GCP operators show that the operators
work out-of-the box. We have a big set of "system" tests for "GCP"
operators that we will run semi-automatically and make sure that all GCP
operators are working fine. This is already a great compatibility test (GCP
operators are about 1/3 of all operators for Airflow). But also the
approach used in GCP system tests can be applied to other operators.

I plan to have a matrix of "compatibilities" in
https://cwiki.apache.org/confluence/display/AIRFLOW/Backported+providers+packages+for+Airflow+1.10.*+series
and
ask community to add/run tests with other packages as well. It should be
rather easy to add system tests for other systems - following the way it is
implemented for GCP.

*Releases*

I think the most important decision is how we are going to release the
packages. This is where PMCs have to decide I think as we have legal
responsibility for releasing Apache Airflow official software.

What we have now (after the PRs get merged) - wheel and source packages
build automatically in Travis CI and uploaded to file.io ephemeral storage.
The builds upload all the packages there - one big "providers" package and
separate packages for each "provider".

It would be great if we can officially publish packages for backporting in
pypi however and here where we have to agree on the
process/versioning/cadence.

We can follow the same process/keys etc as for releasing the main airflow
package, but I think it can be a bit more relaxed in terms of testing - and
we can release it more often (as long as there will be new changes in
providers). Those packages might be released on "as-is" basis - without
guarantee that they work for all operators/hooks/sensors - and without
guarantee that they will work for all 1.10.* versions. We can have the
"compatibility" statement/matrix in our wiki where people who tested some
package might simply state that it works for them. At Polidea we can assume
stewardship on the GCP packages and test them using our automated system
tests for every release for example - maybe others can assume
stewardship for other providers.

For that - we will need some versioning/release policy. I would say a CalVer
<https://calver.org/> approach might work best (YYYY.MM.DD). And to make it
simple we should release one "big" providers package with all providers in.
We can have roughly monthly cadence for it.

But I am also open to any suggestions here.

Please let me know what you think.

J.







-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to