As per the frequency of releases maybe we can consider "nightly builds" for providers? In this way any contributed hook/operator will be pip-installable in 24h, so users can start to use it = test it. This can help us reduce the number of releases with unworking integrations.
Tomek On Mon, Feb 10, 2020 at 12:11 AM Jarek Potiuk <[email protected]> wrote: > > TL;DR; I wanted to discuss the approach we are going to take for backported > providers packages. This is important for PMCs to decide about how we are > going to make release process for it, but I wanted to make it public > discussion so that anyone else can chime-in and we can discuss it as a > community. > > *Context* > > As explained in the other thread - we are close to have releasable/tested > backport packages for Airflow 1.10.* series for "providers" > operators/hooks/packages. The main purpose of those backport packages is to > let users migrate to the new operators before they migrate to 2.0.* version > of Airflow. > > The 2.0 version is still some time in the future, and we have a number of > operators/hooks/sensors implemented that are not actively used/tests > because they are in master version. There are a number of changes and fixes > only implemented in master/2.0 so it would be great to use them in 1.10 - > to use the new features but also to test the master versions as early as > possible. > > Another great property of the backport packages is that they can be used to > ease migration process - users can install the "apache-airflow-providers" > package and start using the new operators without migrating to a new > Airflow. They can incrementally move all their DAGs to use the new > "providers" package and only after all that is migrated they can migrate > Airflow to 2.0 when they are ready. That allows to have a smooth migration > path for those users. > > *Testing* > > The issue we have with those packages is that we are not 100% sure if the > "providers" operators will work with any 1.10.* airflow version. There were > no fundamental changes and they SHOULD work - but we never know until we > test. > > Some preliminary tests with subset of GCP operators show that the operators > work out-of-the box. We have a big set of "system" tests for "GCP" > operators that we will run semi-automatically and make sure that all GCP > operators are working fine. This is already a great compatibility test (GCP > operators are about 1/3 of all operators for Airflow). But also the > approach used in GCP system tests can be applied to other operators. > > I plan to have a matrix of "compatibilities" in > https://cwiki.apache.org/confluence/display/AIRFLOW/Backported+providers+packages+for+Airflow+1.10.*+series > and > ask community to add/run tests with other packages as well. It should be > rather easy to add system tests for other systems - following the way it is > implemented for GCP. > > *Releases* > > I think the most important decision is how we are going to release the > packages. This is where PMCs have to decide I think as we have legal > responsibility for releasing Apache Airflow official software. > > What we have now (after the PRs get merged) - wheel and source packages > build automatically in Travis CI and uploaded to file.io ephemeral storage. > The builds upload all the packages there - one big "providers" package and > separate packages for each "provider". > > It would be great if we can officially publish packages for backporting in > pypi however and here where we have to agree on the > process/versioning/cadence. > > We can follow the same process/keys etc as for releasing the main airflow > package, but I think it can be a bit more relaxed in terms of testing - and > we can release it more often (as long as there will be new changes in > providers). Those packages might be released on "as-is" basis - without > guarantee that they work for all operators/hooks/sensors - and without > guarantee that they will work for all 1.10.* versions. We can have the > "compatibility" statement/matrix in our wiki where people who tested some > package might simply state that it works for them. At Polidea we can assume > stewardship on the GCP packages and test them using our automated system > tests for every release for example - maybe others can assume > stewardship for other providers. > > For that - we will need some versioning/release policy. I would say a CalVer > <https://calver.org/> approach might work best (YYYY.MM.DD). And to make it > simple we should release one "big" providers package with all providers in. > We can have roughly monthly cadence for it. > > But I am also open to any suggestions here. > > Please let me know what you think. > > J. > > > > > > > > -- > > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/> -- Tomasz Urbaszek Polidea | Software Engineer M: +48 505 628 493 E: [email protected] Unique Tech Check out our projects!
