Re: [DISCUSS] Guidelines for Releasing Providers packages

Jarek Potiuk Sat, 03 Apr 2021 04:51:17 -0700

Few comments:

To Kamil's post:


When it comes to testing, I think it's worth thinking about which
> versions of Airflow we want to test on and how to be compatible with
> Airflow. For now, we are trying to make all the changes compatible
> with Airflow 2.0.0, but I see a few changes that may require a
> compatibility break between Airflow and providers packages. If we do
> not clearly specify when a given package does not need to be
> compatible, we may soon have a serious problem with maintaining these
> packages, as the effort needed to test and maintain compatibility
> between multiple packages and multiple Airflow versions will only
> increase.
>

I think we should not drop 2.0.0 as the "base compatibility" level unless
we "yank"
that specific release. I would rather defer any such changes to 3.0 and
speed up
releasing 3.0 if we neeed that, rather than breaking compatibility.
Semver has very nice properties for users (see the discussion about API
versioning) and I think trading "some evolution speed/development easiness"
for
"user backwards compatibility piece of mind by following SemVer strictly"
is a wise choice.

This will also limit the development of Airflow as it is now not
> possible to add any new API / modules / functions to Apache Airflow.
> We for providers packages can only rely on the API available on
> Airflow 2.0.0 or on other providers' packages.
>

I think we are limited only on Airflow 2.0 API, not on other providers. The
providers
have their own SemVer versioning and we've already released a couple of
backwards-incompatible releases of providers. We can even easily add some
minimal cross-provider dependencies (such as Google 3.0 might [optionally]
depend
on Beam 2.0+ or something like that). This is easily manageable (we have no
such
feature yet, but we can easily add such cross-package dependency with
any future release).

For now, I found two PRs that may be problematic, but I'm sure this
> problem will only get worse over time.
>

I am not sure if it will get worse - I think any feature we see as an
incompatibility,
we should mark as 3.0 and when we accumulate enough of those, we will simply
start working on 3.0 release. I do not think we should compromise here. This
is what strict SemVer is all about and it has more pros than cons, I think
(from
the user perspective).


> My change - [AIRFLOW-6829] Auto-apply apply_default (
> https://github.com/apache/airflow/pull/15044 ) makes all package
> providers not backward compatible.  In order not to display a warning
> about the decommissioning of the apply defaults decorator, we need to
> clear references to that decorator in providers packages. This
> decorator is required in older versions.
>

Can we be smarter here, and skip warnings when we import from providers?
I think it should be possible to check what's the call-stack and not to
display
warning in case the import comes from any community-managed provider.
And in 3.0 we can drop it completely. Is there any problem with this
approach?
Would it make it possible to be fully backwards-compatible for 2.0 and
released providers?


> @davilum change - Enable Connection creation from Vault parameters
> (https://github.com/apache/airflow/pull/15013/files) this may require
> extracting the common part between two backends into a new module i.e.
> create a new API.
>

I think that can (and should) still be made in a backwards compatible way.
Probably it
is 'nicer' to break the compatibility, but we are breaking the basic
premise we have with
SemVer, which is not nice. I think by adopting SemVer, we agreed to bear
the cost of
maintaining backwards compatibility.

To Tomek's post:

>
> > I believe we should ask contributors to confirm that they tested the
> > provider E2E (or at least some parts of it). But it should not be
> > required (resources are expensive, OSS is free). As long as there's no
> > E2E (aka system) test for every operator, running from time to time we
> > won't be able to assure that all works as expected. But that's ok.
> >
>

Wholeheartedly agree. We should ask the users to help, but it should not be
a hard requirement. The nice thing about providers vs. Airflow, that we
have a MUCH
easier way to mitigate any problems - the user can simply use the previous
version
of the provider until the ad-hoc fix is ready.  Downgrading Airflow is not
easy,
but downgrading a single provider is super-easy. Any problem in providers
has much smaller "overall effect" of problems introduced - it only impacts
small
amount of users and with the "each provider can be independently downgraded"
mitigation, it should be acceptable risk that sometimes, some provider will
have
some feature broken temporarily, It happens anyway with any software and any
amount of testing you might think of - but in this case we pretty much
always have
an easy mitigation as "given".

There is an exception however here - the "big providers" like
Google or Amazon that have bigger impact. But there,
we have System Tests  in many of those already (for Google we have
System Tests for > 90% of the provider),
I think we should work with the Composer  an MWAA teams to
automatically run those on their infrastructure - AKA AIP-4
https://github.com/apache/airflow/issues/10835.
I think this might be the best contribution from Composer/MWAA teams they
can make - provide us with the credits to run the tests there and automate
those. As soon as we get some of the stabilization/CI work, I am happy to
drive it  together with Tobiasz would like to take on that task -
it is rather close to finish technically, we just need credits from both
Amazon
and Google and some time to set it up.

Re: [DISCUSS] Guidelines for Releasing Providers packages

Reply via email to