Few comments: To Kamil's post:
When it comes to testing, I think it's worth thinking about which > versions of Airflow we want to test on and how to be compatible with > Airflow. For now, we are trying to make all the changes compatible > with Airflow 2.0.0, but I see a few changes that may require a > compatibility break between Airflow and providers packages. If we do > not clearly specify when a given package does not need to be > compatible, we may soon have a serious problem with maintaining these > packages, as the effort needed to test and maintain compatibility > between multiple packages and multiple Airflow versions will only > increase. > I think we should not drop 2.0.0 as the "base compatibility" level unless we "yank" that specific release. I would rather defer any such changes to 3.0 and speed up releasing 3.0 if we neeed that, rather than breaking compatibility. Semver has very nice properties for users (see the discussion about API versioning) and I think trading "some evolution speed/development easiness" for "user backwards compatibility piece of mind by following SemVer strictly" is a wise choice. This will also limit the development of Airflow as it is now not > possible to add any new API / modules / functions to Apache Airflow. > We for providers packages can only rely on the API available on > Airflow 2.0.0 or on other providers' packages. > I think we are limited only on Airflow 2.0 API, not on other providers. The providers have their own SemVer versioning and we've already released a couple of backwards-incompatible releases of providers. We can even easily add some minimal cross-provider dependencies (such as Google 3.0 might [optionally] depend on Beam 2.0+ or something like that). This is easily manageable (we have no such feature yet, but we can easily add such cross-package dependency with any future release). For now, I found two PRs that may be problematic, but I'm sure this > problem will only get worse over time. > I am not sure if it will get worse - I think any feature we see as an incompatibility, we should mark as 3.0 and when we accumulate enough of those, we will simply start working on 3.0 release. I do not think we should compromise here. This is what strict SemVer is all about and it has more pros than cons, I think (from the user perspective). > My change - [AIRFLOW-6829] Auto-apply apply_default ( > https://github.com/apache/airflow/pull/15044 ) makes all package > providers not backward compatible. In order not to display a warning > about the decommissioning of the apply defaults decorator, we need to > clear references to that decorator in providers packages. This > decorator is required in older versions. > Can we be smarter here, and skip warnings when we import from providers? I think it should be possible to check what's the call-stack and not to display warning in case the import comes from any community-managed provider. And in 3.0 we can drop it completely. Is there any problem with this approach? Would it make it possible to be fully backwards-compatible for 2.0 and released providers? > @davilum change - Enable Connection creation from Vault parameters > (https://github.com/apache/airflow/pull/15013/files) this may require > extracting the common part between two backends into a new module i.e. > create a new API. > I think that can (and should) still be made in a backwards compatible way. Probably it is 'nicer' to break the compatibility, but we are breaking the basic premise we have with SemVer, which is not nice. I think by adopting SemVer, we agreed to bear the cost of maintaining backwards compatibility. To Tomek's post: > > > I believe we should ask contributors to confirm that they tested the > > provider E2E (or at least some parts of it). But it should not be > > required (resources are expensive, OSS is free). As long as there's no > > E2E (aka system) test for every operator, running from time to time we > > won't be able to assure that all works as expected. But that's ok. > > > Wholeheartedly agree. We should ask the users to help, but it should not be a hard requirement. The nice thing about providers vs. Airflow, that we have a MUCH easier way to mitigate any problems - the user can simply use the previous version of the provider until the ad-hoc fix is ready. Downgrading Airflow is not easy, but downgrading a single provider is super-easy. Any problem in providers has much smaller "overall effect" of problems introduced - it only impacts small amount of users and with the "each provider can be independently downgraded" mitigation, it should be acceptable risk that sometimes, some provider will have some feature broken temporarily, It happens anyway with any software and any amount of testing you might think of - but in this case we pretty much always have an easy mitigation as "given". There is an exception however here - the "big providers" like Google or Amazon that have bigger impact. But there, we have System Tests in many of those already (for Google we have System Tests for > 90% of the provider), I think we should work with the Composer an MWAA teams to automatically run those on their infrastructure - AKA AIP-4 https://github.com/apache/airflow/issues/10835. I think this might be the best contribution from Composer/MWAA teams they can make - provide us with the credits to run the tests there and automate those. As soon as we get some of the stabilization/CI work, I am happy to drive it together with Tobiasz would like to take on that task - it is rather close to finish technically, we just need credits from both Amazon and Google and some time to set it up.