TL;DR; I would like to revive a discussion (hopefully short :) and possibly
cast a vote on "AIP-4 - Support for System Tests for external systems".

https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-4+Support+for+System+Tests+for+external+systems

This is the very first AIP I created almost 1.5 years ago and it took very
long to get to the point where I think we are very, very close to being
able to implement it after many, many baby steps (and some bigger leaps)
that we've done in the meantime.

*Let me just quickly summarise what is the context:*

   - One of the biggest Airflow advantages are integrations with external
   systems. We have i think several 100s of hooks and operators working with
   those external systems
   - We have an extensive set of tests - both unit and integrations that
   are sometimes really good and catching a lot of problems, but they can only
   do as much as mocking out access to the external systems.
   Unit/integration tests are great for testing the core of Airflow and it's
   functionality but the external services cannot be effectively tested
   - The externa services sometimes change - we have new versions of tools,
   services etc released every day and sometimes even if we perfectly mock it
   in unit tests - the hooks simply stop working at some point in time.
   - I think there is a need to run some tests on a systems level regularly
   - communicating with "real" external systems and testing our operators,
   Let's call them System Tests. They do not necessarily need to be run with
   every PR, but I think running them regularly makes perfect sense.


*Why now? Why this seems to be a good time to do it?*

   - We switched to pytests and we already have separation to
   unit/integration tests in place - we can add support to system tests using
   the same mechanisms.
   - With AIP-21 we grouped the tests into "providers" package and that
   makes it easy to define boundaries of "systems" - every provider is a
   "system" to test.
   - We have plenty of system tests implemented for GCP which we are going
   to use to run tests for backported packages from AIP-21 - we followed
   system test automation for more than a year in GCP operators and we have it
   fully automated already.
   - In the latest PR - https://github.com/apache/airflow/pull/7389 we even
   extracted all the GCP-specific way we run system tests  in the way to a)
   make it easy for everyone to write automated system tests b) make it
   possible to be automated.
   - We have credits provided by Google to run our tests and we can use
   them for regular runs of the system tests
   - We are close to switch-over to GitHub Actions, which will make it easy
   to write manually or regularly scheduled actions that will have securely
   stored credentials to run the system tests - in a way that it will be
   controlled by committers and not abusable by contributors who prepare PRs.
   - I would like to start and lead a community-driven effort where we will
   split amongst community members writing missing tests - so that our new
   backport packages can be tested against latest-released version of 1.10.*.
   We will provide GCP tests as examples, we will also setup the automation
   needed to run the tests regularly - the only thing we will ask the members
   of the community is to write missing tests. This way I hope we can get very
   high coverage of backported packages.

There are of course still a number of open questions - like how to store
credentials, how often to run the tests etc. but I think those are
implementation details that we can work out while we are implementing it.

What do you think about it? If I have a lot of "yes's" quickly, I would
love to start voting on AIP-4.

J.







-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to