+1 for introducing system tests. Lack of them is a big pain.

I would like also to suggest to mark some actual tests (those running
DAGs, etc) as system tests. Then we can simplify our units and
probably speed up CI builds (not to mention the reduction of side
effects). The approach used for GCP system tests that runs an example
DAG makes creating such tests really easy (or we can generate them
automatically...).

Regarding the frequency of such tests, I think we should run all of
them daily on master. Or run them when there is a change in specific
files (operators / hooks etc).

Tomek


On Sat, Feb 15, 2020 at 1:15 PM Jarek Potiuk <jarek.pot...@polidea.com> wrote:
>
> TL;DR; I would like to revive a discussion (hopefully short :) and possibly
> cast a vote on "AIP-4 - Support for System Tests for external systems".
>
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-4+Support+for+System+Tests+for+external+systems
>
> This is the very first AIP I created almost 1.5 years ago and it took very
> long to get to the point where I think we are very, very close to being
> able to implement it after many, many baby steps (and some bigger leaps)
> that we've done in the meantime.
>
> *Let me just quickly summarise what is the context:*
>
>    - One of the biggest Airflow advantages are integrations with external
>    systems. We have i think several 100s of hooks and operators working with
>    those external systems
>    - We have an extensive set of tests - both unit and integrations that
>    are sometimes really good and catching a lot of problems, but they can only
>    do as much as mocking out access to the external systems.
>    Unit/integration tests are great for testing the core of Airflow and it's
>    functionality but the external services cannot be effectively tested
>    - The externa services sometimes change - we have new versions of tools,
>    services etc released every day and sometimes even if we perfectly mock it
>    in unit tests - the hooks simply stop working at some point in time.
>    - I think there is a need to run some tests on a systems level regularly
>    - communicating with "real" external systems and testing our operators,
>    Let's call them System Tests. They do not necessarily need to be run with
>    every PR, but I think running them regularly makes perfect sense.
>
>
> *Why now? Why this seems to be a good time to do it?*
>
>    - We switched to pytests and we already have separation to
>    unit/integration tests in place - we can add support to system tests using
>    the same mechanisms.
>    - With AIP-21 we grouped the tests into "providers" package and that
>    makes it easy to define boundaries of "systems" - every provider is a
>    "system" to test.
>    - We have plenty of system tests implemented for GCP which we are going
>    to use to run tests for backported packages from AIP-21 - we followed
>    system test automation for more than a year in GCP operators and we have it
>    fully automated already.
>    - In the latest PR - https://github.com/apache/airflow/pull/7389 we even
>    extracted all the GCP-specific way we run system tests  in the way to a)
>    make it easy for everyone to write automated system tests b) make it
>    possible to be automated.
>    - We have credits provided by Google to run our tests and we can use
>    them for regular runs of the system tests
>    - We are close to switch-over to GitHub Actions, which will make it easy
>    to write manually or regularly scheduled actions that will have securely
>    stored credentials to run the system tests - in a way that it will be
>    controlled by committers and not abusable by contributors who prepare PRs.
>    - I would like to start and lead a community-driven effort where we will
>    split amongst community members writing missing tests - so that our new
>    backport packages can be tested against latest-released version of 1.10.*.
>    We will provide GCP tests as examples, we will also setup the automation
>    needed to run the tests regularly - the only thing we will ask the members
>    of the community is to write missing tests. This way I hope we can get very
>    high coverage of backported packages.
>
> There are of course still a number of open questions - like how to store
> credentials, how often to run the tests etc. but I think those are
> implementation details that we can work out while we are implementing it.
>
> What do you think about it? If I have a lot of "yes's" quickly, I would
> love to start voting on AIP-4.
>
> J.
>
>
>
>
>
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>

Reply via email to