Hey folks,

I think both these sticking points are really a trade-off of simplicity vs 
consistency/reliability. And to be clear I'm not arguing for things to be more 
complex just for the heck of it, I agree that simplicity is great! But just 
that there needs to be a balance and we can't get caught over-indexing on one 
or the other. I think the combination of test environments being a free for all 
and tests being simply a set of guidelines with some static analysis both will 
combine to be brittle. The example Mateusz just described regarding around 
needing a watcher task to ensure tests end with the right result is a great 
example of how the route of kludging example dags themselves to be the test and 
the test runner can be brittle and complicated. And again, I love the idea of 
the example dags being the code under test, I just think having them also 
conduct the test execution of themselves is going to be troublesome.


But as always, if I'm the only one worried about this, I'm happy to disagree 
and commit and see how it goes :)


Cheers,
Niko

________________________________
From: Jarek Potiuk <ja...@potiuk.com>
Sent: Sunday, February 6, 2022 8:52 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [DISCUSSION] AIP-47 New design of Airflow System Tests


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


I think Mateusz explained the points very well. I have just a few comments to 
some of the points.

> 3. In general the AIP reads as if it's solved this problem, but it's more 
> like it has absolved itself from solving this problem, which is much 
> different. I think this approach could possibly make things even worse as now 
> there is no contract or interface for how to plumb configuration and 
> credentials to the system test dags. The current set of methods and files to 
> plumb credentials through aren't great (and as of now are quite Google 
> specific) but I think this interface can be simplified and improved rather 
> than just exported wholesale for each provider to re-invent a new approach.

We've discussed it extensively with Mateusz (I was also of the opinion that we 
could do some automation here). For example wec could write a "terraform" 
script that creates the whole environment - set up all the service accounts 
etc. But Mateusz convinced me it will be very hard to "mandate" a common way of 
doing it for multiple "services" or "groups" of services. My proposal is that 
we should be clear in the AIP/framework that we don't solve it in a "common 
way". But instead we keep a "service-specific" way of preparing the 
environment. We might automate it - in a service-specific way, but having it as 
part of the system tests is I think out-of-scope. In a way we currently have it 
already with our "regular" tests. To build the AMI to run our self-hosted 
runners, we have a separate repo: https://github.com/apache/airflow-ci-infra/ - 
where we have "packer" scripts which build our image. We even tried Terraform 
there, but well - packer is "good enough". And we can have separate 
"airflow-system-tests-aws" repo and "airlfow-system-tests-gcp" repo, where we 
will - separately document and possibly automate how to build such a "runner"

> 4.  A system that relies on good intentions like "be sure to remember to do X 
> otherwise bad thing Y will happen" certainly guaranties that bad thing Y will 
> happen, frequently. Humans are fallible and not great at sticking to a 
> contract or interface that isn't codified. And this AIP is littered with 
> statements like this. We need test infrastructure that's easier to use and 
> will also enforce these best practices/requirements which are needed for the 
> tests to run.

Here - I wholeheartedly agree with Mateusz. This is GREAT simplification to 
have one example file doing everything. The previous approach we had was 
extremely complex - you had scripts, pytest tests, example dags and they were 
providing (meta)data to each other and it was not only hard to reason about 
them but also hard to verify that they are "ok". The idea of just making it one 
file is great. And the amount of "be sure" is not only small but it actually 
can be very easily enforced by pre-commits. We could make sure that our 
"example_dags" in a certain provider contain the (few) lines that need to be 
copied among them - having a common "header" and "footer" on example dag is 
super-simple pre-commit to write. We also discussed some other approaches and I 
think it is really powerful that the scripts can be run as "pytest" tests and 
as "standalone" scripts with SequentialDebugger. The level of control it gives 
for manual runs and the level of automation it provides by tapping into some of 
the great pytest features (parallel runs, status, flakiness, timeouts, and 
plethora of other plugins) - all of that makes it great to run multiple tests 
in the CI environment. This way it is very easy to run the system test locally 
even without employing pytest - when you need to run it standalone and run 
Pytest in CI or when you want to run multiple tests.

Reply via email to