Hey folks,
I think both these sticking points are really a trade-off of simplicity vs consistency/reliability. And to be clear I'm not arguing for things to be more complex just for the heck of it, I agree that simplicity is great! But just that there needs to be a balance and we can't get caught over-indexing on one or the other. I think the combination of test environments being a free for all and tests being simply a set of guidelines with some static analysis both will combine to be brittle. The example Mateusz just described regarding around needing a watcher task to ensure tests end with the right result is a great example of how the route of kludging example dags themselves to be the test and the test runner can be brittle and complicated. And again, I love the idea of the example dags being the code under test, I just think having them also conduct the test execution of themselves is going to be troublesome. But as always, if I'm the only one worried about this, I'm happy to disagree and commit and see how it goes :) Cheers, Niko ________________________________ From: Jarek Potiuk <ja...@potiuk.com> Sent: Sunday, February 6, 2022 8:52 AM To: dev@airflow.apache.org Subject: RE: [EXTERNAL] [DISCUSSION] AIP-47 New design of Airflow System Tests CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. I think Mateusz explained the points very well. I have just a few comments to some of the points. > 3. In general the AIP reads as if it's solved this problem, but it's more > like it has absolved itself from solving this problem, which is much > different. I think this approach could possibly make things even worse as now > there is no contract or interface for how to plumb configuration and > credentials to the system test dags. The current set of methods and files to > plumb credentials through aren't great (and as of now are quite Google > specific) but I think this interface can be simplified and improved rather > than just exported wholesale for each provider to re-invent a new approach. We've discussed it extensively with Mateusz (I was also of the opinion that we could do some automation here). For example wec could write a "terraform" script that creates the whole environment - set up all the service accounts etc. But Mateusz convinced me it will be very hard to "mandate" a common way of doing it for multiple "services" or "groups" of services. My proposal is that we should be clear in the AIP/framework that we don't solve it in a "common way". But instead we keep a "service-specific" way of preparing the environment. We might automate it - in a service-specific way, but having it as part of the system tests is I think out-of-scope. In a way we currently have it already with our "regular" tests. To build the AMI to run our self-hosted runners, we have a separate repo: https://github.com/apache/airflow-ci-infra/ - where we have "packer" scripts which build our image. We even tried Terraform there, but well - packer is "good enough". And we can have separate "airflow-system-tests-aws" repo and "airlfow-system-tests-gcp" repo, where we will - separately document and possibly automate how to build such a "runner" > 4. A system that relies on good intentions like "be sure to remember to do X > otherwise bad thing Y will happen" certainly guaranties that bad thing Y will > happen, frequently. Humans are fallible and not great at sticking to a > contract or interface that isn't codified. And this AIP is littered with > statements like this. We need test infrastructure that's easier to use and > will also enforce these best practices/requirements which are needed for the > tests to run. Here - I wholeheartedly agree with Mateusz. This is GREAT simplification to have one example file doing everything. The previous approach we had was extremely complex - you had scripts, pytest tests, example dags and they were providing (meta)data to each other and it was not only hard to reason about them but also hard to verify that they are "ok". The idea of just making it one file is great. And the amount of "be sure" is not only small but it actually can be very easily enforced by pre-commits. We could make sure that our "example_dags" in a certain provider contain the (few) lines that need to be copied among them - having a common "header" and "footer" on example dag is super-simple pre-commit to write. We also discussed some other approaches and I think it is really powerful that the scripts can be run as "pytest" tests and as "standalone" scripts with SequentialDebugger. The level of control it gives for manual runs and the level of automation it provides by tapping into some of the great pytest features (parallel runs, status, flakiness, timeouts, and plethora of other plugins) - all of that makes it great to run multiple tests in the CI environment. This way it is very easy to run the system test locally even without employing pytest - when you need to run it standalone and run Pytest in CI or when you want to run multiple tests.