Hey folks,
Very excited about this AIP! I work on the AWS OSS Airflow team. Getting the AWS system tests running has been a pet-project of mine for the past little while. I come from a test automation background, so this is dear to my heart :) Currently I have a branch that contains the implementation of the various methods (mostly around environment and credential configuration) for AWS system tests, but I was running into very obscure runtime issues. So I'm glad to see one of the goals of this AIP is to simplify this process, since it is convoluted and clunky. Here are some high level thoughts after my read through the AIP: 1. “CI integration can be built using GitHub CI or provider-related solution“ - I'm happy (and excited) to collaborate on this front! I already have a proof of concept running (internally to AWS) built with AWS Code Pipeline and Code Build, which runs various tests each time my personal fork is updated from upstream. But we could easily make something like this public and then connect it to the Airflow repo to verify commits, PRs, etc. 2. After reading through the AIP, I still don't think I truly grok the SYSTEM_TESTS_ENV_ID environment variable. I understand its use, particularly for uniqueness, but who is the owner for setting this? What precisely does it represent? Including an example of an actual value for that env variable in the AIP would be helpful! 3. The AIP reads: “Maintaining system tests requires knowledge about breeze, pytest and maintenance of special files like variables.env or credential files. Without those, we will simplify the architecture and improve management over tests.” The AIP talks a lot about removing this type of setup from the Airflow system test platform to simplify things and that "All needed permissions to external services for execution of DAGs (tests) should be provided to the Airflow instance in advance." and that "Each provider should create an instruction explaining how to prepare the environment to run related system tests so that users can do it on their own". In general the AIP reads as if it's solved this problem, but it's more like it has absolved itself from solving this problem, which is much different. I think this approach could possibly make things even worse as now there is no contract or interface for how to plumb configuration and credentials to the system test dags. The current set of methods and files to plumb credentials through aren't great (and as of now are quite Google specific) but I think this interface can be simplified and improved rather than just exported wholesale for each provider to re-invent a new approach. 4. Lastly, regarding the actual construction of the tests themselves, the proposed design is full of statements like: - "If a teardown task(s) has been defined, remember to add trigger_rule="all_done" parameter to the operator call. This will make sure that this task will always run even if the upstream fails" - “Make sure to include these parameters into DAG call: ...“ - “Change Airflow Executor to DebugExecutor by placing this line at the top of the file: ...” - “Try to keep tasks in the DAG body defined in an order of execution.” A system that relies on good intentions like "be sure to remember to do X otherwise bad thing Y will happen" certainly guaranties that bad thing Y will happen, frequently. Humans are fallible and not great at sticking to a contract or interface that isn't codified. And this AIP is littered with statements like this. We need test infrastructure that's easier to use and will also enforce these best practices/requirements which are needed for the tests to run. In general, it reads much more like a guideline on best practices rather than a new and improved system test engine. Thanks for taking the time to create this AIP I'm very eager to get system testing up and running in Airflow and I'd love to collaborate further on it! Cheers, Niko ________________________________ From: Jarek Potiuk <ja...@potiuk.com> Sent: Tuesday, January 25, 2022 8:57 AM To: dev@airflow.apache.org Subject: RE: [EXTERNAL] [DISCUSSION] AIP-47 New design of Airflow System Tests CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Just let me add a bit more context as I see it. The AIP-47 design (following the AIP-4 idea) is targeted to handle those things: * streamline the execution of the System Tests (which are basically executable "example_dags" of ours which we already have in our documentation for providers) * make them integrated and automatically execute in a "cloud" environment - with real cloud services behind * make use of the 100s of System Tests already written in Google Provider (and many others which we have in in other providers - for example AWS should be close to follow) * test automation of those using Google Cloud Platform donated credits (from what I understand Google is willing to sponsor credits to run those tests regularly). * I am quite sure when it works also AWS will donate credits for our CI that we could use for that. * and we should be able to extend it to other providers as well once the "CI environment" is in place. When this one is implemented, we should be finally able not only to run "unit" tests with all PRs for a lot of provider code, but also we will be able to regularly (not with all PR but likely daily) run the example dags with the real services and make sure that at least big part of our Providers code (Hooks/Operators) are still "working as expected" - at least that they pass "smoke tests". Currently we rely mostly on the manual tests done when we release providers (and our community members are actually great here - as every month we have a great deal of help from within the community), but when we have system tests automation, the confidence of releasing new versions of providers will go up significantly. I am really looking forward to discussion on this one and implementation and deployment for a number of providers. j. On Tue, Jan 25, 2022 at 2:13 PM Mateusz Nojek <matno...@gmail.com<mailto:matno...@gmail.com>> wrote: Hello everyone, I created a new AIP draft titled AIP-47 New design of Airflow System Tests<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-47+New+design+of+Airflow+System+Tests>. It is related to the new design of system tests that improves how they are written and run which should contribute to higher quality of tests and thus to lower quantity of bugs related to the operators. The AIP-47 is located under AIP-4<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-4+Support+for+Automation+of+System+Tests+for+external+systems> as its direct successor (it more deeply describes some of the concepts). The basic target was to simplify tests execution and enable running them more regularly. I was working on it with Eugene Kostieiev and Bartlomiej Hirsz for some time and soon we are also going to publish a PR with changes based on the content of this AIP. For now, we would like to start a discussion with the developers and contributors of the community of Apache Airflow. The idea was already discussed with Jarek Potiuk, the original author of AIP-4, who cooperated with us on some of the design details. Please, tell me what you think about it and share your ideas, concerns and comments. Kind regards, Mateusz Nojek