Re: [DISCUSSION] AIP-47 New design of Airflow System Tests

Jarek Potiuk Tue, 15 Feb 2022 03:21:36 -0800

For me that one looks good.


On Tue, Feb 15, 2022 at 10:51 AM Mateusz Nojek <matno...@gmail.com> wrote:
>
> Hello again,
>
> I agree with your (Niko, Giorgio) points that the environment needs to be set 
> up as easy as possible, but I will leave it to the providers, as mentioned 
> before. It's also in their interest to simplify it.
>
> Regarding the "watcher" task. It's been discussed in another topic on the 
> devlist (https://lists.apache.org/thread/qkbwb40tqd3gpzny5xvsvsf93r3slm2z) 
> and the conclusion is that we can make use of "tasks" attribute of the dag 
> object, like this:
>
> list(dag.tasks) >> watcher
>
> No new features needed and also by using list conversion we protect it from 
> further changes to dag.tasks field.
> I think this is very clever yet beautiful way to create a dependency between 
> watcher and all other tasks (thanks, Ash!).
>
> Also, I've just updated the AIP with changes proposed during this discussion:
> - Added point 3. in "What change do you propose to make"
> - Added point 7. in "Design details"
> - Updated "Complete test example" section
> - Added new section "Watcher task" right under "Tasks as setup and teardown"
>
> Please familiarize with the changes and if there are no other strong opinions 
> against this AIP, I would like to start the voting. Is that OK?
>
> Kind regards,
> Mateusz Nojek
>
>
> czw., 10 lut 2022 o 11:27 Jarek Potiuk <ja...@potiuk.com> napisał(a):
>>>
>>>
>>> I think both these sticking points are really a trade-off of simplicity vs 
>>> consistency/reliability. And to be clear I'm not arguing for things to be 
>>> more complex just for the heck of it, I agree that simplicity is great! But 
>>> just that there needs to be a balance and we can't get caught over-indexing 
>>> on one or the other. I think the combination of test environments being a 
>>> free for all and tests being simply a set of guidelines with some static 
>>> analysis both will combine to be brittle. The example Mateusz just 
>>> described regarding around needing a watcher task to ensure tests end with 
>>> the right result is a great example of how the route of kludging example 
>>> dags themselves to be the test and the test runner can be brittle and 
>>> complicated. And again, I love the idea of the example dags being the code 
>>> under test, I just think having them also conduct the test execution of 
>>> themselves is going to be troublesome.
>>
>>
>> I think we should try it and see. I do agree the "watcher" case is a bit not 
>> super-straightforward - and partially it comes from lack of features in 
>> Airflow DAG processing. Maybe also that means we SHOULD add a new feature 
>> where we can specify a task in a DAG that is always run when the DAG ends 
>> and determines the status of all tasks ?
>>
>> Currently the idea we have (my proposal) is that all such "overhead" code in 
>> the example dags MIGHT be automatically added by pre-commit. So the 
>> "Example" dags might have an "auto-generated" part where such watcher (if 
>> present in the DAG) will automatically get the right dependencies to all the 
>> tasks. But maybe we actually could add such feature to Airflow. It does not 
>> seem complex. We could actually have a simple new operator for DAG to add 
>> such "completion" task ?
>>
>> I can imagine for example this:
>>
>> dag >>= watcher
>>
>> Which could automatically add all tasks defined currently in the DAG as 
>> dependencies of the watcher :)
>>
>> WDYT? That sounds like an easy task and also usable in a number of cases 
>> except the System Testing ?
>>
>> J.
>>
>>

Re: [DISCUSSION] AIP-47 New design of Airflow System Tests

Reply via email to