>
>
> I think both these sticking points are really a trade-off of simplicity vs
> consistency/reliability. And to be clear I'm not arguing for things to be
> more complex just for the heck of it, I agree that simplicity is great! But
> just that there needs to be a balance and we can't get caught over-indexing
> on one or the other. I think the combination of test environments being a
> free for all and tests being simply a set of guidelines with some static
> analysis both will combine to be brittle. The example Mateusz just
> described regarding around needing a watcher task to ensure tests end with
> the right result is a great example of how the route of kludging example
> dags themselves to be the test *and* the test runner can be brittle and
> complicated. And again, I love the idea of the example dags being the code
> under test, I just think having them also conduct the test execution of
> themselves is going to be troublesome.
>

I think we should try it and see. I do agree the "watcher" case is a bit
not super-straightforward - and partially it comes from lack of features in
Airflow DAG processing. Maybe also that means we SHOULD add a new feature
where we can specify a task in a DAG that is always run when the DAG ends
and determines the status of all tasks ?

Currently the idea we have (my proposal) is that all such "overhead" code
in the example dags MIGHT be automatically added by pre-commit. So the
"Example" dags might have an "auto-generated" part where such watcher (if
present in the DAG) will automatically get the right dependencies to all
the tasks. But maybe we actually could add such feature to Airflow. It does
not seem complex. We could actually have a simple new operator for DAG to
add such "completion" task ?

I can imagine for example this:

dag >>= watcher

Which could automatically add all tasks defined currently in the DAG as
dependencies of the watcher :)

WDYT? That sounds like an easy task and also usable in a number of cases
except the System Testing ?

J.

Reply via email to