[GitHub] [airflow] yuqian90 commented on a change in pull request #4182: [AIRFLOW-3336] Add new TriggerRule that will consider skipped ancestors as success

2019-12-23 Thread GitBox
yuqian90 commented on a change in pull request #4182: [AIRFLOW-3336] Add new 
TriggerRule that will consider skipped ancestors as success 
URL: https://github.com/apache/airflow/pull/4182#discussion_r361069293
 
 

 ##
 File path: airflow/ti_deps/deps/trigger_rule_dep.py
 ##
 @@ -152,6 +152,11 @@ def _evaluate_trigger_rule(
 elif tr == TR.ONE_FAILED:
 if upstream_done and not (failed or upstream_failed):
 ti.set_state(State.SKIPPED, session)
+elif tr == TR.NONE_FAILED:
+if upstream_failed or failed:
+ti.set_state(State.UPSTREAM_FAILED, session)
+elif skipped == upstream:
 
 Review comment:
   Pls note that this can be a feature worth preserving, because if someone 
wants his task op2 to only run if at least some upstream tasks have succeeded 
(while others can be skipped), he can use "none_failed" to do that because 
that's exactly what this line does. 
   
   However, if he wants to have his task op2 succeed as long as none of the 
upstream tasks fail (either skipped for success), he can still use 
"none_failed" for it. The only workaround he needs to add is to introduce a 
DummyOperator that always succeeds directly upstream of op2.
   
   So we can keep this feature, but we should make it clear in the doc that if 
all the upstream tasks are skipped, a task marked with "none_failed" will be 
skipped. Pls let me know. i don't mind creating a PR to update the doc.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] yuqian90 commented on a change in pull request #4182: [AIRFLOW-3336] Add new TriggerRule that will consider skipped ancestors as success

2019-12-23 Thread GitBox
yuqian90 commented on a change in pull request #4182: [AIRFLOW-3336] Add new 
TriggerRule that will consider skipped ancestors as success 
URL: https://github.com/apache/airflow/pull/4182#discussion_r361067775
 
 

 ##
 File path: airflow/ti_deps/deps/trigger_rule_dep.py
 ##
 @@ -152,6 +152,11 @@ def _evaluate_trigger_rule(
 elif tr == TR.ONE_FAILED:
 if upstream_done and not (failed or upstream_failed):
 ti.set_state(State.SKIPPED, session)
+elif tr == TR.NONE_FAILED:
+if upstream_failed or failed:
+ti.set_state(State.UPSTREAM_FAILED, session)
+elif skipped == upstream:
 
 Review comment:
   @rmn36 @kaxil @ashb @Fokko 
   
   I'm trying to understand the rationale behind this line. E.g. for a dag that 
looks like this, with just op1 >> op2. When we run the DAG, both op1 and op2 
become skipped. The reason is op1 raises `AirflowSkipException`, so op1 is 
skipped. op2 is set to `none_failed`. But op2 is also skipped because this line 
checks `skipped == upstream` and marks op2 as skipped.
   
   However, the doc for "Trigger Rule" says:
   ```none_failed: all parents have not failed (failed or upstream_failed) i.e. 
all parents have succeeded or been skipped```
   
   So if I interpret this correctly, when all of op2's upstream tasks are 
skipped, none of them have failed. So op2 should be success rather than skip. 
   
   This may be a feature or a bug. If it's a feature, we need to update the 
doc. Or if it's a bug we should fix this line.
   
   What do you think?
   
   ```
   with DAG("skip_exception", schedule_interval=None, 
start_date=pendulum.parse("20190101")) as dag:
   def raise_skip():
   raise AirflowSkipException
   op1 = PythonOperator(task_id="op1", python_callable=raise_skip)
   op2 = DummyOperator(task_id="op2", trigger_rule="none_failed")
   op1 >> op2
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services