Attila Sasvari created OOZIE-2882:
-------------------------------------

             Summary: Rerun workflow fails Error: E0404
                 Key: OOZIE-2882
                 URL: https://issues.apache.org/jira/browse/OOZIE-2882
             Project: Oozie
          Issue Type: Improvement
            Reporter: Attila Sasvari


Only one of the properties are allowed [oozie.wf.rerun.skip.nodes OR 
oozie.wf.rerun.failnodes]

Reproduction:
1. Create a workflow with more than 1 node. Eg: Fork - with three parallel 
shell actions. Make sure one of them fails
2. Rerun with 'oozie.wf.rerun.failnodes' set.
3. Rerun again with 'oozie.wf.rerun.skip.nodes' and check 'Skip all successful 
nodes'.
You will get the following error.
Error: E0404 : E0404: Only one of the properties are allowed 
[oozie.wf.rerun.skip.nodes OR oozie.wf.rerun.failnodes]

When a user reruns a workflow job with oozie.wf.rerun.failnode=true and if the 
job fails in subsequent steps, we do not have an option to resubmit the 
workflow using oozie.wf.rerun.skip.node=action1,action2 to allow submission 
from predecessor steps.

Currently, once the workflow fails and one of the rerun options is used for job 
rerun it gets merged and there is no way to override like regular oozie 
configurations or variables.

We have a few options:
1. If fail.nodes and skip.nodes are specified at the same time (or one of them 
was carried over from a previous wf run), we can add {generate skip.nodes by 
discovering nodes that did not fail} union {skip.nodes}
2. Add a way to remove properties (this is also is potentially helpful for 
other use cases)
3. The "newest" property (oozie.wf.rerun.skip.nodes or 
oozie.wf.rerun.failnodes) takes priority and the previous is ignored
4. Make oozie.wf.rerun.skip.nodes or oozie.wf.rerun.failnodes somehow not 
persist in the DB
Part of this JIRA would be to figure out which is the best option.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to