Robert Kanter created OOZIE-1205:
------------------------------------

             Summary: If the JobTracker is restarted during a Fork, Oozie 
doesn't fail all of the currently running actions
                 Key: OOZIE-1205
                 URL: https://issues.apache.org/jira/browse/OOZIE-1205
             Project: Oozie
          Issue Type: Bug
          Components: action
    Affects Versions: trunk
            Reporter: Robert Kanter
            Assignee: Robert Kanter
             Fix For: trunk


If you have a workflow with a fork and restart the JobTracker while its 
executing the paths in the fork, those two jobs will be lost (as expected).  
Once the timeout occurs on the {{ActionCheckXCommand}}, it will check both 
actions sequentially.  While checking the first action, it sets the status to 
FAILED and also sets the workflow's status to FAILED.  It then moves on to the 
other action that was running concurrently, but it cannot pass the precondition 
check because the workflow was already FAILED (the check requires that the 
Workflow is RUNNING).  It will keep trying this every time the timeout hits 
(10min is default) and print a WARN message in the log.   That action will also 
be in RUNNING state forever even though the underlying job isn't running and 
the WF is FAILED.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to