Robert Kanter created OOZIE-1205:
------------------------------------
Summary: If the JobTracker is restarted during a Fork, Oozie
doesn't fail all of the currently running actions
Key: OOZIE-1205
URL: https://issues.apache.org/jira/browse/OOZIE-1205
Project: Oozie
Issue Type: Bug
Components: action
Affects Versions: trunk
Reporter: Robert Kanter
Assignee: Robert Kanter
Fix For: trunk
If you have a workflow with a fork and restart the JobTracker while its
executing the paths in the fork, those two jobs will be lost (as expected).
Once the timeout occurs on the {{ActionCheckXCommand}}, it will check both
actions sequentially. While checking the first action, it sets the status to
FAILED and also sets the workflow's status to FAILED. It then moves on to the
other action that was running concurrently, but it cannot pass the precondition
check because the workflow was already FAILED (the check requires that the
Workflow is RUNNING). It will keep trying this every time the timeout hits
(10min is default) and print a WARN message in the log. That action will also
be in RUNNING state forever even though the underlying job isn't running and
the WF is FAILED.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira