Janos Makai created OOZIE-3584:
----------------------------------

             Summary: Fork-join action issue when action param cannot be 
resolved
                 Key: OOZIE-3584
                 URL: https://issues.apache.org/jira/browse/OOZIE-3584
             Project: Oozie
          Issue Type: Bug
          Components: core
    Affects Versions: 5.1.0
            Reporter: Janos Makai
         Attachments: forkjoin_actionparam_issue.log

_Current bug_
*=========*

There is a sub workflow run in independent mode that runs a fork action which 
contains two (or more) actions.

These actions inside the fork action run in parallel mode, and they have some 
seconds delay in between them.
If a parameter is passed to one of these actions, that cannot be resolved, then 
it changes its status to FAILED, and also the workflow’s state to FAILED. The 
other action’s state which are not started yet will stuck in PREP state 
forever. The correct behaviour would be to KILL the remaining actions as well 
as the workflow.
Note: this bug only occurs when it is run in independent mode. If it has a 
parent workflow, then the parent workflow will kill this workflow after 10 
minutes because of the callback process.

 

 

_Log_
*===*

2020-02-14 11:59:26,698 ERROR org.apache.oozie.command.wf.SignalXCommand: 
SERVER[quasar-nqrrjp-4.quasar-nqrrjp.root.hwx.site] USER[admin] GROUP[-] 
TOKEN[] APP[Sub flow fork join] JOB[0000005-200214101441478-oozie-oozi-W] 
ACTION[0000005-200214101441478-oozie-oozi-W@fork-4a1c] Error running forked 
jobs parallely
org.apache.oozie.command.CommandException: E0718: Workflow already completed
 at org.apache.oozie.command.wf.ActionXCommand.failJob(ActionXCommand.java:213)
 at org.apache.oozie.command.wf.ActionXCommand.failJob(ActionXCommand.java:185)
 at 
org.apache.oozie.command.wf.SignalXCommand.startForkedActions(SignalXCommand.java:498)
 at org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:462)
 at org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:82)
 at org.apache.oozie.command.XCommand.call(XCommand.java:291)
 at 
org.apache.oozie.command.wf.ActionEndXCommand.execute(ActionEndXCommand.java:283)
 at 
org.apache.oozie.command.wf.ActionEndXCommand.execute(ActionEndXCommand.java:62)
 at org.apache.oozie.command.XCommand.call(XCommand.java:291)
 at 
org.apache.oozie.command.wf.ActionStartXCommand.callActionEnd(ActionStartXCommand.java:352)
 at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:338)
 at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:68)
 at org.apache.oozie.command.XCommand.call(XCommand.java:291)
 at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:363)
 at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:292)
 at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
 at 
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:210)
 at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
 at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
 at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.oozie.workflow.WorkflowException: E0718: Workflow already 
completed
 at 
org.apache.oozie.workflow.lite.LiteWorkflowInstance.fail(LiteWorkflowInstance.java:337)
 at org.apache.oozie.command.wf.ActionXCommand.failJob(ActionXCommand.java:201)
 ... 19 more


Full log added as attachment.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to