[
https://issues.apache.org/jira/browse/OOZIE-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17458913#comment-17458913
]
Junfan Zhang edited comment on OOZIE-3646 at 12/14/21, 6:26 AM:
----------------------------------------------------------------
Thanks [~dionusos].
Now the test case has been
attached([link|https://github.com/apache/oozie/pull/65]), Besides, the stucked
thread stack is in ticket's attachment.
If you run {{testPossibleDeadLock}} method, it will fail.
But you make the
{{ConfigurationService.setBoolean(SignalXCommand.FORK_PARALLEL_JOBSUBMISSION,
false);}}, everything is ok.
Because of the sync invoking in {{SignalXCommand}}
{code:java}
List<Future<ActionExecutorContext>> futures =
Services.get().get(CallableQueueService.class)
.invokeAll(tasks)
{code}
Please check it and let me know what you think [~dionusos]
was (Author: zuston):
Thanks [~dionusos].
https://github.com/apache/oozie/pull/65 Now the test case has been attached.
Besides, the stucked thread stack is in ticket's attachment.
If you run {{testPossibleDeadLock}} method, it will fail.
But you make the
{{ConfigurationService.setBoolean(SignalXCommand.FORK_PARALLEL_JOBSUBMISSION,
false);}}, everything is ok.
Because of the sync invoking in {{SignalXCommand}}
{code:java}
List<Future<ActionExecutorContext>> futures =
Services.get().get(CallableQueueService.class)
.invokeAll(tasks)
{code}
Please check it and let me know what you think [~dionusos]
> Possible dead-lock in SignalXCommand
> ------------------------------------
>
> Key: OOZIE-3646
> URL: https://issues.apache.org/jira/browse/OOZIE-3646
> Project: Oozie
> Issue Type: Bug
> Reporter: Junfan Zhang
> Priority: Major
> Attachments: Screen Shot 2021-12-14 at 2.24.10 PM.png
>
>
> The limited thread execution mechanism aims to solve the dead-lock when all
> active threads are executing the SignalXCommand's invokeAll method.
> h2. Dead-lock when to happen
> Assuming that Oozie CallableQueue thread pool size is 120, when all threads
> are executing the {{SignalXCommand.startForkedActions}} method, a deadlock
> occurs.
> Because in {{SignalXCommand.startForkedActions}}, the code of
> {{List<Future<ActionExecutorContext>> futures =
> Services.get().get(CallableQueueService.class)
> .invokeAll(tasks);}} will be sync executed, however now
> all callableQueue threads are busy.
> h2. Solution
> 1. Need to limit directly invokeAll call when the num of rest threads is less
> than the tasks
> 2. To obtain correct active threads number in callableQueue, the
> SignalXCommand.class lock is needed.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)