[
https://issues.apache.org/jira/browse/MAPREDUCE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208601#comment-13208601
]
Jason Lowe commented on MAPREDUCE-3862:
---------------------------------------
DeletionService has the following code which implies we don't want to wait too
long for the shutdown to complete:
{code}
public void stop() {
sched.shutdown();
try {
sched.awaitTermination(10, SECONDS);
} catch (InterruptedException e) {
sched.shutdownNow();
}
super.stop();
}
{code}
However the code never checks the result from {{awaitTermination()}}, and we
can end up trying to continue the shutdown process with the thread pool still
active.
> Nodemanager can appear to hang on shutdown due to lingering DeletionService
> threads
> -----------------------------------------------------------------------------------
>
> Key: MAPREDUCE-3862
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3862
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2, nodemanager
> Affects Versions: 0.23.1
> Reporter: Jason Lowe
>
> When a nodemanager attempts to shutdown cleanly, it's possible for it to
> appear to hang due to lingering DeletionService threads. This can occur when
> yarn.nodemanager.delete.debug-delay-sec is set to a relatively large value
> and one or more containers executes on the node shortly before the shutdown.
> The DeletionService is never calling
> {{setExecuteExistingDelayedTasksAfterShutdownPolicy()}} on the
> ScheduledThreadPoolExecutor, and it defaults to waiting for all scheduled
> tasks to complete before exiting.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira