[
https://issues.apache.org/jira/browse/MAPREDUCE-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101092#comment-13101092
]
Ravi Teja Ch N V commented on MAPREDUCE-2949:
---------------------------------------------
All the composite services are initialized (not started) at the time of
NodeManager initialization. As part of this, the ResourceLocalizationService is
initialized , which starts PublicLocalizer Thread. This thread waits for the
CompletionService queue.
After the services are initialized,while services startup,
NodeStatusUpdaterImpl service startup failed, so only the Deletion Service
(which is started before it) is stopped. Hence the PublicLocalizer thread is
still running, which is runs the NodeManager even though no service is started.
{code:xml}
"Thread-11" prio=10 tid=0x70468800 nid=0x11ce waiting on condition [0x706fe000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0xa01f4a40> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
at
java.util.concurrent.ExecutorCompletionService.take(ExecutorCompletionService.java:164)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:547)
{code}
> NodeManager in a inconsistent state if a service startup fails.
> ---------------------------------------------------------------
>
> Key: MAPREDUCE-2949
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2949
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2, nodemanager
> Affects Versions: 0.24.0
> Reporter: Ravi Teja Ch N V
> Assignee: Ravi Teja Ch N V
>
> When a service startup fails at the Nodemanager, the Nodemanager JVM doesnot
> exit as the following threads are still running.
> Daemon Thread [Timer for 'NodeManager' metrics system] (Running)
> Thread [pool-1-thread-1] (Running)
> Thread [Thread-11] (Running)
> Thread [DestroyJavaVM] (Running).
> As a result, the NodeManager keeps running even though no services are
> started.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira