[ https://issues.apache.org/jira/browse/YARN-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15181334#comment-15181334 ]
Sidharta Seethana commented on YARN-4762: ----------------------------------------- Thanks for the review, [~vinodkv]. Yes, I admit that some of this layering is confusing (and is a WIP) - this is because of having to maintain backward compatibility with the behavior of the existing (cgroups) resource handler. I'll make the changes you suggested in the patch - the typos were introduced in a different patch, but i'll fix them, nonetheless. > NMs failing on DelegatingLinuxContainerRuntime init with LCE on > --------------------------------------------------------------- > > Key: YARN-4762 > URL: https://issues.apache.org/jira/browse/YARN-4762 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Vinod Kumar Vavilapalli > Assignee: Sidharta Seethana > Priority: Blocker > Attachments: YARN-4762.001.patch > > > Seeing this exception and the NMs crash. > {code} > 2016-03-03 16:47:57,807 DEBUG org.apache.hadoop.service.AbstractService: > Service > org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService > is started > 2016-03-03 16:47:58,027 DEBUG > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: > checkLinuxExecutorSetup: > [/hadoop/hadoop-yarn-nodemanager/bin/container-executor, --checksetup] > 2016-03-03 16:47:58,043 ERROR > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl: > Mount point Based on mtab file: /proc/mounts. Controller mount point not > writable for: cpu > 2016-03-03 16:47:58,043 ERROR > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime: > Unable to get cgroups handle. > 2016-03-03 16:47:58,044 DEBUG org.apache.hadoop.service.AbstractService: > noteFailure org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to > initialize container executor > 2016-03-03 16:47:58,044 INFO org.apache.hadoop.service.AbstractService: > Service NodeManager failed in state INITED; cause: > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize > container executor > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize > container executor > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:240) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:539) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:587) > Caused by: java.io.IOException: Failed to initialize linux container > runtime(s)! > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:207) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:238) > ... 3 more > 2016-03-03 16:47:58,047 DEBUG org.apache.hadoop.service.AbstractService: > Service: NodeManager entered state STOPPED > 2016-03-03 16:47:58,047 DEBUG org.apache.hadoop.service.CompositeService: > NodeManager: stopping services, size=0 > 2016-03-03 16:47:58,047 DEBUG org.apache.hadoop.service.AbstractService: > Service: > org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService > entered state STOPPED > 2016-03-03 16:47:58,047 FATAL > org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting > NodeManager > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize > container executor > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:240) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:539) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:587) > Caused by: java.io.IOException: Failed to initialize linux container > runtime(s)! > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:207) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:238) > ... 3 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)