[ https://issues.apache.org/jira/browse/MAPREDUCE-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132576#comment-13132576 ]
Devaraj K commented on MAPREDUCE-3070: -------------------------------------- Before the patch MAPREDUCE-2775, it adds the resources in schedulers when we create a node(RMNodeImpl). RMNode was getting creating before checking whether it present or not in this.rmContext.getRMNodes(). {code:title=RMNodeImpl.java|borderStyle=solid} public RMNodeImpl(NodeId nodeId, RMContext context, String hostName, int cmPort, int httpPort, Node node, Resource capability) { ....... ....... context.getDispatcher().getEventHandler().handle( new NodeAddedSchedulerEvent(this)); } As part of the MAPREDUCE-2775 patch, this event handling will be removed from RMNodeImpl. Same will be done as part of AddNodeTransition which occurs when the STARTED event triggers. This event is triggering only when the node isn't present in this.rmContext.getRMNodes(). {code:title=ResourceTrackerService.java|borderStyle=solid} if (this.rmContext.getRMNodes().putIfAbsent(nodeId, rmNode) != null) { throw new IOException("Duplicate registration from the node!"); } + + this.rmContext.getDispatcher().getEventHandler().handle( + new RMNodeEvent(nodeId, RMNodeEventType.STARTED)); {code} > NM not able to register with RM after NM restart > ------------------------------------------------ > > Key: MAPREDUCE-3070 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3070 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2, nodemanager > Affects Versions: 0.23.0 > Reporter: Ravi Teja Ch N V > Assignee: Devaraj K > Priority: Blocker > Fix For: 0.23.0 > > Attachments: MAPREDUCE-3070.patch > > > After stopping NM gracefully then starting NM, NM registration fails with RM > with Duplicate registration from the node! error. > {noformat} > 2011-09-23 01:50:46,705 FATAL nodemanager.NodeManager > (NodeManager.java:main(204)) - Error starting NodeManager > org.apache.hadoop.yarn.YarnException: Failed to Start > org.apache.hadoop.yarn.server.nodemanager.NodeManager > at > org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:153) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:202) > Caused by: org.apache.avro.AvroRuntimeException: > org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: > Duplicate registration from the node! > at > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:141) > at > org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68) > ... 2 more > Caused by: > org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: > Duplicate registration from the node! > at > org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:142) > at $Proxy13.registerNodeManager(Unknown Source) > at > org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59) > at > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:175) > at > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:137) > ... 3 more > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira