[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132576#comment-13132576
 ] 

Devaraj K commented on MAPREDUCE-3070:
--------------------------------------

Before the patch MAPREDUCE-2775, it adds the resources in schedulers when we 
create a node(RMNodeImpl). RMNode was getting creating before checking whether 
it present or not in this.rmContext.getRMNodes().

{code:title=RMNodeImpl.java|borderStyle=solid}


public RMNodeImpl(NodeId nodeId, RMContext context, String hostName,
      int cmPort, int httpPort, Node node, Resource capability) {
    .......
    .......
    context.getDispatcher().getEventHandler().handle(
        new NodeAddedSchedulerEvent(this));
  }


As part of the MAPREDUCE-2775 patch, this event handling will be removed from 
RMNodeImpl. Same will be done as part of AddNodeTransition which occurs when 
the STARTED event triggers. This event is triggering only when the node isn't 
present in this.rmContext.getRMNodes().


{code:title=ResourceTrackerService.java|borderStyle=solid}
       if (this.rmContext.getRMNodes().putIfAbsent(nodeId, rmNode) != null) {
         throw new IOException("Duplicate registration from the node!");
       }
+      
+      this.rmContext.getDispatcher().getEventHandler().handle(
+          new RMNodeEvent(nodeId, RMNodeEventType.STARTED));
 
{code}
                
> NM not able to register with RM after NM restart
> ------------------------------------------------
>
>                 Key: MAPREDUCE-3070
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3070
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.0
>            Reporter: Ravi Teja Ch N V
>            Assignee: Devaraj K
>            Priority: Blocker
>             Fix For: 0.23.0
>
>         Attachments: MAPREDUCE-3070.patch
>
>
> After stopping NM gracefully then starting NM, NM registration fails with RM 
> with Duplicate registration from the node! error.
> {noformat} 
> 2011-09-23 01:50:46,705 FATAL nodemanager.NodeManager 
> (NodeManager.java:main(204)) - Error starting NodeManager
> org.apache.hadoop.yarn.YarnException: Failed to Start 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager
>       at 
> org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:153)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:202)
> Caused by: org.apache.avro.AvroRuntimeException: 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> Duplicate registration from the node!
>       at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:141)
>       at 
> org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
>       ... 2 more
> Caused by: 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> Duplicate registration from the node!
>       at 
> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:142)
>       at $Proxy13.registerNodeManager(Unknown Source)
>       at 
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:175)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:137)
>       ... 3 more
> {noformat} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to