[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209113#comment-13209113
 ] 

Jason Lowe commented on MAPREDUCE-3730:
---------------------------------------

Thanks Vinod for taking another look.  The race condition mentioned above isn't 
specific to the new node approach, as it's still present when modifying the 
original RMNode.  A node could rejoin just as the tracker service expires it, 
and we could mark the node as lost (new or original node object, doesn't 
matter) just as it starts to heartbeat in again.

I still think using a new node object is easier to implement, understand, and 
maintain since it more directly mirrors the existing behavior of a node that 
eventually expires and starts again.  However I'll defer to your judgement and 
update the patch accordingly.

Note that using ephemeral ports can be problematic, as it opens the door to 
inadvertent duplicate NM registration and the problems described in 
MAPREDUCE-3363.  I believe all of our clusters are currently configured to not 
use them.  We should reconsider the ephemeral port default once this goes in, 
since it was only changed to workaround the NM rejoin issue.
                
> Allow restarted NM to rejoin cluster before RM expires it
> ---------------------------------------------------------
>
>                 Key: MAPREDUCE-3730
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3730
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2, resourcemanager
>    Affects Versions: 0.23.1, 0.24.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Minor
>             Fix For: 0.23.2
>
>         Attachments: MAPREDUCE-3730.patch, MAPREDUCE-3730.patch
>
>
> When a node in the RUNNING state (healthy or unhealthy) is rebooted, the 
> resourcemanager rejects the nodemanager's registration request as a duplicate 
> because it is convinced that the nodemanager is already running on that node. 
>  It won't allow that node to rejoin the cluster until the node expiration 
> time elapses which is 10min+ by default.  We should allow the NM to rejoin 
> the cluster if it re-registers within the expiration timeout.
> Note that this problem occurs with NMs that are configured to specific ports. 
>  If ephemeral ports are used then a NM reboot "works" because the RM thinks 
> the NM registration is for a new node.  See the discussions in MAPREDUCE-3070 
> and MAPREDUCE-3363.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to