[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179027#comment-13179027
 ] 

Jason Lowe commented on MAPREDUCE-3360:
---------------------------------------

Thanks for the updates.  A couple of things about the handling of UNHEALTHY 
nodes:

- They are not removed from the list of nodes being tracked in the context ( 
{{rmNode.context.getRMNodes()}} ), so I don't think we want to add them to the 
list of inactive nodes.  Otherwise the node would be in the two node lists 
simultaneously, and that's probably not desireable.  Specifically we'd want to 
remove this code insertion from the patch:

{code}
@@ -394,6 +411,8 @@ public class RMNodeImpl implements RMNode, 
EventHandler<RMNodeEvent> {
         // Inform the scheduler
         rmNode.context.getDispatcher().getEventHandler().handle(
             new NodeRemovedSchedulerEvent(rmNode));
+        rmNode.context.getInactiveRMNodes()
+            .put(rmNode.nodeId.getHost(), rmNode);
         ClusterMetrics.getMetrics().incrNumUnhealthyNMs();
         return RMNodeState.UNHEALTHY;
       }
{code}

- A node that's marked UNHEALTHY could still have a working nodemanager web 
page, so we don't want to remove the link to it on the status page.  Since the 
UNHEALTHY nodes are tracked in the normal node list, it's simplest to remove 
the UNHEALTHY case from the switch statement in NodesPages.java.


At some point unit tests need to be added/updated for this change (e.g.: 
updating TestNodesPage.java to verify nodes that transition into the LOST state 
appear on the LOST page, etc.)

                
> Provide information about lost nodes in the UI.
> -----------------------------------------------
>
>                 Key: MAPREDUCE-3360
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3360
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>    Affects Versions: 0.23.0
>         Environment: NA
>            Reporter: Bhallamudi Venkata Siva Kamesh
>            Priority: Critical
>         Attachments: LostNodes.png, MAPREDUCE-3360-1.patch, 
> MAPREDUCE-3360-2.patch, MAPREDUCE-3360.patch, lostNodes.png
>
>
> Currently there is no information provided about *lost nodes*. Provide 
> information in the UI. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to