[jira] [Updated] (MAPREDUCE-3360) Provide information about lost nodes in the UI.

Bh V S Kamesh (Updated) (JIRA) Tue, 20 Dec 2011 10:42:58 -0800

     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Bh V S Kamesh updated MAPREDUCE-3360:
-------------------------------------

    Attachment: MAPREDUCE-3360-1.patch

Hi Jason,
 Thanks for the review.

bq 1. Should the "Total Nodes" column header on the nodes pages be changed to 
something like "Active Nodes"? Currently it sounds like it should be a count of 
all the nodes (active or not) associated with the cluster, but it's only 
counting the active nodes. Related to this, should there be a page listing all 
nodes regardless of state (i.e.: a real Total Nodes page)?

Yes currently it shows only active nodes in the cluster. Even I *think*, 
changing it to Active Nodes would be more appropriate (as it does).

bq.3. Maintenance: ClusterMetrics.decr(RMNodeState) doesn't handle all the node 
states. May be better to just to remove this method and have the one place it's 
used do the switch.

I *think*, there are only three events, which causes the NM to lost, and all 
the 3 events have been handled. 
RUNNING -> UNHEALTY, and UNHEALTY -> RUNNING transitions have been handled in 
their corresponding transition hooks.

bq. Nit: The "N/A" web address for inactive nodes shouldn't be a hyperlink, 
since it doesn't go anywhere useful.

For inactive nodes, hyperlink appears as "N/A" however clicking on that does 
not go anywhere.
Please once refer the attached screen shot.

2. There will be bookkeeping issues for the inactive nodes when a single host 
has been configured with multiple nodemanager instances. (A bit odd, but 
possible to setup.) Since the inactive nodes are tracked only by hostname, we 
will remove a node from the inactive list when a new nodemanager appears on a 
different port. Probably best to track that issue in a separate JIRA. There are 
other issues with that setup, e.g.: inability to detect redundant nodemanager 
launches, limit nodemanager instances, etc.

If there is only one NM running on a host, there won't be any problem. However 
if there are multiple NMs running on a single host, it will be a problem.
If the NMs running on a particular host configured to use ephemeral ports, 
there is no such mechanism to identify NMs comeback.

Filing a separate JIRA for this
                
> Provide information about lost nodes in the UI.
> -----------------------------------------------
>
>                 Key: MAPREDUCE-3360
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3360
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>    Affects Versions: 0.23.0
>         Environment: NA
>            Reporter: Bh V S Kamesh
>         Attachments: LostNodes.png, MAPREDUCE-3360-1.patch, 
> MAPREDUCE-3360.patch, lostNodes.png
>
>
> Currently there is no information provided about *lost nodes*. Provide 
> information in the UI. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-3360) Provide information about lost nodes in the UI.

Reply via email to