[jira] Commented: (HDFS-1125) Removing a datanode (failed or decommissioned) should not require a namenode restart

Matthias Friedrich (JIRA) Sun, 06 Feb 2011 23:09:00 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991287#comment-12991287
 ]


Matthias Friedrich commented on HDFS-1125:
------------------------------------------

We also got complaints from our admins about this because it makes it really 
hard to set up professional monitoring. My company operates close to a 100,000 
machines (only a handful Hadoop nodes though), so it's a big concern that our 
infrastructure behaves well.

Also, node decommissioning is one of the things QA departments typically test 
during product 
evaluation, so this could hamper Hadoop adoption in some organizations.

> Removing a datanode (failed or decommissioned) should not require a namenode 
> restart
> ------------------------------------------------------------------------------------
>
>                 Key: HDFS-1125
>                 URL: https://issues.apache.org/jira/browse/HDFS-1125
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 0.20.2
>            Reporter: Alex Loddengaard
>            Priority: Critical
>
> I've heard of several Hadoop users using dfsadmin -report to monitor the 
> number of dead nodes, and alert if that number is not 0.  This mechanism 
> tends to work pretty well, except when a node is decommissioned or fails, 
> because then the namenode requires a restart for said node to be entirely 
> removed from HDFS.  More details here:
> http://markmail.org/search/?q=decommissioned%20node%20showing%20up%20ad%20dead%20node%20in%20web%20based%09interface%20to%20namenode#query:decommissioned%20node%20showing%20up%20ad%20dead%20node%20in%20web%20based%09interface%20to%20namenode+page:1+mid:7gwqwdkobgfuszb4+state:results
> Removal from the exclude file and a refresh should get rid of the dead node.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HDFS-1125) Removing a datanode (failed or decommissioned) should not require a namenode restart

Reply via email to