Respect failed.volumes.tolerated on startup
-------------------------------------------

                 Key: HDFS-1849
                 URL: https://issues.apache.org/jira/browse/HDFS-1849
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: data-node
            Reporter: Eli Collins
             Fix For: 0.23.0


The current failed.volumes.tolerated behavior is not user friendly, datanodes 
can be configured to tolerate N volume failures and still offer service, but if 
the cluster is restarted all the datanodes with failed volumes will not start 
unless the failed volumes have been removed from the hdfs configuration files 
on the respective hosts.

The failed.volumes.tolerated configuration option should be respected on 
startup. The datanode should only refuse to startup if more than 
failed.volumes.tolerated (HDFS-1161) have failed, or if a configured critical 
volume (HDFS-1848) has failed (which is probably not an issue in practice since 
dn startup probably fails eg if the root volume has gone readonly).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to