[ https://issues.apache.org/jira/browse/HDFS-811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eli Collins updated HDFS-811: ----------------------------- Affects Version/s: (was: 0.21.0) Issue Type: New Feature (was: Test) Hadoop Flags: [Reviewed] > Add metrics, failure reporting and additional tests for HDFS-457 > ---------------------------------------------------------------- > > Key: HDFS-811 > URL: https://issues.apache.org/jira/browse/HDFS-811 > Project: Hadoop HDFS > Issue Type: New Feature > Components: test > Affects Versions: 0.22.0 > Reporter: Ravi Phulari > Assignee: Eli Collins > Priority: Minor > Fix For: 0.22.0 > > Attachments: hdfs-811-1.patch, hdfs-811-2.patch, hdfs-811-3.patch, > hdfs-811-4.patch, hdfs-811-5.patch, hdfs-811-6.patch > > > HDFS-457 introduced a improvement which allows datanode to continue if a > volume for replica storage fails. Previously a datanode resigned if any > volume failed. > Description of HDFS-457 > {quote} > Current implementation shuts DataNode down completely when one of the > configured volumes of the storage fails. > This is rather wasteful behavior because it decreases utilization (good > storage becomes unavailable) and imposes extra load on the system > (replication of the blocks from the good volumes). These problems will become > even more prominent when we move to mixed (heterogeneous) clusters with many > more volumes per Data Node. > {quote} > I suggest following additional tests for this improvement. > #1 Test successive volume failures ( Minimum 4 volumes ) > #2 Test if each volume failure reports reduction in available DFS space and > remaining space. > #3 Test if failure of all volumes on a data nodes leads to the data node > failure. > #4 Test if correcting failed storage disk brings updates and increments > available DFS space. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.