[ 
https://issues.apache.org/jira/browse/HDFS-811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876948#action_12876948
 ] 

Jakob Homan commented on HDFS-811:
----------------------------------

Before this goes in, there are still quite a few undescribed asserts and a fair 
amount of unnecessary white space changes that should be removed.  Also, 
TestDataNodeVolumeFailure2? Oy.  It that the sequel?  Would a more descriptive 
name be reasonable?

> Add metrics, failure reporting and additional tests for HDFS-457
> ----------------------------------------------------------------
>
>                 Key: HDFS-811
>                 URL: https://issues.apache.org/jira/browse/HDFS-811
>             Project: Hadoop HDFS
>          Issue Type: Test
>          Components: test
>    Affects Versions: 0.21.0, 0.22.0
>            Reporter: Ravi Phulari
>            Assignee: Eli Collins
>            Priority: Minor
>             Fix For: 0.21.0, 0.22.0
>
>         Attachments: hdfs-811-1.patch, hdfs-811-2.patch, hdfs-811-3.patch, 
> hdfs-811-4.patch
>
>
>  HDFS-457 introduced a improvement which allows  datanode to continue if a 
> volume for replica storage fails. Previously a datanode resigned if any 
> volume failed. 
> Description of HDFS-457
> {quote}
> Current implementation shuts DataNode down completely when one of the 
> configured volumes of the storage fails.
> This is rather wasteful behavior because it decreases utilization (good 
> storage becomes unavailable) and imposes extra load on the system 
> (replication of the blocks from the good volumes). These problems will become 
> even more prominent when we move to mixed (heterogeneous) clusters with many 
> more volumes per Data Node.
> {quote}
> I suggest following additional tests for this improvement. 
> #1 Test successive  volume failures ( Minimum 4 volumes )
> #2 Test if each volume failure reports reduction in available DFS space and 
> remaining space.
> #3 Test if failure of all volumes on a data nodes leads to the data node 
> failure.
> #4 Test if correcting failed storage disk brings updates and increments 
> available DFS space. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to