[ 
https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13031853#comment-13031853
 ] 

Tanping Wang commented on HDFS-1371:
------------------------------------

In the case of all replicas are corrupted, there is nothing cluster can do to 
recover. Reporting this to the name node or not would not make a difference. We 
can argue that if a operator knows about the block corruption, he can choose to 
physically copy the file from somewhere else. However, in real life, based on 
Koji's experience, in the past couple years, this has never been a case on 
Yahoo's clusters that all block replicas are corrupted. Beyond this point, we 
do not want to too rely on a client to report block corruption and want to 
restrict the solution to just deal with a handicapped client.

> One bad node can incorrectly flag many files as corrupt
> -------------------------------------------------------
>
>                 Key: HDFS-1371
>                 URL: https://issues.apache.org/jira/browse/HDFS-1371
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client, name-node
>    Affects Versions: 0.20.1
>         Environment: yahoo internal version 
> [knoguchi@gwgd4003 ~]$ hadoop version
> Hadoop 0.20.104.3.1007030707
>            Reporter: Koji Noguchi
>            Assignee: Tanping Wang
>         Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch
>
>
> On our cluster, 12 files were reported as corrupt by fsck even though the 
> replicas on the datanodes were healthy.
> Turns out that all the replicas (12 files x 3 replicas per file) were 
> reported corrupt from one node.
> Surprisingly, these files were still readable/accessible from dfsclient 
> (-get/-cat) without any problems.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to