[
https://issues.apache.org/jira/browse/HADOOP-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509698
]
dhruba borthakur commented on HADOOP-1557:
------------------------------------------
The bug is that when a setReplication() command is sent to the NameNode, no
data blocks are being read by the client and/or datanode. A block that was
corrupt on the datanode is not known to the namenode at that time.
> Deletion of excess replicas should prefer to delete corrupted replicas before
> deleting valid replicas
> -----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-1557
> URL: https://issues.apache.org/jira/browse/HADOOP-1557
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Reporter: dhruba borthakur
>
> Suppose a block has three replicas and two of the replicas are corrupted. If
> the replication factor of the file is reduced to 2. The filesystem should
> preferably delete the two corrupted replicas, otherwise it could lead to a
> corrupted file.
> One option would be to make the datanode periodically validate all blocks
> with their corresponding CRCs. The other option would be to make the
> setReplication call validate existing replicas before deleting excess
> replicas.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.