[ https://issues.apache.org/jira/browse/HDFS-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14951322#comment-14951322 ]
Zhe Zhang commented on HDFS-9205: --------------------------------- Thanks Nicholas. bq. Those blocks have zero replicas so that it is impossible to replicate them. (Let's ignore read-only storage here since it is an incomplete feature.) Right, those blocks only have corrupt replicas. Before trying to replicate a block replica DN validates it based on almost the same conditions as NN's corrupt replica logic, with the following exception: {code} // DataNode#transferBlock } catch (EOFException e) { lengthTooShort = true; {code} Basically, DN skips a replica only if it's too short, while NN considers a replica as corrupt when the size is different (larger or smaller) than the NN's copy. The above is a very rare corner case, and I agree this is a good change to cut unnecessary NN=>DN traffic for tasks that will be filtered out later anyway. > Do not schedule corrupt blocks for replication > ---------------------------------------------- > > Key: HDFS-9205 > URL: https://issues.apache.org/jira/browse/HDFS-9205 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Reporter: Tsz Wo Nicholas Sze > Assignee: Tsz Wo Nicholas Sze > Priority: Minor > Attachments: h9205_20151007.patch, h9205_20151007b.patch, > h9205_20151008.patch, h9205_20151009.patch, h9205_20151009b.patch > > > Corrupted blocks by definition are blocks cannot be read. As a consequence, > they cannot be replicated. In UnderReplicatedBlocks, there is a queue for > QUEUE_WITH_CORRUPT_BLOCKS and chooseUnderReplicatedBlocks may choose blocks > from it. It seems that scheduling corrupted block for replication is wasting > resource and potentially slow down replication for the higher priority blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)