[ https://issues.apache.org/jira/browse/HDFS-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949752#comment-14949752 ]
Tsz Wo Nicholas Sze commented on HDFS-9205: ------------------------------------------- Thanks Zhe for the comments. > ... those blocks won't be re-replicated, even though > chooseUnderReplicatedBlocks returns them? Or they are re-replicated in the > current logic, but they should not be (IIUC that's the case)? Those blocks have zero replicas so that it is impossible to replicate them. (Let's ignore read-only storage here since it is an incomplete feature.) > ... But is there a use case for an admin to list corrupt blocks and reason > about them by accessing the local blk_ (and metadata) files? ... This patch does not prevent that. > If we do want to save the replication work for corrupt blocks, should we get > rid of QUEUE_WITH_CORRUPT_BLOCKS altogether? The block priority could possibly be updated. > Do not schedule corrupt blocks for replication > ---------------------------------------------- > > Key: HDFS-9205 > URL: https://issues.apache.org/jira/browse/HDFS-9205 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Reporter: Tsz Wo Nicholas Sze > Assignee: Tsz Wo Nicholas Sze > Priority: Minor > Attachments: h9205_20151007.patch, h9205_20151007b.patch, > h9205_20151008.patch > > > Corrupted blocks by definition are blocks cannot be read. As a consequence, > they cannot be replicated. In UnderReplicatedBlocks, there is a queue for > QUEUE_WITH_CORRUPT_BLOCKS and chooseUnderReplicatedBlocks may choose blocks > from it. It seems that scheduling corrupted block for replication is wasting > resource and potentially slow down replication for the higher priority blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)