[ https://issues.apache.org/jira/browse/HDFS-14852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933450#comment-16933450 ]
Kihwal Lee commented on HDFS-14852: ----------------------------------- Deleted blocks are removed in {{scheduleReconstruction()}} when the queues are scanned. HDFS-9205 is likely the cause of the "phantom block" issue. The change added this logic to {{chooseLowRedundancyBlocks()}}, so missing blocks are not scanned and removed. {code} if (priority == QUEUE_WITH_CORRUPT_BLOCKS) { // do not choose corrupted blocks. continue; } {code} We started seeing it since 2.8 and never seen it before 2.8. So this is the likely cause. Normally clusters have a small number of missing blocks, if any. So it won't have any visible impact on the replication activities. This, however, becomes serious when many datanodes dies (network cut, DNS outage, etc.). The namenode will spend so much time in looking at missing blocks and may not recover soon or never recover. Thus, simply removing it may not be wise. We can do something like: - Scan {{QUEUE_WITH_CORRUPT_BLOCKS}} every _n_ iterations. {{redundancyRecheckIntervalMs}} is 3 seconds by default. Doing it every 20 iterations will clear phantoms in a minute. *and/or* - Put a safe guard such as "if more than 5% of blocks are missing, do not scan {{QUEUE_WITH_CORRUPT_BLOCKS}}". > Remove of LowRedundancyBlocks do NOT remove the block from all queues > --------------------------------------------------------------------- > > Key: HDFS-14852 > URL: https://issues.apache.org/jira/browse/HDFS-14852 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec > Affects Versions: 3.2.0, 3.0.3, 3.1.2, 3.3.0 > Reporter: Fei Hui > Assignee: Fei Hui > Priority: Major > Attachments: CorruptBlocksMismatch.png, HDFS-14852.001.patch, > HDFS-14852.002.patch > > > LowRedundancyBlocks.java > {code:java} > // Some comments here > if(priLevel >= 0 && priLevel < LEVEL > && priorityQueues.get(priLevel).remove(block)) { > NameNode.blockStateChangeLog.debug( > "BLOCK* NameSystem.LowRedundancyBlock.remove: Removing block {}" > + " from priority queue {}", > block, priLevel); > decrementBlockStat(block, priLevel, oldExpectedReplicas); > return true; > } else { > // Try to remove the block from all queues if the block was > // not found in the queue for the given priority level. > for (int i = 0; i < LEVEL; i++) { > if (i != priLevel && priorityQueues.get(i).remove(block)) { > NameNode.blockStateChangeLog.debug( > "BLOCK* NameSystem.LowRedundancyBlock.remove: Removing block" + > " {} from priority queue {}", block, i); > decrementBlockStat(block, i, oldExpectedReplicas); > return true; > } > } > } > return false; > } > {code} > Source code is above, the comments as follow > {quote} > // Try to remove the block from all queues if the block was > // not found in the queue for the given priority level. > {quote} > The function "remove" does NOT remove the block from all queues. > Function add from LowRedundancyBlocks.java is used on some places and maybe > one block in two or more queues. > We found that corrupt blocks mismatch corrupt files on NN web UI. Maybe it is > related to this. > Upload initial patch -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org