[ https://issues.apache.org/jira/browse/HDFS-17218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17773564#comment-17773564 ]
Haiyang Hu commented on HDFS-17218: ----------------------------------- {quote} 7. so the block of dn1 will always exist in excessRedundancyMap (until HA switch is performed). {quote} In BlockManager#processChosenExcessRedundancy will add the redundancy of the given block stored in the given datanode to the excess map. https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L4267-L4285 {code:java} private void processChosenExcessRedundancy( final Collection<DatanodeStorageInfo> nonExcess, final DatanodeStorageInfo chosen, BlockInfo storedBlock) { nonExcess.remove(chosen); excessRedundancyMap.add(chosen.getDatanodeDescriptor(), storedBlock); // // The 'excessblocks' tracks blocks until we get confirmation // that the datanode has deleted them; the only way we remove them // is when we get a "removeBlock" message. // // The 'invalidate' list is used to inform the datanode the block // should be deleted. Items are removed from the invalidate list // upon giving instructions to the datanodes. // final Block blockToInvalidate = getBlockOnStorage(storedBlock, chosen); addToInvalidates(blockToInvalidate, chosen.getDatanodeDescriptor()); blockLog.debug("BLOCK* chooseExcessRedundancies: ({}, {}) is added to invalidated blocks set", chosen, storedBlock); } {code} but because the dn side has not deleted the block, it will not call processIncrementalBlockReport, so the block of dn can not remove from excessRedundancyMap > NameNode should remove its excess blocks from the ExcessRedundancyMap When a > DN registers > ----------------------------------------------------------------------------------------- > > Key: HDFS-17218 > URL: https://issues.apache.org/jira/browse/HDFS-17218 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namanode > Reporter: Haiyang Hu > Assignee: Haiyang Hu > Priority: Major > > Currently found that DN will lose all pending DNA_INVALIDATE blocks if it > restarts. > *Root case* > Current DN enables asynchronously deletion, it have many pending deletion > blocks in memory. > when DN restarts, these cached blocks may be lost. it causes some blocks in > the excess map in the namenode to be leaked and this will result in many > blocks having more replicas then expected. > *solution* > Consider NameNode should remove its excess blocks from the > ExcessRedundancyMap When a DN registers, > this approach will ensure that when processing the DN's full block report, > the 'processExtraRedundancy' can be performed according to the actual of the > blocks. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org