[ https://issues.apache.org/jira/browse/HDFS-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14345026#comment-14345026 ]
Tsz Wo Nicholas Sze commented on HDFS-6945: ------------------------------------------- removeFromExcessReplicateMap is quite expensive. It iterates all the storages in excessReplicateMap to find the given block. How about get the Block info from the blocksMap first? Then the storage information can be used to remove the blocks in excessReplicateMap. > BlockManager should remove a block from excessReplicateMap and decrement > ExcessBlocks metric when the block is removed > ---------------------------------------------------------------------------------------------------------------------- > > Key: HDFS-6945 > URL: https://issues.apache.org/jira/browse/HDFS-6945 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 2.5.0 > Reporter: Akira AJISAKA > Assignee: Akira AJISAKA > Priority: Critical > Labels: metrics > Attachments: HDFS-6945-003.patch, HDFS-6945.2.patch, HDFS-6945.patch > > > I'm seeing ExcessBlocks metric increases to more than 300K in some clusters, > however, there are no over-replicated blocks (confirmed by fsck). > After a further research, I noticed when deleting a block, BlockManager does > not remove the block from excessReplicateMap or decrement excessBlocksCount. > Usually the metric is decremented when processing block report, however, if > the block has been deleted, BlockManager does not remove the block from > excessReplicateMap or decrement the metric. > That way the metric and excessReplicateMap can increase infinitely (i.e. > memory leak can occur). -- This message was sent by Atlassian JIRA (v6.3.4#6332)