[ https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16578073#comment-16578073 ]
Kitti Nanasi commented on HDFS-13770: ------------------------------------- Thanks [~zvenczel] for the findings! I fixed them in patch v003. > dfsadmin -report does not always decrease "missing blocks (with replication > factor 1)" metrics when file is deleted > ------------------------------------------------------------------------------------------------------------------- > > Key: HDFS-13770 > URL: https://issues.apache.org/jira/browse/HDFS-13770 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs > Affects Versions: 2.7.7 > Reporter: Kitti Nanasi > Assignee: Kitti Nanasi > Priority: Major > Attachments: HDFS-13770-branch-2.001.patch, > HDFS-13770-branch-2.002.patch, HDFS-13770-branch-2.003.patch > > > Missing blocks (with replication factor 1) metric is not always decreased > when file is deleted. > If a file is deleted, the remove function of UnderReplicatedBlocks can be > called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called > with the wrong priority the corruptReplOneBlocks metric is not decreased, > however the block is removed from the priority queue which contains it. > The corresponding code: > {code:java} > /** remove a block from a under replication queue */ > synchronized boolean remove(BlockInfo block, > int oldReplicas, > int oldReadOnlyReplicas, > int decommissionedReplicas, > int oldExpectedReplicas) { > final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas, > decommissionedReplicas, oldExpectedReplicas); > boolean removedBlock = remove(block, priLevel); > if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS && > oldExpectedReplicas == 1 && > removedBlock) { > corruptReplOneBlocks--; > assert corruptReplOneBlocks >= 0 : > "Number of corrupt blocks with replication factor 1 " + > "should be non-negative"; > } > return removedBlock; > } > /** > * Remove a block from the under replication queues. > * > * The priLevel parameter is a hint of which queue to query > * first: if negative or >= \{@link #LEVEL} this shortcutting > * is not attmpted. > * > * If the block is not found in the nominated queue, an attempt is made to > * remove it from all queues. > * > * <i>Warning:</i> This is not a synchronized method. > * @param block block to remove > * @param priLevel expected privilege level > * @return true if the block was found and removed from one of the priority > queues > */ > boolean remove(BlockInfo block, int priLevel) { > if(priLevel >= 0 && priLevel < LEVEL > && priorityQueues.get(priLevel).remove(block)) { > NameNode.blockStateChangeLog.debug( > "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" + > " from priority queue {}", block, priLevel); > return true; > } else { > // Try to remove the block from all queues if the block was > // not found in the queue for the given priority level. > for (int i = 0; i < LEVEL; i++) { > if (i != priLevel && priorityQueues.get(i).remove(block)) { > NameNode.blockStateChangeLog.debug( > "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" + > " {} from priority queue {}", block, i); > return true; > } > } > } > return false; > } > {code} > It is already fixed on trunk by this jira: HDFS-10999, but that ticket > introduces new metrics, which I think should't be backported to branch-2. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org