[ 
https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16866188#comment-16866188
 ] 

Wei-Chiu Chuang commented on HDFS-13770:
----------------------------------------

+1 The patch still applies. Updated timeout to 60 seconds and uploaded patch to 
 trigger precommit.

> dfsadmin -report does not always decrease "missing blocks (with replication 
> factor 1)" metrics when file is deleted
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-13770
>                 URL: https://issues.apache.org/jira/browse/HDFS-13770
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>    Affects Versions: 2.7.7
>            Reporter: Kitti Nanasi
>            Assignee: Kitti Nanasi
>            Priority: Major
>         Attachments: HDFS-13770-branch-2.001.patch, 
> HDFS-13770-branch-2.002.patch, HDFS-13770-branch-2.003.patch, 
> HDFS-13770-branch-2.004.patch
>
>
> Missing blocks (with replication factor 1) metric is not always decreased 
> when file is deleted.
> If a file is deleted, the remove function of UnderReplicatedBlocks can be 
> called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called 
> with the wrong priority the corruptReplOneBlocks metric is not decreased, 
> however the block is removed from the priority queue which contains it.
> The corresponding code:
> {code:java}
> /** remove a block from a under replication queue */
> synchronized boolean remove(BlockInfo block,
>  int oldReplicas,
>  int oldReadOnlyReplicas,
>  int decommissionedReplicas,
>  int oldExpectedReplicas) {
>  final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas,
>  decommissionedReplicas, oldExpectedReplicas);
>  boolean removedBlock = remove(block, priLevel);
>  if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS &&
>  oldExpectedReplicas == 1 &&
>  removedBlock) {
>  corruptReplOneBlocks--;
>  assert corruptReplOneBlocks >= 0 :
>  "Number of corrupt blocks with replication factor 1 " +
>  "should be non-negative";
>  }
>  return removedBlock;
> }
> /**
>  * Remove a block from the under replication queues.
>  *
>  * The priLevel parameter is a hint of which queue to query
>  * first: if negative or >= \{@link #LEVEL} this shortcutting
>  * is not attmpted.
>  *
>  * If the block is not found in the nominated queue, an attempt is made to
>  * remove it from all queues.
>  *
>  * <i>Warning:</i> This is not a synchronized method.
>  * @param block block to remove
>  * @param priLevel expected privilege level
>  * @return true if the block was found and removed from one of the priority 
> queues
>  */
> boolean remove(BlockInfo block, int priLevel) {
>  if(priLevel >= 0 && priLevel < LEVEL
>  && priorityQueues.get(priLevel).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" +
>  " from priority queue {}", block, priLevel);
>  return true;
>  } else {
>  // Try to remove the block from all queues if the block was
>  // not found in the queue for the given priority level.
>  for (int i = 0; i < LEVEL; i++) {
>  if (i != priLevel && priorityQueues.get(i).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" +
>  " {} from priority queue {}", block, i);
>  return true;
>  }
>  }
>  }
>  return false;
> }
> {code}
> It is already fixed on trunk by this jira: HDFS-10999, but that ticket 
> introduces new metrics, which I think should't be backported to branch-2.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to