[ 
https://issues.apache.org/jira/browse/HDFS-14852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16934000#comment-16934000
 ] 

Fei Hui edited comment on HDFS-14852 at 9/20/19 3:17 AM:
---------------------------------------------------------

[~kihwal] Thanks for your comments
I found the issue in the following scenario:
# 2 corrupt blocks appear on Web UI
# I delete the 2 corrupt files
# Then found "There are 2 missing blocks" but no corrupt blocks are listed, as 
show on uploaded image.

I think the block should be removed from all queues, because when 
FSNameSystemcall delete, BlockManager.removeBlock will be called. 

{code}
neededReconstruction.remove(block, LowRedundancyBlocks.LEVEL);
{code}
arguments of remove from BlockManager.java mean that remove the block from all 
queues

{quote}
  /**
   * Remove a block from the low redundancy queues.
   *
   * The priLevel parameter is a hint of which queue to query
   * first: if negative or >= {@link #LEVEL} this shortcutting
   * is not attmpted.
   *
   * If the block is not found in the nominated queue, an attempt is made to
   * remove it from all queues.
   *
   * <i>Warning:</i> This is not a synchronized method.
   * @param block block to remove
   * @param priLevel expected privilege level
   * @return true if the block was found and removed from one of the priority
   *         queues
   */
{quote}
The above is javadoc of LowRedundancyBlocks.remove
This function want to remove the block from all queues when the block is not 
found in the nominated queue, but implement of LowRedundancyBlocks.remove does 
not do it,  it returns after removing the block from the first queue contains 
the block.
So  i improve the implement of  LowRedundancyBlocks.remove and will it works as 
expected.


was (Author: ferhui):
[~kihwal] Thanks for your comments
I found the issue in the following scenario:
# 2 corrupt blocks appear on Web UI
# I delete the 2 corrupt files
# Then found "There are 2 missing blocks" but no corrupt blocks are listed, as 
show on uploaded image.
I think the block should be removed from all queues, because when 
FSNameSystemcall delete, BlockManager.removeBlock will be called. 

{code}
neededReconstruction.remove(block, LowRedundancyBlocks.LEVEL);
{code}
arguments of remove from BlockManager.java mean that remove the block from all 
queues

{quote}
  /**
   * Remove a block from the low redundancy queues.
   *
   * The priLevel parameter is a hint of which queue to query
   * first: if negative or &gt;= {@link #LEVEL} this shortcutting
   * is not attmpted.
   *
   * If the block is not found in the nominated queue, an attempt is made to
   * remove it from all queues.
   *
   * <i>Warning:</i> This is not a synchronized method.
   * @param block block to remove
   * @param priLevel expected privilege level
   * @return true if the block was found and removed from one of the priority
   *         queues
   */
{quote}
The above is javadoc of LowRedundancyBlocks.remove
This function want to remove the block from all queues when the block is not 
found in the nominated queue, but implement of LowRedundancyBlocks.remove does 
not do it,  it returns after removing the block from the first queue contains 
the block.
So  i improve the implement of  LowRedundancyBlocks.remove and will it works as 
expected.

> Remove of LowRedundancyBlocks do NOT remove the block from all queues
> ---------------------------------------------------------------------
>
>                 Key: HDFS-14852
>                 URL: https://issues.apache.org/jira/browse/HDFS-14852
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ec
>    Affects Versions: 3.2.0, 3.0.3, 3.1.2, 3.3.0
>            Reporter: Fei Hui
>            Assignee: Fei Hui
>            Priority: Major
>         Attachments: CorruptBlocksMismatch.png, HDFS-14852.001.patch, 
> HDFS-14852.002.patch
>
>
> LowRedundancyBlocks.java
> {code:java}
> // Some comments here
>     if(priLevel >= 0 && priLevel < LEVEL
>         && priorityQueues.get(priLevel).remove(block)) {
>       NameNode.blockStateChangeLog.debug(
>           "BLOCK* NameSystem.LowRedundancyBlock.remove: Removing block {}"
>               + " from priority queue {}",
>           block, priLevel);
>       decrementBlockStat(block, priLevel, oldExpectedReplicas);
>       return true;
>     } else {
>       // Try to remove the block from all queues if the block was
>       // not found in the queue for the given priority level.
>       for (int i = 0; i < LEVEL; i++) {
>         if (i != priLevel && priorityQueues.get(i).remove(block)) {
>           NameNode.blockStateChangeLog.debug(
>               "BLOCK* NameSystem.LowRedundancyBlock.remove: Removing block" +
>                   " {} from priority queue {}", block, i);
>           decrementBlockStat(block, i, oldExpectedReplicas);
>           return true;
>         }
>       }
>     }
>     return false;
>   }
> {code}
> Source code is above, the comments as follow
> {quote}
>       // Try to remove the block from all queues if the block was
>       // not found in the queue for the given priority level.
> {quote}
> The function "remove" does NOT remove the block from all queues.
> Function add from LowRedundancyBlocks.java is used on some places and maybe 
> one block in two or more queues.
> We found that corrupt blocks mismatch corrupt files on NN web UI. Maybe it is 
> related to this.
> Upload initial patch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to