[jira] [Commented] (HDFS-16999) Fix wrong use of processFirstBlockReport()

ASF GitHub Bot (Jira) Thu, 04 May 2023 20:03:07 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-16999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17719610#comment-17719610
 ]


ASF GitHub Bot commented on HDFS-16999:
---------------------------------------

zhangshuyan0 opened a new pull request, #5622:
URL: https://github.com/apache/hadoop/pull/5622

   ### Description of PR
   `processFirstBlockReport()` is used to process first block report from 
datanode. It does not calculating `toRemove` list because it believes that 
there is no metadata about the datanode in the namenode. However, If a datanode 
is re registered after restarting, its `blockReportCount` will be updated to 0. 
   
https://github.com/apache/hadoop/blob/c7699d3dcd4f8feaf2c5ae5943b8a4cec738e95d/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java#L1007-L1012
   That is to say, the first block report after a datanode restarts will be 
processed by `processFirstBlockReport()`. 
   
https://github.com/apache/hadoop/blob/c7699d3dcd4f8feaf2c5ae5943b8a4cec738e95d/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L2916-L2925
   This is unreasonable because the metadata of the datanode already exists in 
namenode at this time, and if redundant replica metadata is not removed in 
time, the blocks with insufficient replicas cannot be reconstruct in time, 
which increases the risk of missing block. In summary, 
`processFirstBlockReport()` should only be used when the namenode restarts, not 
when the datanode restarts.
   ### How was this patch tested?
   Add a new unit test.
   




> Fix wrong use of processFirstBlockReport()
> ------------------------------------------
>
>                 Key: HDFS-16999
>                 URL: https://issues.apache.org/jira/browse/HDFS-16999
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Shuyan Zhang
>            Assignee: Shuyan Zhang
>            Priority: Major
>
> `processFirstBlockReport()` is used to process first block report from 
> datanode. It does not calculating `toRemove` list because it believes that 
> there is no metadata about the datanode in the namenode. However, If a 
> datanode is re registered after restarting, its `blockReportCount` will be 
> updated to 0. That is to say, the first block report after a datanode 
> restarts will be processed by `processFirstBlockReport()`.  This is 
> unreasonable because the metadata of the datanode already exists in namenode 
> at this time, and if redundant replica metadata is not removed in time, the 
> blocks with insufficient replicas cannot be reconstruct in time, which 
> increases the risk of missing block. In summary, `processFirstBlockReport()` 
> should only be used when the namenode restarts, not when the datanode 
> restarts. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-16999) Fix wrong use of processFirstBlockReport()

Reply via email to