[ 
https://issues.apache.org/jira/browse/HDFS-16999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17720073#comment-17720073
 ] 

ASF GitHub Bot commented on HDFS-16999:
---------------------------------------

Hexiaoqiao commented on PR #5622:
URL: https://github.com/apache/hadoop/pull/5622#issuecomment-1537031643

   > If the first block report is processed using processFirstBlockReport when 
datanode restart, the incorrect metadata in Namenode memory will not be fixed 
until the next block report. In my opinion, the processing of the first block 
report after datanode restart should be the same as the processing of 
subsequent block reports.
   
   Make sense to me. +1 from my side. 
   
   cc @jojochuang @ayushtkn @goiri Would you mind to take another reviews? 
Thanks.




> Fix wrong use of processFirstBlockReport()
> ------------------------------------------
>
>                 Key: HDFS-16999
>                 URL: https://issues.apache.org/jira/browse/HDFS-16999
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Shuyan Zhang
>            Assignee: Shuyan Zhang
>            Priority: Major
>              Labels: pull-request-available
>
> `processFirstBlockReport()` is used to process first block report from 
> datanode. It does not calculating `toRemove` list because it believes that 
> there is no metadata about the datanode in the namenode. However, If a 
> datanode is re registered after restarting, its `blockReportCount` will be 
> updated to 0. That is to say, the first block report after a datanode 
> restarts will be processed by `processFirstBlockReport()`.  This is 
> unreasonable because the metadata of the datanode already exists in namenode 
> at this time, and if redundant replica metadata is not removed in time, the 
> blocks with insufficient replicas cannot be reconstruct in time, which 
> increases the risk of missing block. In summary, `processFirstBlockReport()` 
> should only be used when the namenode restarts, not when the datanode 
> restarts. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to