[ 
https://issues.apache.org/jira/browse/HDFS-17218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774503#comment-17774503
 ] 

ASF GitHub Bot commented on HDFS-17218:
---------------------------------------

haiyang1987 commented on code in PR #6176:
URL: https://github.com/apache/hadoop/pull/6176#discussion_r1356798872


##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java:
##########
@@ -1007,6 +1013,7 @@ public void updateRegInfo(DatanodeID nodeReg) {
     for(DatanodeStorageInfo storage : getStorageInfos()) {
       if (storage.getStorageType() != StorageType.PROVIDED) {
         storage.setBlockReportCount(0);
+        storage.setBlockContentsStale(true);

Review Comment:
   Thanks @zhangshuyan0 for you comment.
   
   The modifications here may not be directly related to the current problem.
   the reason for modifying this is that if the current dn is re-registered, 
block deletion or exception may occur on the dn during this period, because FBR 
has not yet been completed, and the NN side memory record is different from the 
actual dn block.
   
   If processExtraRedundancyBlock is executed at this time, block loss may 
occur. 
   such as a file has 2 replicas, but only 1 replica is expected.  when 
processExtraRedundancyBlock is executed, a live dn will be choose for deletion, 
which will cause a block miss, so the dn re-registering needs to be marked 
blockContentsStale will avoid this case.





> NameNode should remove its excess blocks from the ExcessRedundancyMap When a 
> DN registers
> -----------------------------------------------------------------------------------------
>
>                 Key: HDFS-17218
>                 URL: https://issues.apache.org/jira/browse/HDFS-17218
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namanode
>            Reporter: Haiyang Hu
>            Assignee: Haiyang Hu
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2023-10-12-15-52-52-336.png
>
>
> Currently found that DN will lose all pending DNA_INVALIDATE blocks if it 
> restarts.
> *Root case*
> Current DN enables asynchronously deletion, it have many pending deletion 
> blocks in memory.
> when DN restarts, these cached blocks may be lost. it causes some blocks in 
> the excess map in the namenode to be leaked and this will result in many 
> blocks having more replicas then expected.
> *solution*
> Consider NameNode should remove its excess blocks from the 
> ExcessRedundancyMap When a DN registers,
> this approach will ensure that when processing the DN's full block report, 
> the 'processExtraRedundancy' can be performed according to the actual of the 
> blocks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to