[GitHub] [hadoop] ZanderXu commented on pull request #5583: HDFS-16987. [BugFix] MarkBlockAsCorrupt should not mark a replica as corrupted if the DN has a newest replica

via GitHub Thu, 04 May 2023 20:35:36 -0700


ZanderXu commented on PR #5583:
URL: https://github.com/apache/hadoop/pull/5583#issuecomment-1535657249


   @Hexiaoqiao @ayushtkn Master, after deep thinking, maybe we can only fix 
this problem when processAllPendingDNMessages, because namenode doesn't know 
whether this report is consistent with the actual replica storage information 
of the DataNode.
   
   **Case1: This report with small GS is postponed report, which is different 
from the actual replica of the datanode.**
   For example:
   
   - The actual replica of DN is: blk_1024_1002
   - The postponed report is: blk_1024_1001
   
   For this case, namenode can ignore this postponed report and doesn't mark it 
as a corrupted replica. 
   
   **Case2: This report with small GS is the newest report, which is same with 
the actual replica of the datanode.**
   For example:
   
   - The actual replica of DN is: blk_1024_1001
   - The report is: blk_1024_1001
   - The storages of this block in namenode already contains this DN
   
   For this case, namenode shouldn't ignore this report, and it should mark 
this replica as a corrupted replica.  Manually modifying block storage files on 
DataNode may cause this problem.
   
   
   At present,  namenode can only consider that each report is the newest 
report, and then modify the status of the block in the memory of namenode, 
because datanode reports the state  to NN through block report or 
blockReceiveAndDelete. 
   
   
   If we modify the logic of `markBlockAsCorrupt`, namenode will can not mark 
the replica as a corrupted replica for case2.
   If we modify the logic of `processAllPendingDNMessages`, the postponed 
message will be temporarily ignored for case 2, and active namenode will mark 
it as a corrupted replica in the next block report of corressponding DN.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] ZanderXu commented on pull request #5583: HDFS-16987. [BugFix] MarkBlockAsCorrupt should not mark a replica as corrupted if the DN has a newest replica

Reply via email to