[jira] [Commented] (HDFS-17218) NameNode should remove its excess blocks from the ExcessRedundancyMap When a DN registers

ASF GitHub Bot (Jira) Thu, 19 Oct 2023 23:30:06 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-17218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17777584#comment-17777584
 ]


ASF GitHub Bot commented on HDFS-17218:
---------------------------------------

haiyang1987 commented on PR #6176:
URL: https://github.com/apache/hadoop/pull/6176#issuecomment-1772160261

   I try to implement this based on the timeout mechanism solution. However, 
there is a case where I have some questions, such as:
   
   - t1 time: Block1 on DN1 is choosed to be added to ExcessRedundancyMap.
   - t2 time: DN1 heartbeat gets Invalidates command.
   - t3 time: Due to a  serious accumulationin DN1 async deletion queue, the 
replica might not be deleted for a prolonged period.
   
   The question here  is how the current NN can define a reasonable timeframe 
to determine whether Block1 corresponding to DN1 in ExcessRedundancyMap has 
timed out. 
   Currently, I haven't  think of a particularly good way to define this. 
   
   Hi @Hexiaoqiao @ZanderXu @zhangshuyan0 excuse me, do you have any 
suggestions for this case?
   look forward to your feedback, Thanks~
   




> NameNode should remove its excess blocks from the ExcessRedundancyMap When a 
> DN registers
> -----------------------------------------------------------------------------------------
>
>                 Key: HDFS-17218
>                 URL: https://issues.apache.org/jira/browse/HDFS-17218
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namanode
>            Reporter: Haiyang Hu
>            Assignee: Haiyang Hu
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2023-10-12-15-52-52-336.png
>
>
> Currently found that DN will lose all pending DNA_INVALIDATE blocks if it 
> restarts.
> *Root case*
> Current DN enables asynchronously deletion, it have many pending deletion 
> blocks in memory.
> when DN restarts, these cached blocks may be lost. it causes some blocks in 
> the excess map in the namenode to be leaked and this will result in many 
> blocks having more replicas then expected.
> *solution*
> Consider NameNode should remove its excess blocks from the 
> ExcessRedundancyMap When a DN registers,
> this approach will ensure that when processing the DN's full block report, 
> the 'processExtraRedundancy' can be performed according to the actual of the 
> blocks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17218) NameNode should remove its excess blocks from the ExcessRedundancyMap When a DN registers

Reply via email to