[ 
https://issues.apache.org/jira/browse/HDFS-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472529#comment-17472529
 ] 

Yongpan Liu edited comment on HDFS-16043 at 1/11/22, 7:56 AM:
--------------------------------------------------------------

     There was a problem today that may be related to this patch , but not 
fatal. 
     During the asynchronous deletion of blocks, SRE offline several Datanodes, 
and then some missing blocks appear, which restored after the active/standby 
switchover. Have you ever met this before?


was (Author: mofei):
     There was a problem today that may be related to this patch , but not 
fatal. 
     During the asynchronous deletion of blocks, SRE offline several Datanodes, 
and then some missing blocks appear, which restored after the active/standby 
switchover. Have you ever met this before?

!image-2022-01-11-15-51-44-481.png|width=939,height=133!

> Add markedDeleteBlockScrubberThread to delete blocks asynchronously
> -------------------------------------------------------------------
>
>                 Key: HDFS-16043
>                 URL: https://issues.apache.org/jira/browse/HDFS-16043
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs, namanode
>    Affects Versions: 3.4.0
>            Reporter: Xiangyi Zhu
>            Assignee: Xiangyi Zhu
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: 20210527-after.svg, 20210527-before.svg
>
>          Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> The deletion of the large directory caused NN to hold the lock for too long, 
> which caused our NameNode to be killed by ZKFC.
>  Through the flame graph, it is found that its main time-consuming 
> calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting 
> inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time.
> h3. solution:
> 1. RemoveBlocks is processed asynchronously. A thread is started in the 
> BlockManager to process the deleted blocks and control the lock time.
>  2. QuotaCount calculation optimization, this is similar to the optimization 
> of this Issue HDFS-16000.
> h3. Comparison before and after optimization:
> Delete 1000w Inode and 1000w block test.
>  *before:*
> remove inode elapsed time: 7691 ms
>  remove block elapsed time :11107 ms
>  *after:*
>  remove inode elapsed time: 4149 ms
>  remove block elapsed time :0 ms



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to