[ https://issues.apache.org/jira/browse/HDFS-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472529#comment-17472529 ]
Yongpan Liu edited comment on HDFS-16043 at 1/11/22, 7:56 AM: -------------------------------------------------------------- There was a problem today that may be related to this patch , but not fatal. During the asynchronous deletion of blocks, SRE offline several Datanodes, and then some missing blocks appear, which restored after the active/standby switchover. Have you ever met this before? was (Author: mofei): There was a problem today that may be related to this patch , but not fatal. During the asynchronous deletion of blocks, SRE offline several Datanodes, and then some missing blocks appear, which restored after the active/standby switchover. Have you ever met this before? !image-2022-01-11-15-51-44-481.png|width=939,height=133! > Add markedDeleteBlockScrubberThread to delete blocks asynchronously > ------------------------------------------------------------------- > > Key: HDFS-16043 > URL: https://issues.apache.org/jira/browse/HDFS-16043 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namanode > Affects Versions: 3.4.0 > Reporter: Xiangyi Zhu > Assignee: Xiangyi Zhu > Priority: Major > Labels: pull-request-available > Attachments: 20210527-after.svg, 20210527-before.svg > > Time Spent: 7h 50m > Remaining Estimate: 0h > > The deletion of the large directory caused NN to hold the lock for too long, > which caused our NameNode to be killed by ZKFC. > Through the flame graph, it is found that its main time-consuming > calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting > inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time. > h3. solution: > 1. RemoveBlocks is processed asynchronously. A thread is started in the > BlockManager to process the deleted blocks and control the lock time. > 2. QuotaCount calculation optimization, this is similar to the optimization > of this Issue HDFS-16000. > h3. Comparison before and after optimization: > Delete 1000w Inode and 1000w block test. > *before:* > remove inode elapsed time: 7691 ms > remove block elapsed time :11107 ms > *after:* > remove inode elapsed time: 4149 ms > remove block elapsed time :0 ms -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org