[ https://issues.apache.org/jira/browse/HDFS-15994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340857#comment-17340857 ]
Qi Zhu edited comment on HDFS-15994 at 5/7/21, 2:08 PM: -------------------------------------------------------- Thanks a lot [~hexiaoqiao] for reply. The deletion process: 1)Namespace Delete:remove the file related meta from Inode Tree 2)Remove Block: remove the blocks from BlockMap, and add the blocks to InvalidateBlocks 3)Waiting ReplicationMonitor to trigger Delete Work,send heartbeat to DN for deleting The 2 step dominant about 90% RPC handler for deletion, and the 3 step are async, will not affect the RPC handler. About the using `release lock - sleep - acquire lock` to avoid NameNode hang for long time, i am not sure how to avoid lock too busy so just give this choice to release. And about the multi-thread deletion may use too many handler, we can make the deletion async to release the handler in a following Jira. For this jira, we can discuss how to avoid the lock too busy beside the `release lock - sleep - acquire lock` choice. cc [~weichiu] [~sodonnell] [~ayushtkn] What's your opinions ? Thanks. was (Author: zhuqi): Thanks a lot [~hexiaoqiao] for reply. The deletion process: 1)Namespace Delete:remove the file related meta from Inode Tree; 2)Remove Block: remove the blocks from BlockMap, and add the blocks to InvalidateBlocks. 3)Waiting ReplicationMonitor to trigger Delete Work,send heartbeat to DN for deleting。 The 2 step dominant about 90% RPC handler for deletion, and the 3 step are async, will not affect the RPC handler. About the using `release lock - sleep - acquire lock` to avoid NameNode hang for long time, i am not sure how to avoid lock too busy so just give this choice to release. And about the multi-thread deletion may use too many handler, we can make the deletion async to release the handler in a following Jira. For this jira, we can discuss how to avoid the lock too busy beside the `release lock - sleep - acquire lock` choice. cc [~weichiu] [~sodonnell] [~ayushtkn] What's your opinions ? Thanks. > Deletion should sleep some time, when there are too many pending deletion > blocks. > --------------------------------------------------------------------------------- > > Key: HDFS-15994 > URL: https://issues.apache.org/jira/browse/HDFS-15994 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Qi Zhu > Assignee: Qi Zhu > Priority: Major > Attachments: HDFS-15994.001.patch > > > HDFS-13831 realize that we can control the frequency of other waiters to get > the lock chance. > But actually in our big cluster with heavy deletion: > The problem still happened, and the pending deletion blocks will be more > than ten million somtimes, and the size become more than 1 million in regular > in huge clusters. > So i think we should sleep for some time when pending too many deletion > blocks. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org