[ https://issues.apache.org/jira/browse/HDFS-17191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Íñigo Goiri updated HDFS-17191: ------------------------------- Summary: Delete operation adds a thread to collect blocks asynchronously (was: HDFS: Delete operation adds a thread to collect blocks asynchronously) > Delete operation adds a thread to collect blocks asynchronously > --------------------------------------------------------------- > > Key: HDFS-17191 > URL: https://issues.apache.org/jira/browse/HDFS-17191 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs > Affects Versions: 3.4.0 > Reporter: Xiangyi Zhu > Assignee: Xiangyi Zhu > Priority: Major > > When we delete a large directory, it is time-consuming to collect the blocks > in the deleted subtree. Currently, block collection is executed within a > write lock. If a large directory is deleted, other RPCs may be blocked for a > period of time. Asynchronous deletion of collected blocks has been > implemented, we can refer to this Jira > https://issues.apache.org/jira/browse/HDFS-16043. > In fact, collecting blocks does not require locking, because after the > subtree is deleted, this subtree will not be accessed by other RPCs. We can > collect the deleted subtree asynchronously and without locking. > But there may be some problems: > 1. When the parent node of the subtree is configured with quota, the quota > update is not synchronous and there will be a small delay. > 2. Because the root directory always has the DirectoryWithQuotaFeature > attribute, we need to update the quotaUsage of the root directory anyway. In > addition, the root directory does not have an upper limit for quota > configuration. I think we can ignore the delayed update of quota for the root > directory. > To solve the above problem, we can check whether all parent directories of > the subtree are configured with quota. If quota is not configured, use > asynchronous collection. We can also use configuration to let users decide > whether to enable quota checking. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org