[ 
https://issues.apache.org/jira/browse/HDFS-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867819#action_12867819
 ] 

Scott Chen commented on HDFS-1143:
----------------------------------

In this patch, we add a inner class called subtreeCleaner in FSNamesystem to 
perform the background deletion.
This class contains a single thread executor and provides a method called 
asyncDelete(INode).
This method cleanup the detached subtree and its blocks in background.

We also modify the delete method so that it detaches the subtree, delete file 
lease and then call asyncDelete().

On my PC, the original unit test takes about 1000ms to perform the large 
deletion.
The modified one takes about 20ms.

> Implement Background deletion
> -----------------------------
>
>                 Key: HDFS-1143
>                 URL: https://issues.apache.org/jira/browse/HDFS-1143
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 0.22.0
>            Reporter: Dmytro Molkov
>            Assignee: Scott Chen
>             Fix For: 0.22.0
>
>         Attachments: HDFS-1143.txt
>
>
> Right now if you try to delete massive number of files from the namenode it 
> will freeze (sometimes for minutes). Most of the time is spent going through 
> the blocks map and invalidating all the blocks.
> This can probably be improved by having a background GC process. The deletion 
> will basically just remove the inode being deleted and then give the subtree 
> that was just deleted to the background thread running cleanup.
> This way the namenode becomes available for the clients soon after deletion, 
> and all the heavy operations are done in the background.
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to