[
https://issues.apache.org/jira/browse/HADOOP-432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469617
]
Doug Cutting commented on HADOOP-432:
-------------------------------------
> expunge(), as implemented, purges the entire trash
No, it only removes things in folders older than the interval. So, in
particular, it never removes the current trash, and won't remove a checkpoint
until its older than the interval.
> results in a large load on the namenode
Expunge lists checkpoints, then removes entire checkpoints with a single call
to the namenode. So it could take a long time on the namenode if the
checkpoint has lots of files, but it doesn't make a lot of calls to the
namenode: all directory enumeration except of the top-level checkpoints is done
server-side, at the namenode. So the RPC load on the namenode is minimized.
> It would be nice if the code, when deleting a file, checked if the source
> file is already in the trash and would expunge it
Yes, that would be a useful feature. Calling moveToTrash() on any path that
begins with the trash's root should cause it to be immediately removed. +1
> support undelete, snapshots, or other mechanism to recover lost files
> ---------------------------------------------------------------------
>
> Key: HADOOP-432
> URL: https://issues.apache.org/jira/browse/HADOOP-432
> Project: Hadoop
> Issue Type: Improvement
> Components: dfs
> Reporter: Yoram Arnon
> Assigned To: Doug Cutting
> Attachments: trash.patch, undelete12.patch, undelete16.patch,
> undelete17.patch
>
>
> currently, once you delete a file it's gone forever.
> most file systems allow some form of recovery of deleted files.
> a simple solution would be an 'undelete' command.
> a more comprehensive solution would include snapshots, manual and automatic,
> with scheduling options.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.