[
http://issues.apache.org/jira/browse/HADOOP-432?page=comments#action_12437000 ]
Yoram Arnon commented on HADOOP-432:
------------------------------------
removing data only when space is required will result in the filesystem always
being 100% full.
That has several downsides:
* performance:
memory usage and image file size on the namenode
difficulty finding space for data - file systems are slower when full
to allocate space you must first delete space (right now, online),
slowing down writes
* contention for space:
if a disk is shared between dfs and say, map-reduce temporary storage,
or the OS, then dfs will take over everything
* when the FS is really full, undelete will fail, because no space is reserved
for it. Better to declare the FS full earlier and keep the (configured)
undelete space available
- also, it's the common way of doing things.
> support undelete, snapshots, or other mechanism to recover lost files
> ---------------------------------------------------------------------
>
> Key: HADOOP-432
> URL: http://issues.apache.org/jira/browse/HADOOP-432
> Project: Hadoop
> Issue Type: Improvement
> Components: dfs
> Reporter: Yoram Arnon
> Assigned To: Wendy Chien
>
> currently, once you delete a file it's gone forever.
> most file systems allow some form of recovery of deleted files.
> a simple solution would be an 'undelete' command.
> a more comprehensive solution would include snapshots, manual and automatic,
> with scheduling options.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira