[GitHub] couchdb-couch pull request: Extend rename_on_delete behaviour on c...

davisp Mon, 18 Apr 2016 14:51:07 -0700

Github user davisp commented on the pull request:

    https://github.com/apache/couchdb-couch/pull/161#issuecomment-211596768
  
    So thinking through this I think we may be conflating two different 
features that while similar are serving different purposes.
    
    Specifically, there's one feature where we rename things into the .delete 
directory so that we can get rid of them quickly as well as manage cleanup when 
the server crashes during deletions. This feature has been in CouchDB for a 
long time dating back years.
    
    The second feature was a small thing that Cloudant added to help manage 
recovering from accidental deletions. This feature was implemented by just 
changing the delete to rename a .couch file to a version that includes the 
deletion timestamp. This allowed operators to recover a "deleted" database for 
clients.
    
    This PR appears to attempt to flatten both of these features by renaming 
files into the .delete directory with useful names and makes the actual 
deletion a bit subtler in that the various renames are munged and so its not 
100% obvious which feature we're handling in the various functions named 
delete/nuke_dir/delete_file/etc.
    
    As a follow on, the renaming here also tries to move away from using a UUID 
in the .delete directory and instead uses a URL encoded path of the deleted 
file (relative to the root database directory). On first glance this seemed 
like a good idea, but researching filename length limits it'd be very easy to 
break this as a user only needs to create a database with > 255 characters on 
extX filesystems (and shorter on others).
    
    What I think we should consider is that all of our deletions move the file 
to the .delete directory mirroring the original filesystem hierarchy. Then our 
existing "rename_on_delete" feature is relabeled as "delete_after_rename" with 
a default of true. This allows us to enable a sysadmin approach to recovering 
deleted databases as well as enables sysadmins to institute their own policies 
for actual deletion.
    
    As a last note, I'd also like to see all of the file system/deletion logic 
moved to couch_file. Its quite awkward to have it split between couch_server 
and couch_file. I'd work on structuring the commits to have one that moves the 
existing logic to couch_file and then a subsequent commits to do the filesystem 
hierarchy mirroring.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch pull request: Extend rename_on_delete behaviour on c...

Reply via email to