Hey list,

I wonder if it is possible to deduplicate read-only snapshots.

Background:

I'm using an bash/rsync script[1] to backup my whole system on a nightly 
basis to an attached USB3 drive into a scratch area, then take a snapshot of 
this area. I'd like to have these snapshots immutable, so they should be 
read-only.

Since rsync won't discover moved files but instead place a new copy of that 
in the backup, I'm running the wonderful bedup application[2] to deduplicate 
my backup drive from time to time and it almost always gains back a good 
pile of gigabytes. The rest of storage space issues is taken care of by 
using rsync's inplace option (although this won't cover the case of files 
moved and changed between backup runs) and using compress-force=gzip.

Since bedup sets the immutable attribute during touching the files, I 
suspect the process will no longer work when I make the snapshots read-only.

I've read about ongoing work to integrate offline (and even online) 
deduplication into the kernel so that this process can be made atomic (and 
even block-based instead of file-based). This would - to my understandings - 
result in the immutable attribute no longer needed. So, given the fact above 
and for the case read-only snapshots cannot be used for this application 
currently, will these patches address the problem and read-only snapshots 
could be deduplicated? Or are read-only snapshots meant to be what the name 
suggests: Immutable, even for deduplication?

Regards,
Kai

[1]: https://gist.github.com/kakra/5520370
[2]: https://github.com/g2p/bedup

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to