On Wed, May 25, 2016 at 04:00:00AM -0700, H. Peter Anvin wrote: > On 05/25/16 02:29, Hugo Mills wrote: > > On Wed, May 25, 2016 at 01:58:15AM -0700, H. Peter Anvin wrote: > >> Hi, > >> > >> I'm looking at using a btrfs with snapshots to implement a generational > >> backup capacity. However, doing it the naïve way would have the side > >> effect that for a file that has been partially modified, after > >> snapshotting the file would be written with *mostly* the same data. How > >> does btrfs' COW algorithm deal with that? If necessary I might want to > >> write some smarter user space utilities for this. > > > > Sounds like it might be a job for one of the dedup tools > > (deupremove, bedup), or, if you're writing your own, the safe > > deduplication ioctl which underlies those tools. > > > > I guess I would prefer if data wasn't first duplicated and then > deduplicated if possible. It sounds like I ought to write a "smart > copy-overwrite" tool for this.
I _think_ rsync --in-place may help here. IIRC, it'll only overwrite the sections of files that have changed, rather than write and replace the whole file. (I may be wrong about that, though. I haven't tested it at that level). There's also the in-band dedup patches that have been on the mailing list recently. That's probably going to need massive amounts of RAM, though. Hugo. -- Hugo Mills | Putting U back in Honor, Valor, and Trth. hugo@... carfax.org.uk | http://carfax.org.uk/ | PGP: E2AB1DE4 |
signature.asc
Description: Digital signature