On Tue, Nov 24, 2015 at 10:36:26PM +0100, Christoph Anton Mitterer wrote: > On Tue, 2015-11-24 at 21:27 +0000, Hugo Mills wrote: > > -p only sends the file metadata for the changes from the reference > > snapshot to the sent snapshot. -c sends all the file metadata, but > > will preserve the reflinks between the sent snapshot and the (one or > > more) reference snapshots. > Let me see if I got that right: > - -p sends just the differences, for both data and meta-data. > - Plus, -c sends *all* the metadata, you said... but will it send all > data (and simply ignore what's already there) or will it also just send > the differences in terms of data?
Well, if you have a snapshot A, snap to A', and then send -p A A', it'll send the same amount of data as send -c A A'. However, the effect on the receiving system is slightly different in terms of the subvol metadata -- with -p, it will preserve the information that A and A' are snapshots of the same original. With -c, it won't preserve that. This will probably have knock-on effects in terms of round-tripping the snapshots (e.g. for restoring one to the hosed system and continuing with the incremental backup scheme). I'd have to do some hard thinking again with the send/receive algebra to work out what the effect would be, but with the -c approach, you'd probably have difficulties. The round-tripping feature hasn't been implemented yet, so the point is currently moot, but it's certainly possible to do it (with a small send stream change), and it probably will be done at some point. > - So that means effectively I'll end up with the same... right? > > In other words, -p should be a tiny bit faster... but not that extremely much > (unless I have tons[0] of metadata changes) Yes. > > You can only use one -p (because there's > > only one difference you can compute at any one time), but you can use > > as many -c as you like (because you can share extents with any number > > of subvols). > So that means, if it would work correctly, -p would be the right choice > for me, as I never have multiple snapshots that I need to draw my > relinks from, right? Correct. The -c case is much less often needed. It's useful if you have, say, several otherwise unrelated subvols that you need to transfer efficiently from a filesystem that has had dedup run on it. (Other use cases may apply as well). > > In implementation terms, on the receiver, -p takes a (writable) > > snapshot of the reference subvol, and modifies it according to the > > stream data. -c makes a new empty subvol, and populates it from > > scratch, using the reflink ioctl to use data which is known to exist > > in the reference subvols. > I see... > I think the manpage needs more information like this... :) [snip] > [0] People may argue that one has XXbytes of metadata, and tons are a > measurement of weight... but when I recently carried 4 of the 8TB HDDs > in my back... I came to the conclusion that data correlates to gram ;-) Yeah, I've met that particular equation too... :) Hugo. -- Hugo Mills | Anyone who claims their cryptographic protocol is hugo@... carfax.org.uk | secure is either a genius or a fool. Given the http://carfax.org.uk/ | genius/fool ratio for our species, the odds aren't PGP: E2AB1DE4 | good. Bruce Schneier
signature.asc
Description: Digital signature