On Tue, Nov 24, 2015 at 10:36:26PM +0100, Christoph Anton Mitterer wrote:
> On Tue, 2015-11-24 at 21:27 +0000, Hugo Mills wrote:
> >    -p only sends the file metadata for the changes from the reference
> > snapshot to the sent snapshot. -c sends all the file metadata, but
> > will preserve the reflinks between the sent snapshot and the (one or
> > more) reference snapshots.
> Let me see if I got that right:
> - -p sends just the differences, for both data and meta-data.
> - Plus, -c sends *all* the metadata, you said... but will it send all
> data (and simply ignore what's already there) or will it also just send
> the differences in terms of data?

   Well, if you have a snapshot A, snap to A', and then send -p A A',
it'll send the same amount of data as send -c A A'.

   However, the effect on the receiving system is slightly different
in terms of the subvol metadata -- with -p, it will preserve the
information that A and A' are snapshots of the same original. With -c,
it won't preserve that.

   This will probably have knock-on effects in terms of round-tripping
the snapshots (e.g. for restoring one to the hosed system and
continuing with the incremental backup scheme). I'd have to do some
hard thinking again with the send/receive algebra to work out what the
effect would be, but with the -c approach, you'd probably have
difficulties. The round-tripping feature hasn't been implemented yet,
so the point is currently moot, but it's certainly possible to do it
(with a small send stream change), and it probably will be done at
some point.

> - So that means effectively I'll end up with the same... right?
> 
> In other words, -p should be a tiny bit faster... but not that extremely much 
> (unless I have tons[0] of metadata changes)

   Yes.

> >  You can only use one -p (because there's
> > only one difference you can compute at any one time), but you can use
> > as many -c as you like (because you can share extents with any number
> > of subvols).
> So that means, if it would work correctly, -p would be the right choice
> for me, as I never have multiple snapshots that I need to draw my
> relinks from, right?

   Correct. The -c case is much less often needed. It's useful if you
have, say, several otherwise unrelated subvols that you need to
transfer efficiently from a filesystem that has had dedup run on it.
(Other use cases may apply as well).

> >    In implementation terms, on the receiver, -p takes a (writable)
> > snapshot of the reference subvol, and modifies it according to the
> > stream data. -c makes a new empty subvol, and populates it from
> > scratch, using the reflink ioctl to use data which is known to exist
> > in the reference subvols.
> I see...
> I think the manpage needs more information like this... :)
[snip]
> [0] People may argue that one has XXbytes of metadata, and tons are a
> measurement of weight... but when I recently carried 4 of the 8TB HDDs
> in my back... I came to the conclusion that data correlates to gram ;-)

   Yeah, I've met that particular equation too... :)

   Hugo.

-- 
Hugo Mills             | Anyone who claims their cryptographic protocol is
hugo@... carfax.org.uk | secure is either a genius or a fool. Given the
http://carfax.org.uk/  | genius/fool ratio for our species, the odds aren't
PGP: E2AB1DE4          | good.                                  Bruce Schneier

Attachment: signature.asc
Description: Digital signature

Reply via email to