Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

Marat Khalili Tue, 31 Oct 2017 23:27:07 -0700

I'm active user of backup using btrfs snapshots. Generally it works withsome caveats.

You seem to have two tasks: (1) same-volume snapshots (I would not callthem backups) and (2) updating some backup volume (preferably on adifferent box). By solving them separately you can avoid some complexitylike accidental remove of snapshot that's still needed for updatingbackup volume.

To reconcile those conflicting goals, the only idea I have come up
with so far is to use btrfs send-receive to perform incremental
backups as described here:
https://btrfs.wiki.kernel.org/index.php/Incremental_Backup .

As already said by Romain Mamedov, rsync is viable alternative tosend-receive with much less hassle. According to some reports it caneven be faster.

Given the hourly snapshots, incremental backups are the only practical
option. They take mere moments. Full backups could take an hour or
more, which won't work with hourly backups.

I don't see much sense in re-doing full backups to the same physicaldevice. If you care about backup integrity, it is probably moreimportant to invest in backups verification. (OTOH, while you didn'treveal data size, if full backup takes just an hour on your system thenwhy not?)

We will delete most snapshots on the live volume, but retain many (or
all) snapshots on the backup block device. Is that a good strategy,
given my goals?

Depending on the way you use it, retaining even a dozen snapshots on alive volume might hurt performance (for high-performance databases) orbe completely transparent (for user folders). You may want to experimentwith this number.

In any case I'd not recommend retaining ALL snapshots on backup device,even if you have infinite space. Such filesystem would be as dangerousas the demon core, only good for adding more snapshots (not evendeleting them), and any little mistake will blow everything up. Keep afew dozen, hundred at most.

Unlike other backup systems, you can fairly easily remove snapshots inthe middle of sequence, use this opportunity. My thinout rule is: removesnapshot if resulting gap will be less than some fraction (e.g. 1/4) ofits age. One day I'll publish portable solution on github.

Given this minimal retention of snapshots on the live volume, should I
defrag it (assuming there is at least 50% free space available on the
device)? (BTW, is defrag OK on an NVMe drive? or an SSD?)

In the above procedure, would I perform that defrag before or after
taking the snapshot? Or should I use autodefrag?

I ended up using autodefrag, didn't try manual defragmentation. I don'tuse SSDs as backup volumes.

Should I consider a dedup tool like one of these?

Certainly NOT for snapshot-based backups: it is already deduplicatedalmost as much as possible, dedup tools can only make it *less*deduplicated.

* Footnote: On the backup device, maybe we will never delete
snapshots. In any event, that's not a concern now. We'll retain many,
many snapshots on the backup device.

Again, DO NOT do this, btrfs in its current state does not support it.Good rule of thumb for time of some operations is data size multipliedby number of snapshots (raised to some power >= 1) and divided by IO/CPUspeed. By creating snapshots it is very easy to create petabytes of datafor kernel to process, which it won't be able to do in many years.


--

With Best Regards,
Marat Khalili

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

Reply via email to