I've been playing with and studying rdiff-backup for about a week and
for the most part it works well for our scenario - keep a backup mirror
that is easy for anyone to access with incrementals if necessary.
However, it has the rather unfortunate property that when an incremental
fails, the next run proceeds to do a regression on the entire mirror. I
understand that this is necessary to get the mirror back into a
consistent state, but it seems like it could be optimized. Logically,
if an incremental fails, 99.999% of the files will still be perfectly
fine because the failed incremental didn't touch them in the first
place. So why does a regression need to touch every file? Can't a
regression look at which files have incrementals that need to be deleted
and only regress those files? It seems to spend most of its time in the
following loop:
1. Copy the file in question to a .tmp file
2. Apply attributes/ACLs to the .tmp file
3. Rename the .tmp file back to the original file.
When there's 400k files in a backup, this actually takes longer than a
full backup would. Surely I'm missing some scenario where this is
necessary? Couldn't this (extremely common) scenario be detected and
just apply the attributes/ACLs to the original file from the mirror
metadata? Why is the .tmp file necessary?
This brings up another related question - the attributes are stored in a
separate file in the rdiff-backup-data directory, do they really need to
be applied to the mirror? I understand rdiff-backup is trying to make
the mirror match the original as closely as possible but due to
filesystem differences the mirror attributes can't really be trusted
anyway. I would actually like to override the mirror's attributes and
make them read-only so the mirror can't be messed with, or simply tell
rdiff-backup not to bother setting attributes on the mirror's files
(particularly when regressing.)
I'm not afraid to go poking around in the source and try to make some
changes but I'd like to discuss any side effects or pitfalls first.
--Nathan
_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki