Les Mikesell wrote:
Johan Ehnberg wrote:
OK. I can see now why this is true. But it seems like one could
rewrite the backuppc rsync protocol to check the pool for a file with
same checksum before syncing. This could give some real speedup on
long files. This would be possible at least for the cpool where the
rsync checksums (and full file checksums) are stored at the end of
each file.
Now this would be quite the feature - and it fits perfecty with the idea
of smart pooling that BackupPC has. The effects are rather interesting:
- Different incremental levels won't be needed to preserve bandwidth
- Full backups will indirectly use earlier incrementals as reference
Definite whishlist item.
But you'll have to read through millions of files and the common case of
a growing logfile isn't going to find a match anyway. The only way this
could work is if the remote rsync could send a starting hash matching
the one used to construct the pool filenames - and then you still have
to deal with the odds of collisions
I thought about this a little a year or so ago -- enough to attempt to
try to understand the rsync perl modules (failed!).
I thought perhaps what would be best is a berkeley db/tied hash lookup
table/cache that would map rsync checksums+file size to a pool item.
The local rsync client would request the checksum of each remote file
before transfer, and if it was in the cache and in the pool, it could be
used as the local version, then let the rsync protocol take over to
verify all of the blocks.
I really like that BackupPC doesn't store its data in a database that
could get corrupted, and the berkeley db would just be a cache whose
integrity wouldn't be critical to the integrity of the backups. And
the cache isn't relied on 100%, but rather the actual pool file the
cache points to is used as the ultimate authority.
Rich
------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/