-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Tino Schwarze wrote: > Hi John, > > On Wed, Feb 18, 2009 at 10:58:14AM -0600, John Goerzen wrote: > >> I've been reading docs on BackupPC and I have a few questions about >> how it works. >> >> First off, I gather that it keeps a hardlinked pool of data, so >> whenever a file changes on any host, on the backup device, it will be >> hardlinked to a file containing the same data, regardless of the host >> it came from, right? > > Right.
Mostly right... If you have a file with identical content stored on two different hosts (or even two files on the same host): host1:/var/log/messages host1:/var/log/kernel.log Let's assume these two files get the exact same log data... They are both backed up onto the server, so each file in full is transferred to the server, no bandwidth savings (basically)... The next day, both files have changed, but the two new files are identical. The first file is copied to a new file in the backup dir, and rsync transfers only the changed data. The second file is copied to a new file in the backup dir, and rsync transfers only the changed data. After the backup completes, backuppc runs through all the new files, and creates a hardlink between the first file and the pool. When it sees the second file, it will delete it from the backup dir, and create a hardlink to the version in the pool. The same applies if the two files were on different hosts. If the host or path is different, then the changed data will be transferred multiple times (or entire content for new files). Worst case is when someone manages to copy their photo library or something on a remote host... >> So, given that, I don't really understand why there is a distinction >> between a full and an incremental backup. Shouldn't either one take >> up the same amount of space? That is, if you've got few changes on >> the client, then on the server you're mostly just hardlinking things >> anyway, right? So why is there a choice? > > The only difference between incremental and full (for rsync!) is that > 1) all files are completely checksummed, so you detect pool curruption > 2) you get the whole directory structore for the server (which is used > as the base for incremental backups) with all hardlinks to pool files > > For an incremental, you only get the directory structure and hardlinks > to new/modified files to the pool. Maybe not (1) since there is an option CSumVerify or something, which is set to 0.01 by default (checks 1% of pool files) each time. Basically, incremental uses less disk IO, CPU, and memory on both client and server, because it doesn't examine the files on the client in as much detail (just size, path, modification date/time) instead of checksum as well. Regards, Adam -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkmcSeMACgkQGyoxogrTyiVwTwCfZS5vCvoyEgaiwQoW4hGipCgZ 0q0AnRVlccbJqXnXsPnbghDmMsj34jXC =OvXr -----END PGP SIGNATURE----- ------------------------------------------------------------------------------ Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H _______________________________________________ BackupPC-users mailing list [email protected] List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
