Holger Parplies <[EMAIL PROTECTED]> wrote on 01/26/2007 02:48:29 PM:

 > You wrote on 26.01.2007 at 01:49:04 [[BackupPC-users] Long:  How
 > BackupPC handles pooling, and how transfer methods affect bandwidth 
usage]:
 >
 > > The advantage of this is that it puts near zero extra load on the 
host.
 > >   In the case of a full backup, reading 100% of the data on the server
 > > is unavoidable no matter what--that's what makes it a full backup.
 > > Beyond this, using tar or smb puts no additional load upon the server.
 >
 > I'm a bit confused by the terms 'host' and 'server'. I tend to think 
of the
 > BackupPC server as 'server' and the host to be backed up as the 'client',

You are exactly correct.  This is why I used the terms "server" for the 
BackupPC server, and "host" for the thing being backed up.  I tried to 
be *very* careful in staying consistent with this, but the above is a 
mistake.  The problem is, my BackupPC targets *are* servers themselves, 
so it's *very* easy to slip up and call them that, even though in this 
context they're "hosts"...

 > Les pointed to File::RsyncP (which I have not looked at). You've 
obviously
 > looked at Xfer::Rsync (where the transfer is started). Note that start()
 > passes an Xfer::RsyncFileIO object to File::RsyncP::new() which seems to
 > serve as a callback mechanism involved in the transfer. I've only had a
 > brief look at Xfer::RsyncFileIO (trying to find out what caching of
 > checksums is actually done), but my impression is that that's the glue
 > between rsync and BackupPC.

I will check it out.

 > > From what I
 > > understand, there would *never* be any files in the destination path:
 > > the destination path is always a newly created directory.
 >
 > See the rsync --compare-dest and --copy-dest options. I'd guess it's 
similar
 > to what they do: compare with one tree and create a second.

Ah!  I was not aware of those parameters!  However, that still means 
that we can only compare from a single previously-completed backup... 
That was my gut feeling, but it conflicts with the documentation.

 > > the documentation says
 > > "As rsync runs, it checks each file in the backup to see if it is
 > > identical to an existing file from any previous backup of any PC."
 >
 > Apparently, it doesn't because it can't. You've probably read Craig's 
reply
 > in the other thread by now.

I don't remember:  it was the confusing, seemingly conflicting 
information in different threads (and the documentation) that caused me 
to do the research and write my e-mail.

It seems to me, then, that the documentation is *wrong*:  rsync does not 
compare against the pool, *ever*;  only against a previous backup (most 
likely the next-highest backup level, but I have yet to find this 
information yet).  The only bandwidth saved is rsync's typical 
compare-against-a-single-existing-tree technique.  Once rsync has 
transferred the files (including saving as "new" files that may 
potentially already be in the pool), it's up to BackupPC_link to merge 
what was received with the pool, deleting the redundant files as it goes 
along.

So there is no potential for using the pool to get bandwidth savings for 
a new host:  we must resort to tricks to prime the host in some way 
somewhat outside of BackupPC.

In the end, I guess this is fairly minor.  You just have to understand 
the nuances of the transfer method you are using, and your exact 
circumstances...

 > If I understand and judge the matter correctly, we'd need a database 
to map
 > rsync checksums to pool files. That would probably add some complexity to
 > BackupPC, but it could be optional.

It's probably not worth the hassle...  For me, the amount of savings 
across the pool are really pretty small compared to the total size of 
the pool.  I can live without it.

 > One final consideration: does it make much sense to backup the complete
 > operating system of multiple similar servers?

That was merely a fictional example.  However, my answer to backup is 
very, very simple:  back up everything, period.  Even if you *know* you 
won't use it.  Even if you *know* it's redundant.  Disk space is cheap, 
forgetting that *one* file you misfiled but now vitally need is not.

Thank you for the information.  I will keep digging.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/

Reply via email to