Hi Les,

I allready thought about that and I agree that the handling of large image 
files is problematic in general. I need to make images for the windows-based 
virtual machines to get them back running when a disaster happens. If I go away 
from backuppc for transfering these images, I don't see any benefits (maybe 
because I just don't know of a image solution that solves my problems better).
As I already use backuppc to do backups of the data partitions (all linux 
based) I don't want my backups to become more complex than necessary.
I can live with the amount of harddisk space the compressed images will consume 
and the IO while merging the files is acceptable for me, too.
I can tell the imaging software (partimage) to cut the image into 2 GB volumes, 
but I doubt that this enables effective pooling, since the system volume I make 
the image from has temporary files, profiles, databases and so on stored. If 
every image file has changes (even if there are only a few megs altered), I 
expect the rsync algorithm to be less effective than comparing large files 
where it is more likely to have a "unchanged" long part which is not 
interrupted by artificial file size boundaries resulting from the 2 GB volume 
splitting.

I hope I made my situation clear.
If anyone has experiences in large image file handling which I may benefit 
from, please let be know!

Thank you very much,

Andreas Piening

Am 12.05.2012 um 06:04 schrieb Les Mikesell:

> On Fri, May 11, 2012 at 4:01 PM, Andreas Piening
> <andreas.pien...@gmail.com> wrote:
>> Hello Backuppc-users,
>> 
>> I stuck while trying to identify the suitable rsync parameters to handle 
>> large image file backups with backuppc.
>> 
>> Following scenario: I use partimage to do LVM-snapshot based full images of 
>> my virtual (windows-) machines (KVM) blockdevices. I want to save theses 
>> images from the virtualization server to my backup machine running backuppc. 
>> The images are between 40 and 60 Gigs uncompressed each. The time-window for 
>> the backup needs to stay outside the working hours and is not large enough 
>> to transfer the images over the line every night. I red about rsync's 
>> capability to only transfer the changed parts in the file by a clever 
>> checksum-algorithm to minimize the network traffic. That's what I want.
>> 
>> I tested it by creating a initial backup of one image, created a new one 
>> with only a few megs of changed data and triggered a new backup process. But 
>> I noticed that the whole file was re-transfered. I waited till the end to 
>> get sure about that and decided that it was not the ultimate idea to check 
>> this with a compressed 18 GB image file but this was my real woking data 
>> image and I expected it to work like expected. Searching for reasons for the 
>> complete re-transmission I ended in a discussion-thread where they talked 
>> about rsync backups of compressed large files. The explanation made sense to 
>> me: The compression algorithm can cause a complete different archive file 
>> even if just some megs of data at the beginning of the file hast been 
>> altered, because of recursion and back-references.
>> So I decided to store my image uncompressed which is about 46 Gigs now. I 
>> found out that I need to add the "-C" parameter to rsync, since data 
>> compression is not enabled per default. Anyway: the whole file was 
>> re-created in the second backup run instead of just transfering the changed 
>> parts, again.
>> 
>> My backuppc-option "RsyncClientCmd" is set to "$sshPath -C -q -x -l root 
>> $host $rsyncPath $argList+" which is backup-pcs default disregarding the 
>> "-C".
>> 
>> Honestly, I don't understand the exact reason for this. There are some 
>> possibilities that may be guilty:
>> 
>> -> partimage does not create a linear backup image file, even if it is 
>> uncompressed
>> -> there is just another parameter for rsync I missed which enables 
>> differential file-changes-transfers
>> -> rsync exames the file but decides to not use differential updates for 
>> this one because of it's size or just because it's created-timestamp is not 
>> the same as the prior one
>> 
>> Please give me a hint if you've successfully made differential backups of 
>> large image files.
> 
> I'm not sure there is a good way to handle very large files in
> backuppc.  Even if rysnc identifies and transfers only the changes,
> the server is going to copy and merge the unchanged parts from the
> previous file which may take just as long anyway, and it will not be
> able to pool the copies.    Maybe you could split the target into many
> small files before the backup.  Then any chunk that is unchanged
> between runs would be skipped quickly and the contents could be
> pooled.
> 
> -- 
>  Les Mikesell
>    lesmikes...@gmail.com

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Reply via email to