> Hi Eric, Hi Sebastian,
I'm cc'ing the rdiff-backup-users list too, they may have some insight as well. > on LVM snapshots and came across your blog and your articles in that regard: > > http://www.globallinuxsecurity.pro/blog.php?q=rdiff-backup-lvm-snapshot > > I'm very impressed both with your rdiff-backup patch and the block-fuse > application. I'm glad you will find it useful! Unfortunately, I have found the sparse-destination patch for rdiff-backup is sometimes slow. I'm running without sparse files until I can figure out a faster way to detect blocks of 0-bytes. If you or someone on the list knows python better than I, please take a look! > Since you mentioned that you use this combination to backup up images up > to 350GB, I am interested to find out whether you have encountered > problems with I/O-Wait. I'm using blockfuse+rdiff-backup after business hours, so if the VM slows down, nobody (or very few) notice. The server runs 4x 1TB drives in RAID-10, and block-IO peaks at ~225MB/sec. That 350GB volume was recently extended to 600GB. > There is a Linux Kernel bug that causes I/O-Wait to skyrocket when > copying large files, especially when those files are larger than the > available memory. > > https://bugzilla.kernel.org/show_bug.cgi?id=12309 Good to know, I was unaware of this bug. See comment#128, it looks like using ext4 works a little better for writing, possibly because of delayed allocation ("delalloc"). Since I'm using ext4 as my destination backup filesystem, this could be the reason I am not experiencing the same issue. I suppose it could be my RAID controller (LSI 9240) buffering the IO overhead from the host CPU, too. What disk hardware are you using for source and destination? > In our case, a quad-core server running rdiff-backup on a block-fuse > directory, having 8GB ram, is basically made unavailable by the symptoms > I described above. All the virtual machines on it become unreachable. I have a feeling that this is due to backup-destination contention rather than backup-source contention. BlockFuse mmaps the source device, and I'm not certain if mmap'ed IO is cached or not. To guarantee you are missing the source's disk cache, you could patch blockfuse to use direct-IO (O_DIRECT), or backup from a "/dev/raw/rawX" device. (Missing disk cache is important for backups, because backups tend to be read-once. Thus, thrashing the cache effects the "good stuff" in the cache.) For large files, rdiff-backup may benefit from writing with the O_DIRECT flag (a hint from comment#128). Again, this would help miss the disk cache. I'm backing up local-to-local; the source is a RAID-10 array, and the destination is a slow 5400rpm 2TB single-disk as tertiary storage. Do you backup local-to-local, or over a network? > If you have any experience with this in your backup scenarios, I would > love to hear back from you. So far it works great on my side. I'm deploying this to backup LVM snapshots of Windows VMs under KVM in about 2 weeks on different hardware. I might have better insight then if I run into new issues. > > Cheers, > Sebastian -- Eric Wheeler President eWheeler, Inc. dba Global Linux Security www.GlobalLinuxSecurity.pro 503-330-4277 PO Box 14707 Portland, OR 97293 _______________________________________________ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki