Re: [BackupPC-users] backuppc slow rsync speeds

Timothy J Massey Mon, 17 Sep 2012 10:02:18 -0700

Mark Coetser <[email protected]> wrote on 09/17/2012 03:08:49 AM:

> Hi
> 
> backuppc                                       3.1.0-9.1
> rsync                                          3.0.7-2
> 
> OK I have a fairly decent spec backup server with 2 gigabit e1000 nics 
> bonned together and running in bond mode 0 all working 100%. If I run 
> plain rsync between the backup server and a backup client both connected


> on gigabit lan I can get sync speeds of +/- 300mbit/s but using backuppc 

> and rsync the max speed I get is 20mbit and the backup is taking 
> forever. Currently I have a full backup thats been running for 3461:23 
> minutes where as the normal rsync would have taken a few hours to 
complete.
> 
> The data is users maildirs and its about 2.6Tb and I am not using rsync 
> over ssh, I have the rsync daemon running on the client and have setup 
> the .pl as follows.

I have several very similar configurations.  Here's an example:

Atom D510 (1.66GHz x 2 Cores)
4GB RAM
CentOS 6 64-bit
4 x 2TB Seagate SATA drives in RAID-6 configuration
        I get almost 200 MB/s transfer rate from this array...
2 x Intel e1000 NICs in bonded mode.

In the past, the biggest server I backed up was around 1TB.  Personally, I 
prefer to keep each server image under 1TB if I can help it.  Everything 
is easiser that way:  not just file-level backups with BackupPC but image 
level as well, and there's less downtime (or less time with noticaeable 
slowdown if it is up) when having to take such images.

With servers <1TB, rsync-based BackupPC full backups are slow, but get 
done in a reasonable amount of time:  8-12 hours, and I can live with 
that.  It is usually kind of beneficial:  if I start a backup in the 
middle of the day it does not hammer the client I'm backing up noticeably. 
 (Lemons, lemonade...  :) )

However, I have recently inherited a server that is >3TB big, and 97% 
full, too!  Backups of that system take 3.5 *days* to complete.  I *can't* 
live with that.  I need better performance.

I was going to write a very similar e-mail to what you wrote as well!  So 
maybe we can work this together.

All of your configuration looks pretty straightforward to me (except the 
mounts:  I'm not sure why you have them if you're using rsyncd).  Mine are 
quite similar.

No matter the size of the system, I seem to top out at about 50GB/hour for 
full backups.  Here is a perfectly typical example:

Full Backup:  769.3 minutes for 675677.3MB of data.  That works out to be 
878MB/min, or about 15MB/s.  For a system with an array that can move 
200MB/s, and a network system that can move at least 70MB/s.

Now, let's look at the "big" server:

Full backup:  5502.8 minutes for 2434613.6MB of data.  That's even worse: 
442MB/min.  And 5502.8 minutes is three and a half *DAYS*.

First, a quick look at the client will show that we can eliminate it 
completely.  I have checked the performance of several of them while a 
backup is running.  The client is not CPU or I/O or memory bound 
whatsoever.  Here is a typical example:  a Windows Server 2008.  Task 
Manager shows minimal everything:  between 0% and 20% CPU usage (with most 
time below 5%), and more than 1GB of 2GB RAM free (with 1300MB of cached 
memory).  Network utilization is absolutely flatlined!  A quick sanity 
check of the server's physical drive lights show that the drive activity 
is in brief fits and starts.  This system is *clearly* not being taxed. By 
the way, this contrasts to the beginning of the backup, when rsync is 
building the file list. The rsync daemon's CPU usage bounces around with 
peaks over 70%, and the drives are blinking constantly during this 
process--so the server is perfectly capable of doing something when it's 
asked to!

The server side, though, shows something completely different.  Here is a 
few lines from dstat:

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
 33   2  64   1   0   0|  22M   47k|   0     0 |   0     0 |1711   402
 43   3  49   6   0   0|  40M  188k|  35k 1504B|   0     0 |2253   632
 45   4  49   1   0   1|  50M   36k|  38k 1056B|   0     0 |2660   909
 46   4  50   0   0   0|  46M    0 |  55k 1754B|   0     0 |2540   622
 45   4  50   1   0   0|  45M   12k| 120B  314B|   0     0 |2494   708
 43   3  50   3   0   0|  42M    0 |  77k 1584B|   0     0 |2613   958
 41   4  47   8   0   0|  50M  268k| 449B  356B|   0     0 |2333   704
 46   3  50   1   0   0|  42M   36k|  26k 1122B|   0     0 |2583   771
 45   4  50   1   0   0|  40M    0 |  30k  726B|   0     0 |2499   681

It looks like everything is under-utilized.  For example, I'm getting a 
measly 40-50MB of read performance from my array of four drives, and 
*nothing* is going out over the network.  My physical drive and network 
lights echo this:  they are *not* busy.  My interrupts are certainly 
manageable and context switches are very low.  Even my CPU numbers look 
tremendous:  nearly no time in wait, and about 50% CPU idle!

Ah, but there's a problem with that.  This is a dual-core system.  Any 
time you see a dual-core system that is stuck at 50% CPU utilization, you 
can bet big that you have a single process that is using 100% of the CPU 
of a single core, and the other core is sitting there idle.  That's 
exactly what's happening here.

Notice what top shows us:

top - 13:21:27 up 49 min,  1 user,  load average: 2.07, 1.85, 1.67
Tasks: 167 total,   2 running, 165 sleeping,   0 stopped,   0 zombie
Cpu(s): 43.7%us,  3.6%sy,  0.0%ni, 50.5%id,  2.1%wa,  0.0%hi,  0.1%si, 
0.0%st
Mem:   3924444k total,  3774644k used,   149800k free,     9640k buffers
Swap:        0k total,        0k used,        0k free,  3239600k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1731 backuppc  20   0  357m 209m 1192 R 95.1  5.5  35:58.08 BackupPC_dump
 1679 backuppc  20   0  360m 211m 1596 D 92.1  5.5  32:54.18 BackupPC_dump


My load average is 2, and you can see those two processes:  two instances 
of BackupPC_dump.  *Each* of them are using 100% of the CPU given to them, 
but they're both using the *same* CPU (core), which is why I have 50% 
idle!

Mark Coetser, can you see what top shows for the CPU utilization for your 
system while doing a backup?  Don't just look at the single "idle" or 
"user" numbers:  look at each BackupPC process as well, and let us know 
what they are--and how many physical (and hyper-threaded) cores you have. 
Additional info can be found in /proc/cpuinfo if you don't know the 
answers.

To everyone:  is there a way to get Perl to allow each of these items to 
run on *different* processes?  From my quick Google it seems that the 
processes must be forked using Perl modules designed for this purpose.  At 
the moment, this is beyond my capability.  Am I missing an easier way to 
do this?

And one more request:  for those of you out there using rsync, can you 
give me some examples where you are getting faster numbers?  Let's say, 
full backups of 100GB hosts in roughly 30-35 minutes, or 500GB hosts in 
two or three hours?  That's about four times faster than what I'm seeing, 
and would work out to be 50-60MB/s, which seems like a much more realistic 
speed.  If you are seeing such speed, can you give us an idea of your 
hardware configuration, as well as an idea of the CPU utilization you're 
seeing during the backups?  Also, are you using compression or checksum 
caching?  If you need help collecting this info, I'd be happy to help you.


To cover a couple of other frequently suggested items, here's what I've 
examined to improve this:

Yes, I have noatime.  From fstab:  UUID=<snipped>  /data     ext4 
defaults,noatime    1 2
Noatime only makes a difference when you are I/O bound--which ideally a 
BackupPC server would be.  In my case, it made very little difference. I'm 
not I/O bound.

I am using EXT4.  I have gotten very similar performance with EXT3.  Have 
not tried XFS or JFS, but would *really* prefer to keep my backups on the 
extremely well-known and supported EXT series.

I am using compression on this BackupPC server.  Obviously, this may 
contribute to the CPU consumption.  My old servers did not have 
compression, but had terrible VIA C3 single-core processors.  And their 
backup performance was quite similar.  I figured with the Atom D510 I'd be 
OK with compression.  But maybe not.  I'll try to see if I can do some 
testing with some smaller hosts without compression and see what happens.

As for checksum caching:  As I mentioned, I think the strength of leaving 
it off is very valuable.  But I look forward to seeing the performance 
others are getting and how they compare to see at what performance cost 
this protection is coming.

Thank you very much for your help!

Timothy J. Massey


 
Out of the Box Solutions, Inc. 
Creative IT Solutions Made Simple!
http://www.OutOfTheBoxSolutions.com
[email protected] 
 
22108 Harper Ave.
St. Clair Shores, MI 48080
Office: (800)750-4OBS (4627)
Cell: (586)945-8796

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/

_______________________________________________
BackupPC-users mailing list
[email protected]
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Re: [BackupPC-users] backuppc slow rsync speeds

Reply via email to