sync performance falls off a cliff

2009-06-29 Thread Mike Connell
Hi,

I've got identical servers. One is primary the other is backup
receiving rsyncs from the primary. I'm backing up a file system to
disk and the files are small and there are lots of directories.

The overall problem seems to be the total number of files.
When I had ~375,000 files, the total rsync time was under a minute.
With ~425,000 files, the total rsync time is 10 minutes.

Last Friday when we were at 425,000 files, the rsync time was 10 minutes.
Today I was able to delete 50,000 unneeded files and the rsync time went
back down to under a minute.

So why the huge change in total rsync time for a somewhat small change
in total number of files? I'm afraid that as the total number of files keeps
increasing that the total rsync time is going to go exponential.

I turn the --progress flag on, and the time is rougly divided up evenly between
building the file list and looking thru the file list. The files themselves
are really small (~16K) and I'm not seeing any problem with anything
other than how long it takes rsync to make a pass thru all the files. I do use
the --delete option.

The servers are Dell 2950s, builtin RAID 10 disks and 4Gig of RAM.
OS is Centos 5.1. I'm running rsync 2.6.8 protocol version 29.

This smells to me like some sort of caching problem. Is there something
in the kernel or rsync itself that I can tweek?

Thanks,

Mike-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: sync performance falls off a cliff

2009-06-30 Thread Leen Besselink
Mike Connell wrote:
> Hi,
>  

Hi Mike,

> I've got identical servers. One is primary the other is backup
> receiving rsyncs from the primary. I'm backing up a file system to
> disk and the files are small and there are lots of directories.
>  
> The overall problem seems to be the total number of files.
> When I had ~375,000 files, the total rsync time was under a minute.
> With ~425,000 files, the total rsync time is 10 minutes.
>  
> Last Friday when we were at 425,000 files, the rsync time was 10 minutes.
> Today I was able to delete 50,000 unneeded files and the rsync time went
> back down to under a minute.
>  
> So why the huge change in total rsync time for a somewhat small change
> in total number of files? I'm afraid that as the total number of files keeps
> increasing that the total rsync time is going to go exponential.
>  
> I turn the --progress flag on, and the time is rougly divided up evenly
> between
> building the file list and looking thru the file list. The files themselves
> are really small (~16K) and I'm not seeing any problem with anything
> other than how long it takes rsync to make a pass thru all the files. I
> do use
> the --delete option.
>  
> The servers are Dell 2950s, builtin RAID 10 disks and 4Gig of RAM.
> OS is Centos 5.1. I'm running rsync 2.6.8 protocol version 29.
>  
> This smells to me like some sort of caching problem. Is there something
> in the kernel or rsync itself that I can tweek?
>

I'm no expert, but I suggest using rsync 3.x (3.0.6 for example), it
doesn't keep the as much information of the filelist in memory.

It's probably swapping to disk, because of the large list and that
significantly slows down the performance of the whole machine(s).

Have a look at the output of the 'vmstat 2' command on both machines
while it's busy, specifically look at the caption that says 'swap',
it has a 'si' and 'so' column below it. 'si' means reading from
swap/disk and 'so' means writing to swap/disk.

You can try it out fairly easily, especially if you don't use rsync
for anything else. If you can't find a package, just building it is
possible an option:

cd /usr/src
wget http://rsync.samba.org/ftp/rsync/rsync-3.0.6.tar.gz
tar -zxvf rsync-3.0.6.tar.gz
nice ./configure && nice make

That should work (atleast if you have gcc and make and possible other
things already installed).

And instead of calling rsync, you call /usr/src/rsync-3.0.6/rsync if
you just want te test it first without installing.

You'll have to do it on both machines ofcourse. If you are not sure
you want to make any changes, with an unsupported binary, you can
use: -n that would make rsync not write changes to disk.

Hope these instructions help.

> Thanks,
>  
> Mike

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: sync performance falls off a cliff

2009-06-30 Thread Leen Besselink
Mike Connell wrote:
> Hi,
>  

Hi again Mike,

> I don't see how to reply to your post so it shows up as a reply
> on the list. So I guess I'll just send email directly to you.
>  

You just e-mail rsync@lists.samba.org instead of me. :-)

> Today I've been watching the production 2.6.8 rsync off and on and no it
> isn't swapping. Used "vmstat" and "top" both on the source and
> the destination. Each shows 0 for si and so.
>  
> With iostat -xn 5, I do see that first disk utilization on the source
> hits 95% while the file list is being received. After rsync says done
> (with the file list), then the destination hits 95% disk utilization.
>  
> This is not good. As it takes more and more time, it will be pegging
> our servers.
>  

Maybe reading all the files from disk fills up the 'file-caching'
(the memory used to prevent reading from disk).

If I do some quick calculation (and my math isn't wrong), then
375.000 * 16k almost fits in 4GB of memory and 425.000 definitly
does not.

Thus it needs to go back to disk and read from there.

> So I used your good advice and downloaded and built rsync 3.0.6.
> (Couldn't find any packages available).
>  
> I now see that the new rsync says "receiving incremental file list".
> What does this

It just means it's supposed to use less memory, because it doesn't need
to keep the whole filelist in memory. Although I suspect other tradeoffs
might be made.

> do? Sounds good. Have only verified that the new rsync seems to work
> in a test capacity. Will move it into production soon to see how it does.
>  

If what you mentioned is how I think it is, then I doubt it will help
much or maybe just for a while.

I don't know what kernel you have (and io-scheduler you are using), but I
do know their is also a 'ionice' command (in Debian-based distributions
it's part of the util-linux-package) which can be prepended to running
the rsync command which is meant to set priorities between processes
for reading and writing to/from disk.

It will possible slow down rsync even more, but atleast it wouldn't slow
down the other processes on the server.

It will still kill the file-cache though, so in that way it could still
slow down other processes.

If it's the file-cache plugging in extra memory would solve the problem
for a certain while. I don't know if you'd need to be running 64-bit for
that though (depends on the machine and CentOS).

If it's failover setup and it turns out that their is no easy solution
with rsync, maybe something at the block device level would be more
appropriate like:

http://www.drbd.org/
or
http://www.centos.org/docs/5/html/5.1/Global_Network_Block_Device/ch-gnbd.html

But it's not something I've used before.

> Thanks,
>  
> Mike
>  

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: sync performance falls off a cliff

2009-07-04 Thread Carlos Carvalho
Leen Besselink (l...@consolejunky.net) wrote on 30 June 2009 09:05:
 >Mike Connell wrote:
 >> Hi,
 >>  
 >
 >Hi Mike,
 >
 >> I've got identical servers. One is primary the other is backup
 >> receiving rsyncs from the primary. I'm backing up a file system to
 >> disk and the files are small and there are lots of directories.
 >>  
 >> The overall problem seems to be the total number of files.
 >> When I had ~375,000 files, the total rsync time was under a minute.
 >> With ~425,000 files, the total rsync time is 10 minutes.
 >>  
 >> Last Friday when we were at 425,000 files, the rsync time was 10 minutes.
 >> Today I was able to delete 50,000 unneeded files and the rsync time went
 >> back down to under a minute.
 >>  
 >> So why the huge change in total rsync time for a somewhat small change
 >> in total number of files? I'm afraid that as the total number of files keeps
 >> increasing that the total rsync time is going to go exponential.
 >>  
 >> I turn the --progress flag on, and the time is rougly divided up evenly
 >> between
 >> building the file list and looking thru the file list. The files themselves
 >> are really small (~16K) and I'm not seeing any problem with anything
 >> other than how long it takes rsync to make a pass thru all the files. I
 >> do use
 >> the --delete option.
 >>  
 >> The servers are Dell 2950s, builtin RAID 10 disks and 4Gig of RAM.
 >> OS is Centos 5.1. I'm running rsync 2.6.8 protocol version 29.
 >>  
 >> This smells to me like some sort of caching problem. Is there something
 >> in the kernel or rsync itself that I can tweek?
 >>
 >
 >I'm no expert, but I suggest using rsync 3.x (3.0.6 for example), it
 >doesn't keep the as much information of the filelist in memory.

Yes. Or at lease it starts transfers much faster, because it doesn't
wait for the full list to be completed.

 >It's probably swapping to disk, because of the large list and that
 >significantly slows down the performance of the whole machine(s).

He's probably running out of ram, not only because of rsync but
also everything else. Since inodes and files are not in ram, they have
to be fetched from the disk, which is *very* slow.

You can tell the kernel to increase the priority of inodes, which will
reduce the time to build the file list a lot. Just set
/proc/sys/vm/vfs_cache_pressure to a low value.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: sync performance falls off a cliff

2009-07-05 Thread Leen Besselink
>  >
>  >I'm no expert, but I suggest using rsync 3.x (3.0.6 for example), it
>  >doesn't keep the as much information of the filelist in memory.
> 
> Yes. Or at lease it starts transfers much faster, because it doesn't
> wait for the full list to be completed.
> 
>  >It's probably swapping to disk, because of the large list and that
>  >significantly slows down the performance of the whole machine(s).
> 
> He's probably running out of ram, not only because of rsync but
> also everything else. Since inodes and files are not in ram, they have
> to be fetched from the disk, which is *very* slow.
> 

not ram per se, because this is what he said in a different e-mail:

Today I've been watching the production 2.6.8 rsync off and on and no it
isn't swapping. Used "vmstat" and "top" both on the source and
the destination. Each shows 0 for si and so.

> You can tell the kernel to increase the priority of inodes, which will
> reduce the time to build the file list a lot. Just set
> /proc/sys/vm/vfs_cache_pressure to a low value.
> 

Ohh, interresting, thank you.

I did already suggest ionice.

I guess we'll have to see how it goes, because he hasn't put it in
production yet.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: sync performance falls off a cliff

2009-07-05 Thread Carlos Carvalho
Leen Besselink (l...@consolejunky.net) wrote on 5 July 2009 10:17:
 >>  >
 >>  >I'm no expert, but I suggest using rsync 3.x (3.0.6 for example), it
 >>  >doesn't keep the as much information of the filelist in memory.
 >> 
 >> Yes. Or at lease it starts transfers much faster, because it doesn't
 >> wait for the full list to be completed.
 >> 
 >>  >It's probably swapping to disk, because of the large list and that
 >>  >significantly slows down the performance of the whole machine(s).
 >> 
 >> He's probably running out of ram, not only because of rsync but
 >> also everything else. Since inodes and files are not in ram, they have
 >> to be fetched from the disk, which is *very* slow.
 >> 
 >
 >not ram per se, because this is what he said in a different e-mail:
 >
 >Today I've been watching the production 2.6.8 rsync off and on and no it
 >isn't swapping. Used "vmstat" and "top" both on the source and
 >the destination. Each shows 0 for si and so.

He *is* running out of ram for cache, so the machine has to get the
pages from disk, which is slow. Swap is not used because the pages are
not modified.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: sync performance falls off a cliff

2009-07-15 Thread Mike Connell

Hi,

Here is an update. I haven't deployed a new version of rsync into 
production.
Instead I split my current rsync up into 10 independent sub directories of 
the

main directory. I run them serially one after the other.

I'm up to 404,000 files and the total sync time doesn't seem to be falling 
off

a cliff (yet).

In my case, only about .1% of my files change, so I'm sure it isn't a rsync 
memory
issue. But I strongly suspect with the results I'm getting so far, that it 
is a

matter of how many directories and inodes can be kept cached in memory. The
largest of the 10 sub directory rsyncs is about 75,000 files. So this would 
seem

to put less pressure on this cache.

Thanks,

Mike 


--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: sync performance falls off a cliff

2009-07-16 Thread Paul Slootman
On Wed 15 Jul 2009, Mike Connell wrote:
>
> I'm up to 404,000 files and the total sync time doesn't seem to be 
> falling off
> a cliff (yet).
>
> In my case, only about .1% of my files change, so I'm sure it isn't a 
> rsync memory

How do you come to that conclusion?
Only the number of total files affects rsync memory usage, not the
number of changed files.


Paul
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html