rsync very very slow with multiple instances at the same time.

2018-03-21 Thread Jayce Piel via rsync
I create a new thread, because the issue is not really the same, but i copy 
here the thread that made me jump into the list.

My issue is not really that it waits before starting copying, but a general 
performance issue, specially when there are multiple rsync running at the same 
time.

Here is my situation :
I have multiple clients (around 20) with users and i want to rsync their home 
dirs with my server to keep a copy of their local files.
On the server, files are hosted on a iSCSI volume (on a Thecus RAID) where i 
never had any performance issue before.

When there is only one client, i have no real performance issues. In a few 
minutes, even with a very large number of files (some users have up to ), the 
sync is done if there are not too many changed files.
But when there are 3 or more rsync at the same time, all rsync become very very 
slow and can take a few hours to complete.

Here are my options :

/usr/local/bin/rsync3 --rsync-path=/usr/local/bin/rsync3 -aHXxvE --stats 
--numeric-ids --delete-excluded --delete-before --human-readable —rsh="ssh -T 
-c aes128-ctr -o Compression=no -x" -z 
--skip-compress=gz/bz2/jpg/jpeg/ogg/mp3/mp4/mov/avi/vmdk/vmem --inplace 
--chmod=u+w --timeout=60 —exclude=‘Caches' —exclude=‘SyncService' 
—exclude=‘.FileSync' —exclude=‘IMAP*' —exclude=‘.Trash' —exclude='Saved 
Application State' —exclude='Autosave Information' 
--exclude-from=/Users/pabittan/.UserSync/exclude-list --max-size=1000M 
/Users/pabittan/ xserve.local.fftir:./


Here is the version i use (self compiled) : 
$ /usr/local/bin/rsync3 --version
rsync  version 3.1.2-jsp  protocol version 31
Copyright (C) 1996-2015 by Andrew Tridgell, Wayne Davison, and others.
Web site: http://rsync.samba.org/
Capabilities:
64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints,
socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace,
append, ACLs, xattrs, iconv, symtimes, no prealloc, file-flags

I had to put in place a sort of queue to not allow more than 4 simultaneous 
rsync to be sure they run at least once a day each. Even limiting to 4 rsync 
makes some wait hours before starting a backup.

I’m open to any help to improve perfs. (i have put my whole script calling 
rsync on github : https://github.com/jpiel/UserSync 
 )

PS: 
I checked, CPU is not under pressure, each rsync instance use between 2 and 5% 
CPU. The whole CPU usage 30%.
I also checked network, and it’s not either an issue.
Disk usage doesn’t seem to be at a high load either… (peak at 300 IO/sec)


> Le 20 mars 2018 à 13:00, rsync-requ...@lists.samba.org a écrit :
> 
> De: Kevin Korb mailto:k...@sanitarium.net>>
> Objet: Rép : Very slow to start sync with millions of directories and files
> Date: 19 mars 2018 à 15:33:31 UTC+1
> À: rsync@lists.samba.org 
> 
> 
> The performance of rsync with a huge number of files is greatly
> determined by every option you are using.  So, what is your whole
> command line?
> 
> On 03/19/2018 09:05 AM, Bráulio Bhavamitra via rsync wrote:
>> Hi all,
>>  
>> I'm using rsync 3 to copy all files from one disk to another. The files
>> were writen by Minio, an S3 compatible opensource backend.
>> 
>> The number of files is dozens of millions, almost each of them within
>> its own directory.
>> 
>> Rsync takes a long time, when not several hours, to even start syncing
>> files. I already see a few reasons:
>> - it first create all directories to put files in, that could be done
>> along with the sync
>> - it needs to generate the list of all files before starting, and cannot
>> start syncing and keep the list generation in a different thread.
>> 
>> Cheers,
>> bráulio
>> 
>> 
> 
> -- 
> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
>   Kevin Korb  Phone:(407) 252-6853
>   Systems Administrator   Internet:
>   FutureQuest, Inc.   ke...@futurequest.net 
>   (work)
>   Orlando, Floridak...@sanitarium.net 
>  (personal)
>   Web page:   http://www.sanitarium.net/ 
> 
>   PGP public key available on web site.
> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,

-- 
Jayce Piel   —jayce.p...@gmail.com  --  0616762431
   Responsable Informatique F.F.Tir

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: rsync very very slow with multiple instances at the same time.

2018-03-23 Thread Jayce Piel via rsync
Ok, so i did some tests.
find /path -type f -ls > /dev/null


First on my local SSD disk (1.9 millions files) :
1 find : 
real2m16.743s
user0m7.607s
sys 0m45.952s

10 concurrent finds (approx same results for each)  :
real4m48.629s
user0m11.013s
sys 2m0.288s

Almost double time is somehow logic.


Now same test on my server on the iSCSI disk (when there is no other activity) 
(2.8 millions files) :
1 find :
real38m54.964s
user0m35.626s
sys 4m33.593s

10 concurrent finds :
real76m34.781s
user0m47.848s
sys 5m42.034s

The difference is not crazy. But the find itself takes so much time !
I now see i have a real issue on that server. Transfer time is not a problem, 
but access time seems to be terribly slow.

> Le 21 mars 2018 à 16:59, Jayce Piel  a écrit :
> 
> Thanks for the answer.
> I will do some tests of the stat() thing at a time when there is nothing else 
> running.
> 
> For the compression i tried to find the lowest common factor between the 
> clients and the server. Server is older for now.
> I used to use -c arcfour-128 before it was no more an option.
> 
> The 2 ciphers you are mentionning are available on the Clients but not on the 
> server, sadly.
> But i keep this in mind for when i will upgrade the server (or move the 
> destination backups).
> 
> 
>> Le 21 mars 2018 à 16:39, Kevin Korb via rsync > <mailto:rsync@lists.samba.org>> a écrit :
>> 
>> When rsync has a lot of files to look through but not many to actually
>> transfer most of the work will be gathering information from the stat()
>> function call.  You can simulate just the stat call with: find /path
>> -type f -ls > /dev/null
>> You can run one then a few of those to see if your storage has issues
>> with lots of stats all at once.
>> 
>> Also, why -c aes128-ctr ?  If your OpenSSH is current then the default
>> of chacha20-poly1...@openssh.com <mailto:chacha20-poly1...@openssh.com> is 
>> much faster.  If your systems have
>> AES-NI in the CPU then aes128-...@openssh.com 
>> <mailto:aes128-...@openssh.com> is much faster.  If your
>> OpenSSH is too old for chacha to be the default then aes128-ctr was the
>> default anyway.
>> 
>> On 03/21/2018 09:49 AM, Jayce Piel via rsync wrote:
>>> 
>>> Here are my options :
>>> 
>>> /usr/local/bin/rsync3 --rsync-path=/usr/local/bin/rsync3 -aHXxvE --stats
>>> --numeric-ids --delete-excluded --delete-before --human-readable
>>> —rsh="ssh -T -c aes128-ctr -o Compression=no -x" -z
>>> --skip-compress=gz/bz2/jpg/jpeg/ogg/mp3/mp4/mov/avi/vmdk/vmem --inplace
>>> --chmod=u+w --timeout=60 —exclude=‘Caches' —exclude=‘SyncService'
>>> —exclude=‘.FileSync' —exclude=‘IMAP*' —exclude=‘.Trash' —exclude='Saved
>>> Application State' —exclude='Autosave Information'
>>> --exclude-from=/Users/pabittan/.UserSync/exclude-list --max-size=1000M
>>> /Users/pabittan/ xserve.local.fftir:./
>>> 
> 
> -- 
> Jayce Piel   —jayce.p...@gmail.com <mailto:jayce.p...@gmail.com>  --  
> 0616762431
>Responsable Informatique F.F.Tir

-- 
Jayce Piel   —jayce.p...@gmail.com  --  0616762431
   Responsable Informatique F.F.Tir

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html