Rsync takes long time to finish

2012-04-12 Thread vijay patel

Hi Friends,
 
I am using rsync to copy data from Production File Server to Disaster Recovery 
file server. I have 100Mbps link setup between these two servers. Folder 
structure is very deep. It is having path like 
/reports/folder1/date/folder2/file.tx, where we have 1600 directories like 
'folder1',  daily folders since last year in date folder and 2 folders for each 
date folder like folder2  which ultimately will contain the file. Files are not 
too big but just design of folder structure is complex. Folder structure design 
is done by application and we can't change it at the moment. I am using 
following command in cron to run rsync.
 
rsync -avh --delete --exclude-from 'ex_file.txt' /reports/ 
10.10.10.100:/reports/ | tee /tmp/rsync_report.out  
/tmp/rsync_report.out.$today
 
Initially we were running it every 5 mins then we increased it to every 30 mins 
since one instance was not getting finished in 5 mins. Now we have made it to 
run every 8 hours because of lots of folders. Is there a way i can improve 
performance of my rsync??
 
 
Regards,
Vijay
  -- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: Rsync takes long time to finish

2012-04-12 Thread Kevin Korb
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Several suggestions...

Add a lockfile to your cron job so it doesn't run two instances at the
same time and you don't have to predict the run time.

Make sure you are running rsync version 3+ on both systems.  It has
significant performance benefits over version 2.

Run a job manually and add --itemize-changes and --progress.  Try to
figure out where most of the time is spent.  Looking for something to
transfer, transferring new files, or updating changed files.

If it is mostly looking for something to transfer then you need
filesystem optimizations.  Such as directory indexing.  You didn't
specify the OS or anything but if you are on Linux this is where an
ext3  ext4 conversion would be helpful.

If it is mostly transferring new files then look at the network
transfer rate.  If it is low then try optimizing the ssh portion.  Try
using -e 'ssh -c arcfour' or try using the hpn version of openssh.  If
encryption isn't important you could also setup rsyncd.

If it is mostly updating existing files check the itemize output to
see if the files really need updating.  For instance if something is
screwing with your timestamps that will create a bunch of extra work
for rsync.  Also, --inplace might help performance but be sure to read
about it.

On 04/12/12 14:29, vijay patel wrote:
 Hi Friends,
 
 I am using rsync to copy data from Production File Server to
 Disaster Recovery file server. I have 100Mbps link setup between
 these two servers. Folder structure is very deep. It is having path
 like /reports/folder1/date/folder2/file.tx, where we have 1600
 directories like 'folder1',  daily folders since last year in date
 folder and 2 folders for each date folder like folder2  which
 ultimately will contain the file. Files are not too big but just
 design of folder structure is complex. Folder structure design is
 done by application and we can't change it at the moment. I am
 using following command in cron to run rsync.
 
 rsync -avh --delete --exclude-from 'ex_file.txt' /reports/ 
 10.10.10.100:/reports/ | tee /tmp/rsync_report.out  
 /tmp/rsync_report.out.$today
 
 Initially we were running it every 5 mins then we increased it to
 every 30 mins since one instance was not getting finished in 5
 mins. Now we have made it to run every 8 hours because of lots of
 folders. Is there a way i can improve performance of my rsync??
 
 
 Regards, Vijay
 
 
 

- -- 
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
Kevin Korb  Phone:(407) 252-6853
Systems Administrator   Internet:
FutureQuest, Inc.   ke...@futurequest.net  (work)
Orlando, Floridak...@sanitarium.net (personal)
Web page:   http://www.sanitarium.net/
PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk+HIoMACgkQVKC1jlbQAQddkACeOljjKSj/NVpc4dj6+Hjm946j
9IsAoPNV4DrbTtH5Yj8Zk7p/2O8JacE3
=LsDJ
-END PGP SIGNATURE-
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


RE: Rsync takes long time to finish

2012-04-12 Thread Stier, Matthew
And, although rsync does parallelize, nothing stops you from running multiple 
instances of rsync.

I had to transfer files from system A to system B, and being limited by the 
processing power of a single thread of rsync, I drilled down one level, and ran 
rsync's against each the first level file and subdirectory.  This put more 
threads/cores/processors to work made better use of the network bandwidth to 
get the job done.

When all the rsync's finished, I ran a single root level rsync to catch the 
stragglers.

If you have the processing power, use it.


-Original Message-
From: rsync-boun...@lists.samba.org [mailto:rsync-boun...@lists.samba.org] On 
Behalf Of Kevin Korb
Sent: Thursday, April 12, 2012 2:44 PM
To: rsync@lists.samba.org
Subject: Re: Rsync takes long time to finish

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Several suggestions...

Add a lockfile to your cron job so it doesn't run two instances at the
same time and you don't have to predict the run time.

Make sure you are running rsync version 3+ on both systems.  It has
significant performance benefits over version 2.

Run a job manually and add --itemize-changes and --progress.  Try to
figure out where most of the time is spent.  Looking for something to
transfer, transferring new files, or updating changed files.

If it is mostly looking for something to transfer then you need
filesystem optimizations.  Such as directory indexing.  You didn't
specify the OS or anything but if you are on Linux this is where an
ext3  ext4 conversion would be helpful.

If it is mostly transferring new files then look at the network
transfer rate.  If it is low then try optimizing the ssh portion.  Try
using -e 'ssh -c arcfour' or try using the hpn version of openssh.  If
encryption isn't important you could also setup rsyncd.

If it is mostly updating existing files check the itemize output to
see if the files really need updating.  For instance if something is
screwing with your timestamps that will create a bunch of extra work
for rsync.  Also, --inplace might help performance but be sure to read
about it.

On 04/12/12 14:29, vijay patel wrote:
 Hi Friends,
 
 I am using rsync to copy data from Production File Server to
 Disaster Recovery file server. I have 100Mbps link setup between
 these two servers. Folder structure is very deep. It is having path
 like /reports/folder1/date/folder2/file.tx, where we have 1600
 directories like 'folder1',  daily folders since last year in date
 folder and 2 folders for each date folder like folder2  which
 ultimately will contain the file. Files are not too big but just
 design of folder structure is complex. Folder structure design is
 done by application and we can't change it at the moment. I am
 using following command in cron to run rsync.
 
 rsync -avh --delete --exclude-from 'ex_file.txt' /reports/ 
 10.10.10.100:/reports/ | tee /tmp/rsync_report.out  
 /tmp/rsync_report.out.$today
 
 Initially we were running it every 5 mins then we increased it to
 every 30 mins since one instance was not getting finished in 5
 mins. Now we have made it to run every 8 hours because of lots of
 folders. Is there a way i can improve performance of my rsync??
 
 
 Regards, Vijay
 
 
 

- -- 
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
Kevin Korb  Phone:(407) 252-6853
Systems Administrator   Internet:
FutureQuest, Inc.   ke...@futurequest.net  (work)
Orlando, Floridak...@sanitarium.net (personal)
Web page:   http://www.sanitarium.net/
PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk+HIoMACgkQVKC1jlbQAQddkACeOljjKSj/NVpc4dj6+Hjm946j
9IsAoPNV4DrbTtH5Yj8Zk7p/2O8JacE3
=LsDJ
-END PGP SIGNATURE-
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


RE: Rsync takes long time to finish

2012-04-12 Thread Stier, Matthew
The first clause should read does not parallelize.


-Original Message-
From: rsync-boun...@lists.samba.org [mailto:rsync-boun...@lists.samba.org] On 
Behalf Of Stier, Matthew
Sent: Thursday, April 12, 2012 3:07 PM
To: Kevin Korb; rsync@lists.samba.org
Subject: RE: Rsync takes long time to finish

And, although rsync does parallelize, nothing stops you from running multiple 
instances of rsync.

I had to transfer files from system A to system B, and being limited by the 
processing power of a single thread of rsync, I drilled down one level, and ran 
rsync's against each the first level file and subdirectory.  This put more 
threads/cores/processors to work made better use of the network bandwidth to 
get the job done.

When all the rsync's finished, I ran a single root level rsync to catch the 
stragglers.

If you have the processing power, use it.


-Original Message-
From: rsync-boun...@lists.samba.org [mailto:rsync-boun...@lists.samba.org] On 
Behalf Of Kevin Korb
Sent: Thursday, April 12, 2012 2:44 PM
To: rsync@lists.samba.org
Subject: Re: Rsync takes long time to finish

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Several suggestions...

Add a lockfile to your cron job so it doesn't run two instances at the
same time and you don't have to predict the run time.

Make sure you are running rsync version 3+ on both systems.  It has
significant performance benefits over version 2.

Run a job manually and add --itemize-changes and --progress.  Try to
figure out where most of the time is spent.  Looking for something to
transfer, transferring new files, or updating changed files.

If it is mostly looking for something to transfer then you need
filesystem optimizations.  Such as directory indexing.  You didn't
specify the OS or anything but if you are on Linux this is where an
ext3  ext4 conversion would be helpful.

If it is mostly transferring new files then look at the network
transfer rate.  If it is low then try optimizing the ssh portion.  Try
using -e 'ssh -c arcfour' or try using the hpn version of openssh.  If
encryption isn't important you could also setup rsyncd.

If it is mostly updating existing files check the itemize output to
see if the files really need updating.  For instance if something is
screwing with your timestamps that will create a bunch of extra work
for rsync.  Also, --inplace might help performance but be sure to read
about it.

On 04/12/12 14:29, vijay patel wrote:
 Hi Friends,
 
 I am using rsync to copy data from Production File Server to
 Disaster Recovery file server. I have 100Mbps link setup between
 these two servers. Folder structure is very deep. It is having path
 like /reports/folder1/date/folder2/file.tx, where we have 1600
 directories like 'folder1',  daily folders since last year in date
 folder and 2 folders for each date folder like folder2  which
 ultimately will contain the file. Files are not too big but just
 design of folder structure is complex. Folder structure design is
 done by application and we can't change it at the moment. I am
 using following command in cron to run rsync.
 
 rsync -avh --delete --exclude-from 'ex_file.txt' /reports/ 
 10.10.10.100:/reports/ | tee /tmp/rsync_report.out  
 /tmp/rsync_report.out.$today
 
 Initially we were running it every 5 mins then we increased it to
 every 30 mins since one instance was not getting finished in 5
 mins. Now we have made it to run every 8 hours because of lots of
 folders. Is there a way i can improve performance of my rsync??
 
 
 Regards, Vijay
 
 
 

- -- 
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
Kevin Korb  Phone:(407) 252-6853
Systems Administrator   Internet:
FutureQuest, Inc.   ke...@futurequest.net  (work)
Orlando, Floridak...@sanitarium.net (personal)
Web page:   http://www.sanitarium.net/
PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk+HIoMACgkQVKC1jlbQAQddkACeOljjKSj/NVpc4dj6+Hjm946j
9IsAoPNV4DrbTtH5Yj8Zk7p/2O8JacE3
=LsDJ
-END PGP SIGNATURE-
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org

RE: Rsync takes long time to finish

2012-04-12 Thread vijay patel

Thanks friends.  We are using Redhat Linux 5.8 on Production and Disaster 
Recovery side.  By drilling down we have found out it is taking lot of time to 
check what has changed while data tranfer is very fast.  As i mentioned data in 
these folders is very less (hardly 40GB) and whenever new file is created, it 
is of max 30KB. 
 
Since we have to sync production environment to DR every 10 mins as per 
Business requirement i have to schedule it via cron. This already distributed 
folder structure i am using. I already have another rsync job which runs every 
5 mins on another folder structure. It is running fine. Is there any option i 
can use with rsync to make this folder check fast?
 
Regards,
Vijay


 

 From: matthew.st...@us.fujitsu.com
 To: k...@sanitarium.net; rsync@lists.samba.org
 Subject: RE: Rsync takes long time to finish
 Date: Thu, 12 Apr 2012 19:29:03 +
 
 The first clause should read does not parallelize.
 
 
 -Original Message-
 From: rsync-boun...@lists.samba.org [mailto:rsync-boun...@lists.samba.org] On 
 Behalf Of Stier, Matthew
 Sent: Thursday, April 12, 2012 3:07 PM
 To: Kevin Korb; rsync@lists.samba.org
 Subject: RE: Rsync takes long time to finish
 
 And, although rsync does parallelize, nothing stops you from running multiple 
 instances of rsync.
 
 I had to transfer files from system A to system B, and being limited by the 
 processing power of a single thread of rsync, I drilled down one level, and 
 ran rsync's against each the first level file and subdirectory. This put more 
 threads/cores/processors to work made better use of the network bandwidth to 
 get the job done.
 
 When all the rsync's finished, I ran a single root level rsync to catch the 
 stragglers.
 
 If you have the processing power, use it.
 
 
 -Original Message-
 From: rsync-boun...@lists.samba.org [mailto:rsync-boun...@lists.samba.org] On 
 Behalf Of Kevin Korb
 Sent: Thursday, April 12, 2012 2:44 PM
 To: rsync@lists.samba.org
 Subject: Re: Rsync takes long time to finish
 
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Several suggestions...
 
 Add a lockfile to your cron job so it doesn't run two instances at the
 same time and you don't have to predict the run time.
 
 Make sure you are running rsync version 3+ on both systems. It has
 significant performance benefits over version 2.
 
 Run a job manually and add --itemize-changes and --progress. Try to
 figure out where most of the time is spent. Looking for something to
 transfer, transferring new files, or updating changed files.
 
 If it is mostly looking for something to transfer then you need
 filesystem optimizations. Such as directory indexing. You didn't
 specify the OS or anything but if you are on Linux this is where an
 ext3  ext4 conversion would be helpful.
 
 If it is mostly transferring new files then look at the network
 transfer rate. If it is low then try optimizing the ssh portion. Try
 using -e 'ssh -c arcfour' or try using the hpn version of openssh. If
 encryption isn't important you could also setup rsyncd.
 
 If it is mostly updating existing files check the itemize output to
 see if the files really need updating. For instance if something is
 screwing with your timestamps that will create a bunch of extra work
 for rsync. Also, --inplace might help performance but be sure to read
 about it.
 
 On 04/12/12 14:29, vijay patel wrote:
  Hi Friends,
  
  I am using rsync to copy data from Production File Server to
  Disaster Recovery file server. I have 100Mbps link setup between
  these two servers. Folder structure is very deep. It is having path
  like /reports/folder1/date/folder2/file.tx, where we have 1600
  directories like 'folder1', daily folders since last year in date
  folder and 2 folders for each date folder like folder2 which
  ultimately will contain the file. Files are not too big but just
  design of folder structure is complex. Folder structure design is
  done by application and we can't change it at the moment. I am
  using following command in cron to run rsync.
  
  rsync -avh --delete --exclude-from 'ex_file.txt' /reports/ 
  10.10.10.100:/reports/ | tee /tmp/rsync_report.out  
  /tmp/rsync_report.out.$today
  
  Initially we were running it every 5 mins then we increased it to
  every 30 mins since one instance was not getting finished in 5
  mins. Now we have made it to run every 8 hours because of lots of
  folders. Is there a way i can improve performance of my rsync??
  
  
  Regards, Vijay
  
  
  
 
 - -- 
 ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
 Kevin Korb Phone: (407) 252-6853
 Systems Administrator Internet:
 FutureQuest, Inc. ke...@futurequest.net (work)
 Orlando, Florida k...@sanitarium.net (personal)
 Web page: http://www.sanitarium.net/
 PGP public key available on web site.
 ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.17 (GNU/Linux)
 Comment: Using

Re: Rsync takes long time to finish

2012-04-12 Thread Kevin Korb
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

And make sure both systems are running rsync v3.  It indexes in
parallel to the copying.

On 04/12/12 16:59, Karl O. Pinc wrote:
 On 04/12/2012 03:28:18 PM, vijay patel wrote:
 
 Thanks friends.  We are using Redhat Linux 5.8 on Production and 
 Disaster Recovery side.  By drilling down we have found out it
 is taking lot of time to check what has changed while data
 tranfer is very fast.  As i mentioned data in these folders is
 very less (hardly 40GB) and whenever new file is created, it is
 of max 30KB.
 
 Since we have to sync production environment to DR every 10 mins
 as per Business requirement i have to schedule it via cron. This
 already distributed folder structure i am using. I already have
 another rsync job which runs every 5 mins on another folder
 structure. It is running fine. Is there any option i can use with
 rsync to make this folder check fast?
 
 No.  Per the response below you need to look at your filesystems.
 
 Use tune2fs -l and see if the dir_index option is on.  If not, 
 then turn it on using tune2fs.  This probably won't fix the
 existing directories.  If this is the problem you'll have to do a 
 backup/restore, or a move of all the files into a new directory 
 hierarchy and then replace the old hierarchy, or something else to
 fix all the existing directories. (I don't think e2fsck will help,
 but I've not looked.  As I say, there may also be some other
 approach.)
 
 If it is mostly looking for something to transfer then you
 need filesystem optimizations. Such as directory indexing. You
 didn't specify the OS or anything but if you are on Linux this
 is where an ext3  ext4 conversion would be helpful.
 
 
 Karl k...@meme.com Free Software:  You don't pay back, you pay
 forward. -- Robert A. Heinlein
 

- -- 
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
Kevin Korb  Phone:(407) 252-6853
Systems Administrator   Internet:
FutureQuest, Inc.   ke...@futurequest.net  (work)
Orlando, Floridak...@sanitarium.net (personal)
Web page:   http://www.sanitarium.net/
PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk+HQ4IACgkQVKC1jlbQAQeUTQCgrA7MIbX73hVZO3YsLxHsaUlN
O9IAnipAWOvrU4mdXuWNHP0/Wc6hmI2H
=CIUJ
-END PGP SIGNATURE-
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


RE: Rsync takes long time to finish

2012-04-12 Thread vijay patel

I am getting following thing in 'tune2fs -l' :
 
Filesystem features:  has_journal resize_inode dir_index filetype 
needs_recovery sparse_super large_file
 
Does this mean it is set?
One more thing i am not using rsync as daemon (Because i am confused with its 
usage at the moment), will it make any difference?

 

 Date: Thu, 12 Apr 2012 17:05:06 -0400
 From: k...@sanitarium.net
 To: rsync@lists.samba.org
 Subject: Re: Rsync takes long time to finish
 
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 And make sure both systems are running rsync v3. It indexes in
 parallel to the copying.
 
 On 04/12/12 16:59, Karl O. Pinc wrote:
  On 04/12/2012 03:28:18 PM, vijay patel wrote:
  
  Thanks friends. We are using Redhat Linux 5.8 on Production and 
  Disaster Recovery side. By drilling down we have found out it
  is taking lot of time to check what has changed while data
  tranfer is very fast. As i mentioned data in these folders is
  very less (hardly 40GB) and whenever new file is created, it is
  of max 30KB.
  
  Since we have to sync production environment to DR every 10 mins
  as per Business requirement i have to schedule it via cron. This
  already distributed folder structure i am using. I already have
  another rsync job which runs every 5 mins on another folder
  structure. It is running fine. Is there any option i can use with
  rsync to make this folder check fast?
  
  No. Per the response below you need to look at your filesystems.
  
  Use tune2fs -l and see if the dir_index option is on. If not, 
  then turn it on using tune2fs. This probably won't fix the
  existing directories. If this is the problem you'll have to do a 
  backup/restore, or a move of all the files into a new directory 
  hierarchy and then replace the old hierarchy, or something else to
  fix all the existing directories. (I don't think e2fsck will help,
  but I've not looked. As I say, there may also be some other
  approach.)
  
  If it is mostly looking for something to transfer then you
  need filesystem optimizations. Such as directory indexing. You
  didn't specify the OS or anything but if you are on Linux this
  is where an ext3  ext4 conversion would be helpful.
  
  
  Karl k...@meme.com Free Software: You don't pay back, you pay
  forward. -- Robert A. Heinlein
  
 
 - -- 
 ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
 Kevin Korb Phone: (407) 252-6853
 Systems Administrator Internet:
 FutureQuest, Inc. ke...@futurequest.net (work)
 Orlando, Florida k...@sanitarium.net (personal)
 Web page: http://www.sanitarium.net/
 PGP public key available on web site.
 ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.17 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
 
 iEYEARECAAYFAk+HQ4IACgkQVKC1jlbQAQeUTQCgrA7MIbX73hVZO3YsLxHsaUlN
 O9IAnipAWOvrU4mdXuWNHP0/Wc6hmI2H
 =CIUJ
 -END PGP SIGNATURE-
 -- 
 Please use reply-all for most replies to avoid omitting the mailing list.
 To unsubscribe or change options: 
 https://lists.samba.org/mailman/listinfo/rsync
 Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
  -- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: Rsync takes long time to finish

2012-04-12 Thread Kevin Korb
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Yes, you have the feature in your filesystem.  Good.  If it is ext3
then converting it to ext4 would still help assuming your distro
supports it.

You are using rsync over ssh.  This is my preference as well for
security reasons.  Using rsyncd would be faster because it would
remove the encryption overhead but that shouldn't be a big deal on
only 100Mbits.  It would make no difference in the indexing.

Have you checked your version yet?  Run rsync --version on both
systems.  If it isn't 3.0.something upgrade.  That will make a big
difference.

On 04/12/12 17:16, vijay patel wrote:
 I am getting following thing in 'tune2fs -l' :
 
 Filesystem features:  has_journal resize_inode dir_index
 filetype needs_recovery sparse_super large_file
 
 Does this mean it is set? One more thing i am not using rsync as
 daemon (Because i am confused with its usage at the moment), will
 it make any difference?
 
 
 Date: Thu, 12 Apr 2012 17:05:06 -0400 From: k...@sanitarium.net 
 To: rsync@lists.samba.org Subject: Re: Rsync takes long time to
 finish
 
 And make sure both systems are running rsync v3. It indexes in 
 parallel to the copying.
 
 On 04/12/12 16:59, Karl O. Pinc wrote:
 On 04/12/2012 03:28:18 PM, vijay patel wrote:
 
 Thanks friends. We are using Redhat Linux 5.8 on Production
 and Disaster Recovery side. By drilling down we have found out
 it is taking lot of time to check what has changed while data 
 tranfer is very fast. As i mentioned data in these folders is 
 very less (hardly 40GB) and whenever new file is created, it
 is of max 30KB.
 
 Since we have to sync production environment to DR every 10
 mins as per Business requirement i have to schedule it via
 cron. This already distributed folder structure i am using. I
 already have another rsync job which runs every 5 mins on
 another folder structure. It is running fine. Is there any
 option i can use with rsync to make this folder check fast?
 
 No. Per the response below you need to look at your filesystems.
 
 Use tune2fs -l and see if the dir_index option is on. If not, 
 then turn it on using tune2fs. This probably won't fix the 
 existing directories. If this is the problem you'll have to do a 
 backup/restore, or a move of all the files into a new directory 
 hierarchy and then replace the old hierarchy, or something else
 to fix all the existing directories. (I don't think e2fsck will
 help, but I've not looked. As I say, there may also be some
 other approach.)
 
 If it is mostly looking for something to transfer then you 
 need filesystem optimizations. Such as directory indexing.
 You didn't specify the OS or anything but if you are on Linux
 this is where an ext3  ext4 conversion would be helpful.
 
 
 Karl k...@meme.com Free Software: You don't pay back, you pay 
 forward. -- Robert A. Heinlein
 
 
 -- Please use reply-all for most replies to avoid omitting the
 mailing list. To unsubscribe or change options:
 https://lists.samba.org/mailman/listinfo/rsync
 Before posting, read:
 http://www.catb.org/~esr/faqs/smart-questions.html
 
 

- -- 
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
Kevin Korb  Phone:(407) 252-6853
Systems Administrator   Internet:
FutureQuest, Inc.   ke...@futurequest.net  (work)
Orlando, Floridak...@sanitarium.net (personal)
Web page:   http://www.sanitarium.net/
PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk+HRvMACgkQVKC1jlbQAQfSRgCg/unUPvt3pX+fbQf7qCQktWQc
kJoAn3ENigLu05Molf5iijT4VhJ1OoVU
=gQ9y
-END PGP SIGNATURE-
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


RE: Rsync takes long time to finish

2012-04-12 Thread vijay patel

yes both servers are having rsync 3.0.6.
 

 Date: Thu, 12 Apr 2012 17:19:47 -0400
 From: k...@sanitarium.net
 To: rsync@lists.samba.org
 Subject: Re: Rsync takes long time to finish
 
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Yes, you have the feature in your filesystem. Good. If it is ext3
 then converting it to ext4 would still help assuming your distro
 supports it.
 
 You are using rsync over ssh. This is my preference as well for
 security reasons. Using rsyncd would be faster because it would
 remove the encryption overhead but that shouldn't be a big deal on
 only 100Mbits. It would make no difference in the indexing.
 
 Have you checked your version yet? Run rsync --version on both
 systems. If it isn't 3.0.something upgrade. That will make a big
 difference.
 
 On 04/12/12 17:16, vijay patel wrote:
  I am getting following thing in 'tune2fs -l' :
  
  Filesystem features: has_journal resize_inode dir_index
  filetype needs_recovery sparse_super large_file
  
  Does this mean it is set? One more thing i am not using rsync as
  daemon (Because i am confused with its usage at the moment), will
  it make any difference?
  
  
  Date: Thu, 12 Apr 2012 17:05:06 -0400 From: k...@sanitarium.net 
  To: rsync@lists.samba.org Subject: Re: Rsync takes long time to
  finish
  
  And make sure both systems are running rsync v3. It indexes in 
  parallel to the copying.
  
  On 04/12/12 16:59, Karl O. Pinc wrote:
  On 04/12/2012 03:28:18 PM, vijay patel wrote:
  
  Thanks friends. We are using Redhat Linux 5.8 on Production
  and Disaster Recovery side. By drilling down we have found out
  it is taking lot of time to check what has changed while data 
  tranfer is very fast. As i mentioned data in these folders is 
  very less (hardly 40GB) and whenever new file is created, it
  is of max 30KB.
  
  Since we have to sync production environment to DR every 10
  mins as per Business requirement i have to schedule it via
  cron. This already distributed folder structure i am using. I
  already have another rsync job which runs every 5 mins on
  another folder structure. It is running fine. Is there any
  option i can use with rsync to make this folder check fast?
  
  No. Per the response below you need to look at your filesystems.
  
  Use tune2fs -l and see if the dir_index option is on. If not, 
  then turn it on using tune2fs. This probably won't fix the 
  existing directories. If this is the problem you'll have to do a 
  backup/restore, or a move of all the files into a new directory 
  hierarchy and then replace the old hierarchy, or something else
  to fix all the existing directories. (I don't think e2fsck will
  help, but I've not looked. As I say, there may also be some
  other approach.)
  
  If it is mostly looking for something to transfer then you 
  need filesystem optimizations. Such as directory indexing.
  You didn't specify the OS or anything but if you are on Linux
  this is where an ext3  ext4 conversion would be helpful.
  
  
  Karl k...@meme.com Free Software: You don't pay back, you pay 
  forward. -- Robert A. Heinlein
  
  
  -- Please use reply-all for most replies to avoid omitting the
  mailing list. To unsubscribe or change options:
  https://lists.samba.org/mailman/listinfo/rsync
  Before posting, read:
  http://www.catb.org/~esr/faqs/smart-questions.html
  
  
 
 - -- 
 ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
 Kevin Korb Phone: (407) 252-6853
 Systems Administrator Internet:
 FutureQuest, Inc. ke...@futurequest.net (work)
 Orlando, Florida k...@sanitarium.net (personal)
 Web page: http://www.sanitarium.net/
 PGP public key available on web site.
 ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.17 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
 
 iEYEARECAAYFAk+HRvMACgkQVKC1jlbQAQfSRgCg/unUPvt3pX+fbQf7qCQktWQc
 kJoAn3ENigLu05Molf5iijT4VhJ1OoVU
 =gQ9y
 -END PGP SIGNATURE-
 -- 
 Please use reply-all for most replies to avoid omitting the mailing list.
 To unsubscribe or change options: 
 https://lists.samba.org/mailman/listinfo/rsync
 Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
  -- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: Rsync takes long time to finish

2012-04-12 Thread Dan Stromberg
I've heard lots of good suggestions already - another thing that I've not
seen mentioned is, upgrading your kernel may help.  Somewhere shortly
before kernel 3.0, pathname lookups got noticeably faster.

You could also try an alternative filesystem like xfs.  It's supposed to be
pretty good at large directories.

On Thu, Apr 12, 2012 at 11:29 AM, vijay patel catchv...@hotmail.com wrote:

  Hi Friends,

 I am using rsync to copy data from Production File Server to Disaster
 Recovery file server. I have 100Mbps link setup between these two servers.
 Folder structure is very deep. It is having path like
 /reports/folder1/date/folder2/file.tx, where we have 1600 directories like
 'folder1',  daily folders since last year in date folder and 2 folders for
 each date folder like folder2  which ultimately will contain the file.
 Files are not too big but just design of folder structure is complex.
 Folder structure design is done by application and we can't change it at
 the moment. I am using following command in cron to run rsync.

 rsync -avh --delete --exclude-from 'ex_file.txt' /reports/ 
 10.10.10.100:/reports/
 | tee /tmp/rsync_report.out  /tmp/rsync_report.out.$today

 Initially we were running it every 5 mins then we increased it to every 30
 mins since one instance was not getting finished in 5 mins. Now we have
 made it to run every 8 hours because of lots of folders. Is there a way i
 can improve performance of my rsync??


 Regards,
 Vijay


 --
 Please use reply-all for most replies to avoid omitting the mailing list.
 To unsubscribe or change options:
 https://lists.samba.org/mailman/listinfo/rsync
 Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: Rsync takes long time to finish

2012-04-12 Thread Kevin Korb
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

There was also a serious performance regression in 2.6.39.


On 04/12/12 17:29, Dan Stromberg wrote:
 
 I've heard lots of good suggestions already - another thing that
 I've not seen mentioned is, upgrading your kernel may help.
 Somewhere shortly before kernel 3.0, pathname lookups got
 noticeably faster.
 
 You could also try an alternative filesystem like xfs.  It's
 supposed to be pretty good at large directories.
 
 On Thu, Apr 12, 2012 at 11:29 AM, vijay patel
 catchv...@hotmail.com mailto:catchv...@hotmail.com wrote:
 
 Hi Friends,
 
 I am using rsync to copy data from Production File Server to 
 Disaster Recovery file server. I have 100Mbps link setup between 
 these two servers. Folder structure is very deep. It is having
 path like /reports/folder1/date/folder2/file.tx, where we have
 1600 directories like 'folder1',  daily folders since last year in
 date folder and 2 folders for each date folder like folder2  which 
 ultimately will contain the file. Files are not too big but just 
 design of folder structure is complex. Folder structure design is 
 done by application and we can't change it at the moment. I am
 using following command in cron to run rsync.
 
 rsync -avh --delete --exclude-from 'ex_file.txt' /reports/ 
 10.10.10.100:/reports/ | tee /tmp/rsync_report.out  
 /tmp/rsync_report.out.$today
 
 Initially we were running it every 5 mins then we increased it to 
 every 30 mins since one instance was not getting finished in 5
 mins. Now we have made it to run every 8 hours because of lots of
 folders. Is there a way i can improve performance of my rsync??
 
 
 Regards, Vijay
 
 
 -- Please use reply-all for most replies to avoid omitting the
 mailing list. To unsubscribe or change options: 
 https://lists.samba.org/mailman/listinfo/rsync Before posting,
 read: http://www.catb.org/~esr/faqs/smart-questions.html 
 http://www.catb.org/%7Eesr/faqs/smart-questions.html
 
 
 
 

- -- 
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
Kevin Korb  Phone:(407) 252-6853
Systems Administrator   Internet:
FutureQuest, Inc.   ke...@futurequest.net  (work)
Orlando, Floridak...@sanitarium.net (personal)
Web page:   http://www.sanitarium.net/
PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk+HSbcACgkQVKC1jlbQAQfW+wCgn9wl1RFxLhFFaEAqQi7rbQcc
i1MAoPqFk0qbcPvcBIlYYU5T7/HG0H6i
=abbJ
-END PGP SIGNATURE-
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


RE: Rsync takes long time to finish

2012-04-12 Thread vijay patel

We are running Kernel 2.6.18-308.1.1.el5 which is latest in RHEL 5.8 on both 
the server. I think i might have to explore option of using ext4.
 

 Date: Thu, 12 Apr 2012 17:31:35 -0400
 From: k...@sanitarium.net
 To: rsync@lists.samba.org
 Subject: Re: Rsync takes long time to finish
 
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 There was also a serious performance regression in 2.6.39.
 
 
 On 04/12/12 17:29, Dan Stromberg wrote:
  
  I've heard lots of good suggestions already - another thing that
  I've not seen mentioned is, upgrading your kernel may help.
  Somewhere shortly before kernel 3.0, pathname lookups got
  noticeably faster.
  
  You could also try an alternative filesystem like xfs. It's
  supposed to be pretty good at large directories.
  
  On Thu, Apr 12, 2012 at 11:29 AM, vijay patel
  catchv...@hotmail.com mailto:catchv...@hotmail.com wrote:
  
  Hi Friends,
  
  I am using rsync to copy data from Production File Server to 
  Disaster Recovery file server. I have 100Mbps link setup between 
  these two servers. Folder structure is very deep. It is having
  path like /reports/folder1/date/folder2/file.tx, where we have
  1600 directories like 'folder1', daily folders since last year in
  date folder and 2 folders for each date folder like folder2 which 
  ultimately will contain the file. Files are not too big but just 
  design of folder structure is complex. Folder structure design is 
  done by application and we can't change it at the moment. I am
  using following command in cron to run rsync.
  
  rsync -avh --delete --exclude-from 'ex_file.txt' /reports/ 
  10.10.10.100:/reports/ | tee /tmp/rsync_report.out  
  /tmp/rsync_report.out.$today
  
  Initially we were running it every 5 mins then we increased it to 
  every 30 mins since one instance was not getting finished in 5
  mins. Now we have made it to run every 8 hours because of lots of
  folders. Is there a way i can improve performance of my rsync??
  
  
  Regards, Vijay
  
  
  -- Please use reply-all for most replies to avoid omitting the
  mailing list. To unsubscribe or change options: 
  https://lists.samba.org/mailman/listinfo/rsync Before posting,
  read: http://www.catb.org/~esr/faqs/smart-questions.html 
  http://www.catb.org/%7Eesr/faqs/smart-questions.html
  
  
  
  
 
 - -- 
 ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
 Kevin Korb Phone: (407) 252-6853
 Systems Administrator Internet:
 FutureQuest, Inc. ke...@futurequest.net (work)
 Orlando, Florida k...@sanitarium.net (personal)
 Web page: http://www.sanitarium.net/
 PGP public key available on web site.
 ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.17 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
 
 iEYEARECAAYFAk+HSbcACgkQVKC1jlbQAQfW+wCgn9wl1RFxLhFFaEAqQi7rbQcc
 i1MAoPqFk0qbcPvcBIlYYU5T7/HG0H6i
 =abbJ
 -END PGP SIGNATURE-
 -- 
 Please use reply-all for most replies to avoid omitting the mailing list.
 To unsubscribe or change options: 
 https://lists.samba.org/mailman/listinfo/rsync
 Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
  -- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: Rsync takes long time to finish

2012-04-12 Thread Kevin Korb
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

That is a bit old for ext4.  You need 2.6.28 as the bare minimum but
there were a few early issues.  I don't remember exactly when it
stabilized but I think it was in the low 2.6.30s.  Your 2.6.18 is from
2006.  (Yes, I know, RedHat has been patching it for years.  Doesn't
mean they have done any performance improvements.)

On 04/12/12 17:36, vijay patel wrote:
 We are running Kernel 2.6.18-308.1.1.el5 which is latest in RHEL
 5.8 on both the server. I think i might have to explore option of
 using ext4.
 
 Date: Thu, 12 Apr 2012 17:31:35 -0400 From: k...@sanitarium.net 
 To: rsync@lists.samba.org Subject: Re: Rsync takes long time to
 finish
 
 There was also a serious performance regression in 2.6.39.
 
 
 On 04/12/12 17:29, Dan Stromberg wrote:
 
 I've heard lots of good suggestions already - another thing that 
 I've not seen mentioned is, upgrading your kernel may help. 
 Somewhere shortly before kernel 3.0, pathname lookups got 
 noticeably faster.
 
 You could also try an alternative filesystem like xfs. It's 
 supposed to be pretty good at large directories.
 
 On Thu, Apr 12, 2012 at 11:29 AM, vijay patel 
 catchv...@hotmail.com mailto:catchv...@hotmail.com wrote:
 
 Hi Friends,
 
 I am using rsync to copy data from Production File Server to 
 Disaster Recovery file server. I have 100Mbps link setup between 
 these two servers. Folder structure is very deep. It is having 
 path like /reports/folder1/date/folder2/file.tx, where we have 
 1600 directories like 'folder1', daily folders since last year
 in date folder and 2 folders for each date folder like folder2
 which ultimately will contain the file. Files are not too big but
 just design of folder structure is complex. Folder structure
 design is done by application and we can't change it at the
 moment. I am using following command in cron to run rsync.
 
 rsync -avh --delete --exclude-from 'ex_file.txt' /reports/ 
 10.10.10.100:/reports/ | tee /tmp/rsync_report.out  
 /tmp/rsync_report.out.$today
 
 Initially we were running it every 5 mins then we increased it
 to every 30 mins since one instance was not getting finished in
 5 mins. Now we have made it to run every 8 hours because of lots
 of folders. Is there a way i can improve performance of my
 rsync??
 
 
 Regards, Vijay
 
 
 -- Please use reply-all for most replies to avoid omitting the 
 mailing list. To unsubscribe or change options: 
 https://lists.samba.org/mailman/listinfo/rsync Before posting, 
 read: http://www.catb.org/~esr/faqs/smart-questions.html 
 http://www.catb.org/%7Eesr/faqs/smart-questions.html
 
 
 
 
 
 -- Please use reply-all for most replies to avoid omitting the
 mailing list. To unsubscribe or change options:
 https://lists.samba.org/mailman/listinfo/rsync
 Before posting, read:
 http://www.catb.org/~esr/faqs/smart-questions.html
 
 

- -- 
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
Kevin Korb  Phone:(407) 252-6853
Systems Administrator   Internet:
FutureQuest, Inc.   ke...@futurequest.net  (work)
Orlando, Floridak...@sanitarium.net (personal)
Web page:   http://www.sanitarium.net/
PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk+HS7AACgkQVKC1jlbQAQd76ACgoXyQ/e+BjENecGaTIayGs+gl
kagAn18vI5dcDAveoB//K6TRQKMydL3s
=uGXf
-END PGP SIGNATURE-
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Rsync takes long time to finish

2012-04-12 Thread Karl O. Pinc
On 04/12/2012 04:36:44 PM, vijay patel wrote:
 
 We are running Kernel 2.6.18-308.1.1.el5 which is latest in RHEL 5.8
 on both the server. I think i might have to explore option of using
 ext4.

Before you do anything you want to figure out why it
is slow so you can solve the real problem.  vmstat, iostat
and so forth are your friends.


Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Rsync takes long time to finish

2012-04-12 Thread Brian K. White
You can try to switch to faster filesystems (reiserfs/ext4/btrfs/zfs) 
and enable metadata performance options and do other tuning steps 
(dir_index, noatime) and upgrade disks and ram etc, but mostly, with a 
frankly unrealistic business requirement like that, you have to either 
tell business that the requirement can't be promised, only strived for, 
or, develop your own system outside of rsync to detect the changes and 
then rsync those files specifically.


For instance, install incrond and make an incron job that watches those 
directories and fires off an rsync just for that file every time a file 
changes. You will still want to run a full regular rsync periodically 
from cron because incron is event based, not a spooler. Events can be 
missed for any number of reasons once in a while (incron is turned off 
because server is in the process of starting up or shutting down or 
upgrading software, your script failed for some events, incrond crashed 
or was killed, etc...) so you need a regular cron job that periodically 
does a full normal rsync to catch anything that might have been missed.


The end result is, barring missed events, all files are synced 
immediately when they are changed, not every 10 minutes.
That may not be good for you though. It depends what the application 
does. If the application is updating hundreds of files constantly, this 
won't work at all.


You may want to investigate distributed filesystems instead of rsync jobs.

--
bkw

On 4/12/2012 4:28 PM, vijay patel wrote:

Thanks friends. We are using Redhat Linux 5.8 on Production and Disaster
Recovery side. By drilling down we have found out it is taking lot of
time to check what has changed while data tranfer is very fast. As i
mentioned data in these folders is very less (hardly 40GB) and whenever
new file is created, it is of max 30KB.

Since we have to sync production environment to DR every 10 mins as per
Business requirement i have to schedule it via cron. This already
distributed folder structure i am using. I already have another rsync
job which runs every 5 mins on another folder structure. It is running
fine. Is there any option i can use with rsync to make this folder check
fast?

Regards,
Vijay



  From: matthew.st...@us.fujitsu.com
  To: k...@sanitarium.net; rsync@lists.samba.org
  Subject: RE: Rsync takes long time to finish
  Date: Thu, 12 Apr 2012 19:29:03 +
 
  The first clause should read does not parallelize.
 
 
  -Original Message-
  From: rsync-boun...@lists.samba.org
[mailto:rsync-boun...@lists.samba.org] On Behalf Of Stier, Matthew
  Sent: Thursday, April 12, 2012 3:07 PM
  To: Kevin Korb; rsync@lists.samba.org
  Subject: RE: Rsync takes long time to finish
 
  And, although rsync does parallelize, nothing stops you from running
multiple instances of rsync.
 
  I had to transfer files from system A to system B, and being limited
by the processing power of a single thread of rsync, I drilled down one
level, and ran rsync's against each the first level file and
subdirectory. This put more threads/cores/processors to work made better
use of the network bandwidth to get the job done.
 
  When all the rsync's finished, I ran a single root level rsync to
catch the stragglers.
 
  If you have the processing power, use it.
 
 
  -Original Message-
  From: rsync-boun...@lists.samba.org
[mailto:rsync-boun...@lists.samba.org] On Behalf Of Kevin Korb
  Sent: Thursday, April 12, 2012 2:44 PM
  To: rsync@lists.samba.org
  Subject: Re: Rsync takes long time to finish
 
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
 
  Several suggestions...
 
  Add a lockfile to your cron job so it doesn't run two instances at the
  same time and you don't have to predict the run time.
 
  Make sure you are running rsync version 3+ on both systems. It has
  significant performance benefits over version 2.
 
  Run a job manually and add --itemize-changes and --progress. Try to
  figure out where most of the time is spent. Looking for something to
  transfer, transferring new files, or updating changed files.
 
  If it is mostly looking for something to transfer then you need
  filesystem optimizations. Such as directory indexing. You didn't
  specify the OS or anything but if you are on Linux this is where an
  ext3  ext4 conversion would be helpful.
 
  If it is mostly transferring new files then look at the network
  transfer rate. If it is low then try optimizing the ssh portion. Try
  using -e 'ssh -c arcfour' or try using the hpn version of openssh. If
  encryption isn't important you could also setup rsyncd.
 
  If it is mostly updating existing files check the itemize output to
  see if the files really need updating. For instance if something is
  screwing with your timestamps that will create a bunch of extra work
  for rsync. Also, --inplace might help performance but be sure to read
  about it.
 
  On 04/12/12 14:29, vijay patel wrote:
   Hi Friends,
  
   I am using rsync to copy data from

Re: Rsync takes long time to finish

2012-04-12 Thread Matthias Schniedermeyer
On 12.04.2012 23:59, vijay patel wrote:
 
 Hi Friends,
  
 I am using rsync to copy data from Production File Server to Disaster 
 Recovery file server. I have 100Mbps link setup between these two servers. 
 Folder structure is very deep. It is having path like 
 /reports/folder1/date/folder2/file.tx, where we have 1600 directories like 
 'folder1',  daily folders since last year in date folder and 2 folders for 
 each date folder like folder2  which ultimately will contain the file. Files 
 are not too big but just design of folder structure is complex. Folder 
 structure design is done by application and we can't change it at the moment. 
 I am using following command in cron to run rsync.
  
 rsync -avh --delete --exclude-from 'ex_file.txt' /reports/ 
 10.10.10.100:/reports/ | tee /tmp/rsync_report.out  
 /tmp/rsync_report.out.$today
  
 Initially we were running it every 5 mins then we increased it to every 30 
 mins since one instance was not getting finished in 5 mins. Now we have made 
 it to run every 8 hours because of lots of folders. Is there a way i can 
 improve performance of my rsync??

You description and the ones in the other mails, read like something 
else is more appropriate: lsyncd
http://code.google.com/p/lsyncd/

It uses inotify to to catch the events of files beeing 
created/changed/.. and then syncs those files/directories (using rsync).





Bis denn

-- 
Real Programmers consider what you see is what you get to be just as 
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a you asked for it, you got it text editor -- complicated, 
cryptic, powerful, unforgiving, dangerous.

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html