Re: Are there any current, in play plans, to multi-thread rsync?

2009-10-17 Thread Wayne Davison
On Fri, Oct 16, 2009 at 8:32 PM, Michael Brian - IL micha...@tusc.comwrote:

  I am looking for an enhancement to multi-thread or parallelize rsync.
 Are there any plans that would allow one to set a parameter, like
 'number_of_threads' and rsync will ship multiple files at the same time?


Not at present.  That would be a pretty big protocol change, and is probably
something that would be best left to a new-protocol rewrite.  It would be
quite interesting to have separate connections for the hierarchy traversal
code versus one or more file transfer connections that were controlled by
the traversal process, but that is not something that I'm working on, nor
have I heard about anyone else doing that.

..wayne..
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: retrieve files without local dir compare

2009-10-17 Thread Wayne Davison
On Wed, Oct 14, 2009 at 2:36 AM, Marc Mertes mer...@uni-bonn.de wrote:

 If the data is here, it is imported by a special software, after the
 import it will be deleted from that directory. The deleting can't be
 disabled.


If you know what file just got processed, append its name onto an exclude
file (one per line), and then use --exclude-from=EXCLUDE_FILE.

..wayne..
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

RE: Are there any current, in play plans, to multi-thread rsync?

2009-10-17 Thread Michael Brian - IL
Thanks for the update.

We are using rsync to sync up the Oracle archived transaction logs to
our standby databases.
We would like to control the number of files that could be transfered in
parallel.

Thanks for the update.


Brian

--
Brian P Michael
Technical Management Consultant
Rolta TUSC, Inc.
micha...@tusc.com
630-960-2909 x1181
http://www.tusc.com http://www.tusc.com/

The information contained in this transmission is privileged and
confidential information intended for the use of the individual or
entity named above.  If the reader of this message is not the intended
recipient, you are hereby notified that any dissemination, distribution
or copying of this communication is strictly prohibited.  If you have
received this transmission in error, do not read it.  Please immediately
reply to the sender that you have received this communication in error
and then delete it.  Thank you.





From: 4way...@gmail.com [mailto:4way...@gmail.com] On Behalf Of Wayne
Davison
Sent: Saturday, October 17, 2009 2:30 AM
To: Michael Brian - IL
Cc: rsync@lists.samba.org
Subject: Re: Are there any current, in play plans, to multi-thread
rsync?


On Fri, Oct 16, 2009 at 8:32 PM, Michael Brian - IL micha...@tusc.com
wrote:


I am looking for an enhancement to multi-thread or parallelize
rsync.
Are there any plans that would allow one to set a parameter,
like 'number_of_threads' and rsync will ship multiple files at the same
time?


Not at present.  That would be a pretty big protocol change, and is
probably something that would be best left to a new-protocol rewrite.
It would be quite interesting to have separate connections for the
hierarchy traversal code versus one or more file transfer connections
that were controlled by the traversal process, but that is not something
that I'm working on, nor have I heard about anyone else doing that.


..wayne..

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: Disable checksumming to improve local performance?

2009-10-17 Thread Wayne Davison
On Fri, Sep 04, 2009 at 10:26:43AM -0700, Greg Siekas wrote:
 Is it possible to disable checksumming?  I'm using rsync with -W,
 whole file, so if a file has changed I don't want to just transfer
 the changes.

There is not currently a way to do that.  I whipped up the appended
patch that makes -WW disable full-file checksum computation and makes
that the default for a local transfer.  When comparing -W vs -WW in some
local-transfer testing, it didn't result in any perceivable difference
in transfer speed or runtime.  You can verify that it is enabled by
using --out-format='%i %C %n%L' and noting that all the checksums turn
into all-bits-on values (the checksum bytes are still transmitted so
that a read-error on the sending side can be indicated to the receiver).

..wayne..
index 6655acd..d8b228d 100644
--- a/compat.c
+++ b/compat.c
@@ -34,6 +34,7 @@ extern int use_qsort;
 extern int allow_inc_recurse;
 extern int append_mode;
 extern int fuzzy_basis;
+extern int whole_file;
 extern int read_batch;
 extern int delay_updates;
 extern int checksum_seed;
@@ -286,6 +287,8 @@ void setup_protocol(int f_out,int f_in)
 		receiver_symlink_times = 1;
 #endif
 	}
+	if (whole_file  1  protocol_version  31)
+		whole_file = 1;
 
 	if (need_unsorted_flist  (!am_sender || inc_recurse))
 		unsort_ndx = ++file_extra_cnt;
index 407568d..72636eb 100644
--- a/main.c
+++ b/main.c
@@ -519,7 +519,7 @@ static pid_t do_cmd(char *cmd, char *machine, char *user, char **remote_argv, in
 		/* If the user didn't request --[no-]whole-file, force
 		 * it on, but only if we're not batch processing. */
 		if (whole_file  0  !write_batch)
-			whole_file = 1;
+			whole_file = 2;
 		set_allow_inc_recurse();
 		pid = local_child(argc, args, f_in_p, f_out_p, child_main);
 #ifdef ICONV_CONST
index 611035f..2c356e3 100644
--- a/match.c
+++ b/match.c
@@ -25,6 +25,7 @@
 extern int checksum_seed;
 extern int append_mode;
 extern int checksum_len;
+extern int whole_file;
 
 int updating_basis_file;
 char sender_file_sum[MAX_DIGEST_LEN];
@@ -124,9 +125,11 @@ static void matched(int f, struct sum_struct *s, struct map_struct *buf,
 		n += s-sums[i].len;
 	}
 
-	for (j = 0; j  n; j += CHUNK_SIZE) {
-		int32 n1 = MIN(CHUNK_SIZE, n - j);
-		sum_update(map_ptr(buf, last_match + j, n1), n1);
+	if (whole_file  2) {
+		for (j = 0; j  n; j += CHUNK_SIZE) {
+			int32 n1 = MIN(CHUNK_SIZE, n - j);
+			sum_update(map_ptr(buf, last_match + j, n1), n1);
+		}
 	}
 
 	if (i = 0)
@@ -336,7 +339,10 @@ void match_sums(int f, struct sum_struct *s, struct map_struct *buf, OFF_T len)
 	matches = 0;
 	data_transfer = 0;
 
-	sum_init(checksum_seed);
+	if (whole_file  2)
+		sum_init(checksum_seed);
+	else
+		memset(sender_file_sum, 0xFF, checksum_len);
 
 	if (append_mode  0) {
 		if (append_mode == 2) {
@@ -377,7 +383,7 @@ void match_sums(int f, struct sum_struct *s, struct map_struct *buf, OFF_T len)
 		matched(f, s, buf, len, -1);
 	}
 
-	if (sum_end(sender_file_sum) != checksum_len)
+	if (whole_file  2  sum_end(sender_file_sum) != checksum_len)
 		overflow_exit(checksum_len); /* Impossible... */
 
 	/* If we had a read error, send a bad checksum.  We use all bits
index 66820b5..f9b1939 100644
--- a/options.c
+++ b/options.c
@@ -34,14 +34,9 @@ extern filter_rule_list daemon_filter_list;
 
 int make_backups = 0;
 
-/**
- * If 1, send the whole file as literal data rather than trying to
- * create an incremental diff.
- *
- * If -1, then look at whether we're local or remote and go by that.
- *
- * @sa disable_deltas_p()
- **/
+/* If 1, send the whole file as literal data rather than trying to
+ * create an incremental diff.  If  1, disable full-file checksumming.
+ * If -1, then look at whether we're local or remote and go by that. */
 int whole_file = -1;
 
 int append_mode = 0;
@@ -928,7 +923,7 @@ static struct poptOption long_options[] = {
   {exclude-from, 0,  POPT_ARG_STRING, 0, OPT_EXCLUDE_FROM, 0, 0 },
   {include-from, 0,  POPT_ARG_STRING, 0, OPT_INCLUDE_FROM, 0, 0 },
   {cvs-exclude, 'C', POPT_ARG_NONE,   cvs_exclude, 0, 0, 0 },
-  {whole-file,  'W', POPT_ARG_VAL,whole_file, 1, 0, 0 },
+  {whole-file,  'W', POPT_ARG_NONE,   0, 'W', 0, 0 },
   {no-whole-file,0,  POPT_ARG_VAL,whole_file, 0, 0, 0 },
   {no-W, 0,  POPT_ARG_VAL,whole_file, 0, 0, 0 },
   {checksum,'c', POPT_ARG_VAL,always_checksum, 1, 0, 0 },
@@ -1499,6 +1494,12 @@ int parse_arguments(int *argc_p, const char ***argv_p)
 			}
 			break;
 
+		case 'W':
+			if (whole_file  0)
+whole_file = 0;
+			whole_file++;
+			break;
+
 		case 'P':
 			if (refused_partial || refused_progress) {
 create_refuse_error(refused_partial
@@ -2288,11 +2289,14 @@ void server_options(char **args, int *argc_p)
 			argstr[x++] = 'k';
 	}
 
-	if (whole_file  0)
-		argstr[x++] = 'W';
 	/* We don't need to send --no-whole-file, because it's the
 	 * default for remote transfers, and in any case old versions
 	 * of rsync will not understand it. */
+	if (whole_file  0) 

Re: Enhanced authentication and authorization in rsyncd

2009-10-17 Thread Wayne Davison
On Sun, Aug 30, 2009 at 12:06:21PM +0300, Amir Rapson wrote:
 A slightly better patch file (removed some warnings).

Thanks!  Sorry for the slow reply, but your patch looks very useful.
I'm reviewing it for inclusion in 3.1.0.

..wayne..
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Problem with 3.0.6 and fileflags

2009-10-17 Thread Wayne Davison
On Wed, Jul 29, 2009 at 03:24:57PM -0400, Roger Bailey wrote:
 I am having trouble compiling rsync 3.0.6 on Red hat Linux EL 5.3.
 [...]
 I am puzzled by the last entry, no file-flags.

That file-flags support is for BSD, so configure determines that it
isn't available for Linux.  Linux has something similar (controlled by
chattr), but support for that hasn't been integrated into the patch.  I
took a look at it once, and ran into some complications, and didn't
finish it.

..wayne..
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: How could be total transferred file size more than total file size ?

2009-10-17 Thread Wayne Davison
On Wed, Jul 29, 2009 at 11:17:34AM +0530, paresh masani wrote:
 Could you please tell me how would it be possible that total
 transferred file size is more than total file size ?

I believe that the only way this can happen is for a resend to occur
(when rsync notices that a file doesn't match the checksum and transfers
it again).

..wayne..
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Nice little performance improvement

2009-10-17 Thread Mike Connell


Hi,


Interesting.  If you're not using incremental recursion (the default in
rsync = 3.0.0), I can see that the du would help by forcing the
destination I/O to overlap the file-list building in time.  But with
incremental recursion, the du shouldn't be necessary because rsync
actually overlaps the checking of destination files with the file-list
building on the source.


Ignoring incremental recursion for a moment. It seems to me that anything
that can warm up the file cache before it is needed would be beneficial?
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Nice little performance improvement

2009-10-17 Thread Darryl Dixon - Winterhouse Consulting
 Hi,

 In order to expeditiously move these new files offsite, we use a
 modified
 version of pyinotify to log all added/altered files across the entire
 filesystem(s) and then every five minutes feed the list to rsync with
 the
 --files-from option. This works very effectively and quickly.

 Interesting...

 How do you tell rsync to delete files that were deleted from the source,
 or is that not part of your use case?

For us, that is not a necessary part of our use-case. It would certainly
however be possible to capture the delete events and remove the files with
some other helper script, rather than use rsync directly (rsync doesn't
give any advantage in that scenario except to be able to re-use the
existing network transport mechanism).

regards,
Darryl Dixon
Winterhouse Consulting Ltd
http://www.winterhouseconsulting.com
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Nice little performance improvement

2009-10-17 Thread Mike Connell

No, not if the file cache isn't large enough for the number of files.
E.g. if you have 20 million files and only 256MB RAM, it's likely a bad 
idea.



Splitting down to the subsub (2-levels down) directory level allows a single
subsub rsync to fit for me. Warming the cache is beneficial here, I didn't 
say
it was in every situation. 


--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Nice little performance improvement

2009-10-17 Thread Jamie Lokier
Mike Connell wrote:
 
 Hi,
 
 Interesting.  If you're not using incremental recursion (the default in
 rsync = 3.0.0), I can see that the du would help by forcing the
 destination I/O to overlap the file-list building in time.  But with
 incremental recursion, the du shouldn't be necessary because rsync
 actually overlaps the checking of destination files with the file-list
 building on the source.
 
 Ignoring incremental recursion for a moment. It seems to me that anything
 that can warm up the file cache before it is needed would be beneficial?

No, not if the file cache isn't large enough for the number of files.
E.g. if you have 20 million files and only 256MB RAM, it's likely a bad idea.

Personally I use a program that I wrote about 11 years ago, called
treescan, which pulls in the inodes to cache about twice as fast as
du by using inode number sorting.

-- Jamie
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


[SCM] The rsync repository. branch, master, updated. v3.0.3-237-g20caffd

2009-10-17 Thread Rsync CVS commit messages
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project The rsync repository..

The branch, master has been updated
   via  20caffd2b361bcad51692998411e4cc566c04b40 (commit)
  from  df6350a8b83a9e669f5e5c822bf2dc929526a128 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -
commit 20caffd2b361bcad51692998411e4cc566c04b40
Author: Wayne Davison way...@samba.org
Date:   Fri Oct 16 22:39:21 2009 -0700

A major overhaul of I/O routines, creating perform_io().

Files-from data is now sent as multiplexed I/O so that it can mingle
with any messages (such as debug output).  Requires protocol 31.

Protocol 31 no longer disables output verbosity in a couple instances
that used to cause protocol issues.

Got rid of MSG_* messages that have implied raw data that follows after
them.  We instead send a negative index value as a part of the raw data
stream, which is guaranteed to be output together with the following
data.  This only affects the (in-progress) protocol 31 and the (self-
contained) communication stream from the receiver to the generator.

Added --debug=IO and improved --debug=FLIST.  Some --debug=IO output
requires --msgs2stderr to be used to see it (i.e. sending a message
about sending a message would send another message, ad infinitum).

---

Summary of changes:
 flist.c|   17 +-
 io.c   | 1945 ++--
 log.c  |4 +-
 main.c |   66 +-
 options.c  |1 +
 receiver.c |9 +-
 rsync.c|   71 ++-
 rsync.h|   15 +-
 sender.c   |2 +-
 testsuite/itemize.test |4 +-
 10 files changed, 1193 insertions(+), 941 deletions(-)


hooks/post-receive
--
The rsync repository.
___
rsync-cvs mailing list
rsync-cvs@lists.samba.org
https://lists.samba.org/mailman/listinfo/rsync-cvs


[SCM] The rsync repository. branch, master, updated. v3.0.3-239-g6f098b0

2009-10-17 Thread Rsync CVS commit messages
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project The rsync repository..

The branch, master has been updated
   via  6f098b0f8c1c3582013f20970bf575ab487f6bda (commit)
   via  1ec57e4ddcea665d04e780041fb0d1d8885bfed3 (commit)
  from  20caffd2b361bcad51692998411e4cc566c04b40 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -
commit 6f098b0f8c1c3582013f20970bf575ab487f6bda
Author: Wayne Davison way...@samba.org
Date:   Sat Oct 17 11:06:49 2009 -0700

Fix some man page problems Scott Kostyshak pointed out.

commit 1ec57e4ddcea665d04e780041fb0d1d8885bfed3
Author: Wayne Davison way...@samba.org
Date:   Sat Oct 17 09:09:27 2009 -0700

Fix check for an empty output buffer and limit to flist_eof.

---

Summary of changes:
 io.c |3 ++-
 rsync.yo |   30 +++---
 2 files changed, 17 insertions(+), 16 deletions(-)


hooks/post-receive
--
The rsync repository.
___
rsync-cvs mailing list
rsync-cvs@lists.samba.org
https://lists.samba.org/mailman/listinfo/rsync-cvs


[SCM] The rsync repository. branch, master, updated. v3.0.3-240-gd23cc15

2009-10-17 Thread Rsync CVS commit messages
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project The rsync repository..

The branch, master has been updated
   via  d23cc156aa36135a2970321873798d35626d477b (commit)
  from  6f098b0f8c1c3582013f20970bf575ab487f6bda (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -
commit d23cc156aa36135a2970321873798d35626d477b
Author: Wayne Davison way...@samba.org
Date:   Sat Oct 17 15:03:11 2009 -0700

Call seteuid() when calling setuid().

---

Summary of changes:
 clientserver.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)


hooks/post-receive
--
The rsync repository.
___
rsync-cvs mailing list
rsync-cvs@lists.samba.org
https://lists.samba.org/mailman/listinfo/rsync-cvs


[SCM] The rsync repository. branch, master, updated. v3.0.3-241-g0a9fbe1

2009-10-17 Thread Rsync CVS commit messages
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project The rsync repository..

The branch, master has been updated
   via  0a9fbe17de7d9d298ed64263a4b3cfb77b871199 (commit)
  from  d23cc156aa36135a2970321873798d35626d477b (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -
commit 0a9fbe17de7d9d298ed64263a4b3cfb77b871199
Author: Wayne Davison way...@samba.org
Date:   Sat Oct 17 15:53:25 2009 -0700

Allow %VAR% environment references in daemon-config parameter values.

---

Summary of changes:
 clientserver.c |   55 ---
 loadparm.c |   65 ++-
 rsyncd.conf.yo |   29 -
 3 files changed, 124 insertions(+), 25 deletions(-)


hooks/post-receive
--
The rsync repository.
___
rsync-cvs mailing list
rsync-cvs@lists.samba.org
https://lists.samba.org/mailman/listinfo/rsync-cvs