Re: Are there any current, in play plans, to multi-thread rsync?
On Fri, Oct 16, 2009 at 8:32 PM, Michael Brian - IL micha...@tusc.comwrote: I am looking for an enhancement to multi-thread or parallelize rsync. Are there any plans that would allow one to set a parameter, like 'number_of_threads' and rsync will ship multiple files at the same time? Not at present. That would be a pretty big protocol change, and is probably something that would be best left to a new-protocol rewrite. It would be quite interesting to have separate connections for the hierarchy traversal code versus one or more file transfer connections that were controlled by the traversal process, but that is not something that I'm working on, nor have I heard about anyone else doing that. ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: retrieve files without local dir compare
On Wed, Oct 14, 2009 at 2:36 AM, Marc Mertes mer...@uni-bonn.de wrote: If the data is here, it is imported by a special software, after the import it will be deleted from that directory. The deleting can't be disabled. If you know what file just got processed, append its name onto an exclude file (one per line), and then use --exclude-from=EXCLUDE_FILE. ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
RE: Are there any current, in play plans, to multi-thread rsync?
Thanks for the update. We are using rsync to sync up the Oracle archived transaction logs to our standby databases. We would like to control the number of files that could be transfered in parallel. Thanks for the update. Brian -- Brian P Michael Technical Management Consultant Rolta TUSC, Inc. micha...@tusc.com 630-960-2909 x1181 http://www.tusc.com http://www.tusc.com/ The information contained in this transmission is privileged and confidential information intended for the use of the individual or entity named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this transmission in error, do not read it. Please immediately reply to the sender that you have received this communication in error and then delete it. Thank you. From: 4way...@gmail.com [mailto:4way...@gmail.com] On Behalf Of Wayne Davison Sent: Saturday, October 17, 2009 2:30 AM To: Michael Brian - IL Cc: rsync@lists.samba.org Subject: Re: Are there any current, in play plans, to multi-thread rsync? On Fri, Oct 16, 2009 at 8:32 PM, Michael Brian - IL micha...@tusc.com wrote: I am looking for an enhancement to multi-thread or parallelize rsync. Are there any plans that would allow one to set a parameter, like 'number_of_threads' and rsync will ship multiple files at the same time? Not at present. That would be a pretty big protocol change, and is probably something that would be best left to a new-protocol rewrite. It would be quite interesting to have separate connections for the hierarchy traversal code versus one or more file transfer connections that were controlled by the traversal process, but that is not something that I'm working on, nor have I heard about anyone else doing that. ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Disable checksumming to improve local performance?
On Fri, Sep 04, 2009 at 10:26:43AM -0700, Greg Siekas wrote: Is it possible to disable checksumming? I'm using rsync with -W, whole file, so if a file has changed I don't want to just transfer the changes. There is not currently a way to do that. I whipped up the appended patch that makes -WW disable full-file checksum computation and makes that the default for a local transfer. When comparing -W vs -WW in some local-transfer testing, it didn't result in any perceivable difference in transfer speed or runtime. You can verify that it is enabled by using --out-format='%i %C %n%L' and noting that all the checksums turn into all-bits-on values (the checksum bytes are still transmitted so that a read-error on the sending side can be indicated to the receiver). ..wayne.. index 6655acd..d8b228d 100644 --- a/compat.c +++ b/compat.c @@ -34,6 +34,7 @@ extern int use_qsort; extern int allow_inc_recurse; extern int append_mode; extern int fuzzy_basis; +extern int whole_file; extern int read_batch; extern int delay_updates; extern int checksum_seed; @@ -286,6 +287,8 @@ void setup_protocol(int f_out,int f_in) receiver_symlink_times = 1; #endif } + if (whole_file 1 protocol_version 31) + whole_file = 1; if (need_unsorted_flist (!am_sender || inc_recurse)) unsort_ndx = ++file_extra_cnt; index 407568d..72636eb 100644 --- a/main.c +++ b/main.c @@ -519,7 +519,7 @@ static pid_t do_cmd(char *cmd, char *machine, char *user, char **remote_argv, in /* If the user didn't request --[no-]whole-file, force * it on, but only if we're not batch processing. */ if (whole_file 0 !write_batch) - whole_file = 1; + whole_file = 2; set_allow_inc_recurse(); pid = local_child(argc, args, f_in_p, f_out_p, child_main); #ifdef ICONV_CONST index 611035f..2c356e3 100644 --- a/match.c +++ b/match.c @@ -25,6 +25,7 @@ extern int checksum_seed; extern int append_mode; extern int checksum_len; +extern int whole_file; int updating_basis_file; char sender_file_sum[MAX_DIGEST_LEN]; @@ -124,9 +125,11 @@ static void matched(int f, struct sum_struct *s, struct map_struct *buf, n += s-sums[i].len; } - for (j = 0; j n; j += CHUNK_SIZE) { - int32 n1 = MIN(CHUNK_SIZE, n - j); - sum_update(map_ptr(buf, last_match + j, n1), n1); + if (whole_file 2) { + for (j = 0; j n; j += CHUNK_SIZE) { + int32 n1 = MIN(CHUNK_SIZE, n - j); + sum_update(map_ptr(buf, last_match + j, n1), n1); + } } if (i = 0) @@ -336,7 +339,10 @@ void match_sums(int f, struct sum_struct *s, struct map_struct *buf, OFF_T len) matches = 0; data_transfer = 0; - sum_init(checksum_seed); + if (whole_file 2) + sum_init(checksum_seed); + else + memset(sender_file_sum, 0xFF, checksum_len); if (append_mode 0) { if (append_mode == 2) { @@ -377,7 +383,7 @@ void match_sums(int f, struct sum_struct *s, struct map_struct *buf, OFF_T len) matched(f, s, buf, len, -1); } - if (sum_end(sender_file_sum) != checksum_len) + if (whole_file 2 sum_end(sender_file_sum) != checksum_len) overflow_exit(checksum_len); /* Impossible... */ /* If we had a read error, send a bad checksum. We use all bits index 66820b5..f9b1939 100644 --- a/options.c +++ b/options.c @@ -34,14 +34,9 @@ extern filter_rule_list daemon_filter_list; int make_backups = 0; -/** - * If 1, send the whole file as literal data rather than trying to - * create an incremental diff. - * - * If -1, then look at whether we're local or remote and go by that. - * - * @sa disable_deltas_p() - **/ +/* If 1, send the whole file as literal data rather than trying to + * create an incremental diff. If 1, disable full-file checksumming. + * If -1, then look at whether we're local or remote and go by that. */ int whole_file = -1; int append_mode = 0; @@ -928,7 +923,7 @@ static struct poptOption long_options[] = { {exclude-from, 0, POPT_ARG_STRING, 0, OPT_EXCLUDE_FROM, 0, 0 }, {include-from, 0, POPT_ARG_STRING, 0, OPT_INCLUDE_FROM, 0, 0 }, {cvs-exclude, 'C', POPT_ARG_NONE, cvs_exclude, 0, 0, 0 }, - {whole-file, 'W', POPT_ARG_VAL,whole_file, 1, 0, 0 }, + {whole-file, 'W', POPT_ARG_NONE, 0, 'W', 0, 0 }, {no-whole-file,0, POPT_ARG_VAL,whole_file, 0, 0, 0 }, {no-W, 0, POPT_ARG_VAL,whole_file, 0, 0, 0 }, {checksum,'c', POPT_ARG_VAL,always_checksum, 1, 0, 0 }, @@ -1499,6 +1494,12 @@ int parse_arguments(int *argc_p, const char ***argv_p) } break; + case 'W': + if (whole_file 0) +whole_file = 0; + whole_file++; + break; + case 'P': if (refused_partial || refused_progress) { create_refuse_error(refused_partial @@ -2288,11 +2289,14 @@ void server_options(char **args, int *argc_p) argstr[x++] = 'k'; } - if (whole_file 0) - argstr[x++] = 'W'; /* We don't need to send --no-whole-file, because it's the * default for remote transfers, and in any case old versions * of rsync will not understand it. */ + if (whole_file 0)
Re: Enhanced authentication and authorization in rsyncd
On Sun, Aug 30, 2009 at 12:06:21PM +0300, Amir Rapson wrote: A slightly better patch file (removed some warnings). Thanks! Sorry for the slow reply, but your patch looks very useful. I'm reviewing it for inclusion in 3.1.0. ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Problem with 3.0.6 and fileflags
On Wed, Jul 29, 2009 at 03:24:57PM -0400, Roger Bailey wrote: I am having trouble compiling rsync 3.0.6 on Red hat Linux EL 5.3. [...] I am puzzled by the last entry, no file-flags. That file-flags support is for BSD, so configure determines that it isn't available for Linux. Linux has something similar (controlled by chattr), but support for that hasn't been integrated into the patch. I took a look at it once, and ran into some complications, and didn't finish it. ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: How could be total transferred file size more than total file size ?
On Wed, Jul 29, 2009 at 11:17:34AM +0530, paresh masani wrote: Could you please tell me how would it be possible that total transferred file size is more than total file size ? I believe that the only way this can happen is for a resend to occur (when rsync notices that a file doesn't match the checksum and transfers it again). ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Nice little performance improvement
Hi, Interesting. If you're not using incremental recursion (the default in rsync = 3.0.0), I can see that the du would help by forcing the destination I/O to overlap the file-list building in time. But with incremental recursion, the du shouldn't be necessary because rsync actually overlaps the checking of destination files with the file-list building on the source. Ignoring incremental recursion for a moment. It seems to me that anything that can warm up the file cache before it is needed would be beneficial? -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Nice little performance improvement
Hi, In order to expeditiously move these new files offsite, we use a modified version of pyinotify to log all added/altered files across the entire filesystem(s) and then every five minutes feed the list to rsync with the --files-from option. This works very effectively and quickly. Interesting... How do you tell rsync to delete files that were deleted from the source, or is that not part of your use case? For us, that is not a necessary part of our use-case. It would certainly however be possible to capture the delete events and remove the files with some other helper script, rather than use rsync directly (rsync doesn't give any advantage in that scenario except to be able to re-use the existing network transport mechanism). regards, Darryl Dixon Winterhouse Consulting Ltd http://www.winterhouseconsulting.com -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Nice little performance improvement
No, not if the file cache isn't large enough for the number of files. E.g. if you have 20 million files and only 256MB RAM, it's likely a bad idea. Splitting down to the subsub (2-levels down) directory level allows a single subsub rsync to fit for me. Warming the cache is beneficial here, I didn't say it was in every situation. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Nice little performance improvement
Mike Connell wrote: Hi, Interesting. If you're not using incremental recursion (the default in rsync = 3.0.0), I can see that the du would help by forcing the destination I/O to overlap the file-list building in time. But with incremental recursion, the du shouldn't be necessary because rsync actually overlaps the checking of destination files with the file-list building on the source. Ignoring incremental recursion for a moment. It seems to me that anything that can warm up the file cache before it is needed would be beneficial? No, not if the file cache isn't large enough for the number of files. E.g. if you have 20 million files and only 256MB RAM, it's likely a bad idea. Personally I use a program that I wrote about 11 years ago, called treescan, which pulls in the inodes to cache about twice as fast as du by using inode number sorting. -- Jamie -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[SCM] The rsync repository. branch, master, updated. v3.0.3-237-g20caffd
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project The rsync repository.. The branch, master has been updated via 20caffd2b361bcad51692998411e4cc566c04b40 (commit) from df6350a8b83a9e669f5e5c822bf2dc929526a128 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log - commit 20caffd2b361bcad51692998411e4cc566c04b40 Author: Wayne Davison way...@samba.org Date: Fri Oct 16 22:39:21 2009 -0700 A major overhaul of I/O routines, creating perform_io(). Files-from data is now sent as multiplexed I/O so that it can mingle with any messages (such as debug output). Requires protocol 31. Protocol 31 no longer disables output verbosity in a couple instances that used to cause protocol issues. Got rid of MSG_* messages that have implied raw data that follows after them. We instead send a negative index value as a part of the raw data stream, which is guaranteed to be output together with the following data. This only affects the (in-progress) protocol 31 and the (self- contained) communication stream from the receiver to the generator. Added --debug=IO and improved --debug=FLIST. Some --debug=IO output requires --msgs2stderr to be used to see it (i.e. sending a message about sending a message would send another message, ad infinitum). --- Summary of changes: flist.c| 17 +- io.c | 1945 ++-- log.c |4 +- main.c | 66 +- options.c |1 + receiver.c |9 +- rsync.c| 71 ++- rsync.h| 15 +- sender.c |2 +- testsuite/itemize.test |4 +- 10 files changed, 1193 insertions(+), 941 deletions(-) hooks/post-receive -- The rsync repository. ___ rsync-cvs mailing list rsync-cvs@lists.samba.org https://lists.samba.org/mailman/listinfo/rsync-cvs
[SCM] The rsync repository. branch, master, updated. v3.0.3-239-g6f098b0
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project The rsync repository.. The branch, master has been updated via 6f098b0f8c1c3582013f20970bf575ab487f6bda (commit) via 1ec57e4ddcea665d04e780041fb0d1d8885bfed3 (commit) from 20caffd2b361bcad51692998411e4cc566c04b40 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log - commit 6f098b0f8c1c3582013f20970bf575ab487f6bda Author: Wayne Davison way...@samba.org Date: Sat Oct 17 11:06:49 2009 -0700 Fix some man page problems Scott Kostyshak pointed out. commit 1ec57e4ddcea665d04e780041fb0d1d8885bfed3 Author: Wayne Davison way...@samba.org Date: Sat Oct 17 09:09:27 2009 -0700 Fix check for an empty output buffer and limit to flist_eof. --- Summary of changes: io.c |3 ++- rsync.yo | 30 +++--- 2 files changed, 17 insertions(+), 16 deletions(-) hooks/post-receive -- The rsync repository. ___ rsync-cvs mailing list rsync-cvs@lists.samba.org https://lists.samba.org/mailman/listinfo/rsync-cvs
[SCM] The rsync repository. branch, master, updated. v3.0.3-240-gd23cc15
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project The rsync repository.. The branch, master has been updated via d23cc156aa36135a2970321873798d35626d477b (commit) from 6f098b0f8c1c3582013f20970bf575ab487f6bda (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log - commit d23cc156aa36135a2970321873798d35626d477b Author: Wayne Davison way...@samba.org Date: Sat Oct 17 15:03:11 2009 -0700 Call seteuid() when calling setuid(). --- Summary of changes: clientserver.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) hooks/post-receive -- The rsync repository. ___ rsync-cvs mailing list rsync-cvs@lists.samba.org https://lists.samba.org/mailman/listinfo/rsync-cvs
[SCM] The rsync repository. branch, master, updated. v3.0.3-241-g0a9fbe1
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project The rsync repository.. The branch, master has been updated via 0a9fbe17de7d9d298ed64263a4b3cfb77b871199 (commit) from d23cc156aa36135a2970321873798d35626d477b (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log - commit 0a9fbe17de7d9d298ed64263a4b3cfb77b871199 Author: Wayne Davison way...@samba.org Date: Sat Oct 17 15:53:25 2009 -0700 Allow %VAR% environment references in daemon-config parameter values. --- Summary of changes: clientserver.c | 55 --- loadparm.c | 65 ++- rsyncd.conf.yo | 29 - 3 files changed, 124 insertions(+), 25 deletions(-) hooks/post-receive -- The rsync repository. ___ rsync-cvs mailing list rsync-cvs@lists.samba.org https://lists.samba.org/mailman/listinfo/rsync-cvs