rsync Folks,
The following explanatory text is by me and the patches are by Rowan McKenzie for use by the Advanced Scientific Computing group at CSIRO. This patch builds upon the --link-dest patch by Bryant Hansen (Thanks heaps!). 1. The original patch provided an alternate behaviour for rsync when using the --link-dest option. When there are identical files in the source and link-dest areas, but a different file in the destination area, standard rsync will update the destination by copying the file from source to destination: the patch updates the destination by hard-linking from the link-dest area. Why do we want this behaviour? In our backup system which we have been running since 2007 using rsync, we recycle old backup destinations to be the target for new backups. This is an efficiency gain - we don't want to create a new area with, in several cases, millions of files, when we have an older area which is nearly up to date. (With mature backups of user areas, we typically find that our daily backups have a churn of 0.5% of files and 1% of data.) The original patch ensures we get the maximum amount of hard-linking available. However, the original patch unconditionally outputs messages for every file hard-linked under the scenario outlined above. Our modified patch makes output of the diagnostic message controlled by the -v option. 2. In addition, the patch adds one more feature and option. In our backups some time ago, we wished to avoid repeated daily backups of large files that were being appended to each day - they were the outputs of computational models. We used the --max-size parameter to skip these files: however, we did not like the lack of warning about skipped files. This patch adds another parameter, --warn, to select the output of a warning message when files bigger than the selected --max-size are skipped. The message is of the form: big_file is over max-size 3. For information: our backups are controlled from the destination of the backups (pull rather than push as Kevin Korb recently advised). We use the rsync daemon capability. The destinations of our backups are file systems subject to HSM (Hierarchical Storage Management), using SGI's Data Migration Facility (DMF). A typical command we use is the following (but I have shortened the paths and addresses). rsync --password-file=not_for_your_eyes --numeric-ids -a --stats --one-file-system --max-size=8.0GB --warn --whole-file --link-dest=previous --delete root@source_host::backups/source_dir current --password-file=not_for_your_eyes . for the daemon --numeric-ids . since the userids on the source are not always available on the destination -a . archive mode --stats . statistics --one-file-system . stops the backup of everything when backing up / --max-size=8.0GB . to skip large files --warn . NEW parameter - warn of skipped files because of --max-size --whole-file . essential when the destination is subject to HSM: otherwise, files will be recalled to use the rsync comparison algorithm --link-dest=previous . pointer to previous backup: to provide a source of files for hard-linking --delete . essential when the destination is a recycled directory, to ensure superseded files are deleted root@source_host::backups/source_dir . the source specification: username @source_host, module specification, and source directory current . the destination directory. We use an extended Tower of Hanoi scheme to manage the keeping of backups: - highly recommended for its ability to provide sensible keeping of backups matched to the likelihood of restores, and because it avoids messy management using dates and times. Regards Rob. Bell e-mail: robert.b...@csiro.au -- Dr Robert C. Bell, BSc (Hons) PhD Technical Services Manager Advanced Scientific Computing CSIRO IM&T Phone: +61 3 9669 8102 | Mobile: +61 428 108 333 | CSIRO 93 3810 robert.b...@csiro.au | http://www.csiro.au/ | http://www.hpsc.csiro.au/ Addresses: Street: CSIRO ASC Level 11, 700 Collins Street, Docklands Vic 3008, Australia Postal: CSIRO ASC Level 11, GPO Box 1289, Melbourne Vic 3001, Australia
# CSIRO-ASC patches for rsync. # # Note: name this so that it's called after rsync_link_dest_from_bryant # # Rowan McKenzie 27/7/2012 # Warnings for --max-size ignored files are displayed if -w/--warning is specified diff -Naur orig/options.c new/options.c --- orig/options.c 2012-07-26 16:36:59.201899312 +1000 +++ new/options.c 2012-07-26 16:49:40.065898039 +1000 @@ -173,6 +173,7 @@ char *dest_option = NULL; int verbose = 0; +int warn = 0; int quiet = 0; int output_motd = 1; int log_before_transfer = 0; @@ -313,6 +314,7 @@ rprintf(F,"\n"); rprintf(F,"Options\n"); rprintf(F," -v, --verbose increase verbosity\n"); + rprintf(F," -w, --warn display warnings\n"); rprintf(F," -q, --quiet suppress non-error messages\n"); rprintf(F," --no-motd suppress daemon-mode MOTD (see manpage caveat)\n"); rprintf(F," -c, --checksum skip based on checksum, not mod-time & size\n"); @@ -453,6 +455,7 @@ {"help", 0, POPT_ARG_NONE, 0, OPT_HELP, 0, 0 }, {"version", 0, POPT_ARG_NONE, 0, OPT_VERSION, 0, 0}, {"verbose", 'v', POPT_ARG_NONE, 0, 'v', 0, 0 }, + {"warn", 'w', POPT_ARG_NONE, 0, 'w', 0, 0 }, {"no-verbose", 0, POPT_ARG_VAL, &verbose, 0, 0, 0 }, {"no-v", 0, POPT_ARG_VAL, &verbose, 0, 0, 0 }, {"quiet", 'q', POPT_ARG_NONE, 0, 'q', 0, 0 }, @@ -671,6 +674,7 @@ rprintf(F," --log-file-format=FMT override the \"log format\" setting\n"); rprintf(F," --sockopts=OPTIONS specify custom TCP options\n"); rprintf(F," -v, --verbose increase verbosity\n"); + rprintf(F," -w, --warn display warnings\n"); rprintf(F," -4, --ipv4 prefer IPv4\n"); rprintf(F," -6, --ipv6 prefer IPv6\n"); rprintf(F," --help show this help screen\n"); @@ -698,6 +702,7 @@ {"server", 0, POPT_ARG_NONE, &am_server, 0, 0, 0 }, {"temp-dir", 'T', POPT_ARG_STRING, &tmpdir, 0, 0, 0 }, {"verbose", 'v', POPT_ARG_NONE, 0, 'v', 0, 0 }, + {"warn", 'w', POPT_ARG_NONE, 0, 'w', 0, 0 }, {"no-verbose", 0, POPT_ARG_VAL, &verbose, 0, 0, 0 }, {"no-v", 0, POPT_ARG_VAL, &verbose, 0, 0, 0 }, {"help", 'h', POPT_ARG_NONE, 0, 'h', 0, 0 }, @@ -978,6 +983,10 @@ verbose++; break; + case 'w': + warn++; + break; + default: rprintf(FERROR, "rsync: %s: %s (in daemon mode)\n", @@ -1095,6 +1104,10 @@ verbose++; break; + case 'w': + warn++; + break; + case 'q': quiet++; break; # Warnings for --max-size ignored files are displayed if -w/--warning is specified diff -Naur orig/generator.c new/generator.c --- orig/generator.c 2012-07-26 16:56:13.773898292 +1000 +++ new/generator.c 2012-07-26 16:44:19.597902412 +1000 @@ -24,6 +24,7 @@ #include "same-inode.h" extern int verbose; +extern int warn; extern int dry_run; extern int do_xfers; extern int stdout_format_has_i; @@ -1796,7 +1796,7 @@ } if (max_size > 0 && F_LENGTH(file) > max_size) { - if (verbose > 1) { + if (verbose > 1 || warn) { if (solo_file) fname = f_name(file, NULL); rprintf(FINFO, "%s is over max-size\n", fname); # Only output '=>' notifications when -v/--verbose specified (it's a patch to the rsync_link_dest_from_bryant patch) diff -Naur orig/generator.c new/generator.c --- orig/generator.c 2012-07-26 16:56:13.773898292 +1000 +++ new/generator.c 2012-07-26 16:44:19.597902412 +1000 @@ -1066,7 +1066,7 @@ * doesn't exist... */ if (!statres || stat_err == ENOENT) { if (!SAME_INODE(sxp->st, st_tmp)) { - if (!statres) { + /*if (!statres) {*/ if (!statres && verbose) { rprintf(FCLIENT, "%s => %s\n", fname, cmpbuf); } if (!hard_link_one(file, fname, cmpbuf, 1)) {
-- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html