Well, it ran to completion this way, in about 7.5h, but i'm not certain i believe it. While i left --delete --force off (I have been horribly burned testing those on really big chunks before), I would expect that the destination would then end up with at least as much as the source. big and big1 are subdirectories of the same volume, and its only contents aside from very small directory containing between 10 and 20 Kb of scripts, so i don't see how the destination could end up 6-1/2Gb short. When my current operations complete, I'll try one with all the options turned on, and run the filesystem map generator from my project to see what differences it left.
I have an idea of a mod to make the hard links check more efficient, but I don't understand C well enough. What i was thinking of was to keep the st_nlink part of the stat, and if it'snot a directory and nlink >1, save the path and inode in a seperate list. and leave them out of the main flist. That way, there's no processing of the items for which there's no possibility of a need to track hard links, then fix only one copy of each linked file, delete all the others, and link them back to it. I'm guessing that's a complete redo of the protocol, though. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Tools@lonnetsvr /users/Tools>cat doit #!/bin/sh /cadappl/encap/packages/rsync-cvs/bin/rsync --rsync-path=/cadappl/encap/packages/rsync-cvs/bin/rsync -WHav --stats --progress alta:/wan/lon-tools1/lon-tools1/big* /wan/lon-tools2/lon-tools2 >doit.log 2>&1 </dev/null & Tools@lonnetsvr /users/Tools>grep ' ' doit.log receiving file list ... done big/tools/Tools/.microsoft/Favorites/Channels/Arcadia Bay Demo Channel/ big/tools/Tools/.microsoft/Favorites/Channels/The Microsoft Channel/ big1/cadappl1/hpux/iclibs/CMOS18/EXTERNALS/PcCMOS18sfliolib_nlm_ex/2.1/tools/adf/vital/sfliolib_nlm -> ../../vha/sfliolib_nlm big1/cadappl1/hpux/iclibs/CMOS18/EXTERNALS/PcCMOS18shliolib_nlm_ex/2.1/tools/adf/vital/shliolib_nlm -> ../../vha/shliolib_nlm big1/cadappl1/hpux/iclibs/CMOS18/PcCMOS18flviolib_spm/2.1.1/lib/flviolib_spm.src -> ../tools/vital/timing/flviolib_spm.src big1/cadappl1/hpux/iclibs/CMOS18/PcCMOS18flviolib_spm/2.1.1/tools/adf/vital/flviolib_spm -> ../../vha/flviolib_spm big1/cadappl1/hpux/latest -> /cadappl/perl/5.6.1 Number of files: 2727469 Number of files transferred: 0 Total file size: 114067347318 bytes Total transferred file size: 0 bytes Literal data: 0 bytes Matched data: 0 bytes File list size: 68790028 Total bytes written: 16 Total bytes read: 68790044 wrote 16 bytes read 68790044 bytes 2531.79 bytes/sec total size is 114067347318 speedup is 1658.20 Tools@lonnetsvr /users/Tools>df -k /wan/lon-tools*/big/tools Filesystem kbytes used avail capacity Mounted on lon-tools1:big 150147795 121588653 28559142 81% /wan/lon-tools1/big lon-tools2:big 150147795 115027617 35120178 77% /wan/lon-tools2/big Tools@lonnetsvr /users/Tools> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Tim Conway [EMAIL PROTECTED] 303.682.4917 Philips Semiconductor - Longmont TC 1880 Industrial Circle, Suite D Longmont, CO 80501 Available via SameTime Connect within Philips, n9hmg on AIM perl -e 'print pack(nnnnnnnnnnnn, 19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), ".\n" ' "There are some who call me.... Tim?" Dave Dykstra <[EMAIL PROTECTED]> Sent by: [EMAIL PROTECTED] 02/07/2002 10:28 AM To: David Birnbaum <[EMAIL PROTECTED]> cc: Tim Conway/LMT/SC/PHILIPS@AMEC Eric Whiting <[EMAIL PROTECTED]> [EMAIL PROTECTED] Subject: Re: SIGUSR1 or SIGINT error Classification: The fix that went into 2.5.0 was for timeouts that were happening even when --timeout=0 (the default). Can any of you say for sure that it makes a difference with a new version when you go from --timeout=0 to a very large timeout? I want to see if Tim's experience with timeouts defaulting to 60 seconds is still happening, or if that was only something earlier. Of course, it's also entirely possible that the "SIGUSR1 or SIGINT error" message is being caused by a different problem. - Dave Dykstra On Thu, Feb 07, 2002 at 10:22:23AM -0500, David Birnbaum wrote: > I'm running 2.5.2. However, we had the same type of problem with 2.4.6, > which is what we were running before. If I had to guess, I would say > that we're seeing this error a little more often in 2.5.2. > > David. > > ----- > > On Thu, 7 Feb 2002 [EMAIL PROTECTED] wrote: > > > Currently 2.5.1pre3. I haven't tested that problem lately, though. I'll > > get the newest up and try a full sync. It's worth a try. I'll feel > > really stupid, though, if i've put all this work into newsync (perl > > driving find|diff|tar|lzop) and it's fixed in rsync. I think our case > > will always create problems, though, with the broken nfs unlink in the > > nfs3 interface on the NAS, and the broken nfs2 client on the solaris > > machines (mtime bug). I won't let this influence my test, though ;-). > > > > Tim Conway > > [EMAIL PROTECTED] > > 303.682.4917 > > Philips Semiconductor - Longmont TC > > 1880 Industrial Circle, Suite D > > Longmont, CO 80501 > > Available via SameTime Connect within Philips, n9hmg on AIM > > perl -e 'print pack(nnnnnnnnnnnn, > > 19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), > > ".\n" ' > > "There are some who call me.... Tim?" > > > > > > > > > > Dave Dykstra <[EMAIL PROTECTED]> > > Sent by: [EMAIL PROTECTED] > > 02/06/2002 03:41 PM > > > > > > To: Eric Whiting <[EMAIL PROTECTED]> > > cc: Tim Conway/LMT/SC/PHILIPS@AMEC > > David Birnbaum <[EMAIL PROTECTED]> > > [EMAIL PROTECTED] > > Subject: Re: SIGUSR1 or SIGINT error > > Classification: > > > > > > > > Looks like a fix for that went into 2.5.0. See revision 1.87 at > > http://cvs.samba.org/cgi-bin/cvsweb/rsync/io.c > > > > Tim & David, what version are you running? > > > > 2.5.2 has some serious problems, Eric. Try the latest development > > snapshot at > > rsync://rsync.samba.org/ftp/unpacked/rsync/ > > or > > ftp://rsync.samba.org/pub/unpacked/rsync/ > > > > - Dave Dykstra > > > > > > On Wed, Feb 06, 2002 at 11:33:43AM -0700, Eric Whiting wrote: > > > Make that 2 of us who need to specify a large timeout. > > > > > > I have found that I have to set the timeout to a large value (10000) to > > > get the rsyncs to run successfully. Leaving it at the default seemed to > > > cause timeout/hang problems. Of course I still running a 2.4.6dev > > > version. I had troubles with 2.5.[01]. (solaris/linux mix of of rsync > > > clients/servers) > > > > > > I need to try 2.5.2 as soon as I get a chance. Looks like some good > > > fixes are happening in 2.5.2. > > > > > > eric > > > > > > > > > > > > On Wed, 2002-02-06 at 10:39, [EMAIL PROTECTED] wrote: > > > > When i was getting these, I traced the process and its children > > (solaris: > > > > truss -f). I found that one of the spawned threads was experiencing > > an io > > > > timeout while the filelist was building. I had set no timeout, but it > > did > > > > it at 60 seconds every time. I found that this corresponded to a > > > > SELECT_TIMEOUT parameter, which was set to 60 if IO_TIMEOUT was 0. BY > > > > > > setting my timeout to 86400 (1 day), i stopped those. Of course, > > then, it > > > > choked farther along, but that's another story. > > > > Try setting a timeout, even if you don't want one. Make it the > > longest > > > > the process should ever take. > > > > > > > > Tim Conway > > > > [EMAIL PROTECTED] > > > > 303.682.4917 > > > > Philips Semiconductor - Longmont TC > > > > 1880 Industrial Circle, Suite D > > > > Longmont, CO 80501 > > > > Available via SameTime Connect within Philips, n9hmg on AIM > > > > perl -e 'print pack(nnnnnnnnnnnn, > > > > > > 19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), > > > > ".\n" ' > > > > "There are some who call me.... Tim?" > > > > > > > > > > > > > > > > > > > > Dave Dykstra <[EMAIL PROTECTED]> > > > > Sent by: [EMAIL PROTECTED] > > > > 02/06/2002 10:16 AM > > > > > > > > > > > > To: David Birnbaum <[EMAIL PROTECTED]> > > > > cc: [EMAIL PROTECTED] > > > > (bcc: Tim Conway/LMT/SC/PHILIPS) > > > > Subject: Re: SIGUSR1 or SIGINT error > > > > Classification: > > > > > > > > > > > > > > > > On Tue, Feb 05, 2002 at 11:28:54AM -0500, David Birnbaum wrote: > > > > > I suspected that might be the case...now...how to determine the > > "real" > > > > > problem? Does rsync log it somewhere? lsof shows that > > STDERR/STDOUT > > > > are > > > > > going to /dev/null, so I hope it's not writing it there. Nothing > > > > > informative in syslog, just the message about the SIG: > > > > > > > > > > Feb 5 09:49:41 hite rsyncd[9279]: [ID 702911 daemon.warning] > > rsync > > > > error: received SIGUSR1 or SIGINT (code 20) at rsync.c(229) > > > > > > > > > > Any clues? > > > > > > > > > > > > I'm sorry, but I don't have any more suggestions. > > > > > > > > - Dave Dykstra > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >