On Tue, Jul 11, 2000 at 07:20:07AM +1000, Neil Schellenberger wrote:
> >>>>> "Ned" == Ned Bass <[EMAIL PROTECTED]> writes:
> 
>     Ned> I have been having an on-going problem completing rsync
>     Ned> transfers.  The transfers simply stop at seemingly random
>     Ned> points without printing an error message.  Rsync still
>     Ned> continues running however all network activity dies.  I can
>     Ned> verify that all communications stop using tcpdump.
> 
>     Ned> I experience this while uploading to an anonymous rsync
>     Ned> server as well as using ssh as the transport.  If I keep
>     Ned> restarting the transfer it will eventually complete.  I am
>     Ned> using rsync to back up entire Linux filesystems, so the
>     Ned> transfers are quite lengthy.
> 
> Hi Ned (and other rsync@samba denizens),
> 
> I'm also experiencing (and trying to track down) this problem on a
> sparc-sun-solaris2.5.1 platform using a 2.4.3 rsync-daemon for
> transport.  I am also still trying to track down the (possibly
> related?) "unexpected EOF in read_timeout" problem.
> 
> As another data point, this seems to happen mostly with large mirrors
> but nonetheless doesn't seem to be a memory issue (but I'm not
> absolutely positive about that).
> 
> I'm running the client/reciever sides from a nightly cron job.  I kick
> off about twelve or fourteen jobs (of varying sizes and varying
> "changing-ness").  Usually the smallish ones finish pretty quickly; no
> more than an hour or two at very most and that's more or less sensible
> relative to their size and the bandwidth available.  Two or so of the
> larger ones (and not always the same two, but often then same two)
> will take fourteen to eighteen hours or more and then, to add insult
> to injury, puke with unexpected EOF.  Running just one of these failed
> jobs manually (as practiaclly the only thing running on a system with
> scads of memory, both physical and virtual) still usually yields an
> "unexpected EOF", but much quicker (twenty minutes or so).
> 
> The machines in question have their O/S stripped down to the bare
> minimum, so I can't easily snoop/tcpdump from them, but netstating
> suggests very, very low network traffic (the rsync jobs run in the
> middle of the night when there is precious little other traffic).


I have had suspicions that rsync 2.4.3 has hang problems on an rsync server
but nobody has confirmed it.  I'd like to see somebody try some rigorous
server mode tests with the new --blocking-io option that is in rsync CVS.
It is defaulted on for rsh as a transport but not for server mode because
we hadn't any evidence that server mode was affected.  If it's easier for
you to hack source code than getting the code out of CVS, instead put
"set_blocking(STDOUT_FILENO)" in util.c after the place it does
set_blocking(STDIN_FILENO).

- Dave Dykstra

Reply via email to