On Fri, Oct 26, 2007 at 11:13:14AM -0500, Les Mikesell wrote:
> John Rouillard wrote:
> >>>
> >>> $Conf{ClientTimeout} = 72000;
> >>>
> >>>which is 20 hours and the sigpipe is occurring before then.
> >>You'd see sigalarm instead of sigpipe if you had a timeout.
> >
> >Something like this I assume:
> >
> [...]
> > create d 755 0/1 12288 src/fastforward-0.51
> > finish: removing in-process file .
> > Child is aborting
> > Done: 17 files, 283 bytes
> > Got fatal error during xfer (aborted by signal=ALRM)
> > Backup aborted by user signal
>
> Yes, that one is a timeout on the backuppc side.
>
> >Also I straced the rsync process on the remote system while it was hung
> >(I assume on whatever occurred after the src/fastforward-0.51)
> >directory and got:
> >
> > [EMAIL PROTECTED] ~]$ ps -ef | grep 6909
> > root 6909 6908 0 Oct25 ? 00:00:00 /usr/bin/rsync
> > --server --sender --numeric-ids --perms --owner --group -D --links
> > --hard-links --times --block-size=2048 --recursive --one-file-system
> > --checksum-seed=32761 --ignore-times . /usr/local/
> > rouilj 10603 10349 0 05:36 pts/0 00:00:00 grep 6909
> > [EMAIL PROTECTED] ~]$ strace -p 6909
> > attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted
> > [EMAIL PROTECTED] ~]$ sudo strace -p 6909
> > Process 6909 attached - interrupt to quit
> > select(1, [0], [], NULL, {42, 756000}) = 0 (Timeout)
> > select(1, [0], [], NULL, {60, 0}) = 0 (Timeout)
> > select(1, [0], [], NULL, {60, 0}) = 0 (Timeout)
> > select(1, [0], [], NULL, {60, 0} <unfinished ...>
> > Process 6909 detached
> >
> >And similar results on the server side process. Maybe a deadlock
> >somewhere? The ssh pipe appeared open. I set it up to forward traffic
> >and was able to pass traffic from the server to the client.
>
> Are these 2 different scenarios (the sigalarm and sigpipe)?
Yes, I just started a new thread on the hang/sigalarm problem.
> I don't
> think I've ever seen a real deadlock on a unix/linux rsync although I
> always got them on windows when trying to run rsync under sshd (and I'd
> appreciate knowing the right versions to use if that works now).
Well, its not really rsync -> rsync right, its File::RsyncP-> rsync.
> The
> sigpipe scenario sounded like the remote rsync crashed or quit (perhaps
> not being able to handle files >2gigs). This looks like something
> different. Can you start the remote strace before the hang so you have
> a chance of seeing the file and activity in progress when the hang occurs?
I can try. As far as the sigpipe issue, looks like there is a missing
email in this thread. I was able to run an rsync of the 22GB file that
was the active transfer atthe time of the SIGPIPE without
problem. I'll repost that missing email.
--
-- rouilj
John Rouillard
System Administrator
Renesys Corporation
603-643-9300 x 111
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
BackupPC-users mailing list
[email protected]
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/