On Fri, Oct 26, 2007 at 11:13:14AM -0500, Les Mikesell wrote:
> John Rouillard wrote:
> >>>
> >>>  $Conf{ClientTimeout} = 72000;
> >>>
> >>>which is 20 hours and the sigpipe is occurring before then.
> >>You'd see sigalarm instead of sigpipe if you had a timeout.
> >
> >Something like this I assume:
> >
> [...]
> >    create d 755       0/1       12288 src/fastforward-0.51
> >  finish: removing in-process file .
> >  Child is aborting
> >  Done: 17 files, 283 bytes
> >  Got fatal error during xfer (aborted by signal=ALRM)
> >  Backup aborted by user signal
> 
> Yes, that one is a timeout on the backuppc side.
> 
> >Also I straced the rsync process on the remote system while it was hung
> >(I assume on whatever occurred after the src/fastforward-0.51)
> >directory and got:
> >
> >  [EMAIL PROTECTED] ~]$ ps -ef | grep 6909
> >  root      6909  6908  0 Oct25 ?        00:00:00 /usr/bin/rsync
> >  --server --sender --numeric-ids --perms --owner --group -D --links
> >  --hard-links --times --block-size=2048 --recursive --one-file-system
> >  --checksum-seed=32761 --ignore-times . /usr/local/
> >  rouilj   10603 10349  0 05:36 pts/0    00:00:00 grep 6909
> >  [EMAIL PROTECTED] ~]$ strace -p 6909
> >  attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted
> >  [EMAIL PROTECTED] ~]$ sudo strace -p 6909
> >  Process 6909 attached - interrupt to quit
> >  select(1, [0], [], NULL, {42, 756000})  = 0 (Timeout)
> >  select(1, [0], [], NULL, {60, 0})       = 0 (Timeout)
> >  select(1, [0], [], NULL, {60, 0})       = 0 (Timeout)
> >  select(1, [0], [], NULL, {60, 0} <unfinished ...>
> >  Process 6909 detached
> >
> >And similar results on the server side process. Maybe a deadlock
> >somewhere? The ssh pipe appeared open. I set it up to forward traffic
> >and was able to pass traffic from the server to the client.
> 
> Are these 2 different scenarios (the sigalarm and sigpipe)?

Yes, I just started a new thread on the hang/sigalarm problem.
> I don't 
> think I've ever seen a real deadlock on a unix/linux rsync although I 
> always got them on windows when trying to run rsync under sshd (and I'd 
> appreciate knowing the right versions to use if that works now).

Well, its not really rsync -> rsync right, its File::RsyncP-> rsync.

> The 
> sigpipe scenario sounded like the remote rsync crashed or quit (perhaps 
> not being able to handle files >2gigs).  This looks like something 
> different.  Can you start the remote strace before the hang so you have 
> a chance of seeing the file and activity in progress when the hang occurs?

I can try. As far as the sigpipe issue, looks like there is a missing
email in this thread. I was able to run an rsync of the 22GB file that
was the active transfer atthe time of the SIGPIPE without
problem. I'll repost that missing email.

-- 
                                -- rouilj

John Rouillard
System Administrator
Renesys Corporation
603-643-9300 x 111

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
BackupPC-users mailing list
[email protected]
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Reply via email to