Sean Kamath <kam...@moltingpenguin.com> writes: > On Jan 30, 2013, at 3:46 PM, micah anderson <mi...@riseup.net> wrote: >> Seems that only the above process was still around and no other dsync >> processes. I have three machines that all have this happening it seems. >> >> I wonder if there is a ssh configuration option I could set to make >> these die off. > > If the ssh process isn't sending anything, and just waiting for read()s, and > keepalives are turned off, the SSH session might never know the remote side > is long gone. . .
This time I managed to capture a process that was stuck and look at it from the server side, and the client side: on the server: 2000 19470 0.0 0.0 7512 3816 ? Ss Feb05 0:01 /usr/bin/dsync dsync-server -E -u foo # strace -s 1024 -F -p 19470 Process 19470 attached - interrupt to quit write(2, "dsync-remote(foo): Error: mdbox /srv/maildirbackups/foo/daily.1/storage: Duplicate GUID 96860517f68aa94f8b51000097f19f0b in m.41:682501 and m.37:653225\n", 167 on the client: root 19001 0.0 0.0 41308 1600 ? S Feb05 0:00 ssh -i /root/.ssh/backmaildir_id_rsa backmaildir@hoopoe-pn /usr/bin/dsync -u foo server # strace -s 1024 -F -p 19001 Process 19001 attached - interrupt to quit select(8, [4], [], NULL, NULL interestingly, now that I've been watching this more, the same users keep getting wedged. When I attempt to do a dsync of that user by hand, I get this: dsync-local(foo): Error: Unexpected reply from server: 13 d2a100118c45d24f760f000097f19f0b 3561 128 \Recent 1353980259 I tried one of the other users that was stuck, and it gave me: dsync-remote(bar): Error: Corrupted dbox file /srv/maildirbackups/bar/daily.1/storage/m.130 (around offset=22532): msg header has bad magic value This looks like there is something corrupted with the dbox for the user on the client side, is there something I can do to repair those? > If any data were transmitted, it would discover the remote side is turned off. One thing I am doing is using a ssh controlmaster socket, and if I kill the process on the client's side, the server side process also dies. micah