There are probably 2 things here:
1. There's some error in nsswitch<->nscd communication protocol that
causes nsswitch to write into the closed socket. This is not trivial
to investigate and will require analyzing nscd and client process logs
side by side (and possibly adding some more logging).
2. Consequences of the aforementioned problem can probably be
corrected by using _setsockopt(..., SO_NOSIGPIPE) in
__open_cached_connection() in nscachedcli.c
(http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc/net/nscachedcli.c?rev=1.3).

I have no access to FreeBSD desktop at the moment - Artem, it would be
cool if you can try the second solution.

Cheers,
Michael

2011/10/5 Artem Belevich <a...@freebsd.org>:
> 2011/10/4 Dag-Erling Smørgrav <d...@des.no>:
>> Any chance of getting a backtrace from an unpatched nscd?  Ideally with
>> the change described here:
>>
>> http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/136073#reply1
>>
>> To test, stop nscd, then run it from the command line like so:
>>
>> $ su -
>> # cd /tmp
>> # ulimit -c 0
>> # /usr/sbin/nscd -nst
>> (do something in another terminal that causes it to crash)
>> # echo backtrace | gdb -batch -x /dev/stdin /usr/sbin/nscd nscd.core
>>
>> and send me the output from both nscd and gdb once it crashes.
>
> In my case it's top that dies with SIGPIPE. nscd keeps running just
> fine. So, there's no backtrace from nscd.
>
> top receives SIGPIPE after it tries to write to the socket with nscd
> on the other end. Apparently nscd closes connection on its end.
> Running ktrace on top I see that before the write to nscd socket,
> there's a read that returned 0 bytes.
>
> Here's top's backtrace. Alas I don't have libc with debug symbols handy:
>
> Program received signal SIGPIPE, Broken pipe.
> 0x0000000800abe8cc in write () from /lib/libc.so.7
> (gdb) where
> #0  0x0000000800abe8cc in write () from /lib/libc.so.7
> #1  0x0000000800aa3f44 in ftell () from /lib/libc.so.7
> #2  0x0000000800aa415f in ftell () from /lib/libc.so.7
> #3  0x0000000800aa2031 in __h_errno () from /lib/libc.so.7
> #4  0x0000000800a98311 in nsdispatch () from /lib/libc.so.7
> #5  0x0000000800a84d95 in getpwent_r () from /lib/libc.so.7
> #6  0x0000000800a84911 in acl_get_brand_np () from /lib/libc.so.7
> #7  0x0000000000404f7b in machine_init (statics=0x7fffffffe770,
> do_unames=1 '\001') at /usr/srcdir/src.git/usr.bin/top/machine.c:258
> #8  0x000000000040a9ab in main (argc=1, argv=0x7fffffffe8c8) at
> /usr/srcdir/src.git/usr.bin/top/../../contrib/top/top.c:464
>
> --Artem
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
>
>
>
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Reply via email to