On 2021-10-15, Otto Moerbeek <o...@drijf.net> wrote:
> On Fri, Oct 15, 2021 at 07:47:22PM +0200, Mischa wrote:
>
>> 
>> 
>> On 2021-10-15 19:42, Otto Moerbeek wrote:
>> > On Fri, Oct 15, 2021 at 07:16:55PM +0200, Mischa wrote:
>> > 
>> > > On 2021-10-15 18:27, Otto Moerbeek wrote:
>> > > >
>> > > > The actual problem (SIGSEGV) happens in the child processes: ktrace the
>> > > > children as well: ktrace -di ...
>> > > >
>> > > >        -Otto
>> > > 
>> > > Thanx Otto.
>> > > Below is the the kdump with ktrace -di
>> > > It's quite a lot of data but I didn't want to remove something that
>> > > could
>> > > potentially be useful.
>> > > 
>> > > Mischa
>> > > 
>> > 
>> > The pattern below happens multiple times:
>> > 
>> > A recvfrom of 101 bytes and after that a SIGSEGV.
>> > 
>> > Now we do not know for sure if those two lines are related.
>> > 
>> > I suspect that it is no coincidence that the 101 is one larger than
>> > 100...
>> > 
>> > No other clue yet.
>> 
>> Anything else I can collect.
>
> You might want to compile and install nsd wit debug symbols info:
>
>       cd /usr/src/usr.sbin/nsd 
>       make -f Makefile.bsd-wrapper obj
>       make -f Makefile.bsd-wrapper clean
>       DEBUG=-g make -f  Makefile.bsd-wrapper
>       make -f  Makefile.bsd-wrapper install

"make DEBUG=-g -f Makefile.bsd-wrapper install", otherwise the installed
object is stripped.

> Then: collect a gdb trace from a running process: install gdb from ports,
> run
>       egdb --pid=pidofnsdchild /usr/sbin/nsd
>
> and wait for the crash.

Alternatively set kern.nosuidcoredump=3, mkdir /var/crash/nsd, and it should
save cores there. (Don't send the core file; egdb /usr/sbin/nsd 
/var/crash/nsd/XXX.core
and "bt full").

Or "egdb /usr/sbin/nsd", "set args -d -v 3" (in case we get anything useful
from logs at the time), "run"

> But I'm mostly unfamiliar with the nsd code and what has been changed
> recently.  I's say make sure sthen@ and florian@ see this: move to
> bugs@ as I do not know if they read misc@.

The only thing I spotted changing code around reads was in dnstap
(which we don't build anyway), nothing stands out so far..

>> > >  91127 nsd      GIO   fd 7 read 101 bytes
>> > > "By\0\0\0\^A\0\0\0\0\0\^A\^A6\^A0\^A1\^A0\^A0\^A0\^A0\^A0\^A0\^A0\^A0\^A0\^A0\^A0\^A0\^A0\^A1\^A0\^A0\^A0\^A4\^A0\^A0\^A1\^A0\^A0\^A0\^A6\^A3\^A0\^Aa\^A2\^Cip6\^Darpa\0\0\f\0\
>> > >  \^A\0\0)\^E\M-,\0\0\M^@\0\0\0"
>> > >  91127 nsd      STRU  struct sockaddr { AF_INET,
>> > > 141.101.75.185:10029 }
>> > >  91127 nsd      RET   recvfrom 101/0x65
>> > >  91127 nsd      PSIG  SIGSEGV SIG_DFL code SEGV_MAPERR<1> addr=0x10
>> > > trapno=6
>> > >  36104 nsd      STRU  struct pollfd [2] { fd=16, events=0x1<POLLIN>,
>> > > revents=0<> } { fd=15, events=0x1<POLLIN>, revents=0<> }
>> > >  36104 nsd      PSIG  SIGCHLD caught handler=0xb27e47fa340 mask=0<>
>
>


-- 
Please keep replies on the mailing list.

Reply via email to