Hi again,

We've been getting an intermittent server crash on our live system for a
while now which hits us every weeks or so.

The nsd daemon crashes with the above error and daemontools restarts it.
Having done a search I believe this is seen sometimes on server shutdown
where it's pretty harmless... but in these cases the server is not
(expectedly) shutting down.

We've now managed to get a core dump of the most recent occurrence.
Is anyone able to give us any pointers on how we can try to narrow this
down?

It's Naviserver 4.99.7 (but was also happening in 4.99.6 - and I think
4.99.5) on Debian 7.7 (Wheezy) and Tcl 8.5.11-2


[-conn:tlc_erp:20-] Fatal: nsthreads: pthread_join failed in Ns_ThreadJoin:
Invalid argument

(gdb) bt
#0  0x00007f5858d39165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f5858d3c3e0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f5859ee9639 in Panic (fmt=<optimized out>) at log.c:707
#3  0x00007f58595d6e12 in Tcl_PanicVA () from /usr/lib/libtcl8.5.so.0
#4  0x00007f58595d6f8c in Tcl_Panic () from /usr/lib/libtcl8.5.so.0
#5  0x00007f585984e6f0 in NsThreadFatal (func=<optimized out>,
osfunc=<optimized out>, err=<optimized out>) at error.c:62
#6  0x00007f5859850561 in Ns_ThreadJoin (thread=<optimized out>,
argPtr=<optimized out>) at pthread.c:459
#7  0x00007f5859ef0ffc in JoinConnThread (threadPtr=0x7f585181dda0) at
queue.c:1688
#8  NsConnThread (arg=0x2db7d70) at queue.c:1368
#9  0x00007f585984f8ac in NsThreadMain (arg=<optimized out>) at thread.c:227
#10 0x00007f5859850899 in ThreadMain (arg=<optimized out>) at pthread.c:809
#11 0x00007f58588edb50 in start_thread () from
/lib/x86_64-linux-gnu/libpthread.so.0
#12 0x00007f5858de370d in clone () from /lib/x86_64-linux-gnu/libc.so.6
#13 0x0000000000000000 in ?? ()

(gdb) frame 8
#8  NsConnThread (arg=0x2db7d70) at queue.c:1368
1368            JoinConnThread(&joinThread);
(gdb) list
1363        }
1364
1365        joinThread = servPtr->pools.joinThread;
1366        Ns_ThreadSelf(&servPtr->pools.joinThread);
1367        if (joinThread != NULL) {
1368            JoinConnThread(&joinThread);
1369        }
1370
1371        Ns_Log(Notice, "exiting: %s", exitMsg);
1372
(gdb) frame 7
#7  0x00007f5859ef0ffc in JoinConnThread (threadPtr=0x7f585181dda0) at
queue.c:1688
1688        Ns_ThreadJoin(threadPtr, &argArg);
(gdb) list
1683    {
1684        void *argArg;
1685
1686        assert(threadPtr != NULL);
1687
1688        Ns_ThreadJoin(threadPtr, &argArg);
1689        /*
1690         * There is no need to free ConnThreadArg here, since it is
1691         * allocated in the driver
1692         */
(gdb) frame 6
#6  0x00007f5859850561 in Ns_ThreadJoin (thread=<optimized out>,
argPtr=<optimized out>) at pthread.c:459
459             NsThreadFatal("Ns_ThreadJoin", "pthread_join", err);
(gdb) list
454
455         assert(thread != NULL);
456
457         err = pthread_join(thr, argPtr);
458         if (err != 0) {
459             NsThreadFatal("Ns_ThreadJoin", "pthread_join", err);
460         }
461     }
462
463     ^L




Regards,

-- 
David Osborne
Qcode Software Limited
http://www.qcode.co.uk
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Reply via email to