Dear David,

the crash looks like a problem in the OpenSSL memory management.

In general, i would believe that this is a problem in the NaviServer code, but of the interplay of the various memory management options of OpenSSL, NaviServer and Tcl. We use these functions under heavy load on many servers, but we are careful to use everywhere the same malloc implementation (actually Google's TCmalloc).

OpenSSL:
======

In general, OpenSSL supports configuration of management routines. However, the memory management interface of OpenSSL changed with the release of OpenSSL 1.1.0. As a consequence, when compiling NaviServer with newer versions, of OpenSSL, the native OpenSSL memory routines are used. The commit [1] says: "Registering our own functions does not seem necessary". So, if one compiles a version of NaviServer between 4.99.15 and 4.99.20 with newer versions of OpenSSL, there might a problem arise, when the native OpenSSL malloc implementation is not full thread-safe, or when a mix between different malloc implementation happens.

NaviServer:
=======

When NaviServer is compiled with -DSYSTEM_MALLOC, ns_malloc() uses malloc() etc., otherwise it uses Tcl's ckalloc() and friends.

Tcl:
===
There exists as well a patch [2] for using internally in Tcl as well system malloc instead of Tcl's own mt-threaded version.

In Oct there was as well a small patch for NaviServer for cases, were Tcl and NaviServer are compiled with different memory allocators [3].

My first attempt would be to compile NaviServer with SYSTEM_MALLOC and check, whether you still experience a problem. The next recommendation would be to check, what malloc versions are used by which subsystems and align these if necessary.

i will look into reviving the configuration of OpenSSL to allow to configure its malloc implementation as it was possible before OpenSSL 1.1.0.

-gn

[1] https://bitbucket.org/naviserver/naviserver/commits/896a4e3765f91b048ccbf570e5afe21b1bb1a41f
[2] https://github.com/gustafn/install-ns
[3] https://bitbucket.org/naviserver/naviserver/commits/caab40365f0429a44740db1927e9f459d733db3f

On 14.12.20 18:07, David Osborne wrote:
Hi,

We're building some Naviserver instances (4.99.19) on Debian Buster (v10.7). One of the instances is a revproxy instance which uses connchans to speak to a back end.

We're seeing very frequent signal 11 crashes of NaviServer with this combination. (We also see this infrequently with 4.99.18 running on Debian Stretch (v9))

Because of the increased frequency I've managed to take a core dump and the issue appears to be when calling SSL_CTX_new after Ns_TLS_CtxClientCreate.

I realise I don't have gdb properly configured, but wondering if the backtrace as it is could shed any light on what's going on or is it still too opaque?

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/naviserver/bin/nsd -u nsd -g nsd -b 0.0.0.0:80 <http://0.0.0.0:80>,0.0.0.0:443 <http://0.0.0.0:443> -i -t /etc/'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f4405ddf700 (LWP 13613))]
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f4407936535 in __GI_abort () at abort.c:79
#2  0x00007f440847cfe6 in Panic (fmt=<optimized out>) at log.c:928
#3  0x00007f44080fbc4a in Tcl_PanicVA () from /lib/x86_64-linux-gnu/libtcl8.6.so <http://libtcl8.6.so> #4  0x00007f44080fbdb9 in Tcl_Panic () from /lib/x86_64-linux-gnu/libtcl8.6.so <http://libtcl8.6.so>
#5  0x00007f44084bbc74 in Abort (signal=<optimized out>) at unix.c:1115
#6  <signal handler called>
#7  malloc_consolidate (av=av@entry=0x7f43bc000020) at malloc.c:4486
#8  0x00007f4407996a58 in _int_malloc (av=av@entry=0x7f43bc000020, bytes=bytes@entry=1024) at malloc.c:3695
#9  0x00007f440799856a in __GI___libc_malloc (bytes=1024) at malloc.c:3057
#10 0x00007f4407c63559 in CRYPTO_zalloc () from /lib/x86_64-linux-gnu/libcrypto.so.1.1 #11 0x00007f4407df7699 in SSL_CTX_new () from /lib/x86_64-linux-gnu/libssl.so.1.1 #12 0x00007f44084b4d85 in Ns_TLS_CtxClientCreate (interp=interp@entry=0x7f43bc009ee0, cert=cert@entry=0x0, caFile=caFile@entry=0x0, caPath=caPath@entry=0x0,     verify=verify@entry=false, ctxPtr=ctxPtr@entry=0x7f4405dde7c0) at tls.c:116 #13 0x00007f44084687a4 in ConnChanOpenObjCmd (clientData=<optimized out>, interp=0x7f43bc009ee0, objc=<optimized out>, objv=<optimized out>)
    at connchan.c:1010
#14 0x00007f44084a7eb8 in Ns_SubcmdObjv (subcmdSpec=subcmdSpec@entry=0x7f4405dde990, clientData=0x7f43bc047870, interp=0x7f43bc009ee0, objc=13,
    objv=0x7f43bc017ff8) at tclobjv.c:1849
#15 0x00007f4408469d45 in NsTclConnChanObjCmd (clientData=<optimized out>, interp=<optimized out>, objc=<optimized out>, objv=<optimized out>)
    at connchan.c:1761
#16 0x00007f440802ffb7 in TclNRRunCallbacks () from /lib/x86_64-linux-gnu/libtcl8.6.so <http://libtcl8.6.so> #17 0x00007f44080313af in ?? () from /lib/x86_64-linux-gnu/libtcl8.6.so <http://libtcl8.6.so> #18 0x00007f4408030d13 in Tcl_EvalEx () from /lib/x86_64-linux-gnu/libtcl8.6.so <http://libtcl8.6.so> #19 0x00007f44084a9164 in NsTclFilterProc (arg=0x55af6a3e9880, conn=0x55af6a502480, why=NS_FILTER_PRE_AUTH) at tclrequest.c:535 #20 0x00007f4408478370 in NsRunFilters (conn=conn@entry=0x55af6a502480, why=why@entry=NS_FILTER_PRE_AUTH) at filter.c:160 #21 0x00007f440848654d in ConnRun (connPtr=connPtr@entry=0x55af6a502480) at queue.c:2450 #22 0x00007f4408485b33 in NsConnThread (arg=0x55af6a4a0090) at queue.c:2157 #23 0x00007f44081b2bb1 in NsThreadMain (arg=0x55af6a354f50) at thread.c:230 #24 0x00007f44081b3af9 in ThreadMain (arg=<optimized out>) at pthread.c:836 #25 0x00007f44078f5fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486 #26 0x00007f4407a0d4cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

--
Regards,
David
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Reply via email to