Hi all!
We've started to see spurious segfaults with httpd 2.4.25, mpm_event,
ssl on Ubuntu 14.04LTS. Not frequent, but none the less happening.
This might be due to processes being cleaned up due to hitting
MaxSpareThreads or MaxConnectionsPerChild, these are tuned to not
happen frequently. It's just a wild guess, but the reason for me
suspecting this is the weird looking stacktraces that points towards
use-after-free issues...
The latest one (cut down for readability):
Program terminated with signal SIGBUS, Bus error.
#0 0x00007f50819eeee0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
Thread 1 (Thread 0x7f507c440700 (LWP 693025)):
#0 0x00007f50819eeee0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1 0x00007f507f027524 in ?? () from
/lib/x86_64-linux-gnu/libssl.so.1.0.0
No symbol table info available.
#2 0x00007f4f00000010 in ?? ()
No symbol table info available.
#3 0x00007f4ffc02a6c8 in ?? ()
No symbol table info available.
#4 0x00007f4fc401da70 in ?? ()
No symbol table info available.
#5 0x0000000000400000 in ?? ()
No symbol table info available.
#6 0x0000000000000000 in ?? ()
No symbol table info available.
In this particular dump I see other threads that looks weird, for
example this snippet:
#14 0x00007f507f27b456 in ssl_filter_write (f=0x7f50580130f8, f=0x7f50580130f8,
len=<optimized out>, data=<optimized out>) at ssl_engine_io.c:793
filter_ctx = 0x7f50580a7ae8
outctx = 0x7f507014d988
res = <optimized out>
#15 ssl_io_filter_output (f=0x7f50580130f8, bb=0x7f507f027693) at
ssl_engine_io.c:1746
data = 0x7f50763be000 <error: Cannot access memory at address
0x7f50763be000>
"Cannot access memory" is bad if I remember correctly.
An earlier occurrence (excerpt):
Program terminated with signal SIGBUS, Bus error.
#0 0x00007f50819eeee0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
Thread 1 (Thread 0x7f50774e9700 (LWP 20794)):
#0 0x00007f50819eeee0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1 0x00007f507f027524 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.0.0
No symbol table info available.
#2 0x00007f507f027693 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.0.0
No symbol table info available.
#3 0x00007f507f27b456 in ssl_filter_write (f=0x7f507013cfe0, f=0x7f507013cfe0,
len=<optimized out>, data=<optimized out>) at ssl_engine_io.c:793
filter_ctx = 0x7f507013cf88
outctx = 0x7f507013d008
res = <optimized out>
#4 ssl_io_filter_output (f=0x7f507013cfe0, bb=0x7f4f840be168) at
ssl_engine_io.c:1746
data = 0x7f5075518000 <error: Cannot access memory at address
0x7f5075518000>
len = 4194304
bucket = 0x7f4f840b1ba8
status = <optimized out>
filter_ctx = 0x7f507013cf88
inctx = <optimized out>
outctx = 0x7f507013d008
rblock = APR_NONBLOCK_READ
#5 0x00007f507f27879a in ssl_io_filter_coalesce (f=0x7f507013cfb8,
bb=0x7f4f840be168) at ssl_engine_io.c:1663
e = <optimized out>
upto = <optimized out>
bytes = <optimized out>
ctx = <optimized out>
count = <optimized out>
...
Here the thread causing the dump has the ominous "Cannot access
memory".
Are we hitting a corner case of process cleanup that plays merry hell
with https/ssl, or are we just having bad luck? Ideas? Suggestions?
/Nikke
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se
---------------------------------------------------------------------------
"Oh, excuse me, but my Vogon space cruiser is here. Bye!"
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=