Am 17.07.2019 um 10:03 schrieb Ruediger Pluem:


On 07/16/2019 11:28 PM, Rainer Jung wrote:
cross-posted to APR+HTTPD

Crahs happens in #2  0x00007faf4c154945 in raise () from /lib64/libc.so.6
#3  0x00007faf4c155f21 in abort () from /lib64/libc.so.6
#4  0x00007faf4c14d810 in __assert_fail () from /lib64/libc.so.6
#5  0x00007faf4c694219 in __pthread_tpp_change_priority () from 
/lib64/libpthread.so.0
#6  0x00007faf4c68cd76 in __pthread_mutex_lock_full () from 
/lib64/libpthread.so.0
#7  0x00007faf4cd07c29 in apr_thread_mutex_lock (mutex=0x2261fe0) at 
locks/unix/thread_mutex.c:108
#8  0x00007faf4cd08603 in apr_pool_walk_tree (pool=0x225a710, fn=0x7faf4cd07fc0 
<pool_num_bytes>, data=0x7faf45777c90)
at memory/unix/apr_pools.c:1515
#9  0x00007faf4cd08630 in apr_pool_walk_tree (pool=0x6a3ce0, fn=0x7faf4cd07fc0 
<pool_num_bytes>, data=0x7faf45777c90) at
memory/unix/apr_pools.c:1521
#10 0x00007faf4cd08630 in apr_pool_walk_tree (pool=0x6a3770, fn=0x7faf4cd07fc0 
<pool_num_bytes>, data=0x7faf45777c90) at
memory/unix/apr_pools.c:1521
#11 0x00007faf4cd08630 in apr_pool_walk_tree (pool=0x6a3110, fn=0x7faf4cd07fc0 
<pool_num_bytes>, data=0x7faf45777c90) at
memory/unix/apr_pools.c:1521
#12 0x00007faf4cd086df in apr_pool_num_bytes (pool=0x6d81, recurse=<value 
optimized out>) at memory/unix/apr_pools.c:2304
#13 0x00007faf4cd0898f in apr_pool_log_event (pool=0x225a710, event=0x7faf4cd16e74 
"PCALLOC", file_line=0x7faf4cd16d78
"locks/unix/thread_mutex.c:50", deref=-1)
     at memory/unix/apr_pools.c:1543
#14 0x00007faf4cd098b8 in apr_pcalloc_debug (pool=0x225a710, size=64, 
file_line=0x7faf4cd16d78
"locks/unix/thread_mutex.c:50") at memory/unix/apr_pools.c:1814
#15 0x00007faf4cd07ce5 in apr_thread_mutex_create (mutex=0x225a798, flags=1, 
pool=0x225a710) at
locks/unix/thread_mutex.c:50
#16 0x00007faf4cd0a164 in apr_pool_clear_debug (pool=0x225a710, file_line=0x488f09 
"mpm_fdqueue.c:236") at
memory/unix/apr_pools.c:1911
#17 0x000000000046c455 in ap_queue_info_push_pool (queue_info=0x22648b0, 
pool_to_recycle=0x225a710) at mpm_fdqueue.c:236
#18 0x00007faf4bf18821 in process_lingering_close (cs=0x78d670) at event.c:1457
#19 0x00007faf4bf196a8 in worker_thread (thd=0x6cae80, dummy=<value optimized 
out>) at event.c:2083
#20 0x00007faf4c68b5f0 in start_thread () from /lib64/libpthread.so.0
#21 0x00007faf4c1f684d in clone () from /lib64/libc.so.6

So it seems a mutex gets created, which allocates memory, which in turn 
triggers debug logging, which walks pools and
finally tries to lock the not yet initialized lock.

Anyone aware of that? Any ideas how to fix?

This is strange. Before apr_thread_mutex_create is called by apr_pool_clear_debug 
pool->mutex is set to NULL. So IMHO in
frame #7 mutex should be NULL.
Which version of APR are you using?

1.7 with a few debug patches, that should really not make a difference here (but might offset line numbers a bit). 1.7.0, 1.7.x, 1.6.5 and 1.6.x do not differ in apr_pools.c.

The code that jumps into the mutex creation is:

1877 APR_DECLARE(void) apr_pool_clear_debug(apr_pool_t *pool,
1878                                        const char *file_line)
1879 {
...
1902
1903     pool_clear_debug(pool, file_line);
1904
1905 #if APR_HAS_THREADS
1906     /* If we had our own mutex, it will have been destroyed by
1907      * the registered cleanups.  Recreate the mutex.  Unlock
1908      * the mutex we obtained above.
1909      */
1910     if (mutex != pool->mutex) {
1911         (void)apr_thread_mutex_create(&pool->mutex,
1912                                       APR_THREAD_MUTEX_NESTED, pool);
1913
1914         if (mutex != NULL)
1915             (void)apr_thread_mutex_unlock(mutex);
1916     }
1917 #endif /* APR_HAS_THREADS */
1918 }

and in pool_clear_debug I see overwriting of pool allocated bytes with a 'A' to poison data but I don't see, where pool->mutex is nulled.

Here's data inspection from gdb for pool, pool->parent (pconf), pool->mutex and pool->parent->mutex. IMHO one can see, that pool->mutex has indeed been treated by poison 'A'.

(gdb) print *pool
$1 = {parent = 0x6a3ce0, child = 0x0, sibling = 0x7729b0, ref = 0x6a3ce8, cleanups = 0x0, free_cleanups = 0x0, allocator = 0x784a20, subprocesses = 0x0, abort_fn = 0x437920 <abort_on_oom>, user_data = 0x0, tag = 0x7faf4bf1d62a "transaction", joined = 0x0, nodes = 0x9335b0, file_line = 0x7faf4bf1d61d "event.c:1829", creation_flags = 0, stat_alloc = 1, stat_total_alloc = 45, stat_clear = 1, owner = 140390560073488, mutex = 0x2261fe0, pre_cleanups = 0x0}

(gdb) print *pool->parent
$2 = {parent = 0x6a3770, child = 0x225a710, sibling = 0x0, ref = 0x6c2c70, cleanups = 0x0, free_cleanups = 0x0, allocator = 0x0, subprocesses = 0x0, abort_fn = 0x437920 <abort_on_oom>, user_data = 0x7911d0, tag = 0x47cb57 "pconf", joined = 0x0, nodes = 0x227eaf0, file_line = 0x47cb4c "main.c:296", creation_flags = 0, stat_alloc = 86248, stat_total_alloc = 172527, stat_clear = 1, owner = 140390893647616, mutex = 0x6a31b0, pre_cleanups = 0x0}

(gdb) print *pool->mutex
$3 = {pool = 0x71b8d0, mutex = {__data = {__lock = 1279770360, __count = 32687, __owner = 1094795585, __nusers = 1094795585, __kind = 1094795585, __spins = 1094795585, __list = {__prev = 0x4141414141414141, __next = 0x4141414141414141}}, __size = "\370\276GL\257\177\000\000", 'A' <repeats 32 times>, __align = 140390875774712},
  cond = 0x4141414141414141, locked = 1094795585, num_waiters = 1094795585}

Note that 'A' (the debug poison letter) is hex 0x41 and 1094795585 decimal in hex is 0x41414141.

(gdb) print *pool->parent->mutex
$4 = {pool = 0x6a3110, mutex = {__data = {__lock = 1, __count = 4, __owner = 28042, __nusers = 1, __kind = 1, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\001\000\000\000\004\000\000\000\212m\000\000\001\000\000\000\001", '\000' <repeats 22 times>, __align = 17179869185}, cond = 0x0, locked = 0, num_waiters = 0}

The original assertion text was something like "__pthread_tpp_change_priority assert ..." but I don't have the full assertion text any more. I could reproduce if needed, it just takes a few minutes due to pool debug slowness when running with full verbosity.

Regards,

Rainer

Reply via email to