Hi, the issue was discovered as part of tracing down a deadlock condition in an SVN test [1].
As far as I understand it, the problem lies in the C-standard not explicitly defining when a function registered with atexit() is called in the context of thread termination [2]. The case is reproducible with the following code inside a DLL and executing that from a calling process (a complete repro project is available on my issue tracker [1]): atexit(apr_terminate); apr_initialize(); apr_pool_t *pool; apr_pool_create_ex(&pool, NULL, NULL, NULL); apr_thread_pool_t *thread_pool; apr_thread_pool_create(&thread_pool, 0, 16, pool); apr_thread_pool_idle_wait_set(thread_pool, 1000 * 1000); for (int i = 0; i < 3; ++i) { apr_thread_pool_push(thread_pool, foo, nullptr, 0, NULL); } In this particular case we create a couple of threads and before the threads terminate the calling process terminates. At this point apr_terminate() is called which got registered via the atexit()-call (as suggested in the apr_terminate-documentation). When that call frees up the threadpool, it gets stuck in thread_pool_cleanup() waiting for the thd_cnt to become 0 (which it never will). I believe the attached patch (done against the apr-util 1.5 branch) should solve the underlying problem. It determines the number of failed thread joins (i.e. those which were either detached or did not exit correctly) and reduces the thd_cnt variable by that amount. The test case I used on Windows is fixed with that patch and I'll run the full apr-test suite later today (if I don't get back to this mail it means the test suite passed for me with that patch). I also reviewed the implementation of apr_thread_join() on other platforms and could not spot any immediate problems the patch would cause on these platforms. However, since I'm quite new to the APR development and also am not have only marginal knowledge with platforms other than Windows, I'd appreciate if someone knowing the details about APR's thread design and who is more familiar with other platforms could review the patch/approach before I commit the patch. Regards, Stefan [1] http://www.luke1410.de:8090/browse/MAXSVN-94 [2] https://stackoverflow.com/questions/39655868/what-does-the-posix-standard-say-about-thread-stacks-in-atexit-handlers-what
Index: misc/apr_thread_pool.c =================================================================== --- misc/apr_thread_pool.c (revision 1749154) +++ misc/apr_thread_pool.c (working copy) @@ -838,10 +838,13 @@ apr_thread_mutex_unlock(me->lock); n_dbg = 0; + apr_size_t numincompletejoins = 0; if (NULL != (head = elt)) { while (elt) { tail = elt; - apr_thread_join(&rv, elt->thd); + if (APR_SUCCESS != apr_thread_join(&rv, elt->thd)) { + ++numincompletejoins; + } elt = APR_RING_NEXT(elt, link); ++n_dbg; } @@ -848,6 +851,8 @@ apr_thread_mutex_lock(me->lock); APR_RING_SPLICE_TAIL(me->recycled_thds, head, tail, apr_thread_list_elt, link); + assert(me->thd_cnt >= numincompletejoins); + me->thd_cnt -= numincompletejoins; apr_thread_mutex_unlock(me->lock); } assert(cnt == n_dbg);
smime.p7s
Description: S/MIME Cryptographic Signature