Eric, The global mutex only serializes concurrent calls to ap_proxy_initialize_worker(). The worker pool is also used when the proxy_handler() is called from a thread kicked off from a _previous_ call to ap_proxy_initialize_worker() . Turning on the pool concurrency check shows this happening:
pool concurrency check: pool 0xa04348(proxy_worker_cp), thread cur 7f25a21fc700 in use by 7f2598bed700, state in use -> in use 172256 . Thawed libc:gsignal (+0x37) <--- abort 1 libc:gsignal (+0x37) 2 libc:abort (+0x143) 3 libapr-1:\apr_pools\pool_concurrency_abort 768 4 libapr-1:\apr_pools\apr_palloc 778 (+0xA) 5 libapr-1:\thread_cond\apr_thread_cond_create 44 6 libaprutil-1:\apr_reslist\apr_reslist_create 299 (+0x9) 7 mod_proxy:\proxy_util\ap_proxy_initialize_worker 2013 (+0x2F) 8 mod_proxy:\mod_proxy\proxy_handler 1176 (+0xE) 9 httpd:UNKNOWN at 0x00000000004543A0 Here's the thead the diagnostic said had the pool 'in use' when apr_palloc() was called: 172271 Thawed libc:__GI_strncmp (+0x1680) <-- in use 1 libc:__GI_strncmp (+0x1680) 2 libc:getenv (+0xBD) 3 libc:__nscd_getai (+0x3D3) 4 libc:gaih_inet.constprop.8 (+0x15F2) 5 libc:getaddrinfo (+0x11F) 6 libapr-1:\sockaddr\call_resolver 397 (+0x10) 7 mod_proxy:\proxy_util\ap_proxy_determine_connection 2506 (+0x15) 8 mod_proxy_http:\mod_proxy_http\proxy_http_handler 1956 (+0x1D) 9 mod_proxy:\mod_proxy\proxy_run_scheme_handler 3063 (+0x18) 10 mod_proxy:\mod_proxy\proxy_handler 1250 (+0x16) 11 httpd:UNKNOWN at 0x00000000004543A0 Another possible fix would be to use the global mutex everywhere it's needed and remove the call to create and use the worker thread lock altogether, but that would mean changing the code that currently uses the worker proxy thread lock and it seemed cleaner to just change proxy_util.c rather than two source modules. The two functions in ap_proxy_acquire_connection() that I serialized were found from other runs. Both apr_reslist_acquire() and connection_constructor() end up using the same worker pool that is being used by ap_proxy_determine_connection(). e.g. pool concurrency check: pool 0x1d93198(proxy_worker_cp), thread cur 7f019bfff700 in use by 7f019abfd700, state in use -> in use aborting thread: 1 Thread 0x7f019bfff700 (LWP 106622) 0x00007f01a735c207 in raise () from /lib64/libc.so.6 3 libapr-1:\apr_pools\pool_concurrency_abort 768 4 libapr-1:\apr_pools\apr_palloc 778 (+0xA) 5 libaprutil-1:\apr_reslist\get_container 100 (+0x8) 6 libaprutil-1:\apr_reslist\apr_reslist_acquire 121 (+0x3) 7 mod_proxy:\proxy_util\ap_proxy_acquire_connection 2327 (+0x3) 8 mod_proxy_http:\mod_proxy_http\proxy_http_handler 1933 (+0x19) 9 mod_proxy:\mod_proxy\proxy_run_scheme_handler 3063 (+0x18) 10 mod_proxy:\mod_proxy\proxy_handler 1250 (+0x16) 11 httpd:UNKNOWN at 0x00000000004543A0 And the "in-use" thread: 5 Thread 0x7f019abfd700 (LWP 106624) 0x00007f01a7424bed in connect () from /lib64/libc.so.6 8 mod_proxy:\proxy_util\ap_proxy_determine_connection 2537 (+0x15) 9 mod_proxy_http:\mod_proxy_http\proxy_http_handler 1956 (+0x1D) 10 mod_proxy:\mod_proxy\proxy_run_scheme_handler 3063 (+0x18) 11 mod_proxy:\mod_proxy\proxy_handler 1250 (+0x16) Once the workers are up and running, they use thread-specific storage, but all the code that starts them up uses a pool that needs to be serialized. worker->cp as used in ap_proxy_acquire_connection() is currently not serialized when running with the worker mpm. It can run at the same time as ap_proxy_initialize_worker() with the same pool used in both. The problem doesn't show up when running the same test with pre-fork or event mpms. -- Don Poitras - Host R&D SAS Institute Inc. SAS Campus Drive mailto:sas...@sas.com (919)531-5637 Fax:677-4444 Cary, NC 27513