Oh, actually the stacktrace shows openssl which cleanups itself on exit(), i.e. atexit() callback or alike (which is preserved on fork() too..). To avoid this, we may want to use OPENSSL_INIT_NO_ATEXIT at OPENSSL_init() time and call OPENSSL_cleanup() explicitly when needed.
On Wed, Sep 25, 2019 at 3:07 PM Stefan Eissing <stefan.eiss...@greenbytes.de> wrote: > > Hmm, far less likely, but still: > > Crashed Thread: 0 Dispatch queue: com.apple.main-thread > > Exception Type: EXC_CRASH (SIGSEGV) > Exception Codes: 0x0000000000000000, 0x0000000000000000 > Exception Note: EXC_CORPSE_NOTIFY > > Termination Signal: Segmentation fault: 11 > Termination Reason: Namespace SIGNAL, Code 0xb > Terminating Process: httpd [6106] > > Application Specific Information: > crashed on child side of fork pre-exec > > Thread 0 Crashed:: Dispatch queue: com.apple.main-thread > 0 libsystem_malloc.dylib 0x00007fff6b2e1c8c free_tiny + 243 > 1 libcrypto.1.1.dylib 0x0000000101363484 OPENSSL_LH_delete > + 228 > 2 libcrypto.1.1.dylib 0x00000001013730f9 OBJ_NAME_remove + > 105 > 3 libcrypto.1.1.dylib 0x000000010136368a OPENSSL_LH_doall + > 74 > 4 libcrypto.1.1.dylib 0x0000000101373338 OBJ_NAME_cleanup + > 72 > 5 libcrypto.1.1.dylib 0x00000001013577de evp_cleanup_int + > 14 > 6 libcrypto.1.1.dylib 0x0000000101360ccf OPENSSL_cleanup + > 335 > 7 libsystem_c.dylib 0x00007fff6b1da3d6 > __cxa_finalize_ranges + 326 > 8 libsystem_c.dylib 0x00007fff6b1da6b3 exit + 55 > 9 mod_mpm_event.so 0x0000000101531546 clean_child_exit + > 54 (event.c:768) > 10 mod_mpm_event.so 0x0000000101531382 child_main + 1698 > (event.c:2551) > 11 mod_mpm_event.so 0x0000000101530c84 make_child + 436 > 12 mod_mpm_event.so 0x000000010152f7e5 event_run + 1093 > (event.c:3256) > 13 httpd 0x0000000100f0366b ap_run_mpm + 75 > (mpm_common.c:101) > 14 httpd 0x0000000100ef4679 main + 2233 > (main.c:848) > 15 libdyld.dylib 0x00007fff6b1343d5 start + 1 > > Thread 1: > 0 libsystem_pthread.dylib 0x00007fff6b327e02 > pthread_rwlock_wrlock + 0 > 1 libcrypto.1.1.dylib 0x00000001013be209 > CRYPTO_THREAD_write_lock + 9 > 2 libcrypto.1.1.dylib 0x000000010138f3a6 > RAND_get_rand_method + 54 > 3 libcrypto.1.1.dylib 0x000000010138f675 RAND_priv_bytes + > 21 > 4 libcrypto.1.1.dylib 0x00000001012b4bb2 bnrand + 178 > 5 libcrypto.1.1.dylib 0x00000001012b2f49 > BN_generate_prime_ex + 665 > 6 libcrypto.1.1.dylib 0x00000001013979b7 > RSA_generate_multi_prime_key + 1783 > 7 libcrypto.1.1.dylib 0x000000010139c2d0 pkey_rsa_keygen + > 160 > 8 libcrypto.1.1.dylib 0x000000010135bc5b EVP_PKEY_keygen + > 91 > 9 mod_md.so 0x0000000101552bb4 gen_rsa + 132 > (md_crypt.c:464) > 10 mod_md.so 0x000000010154a797 > md_acme_acct_register + 887 (md_acme_acct.c:591) > 11 mod_md.so 0x000000010154c83c > md_acme_drive_set_acct + 1020 (md_acme_drive.c:158) > 12 mod_md.so 0x000000010154fd61 > md_acmev2_drive_renew + 81 (md_acmev2_drive.c:102) > 13 mod_md.so 0x000000010154da40 acme_driver_renew > + 1520 > 14 mod_md.so 0x000000010155f196 run_renew + 262 > (md_reg.c:1066) > 15 mod_md.so 0x0000000101564179 md_util_pool_vdo + > 185 (md_util.c:54) > 16 mod_md.so 0x000000010155f08a md_reg_renew + 42 > (md_reg.c:1075) > 17 mod_md.so 0x00000001015423dc run_watchdog + 668 > (mod_md_drive.c:127) > 18 mod_watchdog.so 0x00000001014fd4fc wd_worker + 636 > (mod_watchdog.c:202) > 19 libsystem_pthread.dylib 0x00007fff6b3282eb _pthread_body + 126 > 20 libsystem_pthread.dylib 0x00007fff6b32b249 _pthread_start + 66 > 21 libsystem_pthread.dylib 0x00007fff6b32740d thread_start + 13 > > > > Am 25.09.2019 um 14:46 schrieb Stefan Eissing > > <stefan.eiss...@greenbytes.de>: > > > > The patch looks nice. Running it in my test suite over and over without any > > crash showing up! > > > > Great work! > > > >> Am 25.09.2019 um 13:24 schrieb Yann Ylavic <ylavic....@gmail.com>: > >> > >> On Wed, Sep 25, 2019 at 11:54 AM Stefan Eissing > >> <stefan.eiss...@greenbytes.de> wrote: > >>> > >>> Looking for help on an issue in mod_watchdog use and child exits. > >>> > >>> Occasionally a httpd child crashes due to races between child pool being > >>> destroyed while watchdog threads are still running. The crash manifests > >>> most likely when OPENSSL_cleanup runs while another thread is generated a > >>> private key (singe that takes relatively long). > >>> > >>> Beside the crash report, nothing else really fails since the child is > >>> terminating anyway and no requests are onoing. But still, it's not a nice > >>> thing. > >>> > >>> Do you see an easy way to avoid this? > >> > >> AFAICT, clean_child_exit() destroys pchild only, and pconf is still > >> alive at exit() time. > >> Couldn't we use pconf only in child_init hooks of mod_ssl and mod_watchdog? > >> Does something like the attached patch fixes the crash? > >> <some_pchild_to_pconf.diff> > > >