> Am 25.09.2019 um 15:30 schrieb Yann Ylavic <ylavic....@gmail.com>:
>
> Oh, actually the stacktrace shows openssl which cleanups itself on
> exit(), i.e. atexit() callback or alike (which is preserved on fork()
> too..).
> To avoid this, we may want to use OPENSSL_INIT_NO_ATEXIT at
> OPENSSL_init() time and call OPENSSL_cleanup() explicitly when needed.
I a not sure this will address the issue. If the watchdog thread stays running,
any teardown of OpenSSL will be too early.
Does mod_watchdog need a child pool cleanup that waits to its workers to shut
down maybe?
> On Wed, Sep 25, 2019 at 3:07 PM Stefan Eissing
> <stefan.eiss...@greenbytes.de> wrote:
>>
>> Hmm, far less likely, but still:
>>
>> Crashed Thread: 0 Dispatch queue: com.apple.main-thread
>>
>> Exception Type: EXC_CRASH (SIGSEGV)
>> Exception Codes: 0x0000000000000000, 0x0000000000000000
>> Exception Note: EXC_CORPSE_NOTIFY
>>
>> Termination Signal: Segmentation fault: 11
>> Termination Reason: Namespace SIGNAL, Code 0xb
>> Terminating Process: httpd [6106]
>>
>> Application Specific Information:
>> crashed on child side of fork pre-exec
>>
>> Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
>> 0 libsystem_malloc.dylib 0x00007fff6b2e1c8c free_tiny + 243
>> 1 libcrypto.1.1.dylib 0x0000000101363484 OPENSSL_LH_delete
>> + 228
>> 2 libcrypto.1.1.dylib 0x00000001013730f9 OBJ_NAME_remove +
>> 105
>> 3 libcrypto.1.1.dylib 0x000000010136368a OPENSSL_LH_doall
>> + 74
>> 4 libcrypto.1.1.dylib 0x0000000101373338 OBJ_NAME_cleanup
>> + 72
>> 5 libcrypto.1.1.dylib 0x00000001013577de evp_cleanup_int +
>> 14
>> 6 libcrypto.1.1.dylib 0x0000000101360ccf OPENSSL_cleanup +
>> 335
>> 7 libsystem_c.dylib 0x00007fff6b1da3d6
>> __cxa_finalize_ranges + 326
>> 8 libsystem_c.dylib 0x00007fff6b1da6b3 exit + 55
>> 9 mod_mpm_event.so 0x0000000101531546 clean_child_exit
>> + 54 (event.c:768)
>> 10 mod_mpm_event.so 0x0000000101531382 child_main + 1698
>> (event.c:2551)
>> 11 mod_mpm_event.so 0x0000000101530c84 make_child + 436
>> 12 mod_mpm_event.so 0x000000010152f7e5 event_run + 1093
>> (event.c:3256)
>> 13 httpd 0x0000000100f0366b ap_run_mpm + 75
>> (mpm_common.c:101)
>> 14 httpd 0x0000000100ef4679 main + 2233
>> (main.c:848)
>> 15 libdyld.dylib 0x00007fff6b1343d5 start + 1
>>
>> Thread 1:
>> 0 libsystem_pthread.dylib 0x00007fff6b327e02
>> pthread_rwlock_wrlock + 0
>> 1 libcrypto.1.1.dylib 0x00000001013be209
>> CRYPTO_THREAD_write_lock + 9
>> 2 libcrypto.1.1.dylib 0x000000010138f3a6
>> RAND_get_rand_method + 54
>> 3 libcrypto.1.1.dylib 0x000000010138f675 RAND_priv_bytes +
>> 21
>> 4 libcrypto.1.1.dylib 0x00000001012b4bb2 bnrand + 178
>> 5 libcrypto.1.1.dylib 0x00000001012b2f49
>> BN_generate_prime_ex + 665
>> 6 libcrypto.1.1.dylib 0x00000001013979b7
>> RSA_generate_multi_prime_key + 1783
>> 7 libcrypto.1.1.dylib 0x000000010139c2d0 pkey_rsa_keygen +
>> 160
>> 8 libcrypto.1.1.dylib 0x000000010135bc5b EVP_PKEY_keygen +
>> 91
>> 9 mod_md.so 0x0000000101552bb4 gen_rsa + 132
>> (md_crypt.c:464)
>> 10 mod_md.so 0x000000010154a797
>> md_acme_acct_register + 887 (md_acme_acct.c:591)
>> 11 mod_md.so 0x000000010154c83c
>> md_acme_drive_set_acct + 1020 (md_acme_drive.c:158)
>> 12 mod_md.so 0x000000010154fd61
>> md_acmev2_drive_renew + 81 (md_acmev2_drive.c:102)
>> 13 mod_md.so 0x000000010154da40 acme_driver_renew
>> + 1520
>> 14 mod_md.so 0x000000010155f196 run_renew + 262
>> (md_reg.c:1066)
>> 15 mod_md.so 0x0000000101564179 md_util_pool_vdo
>> + 185 (md_util.c:54)
>> 16 mod_md.so 0x000000010155f08a md_reg_renew + 42
>> (md_reg.c:1075)
>> 17 mod_md.so 0x00000001015423dc run_watchdog +
>> 668 (mod_md_drive.c:127)
>> 18 mod_watchdog.so 0x00000001014fd4fc wd_worker + 636
>> (mod_watchdog.c:202)
>> 19 libsystem_pthread.dylib 0x00007fff6b3282eb _pthread_body +
>> 126
>> 20 libsystem_pthread.dylib 0x00007fff6b32b249 _pthread_start +
>> 66
>> 21 libsystem_pthread.dylib 0x00007fff6b32740d thread_start + 13
>>
>>
>>> Am 25.09.2019 um 14:46 schrieb Stefan Eissing
>>> <stefan.eiss...@greenbytes.de>:
>>>
>>> The patch looks nice. Running it in my test suite over and over without any
>>> crash showing up!
>>>
>>> Great work!
>>>
>>>> Am 25.09.2019 um 13:24 schrieb Yann Ylavic <ylavic....@gmail.com>:
>>>>
>>>> On Wed, Sep 25, 2019 at 11:54 AM Stefan Eissing
>>>> <stefan.eiss...@greenbytes.de> wrote:
>>>>>
>>>>> Looking for help on an issue in mod_watchdog use and child exits.
>>>>>
>>>>> Occasionally a httpd child crashes due to races between child pool being
>>>>> destroyed while watchdog threads are still running. The crash manifests
>>>>> most likely when OPENSSL_cleanup runs while another thread is generated a
>>>>> private key (singe that takes relatively long).
>>>>>
>>>>> Beside the crash report, nothing else really fails since the child is
>>>>> terminating anyway and no requests are onoing. But still, it's not a nice
>>>>> thing.
>>>>>
>>>>> Do you see an easy way to avoid this?
>>>>
>>>> AFAICT, clean_child_exit() destroys pchild only, and pconf is still
>>>> alive at exit() time.
>>>> Couldn't we use pconf only in child_init hooks of mod_ssl and mod_watchdog?
>>>> Does something like the attached patch fixes the crash?
>>>> <some_pchild_to_pconf.diff>
>>>
>>