Update: disregard the man behind the curtain - for now. I have the strangest effects on my main machine under native macOS 10.12.6, which do not happen on my parallels ubuntu image and my laptop with macOS 10.13.0. I tested with hyperthreading disabled to exclude the Intel *lake bugs, but it made no difference.
I will upgrade the machine over night to macOS 10.13.0 and check again tomorrow if that had any effect. Regardless of all this, I managed to find an edge case assertion fail and made a fix on trunk. If we go for 2.4.29, I'll propose that for inclusion. It is not a regression to 2.4.27, so no veto from me on 2.4.28. Cheers, Stefan > Am 28.09.2017 um 09:54 schrieb Stefan Eissing <stefan.eiss...@greenbytes.de>: > > Analyzing... > > my checkout of branch/2.4.x does not show this. > my buildt of 2.4.28 with -deps as provided, does after while of letting: > > while true; do h2load -t 8 -c 100 -n 10000 -m 5 -d gen/data-10k > http://test.example.org:12345/; done > > run (data-10k is a 10k text file). The whole server setup is according to the > mod_http2 test cases > in https://svn.apache.org/repos/asf/httpd/test/mod_h2/trunk > > In 2.4.28 I have apr 1.6.2 and in my branch/2.4.x is a 1.5.x (which shows > 1.5.3 in its header). > > Will check if that is connected or not. Will also check if this happens for > me on my Ubuntu. > > Sporadically (less frequent), I can also see (not both 2.4.x and 2.4.28): > Thread 3 Crashed: > 0 libsystem_kernel.dylib 0x00007fffd3c34d42 __pthread_kill + 10 > 1 libsystem_pthread.dylib 0x00007fffd3d22457 pthread_kill + 90 > 2 libsystem_c.dylib 0x00007fffd3b9a4bb __abort + 140 > 3 libsystem_c.dylib 0x00007fffd3b9a42f abort + 144 > 4 libsystem_pthread.dylib 0x00007fffd3d23bc7 __pthread_abort + 49 > 5 libsystem_pthread.dylib 0x00007fffd3d23c7b > __pthread_abort_reason + 180 > 6 libsystem_pthread.dylib 0x00007fffd3d1fd93 > _pthread_mutex_unlock_drop + 167 > 7 mod_http2.so 0x000000010cb42b2b h2_beam_send + 1819 > (h2_bucket_beam.c:965) > 8 mod_http2.so 0x000000010cb59d31 send_out + 257 > (h2_task.c:100) > 9 mod_http2.so 0x000000010cb59146 > h2_filter_slave_output + 294 (h2_task.c:176) > 10 httpd 0x000000010c598829 > ap_process_request_after_handler + 89 (http_request.c:366) > 11 httpd 0x000000010c598ab6 ap_process_request + > 22 (http_request.c:473) > 12 mod_http2.so 0x000000010cb5874e h2_task_process_conn > + 398 (h2_task.c:682) > 13 httpd 0x000000010c5705f7 > ap_run_process_connection + 55 (connection.c:42) > 14 mod_http2.so 0x000000010cb599db h2_task_do + 539 > (h2_task.c:640) > 15 mod_http2.so 0x000000010cb5df24 slot_run + 260 > (h2_workers.c:233) > 16 libsystem_pthread.dylib 0x00007fffd3d1f93b _pthread_body + 180 > 17 libsystem_pthread.dylib 0x00007fffd3d1f887 _pthread_start + 286 > 18 libsystem_pthread.dylib 0x00007fffd3d1f08d thread_start + 13 > > which is a call to apr_thread_mutex_unlock(). Without having checkinto > pthread version, the closest > similar report I could find on the net is > https://sourceware.org/bugzilla/show_bug.cgi?id=17514 > > Will report when I know more. > > -Stefan > >> Am 27.09.2017 um 18:04 schrieb Luca Toscano <toscano.l...@gmail.com>: >> >> Hi Stefan, >> >> 2017-09-27 17:32 GMT+02:00 Stefan Eissing <stefan.eiss...@greenbytes.de>: >> On my h2 load tests, the server sometimes crashes in an assertion failure >> (build in maintainer mode): >> >> How did you make the tests? It would be good for people to attempt to >> reproduce.. >> >> >> macOS crash reporter: >> Thread 54 Crashed: >> 0 libsystem_kernel.dylib 0x00007fffd3c34d42 __pthread_kill + >> 10 >> 1 libsystem_pthread.dylib 0x00007fffd3d22457 pthread_kill + 90 >> 2 libsystem_c.dylib 0x00007fffd3b9a4bb __abort + 140 >> 3 libsystem_c.dylib 0x00007fffd3b9a42f abort + 144 >> 4 httpd 0x000000010be99282 ap_log_assert + >> 130 (log.c:1696) >> 5 mod_mpm_event.so 0x000000010c50b51c ap_queue_push + >> 188 >> 6 mod_mpm_event.so 0x000000010c509be2 listener_thread + >> 2226 (event.c:1749) >> 7 libsystem_pthread.dylib 0x00007fffd3d1f93b _pthread_body + >> 180 >> 8 libsystem_pthread.dylib 0x00007fffd3d1f887 _pthread_start + >> 286 >> 9 libsystem_pthread.dylib 0x00007fffd3d1f08d thread_start + 13 >> >> error_log: >> [Wed Sep 27 17:27:09.941040 2017] [mpm_event:notice] [pid 23404:tid >> 140736895005632] AH00489: Apache/2.4.28 (Unix) OpenSSL/1.1.0e configured -- >> resuming normal operations >> [Wed Sep 27 17:27:09.941055 2017] [core:notice] [pid 23404:tid >> 140736895005632] AH00094: Command line: '/opt/apache-2.4.x/bin/httpd -d >> /Users/sei/projects/httpd/test/mod_h2/2.4.x/gen/apache' >> [Wed Sep 27 17:29:43.207923 2017] [mpm_event:alert] [pid 30381:tid >> 123145361334272] (35)Resource temporarily unavailable: AH03104: >> apr_thread_create: unable to create worker thread >> [Wed Sep 27 17:29:43.208904 2017] [mpm_event:crit] [pid 30381:tid >> 123145369382912] (9)Bad file descriptor: apr_pollset_poll failed. >> Attempting to shutdown process gracefully >> [Wed Sep 27 17:29:44.000147 2017] [core:notice] [pid 23404:tid >> 140736895005632] AH00052: child pid 30381 exit signal Segmentation fault (11) >> [Wed Sep 27 17:29:44.208035 2017] [mpm_event:alert] [pid 30382:tid >> 123145438388224] (35)Resource temporarily unavailable: AH03104: >> apr_thread_create: unable to create worker thread >> [Wed Sep 27 17:29:44.208925 2017] [mpm_event:crit] [pid 30382:tid >> 123145446436864] (9)Bad file descriptor: apr_pollset_poll failed. >> Attempting to shutdown process gracefully >> [Wed Sep 27 17:29:45.001702 2017] [core:notice] [pid 23404:tid >> 140736895005632] AH00052: child pid 30382 exit signal Segmentation fault (11) >> [Wed Sep 27 17:29:48.881247 2017] [core:crit] [pid 32841:tid >> 123145544712192] AH00102: [Wed Sep 27 17:29:48 2017] file fdqueue.c, line >> 390, assertion "!((queue)->nelts == (queue)->bounds)" failed >> [Wed Sep 27 17:29:50.004871 2017] [core:notice] [pid 23404:tid >> 140736895005632] AH00052: child pid 32841 exit signal Abort trap (6) >> [Wed Sep 27 17:29:52.366672 2017] [core:crit] [pid 35505:tid >> 123145420349440] AH00102: [Wed Sep 27 17:29:52 2017] file fdqueue.c, line >> 390, assertion "!((queue)->nelts == (queue)->bounds)" failed >> [Wed Sep 27 17:29:52.368902 2017] [core:crit] [pid 35488:tid >> 123145392611328] AH00102: [Wed Sep 27 17:29:52 2017] file fdqueue.c, line >> 390, assertion "!((queue)->nelts == (queue)->bounds)" failed >> [Wed Sep 27 17:29:53.007664 2017] [core:notice] [pid 23404:tid >> 140736895005632] AH00052: child pid 35505 exit signal Abort trap (6) >> [Wed Sep 27 17:29:53.007741 2017] [core:notice] [pid 23404:tid >> 140736895005632] AH00052: child pid 35488 exit signal Abort trap (6) >> >> >> Any relevant stacktrace or lead about where the problem might reside? >> >> Luca >