Update: disregard the man behind the curtain - for now.

I have the strangest effects on my main machine under native macOS 10.12.6, 
which
do not happen on my parallels ubuntu image and my laptop with macOS 10.13.0. I 
tested
with hyperthreading disabled to exclude the Intel *lake bugs, but it made no 
difference.

I will upgrade the machine over night to macOS 10.13.0 and check again tomorrow 
if
that had any effect.

Regardless of all this, I managed to find an edge case assertion fail and made 
a fix
on trunk. If we go for 2.4.29, I'll propose that for inclusion. It is not a 
regression to
2.4.27, so no veto from me on 2.4.28.

Cheers,

Stefan


> Am 28.09.2017 um 09:54 schrieb Stefan Eissing <stefan.eiss...@greenbytes.de>:
> 
> Analyzing...
> 
> my checkout of branch/2.4.x does not show this.
> my buildt of 2.4.28 with -deps as provided, does after while of letting:
> 
> while true; do h2load -t 8 -c 100 -n 10000 -m 5 -d gen/data-10k 
> http://test.example.org:12345/; done
> 
> run (data-10k is a 10k text file). The whole server setup is according to the 
> mod_http2 test cases
> in  https://svn.apache.org/repos/asf/httpd/test/mod_h2/trunk
> 
> In 2.4.28 I have apr 1.6.2 and in my branch/2.4.x is a 1.5.x (which shows 
> 1.5.3 in its header).
> 
> Will check if that is connected or not. Will also check if this happens for 
> me on my Ubuntu.
> 
> Sporadically (less frequent), I can also see (not both 2.4.x and 2.4.28):
> Thread 3 Crashed:
> 0   libsystem_kernel.dylib            0x00007fffd3c34d42 __pthread_kill + 10
> 1   libsystem_pthread.dylib           0x00007fffd3d22457 pthread_kill + 90
> 2   libsystem_c.dylib                 0x00007fffd3b9a4bb __abort + 140
> 3   libsystem_c.dylib                 0x00007fffd3b9a42f abort + 144
> 4   libsystem_pthread.dylib           0x00007fffd3d23bc7 __pthread_abort + 49
> 5   libsystem_pthread.dylib           0x00007fffd3d23c7b 
> __pthread_abort_reason + 180
> 6   libsystem_pthread.dylib           0x00007fffd3d1fd93 
> _pthread_mutex_unlock_drop + 167
> 7   mod_http2.so                      0x000000010cb42b2b h2_beam_send + 1819 
> (h2_bucket_beam.c:965)
> 8   mod_http2.so                      0x000000010cb59d31 send_out + 257 
> (h2_task.c:100)
> 9   mod_http2.so                      0x000000010cb59146 
> h2_filter_slave_output + 294 (h2_task.c:176)
> 10  httpd                             0x000000010c598829 
> ap_process_request_after_handler + 89 (http_request.c:366)
> 11  httpd                             0x000000010c598ab6 ap_process_request + 
> 22 (http_request.c:473)
> 12  mod_http2.so                      0x000000010cb5874e h2_task_process_conn 
> + 398 (h2_task.c:682)
> 13  httpd                             0x000000010c5705f7 
> ap_run_process_connection + 55 (connection.c:42)
> 14  mod_http2.so                      0x000000010cb599db h2_task_do + 539 
> (h2_task.c:640)
> 15  mod_http2.so                      0x000000010cb5df24 slot_run + 260 
> (h2_workers.c:233)
> 16  libsystem_pthread.dylib           0x00007fffd3d1f93b _pthread_body + 180
> 17  libsystem_pthread.dylib           0x00007fffd3d1f887 _pthread_start + 286
> 18  libsystem_pthread.dylib           0x00007fffd3d1f08d thread_start + 13
> 
> which is a call to apr_thread_mutex_unlock(). Without having checkinto 
> pthread version, the closest
> similar report I could find on the net is 
> https://sourceware.org/bugzilla/show_bug.cgi?id=17514
> 
> Will report when I know more.
> 
> -Stefan
> 
>> Am 27.09.2017 um 18:04 schrieb Luca Toscano <toscano.l...@gmail.com>:
>> 
>> Hi Stefan,
>> 
>> 2017-09-27 17:32 GMT+02:00 Stefan Eissing <stefan.eiss...@greenbytes.de>:
>> On my h2 load tests, the server sometimes crashes in an assertion failure 
>> (build in maintainer mode):
>> 
>> How did you make the tests? It would be good for people to attempt to 
>> reproduce..
>> 
>> 
>> macOS crash reporter:
>> Thread 54 Crashed:
>> 0   libsystem_kernel.dylib              0x00007fffd3c34d42 __pthread_kill + 
>> 10
>> 1   libsystem_pthread.dylib             0x00007fffd3d22457 pthread_kill + 90
>> 2   libsystem_c.dylib                   0x00007fffd3b9a4bb __abort + 140
>> 3   libsystem_c.dylib                   0x00007fffd3b9a42f abort + 144
>> 4   httpd                               0x000000010be99282 ap_log_assert + 
>> 130 (log.c:1696)
>> 5   mod_mpm_event.so                    0x000000010c50b51c ap_queue_push + 
>> 188
>> 6   mod_mpm_event.so                    0x000000010c509be2 listener_thread + 
>> 2226 (event.c:1749)
>> 7   libsystem_pthread.dylib             0x00007fffd3d1f93b _pthread_body + 
>> 180
>> 8   libsystem_pthread.dylib             0x00007fffd3d1f887 _pthread_start + 
>> 286
>> 9   libsystem_pthread.dylib             0x00007fffd3d1f08d thread_start + 13
>> 
>> error_log:
>> [Wed Sep 27 17:27:09.941040 2017] [mpm_event:notice] [pid 23404:tid 
>> 140736895005632] AH00489: Apache/2.4.28 (Unix) OpenSSL/1.1.0e configured -- 
>> resuming normal operations
>> [Wed Sep 27 17:27:09.941055 2017] [core:notice] [pid 23404:tid 
>> 140736895005632] AH00094: Command line: '/opt/apache-2.4.x/bin/httpd -d 
>> /Users/sei/projects/httpd/test/mod_h2/2.4.x/gen/apache'
>> [Wed Sep 27 17:29:43.207923 2017] [mpm_event:alert] [pid 30381:tid 
>> 123145361334272] (35)Resource temporarily unavailable: AH03104: 
>> apr_thread_create: unable to create worker thread
>> [Wed Sep 27 17:29:43.208904 2017] [mpm_event:crit] [pid 30381:tid 
>> 123145369382912] (9)Bad file descriptor: apr_pollset_poll failed.  
>> Attempting to shutdown process gracefully
>> [Wed Sep 27 17:29:44.000147 2017] [core:notice] [pid 23404:tid 
>> 140736895005632] AH00052: child pid 30381 exit signal Segmentation fault (11)
>> [Wed Sep 27 17:29:44.208035 2017] [mpm_event:alert] [pid 30382:tid 
>> 123145438388224] (35)Resource temporarily unavailable: AH03104: 
>> apr_thread_create: unable to create worker thread
>> [Wed Sep 27 17:29:44.208925 2017] [mpm_event:crit] [pid 30382:tid 
>> 123145446436864] (9)Bad file descriptor: apr_pollset_poll failed.  
>> Attempting to shutdown process gracefully
>> [Wed Sep 27 17:29:45.001702 2017] [core:notice] [pid 23404:tid 
>> 140736895005632] AH00052: child pid 30382 exit signal Segmentation fault (11)
>> [Wed Sep 27 17:29:48.881247 2017] [core:crit] [pid 32841:tid 
>> 123145544712192] AH00102: [Wed Sep 27 17:29:48 2017] file fdqueue.c, line 
>> 390, assertion "!((queue)->nelts == (queue)->bounds)" failed
>> [Wed Sep 27 17:29:50.004871 2017] [core:notice] [pid 23404:tid 
>> 140736895005632] AH00052: child pid 32841 exit signal Abort trap (6)
>> [Wed Sep 27 17:29:52.366672 2017] [core:crit] [pid 35505:tid 
>> 123145420349440] AH00102: [Wed Sep 27 17:29:52 2017] file fdqueue.c, line 
>> 390, assertion "!((queue)->nelts == (queue)->bounds)" failed
>> [Wed Sep 27 17:29:52.368902 2017] [core:crit] [pid 35488:tid 
>> 123145392611328] AH00102: [Wed Sep 27 17:29:52 2017] file fdqueue.c, line 
>> 390, assertion "!((queue)->nelts == (queue)->bounds)" failed
>> [Wed Sep 27 17:29:53.007664 2017] [core:notice] [pid 23404:tid 
>> 140736895005632] AH00052: child pid 35505 exit signal Abort trap (6)
>> [Wed Sep 27 17:29:53.007741 2017] [core:notice] [pid 23404:tid 
>> 140736895005632] AH00052: child pid 35488 exit signal Abort trap (6)
>> 
>> 
>> Any relevant stacktrace or lead about where the problem might reside?
>> 
>> Luca 
> 

Reply via email to