Doesn't the socket_bucket_read() call (frame #3) enter
apr_socket_timeout_set(p, 0), and hence sononblock() which puts the
socket in O_NONBLOCK?

Maybe "strace" would help here too, since we are in the system now.

On Thu, Feb 4, 2016 at 9:17 AM, Joachim Achtzehnter <joac...@kraut.ca> wrote:
> Here is the backtrace:
>
>> #0  0xb7fddc40 in __kernel_vsyscall ()
>> #1  0xb7e05f03 in __read_nocancel () at
>> ../sysdeps/unix/syscall-template.S:81
>> #2  0xb7f39589 in apr_socket_recv (sock=sock@entry=0x8156d18,
>>     buf=buf@entry=0x8168750 "html, body {\n   font-family: arial,
>> helvetica, sans-serif;\n   font-size: 12px;\n   height: 100%;\n
>> margin:0;\n   padding:0;\n   height:100%;\n}\nbody a, a:hover {\n
>> text-decoration: underline;\n   border:"..., len=len@entry=0xbffff5e4) at
>> network_io/unix/sendrecv.c:81
>> #3  0xb7f744d5 in socket_bucket_read (a=0x8160fc8, str=0xbffff5e0,
>>     len=0xbffff5e4, block=APR_NONBLOCK_READ)
>>     at buckets/apr_buckets_socket.c:36
>> #4  0x08086a98 in ap_core_input_filter (f=0x8165770, b=0x8165730,
>>     mode=AP_MODE_READBYTES, block=APR_NONBLOCK_READ, readbytes=5)
>>     at core_filters.c:235
>> #5  0xb7ce732a in bio_filter_in_read (bio=0x81666d8, in=0x816bc83 "",
>> inlen=5)
>>     at ssl_engine_io.c:487
>> #6  0xb7b1bc66 in BIO_read ()
>>    from /home/joachima/httpd-2.4.18/lib/libcrypto.so.1.0.0
>> #7  0xb7c8e040 in ssl3_read_n ()
>>    from /home/joachima/httpd-2.4.18/lib/libssl.so.1.0.0
>> #8  0xb7c8fa75 in ssl3_read_bytes ()
>>    from /home/joachima/httpd-2.4.18/lib/libssl.so.1.0.0
>> #9  0xb7c8c211 in ssl3_read ()
>>    from /home/joachima/httpd-2.4.18/lib/libssl.so.1.0.0
>> #10 0xb7ca8679 in SSL_read ()
>>    from /home/joachima/httpd-2.4.18/lib/libssl.so.1.0.0
>> #11 0xb7ce80d7 in ssl_io_input_read (inctx=inctx@entry=0x81636e8,
>>     buf=buf@entry=0x8163710 "GET /css/subpage.css HTTP/1.1\r\nUser-Agent:
>> curl/7.40.0\r\nHost: na171\r\nAccept: */*\r\n\r\n",
>> len=len@entry=0xbffff8f8)
>>     at ssl_engine_io.c:611
>> #12 0xb7cea36c in ssl_io_filter_input (f=0x8165718, bb=0x81660d8,
>>     mode=AP_MODE_SPECULATIVE, block=APR_NONBLOCK_READ, readbytes=1)
>>     at ssl_engine_io.c:1407
>> #13 0x080a013f in check_pipeline (bb=0x81660d8, c=0x8156eb0)
>>     at http_request.c:244
>> #14 ap_process_request_after_handler (r=0x817e2d8) at http_request.c:367
>> #15 0x080a1229 in ap_process_async_request (r=r@entry=0x817e2d8)
>>     at http_request.c:448
>> #16 0x080a1575 in ap_process_request (r=r@entry=0x817e2d8)
>>     at http_request.c:458
>> #17 0x0809d547 in ap_process_http_sync_connection (c=0x8156eb0)
>>     at http_core.c:210
>> #18 ap_process_http_connection (c=0x8156eb0) at http_core.c:251
>> #19 0x0809549d in ap_run_process_connection (c=0x8156eb0) at
>> connection.c:41
>> #20 0x0809589d in ap_process_connection (c=c@entry=0x8156eb0,
>> csd=0x8156d18)
>>     at connection.c:213
>> #21 0x080a7849 in child_main (child_num_arg=child_num_arg@entry=0,
>>     child_bucket=child_bucket@entry=0) at prefork.c:723
>> #22 0x080a7a9f in make_child (s=0x80f9f38, slot=slot@entry=0,
>>     bucket=bucket@entry=0) at prefork.c:767
>> #23 0x080a9091 in prefork_run (_pconf=0x80d50a8, plog=0x80fd9b8,
>> s=0x80f9f38)
>>     at prefork.c:979
>> #24 0x08070959 in ap_run_mpm (pconf=pconf@entry=0x80d50a8, plog=0x80fd9b8,
>>     s=0x80f9f38) at mpm_common.c:94
>> #25 0x08069f23 in main (argc=4, argv=0xbffffd54) at main.c:777
>
>
>
> On 2016-02-03 11:50 PM, Plüm, Rüdiger, Vodafone Group wrote:
>
>> Can you also post the backtrace when the server is hanging in the read?
>>
>> Regards
>>
>> Rüdiger
>>
>>> -----Original Message-----
>>> From: Plüm, Rüdiger, Vodafone Group
>>> Sent: Donnerstag, 4. Februar 2016 08:39
>>> To: dev@httpd.apache.org
>>> Subject: RE: HTTPS connections lock-up with 2.4.18
>>>
>>> Thanks. Which version of APR / APR-UTIL are you using?
>>>
>>> Regards
>>>
>>> Rüdiger
>>>
>>>> -----Original Message-----
>>>> From: Joachim Achtzehnter [mailto:joac...@kraut.ca]
>>>> Sent: Donnerstag, 4. Februar 2016 03:56
>>>> To: dev@httpd.apache.org
>>>> Subject: Re: HTTPS connections lock-up with 2.4.18
>>>>
>>>> Perhaps, there is an important clue this time. When I retrieve a file
>>>> that works we see the following log messages near the time when the
>>>> reply is being written:
>>>>
>>>>> [Wed Feb 03 18:13:51.402289 2016] [ssl:notice] [pid 1676] [client
>>>>
>>>> 205.159.216.185:55978] bio[9dabd50] out: pass 271 bytes
>>>>>
>>>>> [Wed Feb 03 18:13:51.402463 2016] [ssl:notice] [pid 1676] [client
>>>>
>>>> 205.159.216.185:55978] bio[9dabd50] out: pass 4020 bytes
>>>>>
>>>>> [Wed Feb 03 18:13:51.402515 2016] [core:notice] [pid 1676] [client
>>>>
>>>> 205.159.216.185:55978] core_output_filter: sent 4291 on 4291 bytes
>>>>>
>>>>> [Wed Feb 03 18:13:51.402527 2016] [ssl:notice] [pid 1676] [client
>>>>
>>>> 205.159.216.185:55978] bio[9dabd50] out: METADATA EOS
>>>>>
>>>>> [Wed Feb 03 18:13:51.402581 2016] [ssl:notice] [pid 1676] [client
>>>>
>>>> 205.159.216.185:55978] bio[9dabd50] out: METADATA EOR
>>>>>
>>>>> [Wed Feb 03 18:13:51.406356 2016] [ssl:notice] [pid 1676] [client
>>>>
>>>> 205.159.216.185:55978] bio[9db96d8] in: read 5 bytes
>>>>>
>>>>> [Wed Feb 03 18:13:51.406375 2016] [ssl:notice] [pid 1676] [client
>>>>
>>>> 205.159.216.185:55978] bio[9db96d8] in: read 26 bytes
>>>>>
>>>>> [Wed Feb 03 18:13:51.406392 2016] [http:notice] [pid 1676] (70014)End
>>>
>>> of
>>>>
>>>> file found: [client 205.159.216.185:55978] check_pipeline: no data
>>>>>
>>>>> [Wed Feb 03 18:13:51.406476 2016] [http:notice] [pid 1676] (70014)End
>>>
>>> of
>>>>
>>>> file found: [client 205.159.216.185:55978] check_pipeline:
>>>> data_in_input_filters = 0
>>>>>
>>>>> [Wed Feb 03 18:13:51.406492 2016] [ssl:notice] [pid 1676] [client
>>>>
>>>> 205.159.216.185:55978] bio[9dabd50] out: METADATA FLUSH
>>>>>
>>>>> [Wed Feb 03 18:13:51.406591 2016] [ssl:notice] [pid 1676] [client
>>>>
>>>> 205.159.216.185:55978] bio[9dabd50] out: METADATA FLUSH
>>>>>
>>>>> [Wed Feb 03 18:13:51.406613 2016] [ssl:notice] [pid 1676] [client
>>>>
>>>> 205.159.216.185:55978] bio[9dabd50] out: METADATA EOC
>>>>>
>>>>> [Wed Feb 03 18:13:51.406624 2016] [ssl:notice] [pid 1676] [client
>>>>
>>>> 205.159.216.185:55978] bio[9dabd50] out: pass 31 bytes
>>>>>
>>>>> [Wed Feb 03 18:13:51.406673 2016] [ssl:notice] [pid 1676] [client
>>>>
>>>> 205.159.216.185:55978] bio[9dabd50] out: FLUSH
>>>>>
>>>>> [Wed Feb 03 18:13:51.406723 2016] [core:notice] [pid 1676] [client
>>>>
>>>> 205.159.216.185:55978] core_output_filter: sent 31 on 31 bytes
>>>>>
>>>>> [Wed Feb 03 18:13:51.406735 2016] [ssl:debug] [pid 1676]
>>>>
>>>> ssl_engine_io.c(1029): [client 205.159.216.185:55978] AH02001:
>>>
>>> Connection
>>>>
>>>> closed to child 0 with standard shutdown (server localhost:443)
>>>>
>>>> Notice the call to check_pipeline returns quickly. Now the same part of
>>>> the log for the failing file:
>>>>
>>>>> [Wed Feb 03 17:56:55.424124 2016] [ssl:notice] [pid 1663] [client
>>>>
>>>> 205.159.216.185:55962] bio[8158d50] out: pass 271 bytes
>>>>>
>>>>> [Wed Feb 03 17:56:55.424192 2016] [ssl:notice] [pid 1663] [client
>>>>
>>>> 205.159.216.185:55962] bio[8158d50] out: pass 1727 bytes
>>>>>
>>>>> [Wed Feb 03 17:56:55.424223 2016] [ssl:notice] [pid 1663] [client
>>>>
>>>> 205.159.216.185:55962] bio[8158d50] out: METADATA EOS
>>>>>
>>>>> [Wed Feb 03 17:56:55.424235 2016] [ssl:notice] [pid 1663] [client
>>>>
>>>> 205.159.216.185:55962] bio[8158d50] out: METADATA EOR
>>>>>
>>>>> [Wed Feb 03 18:09:56.652288 2016] [http:notice] [pid 1663]
>>>
>>> (11)Resource
>>>>
>>>> temporarily unavailable: [client 205.159.216.185:55962] check_pipeline:
>>>
>>> no
>>>>
>>>> data
>>>>>
>>>>> [Wed Feb 03 18:09:56.652376 2016] [http:notice] [pid 1663]
>>>
>>> (11)Resource
>>>>
>>>> temporarily unavailable: [client 205.159.216.185:55962] check_pipeline:
>>>> data_in_input_filters = 0
>>>>>
>>>>> [Wed Feb 03 18:09:56.652398 2016] [ssl:notice] [pid 1663] [client
>>>>
>>>> 205.159.216.185:55962] bio[8158d50] out: METADATA FLUSH
>>>>>
>>>>> [Wed Feb 03 18:09:56.652444 2016] [core:notice] [pid 1663] [client
>>>>
>>>> 205.159.216.185:55962] core_output_filter: sent 1998 on 1998 bytes
>>>>>
>>>>> [Wed Feb 03 18:09:56.652510 2016] [ssl:info] [pid 1663] (70014)End of
>>>>
>>>> file found: [client 205.159.216.185:55962] AH01991: SSL input filter
>>>
>>> read
>>>>
>>>> failed.
>>>>>
>>>>> [Wed Feb 03 18:09:56.652519 2016] [ssl:notice] [pid 1663] [client
>>>>
>>>> 205.159.216.185:55962] bio[8158d50] out: METADATA FLUSH
>>>>>
>>>>> [Wed Feb 03 18:09:56.652525 2016] [ssl:notice] [pid 1663] [client
>>>>
>>>> 205.159.216.185:55962] bio[8158d50] out: METADATA EOC
>>>>>
>>>>> [Wed Feb 03 18:09:56.652545 2016] [ssl:notice] [pid 1663] [client
>>>>
>>>> 205.159.216.185:55962] bio[8158d50] out: pass 31 bytes
>>>>>
>>>>> [Wed Feb 03 18:09:56.652553 2016] [ssl:notice] [pid 1663] [client
>>>>
>>>> 205.159.216.185:55962] bio[8158d50] out: FLUSH
>>>>>
>>>>> [Wed Feb 03 18:09:56.652568 2016] [core:notice] [pid 1663] [client
>>>>
>>>> 205.159.216.185:55962] core_output_filter: sent 31 on 31 bytes
>>>>>
>>>>> [Wed Feb 03 18:09:56.652576 2016] [ssl:debug] [pid 1663]
>>>>
>>>> ssl_engine_io.c(1029): [client 205.159.216.185:55962] AH02001:
>>>
>>> Connection
>>>>
>>>> closed to child 0 with standard shutdown (server localhost:443)
>>>>
>>>> The check_pipeline call only returns when I kill the client. I first
>>>> noticed this in the debugger where httpd was sitting in a blocking read.
>>>> Somehow, in this case a blocking read is done instead of a non-blocking
>>>> read.
>>>>
>>>> To be specific, it ended up in the do/while loop that calls read in
>>>> function apr_socket_recv (sendrecv.c), called from socket_bucket_read in
>>>> apr_buckets_socket.c, called from ap_core_input_filter in
>>>> core_filters.c. The block parameter was APR_NONBLOCK_READ, but this
>>>> didn't stop the lower-level functions from blocking, somehow. The
>>>> apr_socket_recv function seems to do the select version of reading based
>>>> on the APR_INCOMPLETE_READ flag in sock->options.
>>>>
>>>> By the way, to answer an earlier question, this is a prefork server with
>>>> separate processes, not threaded.
>>>>
>>>> Thanks,
>>>>
>>>> Joachim
>>>>
>>>>
>>>> On 2016-02-03 8:06, Yann Ylavic wrote:
>>>>>
>>>>> On Wed, Feb 3, 2016 at 4:53 PM, Yann Ylavic <ylavic....@gmail.com>
>>>>
>>>> wrote:
>>>>>>
>>>>>>
>>>>>> I suggest you use the attached patch instead (replacing any previous
>>>>>> one), so that we log the output path taken when mod_ssl is in the
>>>>>> place.
>>>>>
>>>>>
>>>>> Forgot about the casting issue (SSL_get_app_data) with your
>>>
>>> compiler...
>>>>>
>>>>> This v5 (attached) should compile fine.
>>>>>
>>>>>>
>>>>>>> And maybe take network traces on the server side?
>>>>>>
>>>>>>
>>>>>> That would be really nice too.
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Yann.
>>>>
>>>>
>>>> --
>>>> joac...@kraut.ca http://www.kraut.ca
>
>
>
> --
> joac...@kraut.ca   www.kraut.ca

Reply via email to