On 2013-11-25 23:25, Jeff Trawick wrote:
> On Mon, Nov 25, 2013 at 5:03 PM, Jeff Trawick <traw...@gmail.com> wrote:
> 
>> On Mon, Nov 25, 2013 at 4:28 PM, olli hauer <oha...@gmx.de> wrote:
>>
>>> On 2013-11-25 22:14, Jeff Trawick wrote:
>>>> On Sun, Nov 24, 2013 at 8:39 PM, Jeff Trawick <traw...@gmail.com>
>>> wrote:
>>>>
>>>>> Let's move this to dev@httpd and omit dev@apr (after this e-mail)...
>>>>>
>>>>>
>>>>> On Sun, Nov 24, 2013 at 8:28 AM, olli hauer <oha...@gmx.de> wrote:
>>>>>
>>>>>> On 2013-11-22 00:08, Jeff Trawick wrote:
>>>>>>> On Thu, Nov 21, 2013 at 5:48 PM, olli hauer <oha...@gmx.de> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> sorry for late response to apr-1.5.0 ...
>>>>>>>>
>>>>>>>> I've done some tests with apr-1.5.0 on FreeBSD 10 (amd64)
>>>>>>>> and it seems there is an issue that breaks apache24.
>>>>>>>>
>>>>>>>> With apr-1.5.0 apache22 works but apache24 is broken.
>>>>>>>> apache starts fine, nothing special in the logs or during
>>>>>>>> start with -X but no response is coming back.
>>>>>>>>
>>>>>>>> apr/apr-util test:
>>>>>>>> ========================================
>>>>>>>> apr-1.5.0:      all tests passed [1]
>>>>>>>> apr-util-1.5.3: all tests passed
>>>>>>>>
>>>>>>>>
>>>>>>>> working configurations (FreeBSD beta3 [1]
>>>>>>>> =========================================
>>>>>>>> apache22-2.2.26 apr-1.4.8 apr-util-1.5.3
>>>>>>>> apache22-2.2.26 apr-1.5.0 apr-util-1.5.3
>>>>>>>> apache24-2.4.6  apr-1.4.8 apr-util-1.5.2
>>>>>>>> apache24-2.4.7  apr-1.4.8 apr-util-1.5.2
>>>>>>>> apache24-2.4.6  apr-1.4.8 apr-util-1.5.3
>>>>>>>> apache24-2.4.7  apr-1.4.8 apr-util-1.5.3
>>>>>>>>
>>>>>>>> broken combinations:
>>>>>>>> =========================================
>>>>>>>> apache24-2.4.6  apr-1.5.0 apr-util-1.5.3
>>>>>>>> apache24-2.4.7  apr-1.5.0 apr-util-1.5.3
>>>>>>>>
>>>>>>>> All tests where done with MPM worker.
>>>>>>>>
>>>>>>>>
>>>>>>>> FreeBSD 8.4 (amd64) seems OK in all combinations
>>>>>>>> FreeBSD 9.2 (amd64) seems OK in all combinations
>>>>>>>>
>>>>>>>> [1] FreeBSD 10 beta3 with iconv UTF7 patch r258316
>>>>>>>> (head/lib/libiconv_modules/UTF7/citrus_utf7.c)
>>>>>>>>
>>>>>>>> Any hints where to start?
>>>>>>>>
>>>>>>>
>>>>>>> Set LogLevel trace8 and compare good vs. bad.
>>>>>>> Start with -X then attach with dtruss and compare good vs. bad.
>>>>>>> Get open fds displayed by lsof and compare good vs. bad.
>>>>>>> Is connection to client held open?  Get backtraces.
>>>>>>>
>>>>>>> I just compared 1.4.8 vs. 1.5.0 and didn't see anything that looked
>>>>>>> remotely likely.
>>>>>>>
>>>>>>
>>>>>> Comparing trace8 outputs showed the request is processed
>>>>>> but the following code snippet in server/core_filters.c
>>>>>> never get TRUE except the client cancels the request.
>>>>>>
>>>>>> To get some better log entries I've used server/core_filters.c
>>>>>> r1510295 from trunk.
>>>>>>
>>>>>>
>>>>>> @@server/core_filters.c (line 510)
>>>>>> if (APR_BUCKET_IS_FLUSH(bucket)
>>>>>>     || non_file_bytes_in_brigade >= THRESHOLD_MAX_BUFFER
>>>>>>     || morphing_bucket_in_brigade
>>>>>>     || eor_buckets_in_brigade > MAX_REQUESTS_IN_PIPELINE) {
>>>>>> ...
>>>>>> }
>>>>>>
>>>>>> [http:trace3] http_filters.c(974):[client x.x.x.x:x] Response sent
>>> with
>>>>>> status 200, headers:
>>>>>> [http:trace5] http_filters.c(983):[client x.x.x.x:x]   Date: Sun, 24
>>> Nov
>>>>>> 2013 10:28:37 GMT
>>>>>> [http:trace5] http_filters.c(986):[client x.x.x.x:x]   Server:
>>>>>> Apache/2.4.7 (FreeBSD)
>>>>>> [http:trace4] http_filters.c(804):[client x.x.x.x:x]   Last-Modified:
>>>>>> Sat, 23 Nov 2013 16:51:58 GMT
>>>>>> [http:trace4] http_filters.c(804):[client x.x.x.x:x]   ETag:
>>>>>> \\"be-4ebdaf2ef2780\\"
>>>>>> [http:trace4] http_filters.c(804):[client x.x.x.x:x]   Accept-Ranges:
>>>>>> bytes
>>>>>> [http:trace4] http_filters.c(804):[client x.x.x.x:x]
>>> Content-Length: 190
>>>>>> [http:trace4] http_filters.c(804):[client x.x.x.x:x]   Keep-Alive:
>>>>>> timeout=5, max=100
>>>>>> [http:trace4] http_filters.c(804):[client x.x.x.x:x]   Connection:
>>>>>> Keep-Alive
>>>>>> [http:trace4] http_filters.c(804):[client x.x.x.x:x]   Content-Type:
>>>>>> text/html
>>>>>> [core:trace8] core_filters.c(576):[client x.x.x.x:x] brigade contains:
>>>>>> bytes: 284, non-file bytes: 284, eor buckets: 0, morphing buckets: 0
>>>>>> [core:trace8] core_filters.c(576):[client x.x.x.x:x] brigade contains:
>>>>>> bytes: 474, non-file bytes: 284, eor buckets: 0, morphing buckets: 0
>>>>>> [core:trace8] core_filters.c(576):[client x.x.x.x:x] brigade contains:
>>>>>> bytes: 474, non-file bytes: 284, eor buckets: 1, morphing buckets: 0
>>>>>>
>>>>>>
>>>>>> This following lines are only seen if
>>>>>>   apr-1-5.0 was build without IPv6 support
>>>>>>   or apache24 was build with v4-mapping enabled
>>>>>>   or "Listen $IP:$port" is given in httpd.conf
>>>>>>
>>>>>> [core:trace6] core_filters.c(526):[client x.x.x.x:x] will flush
>>> because
>>>>>> of FLUSH bucket
>>>>>> [core:trace8] core_filters.c(528):[client x.x.x.x:x] seen in brigade
>>> so
>>>>>> far: bytes: 474, non-file bytes: 284, eor buckets: 1, morphing
>>> buckets: 0
>>>>>> [core:trace8] core_filters.c(555):[client x.x.x.x:x] flushing now
>>>>>> [core:trace8] core_filters.c(568):[client x.x.x.x:x] total bytes
>>> written:
>>>>>> 474
>>>>>> [core:trace8] core_filters.c(576):[client x.x.x.x:x] brigade contains:
>>>>>> bytes: 0, non-file bytes: 0, eor buckets: 0, morphing buckets: 0
>>>>>>
>>>>>> However a flush is triggered if the client cancels the request, but no
>>>>>> data is sent over the wire ...
>>>>>>
>>>>>>
>>>>>> I've searched if other also have seen a similar issue and found
>>>>>> instead the following interesting article from Nov. 2002 ;)
>>>>>> http://people.apache.org/~trawick/v4mapped.html
>>>>>>
>>>>>>
>>>>> I haven't analyzed the trace messages you showed above.  Here's what I
>>> did
>>>>> on FreeBSD 9:
>>>>>
>>>>> Apply this patch to fix the version check for disabling v4mapped
>>> addresses:
>>>>>
>>>>> Index: configure.in
>>>>> ===================================================================
>>>>> --- configure.in (revision 1545127)
>>>>> +++ configure.in (working copy)
>>>>> @@ -774,7 +774,10 @@
>>>>>  ],
>>>>>  [
>>>>>      case $host in
>>>>> -    *freebsd5*|*netbsd*|*openbsd*)
>>>>> +    *freebsd[1234].*)
>>>>> +        v4mapped=yes
>>>>> +        ;;
>>>>> +    *freebsd*|*netbsd*|*openbsd*)
>>>>>          v4mapped=no
>>>>>          ;;
>>>>>      *)
>>>>>
>>>>> That gives me
>>>>>
>>>>> $ bin/apachectl -V
>>>>> AH00558: httpd: Could not reliably determine the server's fully
>>> qualified
>>>>> domain name, using 127.0.0.1. Set the 'ServerName' directive globally
>>> to
>>>>> suppress this message
>>>>> Server version: Apache/2.4.8-dev (Unix)
>>>>> Server built:   Nov 24 2013 20:21:30
>>>>> Server's Module Magic Number: 20120211:27
>>>>> Server loaded:  APR 1.5.1-dev, APR-UTIL 1.5.4-dev
>>>>> Compiled using: APR 1.5.1-dev, APR-UTIL 1.5.4-dev
>>>>> Architecture:   64-bit
>>>>> Server MPM:     worker
>>>>>   threaded:     yes (fixed thread count)
>>>>>     forked:     yes (variable process count)
>>>>> Server compiled with....
>>>>>  -D APR_HAS_SENDFILE
>>>>>  -D APR_HAS_MMAP
>>>>>  -D APR_HAVE_IPV6 (IPv4-mapped addresses disabled)
>>>>>
>>>>> The last line shown indicates that mapped addresses are disabled in
>>> this
>>>>> build.
>>>>>
>>>>> The only Listen I have is "Listen 8080".  procstat says I have separate
>>>>> sockets, as expected:
>>>>>
>>>>> 60734 httpd               3 s - rw---n---   9       0 TCP ::.8080 ::.0
>>>>> 60734 httpd               4 s - rw---n---   9       0 TCP 0.0.0.0:8080
>>>>> 0.0.0.0:0
>>>>>
>>>>> From netstat:
>>>>>
>>>>> tcp4       0      0 *.8080                 *.*
>>>  LISTEN
>>>>> tcp6       0      0 *.8080                 *.*
>>>  LISTEN
>>>>>
>>>>> (You had shown sockstat before; this is the same stuff.)
>>>>>
>>>>> --/--
>>>>>
>>>>> Are you able to check for the issue on FreeBSD 9?
>>>>>
>>>>> When it hangs, are you connecting to the IPv4 listener or the IPv6
>>>>> listener?  Does it matter whether you use loopback or the address of a
>>>>> network interface?
>>>>>
>>>>> You're using a regular web browser in the failing case, right?  Have
>>> you
>>>>> tried something as simple as netcat for the failing address/port?
>>>  Example:
>>>>>
>>>>> $ echo "GET /" | nc 127.0.0.1 8080
>>>>> <html><body><h1>It works!</h1></body></html>
>>>>>
>>>>> --
>>>>> Born in Roswell... married an alien...
>>>>> http://emptyhammock.com/
>>>>>
>>>>
>>>> I see the hang on FreeBSD 10 Beta 3 using the same sources, though
>>> clang is
>>>> installed instead of gcc.
>>>>
>>>> FWIW, the simple "GET /<EOF>" request works but an HTTP/1.1 request from
>>>> telnet or from Firefox hangs.  I am able to try loopback over IPv4 and
>>> IPv6
>>>> and they both hang.
>>>>
>>>> Now move the gcc binaries from FreeBSD 9 to FreeBSD 10 (along with
>>>> libpcre.0): I don't see the hang with the simple testcases that
>>> triggered
>>>> it with the clang builds.
>>>>
>>>> I wouldn't be surprised if your apr-1.5-as-trigger observation reflects
>>>> "accidents" like code moving around and/or changing in size.  I didn't
>>> see
>>>> any interesting changes in the source that would affect the path where
>>> the
>>>> hang occurs.  But who really knows :)
>>>>
>>>
>>> Playing with the code (damage included), moving 'flush_upto = next;'
>>> and the page loads. I still have some problems to understand how complete
>>> code is working (or even not).
>>>
>>> What is not clear to me why 'if (APR_BUCKET_IS_FLUSH(bucket)' is not
>>> triggered?
>>>
>>
>> It is for me at least during lingering close, and I go into the "if
>> (flush_upto != NULL) {" block as expected.
>>
>> From the start of a response to the simple index.html I see a HEAP bucket
>> (hdrs?), FILE bucket, EOS bucket, EOR bucket, and maybe something else I
>> glossed over, but no flush bucket.
>>
>> I think the issue may be that after generating the response for one
>> request, httpd thinks there is a pipelined request and instead of flushing
>> the first response it goes to read the next request but none is available.
>>
>> Yep, hung here:
>>
>> Breakpoint 1, check_pipeline (c=0x8026282b8) at http_request.c:221
>> 221    if (c->keepalive != AP_CONN_CLOSE) {
>> Current language:  auto; currently minimal
>> (gdb) p c->keepalive
>> $1 = AP_CONN_KEEPALIVE
>> (gdb) n
>> 223        apr_bucket_brigade *bb = apr_brigade_create(c->pool,
>> c->bucket_alloc);
>> (gdb)
>> 225        rv = ap_get_brigade(c->input_filters, bb, AP_MODE_SPECULATIVE,
>> (gdb) list
>> 220 {
>> 221    if (c->keepalive != AP_CONN_CLOSE) {
>> 222        apr_status_t rv;
>> 223        apr_bucket_brigade *bb = apr_brigade_create(c->pool,
>> c->bucket_alloc);
>> 224
>> 225        rv = ap_get_brigade(c->input_filters, bb, AP_MODE_SPECULATIVE,
>> 226                            APR_NONBLOCK_READ, 1);
>> 227        if (rv != APR_SUCCESS || APR_BRIGADE_EMPTY(bb)) {
>> 228            /*
>> 229             * Error or empty brigade: There is no data present in the
>> input
>> (gdb) n
>>
>> (HANGING in call at line 225)
>>
> 
> Ha!
> 
> See if this brings any happiness:
> 
> Index: network_io/unix/sockets.c
> ===================================================================
> --- network_io/unix/sockets.c (revision 1545394)
> +++ network_io/unix/sockets.c (working copy)
> @@ -273,7 +273,7 @@
>  #endif /* TCP_NODELAY_INHERITED */
>  #if APR_O_NONBLOCK_INHERITED
>      if (apr_is_option_set(sock, APR_SO_NONBLOCK) == 1) {
> -        apr_set_option(*new, APR_SO_NONBLOCK, 1);
> +        /* apr_set_option(*new, APR_SO_NONBLOCK, 1); */
>      }
>  #endif /* APR_O_NONBLOCK_INHERITED */


I can confirm after removing the line in apr-1.5.0 apache24 no longer hangs.
(tested with apache-2.4.6 / 2.4.7)

I seen now why this is triggered by comparing apr.h after `./configure',

> grep APR_O_NONBLOCK_INHERITED work/apr-1.4.8/include/apr.h
./work/apr-1.4.8/include/apr.h:#define APR_O_NONBLOCK_INHERITED 0

> grep APR_O_NONBLOCK_INHERITED work/apr-1.5.0/include/apr.h
work/apr-1.5.0/include/apr.h:#define APR_O_NONBLOCK_INHERITED 1


> There are some APR 1.5 autoconf changes to consider at
> http://svn.apache.org/viewvc/apr/apr/branches/1.5.x/build/apr_network.m4?view=log

OK, backing out the changes from r1502805 seems to do the trick on 10b3
http://svn.apache.org/viewvc?view=revision&revision=1502805

> grep APR_O_NONBLOCK_INHERITED work/apr-1.5.0/include/apr.h
work/apr-1.5.0/include/apr.h:#define APR_O_NONBLOCK_INHERITED 0


> Also, be careful which apr and apr-util are getting loaded.  httpd rpath
> has /usr/local/lib in front of the location of my apr and apr-util, and
> /usr/local/lib has alternate versions.  (apachectl -V should show the same
> compiled-with/running-with for apr and apr-util.)

I'm always building in clean environment a package for testing with
tinderbox or poudriere, this way I'm sure the no old libs are flying around.

-- 
Thanks
olli

Reply via email to