On Mon, Nov 25, 2013 at 5:03 PM, Jeff Trawick <traw...@gmail.com> wrote:
> On Mon, Nov 25, 2013 at 4:28 PM, olli hauer <oha...@gmx.de> wrote: > >> On 2013-11-25 22:14, Jeff Trawick wrote: >> > On Sun, Nov 24, 2013 at 8:39 PM, Jeff Trawick <traw...@gmail.com> >> wrote: >> > >> >> Let's move this to dev@httpd and omit dev@apr (after this e-mail)... >> >> >> >> >> >> On Sun, Nov 24, 2013 at 8:28 AM, olli hauer <oha...@gmx.de> wrote: >> >> >> >>> On 2013-11-22 00:08, Jeff Trawick wrote: >> >>>> On Thu, Nov 21, 2013 at 5:48 PM, olli hauer <oha...@gmx.de> wrote: >> >>>> >> >>>>> Hi, >> >>>>> >> >>>>> sorry for late response to apr-1.5.0 ... >> >>>>> >> >>>>> I've done some tests with apr-1.5.0 on FreeBSD 10 (amd64) >> >>>>> and it seems there is an issue that breaks apache24. >> >>>>> >> >>>>> With apr-1.5.0 apache22 works but apache24 is broken. >> >>>>> apache starts fine, nothing special in the logs or during >> >>>>> start with -X but no response is coming back. >> >>>>> >> >>>>> apr/apr-util test: >> >>>>> ======================================== >> >>>>> apr-1.5.0: all tests passed [1] >> >>>>> apr-util-1.5.3: all tests passed >> >>>>> >> >>>>> >> >>>>> working configurations (FreeBSD beta3 [1] >> >>>>> ========================================= >> >>>>> apache22-2.2.26 apr-1.4.8 apr-util-1.5.3 >> >>>>> apache22-2.2.26 apr-1.5.0 apr-util-1.5.3 >> >>>>> apache24-2.4.6 apr-1.4.8 apr-util-1.5.2 >> >>>>> apache24-2.4.7 apr-1.4.8 apr-util-1.5.2 >> >>>>> apache24-2.4.6 apr-1.4.8 apr-util-1.5.3 >> >>>>> apache24-2.4.7 apr-1.4.8 apr-util-1.5.3 >> >>>>> >> >>>>> broken combinations: >> >>>>> ========================================= >> >>>>> apache24-2.4.6 apr-1.5.0 apr-util-1.5.3 >> >>>>> apache24-2.4.7 apr-1.5.0 apr-util-1.5.3 >> >>>>> >> >>>>> All tests where done with MPM worker. >> >>>>> >> >>>>> >> >>>>> FreeBSD 8.4 (amd64) seems OK in all combinations >> >>>>> FreeBSD 9.2 (amd64) seems OK in all combinations >> >>>>> >> >>>>> [1] FreeBSD 10 beta3 with iconv UTF7 patch r258316 >> >>>>> (head/lib/libiconv_modules/UTF7/citrus_utf7.c) >> >>>>> >> >>>>> Any hints where to start? >> >>>>> >> >>>> >> >>>> Set LogLevel trace8 and compare good vs. bad. >> >>>> Start with -X then attach with dtruss and compare good vs. bad. >> >>>> Get open fds displayed by lsof and compare good vs. bad. >> >>>> Is connection to client held open? Get backtraces. >> >>>> >> >>>> I just compared 1.4.8 vs. 1.5.0 and didn't see anything that looked >> >>>> remotely likely. >> >>>> >> >>> >> >>> Comparing trace8 outputs showed the request is processed >> >>> but the following code snippet in server/core_filters.c >> >>> never get TRUE except the client cancels the request. >> >>> >> >>> To get some better log entries I've used server/core_filters.c >> >>> r1510295 from trunk. >> >>> >> >>> >> >>> @@server/core_filters.c (line 510) >> >>> if (APR_BUCKET_IS_FLUSH(bucket) >> >>> || non_file_bytes_in_brigade >= THRESHOLD_MAX_BUFFER >> >>> || morphing_bucket_in_brigade >> >>> || eor_buckets_in_brigade > MAX_REQUESTS_IN_PIPELINE) { >> >>> ... >> >>> } >> >>> >> >>> [http:trace3] http_filters.c(974):[client x.x.x.x:x] Response sent >> with >> >>> status 200, headers: >> >>> [http:trace5] http_filters.c(983):[client x.x.x.x:x] Date: Sun, 24 >> Nov >> >>> 2013 10:28:37 GMT >> >>> [http:trace5] http_filters.c(986):[client x.x.x.x:x] Server: >> >>> Apache/2.4.7 (FreeBSD) >> >>> [http:trace4] http_filters.c(804):[client x.x.x.x:x] Last-Modified: >> >>> Sat, 23 Nov 2013 16:51:58 GMT >> >>> [http:trace4] http_filters.c(804):[client x.x.x.x:x] ETag: >> >>> \\"be-4ebdaf2ef2780\\" >> >>> [http:trace4] http_filters.c(804):[client x.x.x.x:x] Accept-Ranges: >> >>> bytes >> >>> [http:trace4] http_filters.c(804):[client x.x.x.x:x] >> Content-Length: 190 >> >>> [http:trace4] http_filters.c(804):[client x.x.x.x:x] Keep-Alive: >> >>> timeout=5, max=100 >> >>> [http:trace4] http_filters.c(804):[client x.x.x.x:x] Connection: >> >>> Keep-Alive >> >>> [http:trace4] http_filters.c(804):[client x.x.x.x:x] Content-Type: >> >>> text/html >> >>> [core:trace8] core_filters.c(576):[client x.x.x.x:x] brigade contains: >> >>> bytes: 284, non-file bytes: 284, eor buckets: 0, morphing buckets: 0 >> >>> [core:trace8] core_filters.c(576):[client x.x.x.x:x] brigade contains: >> >>> bytes: 474, non-file bytes: 284, eor buckets: 0, morphing buckets: 0 >> >>> [core:trace8] core_filters.c(576):[client x.x.x.x:x] brigade contains: >> >>> bytes: 474, non-file bytes: 284, eor buckets: 1, morphing buckets: 0 >> >>> >> >>> >> >>> This following lines are only seen if >> >>> apr-1-5.0 was build without IPv6 support >> >>> or apache24 was build with v4-mapping enabled >> >>> or "Listen $IP:$port" is given in httpd.conf >> >>> >> >>> [core:trace6] core_filters.c(526):[client x.x.x.x:x] will flush >> because >> >>> of FLUSH bucket >> >>> [core:trace8] core_filters.c(528):[client x.x.x.x:x] seen in brigade >> so >> >>> far: bytes: 474, non-file bytes: 284, eor buckets: 1, morphing >> buckets: 0 >> >>> [core:trace8] core_filters.c(555):[client x.x.x.x:x] flushing now >> >>> [core:trace8] core_filters.c(568):[client x.x.x.x:x] total bytes >> written: >> >>> 474 >> >>> [core:trace8] core_filters.c(576):[client x.x.x.x:x] brigade contains: >> >>> bytes: 0, non-file bytes: 0, eor buckets: 0, morphing buckets: 0 >> >>> >> >>> However a flush is triggered if the client cancels the request, but no >> >>> data is sent over the wire ... >> >>> >> >>> >> >>> I've searched if other also have seen a similar issue and found >> >>> instead the following interesting article from Nov. 2002 ;) >> >>> http://people.apache.org/~trawick/v4mapped.html >> >>> >> >>> >> >> I haven't analyzed the trace messages you showed above. Here's what I >> did >> >> on FreeBSD 9: >> >> >> >> Apply this patch to fix the version check for disabling v4mapped >> addresses: >> >> >> >> Index: configure.in >> >> =================================================================== >> >> --- configure.in (revision 1545127) >> >> +++ configure.in (working copy) >> >> @@ -774,7 +774,10 @@ >> >> ], >> >> [ >> >> case $host in >> >> - *freebsd5*|*netbsd*|*openbsd*) >> >> + *freebsd[1234].*) >> >> + v4mapped=yes >> >> + ;; >> >> + *freebsd*|*netbsd*|*openbsd*) >> >> v4mapped=no >> >> ;; >> >> *) >> >> >> >> That gives me >> >> >> >> $ bin/apachectl -V >> >> AH00558: httpd: Could not reliably determine the server's fully >> qualified >> >> domain name, using 127.0.0.1. Set the 'ServerName' directive globally >> to >> >> suppress this message >> >> Server version: Apache/2.4.8-dev (Unix) >> >> Server built: Nov 24 2013 20:21:30 >> >> Server's Module Magic Number: 20120211:27 >> >> Server loaded: APR 1.5.1-dev, APR-UTIL 1.5.4-dev >> >> Compiled using: APR 1.5.1-dev, APR-UTIL 1.5.4-dev >> >> Architecture: 64-bit >> >> Server MPM: worker >> >> threaded: yes (fixed thread count) >> >> forked: yes (variable process count) >> >> Server compiled with.... >> >> -D APR_HAS_SENDFILE >> >> -D APR_HAS_MMAP >> >> -D APR_HAVE_IPV6 (IPv4-mapped addresses disabled) >> >> >> >> The last line shown indicates that mapped addresses are disabled in >> this >> >> build. >> >> >> >> The only Listen I have is "Listen 8080". procstat says I have separate >> >> sockets, as expected: >> >> >> >> 60734 httpd 3 s - rw---n--- 9 0 TCP ::.8080 ::.0 >> >> 60734 httpd 4 s - rw---n--- 9 0 TCP 0.0.0.0:8080 >> >> 0.0.0.0:0 >> >> >> >> From netstat: >> >> >> >> tcp4 0 0 *.8080 *.* >> LISTEN >> >> tcp6 0 0 *.8080 *.* >> LISTEN >> >> >> >> (You had shown sockstat before; this is the same stuff.) >> >> >> >> --/-- >> >> >> >> Are you able to check for the issue on FreeBSD 9? >> >> >> >> When it hangs, are you connecting to the IPv4 listener or the IPv6 >> >> listener? Does it matter whether you use loopback or the address of a >> >> network interface? >> >> >> >> You're using a regular web browser in the failing case, right? Have >> you >> >> tried something as simple as netcat for the failing address/port? >> Example: >> >> >> >> $ echo "GET /" | nc 127.0.0.1 8080 >> >> <html><body><h1>It works!</h1></body></html> >> >> >> >> -- >> >> Born in Roswell... married an alien... >> >> http://emptyhammock.com/ >> >> >> > >> > I see the hang on FreeBSD 10 Beta 3 using the same sources, though >> clang is >> > installed instead of gcc. >> > >> > FWIW, the simple "GET /<EOF>" request works but an HTTP/1.1 request from >> > telnet or from Firefox hangs. I am able to try loopback over IPv4 and >> IPv6 >> > and they both hang. >> > >> > Now move the gcc binaries from FreeBSD 9 to FreeBSD 10 (along with >> > libpcre.0): I don't see the hang with the simple testcases that >> triggered >> > it with the clang builds. >> > >> > I wouldn't be surprised if your apr-1.5-as-trigger observation reflects >> > "accidents" like code moving around and/or changing in size. I didn't >> see >> > any interesting changes in the source that would affect the path where >> the >> > hang occurs. But who really knows :) >> > >> >> Playing with the code (damage included), moving 'flush_upto = next;' >> and the page loads. I still have some problems to understand how complete >> code is working (or even not). >> >> What is not clear to me why 'if (APR_BUCKET_IS_FLUSH(bucket)' is not >> triggered? >> > > It is for me at least during lingering close, and I go into the "if > (flush_upto != NULL) {" block as expected. > > From the start of a response to the simple index.html I see a HEAP bucket > (hdrs?), FILE bucket, EOS bucket, EOR bucket, and maybe something else I > glossed over, but no flush bucket. > > I think the issue may be that after generating the response for one > request, httpd thinks there is a pipelined request and instead of flushing > the first response it goes to read the next request but none is available. > > Yep, hung here: > > Breakpoint 1, check_pipeline (c=0x8026282b8) at http_request.c:221 > 221 if (c->keepalive != AP_CONN_CLOSE) { > Current language: auto; currently minimal > (gdb) p c->keepalive > $1 = AP_CONN_KEEPALIVE > (gdb) n > 223 apr_bucket_brigade *bb = apr_brigade_create(c->pool, > c->bucket_alloc); > (gdb) > 225 rv = ap_get_brigade(c->input_filters, bb, AP_MODE_SPECULATIVE, > (gdb) list > 220 { > 221 if (c->keepalive != AP_CONN_CLOSE) { > 222 apr_status_t rv; > 223 apr_bucket_brigade *bb = apr_brigade_create(c->pool, > c->bucket_alloc); > 224 > 225 rv = ap_get_brigade(c->input_filters, bb, AP_MODE_SPECULATIVE, > 226 APR_NONBLOCK_READ, 1); > 227 if (rv != APR_SUCCESS || APR_BRIGADE_EMPTY(bb)) { > 228 /* > 229 * Error or empty brigade: There is no data present in the > input > (gdb) n > > (HANGING in call at line 225) > Ha! See if this brings any happiness: Index: network_io/unix/sockets.c =================================================================== --- network_io/unix/sockets.c (revision 1545394) +++ network_io/unix/sockets.c (working copy) @@ -273,7 +273,7 @@ #endif /* TCP_NODELAY_INHERITED */ #if APR_O_NONBLOCK_INHERITED if (apr_is_option_set(sock, APR_SO_NONBLOCK) == 1) { - apr_set_option(*new, APR_SO_NONBLOCK, 1); + /* apr_set_option(*new, APR_SO_NONBLOCK, 1); */ } #endif /* APR_O_NONBLOCK_INHERITED */ There are some APR 1.5 autoconf changes to consider at http://svn.apache.org/viewvc/apr/apr/branches/1.5.x/build/apr_network.m4?view=log Also, be careful which apr and apr-util are getting loaded. httpd rpath has /usr/local/lib in front of the location of my apr and apr-util, and /usr/local/lib has alternate versions. (apachectl -V should show the same compiled-with/running-with for apr and apr-util.) > >> >> --- core_filters.c.orig >> +++ core_filters.c >> @@ -527,13 +527,14 @@ >> /* >> * Defer the actual blocking write to avoid doing many >> writes. >> */ >> - flush_upto = next; >> +// flush_upto = next; >> >> bytes_in_brigade = 0; >> non_file_bytes_in_brigade = 0; >> eor_buckets_in_brigade = 0; >> morphing_bucket_in_brigade = 0; >> } >> + flush_upto = next; >> } >> >> if (flush_upto != NULL) { >> >> -- >> olli >> > > > > -- > Born in Roswell... married an alien... > http://emptyhammock.com/ > -- Born in Roswell... married an alien... http://emptyhammock.com/