Bug#697682: libapache2-mod-perl2: Intermittent FTBFS: t/apache/read3.t failure
On 5 September 2014 20:06, Niko Tyni nt...@debian.org wrote: On Wed, Aug 13, 2014 at 11:47:58AM +0300, Niko Tyni wrote: On Wed, Aug 13, 2014 at 09:12:11AM +0100, Steve Hay wrote: Thanks for the patch. I haven't seen this failure myself on Windows, but the patch certainly doesn't seem to break anything. I'm a little uneasy about it, though. It seems odd to structure read3.pm differently to read2.pm and read4.pm, which still have a plan emitted before the first read(). Why does read3.pm deadlock but read2.pm and read4.pm don't? If it's just luck then perhaps read2.pm and read4.pm should be changed similarly? It only happens with read3 because the POSTed string in that test is very much larger than the other ones. LWP::Protocol::http sends the whole POST request in one write() unless it's over eight kilobytes long (see lib/LWP/Protocol/http.pm in libwww-perl_6.08 line 276.) I've updated the attached patch to change read2 too. read4 is quite different though, as it reads just a few bytes at a time, so I think it's OK to leave it as is. Maybe a comment in the code would be good too, since the structure of most test files is to get the plan out early. Otherwise there is scope for re-introducing the bug in the future since it isn't obvious that the placement of the plan is important. Done as well. Thanks, applied in revision 1624218. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#697682: libapache2-mod-perl2: Intermittent FTBFS: t/apache/read3.t failure
On Wed, Aug 13, 2014 at 11:47:58AM +0300, Niko Tyni wrote: On Wed, Aug 13, 2014 at 09:12:11AM +0100, Steve Hay wrote: Thanks for the patch. I haven't seen this failure myself on Windows, but the patch certainly doesn't seem to break anything. I'm a little uneasy about it, though. It seems odd to structure read3.pm differently to read2.pm and read4.pm, which still have a plan emitted before the first read(). Why does read3.pm deadlock but read2.pm and read4.pm don't? If it's just luck then perhaps read2.pm and read4.pm should be changed similarly? It only happens with read3 because the POSTed string in that test is very much larger than the other ones. LWP::Protocol::http sends the whole POST request in one write() unless it's over eight kilobytes long (see lib/LWP/Protocol/http.pm in libwww-perl_6.08 line 276.) I've updated the attached patch to change read2 too. read4 is quite different though, as it reads just a few bytes at a time, so I think it's OK to leave it as is. Maybe a comment in the code would be good too, since the structure of most test files is to get the plan out early. Otherwise there is scope for re-introducing the bug in the future since it isn't obvious that the placement of the plan is important. Done as well. Thanks again, -- Niko Tyni nt...@debian.org From 2eb936dd9c55e8e3500ff9c0277f9c93dcf357c6 Mon Sep 17 00:00:00 2001 From: Niko Tyni nt...@debian.org Date: Fri, 8 Aug 2014 17:22:33 + Subject: Don't answer anything to the client before it's done sending the POST data LWP::Protocol::http sends long POST requests with several write() calls. If it gets a response with a full HTTP header (up to the two newlines) before it's done with the later write() calls sending the actual POST data, it will stop writing. This leads to a deadlock and a test failure after timeout. The problem only shows up with large POSTED strings like the one in read3.pm, but update read2.pm too for correctness. Thread at http://mail-archives.apache.org/mod_mbox/perl-dev/201408.mbox/%3C20140809104131.GA3670@estella.local.invalid%3E Bug-Debian: https://bugs.debian.org/697682 --- t/response/TestApache/read2.pm | 6 -- t/response/TestApache/read3.pm | 6 -- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/t/response/TestApache/read2.pm b/t/response/TestApache/read2.pm index 4307181..4cf7348 100644 --- a/t/response/TestApache/read2.pm +++ b/t/response/TestApache/read2.pm @@ -20,8 +20,6 @@ my $expected = foobar; sub handler { my $r = shift; -plan $r, tests = 1; - # test the case where the buffer to be filled has set magic # attached. which is the case when one passes an non-existing hash # entry value. it's not autovivified when passed to the function @@ -30,6 +28,10 @@ sub handler { my $data; my $len = $r-read($data-{buffer}, $r-headers_in-{'Content-Length'}); +# only print the plan out after reading to avoid chances of a deadlock +# see http://mail-archives.apache.org/mod_mbox/perl-dev/201408.mbox/%3C20140809104131.GA3670@estella.local.invalid%3E +plan $r, tests = 1; + ok t_cmp($data-{buffer}, $expected, reading into an autovivified hash entry); diff --git a/t/response/TestApache/read3.pm b/t/response/TestApache/read3.pm index 8ad9f26..b1522d5 100644 --- a/t/response/TestApache/read3.pm +++ b/t/response/TestApache/read3.pm @@ -20,8 +20,6 @@ my $expected = foobarx2000; sub handler { my $r = shift; -plan $r, tests = 1; - # test to read data up to end of file is signaled my $data = ''; my $where = 0; @@ -31,6 +29,10 @@ sub handler { $where += $len; } while ($len 0); +# only print the plan out after reading to avoid chances of a deadlock +# see http://mail-archives.apache.org/mod_mbox/perl-dev/201408.mbox/%3C20140809104131.GA3670@estella.local.invalid%3E +plan $r, tests = 1; + ok t_cmp($data, $expected, reading up to end of file); Apache2::Const::OK;
Bug#697682: libapache2-mod-perl2: Intermittent FTBFS: t/apache/read3.t failure
On 9 August 2014 11:41, Niko Tyni nt...@debian.org wrote: tag 697682 = patch thanks On Thu, Aug 07, 2014 at 04:32:08PM +0300, Niko Tyni wrote: On Sat, Mar 30, 2013 at 03:26:55PM +, Dominic Hargreaves wrote: This doesn't seem to be limited to s390; I've seen it in on i386 during perl 5.16 rebuilds. As a data point, t/apache/read3.t fails for me consistently on the kfreebsd-{i386,amd64} porter boxes (fischer and falla) but interestingly not on the kfreebsd buildds. The test is getting an internal server error, with this in the error log: [Thu Aug 07 13:30:28.768239 2014] [perl:error] [pid 81995:tid 34593212160] [client 127.0.0.1:12832] Apache2::RequestIO::read: (70007) The timeout specified has expired at /home/ntyni/libapache2-mod-perl2-2.0.8+httpd24-r1449661/t/response/TestApache/read3.pm line 30 As this was fully reproducible on the Debian/kFreeBSD boxes, I spent some time debugging it. [...] I'm unsure if there's a fault with the client (it gets an FD that's ready for writing from the first select() call but ignores it, calls select() again and blocks because the server has already called poll() on the same descriptor), but the deadlock can be fixed/worked around by making the server not send a response before the full POST request has been read. See the attached patch, which fixes this completely for me. I can also reproduce the issue on amd64 by running the test in a loop and putting some load on the host. The patch makes it go away there too. Cc'ing the modperl dev list. Please consider applying the patch. Thanks for the patch. I haven't seen this failure myself on Windows, but the patch certainly doesn't seem to break anything. I'm a little uneasy about it, though. It seems odd to structure read3.pm differently to read2.pm and read4.pm, which still have a plan emitted before the first read(). Why does read3.pm deadlock but read2.pm and read4.pm don't? If it's just luck then perhaps read2.pm and read4.pm should be changed similarly? Maybe a comment in the code would be good too, since the structure of most test files is to get the plan out early. Otherwise there is scope for re-introducing the bug in the future since it isn't obvious that the placement of the plan is important. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#697682: libapache2-mod-perl2: Intermittent FTBFS: t/apache/read3.t failure
Thanks for looking at the patches! On Wed, Aug 13, 2014 at 09:12:11AM +0100, Steve Hay wrote: Thanks for the patch. I haven't seen this failure myself on Windows, but the patch certainly doesn't seem to break anything. I'm a little uneasy about it, though. It seems odd to structure read3.pm differently to read2.pm and read4.pm, which still have a plan emitted before the first read(). Why does read3.pm deadlock but read2.pm and read4.pm don't? If it's just luck then perhaps read2.pm and read4.pm should be changed similarly? It only happens with read3 because the POSTed string in that test is very much larger than the other ones. LWP::Protocol::http sends the whole POST request in one write() unless it's over eight kilobytes long (see lib/LWP/Protocol/http.pm in libwww-perl_6.08 line 276.) From a correctness POV, I suppose the others should indeed do the same in the (IMO unlikely) case LWP::Protocol::http ever changes to do two writes even for such short requests. Maybe a comment in the code would be good too, since the structure of most test files is to get the plan out early. Otherwise there is scope for re-introducing the bug in the future since it isn't obvious that the placement of the plan is important. Yeah, I agree that would be good. I can send a revised patch later later. -- Niko Tyni nt...@debian.org -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#697682: libapache2-mod-perl2: Intermittent FTBFS: t/apache/read3.t failure
tag 697682 = patch thanks On Thu, Aug 07, 2014 at 04:32:08PM +0300, Niko Tyni wrote: On Sat, Mar 30, 2013 at 03:26:55PM +, Dominic Hargreaves wrote: This doesn't seem to be limited to s390; I've seen it in on i386 during perl 5.16 rebuilds. As a data point, t/apache/read3.t fails for me consistently on the kfreebsd-{i386,amd64} porter boxes (fischer and falla) but interestingly not on the kfreebsd buildds. The test is getting an internal server error, with this in the error log: [Thu Aug 07 13:30:28.768239 2014] [perl:error] [pid 81995:tid 34593212160] [client 127.0.0.1:12832] Apache2::RequestIO::read: (70007) The timeout specified has expired at /home/ntyni/libapache2-mod-perl2-2.0.8+httpd24-r1449661/t/response/TestApache/read3.pm line 30 As this was fully reproducible on the Debian/kFreeBSD boxes, I spent some time debugging it. The test is about a client sending a 12000 byte POST request to the server, which reads it in a loop 100 bytes at a time and checks that it all came through. Starting the server with t/TEST -httpd_conf $(pwd)/debian/apache2.conf -one-process -start-httpd attaching ktrace to it ktrace -p $(pgrep apache2) -f ktrace.apache2.2000 and then running the client part under ktrace as well: ktrace -f ktrace.client.2000 -- perl -Iblib/lib -Iblib/arch t/apache/read3.t gives nice traces to inspect with kdump. The failure only happens when the POSTed string is large enough that it doesn't fit in the same write() call with the HTTP headers. So foobarx1300 doesn't exhibit the problem but foobarx2000 does. There seems to be a deadlock here: client writes the HTTP headers of the POST request but not the data yet server reads HTTP headers server writes HTTP OK + part of the body client calls select(), gets 2 so there are two descriptors ready (read+write) client reads HTTP OK + part of the body server tries to read the POST data but gets EAGAIN (no data yet) server calls poll() and blocks until there's something to read client calls select() again and blocks The client is using LWP::Protocol::http under the hood, and I see it indeed stops writing to the server when it gets a full HTTP header (up to two newlines.) This is around line 386 of lib/LWP/Protocol/http.pm in libwww-perl_6.08-1. Also, I note that it sends the whole POST request in one write() unless it's over eight kilobytes long (see line 276.) I'm unsure if there's a fault with the client (it gets an FD that's ready for writing from the first select() call but ignores it, calls select() again and blocks because the server has already called poll() on the same descriptor), but the deadlock can be fixed/worked around by making the server not send a response before the full POST request has been read. See the attached patch, which fixes this completely for me. I can also reproduce the issue on amd64 by running the test in a loop and putting some load on the host. The patch makes it go away there too. Cc'ing the modperl dev list. Please consider applying the patch. -- Niko Tyni nt...@debian.org From 2be468161de0b09df1eb10940abe531439d311e5 Mon Sep 17 00:00:00 2001 From: Niko Tyni nt...@debian.org Date: Fri, 8 Aug 2014 17:22:33 + Subject: [PATCH] Don't answer anything to the client before it's done sending the POST data LWP::Protocol::http sends long POST requests with several write() calls. If it gets a response with a full HTTP header (up to the two newlines) before it's done with the later write() calls sending the actual POST data, it will stop writing. This leads to a deadlock and a test failure after timeout. Bug-Debian: https://bugs.debian.org/697682 --- t/response/TestApache/read3.pm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/t/response/TestApache/read3.pm b/t/response/TestApache/read3.pm index 8ad9f26..2994093 100644 --- a/t/response/TestApache/read3.pm +++ b/t/response/TestApache/read3.pm @@ -20,8 +20,6 @@ my $expected = foobarx2000; sub handler { my $r = shift; -plan $r, tests = 1; - # test to read data up to end of file is signaled my $data = ''; my $where = 0; @@ -31,6 +29,8 @@ sub handler { $where += $len; } while ($len 0); +plan $r, tests = 1; + ok t_cmp($data, $expected, reading up to end of file); Apache2::Const::OK; -- 2.1.0.rc1
Bug#697682: libapache2-mod-perl2: Intermittent FTBFS: t/apache/read3.t failure
On Sat, Mar 30, 2013 at 03:26:55PM +, Dominic Hargreaves wrote: Control: retitle -1 libapache2-mod-perl2: Intermittent FTBFS: t/apache/read3.t failure This doesn't seem to be limited to s390; I've seen it in on i386 during perl 5.16 rebuilds. As a data point, t/apache/read3.t fails for me consistently on the kfreebsd-{i386,amd64} porter boxes (fischer and falla) but interestingly not on the kfreebsd buildds. The test is getting an internal server error, with this in the error log: [Thu Aug 07 13:30:28.768239 2014] [perl:error] [pid 81995:tid 34593212160] [client 127.0.0.1:12832] Apache2::RequestIO::read: (70007) The timeout specified has expired at /home/ntyni/libapache2-mod-perl2-2.0.8+httpd24-r1449661/t/response/TestApache/read3.pm line 30 -- Niko Tyni nt...@debian.org -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org