On 06/14/2008 10:42 PM, William A. Rowe, Jr. wrote:
Guys, if anyone is looking at this, I'll hold off from tagging a bit
longer,
as I'd rather have apr-1.3.1 address all the platform quirks we identified
in preparing 2.2.9 for release. But if I hear nothing, I'll have to
just move ahead :)
Bill
Paul Querna wrote:
On aurora.apache.org, shortly after installing the new version, we hit
a problem with apr_pollset_poll:
[Thu Jun 12 05:36:51 2008] [error] (70007)The timeout specified has
expired: apr_pollset_poll: (listen)
[Thu Jun 12 05:36:52 2008] [notice] caught SIGTERM, shutting down
If you look in worker.c, around line 687, you can see that if do a
graceful shutdown if we get an unexpected error from apr_pollset_poll.
This appears to be a regression caused by r641661:
https://svn.apache.org/viewvc?view=rev&revision=641661
Which was a fix for PR 42580:
https://issues.apache.org/bugzilla/show_bug.cgi?id=42580
This appears to be an relative edge case on Solaris 10 -- it hasn't
happened again, and it is a regression in APR, but relatively small,
so I am still +1 for httpd-2.2.9 shipping.
Is this really a regression in APR or were we just as lucky before as we
were after?
Code from httpd
rv = apr_pollset_poll(pollset, -1, &numdesc, &pdesc);
if (rv != APR_SUCCESS) {
if (APR_STATUS_IS_EINTR(rv)) {
continue;
}
/* apr_pollset_poll() will only return errors in
catastrophic
* circumstances. Let's try exiting gracefully, for now. */
ap_log_error(APLOG_MARK, APLOG_ERR, rv,
(const server_rec *) ap_server_conf,
"apr_pollset_poll: (listen)");
So we the error message logged if apr_pollset_poll returns anything different
then
APR_SUCCESS or APR_EINTR.
So lets have a look at r641661:
--- apr/apr/trunk/poll/unix/port.c 2008/03/27 00:31:21 641660
+++ apr/apr/trunk/poll/unix/port.c 2008/03/27 00:46:05 641661
@@ -295,12 +295,7 @@
if (ret == -1) {
(*num) = 0;
- if (errno == ETIME || errno == EINTR) {
- rv = APR_TIMEUP;
- }
- else {
- rv = APR_EGENERAL;
- }
+ rv = apr_get_netos_error();
}
else if (nget == 0) {
rv = APR_TIMEUP;
So the code before said that if port_getn returns -1 (== fails) we return
APR_TIMEUP
if the error is ETIME or EINTR and APR_EGENERAL.
So IMHO the error message (in this IMHO the same) would have been shown with
the old
code.
What is more strange to me is that we get a timeout error ((70007)The timeout
specified has
expired: apr_pollset_poll:) even thought we called apr_pollset_poll with -1 as
timeout which
means wait indefinitely or no timeout. The implementation of apr_pollset_poll
seems to be
correct as it ensures that we supply NULL in this case to port_getn. But OTOH
the man page
for port_get / port_getn documents timeout behaviour only for port_get (setting
timeout parameter
to null means not timeout) not for port_getn. So couldn't this be a Solaris bug?
Regards
Rüdiger