Re: Strange continous SEVERE error after upgrading to APR 1.3, Critical poller failure, Timer expired - Bayesian Filter detected spam

2008-08-26 Thread Jimi Hullegård
Hi,

I just noticed that version 1.3.3 is out, but I can't see this fix in the 
release notes. Is the fix for this bug included in 1.3.3?

Regards
/Jimi

mogul | jimi hullegård | system developer | hudiksvallsgatan 4, 113 30 
stockholm sweden | +46 8 506 66 172 | +46 765 27 19 55 | [EMAIL PROTECTED] | 
www.mogul.com


 -Original Message-
 From: Nick Kew [mailto:[EMAIL PROTECTED]
 Sent: den 26 juni 2008 12:50
 To: dev@apr.apache.org
 Subject: Re: Strange continous SEVERE error after
 upgrading to APR 1.3, Critical poller failure, Timer expired
 - Bayesian Filter detected spam

 On Thu, 26 Jun 2008 10:10:48 +0200
 Jimi Hullegård [EMAIL PROTECTED] wrote:

  By the way, does it look like this patch (or some variant, as
  mentioned in other postings in this thread) will be included in a
  future release (like 1.3.3)?

 Yes.  Your confirmation of the fix was what we were waiting for.
 Now we just have to decide which of the two variants to go with!

 --
 Nick Kew



Re: Strange continous SEVERE error after upgrading to APR 1.3, Critical poller failure, Timer expired

2008-06-27 Thread Jim Jagielski


On Jun 26, 2008, at 7:15 AM, Nick Kew wrote:


On Wed, 25 Jun 2008 21:43:47 +0200
Ruediger Pluem [EMAIL PROTECTED] wrote:



Looks good and makes sense.
apr_get_netos_error() in case of ETIME = 720062
APR_TIMEUP = 70007

So they are different.


which is, in itself, a little disturbing.


But I would do

rv = apr_get_netos_error();

instead

of

rv = APR_EGENERAL;


Works for me in terms of fixing this bug.  But looking at the
history of this bug (we had a much smaller bug, had a fix,
then introduced this bug by switching to apr_get_netos_error
because it felt better), I'd be wary of repeating the same
thought process without more thorough analysis and testing.

How about APR_EGENERAL for 1.3/release, and apr_get_netos_error
for trunk, where it can hopefully get some exposure?



+1



RE: Strange continous SEVERE error after upgrading to APR 1.3, Critical poller failure, Timer expired

2008-06-26 Thread Jimi Hullegård
Nick Kew wrote:

 OK, I've just tested APR variants with Event MPM.

 It is indeed that fix that breaks it.  When I revert
 the fix, the problem goes away.

 The original patch posted to fix the same bug works fine.
 Can I suggest you apply the following patch to 1.3.x
 (should work on 1.3.0 or 1.3.2 too) and let us know if
 it fixes the problem you're seeing?

 Index: poll/unix/port.c
 ===
 --- poll/unix/port.c(revision 662044)
 +++ poll/unix/port.c(working copy)
 @@ -315,7 +315,15 @@

  if (ret == -1) {
  (*num) = 0;
 -rv = apr_get_netos_error();
 +if (errno == EINTR) {
 +rv = APR_EINTR;
 +}
 +else if (errno == ETIME) {
 +rv = APR_TIMEUP;
 +}
 +else {
 +rv = APR_EGENERAL;
 +}
  }
  else if (nget == 0) {
  rv = APR_TIMEUP;


We have now applied this patch to 1.3.2 and it successfully solved the problem, 
ie no more errors about Critical
poller failure in the log file. Thanks!

By the way, does it look like this patch (or some variant, as mentioned in 
other postings in this thread) will be included in a future release (like 
1.3.3)?

Regards
/Jimi


Re: Strange continous SEVERE error after upgrading to APR 1.3, Critical poller failure, Timer expired

2008-06-26 Thread Nick Kew
On Thu, 26 Jun 2008 10:10:48 +0200
Jimi Hullegård [EMAIL PROTECTED] wrote:

 By the way, does it look like this patch (or some variant, as
 mentioned in other postings in this thread) will be included in a
 future release (like 1.3.3)?

Yes.  Your confirmation of the fix was what we were waiting for.
Now we just have to decide which of the two variants to go with!

-- 
Nick Kew


Re: Strange continous SEVERE error after upgrading to APR 1.3, Critical poller failure, Timer expired

2008-06-26 Thread Nick Kew
On Wed, 25 Jun 2008 21:43:47 +0200
Ruediger Pluem [EMAIL PROTECTED] wrote:


 Looks good and makes sense.
 apr_get_netos_error() in case of ETIME = 720062
 APR_TIMEUP = 70007
 
 So they are different.

which is, in itself, a little disturbing.

 But I would do
 
 rv = apr_get_netos_error();
 
 instead
 
 of
 
 rv = APR_EGENERAL;

Works for me in terms of fixing this bug.  But looking at the
history of this bug (we had a much smaller bug, had a fix,
then introduced this bug by switching to apr_get_netos_error
because it felt better), I'd be wary of repeating the same
thought process without more thorough analysis and testing.

How about APR_EGENERAL for 1.3/release, and apr_get_netos_error
for trunk, where it can hopefully get some exposure?

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


Re: Strange continous SEVERE error after upgrading to APR 1.3, Critical poller failure, Timer expired

2008-06-26 Thread Ruediger Pluem



On 06/26/2008 01:15 PM, Nick Kew wrote:



Works for me in terms of fixing this bug.  But looking at the
history of this bug (we had a much smaller bug, had a fix,
then introduced this bug by switching to apr_get_netos_error
because it felt better), I'd be wary of repeating the same
thought process without more thorough analysis and testing.

How about APR_EGENERAL for 1.3/release, and apr_get_netos_error
for trunk, where it can hopefully get some exposure?


+1. Do you commit or should I?

Regards

Rüdiger






Re: Strange continous SEVERE error after upgrading to APR 1.3, Critical poller failure, Timer expired

2008-06-25 Thread Nick Kew
On Wed, 25 Jun 2008 10:32:58 +0200
Jimi Hullegård [EMAIL PROTECTED] wrote:

 Hi,
 
 After we upgraded APR from 1.2.8 to 1.3.0 we get a strange error in
 tomcat's catalina.out that we have never seen before:
 
 Jun 24, 2008 11:42:09 AM
 org.apache.tomcat.util.net.AprEndpoint$Poller run SEVERE: Critical
 poller failure (restarting poller): [62] Timer expired

Could this be related to
http://marc.info/?l=apr-devm=121347872322492w=2

I had something similar with httpd+Event MPM on solaris.
Current workaround is to use Worker MPM.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


RE: Strange continous SEVERE error after upgrading to APR 1.3, Critical poller failure, Timer expired

2008-06-25 Thread Jimi Hullegård
 -Original Message-
 From: Nick Kew [mailto:[EMAIL PROTECTED]
 Sent: den 25 juni 2008 13:44
 To: dev@apr.apache.org
 Subject: Re: Strange continous SEVERE error after upgrading
 to APR 1.3, Critical poller failure, Timer expired

 On Wed, 25 Jun 2008 10:32:58 +0200
 Jimi Hullegård [EMAIL PROTECTED] wrote:

  Hi,
 
  After we upgraded APR from 1.2.8 to 1.3.0 we get a strange error in
  tomcat's catalina.out that we have never seen before:
 
  Jun 24, 2008 11:42:09 AM
  org.apache.tomcat.util.net.AprEndpoint$Poller run SEVERE: Critical
  poller failure (restarting poller): [62] Timer expired

 Could this be related to
 http://marc.info/?l=apr-devm=121347872322492w=2

 I had something similar with httpd+Event MPM on solaris.
 Current workaround is to use Worker MPM.

I'm not quite sure I follow you now... Worker MPM is for Apache Httpd, right? 
We don't use httpd. Do you know of any workaround for Tomcat?

/Jimi


Re: Strange continous SEVERE error after upgrading to APR 1.3, Critical poller failure, Timer expired

2008-06-25 Thread Nick Kew

On Wed, 2008-06-25 at 12:43, Nick Kew wrote:
 On Wed, 25 Jun 2008 10:32:58 +0200
 Jimi Hullegård [EMAIL PROTECTED] wrote:
 
  Hi,
  
  After we upgraded APR from 1.2.8 to 1.3.0 we get a strange error in
  tomcat's catalina.out that we have never seen before:
  
  Jun 24, 2008 11:42:09 AM
  org.apache.tomcat.util.net.AprEndpoint$Poller run SEVERE: Critical
  poller failure (restarting poller): [62] Timer expired
 
 Could this be related to
 http://marc.info/?l=apr-devm=121347872322492w=2
 
 I had something similar with httpd+Event MPM on solaris.
 Current workaround is to use Worker MPM.

OK, I've just tested APR variants with Event MPM.

It is indeed that fix that breaks it.  When I revert
the fix, the problem goes away.

The original patch posted to fix the same bug works fine.
Can I suggest you apply the following patch to 1.3.x
(should work on 1.3.0 or 1.3.2 too) and let us know if
it fixes the problem you're seeing?

Index: poll/unix/port.c
===
--- poll/unix/port.c(revision 662044)
+++ poll/unix/port.c(working copy)
@@ -315,7 +315,15 @@

 if (ret == -1) {
 (*num) = 0;
-rv = apr_get_netos_error();
+if (errno == EINTR) {
+rv = APR_EINTR;
+}
+else if (errno == ETIME) {
+rv = APR_TIMEUP;
+}
+else {
+rv = APR_EGENERAL;
+}
 }
 else if (nget == 0) {
 rv = APR_TIMEUP;


-- 
Nick Kew



RE: Strange continous SEVERE error after upgrading to APR 1.3, Critical poller failure, Timer expired

2008-06-25 Thread Jimi Hullegård
Nick Kew wrote:

 On Wed, 2008-06-25 at 12:43, Nick Kew wrote:
 
  Could this be related to
  http://marc.info/?l=apr-devm=121347872322492w=2
 
  I had something similar with httpd+Event MPM on solaris.
  Current workaround is to use Worker MPM.

 OK, I've just tested APR variants with Event MPM.

 It is indeed that fix that breaks it.  When I revert
 the fix, the problem goes away.

 The original patch posted to fix the same bug works fine.
 Can I suggest you apply the following patch to 1.3.x
 (should work on 1.3.0 or 1.3.2 too) and let us know if
 it fixes the problem you're seeing?

 [patch code snipped]

Ok, I have forwarded this to some persons that have access to the system now (I 
am currently at a different office with no remote access). I hope they have 
time to test this soon.

/Jimi


Re: Strange continous SEVERE error after upgrading to APR 1.3, Critical poller failure, Timer expired

2008-06-25 Thread Ruediger Pluem



On 06/25/2008 02:55 PM, Nick Kew wrote:

On Wed, 2008-06-25 at 12:43, Nick Kew wrote:

On Wed, 25 Jun 2008 10:32:58 +0200
Jimi Hullegård [EMAIL PROTECTED] wrote:


Hi,

After we upgraded APR from 1.2.8 to 1.3.0 we get a strange error in
tomcat's catalina.out that we have never seen before:

Jun 24, 2008 11:42:09 AM
org.apache.tomcat.util.net.AprEndpoint$Poller run SEVERE: Critical
poller failure (restarting poller): [62] Timer expired

Could this be related to
http://marc.info/?l=apr-devm=121347872322492w=2

I had something similar with httpd+Event MPM on solaris.
Current workaround is to use Worker MPM.


OK, I've just tested APR variants with Event MPM.

It is indeed that fix that breaks it.  When I revert
the fix, the problem goes away.

The original patch posted to fix the same bug works fine.
Can I suggest you apply the following patch to 1.3.x
(should work on 1.3.0 or 1.3.2 too) and let us know if
it fixes the problem you're seeing?

Index: poll/unix/port.c
===
--- poll/unix/port.c(revision 662044)
+++ poll/unix/port.c(working copy)
@@ -315,7 +315,15 @@

 if (ret == -1) {
 (*num) = 0;
-rv = apr_get_netos_error();
+if (errno == EINTR) {
+rv = APR_EINTR;
+}
+else if (errno == ETIME) {
+rv = APR_TIMEUP;
+}
+else {
+rv = APR_EGENERAL;
+}
 }
 else if (nget == 0) {
 rv = APR_TIMEUP;




Looks good and makes sense.
apr_get_netos_error() in case of ETIME = 720062
APR_TIMEUP = 70007

So they are different.

But I would do

rv = apr_get_netos_error();

instead

of

rv = APR_EGENERAL;

Regards

Rüdiger





Re: Strange continous SEVERE error after upgrading to APR 1.3, Critical poller failure, Timer expired

2008-06-25 Thread William A. Rowe, Jr.

Ruediger Pluem wrote:


On 06/25/2008 02:55 PM, Nick Kew wrote:


OK, I've just tested APR variants with Event MPM.

It is indeed that fix that breaks it.  When I revert
the fix, the problem goes away.

The original patch posted to fix the same bug works fine.
Can I suggest you apply the following patch to 1.3.x
(should work on 1.3.0 or 1.3.2 too) and let us know if
it fixes the problem you're seeing?

Index: poll/unix/port.c
===
--- poll/unix/port.c(revision 662044)
+++ poll/unix/port.c(working copy)
@@ -315,7 +315,15 @@

 if (ret == -1) {
 (*num) = 0;
-rv = apr_get_netos_error();
+if (errno == EINTR) {
+rv = APR_EINTR;
+}
+else if (errno == ETIME) {
+rv = APR_TIMEUP;
+}
+else {
+rv = APR_EGENERAL;
+}
 }
 else if (nget == 0) {
 rv = APR_TIMEUP;




Looks good and makes sense.
apr_get_netos_error() in case of ETIME = 720062
APR_TIMEUP = 70007

So they are different.

But I would do

rv = apr_get_netos_error();

instead

of

rv = APR_EGENERAL;



Good observations.  Folks, you want to fix these regressions before
we TR 1.3.3?

Bill