DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-07-09 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running





--- Additional Comments From [EMAIL PROTECTED]  2002-07-09 21:57 ---
I too am seeing this error. We are running latest apache (2.0.39) on SUN 
running Solaris 
8.  APACHE runs fine for about 24 to 48 hours and then seems to hang. No load 
is visible  
(e.g., there is not a high load situation) and everything appears normal. 
Connections to 
port 80 are accepted but any request itself simply hangs (e.g., telnet to port 
80 and 
issuing a GET /  simply hangs).

Killing and restarting apache solves the problem for about 24 to 48 hours.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-07-09 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running





--- Additional Comments From [EMAIL PROTECTED]  2002-07-09 22:06 ---
We've had some other vague reports of this kind of behavior, but none of us 
have seen it that I know of.  Can you please take one of the misbehaving child 
processes and attach to it with gdb and give us a backtrace?  An strace might 
be helpful as well.  See http://httpd.apache.org/dev/debugging.html 
 
Thanks!

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-07-09 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running





--- Additional Comments From [EMAIL PROTECTED]  2002-07-09 22:40 ---
You just sent me email asking for a gdb of a misbehaving child.  I will attemt 
to do that the 
next time we see one hang (probably 48 hours from now or so)... though it may 
be 
difficult.  Here are a couple of other observations that might help:

We launch 3 apaches (two dedicated to particular clients, and one dedicated to 
many 
clients).  Of the 3, we ONLY see ONE of them consistantly hanging... that is, 
the one 
dedicated to many clients.  The other interesting thing about that particular 
apache is that 
it also handles our secure server - and since the rest never hang, it may be 
related to 
that. In fact, when the one hangs, the other two are fine and keep responding 
(which is to 
be expected).

However, I will attempt to do a dump of a misbehaving child next time I see the 
problem.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-07-13 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running





--- Additional Comments From [EMAIL PROTECTED]  2002-07-13 17:51 ---
For those of you who are seeing this: what other non-standard configuration
directives do you have? Anything with Proxy?

After looking at the trace of a hung server, I see it's making a call to
apr_connect() from inside idle_server_maintenance. The server does
this to knock a child process out of a call to poll() so that it'll notice that
it'd been signaled to die. I bet the frequency at which this occurs will most
likely be tied directly to that maintenance period.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-07-14 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running





--- Additional Comments From [EMAIL PROTECTED]  2002-07-14 02:55 ---
1 more question for you guys..
are you using IP based virtual hosting or name based virtual hosting?

also .. can you try increasing the MaxRequestsPerChild.
this should probably be in the range of 100-1000 or 0 for prefork.

MaxRequestsPerchild says 'serve X requests' and then die. serving only 5
requests and then killing itself is really a bad thing IMHO

--ian
(we have this set to zero with the worker MPM on production on a heavily loaded
site running a older version of apache 2 with none of these problems)

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-07-16 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running





--- Additional Comments From [EMAIL PROTECTED]  2002-07-16 20:29 ---
I have sent my config file, concerning this bug, to a number of people who 
directly 
responded to me from apache.org. However, I will also post the following 
comments to 
keep this bug site up to date:

1) I was wrong about it happening every 24 to 48 hours. The other day, to be 
preemptive 
since I was leaving the office for a few hours (and didn't want it to hang), I 
killed and 
restarted the servers. Within about 5 minutes or so, one of the three apaches 
hung (I 
submitted a ptrace to apache.org with the results).  So this is not a 24 to 48 
hour thing, 
instead, it appears that it can happen at any time.

2) As per the comment about MaxRequestsPerChild... our config is, and always 
has 
been, set at 1000 for this. So I don't think that is involved.

3) We have seen situations where it hangs, and then after sever hundred 
seconds, 
recovers on it's own.

4) It is hanging, not on the socket open, but on the fetch.  For example, when 
the server is 
in a hung state, I can telnet to port 80 on the server and it statis that I am 
connected and 
allows me to type. However, entering "GET /" followed by a return simply hangs. 
So this 
appears to be a hang on the read.

5) Of the 3 apaches we run (on a single server) this only happens to one where 
we have 
a lot of virtual hosts, and also have SSL running (don't know if there's a 
connection). The 
other two have never hung.

6) For all those suggesting config file changes... please note that between the 
last 
apache version and this, we have made no changes to the config (except for the 
SSL 
logging directives which we merely commented out in the new config).  Thus, I 
feel that 
this relates to a change in the apache code layer as opposed to something wrong 
in the 
config itself (not saying that a config change won't fix it, but something 
*broke* between 
the last apache and this apache since our config didn't change).

Aloha!

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-07-17 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running





--- Additional Comments From [EMAIL PROTECTED]  2002-07-17 16:21 ---
What is your ulimit -a when apahe is started, and approx. how many vhosts
do you have?

FWIW, Ian was refering to the original reporter's configuration, where
MaxRequestsPerChild was set to "5" which means that a child process
will be shut down and a new respawned after only 5 requests to that child.
(5 is way too low a value for this setting, something on the order of 1000
is recommended, and set it to 0 for unlimited)

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-07-17 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running





--- Additional Comments From [EMAIL PROTECTED]  2002-07-17 16:28 ---
Additional Comments From aaron ---
What is your ulimit -a when apahe is started, and approx. how many vhosts
do you have?

Aaron... my ulimit is:ulimit -n 1024
There are 105 virtual host records.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-07-22 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running





--- Additional Comments From [EMAIL PROTECTED]  2002-07-22 20:42 ---
**UPDATE**

This last weekend (Saturday morning) I changed our apache configuration on the 
web 
server that is hanging.  We removed the SSL servering and put it up on it's own 
httpd (so 
now we are running a total of 4 httpd's on this machine).

This morning we had another hang... it was in the same httpd as before - so we 
now 
know that this is NOT related to SSL - since it was separated from the pack and 
did not 
hang when the other hung. So now, on a single hardware server, with four 
httpd's 
running (independent), we are still seeing the one with the most virtual hosts 
hanging, 
the others with either SSL or minimal virtual host have never hung.

I may, over the next week, attempt to split the hanging httpd.conf down the 
middle 
(dividing in half what each is handling) and see if we can further isolate 
it.

Wish you guys could reproduce this so we could fix it ;))

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-07-24 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running





--- Additional Comments From [EMAIL PROTECTED]  2002-07-24 02:27 ---
I'll mention this, in case it is significant...

We have a program that remotely monitors to see if our servers are up. The 
program also 
monitors apache.  It senses the apache hanging, but no matter what I do to the 
code, 
setting a SIGNAL and an ALARM in all of my socket code (open, read/write and 
close all 
have alarms around them) it MOSTLY will not catch it.  In other words, the call 
to read 
from the socket reads and an ALARM does not break it out of the read.

Additionally, in support of this, I also noted that a HUP to the parent process 
(apache) 
does not free it up.  Additionally, if I TELNET to port 80 of the hung process 
IP address, I 
get a connection from apache, but issuing a "GET /" causes a hang (as I 
previously 
reported)... HOWEVER, trying to break that hang with a CONTROL-C or other 
signal, 
OTHER than a kill, also refuses to free the hang.

So, it seems to me that other than a kill signal, when these are hanging other 
signals are 
also refused (e.g., alarm, control-c (int)).

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-07-30 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running





--- Additional Comments From [EMAIL PROTECTED]  2002-07-30 15:14 ---
Ok...  The good news? We've fixed the problem on our end. The bad news? We have 
no 
idea what the problem actually is.

My last report to you stated that we separated SSL out of our config and put it 
in it's own 
separate apache config (that meant we were running 4 total apaches). 

Last thursday we took the apache config that was still hanging and split it 
right down the 
middle, putting half the clients into a new (identical) configuration.  We also 
noted that a 
couple of the configurations were sharing the same error/hit log files - which 
probably 
isn't the coolest thing - so we fixed that by making sure each config was 
writing to 
different log files.

So this means we have 5 apaches.  2 that are specific to two clients., a 3rd 
which is SSL 
only, and the remaining two which splits our remaining hundred-and-so virtuals 
down the 
middle.

Since thursday we have seen no hanging situations whatsoever.  Since we were 
seeing 
a hang ALWAYS within 48 hours, we suspect the problem is solved (for us at 
least).

The solution thus EITHER has to do with the fact that multiple configs were 
using some of 
the same log files (unlikely) or that the one config had s many virtual 
hosts in it 
(likely).  Keeping in mind that we USED to run with just one config with all of 
them in it, 
under a previous version of Apache, this thus seems to me to be something 
different in 
the code base that changed when I split the remaining config down the middle 
(both 
config files are IDENTICAL in all respects exept the name of the log files and 
the actual 
virtual hosts involved).

FYI, the old config (prior to the split down the middle) had 119 virtual host 
statements in it. 
The new split configs have 55 in one and 64 in the other.  With 119, we had the 
hang... 
with 55 / 64 we do not experience the hang.

The ball is now in your court ;)

Aloha

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-08-12 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running





--- Additional Comments From [EMAIL PROTECTED]  2002-08-12 15:40 ---
Ok... now, we have been running the final split of apache that I discussed in 
my last 
posting for 2 weeks, no hang. However, two days ago it DID again hang (just one 
of the 5 
apaches we are running).  After restarting, they ran fine for 3 days and then 
another 
hang.

So... so far we are still experiencing it... and it still appears to be highly 
random as to
when/why it occurs.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-08-15 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running





--- Additional Comments From [EMAIL PROTECTED]  2002-08-15 15:13 ---
Another clue to the problem:

After I split the system into 5 apaches running, we have had no hangs for 2 
weeks. Then, 
saturday (as I previously reported) we had a single hang of one of the servers. 
 No 
problems the rest of the weekend, then this last monday morning all he** broke 
loose.

Monday Morning saw the system hang, and then go nuts basically. We would kill 
and 
restart, and it would hang within 15 seconds, 15 minutes or about 1 hour - 
depending on 
the restart (e.g., throughout the day it would hang, the longest it wouldn't 
hang was 1 
hour).  Not only would the apache we had suspected of hanging hang, but also 
the other 
that we split, that had not hung for 2 weeks.

By Monday night (10:30 PM Hawaii time, so pretty late) it was back to not 
hanging.

So what happened monday?  I posted a URL to slashdot.org column and we got a 
lot of 
hits because of it. I suspect GREATLY that the number of hits contributed to 
the hang. 
The INTRESTING thing is that of the two apaches that were hanging, one had the 
domain being slashdotted in it, the other didnt (btw, the server was able to 
server up the 
pages with no problem... it was just a lot of hits but no major load or 
anything).

So... this leads me to believe that the problem is related to traffic.  It is 
possible that it is 
related to the restarting of a child after a maximum number of hits.

I also discovered that my earlier reports were untrue... in this regard:

1)  I reported earlier that HUPing the hung server did no good. This is not 
true. HUPping 
it appears to work. It takes up to a minute to free up - and sometimes requires 
a second 
HUP before it frees up. ONLY OCCASSIONALLY after two hup attempts would it not 
free 
up and we had to kill/restart.

2) I reported eariler that my remote alarms that try to sense it would also 
hang on the 
open and not recover. While this is true, it was due to me using SIGNAL() 
instead of 
SIGACTION() (signal() automatically sets the restart flag to tell the socket 
commands to 
retry the operation after the signal... thus it *appeared* to be always stuck). 
 So... I am 
able to sense it (we've since written a program to sense it on the server and 
automatically 
rehup or restart the server depending on what it sees).


So... all of this leads me to believe it's the rollover (restart) of the 
children.  Note that if one 
hangs, all other virtuals on the same server also hang (e.g., no virtual 
assigned to the 
stuck server will respond until we rehup it).

The only other possibility, I think, is some type of exploit that hangs apache 
in this way... 
but I think that is remote.

One last thing...  when we were having the problems on Monday, I tried to roll 
back 
apache to version httpd-2.0.36  -- but the same problem occured, so I brought 
the version 
back to httpd-2.0.40.  (So this problem appears to be in all versions SINCE and 
INCLUDING 2.0.36).

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-10-17 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running





--- Additional Comments From [EMAIL PROTECTED]  2002-10-17 02:34 ---
[This is a mass bug update.]
This bug reports a problem in an older version of Apache 2.
Could you please update to the most recent version and see
if you can reproduce this problem.  If the bug still exists,
please update the bug with the latest version number.  If 
the bug no longer exists, please close the bug report.

Sorry for this impersonal response, but we get many more bug
reports than our volunteers can keep up with.
Thanks for using Apache!

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-11-02 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running

[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||INVALID



--- Additional Comments From [EMAIL PROTECTED]  2002-11-02 20:18 ---
[This is a mass bug update.] [Resolve-20021102]
No response from submitter; assuming issue is resolved.
If the problem still exists in the lastest version,
please reopen this report and update appropriately.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-11-24 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running

[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|INVALID |
Version|2.0.39  |2.0.43



--- Additional Comments From [EMAIL PROTECTED]  2002-11-24 18:14 ---
I am reopening this bug report. It was closed because I could not confirm it on 
the latest Apache 2.0.43 because we have been unable to do the upgrade.

This week we installed a brand spanking new SUN V100 Server. We placed 2.0.43 
apache on it and we turned on two virtuals. Pretty much same configuration file 
as used on our other servers.

We began seeing the hang problem within the first 24 hours of running. Note 
that there was practically no hits to the system.

Since this bug report was unresolved and was causing our systems to hang... we 
wrote a monitor program a few months ago that senses for the hang and sends a 
HUP to the process.  If the process does not recover it will shut down ALL the 
apache processes and do a restart.  This monitor pings apache about once every 
15 seconds and so can detect and fix the problem almost immediatly.  

We have noted interesting things - specifically, the watchdog has never had to 
kill apache - the HUP always works.  The problem might not occur for days, and 
then might occur many times in a day. Very unpredictable.

We have also noted that apache can apparently also *fix* the problem. That is, 
the process hangs for maybe 30 minutes or so, and then unhangs and everything 
is ok (we know this becuase we were not watching one apache process thinking it 
was immune, and then discovered it was doing this).  This MIGHT be one reason 
this is hard to track down, because apache does apparently resolve it 
eventually.

Just as an interest, here is the restart info from our watchdog program - note 
the times, etc:

11/23/2002 18:59:12 -> Trouble #1 for 192.207.247.2
   192.207.247.2 not responding, HUPing with:
   kill -USR1 `cat /cookware/web/apache/logs/httpd.pid`
11/23/2002 22:20:30 -> Trouble #1 for 192.207.247.2
   192.207.247.2 not responding, HUPing with:
   kill -USR1 `cat /cookware/web/apache/logs/httpd.pid`
11/23/2002 23:34:14 -> Trouble #1 for 192.207.247.2
   192.207.247.2 not responding, HUPing with:
   kill -USR1 `cat /cookware/web/apache/logs/httpd.pid`
11/23/2002 23:40:00 -> Trouble #1 for 192.207.247.2
   192.207.247.2 not responding, HUPing with:
   kill -USR1 `cat /cookware/web/apache/logs/httpd.pid`
11/23/2002 23:40:50 -> Trouble #1 for 192.207.247.2
   192.207.247.2 not responding, trying again:
   kill -USR1 `cat /cookware/web/apache/logs/httpd.pid`
11/23/2002 23:55:45 -> Trouble #1 for 192.207.247.2
   192.207.247.2 not responding, HUPing with:
   kill -USR1 `cat /cookware/web/apache/logs/httpd.pid`
11/24/2002 01:13:32 -> Trouble #1 for 192.207.247.2
   192.207.247.2 not responding, HUPing with:
   kill -USR1 `cat /cookware/web/apache/logs/httpd.pid`
11/24/2002 05:24:50 -> Trouble #1 for 192.207.247.2

etc  

So apparently it can appear in clusters, and then not appear for hours or days.

I also feel your bug #12598 is also related to this problem - probably 
describing the same situation.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2002-11-24 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running





--- Additional Comments From [EMAIL PROTECTED]  2002-11-24 19:50 ---
Here is some info that would be helpful.  I realize that you have sent
some of this to particular individuals in the past, but I don't see it
in the PR.

At the time of the hang:

  how many child processes and what are they doing? (run pstack against
  each one to see what is going on; presumably most of the children will
  have the same backtrace, so no sense sending in duplicate backtraces)

  what is the parent doing?  run truss against it for a while (15 seconds)
  if it doesn't appear to be doing anything, run pstack against it to
  get a backtrace

  is it possible that all available children are handling connections?
  look at netstat output for ports served by Apache to see if there are
  enough hung connections to represent each available child

wild and crazy idea:

  try a different accept mutex mechanism (e.g., "AcceptMutex fcntl" in
  httpd.conf)

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2003-02-24 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running

[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution||WORKSFORME



--- Additional Comments From [EMAIL PROTECTED]  2003-02-24 13:28 ---
no response in 3 months on request for doc on what has hung

feel free to reopen when you can gather the requested materials

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 10266] - apache hangs after some hours of running

2003-03-31 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=10266

apache hangs after some hours of running

[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|WORKSFORME  |
Version|2.0.43  |2.0.44



--- Additional Comments From [EMAIL PROTECTED]  2003-03-31 23:58 ---
Under 2.0.44 we are still experiencing this problem.  We have noted that of the 
three 
servers running it occus on all three... and it occurs with more frequency on 
the busier 
servers.

We currently have special software running to detect the hung process and REHUP 
(kill -
USR1) it. REHUPPING apache fixes it in 99.9% of the cases (after 3 rehups fail 
in 
sequence our software automatically kills and restarts apache).

We have also noted that in most cases apache itself will catch the hung child 
eventually 
(many minutes - which is why we wrote the software to detect it in 15 seconds 
or less).

As a new clue to help you trace this down we have notice the following error 
messages 
in the error logs AFTER a HUP (kill -USR1) is done:

[Mon Mar 31 18:49:28 2003] [notice] Graceful restart requested, doing restart
[Mon Mar 31 18:49:33 2003] [notice] Apache/2.0.44 (Unix) mod_ssl/2.0.44 OpenSSL/
0.9.6g configured -- resuming normal operations
[Mon Mar 31 18:49:33 2003] [warn] long lost child came home! (pid 11039)
[Mon Mar 31 18:49:33 2003] [warn] long lost child came home! (pid 11040)

(that was in the log immediatly following our REHUP).

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]