Hi George,
Thanks for the reply...
My problem is that the mod_perl httpd is sometimes crashing overnight. In
the last three days, it has mysteriously crashed twice. When I restart it
with "apachectl_modperl start" (apachectl_modperl is just apachectl but
with the config file path set differently), it comes up with no problem,
but I suppose it might crash again in the future.
How long had it been running OK before you started having problems?
Did something change just before the problem started occurring?
Previously, I only had one mod_perl httpd running on the system. I split
it into a non-mod_perl httpd and a mod_perl httpd because the system was
running out of memory.
This change happened 4 days ago. Before that, I did not have this crashing
problem.
What ports are you using for your httpd that does and does not have the
problem?
Both httpds listen on port 80. The mod_perl enabled one listens on
216.74.79.145:80 and 216.74.79.194:80, while the non-mod_perl enabled one
listens on port 80 of all other IP addresses on the machine.
Is there any indications in the access_log at about the time of the crash?
203.177.3.11 - - [19/Jan/2001:05:05:21 -0500] "GET /anime/seraphimcall/
HTTP/1.1" 200 3256 "http://www.animelyrics.com/anime/_S" "Mozilla/4.0
(compatible; MSIE 5.0; Windows 98; DigExt)"
207.35.188.14 - - [19/Jan/2001:08:40:20 -0500] "GET / HTTP/1.0" 200 6205
"-" "Mozilla/4.7 [en]C-CCK-MCD (WinNT; U)"
Those are the two log entries for animelyrics.com before and after the
crash; I don't see anything unusual. I also looked at slayers.aaanime.net:
24.67.224.12 - - [19/Jan/2001:05:03:24 -0500] "GET
/~linazel/fanfics/fanfic.asp?fanfic=nobilitypart=10 HTTP/1.0" 200 15547
"http://slayers.aaanime.net/~linazel/fanfics/fanfic.asp?fanfic=nobilitypart
=9"
"Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt; AtHome0107;
sureseeker.com; Hotbar 2.0)"
172.133.91.48 - - [19/Jan/2001:08:35:32 -0500] "GET
/~linazel/banners/ybanner.gif HTTP/1.1" 200 20544 "-" "Opera/5.02 (Windows
98;
U) [en]"
I don't see anything out of the ordinary here, either.
Perhaps you could run a cron job to scan the processes in order narrow down
the exact time of the problem.
What would I be looking for?
Is there any indication of a burst load (or a similar pattern) just before
crash?
Is there a back end data base involved?
It took about 4 hours before the httpd process was restarted.
It would be nice to know how long after the last request the httpd root
process crashed. If a cron job ran once a minute to scan for the httpd
root process and report when it disappears, it might be a clue as to the
nature of the problem. In the report you might want to include information
about the last 10 minutes (last 10 scans from temp files 1 through 10) of
all the httpd process running via the `ps -gaux|grep httpd`. It would be
interesting to know how many httpd process were running and also what
`vmstat` had to say before and after the crash.
Often when trying to solve an intermittent problem, it is good, if you can
duplicate the problem a will. The information obtained about the problem
should help you to achieve this.
For example, the access_log indicates that the last browser access before
the problem were both:
"Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt;".
However, it is difficult to tell if this is just a coincidence or not until
a pattern can be established.