>Hi George,
>
>Thanks for the reply...
>
>> >My problem is that the mod_perl httpd is sometimes crashing overnight. In
>> >the last three days, it has mysteriously crashed twice. When I restart it
>> >with "apachectl_modperl start" (apachectl_modperl is just apachectl but
>> >with the config file path set differently), it comes up with no problem,
>> >but I suppose it might crash again in the future.
>> >
>> How long had it been running OK before you started having problems?
>> Did something change just before the problem started occurring?
>
>Previously, I only had one mod_perl httpd running on the system. I split
>it into a non-mod_perl httpd and a mod_perl httpd because the system was
>running out of memory.
>
>This change happened 4 days ago. Before that, I did not have this crashing 
>problem.
>
>> What ports are you using for your httpd that does and does not have the
>> problem?
>
>Both httpds listen on port 80. The mod_perl enabled one listens on
>216.74.79.145:80 and 216.74.79.194:80, while the non-mod_perl enabled one
>listens on port 80 of all other IP addresses on the machine.
>
>> Is there any indications in the access_log at about the time of the crash?
>
>203.177.3.11 - - [19/Jan/2001:05:05:21 -0500] "GET /anime/seraphimcall/
>HTTP/1.1" 200 3256 "http://www.animelyrics.com/anime/_S" "Mozilla/4.0
>(compatible; MSIE 5.0; Windows 98; DigExt)"
>207.35.188.14 - - [19/Jan/2001:08:40:20 -0500] "GET / HTTP/1.0" 200 6205
>"-" "Mozilla/4.7 [en]C-CCK-MCD  (WinNT; U)"
>
>Those are the two log entries for animelyrics.com before and after the
>crash; I don't see anything unusual. I also looked at slayers.aaanime.net:
>
>24.67.224.12 - - [19/Jan/2001:05:03:24 -0500] "GET
>/~linazel/fanfics/fanfic.asp?fanfic=nobility&part=10 HTTP/1.0" 200 15547
>"http://slayers.aaanime.net/~linazel/fanfics/fanfic.asp?fanfic=nobilitypart
=9"
>"Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt; AtHome0107;
>sureseeker.com; Hotbar 2.0)"
>172.133.91.48 - - [19/Jan/2001:08:35:32 -0500] "GET 
>/~linazel/banners/ybanner.gif HTTP/1.1" 200 20544 "-" "Opera/5.02 (Windows
98;
>U)  [en]"
>
>I don't see anything out of the ordinary here, either.
>
>> Perhaps you could run a cron job to scan the processes in order narrow down
>> the exact time of the problem.
>
>What would I be looking for?
Is there any indication of a burst load (or a similar pattern) just before
crash?
Is there a back end data base involved?

It took about 4 hours before the httpd process was restarted.
It would be nice to know how long after the last request the httpd root
process crashed.  If a cron job ran once a minute to scan for the httpd
root process and report when it disappears, it might be a clue as to the
nature of the problem.  In the report you might want to include information
about the last 10 minutes (last 10 scans from temp files 1 through 10) of
all the httpd process running via the `ps -gaux|grep httpd`.  It would be
interesting to know how many httpd process were running and also what
`vmstat` had to say before and after the crash.

Often when trying to solve an intermittent problem, it is good, if you can
duplicate the problem a will.  The information obtained about the problem
should help you to achieve this.  

For example, the access_log indicates that the last browser access before
the problem were both:
"Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt;".
However, it is difficult to tell if this is just a coincidence or not until
a pattern can be established.



Reply via email to