Thanks to Tim Lucia I found the solution to my problem. In case any one else
gets this problem here's the answer.


Problem Briefly: Is there anyway to figure out what TomCat is doing, or
trying to
do, when it hangs and does not respond to any http or https request?

Problem Details: I am running Tomcat 5.1.12 on Redhat9 on a 4 processor
server. I get frequent but random Tomcat hangs. It has not happened on a 1
processor system, with either Linux or Windows. I can force the hang to
happen fairly reliably if I run tests to bombard the server with http
requests (several
per second). According to logs it happens after the end of processing one
request and before the beginning the next. It is apparently not within
application code, unless it's a finalizer. I have run a higher priority
daemon thread in same JVM that just writes the time to a log file, and it
hangs at the same time, so it could be the JVM that's hanging, or whatever
does the real threading. Mostly, but not always, 'top' shows the 'java'
process using 99.9% of CPU, and 2 of the 4 processors at about 40%.  I can
kill the java process with 'kill -9', but I can't figure what it was stuck
doing.

Any suggestions?

Answer:
The linux command 'kill -QUIT <pid>' dumps the state of the JVM to
catalina.out which shows, for example, where you are in your code if it is
in an infinite loop or a wait-deadlock. kill -QUIT does not actually stop
Tomcat.

(You find the pid of tomcat to use in 'kill -QUIT <pid>' using the command
'ps -ef | grep java' which gives output like this:

root     30625     1  0 Jan22 ?        00:10:00
/pgm/java/bin/java -Djava.util.logging.manager=org.apache.juli.ClassLoaderLo
gManager -Djava.util.logging.config.file=/data/tomcat/conf/logging.propertie
s -Djava.endorsed.dirs=/pgm/tomcat/common/endorsed -classpath
:/pgm/tomcat/bin/bootstrap.jar:/pgm/tomcat/bin/commons-logging-api.jar -Dcat
alina.base=/data/tomcat -Dcatalina.home=/pgm/tomcat -Djava.io.tmpdir=/data/t
omcat/temp org.apache.catalina.startup.Bootstrap start
root     11354 11056  0 08:30 pts/1    00:00:00 grep java

The pid is 30625 in this case - so the command is 'kill -QUIT 30625'
)


If kill -QUIT does not write stuff to catalina.out, the JVM is hung. This
was my problem, and the cause was a kernel SMP threading bug. I switched
from Redhat 9 (2.4.20 kernel) to Fedora Core 4 (2.6.11-1.1369_FC4smp kernel)
and have now run for 48 hours without a hang.

Changing LD_ASSUME_KERNEL also made a difference. See the tomcat release
notes ...
#GLIBC 2.2 / Linux 2.4 users should define an environment variable:
#export LD_ASSUME_KERNEL=2.2.5
#
#Redhat Linux 9.0 users should use the following setting to avoid
#stability problems:
#export LD_ASSUME_KERNEL=2.4.1

On Redhat 9 running on the 4-way SMP, LD_ASSUME_KERNEL=2.2.5, or nothing at
all seemed to be more stable than the recommended LD_ASSUME_KERNEL=2.4.1.

I am current running Fedora Core 4 with LD_ASSUME_KERNEL=2.2.5 and it seems
to be stable.

Dave





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to