Carl wrote:
Chris,

I find it hard to believe two brand new machines with different processors, etc. would have a hardware problem that showed itself in exactly the same way. Further, I have run memTest86 for 30 hours on one of the servers and it showed nothing (although, as Chuck pointed out, the test may not have handled the cores correctly or may not have changed the temperature sufficiently to cause the problem we are seeing.) I have not found a mem test specifically for 64 bit processors.

Right.
After rescanning your posts (and feel free to correct any discrepancies), here is a summary :

1) you never saw this issue under a previous JVM 1.5 and Tomcat version 5.5.x

2) the problem happens on two separate servers, which seems to rule out a common server hardware issue

3) it happens under different versions of Linux, which seems to rule out a problem with one particular Linux distribution

4) it seems to be a SegFault in the JVM, leaving a core dump but no traces in the logs. (which SegFaults in my experience happen usually when trying to execute something which is not valid executable code for the platform at hand) Anyway, it does not seem to be due to running out of some resource, nor to a hidden call to system.exit().

5) not quite sure of this anymore, but it seems to happen also on different JVMs, which would tend to rule out a problem with a particular JVM port.

6) it does not happen immediately, not in any obvious way related to what is being processsed, except that it seems to happen more readily under load

7) it is obviously not a common problem with either JVM or Tomcat, or we would have had laments from others by now

8) I don't know how a Java/Tomcat webapp application could trigger a SegFault on its own, other than by having the JVM participate in it. And apparently your apps are working fine up to the moment of the sudden death, so for once they do not appear as being among the usual suspects.

9) This, in one of your earlier posts, triggered my curiosity :
quote
This Tomcat is straight out of the box except for some modifications to JAVA_OPTS in tomcat/bin/catalina.sh (NDLR: canonically, a better place would be setenv.sh) and opening up ports and turning on SSL in tomcat/conf/server.xml.
unquote

So, maybe two suggestions, taking into account that I am just making wild guesses here (but that's pretty much what everyone by now is doing too, so I don't feel too bad) :

- have you tried running Tomcat from the command-line, with STDOUT/STDERR to the console ? Maybe something shows up there which doesn't show up anywhere else ?

- what about this SSL ? that just seems to me a likely candidate for something that is maybe not used all the time, probably calls stuff which should be native code, and is usually provided separately from Tomcat.
Can you turn it off and still be operational ?
Also, if it is provided separately, it should probably be relatively "grouped" in some directory, making it easier to check if everything is as it should be.

Note also that apart from a direct hardware similarity between the servers on which it happens, another common element seems to be the place at which it happens, namely the server room. This is a long shot, but a power supply issue may also provoke hardware failures. Or if your server room is on top of a mountain, or near a particle accelerator ?
(re relativistic gamma rays, dark energy and all that stuff).
;-)


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to