Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector
with http now.

On Wed, 2010-02-24 at 14:52 +0100, Carl wrote:
> Taylan,
> 
> > The failures we've seen are in anywhere between 8 hours to a week of
> > runtime.
> 
> The timing of the failures seems similar.
> 
> > We have also had failures with hotspot error files (hs_err) present, and
> > the cause specified was indeed SIGSEGV indicating a page fault.
> 
> I have never seen any hs_* files but have seen core files where strace 
> showed the jvm stopped on a seg fault.
> 
> > We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when
> > the situation allows (during regular updates of the application, or a
> > crash) to see if that helps.
> 
> I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not 
> tried 1.6.0_16.  Please post your results of this trial.
> 
> > Running tomcat on the
> > foreground might show something, but then again I could be waiting for a
> > month for it to happen.
> 
> Yes, this has been part of my problem as anytime we change something, we 
> have to wait a week for the server to fail.
> 
> In one sense, I am fortunate that I have a little more flexibility than you. 
> I have two servers (different hardware) but only need one in service at a 
> time.  Therefore, I always have one server I can test ideas on although I 
> have never been able to develop a meaningful stress test, i.e., the only way 
> I can test a change is to put it in production.
> 
> Thanks,
> 
> Carl
> 
> ----- Original Message ----- 
> From: "Taylan Develioglu" <tdevelio...@ebuddy.com>
> To: "Tomcat Users List" <users@tomcat.apache.org>
> Sent: Wednesday, February 24, 2010 8:31 AM
> Subject: Re: jvm exits without trace
> 
> 
> > Hello Carl,
> >
> > The failures we've seen are in anywhere between 8 hours to a week of
> > runtime. Most of them have (still) been running for almost a month
> > without failure. There are ~100 machines.
> >
> >>From the top of my head, I think we've had about 10+ failures now.
> >
> > We have also had failures with hotspot error files (hs_err) present, and
> > the cause specified was indeed SIGSEGV indicating a page fault. But I
> > don't know if the two are related.
> >
> > We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when
> > the situation allows (during regular updates of the application, or a
> > crash) to see if that helps.
> >
> > It might be useful to note that the failures happen with tomcat 6.0.20
> > as well as 6.0.24.
> >
> > As far as load concerns, I haven't had a failure on an idle machines.
> > The machines are well loaded, but only at a fraction limit in regards to
> > load and cpu utilization.
> > Most memory is commited to tomcat, where a 24G machine would have 18G
> > allocated to heap, 128M to permgen and some unspecified amount would get
> > used by jni for apr. About 4G remains free after calculating taking into
> > account the jvm itsself.
> > A 16G machine would have 12G allocated to the heap.
> >
> > Besides the fact that our apps heavily use nio and mina I wouldn't say
> > there's anything else noteworthy. There can be anywhere up to 10000
> > concurrents on one machine.
> >
> > I had searched for coredumps, but no luck. Running tomcat on the
> > foreground might show something, but then again I could be waiting for a
> > month for it to happen.
> >
> > On Wed, 2010-02-24 at 12:42 +0100, Carl wrote:
> >> Taylan,
> >>
> >> I am the person who started the "Tomcat dies suddenly" thread which I 
> >> still
> >> haven't resolved.  I am curious about the pattern of failures you are
> >> experiencing because they may provide some clues to my problem.  In my 
> >> case,
> >> the system will run for 15 minutes to 10 days before failing (most of the
> >> time it is several days to a week.)  It appears to die from a seg fault 
> >> in
> >> the JVM (I am using Sun 1.6.0_18 but have tried previous versions)... you
> >> may be able to see the cause of the failure from the core file (the core
> >> files on my systems were in several directories so you may have to do a
> >> 'find' to locate them.)  Load may be a factor but the failures generally
> >> come after the load has been heavy for a while.  I am running a couple of
> >> applications and it seems the failures are more frequent when people are
> >> hitting the additional apps (the primary app is always used, the 
> >> remaining
> >> apps are used sporatically.)
> >>
> >> How does this compare to what you are experiencing?
> >>
> >> Thanks,
> >>
> >> Carl
> >>
> >> ----- Original Message ----- 
> >> From: "Taylan Develioglu" <tdevelio...@ebuddy.com>
> >> To: "Tomcat Users List" <users@tomcat.apache.org>; <p...@pidster.com>
> >> Sent: Wednesday, February 24, 2010 5:09 AM
> >> Subject: Re: jvm exits without trace
> >>
> >>
> >> > The GC log shows plenty of heap space left in all the spaces.
> >> >
> >> > I purposely didn't bother replacing the variables because I figured 
> >> > they
> >> > would not be relevant.
> >> >
> >> > But if you think they might provide clues they're as follows:
> >> >
> >> > JAVA_HEAP_SIZE=18432M
> >> > JAVA_EDEN_SIZE=$(($(echo $JAVA_HEAP_SIZE|sed 's/M$\|G$//')/6))M
> >> > JAVA_PERM_SIZE=128M
> >> > JAVA_STCK_SIZE=128K
> >> >
> >> > EDEN_SIZE is 1/6th of total heap.
> >> >
> >> > And I said there was nothing in the system logs.
> >> > But you get a couple of points for trying.
> >> >
> >> > On Wed, 2010-02-24 at 10:44 +0100, Pid wrote:
> >> >> On 24/02/2010 09:36, Taylan Develioglu wrote:
> >> >> > I thought I'd add the connector definitions too, :
> >> >> >
> >> >> >     <Connector port="80"
> >> >> > protocol="org.apache.coyote.http11.Http11AprProtocol"
> >> >> >                 compression="1024" keepAliveTimeout="60000"
> >> >> > maxKeepAliveRequests="-1"
> >> >> >                 enableLookups="false" redirectPort="443"
> >> >> > maxThreads="150"
> >> >> > pollerSize="32768"
> >> >> >                 pollerThreadCount="4"/>
> >> >> >
> >> >> >      <Connector port="443"
> >> >> > protocol="org.apache.coyote.http11.Http11AprProtocol" 
> >> >> > SSLEnabled="true"
> >> >> >                 enableLookups="false" maxThreads="10" scheme="https"
> >> >> > secure="true"
> >> >> >                 SSLCertificateFile="/etc/ssl/private/something.crt"
> >> >> > 
> >> >> > SSLCertificateKeyFile="/etc/ssl/private/something.key"
> >> >> >                 SSLCACertificateFile="/etc/ssl/certs/ca.crt"/>
> >> >> >
> >> >> >
> >> >> > On Wed, 2010-02-24 at 10:23 +0100, Taylan Develioglu wrote:
> >> >> >> Hi,
> >> >> >>
> >> >> >> I have jvm's, running tomcat and our application, exiting
> >> >> >> mysteriously,
> >> >> >> and was wondering if anyone could give me some advice on how to 
> >> >> >> debug
> >> >> >> this thing.
> >> >> >>
> >> >> >> There is nothing in catalina.out, nor our application logs, and no
> >> >> >> hotspot error file. GC log looks normal. No trace in system logs.
> >> >> >>
> >> >> >> I am left completely clueless :(, has anyone dealt with a problem 
> >> >> >> like
> >> >> >> this before?
> >> >> >>
> >> >> >> Any help appreciated.
> >> >> >>
> >> >> >> - Tomcat 6.0.24
> >> >> >> - TC native 1.1.18
> >> >> >> - APR 1.3.9
> >> >> >> - Sun JDK 6u18
> >> >> >> - Debian Lenny, 2.6.31.10-amd64
> >> >> >>
> >> >> >> 2 servlets, one as ROOT. 2 HTTP connectors that use TCNative/APR.
> >> >> >>
> >> >> >> JAVA_OPTS ( ):
> >> >> >>
> >> >> >>      -verbose:gc
> >> >> >>      -Djava.awt.headless=true
> >> >> >>      -Dsun.net.inetaddr.ttl=60
> >> >> >>      -Dfile.encoding=UTF-8
> >> >> >>      -Djava.io.tmpdir=$TMP_DIR
> >> >> >>      -Djava.library.path=/usr/local/lib
> >> >> >>      -Djava.endorsed.dirs=$CATALINA_BASE/endorsed
> >> >> >>      -Dcatalina.base=$CATALINA_BASE
> >> >> >>      -Dcatalina.home=$CATALINA_HOME
> >> >> 
> >> >> >>    -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
> >> >> >> -Djava.util.logging.config.file="$CATALINA_BASE/conf/logging.properties"
> >> >> >>      -XX:+PrintGCDetails
> >> >> >>      -Xloggc:$CATALINA_BASE/logs/gc.log
> >> >> >>      -XX:+UseConcMarkSweepGC
> >> >> >>      -XX:CMSInitiatingOccupancyFraction=70
> >> >> >>      -Xms$JAVA_HEAP_SIZE
> >> >> >>      -Xmx$JAVA_HEAP_SIZE
> >> >> >>      -XX:NewSize=$JAVA_EDEN_SIZE
> >> >> >>      -XX:MaxNewSize=$JAVA_EDEN_SIZE
> >> >> >>      -XX:PermSize=$JAVA_PERM_SIZE
> >> >> >>      -XX:MaxPermSize=$JAVA_PERM_SIZE
> >> >> >>      -Xss$JAVA_STCK_SIZE
> >> >> >>      -XX:+UseLargePages
> >> >>
> >> >> There's no actual heap size settings in the above.  But you get a 
> >> >> couple
> >> >> of points for trying.
> >> >>
> >> >> Google "Linux Out Of Memory killer" or "OOM Killer" and then check the
> >> >> server logs carefully.  (e.g. /var/log/messages)
> >> >>
> >> >>
> >> >> p
> >> >>
> >> >> > ---------------------------------------------------------------------
> >> >> > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >> >> > For additional commands, e-mail: users-h...@tomcat.apache.org
> >> >> >
> >> >>
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >> >> For additional commands, e-mail: users-h...@tomcat.apache.org
> >> >>
> >> >
> >> >
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >> > For additional commands, e-mail: users-h...@tomcat.apache.org
> >> >
> >> >
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >> For additional commands, e-mail: users-h...@tomcat.apache.org
> >>
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > For additional commands, e-mail: users-h...@tomcat.apache.org
> >
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to