a different kernel did not help either...

On Thu, 2010-03-11 at 11:37 +0100, Taylan Develioglu wrote:
> Changing to JIO didn't help, the silent crashes continue.
> 
> I'm changing kernel versions now.
> 
> On Fri, 2010-03-05 at 10:45 +0100, Taylan Develioglu wrote:
> > It's performing rather poorly performance wise, compared to the apr
> > connector. The number of threads required to handle the requests has
> > gone up significantly over the board.
> >
> > Stability wise, I don't have complaints yet.
> >
> > I'm keeping my fingers crossed.
> >
> > On Fri, 2010-03-05 at 10:09 +0100, Pid wrote:
> > > On 05/03/2010 08:41, Taylan Develioglu wrote:
> > > > Pid, that would assume we had a working<  1.6.10 version before that we
> > > > replaced.
> > >
> > > That it would.
> > >
> > > > We've run 1.6.10 upwards succesfully for a very long time. So I don't
> > > > see the point in doing this.
> > >
> > > I must have missed that.
> > >
> > > How is the HTTP connector performing?
> > >
> > >
> > > p
> > >
> > > > On Wed, 2010-03-03 at 12:00 +0100, Pid wrote:
> > > >> On 03/03/2010 09:11, Taylan Develioglu wrote:
> > > >>> Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector
> > > >>> with http now.
> > > >>
> > > >> As Chuck mentioned in the other thread, significant changes occurred at
> > > >> 1.6.10, so trying the release before (1.6.7) might be necessary to
> > > >> establish a better determination.
> > > >>
> > > >>
> > > >> p
> > > >>
> > > >>> On Wed, 2010-02-24 at 14:52 +0100, Carl wrote:
> > > >>>> Taylan,
> > > >>>>
> > > >>>>> The failures we've seen are in anywhere between 8 hours to a week of
> > > >>>>> runtime.
> > > >>>>
> > > >>>> The timing of the failures seems similar.
> > > >>>>
> > > >>>>> We have also had failures with hotspot error files (hs_err) 
> > > >>>>> present, and
> > > >>>>> the cause specified was indeed SIGSEGV indicating a page fault.
> > > >>>>
> > > >>>> I have never seen any hs_* files but have seen core files where 
> > > >>>> strace
> > > >>>> showed the jvm stopped on a seg fault.
> > > >>>>
> > > >>>>> We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 
> > > >>>>> when
> > > >>>>> the situation allows (during regular updates of the application, or 
> > > >>>>> a
> > > >>>>> crash) to see if that helps.
> > > >>>>
> > > >>>> I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have 
> > > >>>> not
> > > >>>> tried 1.6.0_16.  Please post your results of this trial.
> > > >>>>
> > > >>>>> Running tomcat on the
> > > >>>>> foreground might show something, but then again I could be waiting 
> > > >>>>> for a
> > > >>>>> month for it to happen.
> > > >>>>
> > > >>>> Yes, this has been part of my problem as anytime we change 
> > > >>>> something, we
> > > >>>> have to wait a week for the server to fail.
> > > >>>>
> > > >>>> In one sense, I am fortunate that I have a little more flexibility 
> > > >>>> than you.
> > > >>>> I have two servers (different hardware) but only need one in service 
> > > >>>> at a
> > > >>>> time.  Therefore, I always have one server I can test ideas on 
> > > >>>> although I
> > > >>>> have never been able to develop a meaningful stress test, i.e., the 
> > > >>>> only way
> > > >>>> I can test a change is to put it in production.
> > > >>>>
> > > >>>> Thanks,
> > > >>>>
> > > >>>> Carl
> > > >>>>
> > > >>>> ----- Original Message -----
> > > >>>> From: "Taylan Develioglu"<tdevelio...@ebuddy.com>
> > > >>>> To: "Tomcat Users List"<users@tomcat.apache.org>
> > > >>>> Sent: Wednesday, February 24, 2010 8:31 AM
> > > >>>> Subject: Re: jvm exits without trace
> > > >>>>
> > > >>>>
> > > >>>>> Hello Carl,
> > > >>>>>
> > > >>>>> The failures we've seen are in anywhere between 8 hours to a week of
> > > >>>>> runtime. Most of them have (still) been running for almost a month
> > > >>>>> without failure. There are ~100 machines.
> > > >>>>>
> > > >>>>>>  From the top of my head, I think we've had about 10+ failures now.
> > > >>>>>
> > > >>>>> We have also had failures with hotspot error files (hs_err) 
> > > >>>>> present, and
> > > >>>>> the cause specified was indeed SIGSEGV indicating a page fault. But 
> > > >>>>> I
> > > >>>>> don't know if the two are related.
> > > >>>>>
> > > >>>>> We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 
> > > >>>>> when
> > > >>>>> the situation allows (during regular updates of the application, or 
> > > >>>>> a
> > > >>>>> crash) to see if that helps.
> > > >>>>>
> > > >>>>> It might be useful to note that the failures happen with tomcat 
> > > >>>>> 6.0.20
> > > >>>>> as well as 6.0.24.
> > > >>>>>
> > > >>>>> As far as load concerns, I haven't had a failure on an idle 
> > > >>>>> machines.
> > > >>>>> The machines are well loaded, but only at a fraction limit in 
> > > >>>>> regards to
> > > >>>>> load and cpu utilization.
> > > >>>>> Most memory is commited to tomcat, where a 24G machine would have 
> > > >>>>> 18G
> > > >>>>> allocated to heap, 128M to permgen and some unspecified amount 
> > > >>>>> would get
> > > >>>>> used by jni for apr. About 4G remains free after calculating taking 
> > > >>>>> into
> > > >>>>> account the jvm itsself.
> > > >>>>> A 16G machine would have 12G allocated to the heap.
> > > >>>>>
> > > >>>>> Besides the fact that our apps heavily use nio and mina I wouldn't 
> > > >>>>> say
> > > >>>>> there's anything else noteworthy. There can be anywhere up to 10000
> > > >>>>> concurrents on one machine.
> > > >>>>>
> > > >>>>> I had searched for coredumps, but no luck. Running tomcat on the
> > > >>>>> foreground might show something, but then again I could be waiting 
> > > >>>>> for a
> > > >>>>> month for it to happen.
> > > >>>>>
> > > >>>>> On Wed, 2010-02-24 at 12:42 +0100, Carl wrote:
> > > >>>>>> Taylan,
> > > >>>>>>
> > > >>>>>> I am the person who started the "Tomcat dies suddenly" thread 
> > > >>>>>> which I
> > > >>>>>> still
> > > >>>>>> haven't resolved.  I am curious about the pattern of failures you 
> > > >>>>>> are
> > > >>>>>> experiencing because they may provide some clues to my problem.  
> > > >>>>>> In my
> > > >>>>>> case,
> > > >>>>>> the system will run for 15 minutes to 10 days before failing (most 
> > > >>>>>> of the
> > > >>>>>> time it is several days to a week.)  It appears to die from a seg 
> > > >>>>>> fault
> > > >>>>>> in
> > > >>>>>> the JVM (I am using Sun 1.6.0_18 but have tried previous 
> > > >>>>>> versions)... you
> > > >>>>>> may be able to see the cause of the failure from the core file 
> > > >>>>>> (the core
> > > >>>>>> files on my systems were in several directories so you may have to 
> > > >>>>>> do a
> > > >>>>>> 'find' to locate them.)  Load may be a factor but the failures 
> > > >>>>>> generally
> > > >>>>>> come after the load has been heavy for a while.  I am running a 
> > > >>>>>> couple of
> > > >>>>>> applications and it seems the failures are more frequent when 
> > > >>>>>> people are
> > > >>>>>> hitting the additional apps (the primary app is always used, the
> > > >>>>>> remaining
> > > >>>>>> apps are used sporatically.)
> > > >>>>>>
> > > >>>>>> How does this compare to what you are experiencing?
> > > >>>>>>
> > > >>>>>> Thanks,
> > > >>>>>>
> > > >>>>>> Carl
> > > >>>>>>
> > > >>>>>> ----- Original Message -----
> > > >>>>>> From: "Taylan Develioglu"<tdevelio...@ebuddy.com>
> > > >>>>>> To: "Tomcat Users List"<users@tomcat.apache.org>;<p...@pidster.com>
> > > >>>>>> Sent: Wednesday, February 24, 2010 5:09 AM
> > > >>>>>> Subject: Re: jvm exits without trace
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>> The GC log shows plenty of heap space left in all the spaces.
> > > >>>>>>>
> > > >>>>>>> I purposely didn't bother replacing the variables because I 
> > > >>>>>>> figured
> > > >>>>>>> they
> > > >>>>>>> would not be relevant.
> > > >>>>>>>
> > > >>>>>>> But if you think they might provide clues they're as follows:
> > > >>>>>>>
> > > >>>>>>> JAVA_HEAP_SIZE=18432M
> > > >>>>>>> JAVA_EDEN_SIZE=$(($(echo $JAVA_HEAP_SIZE|sed 's/M$\|G$//')/6))M
> > > >>>>>>> JAVA_PERM_SIZE=128M
> > > >>>>>>> JAVA_STCK_SIZE=128K
> > > >>>>>>>
> > > >>>>>>> EDEN_SIZE is 1/6th of total heap.
> > > >>>>>>>
> > > >>>>>>> And I said there was nothing in the system logs.
> > > >>>>>>> But you get a couple of points for trying.
> > > >>>>>>>
> > > >>>>>>> On Wed, 2010-02-24 at 10:44 +0100, Pid wrote:
> > > >>>>>>>> On 24/02/2010 09:36, Taylan Develioglu wrote:
> > > >>>>>>>>> I thought I'd add the connector definitions too, :
> > > >>>>>>>>>
> > > >>>>>>>>>       <Connector port="80"
> > > >>>>>>>>> protocol="org.apache.coyote.http11.Http11AprProtocol"
> > > >>>>>>>>>                   compression="1024" keepAliveTimeout="60000"
> > > >>>>>>>>> maxKeepAliveRequests="-1"
> > > >>>>>>>>>                   enableLookups="false" redirectPort="443"
> > > >>>>>>>>> maxThreads="150"
> > > >>>>>>>>> pollerSize="32768"
> > > >>>>>>>>>                   pollerThreadCount="4"/>
> > > >>>>>>>>>
> > > >>>>>>>>>        <Connector port="443"
> > > >>>>>>>>> protocol="org.apache.coyote.http11.Http11AprProtocol"
> > > >>>>>>>>> SSLEnabled="true"
> > > >>>>>>>>>                   enableLookups="false" maxThreads="10" 
> > > >>>>>>>>> scheme="https"
> > > >>>>>>>>> secure="true"
> > > >>>>>>>>>                   
> > > >>>>>>>>> SSLCertificateFile="/etc/ssl/private/something.crt"
> > > >>>>>>>>>
> > > >>>>>>>>> SSLCertificateKeyFile="/etc/ssl/private/something.key"
> > > >>>>>>>>>                   SSLCACertificateFile="/etc/ssl/certs/ca.crt"/>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> On Wed, 2010-02-24 at 10:23 +0100, Taylan Develioglu wrote:
> > > >>>>>>>>>> Hi,
> > > >>>>>>>>>>
> > > >>>>>>>>>> I have jvm's, running tomcat and our application, exiting
> > > >>>>>>>>>> mysteriously,
> > > >>>>>>>>>> and was wondering if anyone could give me some advice on how to
> > > >>>>>>>>>> debug
> > > >>>>>>>>>> this thing.
> > > >>>>>>>>>>
> > > >>>>>>>>>> There is nothing in catalina.out, nor our application logs, 
> > > >>>>>>>>>> and no
> > > >>>>>>>>>> hotspot error file. GC log looks normal. No trace in system 
> > > >>>>>>>>>> logs.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I am left completely clueless :(, has anyone dealt with a 
> > > >>>>>>>>>> problem
> > > >>>>>>>>>> like
> > > >>>>>>>>>> this before?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Any help appreciated.
> > > >>>>>>>>>>
> > > >>>>>>>>>> - Tomcat 6.0.24
> > > >>>>>>>>>> - TC native 1.1.18
> > > >>>>>>>>>> - APR 1.3.9
> > > >>>>>>>>>> - Sun JDK 6u18
> > > >>>>>>>>>> - Debian Lenny, 2.6.31.10-amd64
> > > >>>>>>>>>>
> > > >>>>>>>>>> 2 servlets, one as ROOT. 2 HTTP connectors that use 
> > > >>>>>>>>>> TCNative/APR.
> > > >>>>>>>>>>
> > > >>>>>>>>>> JAVA_OPTS ( ):
> > > >>>>>>>>>>
> > > >>>>>>>>>>        -verbose:gc
> > > >>>>>>>>>>        -Djava.awt.headless=true
> > > >>>>>>>>>>        -Dsun.net.inetaddr.ttl=60
> > > >>>>>>>>>>        -Dfile.encoding=UTF-8
> > > >>>>>>>>>>        -Djava.io.tmpdir=$TMP_DIR
> > > >>>>>>>>>>        -Djava.library.path=/usr/local/lib
> > > >>>>>>>>>>        -Djava.endorsed.dirs=$CATALINA_BASE/endorsed
> > > >>>>>>>>>>        -Dcatalina.base=$CATALINA_BASE
> > > >>>>>>>>>>        -Dcatalina.home=$CATALINA_HOME
> > > >>>>>>>>
> > > >>>>>>>>>>      
> > > >>>>>>>>>> -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
> > > >>>>>>>>>> -Djava.util.logging.config.file="$CATALINA_BASE/conf/logging.properties"
> > > >>>>>>>>>>        -XX:+PrintGCDetails
> > > >>>>>>>>>>        -Xloggc:$CATALINA_BASE/logs/gc.log
> > > >>>>>>>>>>        -XX:+UseConcMarkSweepGC
> > > >>>>>>>>>>        -XX:CMSInitiatingOccupancyFraction=70
> > > >>>>>>>>>>        -Xms$JAVA_HEAP_SIZE
> > > >>>>>>>>>>        -Xmx$JAVA_HEAP_SIZE
> > > >>>>>>>>>>        -XX:NewSize=$JAVA_EDEN_SIZE
> > > >>>>>>>>>>        -XX:MaxNewSize=$JAVA_EDEN_SIZE
> > > >>>>>>>>>>        -XX:PermSize=$JAVA_PERM_SIZE
> > > >>>>>>>>>>        -XX:MaxPermSize=$JAVA_PERM_SIZE
> > > >>>>>>>>>>        -Xss$JAVA_STCK_SIZE
> > > >>>>>>>>>>        -XX:+UseLargePages
> > > >>>>>>>>
> > > >>>>>>>> There's no actual heap size settings in the above.  But you get a
> > > >>>>>>>> couple
> > > >>>>>>>> of points for trying.
> > > >>>>>>>>
> > > >>>>>>>> Google "Linux Out Of Memory killer" or "OOM Killer" and then 
> > > >>>>>>>> check the
> > > >>>>>>>> server logs carefully.  (e.g. /var/log/messages)
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> p
> > > >>>>>>>>
> > > >>>>>>>>> ---------------------------------------------------------------------
> > > >>>>>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > > >>>>>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> ---------------------------------------------------------------------
> > > >>>>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > > >>>>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> ---------------------------------------------------------------------
> > > >>>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > > >>>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> ---------------------------------------------------------------------
> > > >>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > > >>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org
> > > >>>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> ---------------------------------------------------------------------
> > > >>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > > >>>>> For additional commands, e-mail: users-h...@tomcat.apache.org
> > > >>>>>
> > > >>>>>
> > > >>>>
> > > >>>>
> > > >>>> ---------------------------------------------------------------------
> > > >>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > > >>>> For additional commands, e-mail: users-h...@tomcat.apache.org
> > > >>>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> ---------------------------------------------------------------------
> > > >>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > > >>> For additional commands, e-mail: users-h...@tomcat.apache.org
> > > >>>
> > > >>
> > > >>
> > > >> ---------------------------------------------------------------------
> > > >> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > > >> For additional commands, e-mail: users-h...@tomcat.apache.org
> > > >>
> > > >
> > > >
> > >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > For additional commands, e-mail: users-h...@tomcat.apache.org
> >
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to