RE: Thread dump analysis
> From: Pieter Temmerman [mailto:ptemmerman@sadiel.es] > Subject: RE: Thread dump analysis > > And what about Tomcat 5 with JDK6? I don't run Tomcat 5 much anymore, but what little I have done with it seems to be fine on JDK 6. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Thread dump analysis
Charles, List, >Hmmm... could be a JVM bug, or might be a conflict with classes in your >webapp. Do you happen to have some XML-related jars in >your webapp or >otherwise visible to that branch of the classloader tree? (I guess you >answered that below.) Well, actually, the developers also included an Apache Xerces library (note that the other Xerces library is Sun's implementation). Apart from that, as you mentioned below there are XFire related libraries. >No, JDK5 works fine with Tomcat 6. The newer JVM is a bit faster, in many >cases. Good to know. And what about Tomcat 5 with JDK6? >It looks like XFire has been superseded by CXF; have you tried that? >http://xfire.codehaus.org/ >http://cxf.apache.org/ Yes, I also noticed that CXF was the way to go, unfortunately this application has not been developed by me and is closed source, so the only option is to use it as it is..with XFire. I mailed both XFire and Sun's mailinglist a few days ago, but unfortunately I'm still waiting for a reply. I guess I'll just have to deal with the fact that it doesn't work with JDK6, although I don't know why. Anyway, many thanks to all of you for helping me out. It's kindly appreciated. Regards, Pieter
RE: Thread dump analysis
> From: Pieter Temmerman [mailto:ptemmerman@sadiel.es] > Subject: Re: Thread dump analysis > > Now, I don't know why (since I'm a noob programmer), but it seems that > the implementation of Xerces that is included in JDK6 is causing the > application to hang. Hmmm... could be a JVM bug, or might be a conflict with classes in your webapp. Do you happen to have some XML-related jars in your webapp or otherwise visible to that branch of the classloader tree? (I guess you answered that below.) > Off-topic: is it discouraged to use TC6 with JDK5? No, JDK5 works fine with Tomcat 6. The newer JVM is a bit faster, in many cases. > The problem might be reside in Xerces implementation or rather an > incompatibility issue between JDK6 and XFire (I haven't found any > indications for that). It looks like XFire has been superseded by CXF; have you tried that? http://xfire.codehaus.org/ http://cxf.apache.org/ - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Thread dump analysis
Hi. For all of you who were so kind to respond to my problem a few days ago, and for others who'd like to know, I think I have found the problem. As mentioned in earlier mails, the application I monitor hangs after a random amount of time. Using jconsole and virtualvm I managed to detect the threads that were causing the hangup and showed the following in their stacktraces: > com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl $PrologDriver.next(XMLDocumentScannerImpl.java:930) > com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) > com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140) > com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:548) Now, I don't know why (since I'm a noob programmer), but it seems that the implementation of Xerces that is included in JDK6 is causing the application to hang. Using JDK5 I don't see this behavior. (Off-topic: is it discouraged to use TC6 with JDK5?). The problem might be reside in Xerces implementation or rather an incompatibility issue between JDK6 and XFire (I haven't found any indications for that). The thing is, using JDK5, I don't see any Xerces threads anymore that are continuously running. Cheers. Pieter On Wed, 2009-01-28 at 18:13 +0100, Pieter Temmerman wrote: > Thanks Chris, > > I just bumped into a very nice plugin for Jconsole called "TopThread". > It's a linux top alike and shows the CPU usage of all threads. > This only confirms what I figured out before, there are two threads > taking up all CPU. See below for their stack trace: > > com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl > $PrologDriver.next(XMLDocumentScannerImpl.java:930) > com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) > com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140) > com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:548) >BELOW LINE<> > org.codehaus.xfire.soap.handler.ReadHeadersHandler.invoke(ReadHeadersHandler.java:44) > org.codehaus.xfire.handler.HandlerPipeline.invoke(HandlerPipeline.java:131) > org.codehaus.xfire.transport.DefaultEndpoint.onReceive(DefaultEndpoint.java:64) > org.codehaus.xfire.transport.AbstractChannel.receive(AbstractChannel.java:38) > org.codehaus.xfire.transport.http.XFireServletController.invoke(XFireServletController.java:304) > org.codehaus.xfire.transport.http.XFireServletController.doService(XFireServletController.java:129) > org.codehaus.xfire.transport.http.XFireServlet.doPost(XFireServlet.java:116) > javax.servlet.http.HttpServlet.service(HttpServlet.java:710) > javax.servlet.http.HttpServlet.service(HttpServlet.java:803) > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) > org.apache.coyote.http11.Http11Protocol > $Http11ConnectionHandler.process(Http11Protocol.java:583) > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) > java.lang.Thread.run(Thread.java:619) > > > Also, I noticed something else.. > After a while (15-20mins) the webapp is accessible again (without > restarting Tomcat), although CPU usage keeps hitting 100%. Could it be > that Tomcat ran out of connectors (maxThreads was hit) and after a while > they get closed by the timeout? > This could be a plausible explanation, although, the weird thing still > is why this coincides with the CPU topping 100%. > > Still puzzled :) > > - > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > -- Pieter Temmerman email: ptemmerman@sadiel.es skype: ptemmerman.sadiel SADIEL TECNOLOGÍAS DE LA INFORMACIÓN, S.A. http://www.sadiel.es. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Thread dump analysis
Hi Chuck >> Could it be that Tomcat ran out of connectors (maxThreads >> was hit) and after a while they get closed by the timeout? >Depends on how your DB connector pool is configured. I was actually referring to the limit of http connections, set by maxThreads. >> This could be a plausible explanation, although, the weird thing still >> is why this coincides with the CPU topping 100%. >Could the infinite loop that these threads sometime get into include >acquisition of a DB connection at some point inside the >loop? If your DBC >pool is set to discard unreturned connections after some period of time, it >will acquire more and those will >become available for new requests. Yes, the webservice connects at one point to a DB. Although, from the thread it seems that it's looping when reading an XML document. Instead of the Tomcat DB pool, the developers implemented CP30. They said that was better. I don't know whether it will change anything, but it's definitely worth the try. I'll keep digging... Thanks!
RE: Thread dump analysis
> From: Pieter Temmerman [mailto:ptemmerman@sadiel.es] > Subject: Re: Thread dump analysis > > Could it be that Tomcat ran out of connectors (maxThreads > was hit) and after a while they get closed by the timeout? Depends on how your DB connector pool is configured. > This could be a plausible explanation, although, the weird thing still > is why this coincides with the CPU topping 100%. Could the infinite loop that these threads sometime get into include acquisition of a DB connection at some point inside the loop? If your DBC pool is set to discard unreturned connections after some period of time, it will acquire more and those will become available for new requests. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Thread dump analysis
> Do I understand you correctly, that after you killed the first tomcat > (in my understanding the one which fires soap requests, tomcat 5) than > all the RUNNABLE threads in the second tomcat (the one that answers > soap, tomcat 6 with xfire) went away and it (tomcat) was idle? No, I did not restart Tomcat 5 (the one that fires soap requests). I restarted Tomcat 6 (that showed 100% CPU usage). The application on Tomcat5 worked fine now. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Thread dump analysis
Thanks Chris, I just bumped into a very nice plugin for Jconsole called "TopThread". It's a linux top alike and shows the CPU usage of all threads. This only confirms what I figured out before, there are two threads taking up all CPU. See below for their stack trace: com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl $PrologDriver.next(XMLDocumentScannerImpl.java:930) com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140) com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:548)org.codehaus.xfire.soap.handler.ReadHeadersHandler.invoke(ReadHeadersHandler.java:44) org.codehaus.xfire.handler.HandlerPipeline.invoke(HandlerPipeline.java:131) org.codehaus.xfire.transport.DefaultEndpoint.onReceive(DefaultEndpoint.java:64) org.codehaus.xfire.transport.AbstractChannel.receive(AbstractChannel.java:38) org.codehaus.xfire.transport.http.XFireServletController.invoke(XFireServletController.java:304) org.codehaus.xfire.transport.http.XFireServletController.doService(XFireServletController.java:129) org.codehaus.xfire.transport.http.XFireServlet.doPost(XFireServlet.java:116) javax.servlet.http.HttpServlet.service(HttpServlet.java:710) javax.servlet.http.HttpServlet.service(HttpServlet.java:803) org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) org.apache.coyote.http11.Http11Protocol $Http11ConnectionHandler.process(Http11Protocol.java:583) org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) java.lang.Thread.run(Thread.java:619) Also, I noticed something else.. After a while (15-20mins) the webapp is accessible again (without restarting Tomcat), although CPU usage keeps hitting 100%. Could it be that Tomcat ran out of connectors (maxThreads was hit) and after a while they get closed by the timeout? This could be a plausible explanation, although, the weird thing still is why this coincides with the CPU topping 100%. Still puzzled :) - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Thread dump analysis
I omit the whole quoting to save traffic for clarity. Do I understand you correctly, that after you killed the first tomcat (in my understanding the one which fires soap requests, tomcat 5) than all the RUNNABLE threads in the second tomcat (the one that answers soap, tomcat 6 with xfire) went away and it (tomcat) was idle? Leon - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Thread dump analysis
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Pieter, Pieter Temmerman wrote: > Memory usage looks healthy, but CPU usage goes sky high (mainly caused > by the Java Tomcat process). That's interesting. Looking at your thread dump snippet, it looks like your database connection pool is exhausted. If that were the only problem, you'd have very low CPU usage because everything would be blocked waiting. Something else must be going on. > My first question is, what is a TP-Processor exactly? Is each client > connection to Tomcat being assigned a TP-Processor or am I wrong? Each HTTP connection is assigned a TP-Processor for the duration of the request. > But for some kind of reason, there is never a free > one available, and the application just won't work until Tomcat is > restarted. You are probably leaking them somewhere. The fact that your server stays up for an arbitrary amount of time leads me to believe that you have a leak in one place, but not everywhere. So, if someone is using that leak-prone code a lot, you exhaust your connection pool quickly, but if not, the server stays up for a long time. Are you using Tomcat's DBCP? If so, enable all the debugging features (logAbandoned, removeAbandoned, etc.) which should lead you to the code that is leaking connections. > Note the line "locked <0x2aabadaabff8>" and "waiting on > <0x2aabadaabff8>" later on. So first it's locking that "thing" and > then it's waiting on that "thing". This same number is coming back in > each TP-Processor that is in waiting state. That seems rather weird to > me. That's just the way the thread dump looks. > So I was wondering: > a. Is that normal behavior? Yes, assuming that resource exhaustion is considered "normal". > b. Is there any way to know what the 0x2aabadaabff8 means? Yes. > My scientific calculator says it's rather an insane number when trying > to convert it to decimal. It's just a pointer, so knowing the decimal value is pretty useless. > Maybe it's just as easy as it reads: Waiting on 0x0002aabadwhich is > a GenericObjectPool. Right: the JVM tells you what object 0xabababa... is. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkmAexUACgkQ9CaO5/Lv0PAPvgCeOdXkDSuJLihFpmJkoM8H76wB sQYAnjbok0n5vgaeJHv5x0P25N+6ayMJ =SkV/ -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Thread dump analysis
>If the threadMax would be too low in the connector, wouldn't the > "freeze" be over once there are free connections? And also, how can a > small threadMax make a thread hang? For example the one that is trying > to read an XML file. As a follow up on my own question. This is what the docs say: At server startup time, this Connector will create a number of request processing threads (based on the value configured for the minSpareThreads attribute). Each incoming request requires a thread for the duration of that request. If more simultaneous requests are received than can be handled by the currently available request processing threads, additional threads will be created up to the configured maximum (the value of the maxThreads attribute). If still more simultaneous requests are received, they are stacked up inside the server socket created by the Connector, up to the configured maximum (the value of the acceptCount attribute. Any further simultaneous requests will receive "connection refused" errors, until resources are available to process them. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Thread dump analysis
I really appreciate your input Leon. On Wed, 2009-01-28 at 11:07 +0100, Leon Rosenberg wrote: > > "RMI TCP Connection(42)-173.x.x.x" - Thread t...@112 > > java.lang.Thread.State: RUNNABLE > > reading from socket, usually not a problem. > I thought so. Thanks. > > "http-8081-35" - Thread t...@71 > > java.lang.Thread.State: RUNNABLE > >at java.lang.String.equals(String.java:1018) > >at > > com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:554) > >at > > org.codehaus.xfire.soap.handler.ReadHeadersHandler.invoke(ReadHeadersHandler.java:44) > This one actually looks like a problem, was the server under load as > you created the dump, or did you removed the load first? No, the server was not under load at all. And I took various thread dumps at random intervals, and they thread was still there (same thread number). Maybe there is an infinite loop, but I don't know why it works for a random amount of time, and then freezes. > If there were no external requests to the server, you most probably > have an infinite loop somewhere (how stable is xfire anyway?) or you > have a really really really large xml file to parse. That's a good question. I have no experience with XFire. Would there be a way to know which XML file it's parsing? > > > "http-8081-31" - Thread t...@67 > > java.lang.Thread.State: RUNNABLE > >at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl > > $PrologDriver.next(XMLDocumentScannerImpl.java:930) > >at > > com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) > >at > > com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140) > >at > > com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:548) > >at > > org.codehaus.xfire.soap.handler.ReadHeadersHandler.invoke(ReadHeadersHandler.java:44) > >at > > org.codehaus.xfire.handler.HandlerPipeline.invoke(HandlerPipeline.java:131) > >at > > org.codehaus.xfire.transport.DefaultEndpoint.onReceive(DefaultEndpoint.java:64) > >at > > org.codehaus.xfire.transport.AbstractChannel.receive(AbstractChannel.java:38) > >at > > org.codehaus.xfire.transport.http.XFireServletController.invoke(XFireServletController.java:304) > >at > > org.codehaus.xfire.transport.http.XFireServletController.doService(XFireServletController.java:129) > >at > > org.codehaus.xfire.transport.http.XFireServlet.doPost(XFireServlet.java:116) > > > > note that the XFire path is almost identical. > Personally I neved managed to catch healthy code in something like > String.equals(), but it may be possible, even imho unlikeable. > The dump looks healthy, but is it the dump from the tomcat 5.5 or tomcat 6? > Because if tomcat 6 is the problem and the dump is from tomcat 5 its useless > :-) > This was the heap dump from Tomcat 6. > you need more research :-) For now it looks like your first tomcat is > waiting for the second tomcat which just can't handle the requests as > fast as its expected. You may want to check configuration options of > both (threads in the connector etc). If the threadMax would be too low in the connector, wouldn't the "freeze" be over once there are free connections? And also, how can a small threadMax make a thread hang? For example the one that is trying to read an XML file. > Also the heap dump of the second tomcat would be useful (if this one > is from first). > And a thread dump shortly after the traffic has gone (for example kill > first tomcat, wait 1 minute, make thread dump). > Did this, see the output (I removed the RMI threads): "TP-Monitor" - Thread t...@28 java.lang.Thread.State: TIMED_WAITING on org.apache.tomcat.util.threads.threadpool$monitorrunna...@498364 at java.lang.Object.wait(Native Method) at org.apache.tomcat.util.threads.ThreadPool$MonitorRunnable.run(ThreadPool.java:565) at java.lang.Thread.run(Thread.java:619) Locked ownable synchronizers: - None "TP-Processor4" - Thread t...@27 java.lang.Thread.State: RUNNABLE at java.net.PlainSocketImpl.socketAccept(Native Method) at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384) - locked java.net.sockssocketi...@134af1 at java.net.ServerSocket.implAccept(ServerSocket.java:453) at java.net.ServerSocket.accept(ServerSocket.java:421) at org.apache.jk.common.ChannelSocket.accept(ChannelSocket.java:306) at org.apache.jk.common.ChannelSocket.acceptConnections(ChannelSocket.java:660) at org.apache.jk.common.ChannelSocket$SocketAcceptor.runIt(ChannelSocket.java:870) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690) at java.lang.Thread.run(Thread.java:619) Locked ownable synchronizers: - None "TP-Processor3" - Thread t...@2
Re: Thread dump analysis
On Wed, Jan 28, 2009 at 10:42 AM, Pieter Temmerman wrote: > > >>Have you found any java.lang.Thread.State: RUNNABLE threads? They are >>usually more interesting if it comes to a high cpu :-) > > These are the RUNNABLE threads on Tomcat 6: > > "RMI TCP Connection(42)-173.x.x.x" - Thread t...@112 > java.lang.Thread.State: RUNNABLE reading from socket, usually not a problem. > "http-8081-35" - Thread t...@71 > java.lang.Thread.State: RUNNABLE >at java.lang.String.equals(String.java:1018) >at > com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:554) >at > org.codehaus.xfire.soap.handler.ReadHeadersHandler.invoke(ReadHeadersHandler.java:44) >at > org.codehaus.xfire.handler.HandlerPipeline.invoke(HandlerPipeline.java:131) >at > org.codehaus.xfire.transport.DefaultEndpoint.onReceive(DefaultEndpoint.java:64) >at > org.codehaus.xfire.transport.AbstractChannel.receive(AbstractChannel.java:38) >at > org.codehaus.xfire.transport.http.XFireServletController.invoke(XFireServletController.java:304) >at > org.codehaus.xfire.transport.http.XFireServletController.doService(XFireServletController.java:129) >at > org.codehaus.xfire.transport.http.XFireServlet.doPost(XFireServlet.java:116) >at javax.servlet.http.HttpServlet.service(HttpServlet.java:710) >at javax.servlet.http.HttpServlet.service(HttpServlet.java:803) >at This one actually looks like a problem, was the server under load as you created the dump, or did you removed the load first? If there were no external requests to the server, you most probably have an infinite loop somewhere (how stable is xfire anyway?) or you have a really really really large xml file to parse. If there were requests to the server as you took the dump, you might just happen to catch this part of the execution. In this case you should remove the traffic first and create a dump afterwards (of course only if the load remains after the traffic is cut off) > "http-8081-31" - Thread t...@67 > java.lang.Thread.State: RUNNABLE >at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl > $PrologDriver.next(XMLDocumentScannerImpl.java:930) >at > com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) >at > com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140) >at > com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:548) >at > org.codehaus.xfire.soap.handler.ReadHeadersHandler.invoke(ReadHeadersHandler.java:44) >at > org.codehaus.xfire.handler.HandlerPipeline.invoke(HandlerPipeline.java:131) >at > org.codehaus.xfire.transport.DefaultEndpoint.onReceive(DefaultEndpoint.java:64) >at > org.codehaus.xfire.transport.AbstractChannel.receive(AbstractChannel.java:38) >at > org.codehaus.xfire.transport.http.XFireServletController.invoke(XFireServletController.java:304) >at > org.codehaus.xfire.transport.http.XFireServletController.doService(XFireServletController.java:129) >at > org.codehaus.xfire.transport.http.XFireServlet.doPost(XFireServlet.java:116) > note that the XFire path is almost identical. Personally I neved managed to catch healthy code in something like String.equals(), but it may be possible, even imho unlikeable. > > The Tomcat 5.5 webapp which connects to the webservice is not > responding. I saw the following thread on Tomcat 5.5 which may be the > connection to the webservice: > > The fact that it's reading from a socket, which I think it the > connection to the webservice, may explain why the application is not > doing anything. Yep > >>Also, as David posted, what is the HEAP usage? it's usually at the end >>of the dump. > > vdebian2:/usr/local/jdk1.6.0_07/bin# ./jmap 8100 > Attaching to process ID 8100, please wait... > Debugger attached successfully. > Server compiler detected. > JVM version is 10.0-b23 > > using thread-local object allocation. > Parallel GC with 2 thread(s) > > Heap Configuration: > MinHeapFreeRatio = 40 > MaxHeapFreeRatio = 70 > MaxHeapSize = 1073741824 (1024.0MB) > NewSize = 1048576 (1.0MB) > MaxNewSize = 4294901760 (4095.9375MB) > OldSize = 4194304 (4.0MB) > NewRatio = 8 > SurvivorRatio= 8 > PermSize = 16777216 (16.0MB) > MaxPermSize = 268435456 (256.0MB) > > Heap Usage: > PS Young Generation > Eden Space: > capacity = 58851328 (56.125MB) > used = 18383664 (17.532028198242188MB) > free = 40467664 (38.59297180175781MB) > 31.237466722925948% used > >From Space: > capacity = 327680 (0.3125MB) > used = 131072 (0.125MB) > free = 196608 (0.1875MB) > 40.0% used > To Space: > capacity = 393216 (0.375MB) > used = 0 (0.0MB) > free = 393216 (0.375MB) > 0.0% used > PS Old Generation >
RE: Thread dump analysis
Thanks to all for your helpful input. As a result of trying to reproduce the problem on pre-production servers, I used JMeter to generate some load. Let me give some more specific information about the setup. I have two Tomcat servers: - Tomcat 5.5.7 (jdk 1.5.0_09) (I know, it's old. But this is the version that is used in production. I don't administer the actual production server, thus I'm not in the position to upgrade it). - Tomcat 6.0.16 (jdk 1.6.0_07) On Tomcat 6.0 I'm running a webservice which is implemented using XFire and connects to an Oracle 9i database. On Tomcat 5.5.7 I'm running web application X which uses the webservice on Tomcat 6. Normally, what we see on the production server is that Tomcat 5.5 is hanging (CPU goes to 100%). However, this time when trying to reproduce on the pre-production servers, it's Tomcat 6 which is going to 100%, which on his turn makes the webapp on Tomcat 5.5 not to respond. Ok, so I thought that the webservice on TC 6.0 crashed or something. So I used SOAPUI to invoke a request to the webservice, just to be sure whether the webservice was down. Strangely the webservice successfully answered my request. >Have you found any java.lang.Thread.State: RUNNABLE threads? They are >usually more interesting if it comes to a high cpu :-) These are the RUNNABLE threads on Tomcat 6: "RMI TCP Connection(42)-173.x.x.x" - Thread t...@112 java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) - locked java.io.bufferedinputstr...@32c1b3 at java.io.FilterInputStream.read(FilterInputStream.java:66) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:517) at sun.rmi.transport.tcp.TCPTransport $ConnectionHandler.run0(TCPTransport.java:790) at sun.rmi.transport.tcp.TCPTransport $ConnectionHandler.run(TCPTransport.java:649) at java.util.concurrent.ThreadPoolExecutor $Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor $Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:619) Locked ownable synchronizers: - locked java.util.concurrent.locks.reentrantlock$nonfairs...@4adf48 NOTE!: There are 6 more of those RMI threads "http-8081-35" - Thread t...@71 java.lang.Thread.State: RUNNABLE at java.lang.String.equals(String.java:1018) at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:554) at org.codehaus.xfire.soap.handler.ReadHeadersHandler.invoke(ReadHeadersHandler.java:44) at org.codehaus.xfire.handler.HandlerPipeline.invoke(HandlerPipeline.java:131) at org.codehaus.xfire.transport.DefaultEndpoint.onReceive(DefaultEndpoint.java:64) at org.codehaus.xfire.transport.AbstractChannel.receive(AbstractChannel.java:38) at org.codehaus.xfire.transport.http.XFireServletController.invoke(XFireServletController.java:304) at org.codehaus.xfire.transport.http.XFireServletController.doService(XFireServletController.java:129) at org.codehaus.xfire.transport.http.XFireServlet.doPost(XFireServlet.java:116) at javax.servlet.http.HttpServlet.service(HttpServlet.java:710) at javax.servlet.http.HttpServlet.service(HttpServlet.java:803) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol $Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint $Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:619) Locked ownable synchronizers: - None "http-8081-34" - Thread t...@70 java.lang.Thread.State: WAITING on org.apache.tomcat.util.net.jioendpoint$wor...@186f141 at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.tomcat.util.net.JIoEndpoint $Worker.await(JIoEndpoint.java:416)
RE: Thread dump analysis
> From: Pieter Temmerman [mailto:ptemmerman@sadiel.es] > Subject: Thread dump analysis > > Memory usage looks healthy, but CPU usage goes sky high (mainly caused > by the Java Tomcat process). If you're truly out of memory, the GC thread(s) may be running almost continuously. However, with modern JVMs (you didn't bother to tell us what JVM you're using, nor what platform you're on), that's a pretty rare occurrence; it's more likely an application thread that's looping. As Leon suggested, look for threads in the runnable state and check the state of the heap. > org.apache.commons.dbcp.PoolingDataSource.getConnection(Poolin > gDataSource.java:96) This shows that indeed the thread is waiting for a DB connection. Note that the connection pool here is managed by the webapp, not Tomcat; the class names are different for Tomcat. > My first question is, what is a TP-Processor exactly? Is each client > connection to Tomcat being assigned a TP-Processor or am I wrong? You're not wrong; each request (not client) is handled by a TP-Processor thread. When processing for that request completes, the thread goes back to the thread pool to wait for another request from any client. > So my guess is that the freeze is being caused by a database > connection pool that is out of open connections You seem to have two symptoms: high CPU usage, and running out of DB connections. It is quite likely the two are related, perhaps the webapp gets in an infinite loop and therefore never releases DB connections back to the pool. > Note the line "locked <0x2aabadaabff8>" and "waiting on > <0x2aabadaabff8>" later on. So first it's locking that > "thing" and then it's waiting on that "thing". That "thing" is just a plain java.lang.Object, which is here serving the purpose of being a synchronization point. Java synchronization semantics require that an object be locked before it can be waited on; the wait() releases the lock until someone other thread calls notify() on that object. > Is that normal behavior? Yes. > Is there any way to know what the 0x2aabadaabff8 means? It's the virtual address of the object being waited on. Not terribly useful other than as an identifier for the object at a given point in time. The object can move during a garbage collection. You really should move up to something a bit newer than 5.5.7; it's over four years old and lots of bug fixes have gone into the 5.5 line since then. 5.5.27 is the current level for that leg. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Thread dump analysis
Have you found any java.lang.Thread.State: RUNNABLE threads? They are usually more interesting if it comes to a high cpu :-) Also, as David posted, what is the HEAP usage? it's usually at the end of the dump. regards Leon On Mon, Jan 26, 2009 at 5:37 PM, Pieter Temmerman wrote: > Hi all. > > I've been investigating why one of our applications (running in Tomcat > 5.5.7) suddenly freezes after a variable amount of time (sometimes > 10min, sometimes 2 hours). > > Disclaimer: I'm not the developer of the application, nor do I know the > exact details of how stuff is implemented. I know..it sucks. > > Memory usage looks healthy, but CPU usage goes sky high (mainly caused > by the Java Tomcat process). > So I made a thread dump and the first thing I noticed was the large > amount of TP-ProcessorXX threads, most of them in WAITING state. > > A small snippet of the thread dump (it's very very big). > > "TP-Processor290" daemon prio=1 tid=0x2aaab47acd30 nid=0x6486 in > Object.wait() [0x56ae4000..0x56ae6e10] >at java.lang.Object.wait(Native Method) >- waiting on <0x2aabadaabff8> (a > org.apache.commons.pool.impl.GenericObjectPool) >at java.lang.Object.wait(Object.java:474) >at > org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:810) >- locked <0x2aabadaabff8> (a > org.apache.commons.pool.impl.GenericObjectPool) >at > org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:96) >at > org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:880) >at > org.springframework.orm.hibernate3.LocalDataSourceConnectionProvider.getConnection(LocalDataSourceConnectionProvider.java:81) >at > org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:423) >at > org.hibernate.jdbc.ConnectionManager.getConnection(ConnectionManager.java:144) >at > org.hibernate.jdbc.AbstractBatcher.prepareQueryStatement(AbstractBatcher.java:139) >at org.hibernate.loader.Loader.prepareQueryStatement(Loader.java:1547) >at org.hibernate.loader.Loader.doQuery(Loader.java:673) > > My first question is, what is a TP-Processor exactly? Is each client > connection to Tomcat being assigned a TP-Processor or am I wrong? > > Anyway, back to the thread dump itself. > There are a lot (read: +100) of those TP-Processor threads in waiting > state which mention org.hibernate.blablabla. So my guess is that the > freeze is being caused by a database connection pool that is out of open > connections, and thus the application those threads are waiting until > there is a free one. But for some kind of reason, there is never a free > one available, and the application just won't work until Tomcat is > restarted. > > In the assumption that this is the reason for the application to hang, > my "thread dump decoding knowledge" is too limited to be sure what is > causing this situation. Is the thread pool just too small, is the > application not closing it's connections, and thus running out of > connections. I have no idea. > > Therefor, my second question. > > "TP-Processor290" daemon prio=1 tid=0x2aaab47acd30 nid=0x6486 in > Object.wait() [0x56ae4000..0x56ae6e10] >at java.lang.Object.wait(Native Method) >- waiting on <0x2aabadaabff8> (a > org.apache.commons.pool.impl.GenericObjectPool) >at java.lang.Object.wait(Object.java:474) >at > org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:810) >- locked <0x2aabadaabff8> (a > org.apache.commons.pool.impl.GenericObjectPool) > > > Note the line "locked <0x2aabadaabff8>" and "waiting on > <0x2aabadaabff8>" later on. So first it's locking that "thing" and > then it's waiting on that "thing". This same number is coming back in > each TP-Processor that is in waiting state. That seems rather weird to > me. > So I was wondering: > a. Is that normal behavior? > b. Is there any way to know what the 0x2aabadaabff8 means? > > My scientific calculator says it's rather an insane number when trying > to convert it to decimal. > Maybe it's just as easy as it reads: Waiting on 0x0002aabadwhich is > a GenericObjectPool. > > Any help to confirm my supposition will be kindly appreciated. > > > > > > > > > - > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Thread dump analysis
We spent weeks looking at similar bizarre thread stack dumps. Eventually it turned out to be a GC problem. The JVM will all of a sudden decide to stop large numbers of threads from running (or perhaps it stops one, but that thread happens to be holding a heavily contended lock --- database connection pool and log4j are common candidates). Anyway, take a detailed look at your GC stats. I bet that you will find that the hangups coincide with full GC cycles. It appears based on my experience that the thread dumps you get during one of these episodes are not believable -- they show wacky things like threads waiting on a lock they already hold, multiple threads holding the same lock, and threads waiting on mysterious 'monitors' that appear nowhere else in the stack dump (these I concluded are internal JVM monitors that are used by GC to block threads). - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org