Hi Leon,
Thanks for your tip about the NPTL threads. I tried upgrading to linux
2.6.8 and glibc 2.3.5, but the problem remains :(
I've studied a number of stack traces and the locking seems fine - there
is evidence of cyclical deadlock. The problem is that a whole lot of
threads (including the ones holding the locks) are listed as 'waiting
for monitor' but I can't see why. I saw Charles mention in another
thread something about the monitor disappearing in native code and not
returning and thought this might have something to do with it.
If included a couple of stack traces of the threads holding the
problematic locks (on the PersistenceManagerFactoryImpl)
Thanks again,
Roger
TP-Processor21" daemon prio=1 tid=0x08534250 nid=0x56b8 waiting for
monitor entry [0xbabfe000..0xbabff7d4]
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:620)
at
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124)
at
org.apache.catalina.loader.WebappClassLoader.findClassInternal(WebappClassLoader.java:1812)
- locked <0x5b031c00> (a org.apache.catalina.loader.WebappClassLoader)
at
org.apache.catalina.loader.WebappClassLoader.findClass(WebappClassLoader.java:866)
at
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1319)
at
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1198)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
- locked <0x5b031c00> (a org.apache.catalina.loader.WebappClassLoader)
at
org.postgresql.jdbc2.AbstractJdbc2Connection.executeTransactionCommand(AbstractJdbc2Connection.java:625)
at
org.postgresql.jdbc2.AbstractJdbc2Connection.commit(AbstractJdbc2Connection.java:645)
at
org.apache.commons.dbcp.DelegatingConnection.commit(DelegatingConnection.java:238)
at
org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.commit(PoolingDataSource.java:199)
at
org.jpox.store.rdbms.RDBMSManager$MgmtTransaction.execute(RDBMSManager.java:1872)
at org.jpox.store.rdbms.RDBMSManager.addClasses(RDBMSManager.java:553)
at
org.jpox.store.StoreManager.initialiseAutoStart(StoreManager.java:400)
at
org.jpox.store.rdbms.RDBMSManager.initialiseSchema(RDBMSManager.java:472)
at org.jpox.store.rdbms.RDBMSManager.<init>(RDBMSManager.java:237)
at
org.jpox.store.rdbms.RDBMSManagerFactory.getStoreManager(RDBMSManagerFactory.java:59)
- locked <0x6c16a598> (a java.lang.Class)
at
org.jpox.AbstractPersistenceManager.<init>(AbstractPersistenceManager.java:221)
at
org.jpox.PersistenceManagerImpl.<init>(PersistenceManagerImpl.java:34)
at
org.jpox.PersistenceManagerFactoryImpl.getPersistenceManager(PersistenceManagerFactoryImpl.java:887)
- locked <0x5b0e1900> (a org.jpox.PersistenceManagerFactoryImpl)
at
org.jpox.PersistenceManagerFactoryImpl.getPersistenceManager(PersistenceManagerFactoryImpl.java:862)
- locked <0x5b0e1900> (a org.jpox.PersistenceManagerFactoryImpl)
at net.ninthave.multisites.core.DAO.<init>(DAO.java:82)
at
net.ninthave.multisites.filters.RequestFilter.doFilter(RequestFilter.java:134)
"TP-Processor24" daemon prio=1 tid=0x08117d10 nid=0x3380 waiting for
monitor entry [0xbb9fe000..0xbb9ff854]
at java.lang.String.intern(Native Method)
at
com.sun.org.apache.xerces.internal.util.SymbolTable$Entry.<init>(SymbolTable.java:318)
at
com.sun.org.apache.xerces.internal.util.SymbolTable.addSymbol(SymbolTable.java:152)
at
com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.getAttributeTypeName(XMLDTDValidator.java:1818)
at
com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.addDTDDefaultAttrsAndValidate(XMLDTDValidator.java:1198)
at
com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.handleStartElement(XMLDTDValidator.java:1971)
at
com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.emptyElement(XMLDTDValidator.java:816)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:872)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1693)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764)
at
com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:375)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:176)
at
org.jpox.metadata.MetaDataParser.parseMetaData(MetaDataParser.java:152)
at
org.jpox.metadata.MetaDataParser.parseMetaData(MetaDataParser.java:106)
at org.jpox.metadata.MetaDataManager.parseFile(MetaDataManager.java:428)
at
org.jpox.metadata.MetaDataManager.getMetaDataForClassOrInterface(MetaDataManager.java:351)
at
org.jpox.metadata.MetaDataManager.getMetaDataForClass(MetaDataManager.java:278)
at
org.jpox.metadata.MetaDataManager.getMetaDataForClass(MetaDataManager.java:263)
at
org.jpox.store.rdbms.RDBMSManager$ClassAdder.getReferencedClasses(RDBMSManager.java:2314)
at
org.jpox.store.rdbms.RDBMSManager$ClassAdder.addClassTables(RDBMSManager.java:2096)
at
org.jpox.store.rdbms.RDBMSManager$ClassAdder.addClassTablesAndValidate(RDBMSManager.java:2378)
at
org.jpox.store.rdbms.RDBMSManager$ClassAdder.execute(RDBMSManager.java:2044)
- locked <0x5cead8f8> (a org.jpox.store.rdbms.RDBMSManager)
at
org.jpox.store.rdbms.RDBMSManager$MgmtTransaction.execute(RDBMSManager.java:1914)
at org.jpox.store.rdbms.RDBMSManager.addClasses(RDBMSManager.java:492)
at org.jpox.store.rdbms.RDBMSManager.addClass(RDBMSManager.java:504)
at
org.jpox.store.StoreManager.initialiseAutoStart(StoreManager.java:354)
at
org.jpox.store.rdbms.RDBMSManager.initialiseSchema(RDBMSManager.java:402)
at org.jpox.store.rdbms.RDBMSManager.<init>(RDBMSManager.java:247)
at
org.jpox.store.rdbms.RDBMSManagerFactory.getStoreManager(RDBMSManagerFactory.java:59)
- locked <0x6d467600> (a java.lang.Class)
at
org.jpox.AbstractPersistenceManager.<init>(AbstractPersistenceManager.java:192)
at
org.jpox.PersistenceManagerImpl.<init>(PersistenceManagerImpl.java:34)
at
org.jpox.PersistenceManagerFactoryImpl.getPersistenceManager(PersistenceManagerFactoryImpl.java:675)
- locked <0x5b983f98> (a org.jpox.PersistenceManagerFactoryImpl)
at
org.jpox.PersistenceManagerFactoryImpl.getPersistenceManager(PersistenceManagerFactoryImpl.java:650)
- locked <0x5b983f98> (a org.jpox.PersistenceManagerFactoryImpl)
at net.ninthave.multisites.core.DAO.<init>(DAO.java:82)
at
net.ninthave.multisites.filters.RequestFilter.doFilter(RequestFilter.java:137)
Leon Rosenberg wrote:
This is called thread hi-jacking :-)
However, without knowing anything about your jdo implementation I can't say if
org.jpox.PersistenceManagerFactoryImpl.getPersistenceManager(PersistenceManagerFactoryImpl.java:645)
- waiting to lock <0x5b775de0> (a
and
TP-Processor17" daemon prio=1 tid=0x082df898 nid=0x3379 waiting for
monitor entry [0xbc7fd000..0xbc7ff4d4]
at
org.jpox.store.expression.TableExprAsJoins.referenceColumn(TableExprAsJoins.java:65)
is normal behaviour or an indication for a deadlock. What I can say,
is that the 2.4.x kernel could be a problem if you have more then 800
threads.
Do you have NPTL or the standart linux threading? I don't know your
linux distro, but most 2.4.x kernels came out without NPTL support, is
updating to 2.6.x an option for you?
regards
leon
On 3/17/06, Roger Keays <[EMAIL PROTECTED]> wrote:
Hi Leon,
Recently I've noticed similar deadlocks on my tomcat server. In my case,
the server just hangs with no warning even though there is plenty of
memory available. I tried the following to locate the problem, but it
has all been in vain:
* upgrade jdk from 1.5.0.1 to 1.5.0.6
* upgrade tomcat from 5.5.7 to 5.5.16
* probing with JMX
* rebooting
* running the server as root
* increasing the AJP thread maximums
* increasing the max postgresql connections
* removing the HTTP connector and only using AJP
* removing the AJP connector and only using HTTP
When the deadlock occurs, I cannot even probe the JVM with JMX. It is
fairly repeatable though - all I have to do is load about a dozen pages
at once (presumably this is about 80 - 100 GET requests).
I also noticed by running 'top' that a single java process sits on top
of the list, while in normal operation there are a few java processes
running at the top of the list. When I saw your post, I tried sending
SIGKILL and got a bunch of stacktraces like those attached below. Unlike
your case though, I get practically no warning in the catalina log of
any potential problems.
My environment is
* Linux version 2.4.27-2-686
* jdk 1.5.0.6
* tomcat 5.5.16
* jvm args -Xmx256m
* glibc 2.3.2
Any suggestions on this one would be helpful.
Thanks!
Roger
Begin stacktrace snippets:
"TP-Processor14" daemon prio=1 tid=0x08402b38 nid=0x3376 waiting for
monitor entry [0xbcdff000..0xbcdff754]
at
org.jpox.PersistenceManagerFactoryImpl.getPersistenceManager(PersistenceManagerFactoryImpl.java:645)
- waiting to lock <0x5b775de0> (a
org.jpox.PersistenceManagerFactoryImpl)
at net.ninthave.multisites.core.DAO.<init>(DAO.java:82)
at
net.ninthave.multisites.filters.RequestFilter.doFilter(RequestFilter.java:137)
"TP-Processor17" daemon prio=1 tid=0x082df898 nid=0x3379 waiting for
monitor entry [0xbc7fd000..0xbc7ff4d4]
at
org.jpox.store.expression.TableExprAsJoins.referenceColumn(TableExprAsJoins.java:65)
at
org.jpox.store.query.QueryStatement$QueryExpression.toString(QueryStatement.java:1147)
at
org.jpox.store.query.QueryStatement.selectQueryExpression(QueryStatement.java:357)
- locked <0x5db8db60> (a org.jpox.store.query.QueryStatement)
at
org.jpox.store.query.QueryStatement.select(QueryStatement.java:261)
- locked <0x5db8db60> (a org.jpox.store.query.QueryStatement)
at
org.jpox.store.query.QueryStatement.select(QueryStatement.java:274)
- locked <0x5db8db60> (a org.jpox.store.query.QueryStatement)
at
org.jpox.store.query.QueryStatement.select(QueryStatement.java:287)
- locked <0x5db8db60> (a org.jpox.store.query.QueryStatement)
at org.jpox.store.mapping.Mappings.selectMapping(Mappings.java:50)
at
org.jpox.store.rdbms.extent.ClassTableExtent.newResultObjectFactory(ClassTableExtent.java:254)
at org.jpox.store.query.JDOQLQuery.compile(JDOQLQuery.java:615)
at
org.jpox.store.query.JDOQLQuery.performExecute(JDOQLQuery.java:654)
at org.jpox.store.query.Query.executeWithMap(Query.java:868)
at org.jpox.store.query.Query.executeWithArray(Query.java:848)
at org.jpox.store.query.Query.execute(Query.java:793)
at cashflowclub.Helper.getNextEvent(Helper.java:44)
"TP-Processor18" daemon prio=1 tid=0x084e9a90 nid=0x337a runnable
[0xbc5ff000..0xbc5ff554]
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:313)
- locked <0x5bca9a08> (a java.io.BufferedInputStream)
at org.apache.jk.common.ChannelSocket.read(ChannelSocket.java:607)
at
org.apache.jk.common.ChannelSocket.receive(ChannelSocket.java:545)
at
org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:672)
at
org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:876)
at
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684)
at java.lang.Thread.run(Thread.java:595)
Leon Rosenberg wrote:
Hi,
I analyzed my thread dump and log files from yesterdays's crash a bit.
Here the flow of events:
First probably the deadlock occured. The following entry in logfiles
is an indication for it:
Mar 14, 2006 8:53:03 PM org.apache.tomcat.util.threads.ThreadPool logFull
SEVERE: All threads (750) are currently busy, waiting. Increase
maxThreads (750) or check the servlet status
After that the series of "can't/shouldn't happen exceptions" fills the logfiles:
ApplicationDispatcher[]: Servlet.service() for servlet jsp threw exception
javax.servlet.jsp.JspException: No bean found under attribute key result
StandardWrapperValve[controller]: Servlet.service() for servlet
controller threw exception
javax.servlet.jsp.JspException: No bean found under attribute key result
javax.servlet.jsp.JspException: No bean found under attribute key result
[EMAIL PROTECTED]: Exception
Processing ErrorPage[errorCode=500, location=/down/500.html]
ClientAbortException: java.net.SocketException: Connection reset
ApplicationDispatcher[]: Servlet.service() for servlet jsp threw exception
javax.servlet.jsp.JspException: No bean found under attribute key result
[EMAIL PROTECTED]: Exception
Processing ErrorPage[errorCode=500, location=/down/500.html]
ClientAbortException: java.net.SocketException: Connection reset
ApplicationDispatcher[]: Servlet.service() for servlet jsp threw exception
[EMAIL PROTECTED]: Exception
Processing ErrorPage[errorCode=500, location=/down/500.html]
ClientAbortException: java.net.SocketException: Connection reset
[EMAIL PROTECTED]: Exception
Processing ErrorPage[errorCode=500, location=/down/500.html]
ClientAbortException: java.net.SocketException: Connection reset
ApplicationDispatcher[]: Servlet.service() for servlet jsp threw exception
[EMAIL PROTECTED]: Exception
Processing ErrorPage[errorCode=500, location=/down/500.html]
ClientAbortException: java.net.SocketException: Connection reset
[EMAIL PROTECTED]: Exception
Processing ErrorPage[errorCode=500, location=/down/500.html]
ClientAbortException: java.net.SocketException: Broken pipe
[EMAIL PROTECTED]: Exception
Processing ErrorPage[errorCode=500, location=/down/500.html]
ClientAbortException: java.net.SocketException: Connection reset
[EMAIL PROTECTED]: Exception
Processing ErrorPage[errorCode=500, location=/down/500.html]
ClientAbortException: java.net.SocketException: Broken pipe
[EMAIL PROTECTED]: Exception
Processing ErrorPage[errorCode=500, location=/down/500.html]
ClientAbortException: java.net.SocketException: Connection reset
ApplicationDispatcher[]: Servlet.service() for servlet jsp threw exception
javax.servlet.jsp.JspException: No bean found under attribute key userRights
StandardWrapperValve[controller]: Servlet.service() for servlet
controller threw exception
javax.servlet.jsp.JspException: No bean found under attribute key userRights
javax.servlet.jsp.JspException: No bean found under attribute key userRights
[EMAIL PROTECTED]: Exception
Processing ErrorPage[errorCode=500, location=/down/500.html]
ClientAbortException: java.net.SocketException: Connection reset
ApplicationDispatcher[]: Servlet.service() for servlet jsp threw exception
java.lang.IllegalStateException: getAttribute: Session already invalidated
ApplicationDispatcher[]: Servlet.service() for servlet controller
threw exception
java.lang.IllegalStateException: getAttribute: Session already invalidated
StandardWrapperValve[index]: Servlet.service() for servlet index threw exception
java.lang.IllegalStateException: getAttribute: Session already invalidated
java.lang.IllegalStateException: getAttribute: Session already invalidated
[EMAIL PROTECTED]: Exception
Processing ErrorPage[errorCode=500, location=/down/500.html]
ClientAbortException: java.net.SocketException: Connection reset
Note: all three beans "result", "user" and "userRights" can't be
missing in the jsp, because the are added to the request in the
preprocessing phase or immediately before the forward to the jsp
happens. We also never see any of the above exception in catalina.out
for normally running server.
Some time later:
ApplicationDispatcher[]: Servlet.service() for servlet jsp threw exception
java.lang.OutOfMemoryError: Java heap space
StandardWrapperValve[controller]: Servlet.service() for servlet
controller threw exception
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
[EMAIL PROTECTED]: Exception
Processing ErrorPage[errorCode=404, location=/down/404.html]
ClientAbortException: java.net.SocketException: Connection reset
the out of memory error is repeated 41 times.
After all the cpu load gone down to zero and the server stoped
responding (no wonder without free threads or memory).
The thread dump (can be submitted if someone is interested) contains
678 threads waiting
on 9 JspServletWrapper Objects (
The nine looks like:
"http-8580-Processor16777" daemon prio=1 tid=0xc40b5890 nid=0x236f
waiting for monitor entry [0xc13fe000..0xc13ff840]
at
org.apache.naming.resources.ProxyDirContext.cacheLoad(ProxyDirContext.java:1552)
- waiting to lock <0x61655498> (a
org.apache.naming.resources.ResourceCache)
at
org.apache.naming.resources.ProxyDirContext.cacheLookup(ProxyDirContext.java:1398)
at
org.apache.naming.resources.ProxyDirContext.lookup(ProxyDirContext.java:279)
at
org.apache.naming.resources.DirContextURLConnection.getInputStream(DirContextURLConnection.java:301)
at org.apache.jasper.compiler.Compiler.isOutDated(Compiler.java:569)
at org.apache.jasper.compiler.Compiler.isOutDated(Compiler.java:492)
at
org.apache.jasper.JspCompilationContext.compile(JspCompilationContext.java:509)
at
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:274)
- locked <0x61df86c0> (a org.apache.jasper.servlet.JspServletWrapper)
at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:292)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:236)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:810)
the 678 looks like:
"http-8580-Processor17757" daemon prio=1 tid=0x0a448538 nid=0x2779
waiting for monitor entry [0xd30fe000..0xd30ff740]
at
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:270)
- waiting to lock <0x61df86c0> (a
org.apache.jasper.servlet.JspServletWrapper)
at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:292)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:236)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:810)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:237)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:157)
at
org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:703)
at
org.apache.catalina.core.ApplicationDispatcher.doInclude(ApplicationDispatcher.java:589)
at
org.apache.catalina.core.ApplicationDispatcher.include(ApplicationDispatcher.java:499)
at
org.apache.jasper.runtime.JspRuntimeLibrary.include(JspRuntimeLibrary.java:966)
at
org.apache.jasper.runtime.PageContextImpl.include(PageContextImpl.java:581)
The 678 threads aren't waiting for a single lock all, they are
separated in groups, the largest group are 200 threads, the smallest
23.
I can't find a locked <0x61655498> entry nowhere.
But I should be able to see it to recognize the thread causing the
deadlock, shouldn't I?
Any ideas, anyone?
regards
Leon
--
----------------------------------------
Ninth Avenue Software
p: +61 7 3870 8494 (UTC +10)
f: +61 7 3870 8491
w: http://www.ninthavenue.com.au
e: [EMAIL PROTECTED]
----------------------------------------
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
----------------------------------------
Ninth Avenue Software
p: +61 7 3870 8494 (UTC +10)
f: +61 7 3870 8491
w: http://www.ninthavenue.com.au
e: [EMAIL PROTECTED]
----------------------------------------
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]