Bugs item #617574, was opened at 2002-10-02 16:20 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=376685&aid=617574&group_id=22866
Category: JBossMX Group: v3.2 Status: Open Resolution: None Priority: 7 Submitted By: Michael Bartmann (bartmann) Assigned to: Scott M Stark (starksm) Summary: Classloader deadlock Initial Comment: We have for the third time in quite some weeks experienced a partial lookup of the JBoss server (some services responsive, others not). The bug is not deterministically reproducible for us. But this time we luckily had a debugger online and drilled down to what seems to be a classloader deadlock. This was under NT4.0 (it happend before under W2000 also) We used Branch_3_2 (checkout 12 hours before it went beta) under JDK1.4.0-b92. It happened when many separate ear-scoped mbeans and dependent MDBs got deployed in a short time. Many of the mbeans are JMSProviders and the MDBs recieve external messages almost immediatelly after startup, so they all try to load classes simultaneously. Most of the threads were waiting for a lock at line 84 in the loadClass() of HeirarchicalLoaderRepository2; only one threads was locked in loadClass() of java.lang.ClassLoader. The two threads which seem to have caused the deadlock were "Thread-47" (java.util.TimerThread) and "Thread Pool Worker-0" (EDU.oswego.blablaWorker), both childs of the ThreadGroup "ASF Session Pool Threads". =================================== "Thread-47" has the following trace: loadClass() at line 84 of org.jboss.mx.loading.HeirarchicalLoaderRepository2, this=org.jboss.mx.loading.HeirarchicalLoaderRepository2@129c ... loadClass() at line 262 of java.lang.ClassLoader, this=org.jboss.mx.loading.UnifiedClassLoader@1299 ... =================================== "Thread Pool Worker-0" has the following trace: loadClass() at line 295 of java.lang.ClassLoader, this=org.jboss.mx.loading.UnifiedClassLoader@1299 ... loadClass() at line 88 of org.jboss.mx.loading.HeirarchicalLoaderRepository2, this=org.jboss.mx.loading.HeirarchicalLoaderRepository2@129c ... =================================== ...so the deadlock seems evident. ---------------------------------------------------------------------- >Comment By: Michael Bartmann (bartmann) Date: 2002-10-02 21:42 Message: Logged In: YES user_id=69300 Please forgive me. I write highly parallel code for years and don't know how to generate a VM thread dump w/o the debugger. Got to find out how. What we did instead is: we walked through the JBuilder debug thread list and saved a stacktrace of every single thread that had a loadClass anywhere in its stacktrace (as bmp, I can see you roll on the floor...). But there is one thing with your argument that I don't understand: At least in what JBuilder showed as the source of the java.lang.ClassLoader we have a lock (synchronized..) in the java.lang.ClassLoader too. So I saw two synchronized sections locking on two different ClassLoader instances, which are "overcross" in the two threads. Shouldn't this deadlock? ---------------------------------------------------------------------- Comment By: Scott M Stark (starksm) Date: 2002-10-02 20:37 Message: Logged In: YES user_id=175228 Can't you just provide the VM thread dump rather than having to run the server in a debugger? The thread dumps shown by the two images do not indicate to me that the threads are deadlocked. The Thread Pool Worker-0 thread is in ClassLoader.loadClass with the HeirarchicalLoaderRepository2 lock held which will stop the Thread-47 from entering the HeirarchicalLoaderRepository2, but loadClass will proceed. We need the full VM thread dump in general to look at deadlock issues. The likely problem is inconsistent locking at the HeirarchicalLoaderRepository2 level which would allow for recursive calls into a HeirarchicalLoaderRepository2 by two different threads. Another change made in the 3.2 beta release that will affect this startup issue is that the TimedInstnacePoolFeeder is no longer used to initialize the pool because its start could not be synched well with the complete deployment start state. An interim workaround might be to simply remove all of the feeder config settings like the following: <feeder- policy>org.jboss.ejb.plugins.TimedInstancePoolFeeder</feede r-policy> <feeder-policy-conf> <increment>10</increment> <period>500</period> </feeder-policy-conf> ---------------------------------------------------------------------- Comment By: Michael Bartmann (bartmann) Date: 2002-10-02 18:10 Message: Logged In: YES user_id=69300 oops, douplicates.... sourceforge didn't like my first post, I lost my text and retried. so simply ignore one of me previous two comments. ---------------------------------------------------------------------- Comment By: Michael Bartmann (bartmann) Date: 2002-10-02 18:05 Message: Logged In: YES user_id=69300 Hi Adrian, you are right, only one of the threads goes through UCL2. The other one (initiating in java.util.TimerThread) goes there right throug the guts of some sun classes. I'll append stacktraces in the (sorry for that) format of zipped screenshots of the jbuilder debugger gui; there was no way to save them as text :-( Regards, Michael ---------------------------------------------------------------------- Comment By: Michael Bartmann (bartmann) Date: 2002-10-02 17:58 Message: Logged In: YES user_id=69300 Hi Adrian, you are right, only one of the threads (the one from the MBean container) goes through a UCL2. The other one is an offspring of java.util.TimerThread and gets there right through the guts of some sun classes. I have detailed stacktraces as JBuilder-screenshot-bitmaps, which I will append here (one at a time, I don't know how to append more than one file through the sourceforge bug tracker) Enjoy, Michael ---------------------------------------------------------------------- Comment By: Adrian Brock (ejort) Date: 2002-10-02 17:40 Message: Logged In: YES user_id=9459 Hi Michael, Does this appear in your stacktrace? package org.jboss.jms.asf; ... public class ServerSessionPoolLoader ... protected void startService() throws Exception { XidFactoryMBean xidFactoryObj = (XidFactoryMBean) getServer().getAttribute(xidFactory, "Instance"); Class cls = Class.forName(poolFactoryClass); I've seen stack traces where Class.forName goes straight through loadClassInternal() with the known synchronization bug. Shouldn't this use the context cl, a UCL2? Regards, Adrian ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=376685&aid=617574&group_id=22866 ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Jboss-development mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/jboss-development