Bugs item #617574, was opened at 2002-10-02 07:20
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=376685&aid=617574&group_id=22866
Category: JBossMX
Group: v3.2
>Status: Pending
>Resolution: Fixed
Priority: 7
Submitted By: Michael Bartmann (bartmann)
Assigned to: Scott M Stark (starksm)
Summary: Classloader deadlock
Initial Comment:
We have for the third time in quite some weeks
experienced a partial lookup
of the JBoss server (some services responsive, others
not). The bug is not deterministically reproducible for us.
But this time we luckily had a debugger online and
drilled down
to what seems to be a classloader deadlock.
This was under NT4.0 (it happend before under W2000 also)
We used Branch_3_2 (checkout 12 hours before it went beta)
under JDK1.4.0-b92.
It happened when many separate ear-scoped mbeans
and dependent MDBs got deployed in a short time.
Many of the mbeans are JMSProviders and the MDBs
recieve external messages almost immediatelly after
startup, so they all try to load classes simultaneously.
Most of the threads were waiting for a lock at line 84
in the loadClass()
of HeirarchicalLoaderRepository2; only one threads was
locked
in loadClass() of java.lang.ClassLoader.
The two threads which seem to have caused the deadlock were
"Thread-47" (java.util.TimerThread) and
"Thread Pool Worker-0" (EDU.oswego.blablaWorker),
both childs of the ThreadGroup "ASF Session Pool Threads".
===================================
"Thread-47" has the following trace:
loadClass() at line 84 of
org.jboss.mx.loading.HeirarchicalLoaderRepository2,
this=org.jboss.mx.loading.HeirarchicalLoaderRepository2@129c
...
loadClass() at line 262 of java.lang.ClassLoader,
this=org.jboss.mx.loading.UnifiedClassLoader@1299
...
===================================
"Thread Pool Worker-0" has the following trace:
loadClass() at line 295 of java.lang.ClassLoader,
this=org.jboss.mx.loading.UnifiedClassLoader@1299
...
loadClass() at line 88 of
org.jboss.mx.loading.HeirarchicalLoaderRepository2,
this=org.jboss.mx.loading.HeirarchicalLoaderRepository2@129c
...
===================================
...so the deadlock seems evident.
----------------------------------------------------------------------
>Comment By: Scott M Stark (starksm)
Date: 2002-10-03 13:59
Message:
Logged In: YES
user_id=175228
See the changes added today to address this issue. I have
not been able to come up with a testcase that reproduces
the original deadlock so the completion of the fix is waiting
that.
----------------------------------------------------------------------
Comment By: Scott M Stark (starksm)
Date: 2002-10-02 12:58
Message:
Logged In: YES
user_id=175228
Checkout the VM docs for your platform on how to generate
a thread dump. Its Ctrl-Break on windows, Ctrl-\ or SIGQUIT
on most *unix like systems for the sun based VMs.
Ok, I see what you are referring to reguarding the loadClass
calls to UCL@1299. The call from Thread-47 is does not have
a top level call through a UnifiedClassLoader2 at any point
and this will violate the use condition for the
UnfiedClassLoader entering loadClass. I'll look into that.
----------------------------------------------------------------------
Comment By: Michael Bartmann (bartmann)
Date: 2002-10-02 12:42
Message:
Logged In: YES
user_id=69300
Please forgive me. I write highly parallel code for years and
don't know how to generate a VM thread dump w/o the
debugger. Got to find out how.
What we did instead is:
we walked through the JBuilder debug thread list and
saved a stacktrace of every single thread that had a loadClass
anywhere in its stacktrace (as bmp, I can see you roll on
the floor...).
But there is one thing with your argument that I don't
understand:
At least in what JBuilder showed as the source of the
java.lang.ClassLoader we have a lock (synchronized..) in
the java.lang.ClassLoader too.
So I saw two synchronized sections locking on two different
ClassLoader instances, which are "overcross" in the two threads.
Shouldn't this deadlock?
----------------------------------------------------------------------
Comment By: Scott M Stark (starksm)
Date: 2002-10-02 11:37
Message:
Logged In: YES
user_id=175228
Can't you just provide the VM thread dump rather than having
to run the server in a debugger? The thread dumps shown by
the two images do not indicate to me that the threads are
deadlocked. The Thread Pool Worker-0 thread is in
ClassLoader.loadClass with the
HeirarchicalLoaderRepository2 lock held which will stop the
Thread-47 from entering the HeirarchicalLoaderRepository2,
but loadClass will proceed. We need the full VM thread dump
in general to look at deadlock issues. The likely problem is
inconsistent locking at the HeirarchicalLoaderRepository2
level which would allow for recursive calls into a
HeirarchicalLoaderRepository2 by two different threads.
Another change made in the 3.2 beta release that will affect
this startup issue is that the TimedInstnacePoolFeeder is no
longer used to initialize the pool because its start could not
be synched well with the complete deployment start state.
An interim workaround might be to simply remove all of the
feeder config settings like the following:
<feeder-
policy>org.jboss.ejb.plugins.TimedInstancePoolFeeder</feede
r-policy>
<feeder-policy-conf>
<increment>10</increment>
<period>500</period>
</feeder-policy-conf>
----------------------------------------------------------------------
Comment By: Michael Bartmann (bartmann)
Date: 2002-10-02 09:10
Message:
Logged In: YES
user_id=69300
oops, douplicates....
sourceforge didn't like my first post, I lost my text and
retried. so simply ignore one of me previous two comments.
----------------------------------------------------------------------
Comment By: Michael Bartmann (bartmann)
Date: 2002-10-02 09:05
Message:
Logged In: YES
user_id=69300
Hi Adrian,
you are right, only one of the threads goes through UCL2.
The other one (initiating in java.util.TimerThread) goes there
right throug the guts of some sun classes.
I'll append stacktraces in the (sorry for that) format of
zipped screenshots of the jbuilder debugger gui; there was
no way to save them as text :-(
Regards,
Michael
----------------------------------------------------------------------
Comment By: Michael Bartmann (bartmann)
Date: 2002-10-02 08:58
Message:
Logged In: YES
user_id=69300
Hi Adrian,
you are right, only one of the threads (the one from
the MBean container) goes through a UCL2. The other one
is an offspring of java.util.TimerThread and gets there right
through the guts of some sun classes. I have detailed
stacktraces as JBuilder-screenshot-bitmaps, which I will
append here (one at a time, I don't know how to append more
than one file through the sourceforge bug tracker)
Enjoy,
Michael
----------------------------------------------------------------------
Comment By: Adrian Brock (ejort)
Date: 2002-10-02 08:40
Message:
Logged In: YES
user_id=9459
Hi Michael,
Does this appear in your stacktrace?
package org.jboss.jms.asf;
...
public class ServerSessionPoolLoader
...
protected void startService() throws Exception
{
XidFactoryMBean xidFactoryObj = (XidFactoryMBean)
getServer().getAttribute(xidFactory, "Instance");
Class cls = Class.forName(poolFactoryClass);
I've seen stack traces where Class.forName goes
straight through loadClassInternal() with the known
synchronization bug.
Shouldn't this use the context cl, a UCL2?
Regards,
Adrian
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=376685&aid=617574&group_id=22866
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Jboss-development mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jboss-development