Mandy Chung said the following on 11/18/09 11:36:
> It's a known bug:
>
> 6501158: Thread state is incorrect during class initialization
> procedure
>
> I recalled the discussion for this bug but don't remember if we
> discussed enhancing the java.lang.management spec to cover "waiting"
> on VM internal actions.
>
> David will probably have more information about this.

I have nothing really to add save what is stated in the CR, but as my main comment was not public I've moved it to being public (and dropped myself as RE) and reproduce it below.

Quite simply the code that does the "wait" is low-level in the VM and does not come through the normal Object.wait() path that would set the Thread.State. It can be "fixed" but there are a couple of additional issues that also need to be addressed due to the fact that the monitor used is not associated with Java-level object. (The JLS was updated in this regard.)

The meta-discussion was whether we should introduce a new Thread.State to cover this special case (waiting for class initialization), and that discussion seemed to lean towards doing this (I suggested it and Mandy agreed it seemed like a good idea :) ) But things did not progress from there.

Cheers,
David
-----

From 6501158:

The submitter quotes the JLS with regard to the class initialization procedure and how synchronization is employed. In fact hotspot does not synchronize using the Class object monitor during class initialization - this is to avoid denial-of-service style attacks just by explicitly locking a Class object. The JLS is in the process of being updated to say that a "unique initialization lock " is used for class initialization, not necessarily the Class object's lock. This brings the spec into line with the hotspot implementation.

The reason I mention this is that the monitor that hotspot uses is associated with the klassOop for the class. The monitor code sets current_waiting_monitor() or current_pending_monitor() as appropriate during wait() or monitor entry. The management code, via the ThreadService::ThreadSnapShot gets a hold of the object associated with the monitor for a blocked thread and assumes that the object is in fact the oop for a java.lang.Object. When the klassOop is treated as in instance oop and queried for its own class etc then we end up crashing the VM.

The suggested fix correctly sets the thread state to "WAITING":

Full thread dump Java HotSpot(TM) Tiered VM (1.7.0-internal-dh198349-fastdebug mixed mode):

"Runner" prio=3 tid=0x08171800 nid=0xb in Object.wait() [0xcb99d000..0xcb99dbb0]
   java.lang.Thread.State: WAITING (on object monitor)

but additional changes are need in ThreadSnapShot to discard the non-instance oop. (It seems JvmtiEnvBase::get_current_contended_monitor would need a similar modification). This seems to work and getThreadInfo simply reports eg:

Current thread info: "Runner" Id=8 WAITING

which seems okay. And getLockInfo() returns null.

It is unclear however whether reporting this information actually violates the specification for these management API's. A thread is only WAITING when performing Object.wait(), in which case there must be an Object being waited upon and so LockInfo must return non-null information. Yet that is not the case here.

It seems to me that while we can report the information above, it might be better to see whether the management specification can be enhanced to cover "waiting" on VM internal actions and to then report this circumstance as one of those.

Note also that the existing hotspot code could already be susceptible to a crash due to the use of the klassOop monitor for class initialization. If the timing were just right, a call to getThreadInfo could see a thread blocked trying to acquire this monitor (not wait upon it) and that would be captured by the ThreadSnapshot and eventually cause a crash. The fact that the snapshot requires a safepoint makes it less likely that you would encounter the target thread while blocked on the monitor, as the monitor is only held for a short period during class initialization.

I will await discussion with the management/monitoring folk before deciding how best to proceed with this CR.

Reply via email to