Hiroshi,
I found a couple of good candidates:
6357005 4/3 'SingleStep' event fails assertion in
'src/share/vm/prims/jvmtiEventController.cpp, 590'
6648438 4/4 src/share/vm/prims/jvmtiEnv.cpp:457 assert(phase
== JVMTI_PHASE_LIVE,"sanity check")
I suspect that I'll have much more luck making a reproducible test
case from the test case in 6357005.
Dan
Daniel D. Daugherty wrote:
Hiroshi,
I'm leaning toward making the two fields volatile and adding the
appropriate OrderAccess::XXX calls. I vaguely remember an existing
bug for events posting late. I'm going to see if I can find it.
Dan
Hiroshi Yamauchi wrote:
Hi Dan,
On Wed, Feb 11, 2009 at 12:43 PM, Daniel D. Daugherty
<[email protected] <mailto:[email protected]>> wrote:
I'll chime in on parts of this thread below.
Tim Bell wrote:
I don't know the precise answer to this question, so I am
forwarding it to the Serviceability list to get a wider
audience.
-------- Original Message --------
Subject: Data visibility between threads in Hotspot
Date: Tue, 10 Feb 2009 11:14:24 -0800
Hi Tim,
I have a Hotspot question. Chuck Rasbold pointed me to you.
Feel free to
forward this message to other experts at Sun, if needed.
If one Hotspot engineer wants to guarantee the immediate
visibility of a
write to a static variable to reads in other threads (assuming
the reads
and the writes are properly synchronized via a mutex), what
does s/he do?
Does the use of MutexLocker guarantee the visibility? It
probably does not.
I believe that MutexLocker does guarantee the visibility:
src/share/vm/runtime/mutexLocker.hpp:
133 // See orderAccess.hpp. We assume throughout the VM that
MutexLocker's
134 // and friends constructors do a fence, a lock and an
acquire *in that
135 // order*. And that their destructors do a release and
unlock, in *that*
136 // order. If their implementations change such that these
assumptions
137 // are violated, a whole lot of code will break.
OK. That's handy.
See src/share/vm/runtime/orderAccess.hpp for the gory details.
There are several volatile variables in Hotpot, but it may not
really
work because C++ compilers do not guarantee the visibility or
limit the
instruction reordering.
See the "C++ Volatility" section in
src/share/vm/runtime/orderAccess.hpp.
It's an interesting header file to read.
For example, there is the static variable JvmtiEnvBase:_phase
which
indicates the JVMTI phase in which the JVM is currently in.
But AFAICT,
there is no synchronization used by the callers of
get_phase() and
set_phase() and apparently they can be called by multiple
threads. Also,
the JvmtiEventEnabled::_enabled_bits which is a jlong variable
that
indicates which events are enabled for a single jvmtiEnv. The
writes and
reads to it are not synchronized, either. These are race
conditions,
which implies that some JVMTI event can go off in the dead
phase when no
events are supposed to go off. I'm actually looking into a bug
in a
JVMTI agent that I suspect is due to this.
It certainly looks like JvmtiEnvBase::_phase should minimally be a
volatile; it may also require some sort of synchronization to force
the updated values to be visible...
JvmtiEventEnabled::_enabled_bits also looks problematic. I'm going
to have to mull on both of these and talk to some other folks.
The bug that I ran across was a memory corruption crash because a
CompiledMethodLoad event happened to be sent (due to this race
condition) after Agent_OnUnload call where I had already deallocated
the agent's data objects. This happens very rarely (once in every
more than 100k runs of a more-or-less javac invocation) but often
enough to fix.
I have a patch attached, which I used to suppress the bug by avoiding
the events to be sent during the dead phase by adding
synchronization. Basically I make the _phase variable volatile and I
use a mutex to guard event callbacks in the all the post_xxx
functions and the phase transition to the dead phase in
post_vm_death(). Probably we don't need the volatile as long as the
mutex guarantees the visibility. The other phase transitions
(onload->primodial->start->live) may not need the same degree of
synchronization because events that can be sent in a earlier phase
can be sent in the next phase as well.
The _enabled_bits and the callback function table will probably need
a similar synchronization as well. But the attached patch does not
implement it and only ensures that no events are sent during the dead
phase. I'd like to get the fix into openjdk in one form or another.
Thanks,
Hiroshi
Dan
Thanks,
Hiroshi