They looks related but it's hard for me to tell because I cannot see the test cases :)
On Fri, Feb 13, 2009 at 2:13 PM, Daniel D. Daugherty <[email protected]> wrote: > Hiroshi, > > I found a couple of good candidates: > > 6357005 4/3 'SingleStep' event fails assertion in > 'src/share/vm/prims/jvmtiEventController.cpp, 590' > > 6648438 4/4 src/share/vm/prims/jvmtiEnv.cpp:457 assert(phase > == JVMTI_PHASE_LIVE,"sanity check") > > I suspect that I'll have much more luck making a reproducible test > case from the test case in 6357005. > > Dan > > > Daniel D. Daugherty wrote: >> >> Hiroshi, >> >> I'm leaning toward making the two fields volatile and adding the >> appropriate OrderAccess::XXX calls. I vaguely remember an existing >> bug for events posting late. I'm going to see if I can find it. >> >> Dan >> >> >> Hiroshi Yamauchi wrote: >>> >>> Hi Dan, >>> >>> On Wed, Feb 11, 2009 at 12:43 PM, Daniel D. Daugherty >>> <[email protected] <mailto:[email protected]>> wrote: >>> >>> I'll chime in on parts of this thread below. >>> >>> >>> >>> >>> Tim Bell wrote: >>> >>> I don't know the precise answer to this question, so I am >>> forwarding it to the Serviceability list to get a wider audience. >>> >>> -------- Original Message -------- >>> Subject: Data visibility between threads in Hotspot >>> Date: Tue, 10 Feb 2009 11:14:24 -0800 >>> >>> Hi Tim, >>> >>> I have a Hotspot question. Chuck Rasbold pointed me to you. >>> Feel free to >>> forward this message to other experts at Sun, if needed. >>> >>> If one Hotspot engineer wants to guarantee the immediate >>> visibility of a >>> write to a static variable to reads in other threads (assuming >>> the reads >>> and the writes are properly synchronized via a mutex), what >>> does s/he do? >>> >>> Does the use of MutexLocker guarantee the visibility? It >>> probably does not. >>> >>> >>> I believe that MutexLocker does guarantee the visibility: >>> >>> src/share/vm/runtime/mutexLocker.hpp: >>> >>> 133 // See orderAccess.hpp. We assume throughout the VM that >>> MutexLocker's >>> 134 // and friends constructors do a fence, a lock and an >>> acquire *in that >>> 135 // order*. And that their destructors do a release and >>> unlock, in *that* >>> 136 // order. If their implementations change such that these >>> assumptions >>> 137 // are violated, a whole lot of code will break. >>> >>> >>> OK. That's handy. >>> >>> >>> >>> See src/share/vm/runtime/orderAccess.hpp for the gory details. >>> >>> >>> >>> >>> There are several volatile variables in Hotpot, but it may not >>> really >>> work because C++ compilers do not guarantee the visibility or >>> limit the >>> instruction reordering. >>> >>> See the "C++ Volatility" section in >>> src/share/vm/runtime/orderAccess.hpp. >>> >>> >>> It's an interesting header file to read. >>> >>> >>> >>> >>> For example, there is the static variable JvmtiEnvBase:_phase >>> which >>> indicates the JVMTI phase in which the JVM is currently in. >>> But AFAICT, >>> there is no synchronization used by the callers of get_phase() and >>> set_phase() and apparently they can be called by multiple >>> threads. Also, >>> the JvmtiEventEnabled::_enabled_bits which is a jlong variable >>> that >>> indicates which events are enabled for a single jvmtiEnv. The >>> writes and >>> reads to it are not synchronized, either. These are race >>> conditions, >>> which implies that some JVMTI event can go off in the dead >>> phase when no >>> events are supposed to go off. I'm actually looking into a bug >>> in a >>> JVMTI agent that I suspect is due to this. >>> >>> >>> It certainly looks like JvmtiEnvBase::_phase should minimally be a >>> volatile; it may also require some sort of synchronization to force >>> the updated values to be visible... >>> >>> JvmtiEventEnabled::_enabled_bits also looks problematic. I'm going >>> to have to mull on both of these and talk to some other folks. >>> >>> >>> The bug that I ran across was a memory corruption crash because a >>> CompiledMethodLoad event happened to be sent (due to this race condition) >>> after Agent_OnUnload call where I had already deallocated the agent's data >>> objects. This happens very rarely (once in every more than 100k runs of a >>> more-or-less javac invocation) but often enough to fix. >>> >>> I have a patch attached, which I used to suppress the bug by avoiding the >>> events to be sent during the dead phase by adding synchronization. Basically >>> I make the _phase variable volatile and I use a mutex to guard event >>> callbacks in the all the post_xxx functions and the phase transition to the >>> dead phase in post_vm_death(). Probably we don't need the volatile as long >>> as the mutex guarantees the visibility. The other phase transitions >>> (onload->primodial->start->live) may not need the same degree of >>> synchronization because events that can be sent in a earlier phase can be >>> sent in the next phase as well. >>> >>> The _enabled_bits and the callback function table will probably need a >>> similar synchronization as well. But the attached patch does not implement >>> it and only ensures that no events are sent during the dead phase. I'd like >>> to get the fix into openjdk in one form or another. >>> >>> Thanks, >>> Hiroshi >>> >>> >>> >>> >>> Dan >>> >>> >>> >>> Thanks, >>> Hiroshi >>> >>> >
