Hi Coleen,
Thank you for the answers!
I'm Okay with the fix.
Not sure if you have another version to review.
Thanks,
Serguei
On 11/21/19 21:00, [email protected] wrote:
Hi
Coleen,
Looks good in general.
Serguei, Thank you for reviewing this.
Nice approach, thank you for working on this!
It is great to get rid of the nmethodLocker's in
JvmtiDeferredEvent class.
I have some questions/comments.
http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/src/hotspot/share/runtime/serviceThread.cpp.frames.html
49 JvmtiDeferredEvent* ServiceThread::_jvmti_event = NULL;
The ServiceThread processes only one JVMTI event at each
iteration.
So, having this static field is Okay.
Yes, the ServiceThread has locked these fields or is running so
doesn't modify the current jvmti_event until it's done.
http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/src/hotspot/share/runtime/mutexLocker.cpp.udiff.html
- def(JmethodIdCreation_lock , PaddedMutex , leaf, true, _safepoint_check_always); // used for creating jmethodIDs.
+ def(JmethodIdCreation_lock , PaddedMutex , leaf, true, _safepoint_check_never); // used for creating jmethodIDs.
This needs a good testing coverage to be safe (everything
that is transitively using InstanceKlass::get_jmethod_id).
It would be easier to just run all the JVMTI, JDI and
j.l.instrument tests.
The hs-tier5 should have most of them.
I ran these tests locally and mach5 1-3, but I'll run through
hs-tier5 tonight.
http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.udiff.html
-void JvmtiDeferredEventQueue::enqueue(const JvmtiDeferredEvent& event) {
+void JvmtiDeferredEventQueue::enqueue(JvmtiDeferredEvent event) {
What was the motivation for this change?
Why the copy semantics is needed here?
The event is copied to the QueueNode _event field when it is
enqueued. I took out too many consts and meant to put them back
in. The only one that couldn't be const is:
JvmtiDeferredEvent& event() { return _event; }
because we call oops_do and nmethods_do on it.
I'm rerunning tests and will send out another patch if it differs
too much from the original.
Thanks for reviewing.
Coleen
Thanks,
Serguei
Please review a new version of this change that keeps the
nmethod from being unloaded, after it is added to the deferred
event queue:
http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/index.html
Ran the test that failed 100 times without failure, tier1 on
Oracle supported platforms, and tier2-3 including jvmti and
jdi tests locally.
See bug for more details about the crash.
https://bugs.openjdk.java.net/browse/JDK-8173361
Thanks,
Coleen
Hi Serguei,
Sorry for not sending an update. I talked to Erik and am
working on a version that keeps the nmethod from being
unloaded while it's in the deferred event queue, with a
version that the GC people will like, and I like. I'm
testing it out now.
Thanks!
Coleen
Hi Coleen,
Sorry for the latency, I had to investigate it a little
bit.
I still have some doubt your fix is right thing to do.
On 11/16/19 04:55, [email protected]
wrote:
Hi Coleen,
Hi, I've been working on answers to these questions,
so I'll start with this one.
The nmethodLocker keeps the nmethod from being
reclaimed (made_zombie or memory released) by the
sweeper, but the nmethod could be unloaded.
Unloading the nmethod clears the Method* _method
field.
Yes, I see it is done in the nmethod::make_unloaded().
The post_compiled_method_load event needs the
_method field to look at things like inlining and
ScopeDesc fields. If the nmethod is unloaded, some
of the oops are dead. There are "holder" oops that
correspond to the metadata in the nmethod. If these
oops are dead, causing the nmethod to get unloaded,
then the metadata may not be valid.
So my change 02 looks for a NULL nmethod._method
field to tell whether we can post information about
the nmethod.
There's code in nmethod.cpp like:
jmethodID nmethod::get_and_cache_jmethod_id() {
if (_jmethod_id == NULL) {
// Cache the jmethod_id since it can no longer
be looked up once the
// method itself has been marked for unloading.
_jmethod_id = method()->jmethod_id();
}
return _jmethod_id;
}
Which was added when post_method_load and unload
were turned into deferred events.
Could we cache the jmethodID in the
JvmtiDeferredEvent::compiled_method_load_event
similarly as we do in the
JvmtiDeferredEvent::compiled_method_unload_event?
This would help to get rid of the dependency on the
nmethod::_method.
Do we depend on any other nmethod fields?
Yes, there are other nmethod metadata that we rely on to
print inline information, and this function
JvmtiCodeBlobEvents::build_jvmti_addr_location_map
because it uses the ScopeDesc data in the nmethod.
One possible approach is to prepare and cache all this
information
in the nmethod::post_compiled_method_load_event() before
the
JvmtiDeferredEvent::compiled_method_load_event() is
called.
The event parameters are:
typedef struct {
const void* start_address;
jlocation location;
} jvmtiAddrLocationMap;
CompiledMethodLoad(jvmtiEnv *jvmti_env,
jmethodID method,
jint code_size,
const void* code_addr,
jint map_length,
const jvmtiAddrLocationMap* map,
const void* compile_info)
Some of these addresses above could be not accessible when
an event is posted.
Not sure yet if it is Okay.
The question is if this kind of refactoring is worth and
right thing to do.
We do cache the jmethodID but that's not good enough.
See my last comment in the bug report. The jmethodID
can point to an unloaded method.
This looks like it is done a little bit late.
It'd better to do it before the event is deferred (see
above).
I tried a version of keeping the nmethod alive, but the
GC folks will hate it. And it doesn't work and I hate
it.
From serviceability point of view this is the best and
most consistent approach.
I seems to me, it was initially designed this way.
The downside is it adds some extra complexity to the GC.
My version 01 is the best, with the caveat that maybe it
should check for _method == NULL instead of
nmethod->is_alive(). I have to talk to Erik to see
if there's a race with concurrent class unloading.
Any application that depends on a compiled method
loading event on a class that could be unloaded is a
buggy application. Applications should not rely on when
the JIT compiler decides to compile a method! This
happens to us for a stress test. Most applications will
get most of their compiled method loading events as they
normally do.
It is not an application that relies on the compiled
method loading event.
It is about profiling tools to be able to get correct
information about what is going on with compilations.
My concern is that if we skip such compiled method load
events then profilers have no way
to find out there many unneeded compilations that are
thrown away without any real use.
Also, it is not clear what happens with the subsequent
compiled method unload events.
Are they going to be skipped as well or they can appear
and confuse profilers?
Thanks,
Serguei
Thanks,
Coleen
Thanks,
Serguei
I put more debugging in the bug to show this crash
was from an unloaded nmethod.
Coleen
Hi Coleen,
I have some questions.
Both the compiler method load and unload are
posted as deferred events.
Both events keep the nmethod alive until the
ServiceThread processes the event.
The implementation is:
JvmtiDeferredEvent
JvmtiDeferredEvent::compiled_method_load_event(nmethod*
nm) {
. . .
// Keep the nmethod alive until the
ServiceThread can process
// this deferred event.
nmethodLocker::lock_nmethod(nm);
return event;
}
JvmtiDeferredEvent
JvmtiDeferredEvent::compiled_method_unload_event(nmethod*
nm, jmethodID id, const void* code) {
. . .
// Keep the nmethod alive until the
ServiceThread can process
// this deferred event. This will keep the
memory for the
// generated code from being reused too early.
We pass
// zombie_ok == true here so that our nmethod
that was just
// made into a zombie can be locked.
nmethodLocker::lock_nmethod(nm, true /*
zombie_ok */);
return event;
}
void JvmtiDeferredEvent::post() {
assert(ServiceThread::is_service_thread(Thread::current()),
"Service thread must post enqueued
events");
switch(_type) {
case TYPE_COMPILED_METHOD_LOAD: {
nmethod* nm =
_event_data.compiled_method_load;
JvmtiExport::post_compiled_method_load(nm);
// done with the deferred event so unlock
the nmethod
nmethodLocker::unlock_nmethod(nm);
break;
}
case TYPE_COMPILED_METHOD_UNLOAD: {
nmethod* nm =
_event_data.compiled_method_unload.nm;
JvmtiExport::post_compiled_method_unload(
_event_data.compiled_method_unload.method_id,
_event_data.compiled_method_unload.code_begin);
// done with the deferred event so unlock
the nmethod
nmethodLocker::unlock_nmethod(nm);
break;
}
. . .
}
}
Then I wonder how is it possible for the nmethod
to be not alive here?:
2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) {
. . .
2173 // It's not safe to look at metadata for unloaded methods.
2174 if (!nm->is_alive()) {
2175 return;
2176 }
At least, it lokks like something else is broken.
Do I miss something important here?
Thanks,
Serguei
Summary:
Don't post information which uses metadata from
unloaded nmethods
Tested tier1-3 and 100 times with test that
failed (reproduced failure without the fix).
open webrev at http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev
bug link https://bugs.openjdk.java.net/browse/JDK-8173361
Thanks,
Coleen
|