Re: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load

[email protected] Fri, 22 Nov 2019 10:31:15 -0800

Hi Coleen,

Thank you for the answers!
I'm Okay with the fix.
Not sure if you have another version to review.

Thanks,
Serguei

On 11/21/19 21:00, [email protected] wrote:

On 11/21/19 8:33 PM, [email protected] wrote:

Hi Coleen,

Looks good in general.

Serguei, Thank you for reviewing this.
Nice approach, thank you for working on this!
It is great to get rid of the nmethodLocker's in JvmtiDeferredEvent class.

I have some questions/comments.

http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/src/hotspot/share/runtime/serviceThread.cpp.frames.html
  49 JvmtiDeferredEvent* ServiceThread::_jvmti_event = NULL;
The ServiceThread processes only one JVMTI event at each iteration.
So, having this static field is Okay.
Yes, the ServiceThread has locked these fields or is running so doesn't modify the current jvmti_event until it's done.
http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/src/hotspot/share/runtime/mutexLocker.cpp.udiff.html
-  def(JmethodIdCreation_lock       , PaddedMutex  , leaf,        true,  _safepoint_check_always); // used for creating jmethodIDs.
+  def(JmethodIdCreation_lock       , PaddedMutex  , leaf,        true,  _safepoint_check_never); // used for creating jmethodIDs.
This needs a good testing coverage to be safe (everything
that is transitively using InstanceKlass::get_jmethod_id).
It would be easier to just run all the JVMTI, JDI and j.l.instrument tests.
The hs-tier5 should have most of them.
I ran these tests locally and mach5 1-3, but I'll run through hs-tier5 tonight.
http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/src/hotspot/share/prims/jvmtiImpl.cpp.udiff.html
-void JvmtiDeferredEventQueue::enqueue(const JvmtiDeferredEvent& event) {
+void JvmtiDeferredEventQueue::enqueue(JvmtiDeferredEvent event) {
What was the motivation for this change?
Why the copy semantics is needed here?
The event is copied to the QueueNode _event field when it is enqueued. I took out too many consts and meant to put them back in. The only one that couldn't be const is:

JvmtiDeferredEvent& event() { return _event; }

because we call oops_do and nmethods_do on it.

I'm rerunning tests and will send out another patch if it differs too much from the original.

Thanks for reviewing.
Coleen
Thanks,
Serguei

On 11/21/19 6:12 AM, [email protected] wrote:
Please review a new version of this change that keeps the nmethod from being unloaded, after it is added to the deferred event queue:

http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/index.html

Ran the test that failed 100 times without failure, tier1 on Oracle supported platforms, and tier2-3 including jvmti and jdi tests locally.

See bug for more details about the crash.

https://bugs.openjdk.java.net/browse/JDK-8173361

Thanks,
Coleen

On 11/18/19 10:09 PM, [email protected] wrote:
Hi Serguei,

Sorry for not sending an update. I talked to Erik and am working on a version that keeps the nmethod from being unloaded while it's in the deferred event queue, with a version that the GC people will like, and I like. I'm testing it out now.

Thanks!
Coleen

On 11/18/19 10:03 PM, [email protected] wrote:
Hi Coleen,

Sorry for the latency, I had to investigate it a little bit.
I still have some doubt your fix is right thing to do.

On 11/16/19 04:55, [email protected] wrote:

On 11/15/19 11:17 PM, [email protected] wrote:

Hi Coleen,

On 11/15/19 2:12 PM, [email protected] wrote:

Hi, I've been working on answers to these questions, so I'll start with this one.

The nmethodLocker keeps the nmethod from being reclaimed (made_zombie or memory released) by the sweeper, but the nmethod could be unloaded. Unloading the nmethod clears the Method* _method field.

Yes, I see it is done in the nmethod::make_unloaded().

The post_compiled_method_load event needs the _method field to look at things like inlining and ScopeDesc fields.   If the nmethod is unloaded, some of the oops are dead. There are "holder" oops that correspond to the metadata in the nmethod. If these oops are dead, causing the nmethod to get unloaded, then the metadata may not be valid.

So my change 02 looks for a NULL nmethod._method field to tell whether we can post information about the nmethod.

There's code in nmethod.cpp like:

jmethodID nmethod::get_and_cache_jmethod_id() {
if (_jmethod_id == NULL) {
    // Cache the jmethod_id since it can no longer be looked up once the
    // method itself has been marked for unloading.
    _jmethod_id = method()->jmethod_id();
}
return _jmethod_id;
}

Which was added when post_method_load and unload were turned into deferred events.

Could we cache the jmethodID in the JvmtiDeferredEvent::compiled_method_load_event
similarly as we do in the JvmtiDeferredEvent::compiled_method_unload_event?
This would help to get rid of the dependency on the nmethod::_method.
Do we depend on any other nmethod fields?

Yes, there are other nmethod metadata that we rely on to print inline information, and this function JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it uses the ScopeDesc data in the nmethod.

One possible approach is to prepare and cache all this information
in the nmethod::post_compiled_method_load_event() before the
JvmtiDeferredEvent::compiled_method_load_event() is called.
The event parameters are:
typedef struct {
    const void* start_address;
    jlocation location;
} jvmtiAddrLocationMap;
CompiledMethodLoad(jvmtiEnv *jvmti_env,
            jmethodID method,
            jint code_size,
            const void* code_addr,
            jint map_length,
            const jvmtiAddrLocationMap* map,
            const void* compile_info)
Some of these addresses above could be not accessible when an event is posted.
Not sure yet if it is Okay.
The question is if this kind of refactoring is worth and right thing to do.

We do cache the jmethodID but that's not good enough. See my last comment in the bug report. The jmethodID can point to an unloaded method.

This looks like it is done a little bit late.
It'd better to do it before the event is deferred (see above).

I tried a version of keeping the nmethod alive, but the GC folks will hate it. And it doesn't work and I hate it.

From serviceability point of view this is the best and most consistent approach.
I seems to me, it was initially designed this way.
The downside is it adds some extra complexity to the GC.

My version 01 is the best, with the caveat that maybe it should check for _method == NULL instead of nmethod->is_alive(). I have to talk to Erik to see if there's a race with concurrent class unloading.

Any application that depends on a compiled method loading event on a class that could be unloaded is a buggy application. Applications should not rely on when the JIT compiler decides to compile a method! This happens to us for a stress test. Most applications will get most of their compiled method loading events as they normally do.

It is not an application that relies on the compiled method loading event.
It is about profiling tools to be able to get correct information about what is going on with compilations.
My concern is that if we skip such compiled method load events then profilers have no way
to find out there many unneeded compilations that are thrown away without any real use.
Also, it is not clear what happens with the subsequent compiled method unload events.
Are they going to be skipped as well or they can appear and confuse profilers?

Thanks,
Serguei
Thanks,
Coleen
Thanks,
Serguei
I put more debugging in the bug to show this crash was from an unloaded nmethod.

Coleen

On 11/15/19 4:45 PM, [email protected] wrote:
Hi Coleen,

I have some questions.

Both the compiler method load and unload are posted as deferred events.
Both events keep the nmethod alive until the ServiceThread processes the event.

The implementation is:

JvmtiDeferredEvent JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) {
. . .
// Keep the nmethod alive until the ServiceThread can process
// this deferred event.
nmethodLocker::lock_nmethod(nm);
return event;
}

JvmtiDeferredEvent JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, jmethodID id, const void* code) {
. . .
// Keep the nmethod alive until the ServiceThread can process
// this deferred event. This will keep the memory for the
// generated code from being reused too early. We pass
// zombie_ok == true here so that our nmethod that was just
// made into a zombie can be locked.
nmethodLocker::lock_nmethod(nm, true /* zombie_ok */);
return event;
}

void JvmtiDeferredEvent::post() {
assert(ServiceThread::is_service_thread(Thread::current()),
         "Service thread must post enqueued events");
switch(_type) {
    case TYPE_COMPILED_METHOD_LOAD: {
      nmethod* nm = _event_data.compiled_method_load;
      JvmtiExport::post_compiled_method_load(nm);
      // done with the deferred event so unlock the nmethod
      nmethodLocker::unlock_nmethod(nm);
      break;
    }
    case TYPE_COMPILED_METHOD_UNLOAD: {
      nmethod* nm = _event_data.compiled_method_unload.nm;
      JvmtiExport::post_compiled_method_unload(
        _event_data.compiled_method_unload.method_id,
        _event_data.compiled_method_unload.code_begin);
      // done with the deferred event so unlock the nmethod
      nmethodLocker::unlock_nmethod(nm);
      break;
    }
    . . .
}
}

Then I wonder how is it possible for the nmethod to be not alive here?:
2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) {
. . .
2173   // It's not safe to look at metadata for unloaded methods.
2174   if (!nm->is_alive()) {
2175     return;
2176   }
At least, it lokks like something else is broken.
Do I miss something important here?

Thanks,
Serguei

On 11/14/19 5:15 PM, [email protected] wrote:

Summary: Don't post information which uses metadata from unloaded nmethods

Tested tier1-3 and 100 times with test that failed (reproduced failure without the fix).

open webrev at http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev
bug link https://bugs.openjdk.java.net/browse/JDK-8173361

Thanks,
Coleen

Re: RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load

Reply via email to