Hi Dan, Serguei,

I'm going to combine my response to you both into one as it's the same discussion ...

On 15/11/2018 9:28 am, Daniel D. Daugherty wrote:
On 11/14/18 6:01 PM, David Holmes wrote:
On 15/11/2018 6:13 am, Daniel D. Daugherty wrote:
I do not have a better idea at the moment.

The ResourceExhausted event is one of many JVM/TI events. So we put in a
special case for this event because an agent out in the world tries to do
something that is not expressly forbidden by the spec, but it is a bad
thing to do with HotSpot. Okay. What about the next event? What about
the next agent? Once you've added a special case, where do you stop?

I had proposed more broadly not posting any events in threads that can not run Java but that is potentially more damaging to existing agents that just happen to work in the compiler thread today.

But it goes beyond just running Java. Can these other threads make JNI calls, or other JVM TI calls? If they can run Java they can do all those things. If they can't run Java they may not be able to.

There is no perfect solution here because JVM TI allows too much freedom. It doesn't put any constraints on the threads that can execute callbacks, and has very few constraints on what callbacks can be executed when.

On the other hand an implementation has complete freedom as to when exactly resource exhaustion events get posted. Does it make sense for an internal VM thread to post this ResourceExhausted event? Is it just an accident of implementation that the code path decided to throw OOME and so also posts the event? You can easily imagine this code not doing that - and as Thomas points out the compiler thread swallows the exception anyway!

As there is no way to know what an agent may do ahead of time we are faced with either:
- breaking some agents and crashing the VM; or
- masking an event

I vote for no breakage.

I would have no problem suppressing the event from a "hidden" thread
because those threads intentionally try to hide from JVM/TI.

Yes but only for certain things:
- not visible in thread stack dumps
- not able to suspend/resume
- not visible in thread lists/groups
- no thread lifecycle events
- not roots for Heap walking

But as far as I can see "hidden" threads still generate all other JVM TI events, so excluding them here seems to be too broad and marks a change in behaviour that seems unjustified.

That would
cover the case for the C1 and C2 CompilerThreads, but I don't know
about these newer JVM/CI Compiler threads that actually run Java code...

The JVMCI threads are not hidden. As Serguei points out in his response the hidden-ness of a compiler thread is based on its ability to run Java. So although it comes down to the same thing for compiler threads I still thing the correct test is for can_call_Java().

[Aside: I think JVMCI needs to re-think how it expects internal compiler threads to interact with things like JVM TI - we're already seeing issues with tests 'accidentally' suspending compiler threads and then getting into deadlocks! ]

JC Beyler made a suggestion in Thomas's original query thread: perhaps we should defer these events for the ServiceThread to execute? That would still have to be gated on either the thread being hidden or (preferably to me) not being able to run Java, but at least it should avoid the problem whilst still causing the event to appear.

That said, ResourceExhausted events have no context information and to me they appear to be expected to be synchronous events thrown in the thread that encountered the resource exhaustion. So to have them post in the ServiceThread might still be somewhat surprising to an agent.

So my preferred point solution is still to skip posting ResourceExhausted if the thread can not run Java code.

Cheers,
David
-----


Dan



Cheers,
David

Dan


On 11/14/18 2:59 PM, Thomas Stüfe wrote:
Hi Dan,

On Wed, Nov 14, 2018 at 8:32 PM Daniel D. Daugherty
<daniel.daughe...@oracle.com> wrote:
I have a philosophical problem with a fix like this.

The fix is making the assumption that the handler for this event will want to run Java code and if the event is posted from a Java thread that cannot
run Java code, then we skip posting the event.

If I happen to have a more conservative agent that does not run Java code
in its handlers, then my agent won't receive this event even though it
should not cause my agent any problems. That might be an unexpected change
in behavior on the part of my agent.

Dan

I was thinking about that too. In fact, the JVMTI agent I am trying to
add this fix for exists in two variants. The base form,
airlift/jvmkill, just does a plain kill(2). So not problem there. Only
the forked version by cloudfoundry/jvmkill do this fancy stuff. (Note
that both projects are on github, if you care to look them up. They
are not by me. I am just using them).

But the problem is, the JVMTI spec says nothing about what you can or
cannot do in reaction to a ResourceExhausted event. So what
cloudfoundry/jvmkill does is valid and not forbidden.

Therefore I think suppressing ResourceExhausted in this case is the
only choice we have. One might fine-tune the conditions under which we
suppress sending ResourceExhausted: maybe suppress for
CompilerThreads, or only  for CompilerThreads getting MetaspaceOOMs...

But I think we should do something here. By neglecting to add
restrictions to the JVMTI spec, we encouraged JVMTI agents to do these
kind of things. The least we can do is minimize the damage.

And then, usually this will not be the last opportunity for
ResourceExhausted to be posted, no? Chances are high that there will
be more OOMs following, in real java threads.

Finally, I find swallowing the ResourceExhausted for compiler threads
is symmetric to the way the compiler thread swallows the OOME itself.
They do not report the OOME but ignore and clear it.

But as you can see, I see your point. Do you have a better proposal?

..Thomas


On 11/14/18 10:06 AM, Daniel D. Daugherty wrote:
Adding serviceability-dev@...

Since the proposed solution to the bug is to not post an event, I think the Serviceability Team (which owns JVM/TI) needs to be involved directly
in the review.

Dan


On 11/14/18 9:28 AM, Thomas Stüfe wrote:
Dear all,

may I please have reviews on this small patch. Note that this is
borderline serviceability. I try to avoid crossposting, but if you
think this should be looked at by serviceability feel free to forward
it there.

Issue: https://bugs.openjdk.java.net/browse/JDK-8213834

CR:
http://cr.openjdk.java.net/~stuefe/webrevs/8213834-jvmti-reourceexhausted-shall-not-be-posted-from-compiler-thread/webrev.00/webrev/

Short description: we may post ResourceExhausted from Compiler
threads. Handlers of this event may call back into the JVM, which may
cause problems since we run in the compiler thread. The proposed
solution is not to post the event in that case.

See previous discussion here:
http://mail.openjdk.java.net/pipermail/serviceability-dev/2018-November/025898.html


Thanks, Thomas


Reply via email to