On Fri, 30 Oct 2020 16:02:41 GMT, Erik Österlund <eosterl...@openjdk.org> wrote:
> > Hi Erik, > > is it possible for GC to mistake a primitive value for a reference when > > posting the exit event? > > My understanding is: we are at a random bci of a method that is forced to > > return early. The expression stack is emptied and the return value is > > pushed on the expression stack then we call into the interpreter runtime to > > post the JVMTI method exit event during which we come to a safepoint for > > GC. The oop map for the bci does not cover this forced early return and if > > the return value is an object then the reference pushed on the expression > > stack before is not updated by GC. With your fix the value is updated if it > > is a reference. > > If this is correct then to me it appears as if GC can also crash because > > the oop map for the random bci tells there has to be a reference at the > > stack position of the return value if it actually is a primitive value. > > I think what you are saying is true. Note though that the return value of > ForceEarlyReturn is installed with a handshake. The handshake polls of the > interpreter are emitted in loop backedges and returns. At loop backedges, the > expression stack is empty (required by OSR), and at returns the types match > correctly. However, if an arbitrary bytecode performs a runtime call with > call_VM() while the bottom of the expression stack is an oop, then I think > there is an issue. At that call_VM, the early return value could get > installed, and when the C++ function returns, we check for early returns, > further dispatching to an unwind routine that posts the MethodExit > notification. If we GC during this MethodExit notification, then I think you > can crash the GC. The GC code generates an oop map for the frame, checking > what the types in the expression stack should be. The early return int is > pushed on the slot intersecting with the bottom entry in the expression > stack. That bottom entry could be an oop, and the early return value could be an int. Then the early return int will be passed to the oop closure, which should result in a crash. > > So I suspect that almost always, the handshake installing the > ForceEarlyReturn value is installed with a handshake in a bytecode backedge > or at a return, where the interpreter safepoint polls for the fast path code. > Then you won't notice the issue. But in the rare scenario that the > ForceEarlyReturn value is installed in a slow path call from a random > bytecode... I can't see how that would work correctly. Looking more closely, I gotta say I have no idea why there is a call to clear the expression stack at all in TemplateInterpreterGenerator::generate_earlyret_entry_for(). It is unclear to me what problem if any that solves. It does however seem like it introduces this problem of having a forced int return value intersect with an oop in the expression stack for a BCI performing slow-path calls into the VM. Simply removing the code that clears the expression stack, removes the issue, and I can't see that it introduces any other issue. Anyway, I think this is a separate bug. Do you mind if I push the fix for that bug, as a different RFR? It will likely involve poking around at more platform dependent code. ------------- PR: https://git.openjdk.java.net/jdk/pull/930