On Fri, 2016-04-29 at 12:33 -0700, serguei.spit...@oracle.com wrote:
> On 4/29/16 01:56, Severin Gehwolf wrote:
> > Hi Serguei,
> > 
> > On Fri, 2016-04-29 at 01:34 -0700, serguei.spit...@oracle.com wrote:
> > > Hi Severin,
> > > 
> > > The fix looks good in general.
> > > I'm testing both fixes together at the moment.
> > That is JDK-8154529 and JDK-8153711? Yes, that's what I've done too.
> > 
> > > A couple of questions...
> > > 
> > > It seems, there are more places where an invokerLock critical section is 
> > > missed.
> > Right.
> > 
> > > The following functions:
> > >   - invoker_enableInvokeRequests
> > This should be fixed with the patch for JDK-8154529
> > 
> > >   - invokeConstructor
> > >   - invokeStatic
> > >   - invokeNonvirtual
> > >   - invokeVirtual
> > >   - saveGlobalRef
> > Correct. The intent would be to fix the callsites of saveGlobalRef. If
> > we fix invoke* then saveGlobalRef should not be an issue. I didn't want
> > to include this in this fix since it's pretty hairy and would like to
> > fix this in incremental steps.
> > 
> > > The first function is easy to fix.
> > > The last 5 functions are called from the invoker_doInvoke() that we 
> > > already had a problem with.
> > > I'm puzzled with the question: How to synchronize and avoid deadlocks at 
> > > the same time?
> > I'm not sure yet. Perhaps locking only while saveGlobalRef is being
> > called in invoke* functions will help.
> > 
> > > I'm inclined to let your fix go (if the testing is Ok) and file one more 
> > > bug on the remaining sync issues.
> > Please keep me in the loop about your test results.
>  
> Both the JTreg com/sun/jdi and the co-located nsk.jdi tests are all passed.
> 
> I also ran the 4 previously failed tests in big loops of 1000 runs:
>     com/sun/jdi/InvokeTest.java
>     com/sun/jdi/InvokeHangTest.java
>     com/sun/jdi/InterfaceMethodsTest.java
>     com/sun/jdi/OomDebugTest.java  (new test introduced in the webrev)

I suggest to run InvokeTest, InvokeHangTest and InterfaceMethodsTest in
a loop. Those never failed for me in such a scenario.

> The OomDebugTest.java failed with a timeout (most likely, a deadlock).
> Please, find the OomDebugTest.jtr file in attachments.

Correct. This is what I was seeing. See the last comment of the bug:
https://bugs.openjdk.java.net/browse/JDK-8153711

It has the jstack output of the hung OomDebugTestTarg JVM. I'm not
convinced this is the same failure we were seeing in JDK-4858370 since
the stack suggests it's doing a GC upon a newInstance of a primitive
array. It also does not seem like the same issue as the deadlocks
exposed by locking during an invoke, because it was reproducible fairly
consistently with InvokeHangTest.

It looks to me like a new issue. Probably one which was there before,
but is only exposed by the new test. The new test stress-tests the GC
with a debugger attached. Of course, this is going to be hard to prove
since it would just run out of memory prior the patch. Thoughts?

Cheers,
Severin



> Thanks,
> Serguei 
> 
> 
> 
> > 
> > Thanks for your help!
> > 
> > Cheers,
> > Severin
> > 
> > > Thanks,
> > > Serguei
> > > 
> > > 
> > > On 4/28/16 02:00, Severin Gehwolf wrote:
> > > > On Tue, 2016-04-19 at 19:32 -0700, serguei.spit...@oracle.com wrote:
> > > > > > Hi Severin,
> > > > > > 
> > > > > > I postpone a push for this fix.
> > > > > > 
> > > > > > There are two nsk.jdi test failures (they look like deadlocks):
> > > > > >   nsk/jdi/ObjectReference/invokeMethod/invokemethod012 FAIL(TIMEOUT)
> > > > > >   nsk/jdi/Scenarios/invokeMethod/popframes001 FAIL(TIMEOUT)
> > > > > > 
> > > > > > and one jtreg sun/com/jdi failure (it looks like a deadlock too):
> > > > > >   com/sun/jdi/InvokeHangTes.java
> > > > > > 
> > > > > > 
> > > > > > I'll double check if these failures are regressions caused by your 
> > > > > > fix 
> > > > > > or not.
> > > > > I confirm, the failures above are new regressions introduced by the 
> > > > > fix.
> > > > > The tests fail consistently with the fix and do not fail without the 
> > > > > fix.
> > > > OK this was caused by the locking done in invoker_doInvoke(). Note that
> > > > holding either of them invoker lock or event handler lock causes this.
> > > > 
> > > > Here is a new webrev with those hunks removed. It's sufficient to grab
> > > > the locks again in invoke_completeInvokeRequest() when clearing the
> > > > global references in order to not get those failures we've seen when
> > > > the fix for JDK-4858370 was in place.
> > > > 
> > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8153711/webrev.02/
> > > > 
> > > > Testing done:
> > > >  - com/sun/jdi/InvokeTest.java com/sun/jdi/InvokeHangTest.java and
> > > >    sun/jdi/InterfaceMethodsTest.java does not fail in 1500 runs.
> > > >  - regular com/sun/jdi test set: no regressions
> > > > 
> > > > Note that there are some rare cases where OomDebugTest times out which
> > > > seems to be caused by the GC, though. See the bug for details. Since
> > > > this problem is rare, it's still worthwhile having that test included.
> > > > If it turns out a problem in practice we could consider disabling the
> > > > test.
> > > > 
> > > > Thoughts?
> > > > 
> > > > Cheers,
> > > > Severin
> > >  
>  

Reply via email to