Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-24 Thread Kim Barrett
> On Aug 24, 2016, at 9:26 PM, David Holmes  wrote:
> 
> Hi Kim,
> 
> Sorry this fell off the radar …

That’s okay.  There was a long gap while I was working on other things.

> On 24/08/2016 7:09 AM, Kim Barrett wrote:
>> Mandy asked for some additional comments to make life easier for
>> future readers.  I've also stopped pretending the description for
>> waitForReferenceProcessing is javadoc, since this function is private
>> and only accessible via SharedSecrets or reflection that avoids access
>> control.
>> 
>> Changes only in the jdk tree:
>> 
>> New webrevs:
>> full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.05/
>> incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.05.inc/
> 
> Just to be clear, in waitForReferenceProcessing() you are okay with returning 
> true on spurious wakeup?

Yes.

>> Unchanged hotspot webrevs:
>> full: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/
>> incr: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04.inc/
> 
> No further comments from me.
> 
> Thanks,
> David




Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-24 Thread David Holmes

Hi Kim,

Sorry this fell off the radar ...

On 24/08/2016 7:09 AM, Kim Barrett wrote:

Mandy asked for some additional comments to make life easier for
future readers.  I've also stopped pretending the description for
waitForReferenceProcessing is javadoc, since this function is private
and only accessible via SharedSecrets or reflection that avoids access
control.

Changes only in the jdk tree:

New webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.05/
incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.05.inc/


Just to be clear, in waitForReferenceProcessing() you are okay with 
returning true on spurious wakeup?



Unchanged hotspot webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/
incr: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04.inc/


No further comments from me.

Thanks,
David


Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-24 Thread Kim Barrett
> On Aug 24, 2016, at 3:50 AM, Per Liden  wrote:
> 
> Still looks good.
> 
> cheers,
> Per

Thanks.

> 
> On 2016-08-23 23:09, Kim Barrett wrote:
>> Mandy asked for some additional comments to make life easier for
>> future readers.  I've also stopped pretending the description for
>> waitForReferenceProcessing is javadoc, since this function is private
>> and only accessible via SharedSecrets or reflection that avoids access
>> control.
>> 
>> Changes only in the jdk tree:
>> 
>> New webrevs:
>> full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.05/
>> incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.05.inc/
>> 
>> Unchanged hotspot webrevs:
>> full: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/
>> incr: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04.inc/




Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-24 Thread Per Liden

Still looks good.

cheers,
Per

On 2016-08-23 23:09, Kim Barrett wrote:

Mandy asked for some additional comments to make life easier for
future readers.  I've also stopped pretending the description for
waitForReferenceProcessing is javadoc, since this function is private
and only accessible via SharedSecrets or reflection that avoids access
control.

Changes only in the jdk tree:

New webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.05/
incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.05.inc/

Unchanged hotspot webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/
incr: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04.inc/




Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-23 Thread Kim Barrett
> On Aug 23, 2016, at 5:31 PM, Mandy Chung  wrote:
> 
> 
>> On Aug 23, 2016, at 2:09 PM, Kim Barrett  wrote:
>> 
>> New webrevs:
>> full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.05/
>> incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.05.inc/
>> 
>> Unchanged hotspot webrevs:
>> full: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/
>> incr: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04.inc/
> 
> +1
> 
> I agree with Kim offline to keep the current proposed methods as the 
> interface can be revised in the future when needed.
> 
> FYI.  I updated the description of JDK-8156500 to:
>  Move Reference pending list into VM to prevent deadlocks
> 
> This will make it clear what this changeset is about.
> 
> Mandy

Thanks.



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-23 Thread Mandy Chung

> On Aug 23, 2016, at 2:09 PM, Kim Barrett  wrote:
> 
> New webrevs:
> full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.05/
> incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.05.inc/
> 
> Unchanged hotspot webrevs:
> full: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/
> incr: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04.inc/

+1

I agree with Kim offline to keep the current proposed methods as the interface 
can be revised in the future when needed.

FYI.  I updated the description of JDK-8156500 to:
  Move Reference pending list into VM to prevent deadlocks

This will make it clear what this changeset is about.

Mandy

Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-23 Thread Kim Barrett
Mandy asked for some additional comments to make life easier for
future readers.  I've also stopped pretending the description for
waitForReferenceProcessing is javadoc, since this function is private
and only accessible via SharedSecrets or reflection that avoids access
control.

Changes only in the jdk tree:

New webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.05/
incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.05.inc/

Unchanged hotspot webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/
incr: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04.inc/




Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-19 Thread Kim Barrett
> On Aug 19, 2016, at 7:54 PM, Mandy Chung  wrote:
> 
> 
>> On Aug 19, 2016, at 4:09 PM, Kim Barrett  wrote:
>> 
>>> On Aug 13, 2016, at 4:07 PM, Mandy Chung  wrote:
>>> 
>>> 
 On Aug 13, 2016, at 1:20 AM, Peter Levart  wrote:
 
 Hi Mandy,
 
 On 08/13/2016 01:55 AM, Mandy Chung wrote:
>> On Aug 8, 2016, at 6:25 PM, Kim Barrett 
>> wrote:
>> 
>> full: 
>> http://cr.openjdk.java.net/~kbarrett/8156500/jdk.04/
>> 
>> 
>> http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/
> This looks very good.  
> 
> Have you considered having JVM_WaitForReferencePendingList method to 
> return the pending list?  i.e. make it a blocking version of 
> getAndClearReferencePendingList rather than two operations.
> 
> waitForReferenceProcessing is really wait for pending reference(s) 
> enqueued (additionally cleaner gets invoked).  What do you think to 
> rename this method to “waitForPendingReferencesEnqueued”?
> 
 
 I think the split is intentional. It's a clever way to avoid an otherwise 
 inevitable race that could otherwise cause DBB allocating thread to fail 
 with OOME prematurely.
 
>>> 
>>> Ah I see.  I initially read that pendingList doesn’t need to be set within 
>>> the synchronized block with processPendingActive flag but such race could 
>>> cause DBB to fail with OOME, as you clarified.
>>> 
>>> Then I suggest waitForReferencePendingList (and its JVM function) be 
>>> renamed to:
>>> boolean awaitReferencePendingList();
>>> 
>>> a blocking version of "boolean hasReferencePendingList()”.  Minor 
>>> difference but I think it’s little clearer.
>> 
>> Regarding the name, e.g. "waitFor" vs "await", the meaning is the
>> same.  If there were a documented or defacto preference for "await" I
>> would feel obliged to rename, but the JDK seems to use both forms in
>> roughly comparable numbers.
>> 
>> Regarding the signature and description, recall that
>> waitForReferencePendingList() does not return a value, nor does it
>> need to.  So documenting it as a blocking version of
>> hasReferencePendingList() isn't correct.
> 
> It’s not a suggestion to write “a blocking version of …” in the javadoc (just 
> my brief way to make my comment that clears my reading of the interfaces)
> 
> If there were multiple reference handling threads, each would want to wait 
> until there are pending references but only one thread could grab the pending 
> list.   Having it to return a boolean would be useful to me and up to the 
> callers to decide whether it wants to check the return value (it doesn’t need 
> to get the pending list if it returns false)

But we don't have multiple reference processing [handling] threads,
nor does the proposed design want such.  waitForReferencePendingList()
is an implementation detail that exists for the exclusive use of the
single reference processing thread, to allow it to wait outside the
synchronization block where it gets the pending list.  The notion of a
plurality of callers makes no sense for this function.

Maybe I'm not understanding what you are suggesting? It seems like you
are talking about some unknown to me design that is different from
what I've proposed, and suggesting names and signatures based on that
alternative design.

> If waitForReferencePendingList returns a boolean, awaitReferencePendingList 
> reads consistent with “hasReferencePendingList” (one word verb).
> 
>> 
>>> waitForReferenceProcessing - I’d like the javadoc to include more details 
>>> or @implNote for example: 
>>> This method returns true to indicate that some pending references have been 
>>> enqueued.   If there are cleaners, cleaning actions are queued to be 
>>> invoked e.g. DirectByteBuffer may be deallocated.
>>> 
>>> This would help the readers to understand what this method is intended to 
>>> do.
>> 
>> I was intentionally a little vague in the description; I don't want to
>> be promising too much, and having callers depend on behavior that we
>> might want to change. In particular, it was intentional that there's
>> no mention of Cleaner or any special behavior for them.
>> 
>> For example, we might later decide we don't want the reference
>> processing thread to incur the cost of notifying possible waiters and
>> the associated synchronization, and instead let "waiters" return
>> immediately when there is reference processing in progress, on the
>> assumption that the reference processing thread makes rapid progress.
>> In some respects that's probably more similar in effect to what the
>> old code did.
>> 
>> Also recall that this function is presently private, accessed by
>> nio.Bits code via SharedSecrets (and via reflection by one test).  It
>> might be appropriate to change the access to this function or provide
>> some similar non-private function in the future, at 

Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-19 Thread Mandy Chung

> On Aug 19, 2016, at 4:09 PM, Kim Barrett  wrote:
> 
>> On Aug 13, 2016, at 4:07 PM, Mandy Chung  wrote:
>> 
>> 
>>> On Aug 13, 2016, at 1:20 AM, Peter Levart  wrote:
>>> 
>>> Hi Mandy,
>>> 
>>> On 08/13/2016 01:55 AM, Mandy Chung wrote:
> On Aug 8, 2016, at 6:25 PM, Kim Barrett 
> wrote:
> 
> full: 
> http://cr.openjdk.java.net/~kbarrett/8156500/jdk.04/
> 
> 
> http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/
 This looks very good.  
 
 Have you considered having JVM_WaitForReferencePendingList method to 
 return the pending list?  i.e. make it a blocking version of 
 getAndClearReferencePendingList rather than two operations.
 
 waitForReferenceProcessing is really wait for pending reference(s) 
 enqueued (additionally cleaner gets invoked).  What do you think to rename 
 this method to “waitForPendingReferencesEnqueued”?
 
>>> 
>>> I think the split is intentional. It's a clever way to avoid an otherwise 
>>> inevitable race that could otherwise cause DBB allocating thread to fail 
>>> with OOME prematurely.
>>> 
>> 
>> Ah I see.  I initially read that pendingList doesn’t need to be set within 
>> the synchronized block with processPendingActive flag but such race could 
>> cause DBB to fail with OOME, as you clarified.
>> 
>> Then I suggest waitForReferencePendingList (and its JVM function) be renamed 
>> to:
>>  boolean awaitReferencePendingList();
>> 
>> a blocking version of "boolean hasReferencePendingList()”.  Minor difference 
>> but I think it’s little clearer.
> 
> Regarding the name, e.g. "waitFor" vs "await", the meaning is the
> same.  If there were a documented or defacto preference for "await" I
> would feel obliged to rename, but the JDK seems to use both forms in
> roughly comparable numbers.
> 
> Regarding the signature and description, recall that
> waitForReferencePendingList() does not return a value, nor does it
> need to.  So documenting it as a blocking version of
> hasReferencePendingList() isn't correct.

It’s not a suggestion to write “a blocking version of …” in the javadoc (just 
my brief way to make my comment that clears my reading of the interfaces)

If there were multiple reference handling threads, each would want to wait 
until there are pending references but only one thread could grab the pending 
list.   Having it to return a boolean would be useful to me and up to the 
callers to decide whether it wants to check the return value (it doesn’t need 
to get the pending list if it returns false)

If waitForReferencePendingList returns a boolean, awaitReferencePendingList 
reads consistent with “hasReferencePendingList” (one word verb).

> 
>> waitForReferenceProcessing - I’d like the javadoc to include more details or 
>> @implNote for example: 
>> This method returns true to indicate that some pending references have been 
>> enqueued.   If there are cleaners, cleaning actions are queued to be invoked 
>> e.g. DirectByteBuffer may be deallocated.
>> 
>> This would help the readers to understand what this method is intended to do.
> 
> I was intentionally a little vague in the description; I don't want to
> be promising too much, and having callers depend on behavior that we
> might want to change. In particular, it was intentional that there's
> no mention of Cleaner or any special behavior for them.
> 
> For example, we might later decide we don't want the reference
> processing thread to incur the cost of notifying possible waiters and
> the associated synchronization, and instead let "waiters" return
> immediately when there is reference processing in progress, on the
> assumption that the reference processing thread makes rapid progress.
> In some respects that's probably more similar in effect to what the
> old code did.
> 
> Also recall that this function is presently private, accessed by
> nio.Bits code via SharedSecrets (and via reflection by one test).  It
> might be appropriate to change the access to this function or provide
> some similar non-private function in the future, at which point the
> javadoc and possibly the semantics and implementation will need to be
> carefully examined.

This is private API that is subject to change or remove any time.  I won’t 
worry the javadoc here will give a false impression of its supportedness.

I think this deserves reasonable level of detailed implementation notes.  It 
will help the readers to understand what this does rather than reading the code 
(both this class and the callsites) carefully with multiple mutators in mind.

Mandy



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-19 Thread Kim Barrett
> On Aug 13, 2016, at 4:07 PM, Mandy Chung  wrote:
> 
> 
>> On Aug 13, 2016, at 1:20 AM, Peter Levart  wrote:
>> 
>> Hi Mandy,
>> 
>> On 08/13/2016 01:55 AM, Mandy Chung wrote:
 On Aug 8, 2016, at 6:25 PM, Kim Barrett 
 wrote:
 
 full: 
 http://cr.openjdk.java.net/~kbarrett/8156500/jdk.04/
 
 
 http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/
>>> This looks very good.  
>>> 
>>> Have you considered having JVM_WaitForReferencePendingList method to return 
>>> the pending list?  i.e. make it a blocking version of 
>>> getAndClearReferencePendingList rather than two operations.
>>> 
>>> waitForReferenceProcessing is really wait for pending reference(s) enqueued 
>>> (additionally cleaner gets invoked).  What do you think to rename this 
>>> method to “waitForPendingReferencesEnqueued”?
>>> 
>> 
>> I think the split is intentional. It's a clever way to avoid an otherwise 
>> inevitable race that could otherwise cause DBB allocating thread to fail 
>> with OOME prematurely.
>> 
> 
> Ah I see.  I initially read that pendingList doesn’t need to be set within 
> the synchronized block with processPendingActive flag but such race could 
> cause DBB to fail with OOME, as you clarified.
> 
> Then I suggest waitForReferencePendingList (and its JVM function) be renamed 
> to:
>   boolean awaitReferencePendingList();
> 
> a blocking version of "boolean hasReferencePendingList()”.  Minor difference 
> but I think it’s little clearer.

Regarding the name, e.g. "waitFor" vs "await", the meaning is the
same.  If there were a documented or defacto preference for "await" I
would feel obliged to rename, but the JDK seems to use both forms in
roughly comparable numbers.

Regarding the signature and description, recall that
waitForReferencePendingList() does not return a value, nor does it
need to.  So documenting it as a blocking version of
hasReferencePendingList() isn't correct.

> waitForReferenceProcessing - I’d like the javadoc to include more details or 
> @implNote for example: 
>  This method returns true to indicate that some pending references have been 
> enqueued.   If there are cleaners, cleaning actions are queued to be invoked 
> e.g. DirectByteBuffer may be deallocated.
> 
> This would help the readers to understand what this method is intended to do.

I was intentionally a little vague in the description; I don't want to
be promising too much, and having callers depend on behavior that we
might want to change. In particular, it was intentional that there's
no mention of Cleaner or any special behavior for them.

For example, we might later decide we don't want the reference
processing thread to incur the cost of notifying possible waiters and
the associated synchronization, and instead let "waiters" return
immediately when there is reference processing in progress, on the
assumption that the reference processing thread makes rapid progress.
In some respects that's probably more similar in effect to what the
old code did.

Also recall that this function is presently private, accessed by
nio.Bits code via SharedSecrets (and via reflection by one test).  It
might be appropriate to change the access to this function or provide
some similar non-private function in the future, at which point the
javadoc and possibly the semantics and implementation will need to be
carefully examined.




Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-17 Thread Peter Levart

Hi Kim,


On 08/15/2016 05:15 AM, Kim Barrett wrote:

>I have a feeling that these pauses are now unnecessary. Will try to check with 
some experiments…

I found that the DirectBufferAllocTest will sometimes fail if the pauses are 
taken out.
I think what’s going on is that the multiple threads are competing for 
resources, and
some threads in that test lose out if all of them are waiting and wake up at 
the same
time.  The exponentially increasing back-off scatters the threads enough for 
that to
become very unlikely, though with sufficiently bad luck… But I think the current
implementation could also fail that test with similarly bad luck.  It just 
requires*very*
bad luck, so we’re not seeing it as a problem.  And that test is a pretty 
extreme stress
test.



I get the same results when experimenting with this. Another option 
would be to enclose the whole logic of (retrying reservation while 
waiting for Reference processing progress followed by System.gc() and 
another round of reservation retries while waiting for Reference 
processing) into a synchronized block so that multiple reservation 
threads could not barge for reservations. This approach works and might 
even be triggering less System.gc() rounds, but it is not much better 
than current approach.


Regards, Peter



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-14 Thread Kim Barrett
> On Aug 13, 2016, at 4:20 AM, Peter Levart  wrote:
> 
> Hi Mandy,
> 
> On 08/13/2016 01:55 AM, Mandy Chung wrote:
>>> On Aug 8, 2016, at 6:25 PM, Kim Barrett 
>>>  wrote:
>>> 
>>> full: 
>>> http://cr.openjdk.java.net/~kbarrett/8156500/jdk.04/
>>> 
>>>  
>>> http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/
>> This looks very good.  
>> 
>> Have you considered having JVM_WaitForReferencePendingList method to return 
>> the pending list?  i.e. make it a blocking version of 
>> getAndClearReferencePendingList rather than two operations.
>> 
>> waitForReferenceProcessing is really wait for pending reference(s) enqueued 
>> (additionally cleaner gets invoked).  What do you think to rename this 
>> method to “waitForPendingReferencesEnqueued”?
>> 
> 
> I think the split is intentional. It's a clever way to avoid an otherwise 
> inevitable race that could otherwise cause DBB allocating thread to fail with 
> OOME prematurely.
> 
> waitForReferencePendingList() is invoked out of synchronized block so that it 
> does not prevent the progress of thread(s) invoking 
> waitForReferenceProcessing(), while getAndClearReferencePendingList is 
> non-blocking and is invoked inside the synchronized block that also sets a 
> boolean flag. Code in waitForReferenceProcessing() checks both the 
> non-emptiness of reference pending list and the boolean flag while holding 
> the lock.
> 
> The following sequence of actions (as used in NIO Bits):
> 
> System.gc();
> Reference.waitForReferenceProcessing();
> 
> ...therefore either:
> 
> - completes immediately after System.gc() returns without discovering any 
> pending Reference, returning false; or
> - completes after System.gc() discovers at least one pending Reference and 
> ReferenceHandler thread processes at least one Cleaner, returning true; or 
> enqueues all pending Reference(s), returning false.
> 
> When waitForReferenceProcessing() returns false consecutively even after a 
> series of exponentially increasing pauses, the DBB allocating thread(s) can 
> be sure they have exhausted all options to allocate the direct buffer and 
> must fail with OOME.

Thanks for answering Mandy’s questions for me.  That all looks right.

> I have a feeling that these pauses are now unnecessary. Will try to check 
> with some experiments…

I found that the DirectBufferAllocTest will sometimes fail if the pauses are 
taken out.
I think what’s going on is that the multiple threads are competing for 
resources, and
some threads in that test lose out if all of them are waiting and wake up at 
the same
time.  The exponentially increasing back-off scatters the threads enough for 
that to
become very unlikely, though with sufficiently bad luck… But I think the current
implementation could also fail that test with similarly bad luck.  It just 
requires *very*
bad luck, so we’re not seeing it as a problem.  And that test is a pretty 
extreme stress
test.

> 
> Regards, Peter
> 
>> Grammar question:
>> 
>> "there are no more references”
>> "If there aren't any pending {@link Reference}s"
>> 
>> - This is the case with zero pending reference, shouldn’t it use “is” such 
>> that:
>> 
>> "there us no more reference”
>> "If there isn't any pending {@link Reference}”
>> 
>> Please update the synposis of JDK-8156500 to make it clear what this fix is 
>> (it’s nothing related to JDI).  Maybe something like “Move the pending 
>> reference list to VM”
>> 
>> Mandy




Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-13 Thread Mandy Chung

> On Aug 13, 2016, at 1:20 AM, Peter Levart  wrote:
> 
> Hi Mandy,
> 
> On 08/13/2016 01:55 AM, Mandy Chung wrote:
>>> On Aug 8, 2016, at 6:25 PM, Kim Barrett 
>>>  wrote:
>>> 
>>> full: 
>>> http://cr.openjdk.java.net/~kbarrett/8156500/jdk.04/
>>> 
>>>  
>>> http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/
>> This looks very good.  
>> 
>> Have you considered having JVM_WaitForReferencePendingList method to return 
>> the pending list?  i.e. make it a blocking version of 
>> getAndClearReferencePendingList rather than two operations.
>> 
>> waitForReferenceProcessing is really wait for pending reference(s) enqueued 
>> (additionally cleaner gets invoked).  What do you think to rename this 
>> method to “waitForPendingReferencesEnqueued”?
>> 
> 
> I think the split is intentional. It's a clever way to avoid an otherwise 
> inevitable race that could otherwise cause DBB allocating thread to fail with 
> OOME prematurely.
> 

Ah I see.  I initially read that pendingList doesn’t need to be set within the 
synchronized block with processPendingActive flag but such race could cause DBB 
to fail with OOME, as you clarified.

Then I suggest waitForReferencePendingList (and its JVM function) be renamed to:
   boolean awaitReferencePendingList();

a blocking version of "boolean hasReferencePendingList()”.  Minor difference 
but I think it’s little clearer.

waitForReferenceProcessing - I’d like the javadoc to include more details or 
@implNote for example: 
  This method returns true to indicate that some pending references have been 
enqueued.   If there are cleaners, cleaning actions are queued to be invoked 
e.g. DirectByteBuffer may be deallocated.

This would help the readers to understand what this method is intended to do.

Mandy

> waitForReferencePendingList() is invoked out of synchronized block so that it 
> does not prevent the progress of thread(s) invoking 
> waitForReferenceProcessing(), while getAndClearReferencePendingList is 
> non-blocking and is invoked inside the synchronized block that also sets a 
> boolean flag. Code in waitForReferenceProcessing() checks both the 
> non-emptiness of reference pending list and the boolean flag while holding 
> the lock.
> 
> The following sequence of actions (as used in NIO Bits):
> 
> System.gc();
> Reference.waitForReferenceProcessing();
> 
> ...therefore either:
> 
> - completes immediately after System.gc() returns without discovering any 
> pending Reference, returning false; or
> - completes after System.gc() discovers at least one pending Reference and 
> ReferenceHandler thread processes at least one Cleaner, returning true; or 
> enqueues all pending Reference(s), returning false.
> 
> When waitForReferenceProcessing() returns false consecutively even after a 
> series of exponentially increasing pauses, the DBB allocating thread(s) can 
> be sure they have exhausted all options to allocate the direct buffer and 
> must fail with OOME.
> 
> I have a feeling that these pauses are now unnecessary. Will try to check 
> with some experiments...
> 
> Regards, Peter
> 
>> Grammar question:
>> 
>> "there are no more references”
>> "If there aren't any pending {@link Reference}s"
>> 
>> - This is the case with zero pending reference, shouldn’t it use “is” such 
>> that:
>> 
>> "there us no more reference”
>> "If there isn't any pending {@link Reference}”
>> 
>> Please update the synposis of JDK-8156500 to make it clear what this fix is 
>> (it’s nothing related to JDI).  Maybe something like “Move the pending 
>> reference list to VM”
>> 
>> Mandy
>> 
> 



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-13 Thread Peter Levart

Hi Mandy,


On 08/13/2016 01:55 AM, Mandy Chung wrote:

On Aug 8, 2016, at 6:25 PM, Kim Barrett  wrote:

full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.04/
  http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/

This looks very good.

Have you considered having JVM_WaitForReferencePendingList method to return the 
pending list?  i.e. make it a blocking version of 
getAndClearReferencePendingList rather than two operations.

waitForReferenceProcessing is really wait for pending reference(s) enqueued 
(additionally cleaner gets invoked).  What do you think to rename this method 
to “waitForPendingReferencesEnqueued”?


I think the split is intentional. It's a clever way to avoid an 
otherwise inevitable race that could otherwise cause DBB allocating 
thread to fail with OOME prematurely.


waitForReferencePendingList() is invoked out of synchronized block so 
that it does not prevent the progress of thread(s) invoking 
waitForReferenceProcessing(), while getAndClearReferencePendingList is 
non-blocking and is invoked inside the synchronized block that also sets 
a boolean flag. Code in waitForReferenceProcessing() checks both the 
non-emptiness of reference pending list and the boolean flag while 
holding the lock.


The following sequence of actions (as used in NIO Bits):

System.gc();
Reference.waitForReferenceProcessing();

...therefore either:

- completes immediately after System.gc() returns without discovering 
any pending Reference, returning false; or
- completes after System.gc() discovers at least one pending Reference 
and ReferenceHandler thread processes at least one Cleaner, returning 
true; or enqueues all pending Reference(s), returning false.


When waitForReferenceProcessing() returns false consecutively even after 
a series of exponentially increasing pauses, the DBB allocating 
thread(s) can be sure they have exhausted all options to allocate the 
direct buffer and must fail with OOME.


I have a feeling that these pauses are now unnecessary. Will try to 
check with some experiments...


Regards, Peter


Grammar question:

"there are no more references”
"If there aren't any pending {@link Reference}s"

- This is the case with zero pending reference, shouldn’t it use “is” such that:

"there us no more reference”
"If there isn't any pending {@link Reference}”

Please update the synposis of JDK-8156500 to make it clear what this fix is 
(it’s nothing related to JDI).  Maybe something like “Move the pending 
reference list to VM”

Mandy




Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-12 Thread Mandy Chung

> On Aug 8, 2016, at 6:25 PM, Kim Barrett  wrote:
> 
> full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.04/
>  http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/

This looks very good.  

Have you considered having JVM_WaitForReferencePendingList method to return the 
pending list?  i.e. make it a blocking version of 
getAndClearReferencePendingList rather than two operations.

waitForReferenceProcessing is really wait for pending reference(s) enqueued 
(additionally cleaner gets invoked).  What do you think to rename this method 
to “waitForPendingReferencesEnqueued”?

Grammar question:

"there are no more references”
"If there aren't any pending {@link Reference}s"

- This is the case with zero pending reference, shouldn’t it use “is” such that:

"there us no more reference”
"If there isn't any pending {@link Reference}”

Please update the synposis of JDK-8156500 to make it clear what this fix is 
(it’s nothing related to JDI).  Maybe something like “Move the pending 
reference list to VM”

Mandy

Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-09 Thread Kim Barrett
> On Aug 9, 2016, at 3:52 AM, Per Liden  wrote:
> 
> Hi Kim,
> 
> On 2016-08-09 03:25, Kim Barrett wrote:
>>> On Aug 8, 2016, at 8:16 AM, Per Liden  wrote:
>>> I have one suggestion though, regarding CheckReferencePendingList(). While 
>>> reviewing I found that I had to check several times what its return value 
>>> actually meant, the "check" part of the name doesn't quite reveal that.
>>> Further, it seems to me that the waiting path of this function has fairly 
>>> little in common with the non-waiting path, e.g. it always returns true. 
>>> So, to make both the naming and implementation more clear I'd like to 
>>> suggest that we split this into two separate functions, 
>>> HasReferencePendingList() and WaitForReferencePendingList(), like this:
>> 
>> I was thinking about splitting things way, and ended up not doing so
>> for no good reason I can think of. And it does seem clearer that way,
>> so...
>> 
>> New webrevs:
>> full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.04/
>>  http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/
>> incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.04.inc/
>>  http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04.inc/
> 
> Thanks for fixing, looks good.

Thanks!

> 
> cheers,
> Per
> 
>> 
>>> Other than this I think the patch looks good.



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-09 Thread Per Liden

Hi Kim,

On 2016-08-09 03:25, Kim Barrett wrote:

On Aug 8, 2016, at 8:16 AM, Per Liden  wrote:

Hi Kim,

On 2016-08-01 20:47, Kim Barrett wrote:

This updated webrev addresses concerns about the performance of the VM
API used by the reference processor thread in the original webrev.

New webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.03/
 http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.03/
incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.03.incr/
 http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.03.incr/


Overall I think you managed to hit a good balance here, keeping the VM API 
straight forward without loosing flexibility. Having the ReferenceHandler doing 
the actual work seems sound and avoids some complexity, which I like.


Thanks for looking at this.


I have one suggestion though, regarding CheckReferencePendingList(). While reviewing I 
found that I had to check several times what its return value actually meant, the 
"check" part of the name doesn't quite reveal that.
Further, it seems to me that the waiting path of this function has fairly 
little in common with the non-waiting path, e.g. it always returns true. So, to 
make both the naming and implementation more clear I'd like to suggest that we 
split this into two separate functions, HasReferencePendingList() and 
WaitForReferencePendingList(), like this:


I was thinking about splitting things way, and ended up not doing so
for no good reason I can think of. And it does seem clearer that way,
so...

New webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.04/
  http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/
incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.04.inc/
  http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04.inc/


Thanks for fixing, looks good.

cheers,
Per




Other than this I think the patch looks good.



The originally offered approach was an attempt to minimize changes.  I
was trying to be as similar to the existing code as possible, while
moving part of it into the VM, where we have the necessary control
over suspension and the like.  We already know we want to make changes
in this area, in order to eliminate the use of
jdk.internal.ref.Cleaner (JDK-8149925).  But we've also agreed that
JDK-8149925 wasn't urgent and to defer it to JDK 10.  I don't think we
should revisit that.


Is JDK-8149925 the correct bug id? Looks like that bug has already been fixed 
in 9.


That CR was originally for removing all the uses.  During review the
nio.Bits / direct buffer use was separated, and is the last one.
There was a bunch of discussion about removing that final one in that
review (especially later parts)
http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-February/038663.html
http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-March/039315.html
http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-April/039862.html

and also here:
http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-March/039561.html.

However, I don't see a followup CR for removing the nio.Bits use.
Maybe that didn't get created when we split the issue, or maybe I'm
just failing to find it.  Peter?

Note that the consensus back in March/April was to defer it, along
with the dependent actual removal of the internal Cleaner, to JDK 10.




Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-08 Thread Kim Barrett
> On Aug 8, 2016, at 8:16 AM, Per Liden  wrote:
> 
> Hi Kim,
> 
> On 2016-08-01 20:47, Kim Barrett wrote:
>> This updated webrev addresses concerns about the performance of the VM
>> API used by the reference processor thread in the original webrev.
>> 
>> New webrevs:
>> full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.03/
>>  http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.03/
>> incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.03.incr/
>>  http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.03.incr/
> 
> Overall I think you managed to hit a good balance here, keeping the VM API 
> straight forward without loosing flexibility. Having the ReferenceHandler 
> doing the actual work seems sound and avoids some complexity, which I like.

Thanks for looking at this.

> I have one suggestion though, regarding CheckReferencePendingList(). While 
> reviewing I found that I had to check several times what its return value 
> actually meant, the "check" part of the name doesn't quite reveal that.
> Further, it seems to me that the waiting path of this function has fairly 
> little in common with the non-waiting path, e.g. it always returns true. So, 
> to make both the naming and implementation more clear I'd like to suggest 
> that we split this into two separate functions, HasReferencePendingList() and 
> WaitForReferencePendingList(), like this:

I was thinking about splitting things way, and ended up not doing so
for no good reason I can think of. And it does seem clearer that way,
so...

New webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.04/
  http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04/
incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.04.inc/
  http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.04.inc/

> Other than this I think the patch looks good.
> 
>> 
>> The originally offered approach was an attempt to minimize changes.  I
>> was trying to be as similar to the existing code as possible, while
>> moving part of it into the VM, where we have the necessary control
>> over suspension and the like.  We already know we want to make changes
>> in this area, in order to eliminate the use of
>> jdk.internal.ref.Cleaner (JDK-8149925).  But we've also agreed that
>> JDK-8149925 wasn't urgent and to defer it to JDK 10.  I don't think we
>> should revisit that.
> 
> Is JDK-8149925 the correct bug id? Looks like that bug has already been fixed 
> in 9.

That CR was originally for removing all the uses.  During review the
nio.Bits / direct buffer use was separated, and is the last one.
There was a bunch of discussion about removing that final one in that
review (especially later parts)
http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-February/038663.html
http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-March/039315.html
http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-April/039862.html

and also here:
http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-March/039561.html.

However, I don't see a followup CR for removing the nio.Bits use.
Maybe that didn't get created when we split the issue, or maybe I'm
just failing to find it.  Peter?

Note that the consensus back in March/April was to defer it, along
with the dependent actual removal of the internal Cleaner, to JDK 10.




Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-08 Thread Per Liden

Hi Kim,

On 2016-08-01 20:47, Kim Barrett wrote:

This updated webrev addresses concerns about the performance of the VM
API used by the reference processor thread in the original webrev.

New webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.03/
  http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.03/
incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.03.incr/
  http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.03.incr/


Overall I think you managed to hit a good balance here, keeping the VM 
API straight forward without loosing flexibility. Having the 
ReferenceHandler doing the actual work seems sound and avoids some 
complexity, which I like.


I have one suggestion though, regarding CheckReferencePendingList(). 
While reviewing I found that I had to check several times what its 
return value actually meant, the "check" part of the name doesn't quite 
reveal that. Further, it seems to me that the waiting path of this 
function has fairly little in common with the non-waiting path, e.g. it 
always returns true. So, to make both the naming and implementation more 
clear I'd like to suggest that we split this into two separate 
functions, HasReferencePendingList() and WaitForReferencePendingList(), 
like this:



JVM_ENTRY(jboolean, JVM_HasReferencePendingList(JNIEnv* env))
  JVMWrapper("JVM_HasReferencePendingList");
  MonitorLockerEx ml(Heap_lock);
  return Universe::has_reference_pending_list();
JVM_END

JVM_ENTRY(void, JVM_WaitForReferencePendingList(JNIEnv* env))
  JVMWrapper("JVM_WaitForReferencePendingList");
  MonitorLockerEx ml(Heap_lock);
  while (!Universe::has_reference_pending_list()) {
ml.wait();
  }
JVM_END


And of course, this would then also be reflected in j.l.r.Reference, 
which would have:


private static native boolean hasReferencePendingList();
private static native void waitForReferencePendingList();


Other than this I think the patch looks good.



The originally offered approach was an attempt to minimize changes.  I
was trying to be as similar to the existing code as possible, while
moving part of it into the VM, where we have the necessary control
over suspension and the like.  We already know we want to make changes
in this area, in order to eliminate the use of
jdk.internal.ref.Cleaner (JDK-8149925).  But we've also agreed that
JDK-8149925 wasn't urgent and to defer it to JDK 10.  I don't think we
should revisit that.


Is JDK-8149925 the correct bug id? Looks like that bug has already been 
fixed in 9.


cheers,
Per



As Peter pointed out, my original change introduces a small
performance regression on a microbenchmark for reference processing
throughput.  It also showed a small performance benefit on a different
microbenchmark (allocation rate for a stress test of direct byte
buffers), as noted in the original RFR.  I can reproduce both of
those.

I think reference processing thread throughput is the right measure to
use, so long as others don't become horrible.  Focusing on that, it's
better to just let the reference processing thread do the processing,
rather than slowing it down to allow for the rare case when there's
another thread that could possibly help.  This is especially true now
that Cleaner has such limited usage.

This leads to a different API for other threads; rather than
tryHandlePending that other threads can call to help and to examine
progress, we now have waitForReferenceProcessing.  The VM API is also
different; rather than popReferencePendingList to get or wait, we now
have getAndClearReferencePendingList and checkReferencePendingList,
the latter with an optional wait.

The splitting of the VM API allows us to avoid a narrow race condition
discussed by Peter in his prototypes.  Peter suggested this race was
ok because java.nio.Bits makes several retries.  However, those
retries are only done before throwing OOME.  There are no retries
before invoking GC, so this race could lead to unnecessary successive
GCs.

Doing all the processing on the reference processing thread eliminates
execution of Cleaners on application threads, though that's not nearly
as much an issue now that the use of Cleaner is so restricted.

We've also eliminated a pre-existing issue where a helper could report
no progress even though the reference processing thread (and other
helpers) had removed a pending reference, but not yet processed it.
This could result in the non-progressing helper invoking GC or
reporting failure, even though it might have succeeded if it had
waited for the other thread(s) to complete processing the reference(s)
being worked on.

I think the new waitForReferenceProcessing could be used to fix
JDK-6306116, though that isn't part of this change, and was not really
a motivating factor.

I don't know if the new waitForReferenceProcessing will be used by
whatever we eventually do for JDK-8149925, but I think the new VM API
will suffice for that.  That is, I think JDK-8149925 might require
changes to the core-libs API used 

Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-03 Thread Kim Barrett
> On Aug 3, 2016, at 3:41 AM, David Holmes  wrote:
> 
> On 2/08/2016 4:46 AM, Kim Barrett wrote:
>> This updated webrev addresses the concerns around initialization order
>> for Throwable and OutOfMemoryError.  It doesn't address the VM API
>> used by the reference processor thread; that will be in a later
>> webrev.
>> 
>> New webrevs:
>> full: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.02/
>> incr: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.02.incr/
>> 
>> The jdk tree is unchanged from webrev.01.
>> 
>> I've backed out the change to initialize_java_lang_classes.  (To
>> verify this, look at the runtime/thread.cpp in the full webrev.)
>> 
>> Instead, Universe::gen_out_of_memory_error now also requires Throwable
>> to be initialized before it will attempt to fill in the stack trace.
>> This means that VM generated OOMEs occurring early in initialization
>> won't have a stack trace.
> 
> This change looks good to me.

Thanks!



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-03 Thread David Holmes

On 2/08/2016 4:46 AM, Kim Barrett wrote:

This updated webrev addresses the concerns around initialization order
for Throwable and OutOfMemoryError.  It doesn't address the VM API
used by the reference processor thread; that will be in a later
webrev.

New webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.02/
incr: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.02.incr/

The jdk tree is unchanged from webrev.01.

I've backed out the change to initialize_java_lang_classes.  (To
verify this, look at the runtime/thread.cpp in the full webrev.)

Instead, Universe::gen_out_of_memory_error now also requires Throwable
to be initialized before it will attempt to fill in the stack trace.
This means that VM generated OOMEs occurring early in initialization
won't have a stack trace.


This change looks good to me.

Thanks,
David


An alternative that I considered was to remove the assert at the end
of fill_in_stack_trace_of_preallocated_backtrace that requires
java_lang_Throwable;:unassigned_stacktrace() to return non-NULL.  I
did some testing that seemed to work and provide stacktraces, relying
on code in the Throwable class that special cases this "out of
protocol" state.  However, having read some of the history around that
(JDK-6998871, JDK-7045138, JDK-7046490), that special casing seems to
have been intended to be temporary, and expected to be removed.
(Though I didn't find an RFE for doing so, and I'm not doing so here.)
Even if we decided it wasn't going to be removed, this approach seems
somewhat fragile.

[Note to Coleen: This is the reverse of what you and I talked about a
few days ago; I hadn't reviewed the history carefully then.]



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-02 Thread Kim Barrett
> On Aug 1, 2016, at 2:47 PM, Kim Barrett  wrote:
> 
> This updated webrev addresses concerns about the performance of the VM
> API used by the reference processor thread in the original webrev.
> 
> New webrevs:
> full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.03/
>  http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.03/
> incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.03.incr/
>  http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.03.incr/

Sorry, typo in the incremental webrev URLs — “incr” => “inc”.



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-02 Thread Kim Barrett
> On Aug 2, 2016, at 8:55 AM, Peter Levart  wrote:
> 
> Hi Kim,
> 
> This looks very good. I like the way you dealt with race between the 
> ReferenceHandler thread and threads waiting for it to do some cleanup 
> progress. I think the VM API is suitable for possible further development on 
> JDK-8149925. It's also nice that the whole pending list is obtained in one 
> native call, so further optimizations on Java side are possible.
> 
> Regards, Peter

Thanks!

> 
> On 08/01/2016 08:47 PM, Kim Barrett wrote:
>> This updated webrev addresses concerns about the performance of the VM
>> API used by the reference processor thread in the original webrev.
>> 
>> New webrevs:
>> full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.03/
>>   http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.03/
>> incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.03.incr/
>>   http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.03.incr/
[snip]



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-02 Thread Peter Levart

Hi Kim,

This looks very good. I like the way you dealt with race between the 
ReferenceHandler thread and threads waiting for it to do some cleanup 
progress. I think the VM API is suitable for possible further 
development on JDK-8149925. It's also nice that the whole pending list 
is obtained in one native call, so further optimizations on Java side 
are possible.


Regards, Peter

On 08/01/2016 08:47 PM, Kim Barrett wrote:

This updated webrev addresses concerns about the performance of the VM
API used by the reference processor thread in the original webrev.

New webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.03/
   http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.03/
incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.03.incr/
   http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.03.incr/

The originally offered approach was an attempt to minimize changes.  I
was trying to be as similar to the existing code as possible, while
moving part of it into the VM, where we have the necessary control
over suspension and the like.  We already know we want to make changes
in this area, in order to eliminate the use of
jdk.internal.ref.Cleaner (JDK-8149925).  But we've also agreed that
JDK-8149925 wasn't urgent and to defer it to JDK 10.  I don't think we
should revisit that.

As Peter pointed out, my original change introduces a small
performance regression on a microbenchmark for reference processing
throughput.  It also showed a small performance benefit on a different
microbenchmark (allocation rate for a stress test of direct byte
buffers), as noted in the original RFR.  I can reproduce both of
those.

I think reference processing thread throughput is the right measure to
use, so long as others don't become horrible.  Focusing on that, it's
better to just let the reference processing thread do the processing,
rather than slowing it down to allow for the rare case when there's
another thread that could possibly help.  This is especially true now
that Cleaner has such limited usage.

This leads to a different API for other threads; rather than
tryHandlePending that other threads can call to help and to examine
progress, we now have waitForReferenceProcessing.  The VM API is also
different; rather than popReferencePendingList to get or wait, we now
have getAndClearReferencePendingList and checkReferencePendingList,
the latter with an optional wait.

The splitting of the VM API allows us to avoid a narrow race condition
discussed by Peter in his prototypes.  Peter suggested this race was
ok because java.nio.Bits makes several retries.  However, those
retries are only done before throwing OOME.  There are no retries
before invoking GC, so this race could lead to unnecessary successive
GCs.

Doing all the processing on the reference processing thread eliminates
execution of Cleaners on application threads, though that's not nearly
as much an issue now that the use of Cleaner is so restricted.

We've also eliminated a pre-existing issue where a helper could report
no progress even though the reference processing thread (and other
helpers) had removed a pending reference, but not yet processed it.
This could result in the non-progressing helper invoking GC or
reporting failure, even though it might have succeeded if it had
waited for the other thread(s) to complete processing the reference(s)
being worked on.

I think the new waitForReferenceProcessing could be used to fix
JDK-6306116, though that isn't part of this change, and was not really
a motivating factor.

I don't know if the new waitForReferenceProcessing will be used by
whatever we eventually do for JDK-8149925, but I think the new VM API
will suffice for that.  That is, I think JDK-8149925 might require
changes to the core-libs API used by nio.Bits, and won't require
further VM API changes.







Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-01 Thread Kim Barrett
This updated webrev addresses concerns about the performance of the VM
API used by the reference processor thread in the original webrev.

New webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.03/
  http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.03/
incr: http://cr.openjdk.java.net/~kbarrett/8156500/jdk.03.incr/
  http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.03.incr/

The originally offered approach was an attempt to minimize changes.  I
was trying to be as similar to the existing code as possible, while
moving part of it into the VM, where we have the necessary control
over suspension and the like.  We already know we want to make changes
in this area, in order to eliminate the use of
jdk.internal.ref.Cleaner (JDK-8149925).  But we've also agreed that
JDK-8149925 wasn't urgent and to defer it to JDK 10.  I don't think we
should revisit that.

As Peter pointed out, my original change introduces a small
performance regression on a microbenchmark for reference processing
throughput.  It also showed a small performance benefit on a different
microbenchmark (allocation rate for a stress test of direct byte
buffers), as noted in the original RFR.  I can reproduce both of
those.

I think reference processing thread throughput is the right measure to
use, so long as others don't become horrible.  Focusing on that, it's
better to just let the reference processing thread do the processing,
rather than slowing it down to allow for the rare case when there's
another thread that could possibly help.  This is especially true now
that Cleaner has such limited usage.

This leads to a different API for other threads; rather than
tryHandlePending that other threads can call to help and to examine
progress, we now have waitForReferenceProcessing.  The VM API is also
different; rather than popReferencePendingList to get or wait, we now
have getAndClearReferencePendingList and checkReferencePendingList,
the latter with an optional wait.

The splitting of the VM API allows us to avoid a narrow race condition
discussed by Peter in his prototypes.  Peter suggested this race was
ok because java.nio.Bits makes several retries.  However, those
retries are only done before throwing OOME.  There are no retries
before invoking GC, so this race could lead to unnecessary successive
GCs.

Doing all the processing on the reference processing thread eliminates
execution of Cleaners on application threads, though that's not nearly
as much an issue now that the use of Cleaner is so restricted.

We've also eliminated a pre-existing issue where a helper could report
no progress even though the reference processing thread (and other
helpers) had removed a pending reference, but not yet processed it.
This could result in the non-progressing helper invoking GC or
reporting failure, even though it might have succeeded if it had
waited for the other thread(s) to complete processing the reference(s)
being worked on.

I think the new waitForReferenceProcessing could be used to fix
JDK-6306116, though that isn't part of this change, and was not really
a motivating factor.

I don't know if the new waitForReferenceProcessing will be used by
whatever we eventually do for JDK-8149925, but I think the new VM API
will suffice for that.  That is, I think JDK-8149925 might require
changes to the core-libs API used by nio.Bits, and won't require
further VM API changes.





Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-08-01 Thread Kim Barrett
This updated webrev addresses the concerns around initialization order
for Throwable and OutOfMemoryError.  It doesn't address the VM API
used by the reference processor thread; that will be in a later
webrev.

New webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.02/
incr: http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.02.incr/

The jdk tree is unchanged from webrev.01.

I've backed out the change to initialize_java_lang_classes.  (To
verify this, look at the runtime/thread.cpp in the full webrev.)

Instead, Universe::gen_out_of_memory_error now also requires Throwable
to be initialized before it will attempt to fill in the stack trace.
This means that VM generated OOMEs occurring early in initialization
won't have a stack trace.

An alternative that I considered was to remove the assert at the end
of fill_in_stack_trace_of_preallocated_backtrace that requires
java_lang_Throwable;:unassigned_stacktrace() to return non-NULL.  I
did some testing that seemed to work and provide stacktraces, relying
on code in the Throwable class that special cases this "out of
protocol" state.  However, having read some of the history around that
(JDK-6998871, JDK-7045138, JDK-7046490), that special casing seems to
have been intended to be temporary, and expected to be removed.
(Though I didn't find an RFE for doing so, and I'm not doing so here.)
Even if we decided it wasn't going to be removed, this approach seems
somewhat fragile.

[Note to Coleen: This is the reverse of what you and I talked about a
few days ago; I hadn't reviewed the history carefully then.]



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-29 Thread Mandy Chung

> On Jun 29, 2016, at 6:42 PM, David Holmes  wrote:
> 
> The earlier you initialize Throwable the earlier you can try to create an 
> exception that has a backtrace. But there will always be a region of code 
> where we can't throw an OOME with a backtrace because of the missing 
> initialization of Throwable. So we can narrow that window by moving the 
> initialization of Throwable (which in turn requires a whole bunch of 
> collection classes - so the window has a fixed minimum size [ unless we do 
> some creative restructuring of Throwable's static initialization]). My 
> argument, which I think I've now convinced Kim of, is that we shouldn't be 
> trying to throw an OOME with a stacktrace if Throwable has not been 
> initialized - and that is where the change to gen_out_of_memory_error comes 
> in. It is actually passed the pre-allocated OOME that has no backtrace and 
> will never have a backtrace, and that is what should be thrown when Throwable 
> has not been initialized.

Thanks for the clarification.  It makes sense and the pre-allocated OOME with 
no backtrace should be used for this early OOM scenario.

Mandy

Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-29 Thread Mandy Chung

> On Jun 28, 2016, at 10:19 PM, David Holmes  wrote:
> 
> So here is what I see has happened.
> 
> Looking back at 9-b01, before we forced the initialization of 
> InterruptedException and thus Throwable we find:
> 
> 58 Initializing 'java/lang/Throwable' (0x00082990)
> 
> So Kim is right this was working by accident. It just seems that back then 
> there was less memory required by the initialization of all the collection 
> and other classes and so we didn't run into this problem.
> 
> Post the InterruptedException change the initialization order made it 
> unlikely an OOME could be encountered before Throwable was initialized (and 
> after we have reached a point where we can throw without the VM crashing or 
> instead doing a vm_exit_during_initialization).
> 
> Post modules those collection classes, and others, are now done earlier again 
> and before Throwable. And presumably their memory demands have increased.
> 

This is the new primordial class added by modules that causes additional 
classes to be loaded early before initPhase1.

  // The VM creates objects of this class.
  initialize_class(vmSymbols::java_lang_reflect_Module(), CHECK);

> Although we preallocate the OutOfMemoryError instances, and avoid executing 
> any java code to do so, we can't necessarily** "throw" them until after 
> Throwable is initialized. We now have a lot more initialization occurring 
> before we init Throwable and so OOME is more likely and so it will fail as 
> Kim observed.


Would initializing java.lang.Throwable after java.lang.reflect.Module address 
this issue?  I don’t think I fully follow the problem Kim observed and below.

> 
> ** I say necessarily because I still believe it is the fact we attempt to 
> fill in the stacktrace that leads to the dependency on Throwable being 
> initialized, and we should be able to avoid that if we check the VM 
> initialization state in gen_out_of_memory_error().

Mandy

Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-29 Thread David Holmes

On 30/06/2016 5:12 AM, Kim Barrett wrote:

On Jun 29, 2016, at 2:54 PM, Kim Barrett  wrote:

I think the problem may have existed long before modules, but simply
wasn't noticed.  That is, I suspect trying the test case mentioned
below with a pre-June-2014 fastdebug build (when Reference started
initializing InterruptedException) might trip the same assert.  I'm
presently trying to obtain a sufficiently old fastdebug build to run
the experiment.


Nope, works fine with JDK 9 b01.


See my response to Mandy above.

David
-


./jdk1.9.0/fastdebug/bin/java -XX:AutoBoxCacheMax=2147483647 -version
Error occurred during initialization of VM
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.lang.Integer$IntegerCache.(Integer.java:799)
at java.lang.Integer.valueOf(Integer.java:827)
at sun.misc.Signal.handle(Signal.java:169)
at java.lang.Terminator.setup(Terminator.java:60)
at java.lang.System.initializeSystemClass(System.java:1197)



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-29 Thread Kim Barrett
> On Jun 29, 2016, at 2:54 PM, Kim Barrett  wrote:
> 
> I think the problem may have existed long before modules, but simply
> wasn't noticed.  That is, I suspect trying the test case mentioned
> below with a pre-June-2014 fastdebug build (when Reference started
> initializing InterruptedException) might trip the same assert.  I'm
> presently trying to obtain a sufficiently old fastdebug build to run
> the experiment.

Nope, works fine with JDK 9 b01.

./jdk1.9.0/fastdebug/bin/java -XX:AutoBoxCacheMax=2147483647 -version
Error occurred during initialization of VM
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.lang.Integer$IntegerCache.(Integer.java:799)
at java.lang.Integer.valueOf(Integer.java:827)
at sun.misc.Signal.handle(Signal.java:169)
at java.lang.Terminator.setup(Terminator.java:60)
at java.lang.System.initializeSystemClass(System.java:1197)



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-29 Thread Kim Barrett
> On Jun 28, 2016, at 7:23 PM, David Holmes  wrote:
> 
> On 29/06/2016 8:43 AM, Kim Barrett wrote:
>> Updated webrevs:
>> 
>> Full:
>> http://cr.openjdk.java.net/~kbarrett/8156500/jdk.01/
>> http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.01/
>> 
>> Incremental:
>> http://cr.openjdk.java.net/~kbarrett/8156500/jdk.01.inc/
> 
> Did Reference not work? Just curious. I used to be a qualified Java 
> programmer back in the Java 5 era, but wildcards were always a bit iffy :)

It did not, or at least not without more contortions than I wanted to
make.  The old code used , so I decided to fall back to that.

Specifically, the type parameter in the type of "ref" needs to be
compatible with the type parameter in the type of "q", and that seems
to be messy to make happen when wildcards are involved.  Having popXXX
return Reference and splitting most of the body of
tryHandlePending into a new parameterized helper

   boolean tryHandlePendingAux(Reference ref, boolean should_wait)

did seem to do the trick, but also seemed like it was trying too hard...

>> http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.01.inc/
>> 
>> Still investigating the initialization order for core exceptions.
> 
> I suspect it is as Coleen indicated that the module changes introduced the 
> new failure path that you hit. I did a quick check of the initialization 
> order in an old b50:
> 
> 33 Initializing 'java/lang/Throwable' (0x001780002990)
> ...
> 46 Initializing 'java/lang/Exception'(no method) (0x001780003158)
> 47 Initializing 'java/lang/InterruptedException'(no method) 
> (0x0017800178a8)
> 
> Compare that with a current build:
> 
> 60 Initializing 'java/lang/ref/Reference$ReferenceHandler' 
> (0x00080001e3f0)
> 61 Initializing 'java/lang/Throwable' (0x000829f8)
> 62 Initializing 'java/lang/Exception'(no method) (0x000831a8)
> 63 Initializing 'java/lang/InterruptedException'(no method) 
> (0x00080001e6e0)
> 64 Initializing 'java/lang/ref/PhantomReference'(no method) 
> (0x00086440)
> 
> Initialization of Throwable is much, much later (large parts of java.lang and 
> java.util are now initialized first!) and is obviously a direct consequence 
> of preinitializing InterruptedException.
> 
> So I would say that the module change did break this and that initialization 
> of Throwable (only) should be restored to a much higher place in the 
> initialization sequence.

I think the problem may have existed long before modules, but simply
wasn't noticed.  That is, I suspect trying the test case mentioned
below with a pre-June-2014 fastdebug build (when Reference started
initializing InterruptedException) might trip the same assert.  I'm
presently trying to obtain a sufficiently old fastdebug build to run
the experiment.

The thing that has changed is that we now have a test (ol' reliable
tickler of bugs TestOptionsWithRanges) that tries all kinds of wacky
configurations that had never been seen before.

It tripped the init order problem by trying -XX:AutoBoxCacheMax with a
value of jint_max, which led to an attempt to allocate an array whose
size exceeds the maximum allowed, which is reported as OOME with a
specialized message, which led to the assertion failure in
java_lang_Throwable::fill_in_stack_trace_of_preallocated_backtrace
because java_lang_Throwable::unassigned_stacktrace() == NULL.

Note that with a smaller -XX:AutoBoxCacheMax value of 1G we instead
get a VM shutdown during initialization, because we are trying to GC
before that's allowed, rather than trying to make an array with an
impossible size.

Aside: Reporting an attempt to allocate a mere 2G element objarray
using an OOME seems a bit quaint these days. :-)



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-29 Thread Mandy Chung

> On Jun 28, 2016, at 3:43 PM, Kim Barrett  wrote:
> 
> Updated webrevs:
> 
> Full:
> http://cr.openjdk.java.net/~kbarrett/8156500/jdk.01/
> http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.01/
> 

The VM change does simplify the implementation a lot.  Does/should the 
reference processing and enqueue them in the pending list at a safepoint? Or 
does the enqueuing happen while other Java threads are running except that it 
holds Heap_lock?

My main concern is that the ReferenceHandler thread now has to make a VM upcall 
to dequeue a reference object for every iteration.  I believe this is what the 
original implementation tried to avoid.  Would it be feasible for 
popReferencePendingList to be intrinsified?  If it is infeasible, we should 
think through other alternatives for example like Peter’s suggestion  to get 
the entire pending list at a time.

Comments on Reference.java

 143 /* Atomically pop the next pending reference.  If should_wait is
 144  * true and there are no pending references, block until the
 145  * pending reference list becomes non-empty.  The GC adds
 146  * references to the list, and notifies waiting threads when
 147  * references are added.
 148  */
 149 private static native Reference 
popReferencePendingList(boolean should_wait);

The parameter name “should_wait” - I would avoid using the word “should” and 
`_`.  What about “await”?

The spec needs to specify:

What happens if the thread calling this method is interrupted?  What does it 
return?  Does it throw InterruptedException?  What happens to the thread’s 
interrupted status?

This method is significant and can you add @param, @return, @throws to make it 
a proper javadoc.   

This change is small but the implication is significant.  JDK-7103238 would be 
more appropriate for this change than JDK-8156400 and its synopsis that does 
not highlight what the changeset is about.

 
> Still investigating the initialization order for core exceptions.
> 

This would be good to understand what’s going on there.

Mandy

Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-29 Thread Peter Levart

Hi again,

On 06/29/2016 03:30 PM, Peter Levart wrote:
Just to show what I mean, here's a simple optimization that doesn't 
use a private pendingList of references shared among callers of 
tryHandlePanding(true/false), but instead uses course-grained 
synchronization and waiting for tryHandlePanding(false) callers while 
ReferenceHandler is privately processing the whole list of pending 
references:


http://cr.openjdk.java.net/~plevart/misc/PendingReferenceHandling/webrev.02/ 




...and now, on top of above, for some more involved optimization that is 
most effective when a chunk of pending references is discovered where 
majority are destined for the same ReferenceQueue (as in our benchmark):


http://cr.openjdk.java.net/~plevart/misc/PendingReferenceHandling/webrev.03/

The results of benchmark:

Original JDK 9:

Benchmark(refCount)  Mode  Cnt 
Score  Error  Units
ReferenceEnqueueBench.dequeueReferences  10ss  100 38410.515 
± 1011.769  us/op


Patched (by Kim):

Benchmark(refCount)  Mode  Cnt 
Score  Error  Units
ReferenceEnqueueBench.dequeueReferences  10ss  100 42197.522 
± 1161.451  us/op


Proposed (by Peter, webrev):

Benchmark(refCount)  Mode  Cnt 
Score  Error  Units
ReferenceEnqueueBench.dequeueReferences  10ss  100 34134.977 
± 1274.753  us/op


Proposed (by Peter, webrev.02):

Benchmark(refCount)  Mode  Cnt 
Score  Error  Units
ReferenceEnqueueBench.dequeueReferences  10ss  100 27935.929 
± 1128.678  us/op


Proposed (by Peter, webrev.03):

Benchmark(refCount)  Mode  Cnt Score
Error  Units
ReferenceEnqueueBench.dequeueReferences  10ss  100 6603.009 
± 59.051  us/op



That's already 6x faster than baseline!

I'm not suggesting that any of these optimizations is applied. I just 
wanted to show that a JNI call that allows retrieving the whole pending 
list has potential to allow such optimizations.


Regards, Peter



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-29 Thread Peter Levart

Hi Kim,


On 06/29/2016 01:22 PM, Peter Levart wrote:
Transfering the whole list in one JNI invocation has the potential for 
further optimizations on the Java side (like handling the whole popped 
list privately without additional synchronization - if we ever find a 
way for java.nio.Bits to wait for it reliably - or even enqueue-ing a 
chunk of consecutive references destined for the same queue using a 
single synchronized action on the queue, etc...) 


Just to show what I mean, here's a simple optimization that doesn't use 
a private pendingList of references shared among callers of 
tryHandlePanding(true/false), but instead uses course-grained 
synchronization and waiting for tryHandlePanding(false) callers while 
ReferenceHandler is privately processing the whole list of pending 
references:


http://cr.openjdk.java.net/~plevart/misc/PendingReferenceHandling/webrev.02/

This further improves benchmark results and it still passes the 
DirectBufferAllocTest:


Original JDK 9:

Benchmark(refCount)  Mode  Cnt 
Score  Error  Units
ReferenceEnqueueBench.dequeueReferences  10ss  100 38410.515 
± 1011.769  us/op


Patched (by Kim):

Benchmark(refCount)  Mode  Cnt 
Score  Error  Units
ReferenceEnqueueBench.dequeueReferences  10ss  100 42197.522 
± 1161.451  us/op


Proposed (by Peter, webrev):

Benchmark(refCount)  Mode  Cnt 
Score  Error  Units
ReferenceEnqueueBench.dequeueReferences  10ss  100 34134.977 
± 1274.753  us/op


Proposed (by Peter, webrev.02):

Benchmark(refCount)  Mode  Cnt 
Score  Error  Units
ReferenceEnqueueBench.dequeueReferences  10ss  100 27935.929 
± 1128.678  us/op



Regards, Peter



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-29 Thread Peter Levart

Hi Kim,

Let me chime-in although it's a bit late...

I think this is a good change to finally get rid of OOME problems in 
this area.


On 06/28/2016 07:45 PM, Kim Barrett wrote:

On Jun 28, 2016, at 5:33 AM, Per Liden  wrote:
Patch looks good. The only thing I don't feel qualified to review is the 
initialization order change in thread.cpp, so I'll let others comments on that.

Thanks.  I’ll be following up on that area.


I like the pop-one-reference-at-a-time semantics, which simplifies things a lot 
and keeps the interface nice and clean. I was previously afraid that it might 
cause a noticeable performance degradation compared to lifting the whole list 
into Java in one go, but your testing seem to prove that's not the case.

I was concerned about that too, and had tried a different approach that also still 
supported the existing "some callers wait and others don’t" API, but it was a 
bit messy.  Coleen convinced me to try this (since it was easy) and do the measurement, 
and it worked out well.



The repeated JNI invocations do not present a big overhead, but it is 
measurable. For example, the following benchmark measures the time it 
takes after a GC cycle for a bunch of references to be transfered to 
Java side, enqueued into ReferenceQueue and dequeued from it:


http://cr.openjdk.java.net/~plevart/misc/PendingReferenceHandling/ReferenceEnqueueBench.java

The results on my i7-4771 PC:

Original JDK 9:

Benchmark(refCount)  Mode  Cnt 
Score  Error  Units
ReferenceEnqueueBench.dequeueReferences  10ss  100 38410.515 
± 1011.769  us/op


Patched (by Kim):

Benchmark(refCount)  Mode  Cnt 
Score  Error  Units
ReferenceEnqueueBench.dequeueReferences  10ss  100 42197.522 
± 1161.451  us/op



So, about 10% worse for this benchmark.

Transfering the whole list in one JNI invocation has the potential for 
further optimizations on the Java side (like handling the whole popped 
list privately without additional synchronization - if we ever find a 
way for java.nio.Bits to wait for it reliably - or even enqueue-ing a 
chunk of consecutive references destined for the same queue using a 
single synchronized action on the queue, etc...)


If the JNI API was something like the following:

/* Atomically pop the pending reference list if wholeList is true,
 * or just next pending reference if wholeList is false.
 * If wait is true and the pending reference list is empty, blocks 
until

 * it becomes non-empty, or returns null if wait is false.
 */
private static native Reference popReferencePendingList(boolean 
wholeList, boolean wait);



Then not too much complication is needed in Reference.java (diff to 
current JDK 9 sources) to consume this API and already have some benefit 
from it:


http://cr.openjdk.java.net/~plevart/misc/PendingReferenceHandling/webrev/

There is a possible race here between ReferenceHandler calling 
tryHandlePending(true) and java.nio.Bits calling tryHandlePending(false) 
that can make the method return false to java.nio.Bits when 
ReferenceHandler has just popped the whole list and not yet installed it 
in the private pendingList field, but java.nio.Bits makes several 
retries in this case with exponentially increasing delays, so that does 
not currently present a problem. 
java/nio/Buffer/DirectBufferAllocTest.java test passes with this change.


And the above benchmark shows improvement instead of regression:

Proposed (my Peter):

Benchmark(refCount)  Mode  Cnt 
Score  Error  Units
ReferenceEnqueueBench.dequeueReferences  10ss  100 34134.977 
± 1274.753  us/op



So what do you think? Is it worth the little additional logic on the 
Java side?



Regards, Peter



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-28 Thread David Holmes

Hi Mandy,

On 29/06/2016 12:28 PM, Mandy Chung wrote:



On Jun 28, 2016, at 4:23 PM, David Holmes  wrote:

On 29/06/2016 8:43 AM, Kim Barrett wrote:

Updated webrevs:

Full:
http://cr.openjdk.java.net/~kbarrett/8156500/jdk.01/
http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.01/

Incremental:
http://cr.openjdk.java.net/~kbarrett/8156500/jdk.01.inc/


Did Reference not work? Just curious. I used to be a qualified Java 
programmer back in the Java 5 era, but wildcards were always a bit iffy :)



A related post in compiler-dev w.r.t. Reference:
  http://mail.openjdk.java.net/pipermail/compiler-dev/2014-January/008457.html

Looks like no reply on that thread though.




http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.01.inc/

Still investigating the initialization order for core exceptions.


I suspect it is as Coleen indicated that the module changes introduced the new 
failure path that you hit. I did a quick check of the initialization order in 
an old b50:

33 Initializing 'java/lang/Throwable' (0x001780002990)
...
46 Initializing 'java/lang/Exception'(no method) (0x001780003158)
47 Initializing 'java/lang/InterruptedException'(no method) (0x0017800178a8)

Compare that with a current build:

60 Initializing 'java/lang/ref/Reference$ReferenceHandler' (0x00080001e3f0)
61 Initializing 'java/lang/Throwable' (0x000829f8)
62 Initializing 'java/lang/Exception'(no method) (0x000831a8)
63 Initializing 'java/lang/InterruptedException'(no method) (0x00080001e6e0)
64 Initializing 'java/lang/ref/PhantomReference'(no method) (0x00086440)



How do you get this list?


For current builds: java -Xlog:class+init -version (and I trimmed the 
output)


For slightly older builds: java -Xlog:classinit -version

For pre UL you need a fastdebug build and: java 
-XX:+TraceClassInitialization -version





Initialization of Throwable is much, much later (large parts of java.lang and 
java.util are now initialized first!) and is obviously a direct consequence of 
preinitializing InterruptedException.



Do you mind trying b110 before JEP 261 was integrated in jdk-9+111 and see any 
difference?


Here's fragment from b109:

31 Initializing 'java/lang/ref/Reference' (0x00085a08)
32 Initializing 'java/lang/ref/Reference$Lock'(no method) 
(0x0008000181c0)
33 Initializing 'java/lang/ref/Reference$ReferenceHandler' 
(0x0008000183b8)

34 Initializing 'java/lang/Throwable' (0x000828f0)


initPhase1 is expected to be called very early in the VM initialization after 
the primordial classes are loaded.

IIRC, the exception classes are preallocated to improve diagnosability in the 
case when we run out of memory VM can still throw a preallocated instance with 
an preallocated stack trace buffer.  I don’t think these exception classes are 
expected to be loaded initPhase1.


So here is what I see has happened.

Looking back at 9-b01, before we forced the initialization of 
InterruptedException and thus Throwable we find:


58 Initializing 'java/lang/Throwable' (0x00082990)

So Kim is right this was working by accident. It just seems that back 
then there was less memory required by the initialization of all the 
collection and other classes and so we didn't run into this problem.


Post the InterruptedException change the initialization order made it 
unlikely an OOME could be encountered before Throwable was initialized 
(and after we have reached a point where we can throw without the VM 
crashing or instead doing a vm_exit_during_initialization).


Post modules those collection classes, and others, are now done earlier 
again and before Throwable. And presumably their memory demands have 
increased.


Although we preallocate the OutOfMemoryError instances, and avoid 
executing any java code to do so, we can't necessarily** "throw" them 
until after Throwable is initialized. We now have a lot more 
initialization occurring before we init Throwable and so OOME is more 
likely and so it will fail as Kim observed.


** I say necessarily because I still believe it is the fact we attempt 
to fill in the stacktrace that leads to the dependency on Throwable 
being initialized, and we should be able to avoid that if we check the 
VM initialization state in gen_out_of_memory_error().


Thanks,
David


More to dig here.

Mandy


So I would say that the module change did break this and that initialization of 
Throwable (only) should be restored to a much higher place in the 
initialization sequence.

Thanks,
David





Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-28 Thread Mandy Chung

> On Jun 28, 2016, at 4:23 PM, David Holmes  wrote:
> 
> On 29/06/2016 8:43 AM, Kim Barrett wrote:
>> Updated webrevs:
>> 
>> Full:
>> http://cr.openjdk.java.net/~kbarrett/8156500/jdk.01/
>> http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.01/
>> 
>> Incremental:
>> http://cr.openjdk.java.net/~kbarrett/8156500/jdk.01.inc/
> 
> Did Reference not work? Just curious. I used to be a qualified Java 
> programmer back in the Java 5 era, but wildcards were always a bit iffy :)


A related post in compiler-dev w.r.t. Reference:
  http://mail.openjdk.java.net/pipermail/compiler-dev/2014-January/008457.html

Looks like no reply on that thread though.

> 
>> http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.01.inc/
>> 
>> Still investigating the initialization order for core exceptions.
> 
> I suspect it is as Coleen indicated that the module changes introduced the 
> new failure path that you hit. I did a quick check of the initialization 
> order in an old b50:
> 
> 33 Initializing 'java/lang/Throwable' (0x001780002990)
> ...
> 46 Initializing 'java/lang/Exception'(no method) (0x001780003158)
> 47 Initializing 'java/lang/InterruptedException'(no method) 
> (0x0017800178a8)
> 
> Compare that with a current build:
> 
> 60 Initializing 'java/lang/ref/Reference$ReferenceHandler' 
> (0x00080001e3f0)
> 61 Initializing 'java/lang/Throwable' (0x000829f8)
> 62 Initializing 'java/lang/Exception'(no method) (0x000831a8)
> 63 Initializing 'java/lang/InterruptedException'(no method) 
> (0x00080001e6e0)
> 64 Initializing 'java/lang/ref/PhantomReference'(no method) 
> (0x00086440)
> 

How do you get this list?

> Initialization of Throwable is much, much later (large parts of java.lang and 
> java.util are now initialized first!) and is obviously a direct consequence 
> of preinitializing InterruptedException.
> 

Do you mind trying b110 before JEP 261 was integrated in jdk-9+111 and see any 
difference?

initPhase1 is expected to be called very early in the VM initialization after 
the primordial classes are loaded.

IIRC, the exception classes are preallocated to improve diagnosability in the 
case when we run out of memory VM can still throw a preallocated instance with 
an preallocated stack trace buffer.  I don’t think these exception classes are 
expected to be loaded initPhase1.

More to dig here.

Mandy

> So I would say that the module change did break this and that initialization 
> of Throwable (only) should be restored to a much higher place in the 
> initialization sequence.
> 
> Thanks,
> David
> 



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-28 Thread David Holmes

On 29/06/2016 8:43 AM, Kim Barrett wrote:

Updated webrevs:

Full:
http://cr.openjdk.java.net/~kbarrett/8156500/jdk.01/
http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.01/

Incremental:
http://cr.openjdk.java.net/~kbarrett/8156500/jdk.01.inc/


Did Reference not work? Just curious. I used to be a qualified Java 
programmer back in the Java 5 era, but wildcards were always a bit iffy :)



http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.01.inc/

Still investigating the initialization order for core exceptions.


I suspect it is as Coleen indicated that the module changes introduced 
the new failure path that you hit. I did a quick check of the 
initialization order in an old b50:


33 Initializing 'java/lang/Throwable' (0x001780002990)
...
46 Initializing 'java/lang/Exception'(no method) (0x001780003158)
47 Initializing 'java/lang/InterruptedException'(no method) 
(0x0017800178a8)


Compare that with a current build:

60 Initializing 'java/lang/ref/Reference$ReferenceHandler' 
(0x00080001e3f0)

61 Initializing 'java/lang/Throwable' (0x000829f8)
62 Initializing 'java/lang/Exception'(no method) (0x000831a8)
63 Initializing 'java/lang/InterruptedException'(no method) 
(0x00080001e6e0)
64 Initializing 'java/lang/ref/PhantomReference'(no method) 
(0x00086440)


Initialization of Throwable is much, much later (large parts of 
java.lang and java.util are now initialized first!) and is obviously a 
direct consequence of preinitializing InterruptedException.


So I would say that the module change did break this and that 
initialization of Throwable (only) should be restored to a much higher 
place in the initialization sequence.


Thanks,
David



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-28 Thread Kim Barrett
> On Jun 27, 2016, at 9:34 PM, Daniel D. Daugherty 
>  wrote:
> Other than the possible missing notify_all() in vmGCOperations.cpp,
> this all looks great.

Thanks Dan.



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-28 Thread Kim Barrett
Updated webrevs:

Full:
http://cr.openjdk.java.net/~kbarrett/8156500/jdk.01/
http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.01/

Incremental:
http://cr.openjdk.java.net/~kbarrett/8156500/jdk.01.inc/
http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.01.inc/

Still investigating the initialization order for core exceptions.



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-28 Thread Kim Barrett
> On Jun 28, 2016, at 12:04 AM, David Holmes  wrote:
> 
> Hi Kim,
> 
> I like all the removal of the pending-list-locker related code, but can't 
> really comment on the details of how it was replaced and interacts with all 
> the GC code. The interface to the Reference class is fine, it is the GC code 
> that pushes into the reference queue that I can't really comment on.

Thanks David.

> I have a couple of queries following up from Coleen's and Dan's reviews.
> 
> First, the synchronization of the pending-reference-list leaves me somewhat 
> perplexed. As Coleen noted you require the heap_lock to be held but also use 
> an Atomic::xchg. You have a comment:
> 
> +   // owns the lock.  Swap is used by parallel non-concurrent reference
> +   // processing threads, where some higher level controller owns
> +   // Heap_lock, so requires the lock is locked, but not necessarily by
> +   // the current thread.
> 
> Can you clarify the higher-level protocol that is at play there. Asserting 
> that "someone" owns a lock seems only marginally useful if you can't be sure 
> who that owner is, or when they might release it. Some more direct detecting 
> of being within this higher-level protocol would be better, if it is possible 
> to detect.

Was Per’s explanation sufficient?

> As previously mentioned the change to the initialization order is a concern 
> to me - there are always unexpected consequences of making such changes, no 
> matter how straight-forward they seem at the time. Pre-initialization of 
> exception classes is not really essential as the attempt to throw any of them 
> from Java code will force initialization anyway - it's only for exceptions 
> created and thrown from the VM code that direct initialization is needed. The 
> problematic case is OOME because throwing the OOME may trigger a secondary 
> OOME during that class initialization etc. Hence we try to deal with that by 
> pre-allocating instances that do not have stacktraces etc, so they can be 
> thrown safely. Hence my earlier comment that I think the real bug in your 
> situation is that gen_out_of_memory_error() is not considering how far into 
> the VM init sequence we are, and that it should be returning the 
> pre-allocated instance with no backtrace.

I'm also wondering why the existing order must have worked before the
pre-initialization of InterruptedException by Reference, but now
doesn't when we take that out.  I'll see if I can figure out what's
going on there.  It might be some bug was introduced but was being
masked by the Reference initialization.

> ---
> 
> A query on the JDK side:
> 
> src/java.base/share/classes/java/lang/ref/Reference.java
> 
> private static native Reference 
> popReferencePendingList(boolean wait);
> 
> I'm intrigued by the use of the lower-bounded wildcard using Object. As 
> Object has no super-types this looks very strange and should be equivalent to 
> the more common upper-bounded wildcard using Object ie Reference Object> or more concisely Reference. I see this construct exists in the 
> current code for the ReferenceQueue, but I am quite intrigued as to why? 
> (Someone getting ready for Any-types? :) )

I botched that. I meant to have a real Java programmer (which I don't
qualify as) look at that part of the change before sending it out for
review, but seem to have forgotten to do so. And lulled into a false
sense of security, since neither the compiler nor the runtime
complained, unlike the reams of template instantiation error output or
segfaults at runtime that I'd expect with C++...



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-28 Thread Kim Barrett
> On Jun 28, 2016, at 5:33 AM, Per Liden  wrote:
> Patch looks good. The only thing I don't feel qualified to review is the 
> initialization order change in thread.cpp, so I'll let others comments on 
> that.

Thanks.  I’ll be following up on that area.

> I like the pop-one-reference-at-a-time semantics, which simplifies things a 
> lot and keeps the interface nice and clean. I was previously afraid that it 
> might cause a noticeable performance degradation compared to lifting the 
> whole list into Java in one go, but your testing seem to prove that's not the 
> case.

I was concerned about that too, and had tried a different approach that also 
still supported the existing "some callers wait and others don’t" API, but it 
was a bit messy.  Coleen convinced me to try this (since it was easy) and do 
the measurement, and it worked out well.



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-28 Thread Per Liden

Hi Kim,

On 2016-06-24 22:09, Kim Barrett wrote:

Please review this change which moves the Reference pending list and
locking from the java.lang.ref.Reference class into the VM.  This
includes elimination of the ReferencePendingListLocker mechanism in
the VM. This fixes various fragility issues arising from having the
management of this list split between Java and the VM (GC).

This change addresses JDK-8156500 by eliminating the possibility of
suspension while holding the lock. Because the locking is now done in
the VM, we have full control over where suspension may occur.

This change retains the existing interface between the reference
processor and the nio.Bits package, e.g. Reference.tryHandlePending
has the same signature and behavior; this change just pushes the
pending list manipulation down into the VM.  There are open bugs
reports, enhancement requests, and discussions related to that
interface; see below. This change does not attempt to address them.

This change additionally addresses or renders moot
https://bugs.openjdk.java.net/browse/JDK-8055232
(ref) Exceptions while processing Reference pending list

and
https://bugs.openjdk.java.net/browse/JDK-7103238
Ensure pending list lock is held on behalf of GC before enqueuing references on 
to the pending list

It is also relevant for followup discussion of
https://bugs.openjdk.java.net/browse/JDK-8022321
java/lang/ref/OOMEInReferenceHandler.java fails immediately

and
https://bugs.openjdk.java.net/browse/JDK-8149925
We don't need jdk.internal.ref.Cleaner any more - part 1

and has implications for:
https://bugs.openjdk.java.net/browse/JDK-8066859
java/lang/ref/OOMEInReferenceHandler.java failed with java.lang.Exception: 
Reference Handler thread died

In addition to the obviously relevant changes, there are a couple of
changes whose presence here might seem surprising and require further
explanation.

- Removal of a stale comment in create_vm, noticed because it was near
some code being removed as part of this change.  The comment was
rendered obsolete by JDK-8028275.

- Moved initialization of exception classes earlier in
initialize_java_lang_classes. The previous order only worked by
accident, at least for OOME.  During the bulk of the library
initialization, OOME may be thrown, perhaps due to poorly chosen
command line options.  That attempts to use one of the pre-allocated
OOME objects, and tries to fill in its stack trace.  If the Throwable
class is not yet initialized, this leads to a failed assert in the VM
because Throwable.UNASSIGNED_STACK still has a null value.  This
initialization order issue was being masked by the old pending list
implementation; the initialization of Reference ensured
InterruptedException was initialized (thereby initializing its
Throwable base class), and the initialization of Reference just
happened to occur early enough that Throwable was initialized by the
time it was needed when running certain tests.  The forced
initialization of InterruptedException is no longer needed by
Reference, but removal of it triggered test failures (and could
trigger corresponding product failures) because of this ordering
issue.

CR:
https://bugs.openjdk.java.net/browse/JDK-8156500


Patch looks good. The only thing I don't feel qualified to review is the 
initialization order change in thread.cpp, so I'll let others comments 
on that.


I like the pop-one-reference-at-a-time semantics, which simplifies 
things a lot and keeps the interface nice and clean. I was previously 
afraid that it might cause a noticeable performance degradation compared 
to lifting the whole list into Java in one go, but your testing seem to 
prove that's not the case.


cheers,
Per



Webrev:
http://cr.openjdk.java.net/~kbarrett/8156500/jdk.00/
http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.00/

Testing:
RBT ad hoc nightly runtime & gc tests.

With the proposed patch for JDK-8153711 applied, locally ran nearly
35K iterations of the OomDebugTest that is part of that patch, with no
failures / deadlocks.  Without this change it would typically only take
on the order of 100 iterations to hit a deadlock failure.

Locally ran direct byte buffer allocation test:
jdk/test/java/nio/Buffer/DirectBufferAllocTest.java
This change reported 3% faster allocation times, and 1/2 the standard
deviation for allocation times.

Locally ran failing test from JDK-8022321 and JDK-8066859 a few
thousand times.
jdk/test/java/lang/ref/OOMEInReferenceHandler.java
No errors, but failures were noted as hard to reproduce.



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-27 Thread David Holmes

Hi Kim,

I like all the removal of the pending-list-locker related code, but 
can't really comment on the details of how it was replaced and interacts 
with all the GC code. The interface to the Reference class is fine, it 
is the GC code that pushes into the reference queue that I can't really 
comment on.


I have a couple of queries following up from Coleen's and Dan's reviews.

First, the synchronization of the pending-reference-list leaves me 
somewhat perplexed. As Coleen noted you require the heap_lock to be held 
but also use an Atomic::xchg. You have a comment:


+   // owns the lock.  Swap is used by parallel non-concurrent reference
+   // processing threads, where some higher level controller owns
+   // Heap_lock, so requires the lock is locked, but not necessarily by
+   // the current thread.

Can you clarify the higher-level protocol that is at play there. 
Asserting that "someone" owns a lock seems only marginally useful if you 
can't be sure who that owner is, or when they might release it. Some 
more direct detecting of being within this higher-level protocol would 
be better, if it is possible to detect.


---

As previously mentioned the change to the initialization order is a 
concern to me - there are always unexpected consequences of making such 
changes, no matter how straight-forward they seem at the time. 
Pre-initialization of exception classes is not really essential as the 
attempt to throw any of them from Java code will force initialization 
anyway - it's only for exceptions created and thrown from the VM code 
that direct initialization is needed. The problematic case is OOME 
because throwing the OOME may trigger a secondary OOME during that class 
initialization etc. Hence we try to deal with that by pre-allocating 
instances that do not have stacktraces etc, so they can be thrown 
safely. Hence my earlier comment that I think the real bug in your 
situation is that gen_out_of_memory_error() is not considering how far 
into the VM init sequence we are, and that it should be returning the 
pre-allocated instance with no backtrace.


---

A query on the JDK side:

src/java.base/share/classes/java/lang/ref/Reference.java

 private static native Reference 
popReferencePendingList(boolean wait);


I'm intrigued by the use of the lower-bounded wildcard using Object. As 
Object has no super-types this looks very strange and should be 
equivalent to the more common upper-bounded wildcard using Object ie 
Reference or more concisely Reference. I see this 
construct exists in the current code for the ReferenceQueue, but I am 
quite intrigued as to why? (Someone getting ready for Any-types? :) )


Thanks,
David
-

On 25/06/2016 6:09 AM, Kim Barrett wrote:

Please review this change which moves the Reference pending list and
locking from the java.lang.ref.Reference class into the VM.  This
includes elimination of the ReferencePendingListLocker mechanism in
the VM. This fixes various fragility issues arising from having the
management of this list split between Java and the VM (GC).

This change addresses JDK-8156500 by eliminating the possibility of
suspension while holding the lock. Because the locking is now done in
the VM, we have full control over where suspension may occur.

This change retains the existing interface between the reference
processor and the nio.Bits package, e.g. Reference.tryHandlePending
has the same signature and behavior; this change just pushes the
pending list manipulation down into the VM.  There are open bugs
reports, enhancement requests, and discussions related to that
interface; see below. This change does not attempt to address them.

This change additionally addresses or renders moot
https://bugs.openjdk.java.net/browse/JDK-8055232
(ref) Exceptions while processing Reference pending list

and
https://bugs.openjdk.java.net/browse/JDK-7103238
Ensure pending list lock is held on behalf of GC before enqueuing references on 
to the pending list

It is also relevant for followup discussion of
https://bugs.openjdk.java.net/browse/JDK-8022321
java/lang/ref/OOMEInReferenceHandler.java fails immediately

and
https://bugs.openjdk.java.net/browse/JDK-8149925
We don't need jdk.internal.ref.Cleaner any more - part 1

and has implications for:
https://bugs.openjdk.java.net/browse/JDK-8066859
java/lang/ref/OOMEInReferenceHandler.java failed with java.lang.Exception: 
Reference Handler thread died

In addition to the obviously relevant changes, there are a couple of
changes whose presence here might seem surprising and require further
explanation.

- Removal of a stale comment in create_vm, noticed because it was near
some code being removed as part of this change.  The comment was
rendered obsolete by JDK-8028275.

- Moved initialization of exception classes earlier in
initialize_java_lang_classes. The previous order only worked by
accident, at least for OOME.  During the bulk of the library
initialization, OOME may be thrown, perhaps due to poorly chosen

Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-27 Thread Daniel D. Daugherty

> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8156500/jdk.00/

make/mapfiles/libjava/mapfile-vers
No comments.

src/java.base/share/classes/java/lang/ref/Reference.java
Much cleaner (pun might have been intended :-)).

src/java.base/share/native/include/jvm.h
No comments.

src/java.base/share/native/libjava/Reference.c
No comments.

> http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.00/

make/symbols/symbols-unix
No comments.

src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Threads.java
No comments.

src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/soql/sa.js
No comments.

src/share/vm/ci/ciReplay.cpp
No comments.

src/share/vm/classfile/javaClasses.cpp
No comments.

src/share/vm/classfile/javaClasses.hpp
No comments.

src/share/vm/compiler/compileBroker.cpp
No comments.

src/share/vm/gc/cms/concurrentMarkSweepThread.cpp
So no more Surrogate Locker Thread (SLT)? Sound good to me.

src/share/vm/gc/cms/vmCMSOperations.cpp
No comments.

src/share/vm/gc/cms/vmCMSOperations.hpp
No comments.

src/share/vm/gc/g1/concurrentMarkThread.cpp
No comments.

src/share/vm/gc/g1/g1CollectedHeap.hpp
No comments.

src/share/vm/gc/g1/vm_operations_g1.cpp
No comments.

src/share/vm/gc/g1/vm_operations_g1.hpp
No comments.

src/share/vm/gc/shared/collectedHeap.hpp
No comments.

src/share/vm/gc/shared/genCollectedHeap.hpp
No comments.

src/share/vm/gc/shared/referenceProcessor.cpp
No comments.

src/share/vm/gc/shared/referenceProcessor.hpp
No comments.

src/share/vm/gc/shared/vmGCOperations.cpp
L103: Heap_lock->unlock();
You did not add a conditional "Heap_lock->notify_all()" before
the Heap_lock->unlock() like you've done in other places.
Since you deleted a release_and_notify_pending_list_lock()
call, I would think you need the:

  if (Universe::has_reference_pending_list()) {
Heap_lock->notify_all();
  }

src/share/vm/gc/shared/vmGCOperations.hpp
No comments.

src/share/vm/memory/universe.cpp
No comments.

src/share/vm/memory/universe.hpp
No comments.

src/share/vm/oops/method.cpp
No comments.

src/share/vm/prims/jvm.cpp
No comments.

src/share/vm/prims/jvm.h
No comments.

src/share/vm/runtime/thread.cpp
So this change moves call_initPhase1() after the initialize_class()
calls and that just sets off alarm bells in my head. Yes, you have
a really big comment below explaining this. Still worries me!

src/share/vm/runtime/vmStructs.cpp
No comments.

src/share/vm/gc/shared/referencePendingListLocker.cpp
src/share/vm/gc/shared/referencePendingListLocker.hpp
Both files are deleted and don't need review.


Other than the possible missing notify_all() in vmGCOperations.cpp,
this all looks great.

Dan


On 6/24/16 2:09 PM, Kim Barrett wrote:

Please review this change which moves the Reference pending list and
locking from the java.lang.ref.Reference class into the VM.  This
includes elimination of the ReferencePendingListLocker mechanism in
the VM. This fixes various fragility issues arising from having the
management of this list split between Java and the VM (GC).

This change addresses JDK-8156500 by eliminating the possibility of
suspension while holding the lock. Because the locking is now done in
the VM, we have full control over where suspension may occur.

This change retains the existing interface between the reference
processor and the nio.Bits package, e.g. Reference.tryHandlePending
has the same signature and behavior; this change just pushes the
pending list manipulation down into the VM.  There are open bugs
reports, enhancement requests, and discussions related to that
interface; see below. This change does not attempt to address them.

This change additionally addresses or renders moot
https://bugs.openjdk.java.net/browse/JDK-8055232
(ref) Exceptions while processing Reference pending list

and
https://bugs.openjdk.java.net/browse/JDK-7103238
Ensure pending list lock is held on behalf of GC before enqueuing references on 
to the pending list

It is also relevant for followup discussion of
https://bugs.openjdk.java.net/browse/JDK-8022321
java/lang/ref/OOMEInReferenceHandler.java fails immediately

and
https://bugs.openjdk.java.net/browse/JDK-8149925
We don't need jdk.internal.ref.Cleaner any more - part 1

and has implications for:
https://bugs.openjdk.java.net/browse/JDK-8066859
java/lang/ref/OOMEInReferenceHandler.java failed with java.lang.Exception: 
Reference Handler thread died

In addition to the obviously relevant changes, there are a couple of
changes whose presence here might seem surprising and require further
explanation.

- Removal of a stale comment in create_vm, noticed because it was near
some code being removed as part of this change.  The comment was
rendered obsolete by JDK-8028275.

- Moved initialization of exception classes earlier in
initialize_java_lang_classes. The 

Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-26 Thread David Holmes

One more follow up before I actually review the code ...

On 25/06/2016 6:09 AM, Kim Barrett wrote:

Please review this change which moves the Reference pending list and
locking from the java.lang.ref.Reference class into the VM.  This
includes elimination of the ReferencePendingListLocker mechanism in
the VM. This fixes various fragility issues arising from having the
management of this list split between Java and the VM (GC).

This change addresses JDK-8156500 by eliminating the possibility of
suspension while holding the lock. Because the locking is now done in
the VM, we have full control over where suspension may occur.

This change retains the existing interface between the reference
processor and the nio.Bits package, e.g. Reference.tryHandlePending
has the same signature and behavior; this change just pushes the
pending list manipulation down into the VM.  There are open bugs
reports, enhancement requests, and discussions related to that
interface; see below. This change does not attempt to address them.

This change additionally addresses or renders moot
https://bugs.openjdk.java.net/browse/JDK-8055232
(ref) Exceptions while processing Reference pending list

and
https://bugs.openjdk.java.net/browse/JDK-7103238
Ensure pending list lock is held on behalf of GC before enqueuing references on 
to the pending list

It is also relevant for followup discussion of
https://bugs.openjdk.java.net/browse/JDK-8022321
java/lang/ref/OOMEInReferenceHandler.java fails immediately

and
https://bugs.openjdk.java.net/browse/JDK-8149925
We don't need jdk.internal.ref.Cleaner any more - part 1

and has implications for:
https://bugs.openjdk.java.net/browse/JDK-8066859
java/lang/ref/OOMEInReferenceHandler.java failed with java.lang.Exception: 
Reference Handler thread died

In addition to the obviously relevant changes, there are a couple of
changes whose presence here might seem surprising and require further
explanation.

- Removal of a stale comment in create_vm, noticed because it was near
some code being removed as part of this change.  The comment was
rendered obsolete by JDK-8028275.

- Moved initialization of exception classes earlier in
initialize_java_lang_classes. The previous order only worked by
accident, at least for OOME.  During the bulk of the library
initialization, OOME may be thrown, perhaps due to poorly chosen
command line options.  That attempts to use one of the pre-allocated
OOME objects, and tries to fill in its stack trace.  If the Throwable
class is not yet initialized, this leads to a failed assert in the VM
because Throwable.UNASSIGNED_STACK still has a null value.  This
initialization order issue was being masked by the old pending list
implementation; the initialization of Reference ensured
InterruptedException was initialized (thereby initializing its
Throwable base class), and the initialization of Reference just
happened to occur early enough that Throwable was initialized by the
time it was needed when running certain tests.  The forced
initialization of InterruptedException is no longer needed by
Reference, but removal of it triggered test failures (and could
trigger corresponding product failures) because of this ordering
issue.


This is surprising because prior to forcing the load of 
InterruptedException we did not see any initialization issues with OOME 
throwing that I am aware of. To me this seems to be a bug in 
Universe::gen_out_of_memory_error() as it is not checking to see if the 
necessary classes have been initialized - else it should return a 
pre-generated OOME with no backtrace.


Any change to the VM initialization sequence has to be examined very 
carefully because there are always non obvious consequences to making 
changes.


David
-


CR:
https://bugs.openjdk.java.net/browse/JDK-8156500

Webrev:
http://cr.openjdk.java.net/~kbarrett/8156500/jdk.00/
http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.00/

Testing:
RBT ad hoc nightly runtime & gc tests.

With the proposed patch for JDK-8153711 applied, locally ran nearly
35K iterations of the OomDebugTest that is part of that patch, with no
failures / deadlocks.  Without this change it would typically only take
on the order of 100 iterations to hit a deadlock failure.

Locally ran direct byte buffer allocation test:
jdk/test/java/nio/Buffer/DirectBufferAllocTest.java
This change reported 3% faster allocation times, and 1/2 the standard
deviation for allocation times.

Locally ran failing test from JDK-8022321 and JDK-8066859 a few
thousand times.
jdk/test/java/lang/ref/OOMEInReferenceHandler.java
No errors, but failures were noted as hard to reproduce.



Re: RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-26 Thread David Holmes

Hi Kim,

On 25/06/2016 6:09 AM, Kim Barrett wrote:

Please review this change which moves the Reference pending list and
locking from the java.lang.ref.Reference class into the VM.  This
includes elimination of the ReferencePendingListLocker mechanism in
the VM. This fixes various fragility issues arising from having the
management of this list split between Java and the VM (GC).


This seems a particular example of a more general problem. If the target 
VM can be "suspended" (does that mean it is taken to a safepoint or ???) 
and an execution request is then sent, there must be numerous potential 
things that can fail because a "suspended" thread holds a resource 
needed by the execution request?


David
-


This change addresses JDK-8156500 by eliminating the possibility of
suspension while holding the lock. Because the locking is now done in
the VM, we have full control over where suspension may occur.

This change retains the existing interface between the reference
processor and the nio.Bits package, e.g. Reference.tryHandlePending
has the same signature and behavior; this change just pushes the
pending list manipulation down into the VM.  There are open bugs
reports, enhancement requests, and discussions related to that
interface; see below. This change does not attempt to address them.

This change additionally addresses or renders moot
https://bugs.openjdk.java.net/browse/JDK-8055232
(ref) Exceptions while processing Reference pending list

and
https://bugs.openjdk.java.net/browse/JDK-7103238
Ensure pending list lock is held on behalf of GC before enqueuing references on 
to the pending list

It is also relevant for followup discussion of
https://bugs.openjdk.java.net/browse/JDK-8022321
java/lang/ref/OOMEInReferenceHandler.java fails immediately

and
https://bugs.openjdk.java.net/browse/JDK-8149925
We don't need jdk.internal.ref.Cleaner any more - part 1

and has implications for:
https://bugs.openjdk.java.net/browse/JDK-8066859
java/lang/ref/OOMEInReferenceHandler.java failed with java.lang.Exception: 
Reference Handler thread died

In addition to the obviously relevant changes, there are a couple of
changes whose presence here might seem surprising and require further
explanation.

- Removal of a stale comment in create_vm, noticed because it was near
some code being removed as part of this change.  The comment was
rendered obsolete by JDK-8028275.

- Moved initialization of exception classes earlier in
initialize_java_lang_classes. The previous order only worked by
accident, at least for OOME.  During the bulk of the library
initialization, OOME may be thrown, perhaps due to poorly chosen
command line options.  That attempts to use one of the pre-allocated
OOME objects, and tries to fill in its stack trace.  If the Throwable
class is not yet initialized, this leads to a failed assert in the VM
because Throwable.UNASSIGNED_STACK still has a null value.  This
initialization order issue was being masked by the old pending list
implementation; the initialization of Reference ensured
InterruptedException was initialized (thereby initializing its
Throwable base class), and the initialization of Reference just
happened to occur early enough that Throwable was initialized by the
time it was needed when running certain tests.  The forced
initialization of InterruptedException is no longer needed by
Reference, but removal of it triggered test failures (and could
trigger corresponding product failures) because of this ordering
issue.

CR:
https://bugs.openjdk.java.net/browse/JDK-8156500

Webrev:
http://cr.openjdk.java.net/~kbarrett/8156500/jdk.00/
http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.00/

Testing:
RBT ad hoc nightly runtime & gc tests.

With the proposed patch for JDK-8153711 applied, locally ran nearly
35K iterations of the OomDebugTest that is part of that patch, with no
failures / deadlocks.  Without this change it would typically only take
on the order of 100 iterations to hit a deadlock failure.

Locally ran direct byte buffer allocation test:
jdk/test/java/nio/Buffer/DirectBufferAllocTest.java
This change reported 3% faster allocation times, and 1/2 the standard
deviation for allocation times.

Locally ran failing test from JDK-8022321 and JDK-8066859 a few
thousand times.
jdk/test/java/lang/ref/OOMEInReferenceHandler.java
No errors, but failures were noted as hard to reproduce.



RFR: 8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

2016-06-24 Thread Kim Barrett
Please review this change which moves the Reference pending list and
locking from the java.lang.ref.Reference class into the VM.  This
includes elimination of the ReferencePendingListLocker mechanism in
the VM. This fixes various fragility issues arising from having the
management of this list split between Java and the VM (GC).

This change addresses JDK-8156500 by eliminating the possibility of
suspension while holding the lock. Because the locking is now done in
the VM, we have full control over where suspension may occur.

This change retains the existing interface between the reference
processor and the nio.Bits package, e.g. Reference.tryHandlePending
has the same signature and behavior; this change just pushes the
pending list manipulation down into the VM.  There are open bugs
reports, enhancement requests, and discussions related to that
interface; see below. This change does not attempt to address them.

This change additionally addresses or renders moot
https://bugs.openjdk.java.net/browse/JDK-8055232
(ref) Exceptions while processing Reference pending list

and
https://bugs.openjdk.java.net/browse/JDK-7103238
Ensure pending list lock is held on behalf of GC before enqueuing references on 
to the pending list

It is also relevant for followup discussion of
https://bugs.openjdk.java.net/browse/JDK-8022321  
java/lang/ref/OOMEInReferenceHandler.java fails immediately

and
https://bugs.openjdk.java.net/browse/JDK-8149925
We don't need jdk.internal.ref.Cleaner any more - part 1

and has implications for:
https://bugs.openjdk.java.net/browse/JDK-8066859  
java/lang/ref/OOMEInReferenceHandler.java failed with java.lang.Exception: 
Reference Handler thread died 

In addition to the obviously relevant changes, there are a couple of
changes whose presence here might seem surprising and require further
explanation.

- Removal of a stale comment in create_vm, noticed because it was near
some code being removed as part of this change.  The comment was
rendered obsolete by JDK-8028275.

- Moved initialization of exception classes earlier in
initialize_java_lang_classes. The previous order only worked by
accident, at least for OOME.  During the bulk of the library
initialization, OOME may be thrown, perhaps due to poorly chosen
command line options.  That attempts to use one of the pre-allocated
OOME objects, and tries to fill in its stack trace.  If the Throwable
class is not yet initialized, this leads to a failed assert in the VM
because Throwable.UNASSIGNED_STACK still has a null value.  This
initialization order issue was being masked by the old pending list
implementation; the initialization of Reference ensured
InterruptedException was initialized (thereby initializing its
Throwable base class), and the initialization of Reference just
happened to occur early enough that Throwable was initialized by the
time it was needed when running certain tests.  The forced
initialization of InterruptedException is no longer needed by
Reference, but removal of it triggered test failures (and could
trigger corresponding product failures) because of this ordering
issue.

CR:
https://bugs.openjdk.java.net/browse/JDK-8156500

Webrev:
http://cr.openjdk.java.net/~kbarrett/8156500/jdk.00/
http://cr.openjdk.java.net/~kbarrett/8156500/hotspot.00/

Testing:
RBT ad hoc nightly runtime & gc tests.

With the proposed patch for JDK-8153711 applied, locally ran nearly
35K iterations of the OomDebugTest that is part of that patch, with no
failures / deadlocks.  Without this change it would typically only take
on the order of 100 iterations to hit a deadlock failure.

Locally ran direct byte buffer allocation test:
jdk/test/java/nio/Buffer/DirectBufferAllocTest.java
This change reported 3% faster allocation times, and 1/2 the standard
deviation for allocation times.

Locally ran failing test from JDK-8022321 and JDK-8066859 a few
thousand times.
jdk/test/java/lang/ref/OOMEInReferenceHandler.java
No errors, but failures were noted as hard to reproduce.