Integrated: 8284435: Add dedicated filler objects for known dead Java heap areas

2022-05-02 Thread Thomas Schatzl
On Fri, 8 Apr 2022 08:13:33 GMT, Thomas Schatzl  wrote:

> Hi all,
> 
>   can I have reviews for this change that adds dedicated filler objects to 
> the VM?
> 
> Currently, when formatting areas of dead objects all gcs use instances of 
> j.l.Object and int-arrays.
> 
> This has the drawback of not being easily able to discern whether a given 
> object is actually dead (and should never be referenced) or just a regular 
> j.l.Object/int array.
> 
> This also makes enhanced error detection (any reference to such an object is 
> an error - i.e. detecting references to such objects) and to skip potentially 
> already unloaded classes when scanning areas of the heap below TAMS, G1 uses 
> its prev bitmap.
> Other collectors do not have this extra information at the moment, so they 
> can't (and don't) do this kind of verification.
> 
> With [JDK-8210708](https://bugs.openjdk.java.net/browse/JDK-8210708) the prev 
> bitmap will effectively be removed in G1; G1 will format the dead areas with 
> these filler objects to avoid coming across unloaded classes. This is fine 
> wrt to normal operation, however, this looses the existing enhanced 
> verification mentioned above.
> 
> This change proposes to add dedicated VM-internal filler objects, i.e. 
> equivalents of j.l.Object and int-arrays.
> 
> This has the following benefits:
> 
> - keep this error detection (actually making it much simpler) and allowing 
> similar verification for other collectors. (This change does not add this)
> 
> - this also makes it "easy" to detect references to filler objects in 
> debugging tools - you only need to know the two klasses (or just get their 
> friendly name) to see whether that reference may actually be valid (or refers 
> to the inside such an object). References to these classes in the crash file 
> may also allow the issue to be more clear.
> 
> This causes some minor changes to external behavior:
> 
> - logs/heap dumps now contain instances of these objects - which seems fine 
> as previously they have just been reported as part of j.l.Object/int-arrays 
> statistics. The VM spec also does not guarantee whether a particular kind of 
> object should/should not show there anyway afaik.
> 
> - if the application ever gets to instantiate a reference to such an object 
> somehow, any enabled verification will crash the VM. That's bad luck for 
> messing with internal classes, but that's the purpose of these objects.
> 
> The change takes care that getting a reference will not be possible by normal 
> means (i.e. via Class.forName() etc) - which should be sufficient to avoid 
> the issue. Actually, existing mechanisms seem to be sufficient.
> 
> 
> Testing: tier1-8
> 
> There is one question I would like the reviewers to specially think about, 
> the name of the filler array klass. I just used 
> `Ljava/internal/vm/FillerArray;` for that, looking at other internal 
> symbols/klasses, but I'm not sure this adheres to naming guidelines.
> 
> Thanks go to @iklam for helping out with the change.
> 
> Thanks,
>   Thomas

This pull request has now been integrated.

Changeset: 70205956
Author:Thomas Schatzl 
URL:   
https://git.openjdk.java.net/jdk/commit/70205956313740e48e1fcb0c02c8f1488ab0d987
Stats: 86 lines in 8 files changed: 80 ins; 1 del; 5 mod

8284435: Add dedicated filler objects for known dead Java heap areas

Reviewed-by: iklam, iwalulya

-

PR: https://git.openjdk.java.net/jdk/pull/8156


Re: RFR: 8284435: Add dedicated filler objects for known dead Java heap areas [v3]

2022-05-02 Thread Thomas Schatzl
On Mon, 11 Apr 2022 14:55:32 GMT, Thomas Schatzl  wrote:

>> Hi all,
>> 
>>   can I have reviews for this change that adds dedicated filler objects to 
>> the VM?
>> 
>> Currently, when formatting areas of dead objects all gcs use instances of 
>> j.l.Object and int-arrays.
>> 
>> This has the drawback of not being easily able to discern whether a given 
>> object is actually dead (and should never be referenced) or just a regular 
>> j.l.Object/int array.
>> 
>> This also makes enhanced error detection (any reference to such an object is 
>> an error - i.e. detecting references to such objects) and to skip 
>> potentially already unloaded classes when scanning areas of the heap below 
>> TAMS, G1 uses its prev bitmap.
>> Other collectors do not have this extra information at the moment, so they 
>> can't (and don't) do this kind of verification.
>> 
>> With [JDK-8210708](https://bugs.openjdk.java.net/browse/JDK-8210708) the 
>> prev bitmap will effectively be removed in G1; G1 will format the dead areas 
>> with these filler objects to avoid coming across unloaded classes. This is 
>> fine wrt to normal operation, however, this looses the existing enhanced 
>> verification mentioned above.
>> 
>> This change proposes to add dedicated VM-internal filler objects, i.e. 
>> equivalents of j.l.Object and int-arrays.
>> 
>> This has the following benefits:
>> 
>> - keep this error detection (actually making it much simpler) and allowing 
>> similar verification for other collectors. (This change does not add this)
>> 
>> - this also makes it "easy" to detect references to filler objects in 
>> debugging tools - you only need to know the two klasses (or just get their 
>> friendly name) to see whether that reference may actually be valid (or 
>> refers to the inside such an object). References to these classes in the 
>> crash file may also allow the issue to be more clear.
>> 
>> This causes some minor changes to external behavior:
>> 
>> - logs/heap dumps now contain instances of these objects - which seems fine 
>> as previously they have just been reported as part of j.l.Object/int-arrays 
>> statistics. The VM spec also does not guarantee whether a particular kind of 
>> object should/should not show there anyway afaik.
>> 
>> - if the application ever gets to instantiate a reference to such an object 
>> somehow, any enabled verification will crash the VM. That's bad luck for 
>> messing with internal classes, but that's the purpose of these objects.
>> 
>> The change takes care that getting a reference will not be possible by 
>> normal means (i.e. via Class.forName() etc) - which should be sufficient to 
>> avoid the issue. Actually, existing mechanisms seem to be sufficient.
>> 
>> 
>> Testing: tier1-8
>> 
>> There is one question I would like the reviewers to specially think about, 
>> the name of the filler array klass. I just used 
>> `Ljava/internal/vm/FillerArray;` for that, looking at other internal 
>> symbols/klasses, but I'm not sure this adheres to naming guidelines.
>> 
>> Thanks go to @iklam for helping out with the change.
>> 
>> Thanks,
>>   Thomas
>
> Thomas Schatzl has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Fix test

Note that I did a tier1-5 run with the latest change with no issues before 
pushing.

-

PR: https://git.openjdk.java.net/jdk/pull/8156


Re: RFR: 8284435: Add dedicated filler objects for known dead Java heap areas [v3]

2022-05-02 Thread Thomas Schatzl
On Fri, 29 Apr 2022 16:45:01 GMT, Ioi Lam  wrote:

>> Thomas Schatzl has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Fix test
>
> The latest version looks good to me.

Thanks @iklam @walulyai for your reviews

-

PR: https://git.openjdk.java.net/jdk/pull/8156


Re: RFR: 8284161: Implementation of Virtual Threads (Preview) [v7]

2022-04-27 Thread Thomas Schatzl
On Tue, 26 Apr 2022 17:27:35 GMT, Alan Bateman  wrote:

>> This is the implementation of JEP 425: Virtual Threads (Preview); TBD which 
>> JDK version to target.
>> 
>> We will refresh this PR periodically to pick up changes and fixes from the 
>> loom repo.
>> 
>> Most of the new mechanisms in the HotSpot VM are disabled by default and 
>> require running with `--enable-preview` to enable.
>> 
>> The patch has support for x64 and aarch64 on the usual operating systems 
>> (Linux, macOS, and Windows). There are stubs (calling Unimplemented) for 
>> zero and some of the other ports. Additional ports can be contributed via 
>> PRs against the fibers branch in the loom repo.
>> 
>> There are changes in many areas. To reduce notifications/mails, the labels 
>> have been trimmed down for now to hotspot, serviceability and core-libs. 
>> We'll add the complete set of labels when the PR is further along.
>> 
>> The changes include a refresh of java.util.concurrent and ForkJoinPool from 
>> Doug Lea's CVS. These changes will probably be proposed and integrated in 
>> advance of this PR.
>> 
>> The changes include some non-exposed and low-level infrastructure to support 
>> the (in draft) JEPs for Structured Concurrency and Scope Locals. This is to 
>> make life a bit easier and avoid having to separate VM changes and juggle 
>> branches at this time.
>
> Alan Bateman has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Refresh 8d8f0a2fd646e57fe6b4e8ab669f836dc46dda69

Reviewed changes in gc/, oops/, memory/, partially in runtime/.

-

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/8166


Re: RFR: 8284435: Add dedicated filler objects for known dead Java heap areas [v3]

2022-04-19 Thread Thomas Schatzl
On Mon, 11 Apr 2022 21:51:32 GMT, David Holmes  wrote:

> Do you really need to define a real `FillerObject.java` class? Can't you use 
> an internal-only variant of a hidden class to represent this?

I am not sure I understand this request, but of course I am open to try 
different approaches. An existing example would be fine to get me on track; but 
maybe I am only not understanding your terminology: do you mean to have a 
special subclass of instanceKlass for `FillerObject`?
I had that, but that has been much much more effort (in terms of code, 
maintainability, ...) than having such an additional Java class file in the JDK 
and use the existing macros to use it everywhere.

-

PR: https://git.openjdk.java.net/jdk/pull/8156


Re: RFR: 8284435: Add dedicated filler objects for known dead Java heap areas [v3]

2022-04-11 Thread Thomas Schatzl
> Hi all,
> 
>   can I have reviews for this change that adds dedicated filler objects to 
> the VM?
> 
> Currently, when formatting areas of dead objects all gcs use instances of 
> j.l.Object and int-arrays.
> 
> This has the drawback of not being easily able to discern whether a given 
> object is actually dead (and should never be referenced) or just a regular 
> j.l.Object/int array.
> 
> This also makes enhanced error detection (any reference to such an object is 
> an error - i.e. detecting references to such objects) and to skip potentially 
> already unloaded classes when scanning areas of the heap below TAMS, G1 uses 
> its prev bitmap.
> Other collectors do not have this extra information at the moment, so they 
> can't (and don't) do this kind of verification.
> 
> With [JDK-8210708](https://bugs.openjdk.java.net/browse/JDK-8210708) the prev 
> bitmap will effectively be removed in G1; G1 will format the dead areas with 
> these filler objects to avoid coming across unloaded classes. This is fine 
> wrt to normal operation, however, this looses the existing enhanced 
> verification mentioned above.
> 
> This change proposes to add dedicated VM-internal filler objects, i.e. 
> equivalents of j.l.Object and int-arrays.
> 
> This has the following benefits:
> 
> - keep this error detection (actually making it much simpler) and allowing 
> similar verification for other collectors. (This change does not add this)
> 
> - this also makes it "easy" to detect references to filler objects in 
> debugging tools - you only need to know the two klasses (or just get their 
> friendly name) to see whether that reference may actually be valid (or refers 
> to the inside such an object). References to these classes in the crash file 
> may also allow the issue to be more clear.
> 
> This causes some minor changes to external behavior:
> 
> - logs/heap dumps now contain instances of these objects - which seems fine 
> as previously they have just been reported as part of j.l.Object/int-arrays 
> statistics. The VM spec also does not guarantee whether a particular kind of 
> object should/should not show there anyway afaik.
> 
> - if the application ever gets to instantiate a reference to such an object 
> somehow, any enabled verification will crash the VM. That's bad luck for 
> messing with internal classes, but that's the purpose of these objects.
> 
> The change takes care that getting a reference will not be possible by normal 
> means (i.e. via Class.forName() etc) - which should be sufficient to avoid 
> the issue. Actually, existing mechanisms seem to be sufficient.
> 
> 
> Testing: tier1-8
> 
> There is one question I would like the reviewers to specially think about, 
> the name of the filler array klass. I just used 
> `Ljava/internal/vm/FillerArray;` for that, looking at other internal 
> symbols/klasses, but I'm not sure this adheres to naming guidelines.
> 
> Thanks go to @iklam for helping out with the change.
> 
> Thanks,
>   Thomas

Thomas Schatzl has updated the pull request incrementally with one additional 
commit since the last revision:

  Fix test

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/8156/files
  - new: https://git.openjdk.java.net/jdk/pull/8156/files/b3e2c6c3..effb9cd5

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=8156&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=8156&range=01-02

  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8156.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8156/head:pull/8156

PR: https://git.openjdk.java.net/jdk/pull/8156


Re: RFR: 8284435: Add dedicated filler objects for known dead Java heap areas [v2]

2022-04-11 Thread Thomas Schatzl
On Fri, 8 Apr 2022 16:45:54 GMT, Ioi Lam  wrote:

>> Thomas Schatzl has updated the pull request incrementally with two 
>> additional commits since the last revision:
>> 
>>  - iklam review
>>  - Test case
>
> src/hotspot/share/classfile/systemDictionaryShared.cpp line 1727:
> 
>> 1725: ArchivedMirrorPatcher::update_array_klasses(k);
>> 1726:   }
>> 1727:   
>> ArchivedMirrorPatcher::update_array_klasses(Universe::fillerArrayKlassObj());
> 
> I think this is not necessary. `Universe::fillerArrayKlassObj()` shares the 
> same mirror as `Universe::intArrayKlassObj()`, which has already been updated 
> in the loop above.
> 
> `ArchivedMirrorPatcher::update_array_klasses(k)` will essentially do 
> `k->mirror->pointer_back_to_klass += delta`, so it will incorrectly set the 
> pointer when delta is not zero.
> 
> I would suggest running with
> 
> 
> -XX:ArchiveRelocationMode=1 -Xlog:cds -Xlog:class+load=debug
> 
> 
> and step into the following code: 
> 
> 
> void java_lang_Class::update_archived_mirror_native_pointers(oop 
> archived_mirror) {
>   assert(MetaspaceShared::relocation_delta() != 0, "must be");
> 
>   Klass* k = ((Klass*)archived_mirror->metadata_field(_klass_offset));
>   archived_mirror->metadata_field_put(_klass_offset,
>   (Klass*)(address(k) + MetaspaceShared::relocation_delta())); <<<< HERE

Stepping into that code (well, I added some logging) indicated that the mirrors 
for these klasses (`_fillerArrayKlassObj` and `intArrayKlassObj`) are 
different, so the field is not updated multiple times. So this code seems 
required, also because there are lots of crashes when removing the 
`ArchivedMirrorPatcher::update_array_klasses` call.

(Say, if you print the `java_mirror()` after Universe initialization)

The problem with the compiler we had (which was resolved by initializing the 
filler array before the int-array) was that in the component mirror, there is a 
reference to the arrayklass that represents an array of that basic type.
So as the code earlier initialized the filler array klass after the int array 
klass, the compiler used the filler array klass for array instantiation which 
tests did not like.

-

PR: https://git.openjdk.java.net/jdk/pull/8156


Re: RFR: 8284435: Add dedicated filler objects for known dead Java heap areas [v2]

2022-04-11 Thread Thomas Schatzl
> Hi all,
> 
>   can I have reviews for this change that adds dedicated filler objects to 
> the VM?
> 
> Currently, when formatting areas of dead objects all gcs use instances of 
> j.l.Object and int-arrays.
> 
> This has the drawback of not being easily able to discern whether a given 
> object is actually dead (and should never be referenced) or just a regular 
> j.l.Object/int array.
> 
> This also makes enhanced error detection (any reference to such an object is 
> an error - i.e. detecting references to such objects) and to skip potentially 
> already unloaded classes when scanning areas of the heap below TAMS, G1 uses 
> its prev bitmap.
> Other collectors do not have this extra information at the moment, so they 
> can't (and don't) do this kind of verification.
> 
> With [JDK-8210708](https://bugs.openjdk.java.net/browse/JDK-8210708) the prev 
> bitmap will effectively be removed in G1; G1 will format the dead areas with 
> these filler objects to avoid coming across unloaded classes. This is fine 
> wrt to normal operation, however, this looses the existing enhanced 
> verification mentioned above.
> 
> This change proposes to add dedicated VM-internal filler objects, i.e. 
> equivalents of j.l.Object and int-arrays.
> 
> This has the following benefits:
> 
> - keep this error detection (actually making it much simpler) and allowing 
> similar verification for other collectors. (This change does not add this)
> 
> - this also makes it "easy" to detect references to filler objects in 
> debugging tools - you only need to know the two klasses (or just get their 
> friendly name) to see whether that reference may actually be valid (or refers 
> to the inside such an object). References to these classes in the crash file 
> may also allow the issue to be more clear.
> 
> This causes some minor changes to external behavior:
> 
> - logs/heap dumps now contain instances of these objects - which seems fine 
> as previously they have just been reported as part of j.l.Object/int-arrays 
> statistics. The VM spec also does not guarantee whether a particular kind of 
> object should/should not show there anyway afaik.
> 
> - if the application ever gets to instantiate a reference to such an object 
> somehow, any enabled verification will crash the VM. That's bad luck for 
> messing with internal classes, but that's the purpose of these objects.
> 
> The change takes care that getting a reference will not be possible by normal 
> means (i.e. via Class.forName() etc) - which should be sufficient to avoid 
> the issue. Actually, existing mechanisms seem to be sufficient.
> 
> 
> Testing: tier1-8
> 
> There is one question I would like the reviewers to specially think about, 
> the name of the filler array klass. I just used 
> `Ljava/internal/vm/FillerArray;` for that, looking at other internal 
> symbols/klasses, but I'm not sure this adheres to naming guidelines.
> 
> Thanks go to @iklam for helping out with the change.
> 
> Thanks,
>   Thomas

Thomas Schatzl has updated the pull request incrementally with two additional 
commits since the last revision:

 - iklam review
 - Test case

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/8156/files
  - new: https://git.openjdk.java.net/jdk/pull/8156/files/9cee3295..b3e2c6c3

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=8156&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=8156&range=00-01

  Stats: 54 lines in 2 files changed: 53 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8156.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8156/head:pull/8156

PR: https://git.openjdk.java.net/jdk/pull/8156


RFR: 8284435: Add dedicated filler objects for known dead Java heap areas

2022-04-08 Thread Thomas Schatzl
Hi all,

  can I have reviews for this change that adds dedicated filler objects to the 
VM?

Currently, when formatting areas of dead objects all gcs use instances of 
j.l.Object and int-arrays.

This has the drawback of not being easily able to discern whether a given 
object is actually dead (and should never be referenced) or just a regular 
j.l.Object/int array.

This also makes enhanced error detection (any reference to such an object is an 
error - i.e. detecting references to such objects) and to skip potentially 
already unloaded classes when scanning areas of the heap below TAMS, G1 uses 
its prev bitmap.
Other collectors do not have this extra information at the moment, so they 
can't (and don't) do this kind of verification.

With [JDK-8210708](https://bugs.openjdk.java.net/browse/JDK-8210708) the prev 
bitmap will effectively be removed in G1; G1 will format the dead areas with 
these filler objects to avoid coming across unloaded classes. This is fine wrt 
to normal operation, however, this looses the existing enhanced verification 
mentioned above.

This change proposes to add dedicated VM-internal filler objects, i.e. 
equivalents of j.l.Object and int-arrays.

This has the following benefits:

- keep this error detection (actually making it much simpler) and allowing 
similar verification for other collectors. (This change does not add this)

- this also makes it "easy" to detect references to filler objects in debugging 
tools - you only need to know the two klasses (or just get their friendly name) 
to see whether that reference may actually be valid (or refers to the inside 
such an object). References to these classes in the crash file may also allow 
the issue to be more clear.

This causes some minor changes to external behavior:

- logs/heap dumps now contain instances of these objects - which seems fine as 
previously they have just been reported as part of j.l.Object/int-arrays 
statistics. The VM spec also does not guarantee whether a particular kind of 
object should/should not show there anyway afaik.

- if the application ever gets to instantiate a reference to such an object 
somehow, any enabled verification will crash the VM. That's bad luck for 
messing with internal classes, but that's the purpose of these objects.

The change takes care that getting a reference will not be possible by normal 
means (i.e. via Class.forName() etc) - which should be sufficient to avoid the 
issue. Actually, existing mechanisms seem to be sufficient.


Testing: tier1-8

There is one question I would like the reviewers to specially think about, the 
name of the filler array klass. I just used `Ljava/internal/vm/FillerArray;` 
for that, looking at other internal symbols/klasses, but I'm not sure this 
adheres to naming guidelines.

Thanks go to @iklam for helping out with the change.

Thanks,
  Thomas

-

Commit messages:
 - First stab at new filler - sjohanss

Changes: https://git.openjdk.java.net/jdk/pull/8156/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=8156&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8284435
  Stats: 33 lines in 7 files changed: 27 ins; 1 del; 5 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8156.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8156/head:pull/8156

PR: https://git.openjdk.java.net/jdk/pull/8156


Re: RFR: 8277494: [BACKOUT] JDK-8276150 Quarantined jpackage apps are labeled as "damaged"

2021-11-19 Thread Thomas Schatzl
On Fri, 19 Nov 2021 20:57:46 GMT, Daniel D. Daugherty  
wrote:

> This reverts commit 936f7ff49ed86adb74bb1ff10d93cb3d7f7d70a0.
> 
> So far we've had 3 failed Tier2 job sets in a row. My Mach5 Tier2 of this 
> [BACKOUT] has 
> passed the macosx-aarch64 test task that was failing before.

Marked as reviewed by tschatzl (Reviewer).

-

PR: https://git.openjdk.java.net/jdk/pull/6483


Re: RFR: 8275262: [BACKOUT] AArch64: Implement string_compare intrinsic in SVE

2021-10-14 Thread Thomas Schatzl
On Thu, 14 Oct 2021 06:37:19 GMT, Nick Gasson  wrote:

> This reverts "8269559: AArch64: Implement string_compare intrinsic in SVE" 
> which caused some unknown failures in Oracle's CI.

Lgtm.

-

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/5941


Re: RFR: 8273140: Replace usages of Enum.class.getEnumConstants() with Enum.values() where possible [v4]

2021-09-01 Thread Thomas Schatzl
On Tue, 31 Aug 2021 20:04:23 GMT, Сергей Цыпанов 
 wrote:

>> Just a very tiny clean-up.
>> 
>> There are some places in JDK code base where we call 
>> `Enum.class.getEnumConstants()` to get all the values of the referenced 
>> `enum`. This is excessive, less-readable and slower than just calling 
>> `Enum.values()` as in `getEnumConstants()` we have volatile access:
>> 
>> public T[] getEnumConstants() {
>> T[] values = getEnumConstantsShared();
>> return (values != null) ? values.clone() : null;
>> }
>> 
>> private transient volatile T[] enumConstants;
>> 
>> T[] getEnumConstantsShared() {
>> T[] constants = enumConstants;
>> if (constants == null) { /* ... */ }
>> return constants;
>> }
>> 
>> Calling values() method is slightly faster:
>> 
>> @BenchmarkMode(Mode.AverageTime)
>> @OutputTimeUnit(TimeUnit.NANOSECONDS)
>> @Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
>> public class EnumBenchmark {
>> 
>>   @Benchmark
>>   public Enum[] values() {
>> return Enum.values();
>>   }
>> 
>>   @Benchmark
>>   public Enum[] getEnumConstants() {
>> return Enum.class.getEnumConstants();
>>   }
>> 
>>   private enum Enum {
>> A,
>> B
>>   }
>> }
>> 
>> 
>> BenchmarkMode  Cnt   
>>   Score Error   Units
>> EnumBenchmark.getEnumConstants   avgt   15   
>>   6,265 ±   0,051   ns/op
>> EnumBenchmark.getEnumConstants:·gc.alloc.rateavgt   15  
>> 2434,075 ±  19,568  MB/sec
>> EnumBenchmark.getEnumConstants:·gc.alloc.rate.norm   avgt   15   
>>  24,002 ±   0,001B/op
>> EnumBenchmark.getEnumConstants:·gc.churn.G1_Eden_Space   avgt   15  
>> 2433,709 ±  70,216  MB/sec
>> EnumBenchmark.getEnumConstants:·gc.churn.G1_Eden_Space.norm  avgt   15   
>>  23,998 ±   0,659B/op
>> EnumBenchmark.getEnumConstants:·gc.churn.G1_Survivor_Space   avgt   15   
>>   0,009 ±   0,003  MB/sec
>> EnumBenchmark.getEnumConstants:·gc.churn.G1_Survivor_Space.norm  avgt   15   
>>  ≈ 10⁻⁴  B/op
>> EnumBenchmark.getEnumConstants:·gc.count avgt   15   
>> 210,000counts
>> EnumBenchmark.getEnumConstants:·gc.time  avgt   15   
>> 119,000ms
>> 
>> EnumBenchmark.values avgt   15   
>>   4,164 ±   0,134   ns/op
>> EnumBenchmark.values:·gc.alloc.rate  avgt   15  
>> 3665,341 ± 120,721  MB/sec
>> EnumBenchmark.values:·gc.alloc.rate.norm avgt   15   
>>  24,002 ±   0,001B/op
>> EnumBenchmark.values:·gc.churn.G1_Eden_Space avgt   15  
>> 3660,512 ± 137,250  MB/sec
>> EnumBenchmark.values:·gc.churn.G1_Eden_Space.normavgt   15   
>>  23,972 ±   0,529B/op
>> EnumBenchmark.values:·gc.churn.G1_Survivor_Space avgt   15   
>>   0,017 ±   0,003  MB/sec
>> EnumBenchmark.values:·gc.churn.G1_Survivor_Space.normavgt   15   
>>  ≈ 10⁻⁴  B/op
>> EnumBenchmark.values:·gc.count   avgt   15   
>> 262,000counts
>> EnumBenchmark.values:·gc.timeavgt   15   
>> 155,000ms
>
> Сергей Цыпанов has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   8273140: Fix copyright year

Lgtm.

-

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/5303


Re: RFR: 8271396: Spelling errors

2021-07-28 Thread Thomas Schatzl
On Wed, 3 Feb 2021 19:12:25 GMT, Emmanuel Bourg 
 wrote:

> This PR fixes the following spelling errors:
> 
>  choosen  -> chosen
>  commad   -> command
>  hiearchy -> hierarchy
>  leagacy  -> legacy
>  minium   -> minimum
>  subsytem -> subsystem
>  unamed   -> unnamed

Lgtm.

-

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/2385


Re: [jdk17] RFR: JDK-8225667: Clarify the behavior of System::gc w.r.t. reference processing [v2]

2021-07-01 Thread Thomas Schatzl
On Wed, 30 Jun 2021 22:52:36 GMT, Mandy Chung  wrote:

>> This spec clarification is a follow-up to 
>> [JDK-8224760](https://bugs.openjdk.java.net/browse/JDK-8224760?focusedCommentId=14268320&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14268320)
>>  w.r.t. reference processing.  Since there is no guarantee for any memory 
>> reclamation by the invocation of `System::gc`, the spec should also clarify 
>> that there is no guarantee in determining the change of reachability of any 
>> objects or any particular number of `Reference` objects be enqueued and 
>> cleared.
>> 
>> CSR:
>>https://bugs.openjdk.java.net/browse/JDK-8269690
>
> Mandy Chung has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Kim's suggestion on the wording

Marked as reviewed by tschatzl (Reviewer).

-

PR: https://git.openjdk.java.net/jdk17/pull/183


Re: [jdk17] RFR: 8269534: Remove java/util/concurrent/locks/Lock/TimedAcquireLeak.java from ProblemList.txt

2021-06-28 Thread Thomas Schatzl
On Mon, 28 Jun 2021 17:05:49 GMT, Daniel D. Daugherty  
wrote:

> A trivial fix to remove java/util/concurrent/locks/Lock/TimedAcquireLeak.java 
> from ProblemList.txt

Lgtm and trivial.

-

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk17/pull/164


Re: RFR: 8254598: StringDedupTable should use OopStorage [v2]

2021-05-12 Thread Thomas Schatzl
On Fri, 7 May 2021 08:28:43 GMT, Kim Barrett  wrote:

>> Please review this change to the String Deduplication facility.  It is being
>> changed to use OopStorage to hold weak references to relevant objects,
>> rather than bespoke data structures requiring dedicated processing phases by
>> supporting GCs.
>> 
>> (The Shenandoah update was contributed by Zhengyu Gu.)
>> 
>> This change significantly simplifies the interface between a given GC and
>> the String Deduplication facility, which should make it much easier for
>> other GCs to opt in.  However, this change does not alter the set of GCs
>> providing support; currently only G1 and Shenandoah support string
>> deduplication.  Adding support by other GCs will be in followup RFEs.
>> 
>> Reviewing via the diffs might not be all that useful for some parts, as
>> several files have been essentially completely replaced, and a number of
>> files have been added or eliminated.  The full webrev might be a better
>> place to look.
>> 
>> The major changes are in gc/shared/stringdedup/* and in the supporting
>> collectors, but there are also some smaller changes in other places, most
>> notably classfile/{stringTable,javaClasses}.
>> 
>> This change is additionally labeled for review by core-libs, although there
>> are no source changes there.  This change injects a byte field of bits into
>> java.lang.String, using one of the previously unused padding bytes in that
>> class.  (There were two unused bytes, now only one.)
>> 
>> Testing:
>> mach5 tier1-2 with and without -XX:+UseStringDeduplication
>> 
>> Locally (linux-x64) ran all of the existing tests that use string
>> deduplication with both G1 and Shenandoah.  Note that
>> TestStringDeduplicationInterned.java is disabled for shenandoah, as it
>> currently fails, for reasons I haven't figured out but suspect are test
>> related.
>> 
>> - gc/stringdedup/   -- these used to be in gc/g1
>> - runtime/cds/SharedStringsDedup.java
>> - runtime/cds/appcds/cacheObject/DifferentHeapSizes.java
>> - runtime/cds/appcds/sharedStrings/*
>> 
>> shenandoah-only:
>> - gc/shenandoah/jvmti/TestHeapDump.java
>> - gc/shenandoah/TestStringDedup.java
>> - gc/shenandoah/TestStringDedupStress.java
>> 
>> Performance tested baseline, baseline + stringdedup, new with stringdedup,
>> finding no significant differences.
>
> Kim Barrett has updated the pull request incrementally with three additional 
> commits since the last revision:
> 
>  - more comment improvements
>  - improve naming and comments around injected String flags
>  - fix some typos in comments

Lgtm.

src/hotspot/share/gc/shared/stringdedup/stringDedup.hpp line 82:

> 80: // Doing so would counteract C2 optimizations on string literals.  But an
> 81: // interned string might later become a deduplication candidate through 
> the
> 82: // normal GC discovery mechanism.  To prevent such modificatoins, the

s/modificatoins/modifications

-

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/3662


Re: RFR: 8254598: StringDedupTable should use OopStorage

2021-05-04 Thread Thomas Schatzl
On Fri, 23 Apr 2021 19:48:47 GMT, Kim Barrett  wrote:

> Please review this change to the String Deduplication facility.  It is being
> changed to use OopStorage to hold weak references to relevant objects,
> rather than bespoke data structures requiring dedicated processing phases by
> supporting GCs.
> 
> (The Shenandoah update was contributed by Zhengyu Gu.)
> 
> This change significantly simplifies the interface between a given GC and
> the String Deduplication facility, which should make it much easier for
> other GCs to opt in.  However, this change does not alter the set of GCs
> providing support; currently only G1 and Shenandoah support string
> deduplication.  Adding support by other GCs will be in followup RFEs.
> 
> Reviewing via the diffs might not be all that useful for some parts, as
> several files have been essentially completely replaced, and a number of
> files have been added or eliminated.  The full webrev might be a better
> place to look.
> 
> The major changes are in gc/shared/stringdedup/* and in the supporting
> collectors, but there are also some smaller changes in other places, most
> notably classfile/{stringTable,javaClasses}.
> 
> This change is additionally labeled for review by core-libs, although there
> are no source changes there.  This change injects a byte field of bits into
> java.lang.String, using one of the previously unused padding bytes in that
> class.  (There were two unused bytes, now only one.)
> 
> Testing:
> mach5 tier1-2 with and without -XX:+UseStringDeduplication
> 
> Locally (linux-x64) ran all of the existing tests that use string
> deduplication with both G1 and Shenandoah.  Note that
> TestStringDeduplicationInterned.java is disabled for shenandoah, as it
> currently fails, for reasons I haven't figured out but suspect are test
> related.
> 
> - gc/stringdedup/   -- these used to be in gc/g1
> - runtime/cds/SharedStringsDedup.java
> - runtime/cds/appcds/cacheObject/DifferentHeapSizes.java
> - runtime/cds/appcds/sharedStrings/*
> 
> shenandoah-only:
> - gc/shenandoah/jvmti/TestHeapDump.java
> - gc/shenandoah/TestStringDedup.java
> - gc/shenandoah/TestStringDedupStress.java
> 
> Performance tested baseline, baseline + stringdedup, new with stringdedup,
> finding no significant differences.

First pass, just comment suggestions for now.

src/hotspot/share/classfile/javaClasses.hpp line 119:

> 117:   static const uint8_t _deduplication_requested_mask = 1 << 1;
> 118: 
> 119:   static int flags_offset() { CHECK_INIT(_flags_offset); }

Maybe add some description about the injected `flags` field and its contents 
here.

src/hotspot/share/classfile/javaClasses.hpp line 152:

> 150:   static inline void set_value(oop string, typeArrayOop buffer);
> 151: 
> 152:   // Set the no_deduplication flag true.  This flag is sticky; once set 
> it

Initially I was a bit confused of the use of "flag" here and that the field is 
called "flags". Maybe use "deduplication bit" here, or rename the "flags" field 
but feel free to ignore.

src/hotspot/share/classfile/javaClasses.hpp line 154:

> 152:   // Set the no_deduplication flag true.  This flag is sticky; once set 
> it
> 153:   // never gets cleared.  This is set when a string is interned in the
> 154:   // StringTable, to prevent string deduplication from changing the 
> string's

I think this information about the contents of the injected "flags" should be 
collected somewhere else as piecing together what fields "flags" has from 
different method documentation seems wrong.

src/hotspot/share/classfile/javaClasses.hpp line 170:

> 168:   static inline bool hash_is_set(oop string);
> 169:   static inline bool is_latin1(oop java_string);
> 170:   static inline bool no_deduplication(oop java_string);

That identifier is missing a verb to read better, but I do not have a good 
idea. Maybe it would be easier to use the negation of "no_deduplication", 
something like "may_deduplicate"?
Feel free to ignore.

src/hotspot/share/classfile/javaClasses.hpp line 171:

> 169:   static inline bool is_latin1(oop java_string);
> 170:   static inline bool no_deduplication(oop java_string);
> 171:   static inline bool deduplication_requested(oop java_string);

Having a verb at the beginning like `is_deduplication_requested` sounds better.

src/hotspot/share/gc/shared/stringdedup/stringDedup.hpp line 31:

> 29: //
> 30: // String deduplication aims to reduce the heap live-set by deduplicating
> 31: // identical instances of String so that they share the same backing byte

... by deduplicating instances of (java.lang.)String with identical backing 
byte array (the String's value).

src/hotspot/share/gc/shared/stringdedup/stringDedup.hpp line 69:

> 67: // arrays.  This permits reclamation of arrays that become unused.  This 
> is
> 68: // separate from the request storage objects because dead count tracking 
> is
> 69: // used by the table implementation as part of resizing decisions and for

s/table/string table? I.e. which table is re

Re: RFR: 8257876: Avoid Reference.isEnqueued in tests [v2]

2020-12-10 Thread Thomas Schatzl
On Thu, 10 Dec 2020 08:56:25 GMT, Kim Barrett  wrote:

>> I understand that the test code previously just forwarded the 
>> `InterruptedException` if it happened in the `Thread.sleep()` call too. So 
>> this may only be an exiting issue and please ignore this comment.
>> Not catching `InterruptedException` here only seems to be a cause for 
>> unnecessary failure. Then again, it probably does not happen a lot.
>
> Nothing in the test calls Thread.interrupt(), so there isn't a risk of
> failure due to not handling that exception in some "interesting" way. But
> InterruptedException must be "handled" somehow, because it's a checked
> exception. That's already dealt with by the run() method declaring that it
> throws that type, and main declaring that it throws Exception.  The other
> tests modified in this change don't take that approach (just let it
> propagate out through main), instead wrapping the interruptable calls in
> try/catch, though again just to satisfy the requirement that a checked
> exception must be statically verified to be handled, even though there
> aren't going to be any thrown.

Okay.

-

PR: https://git.openjdk.java.net/jdk/pull/1691


Re: RFR: 8257876: Avoid Reference.isEnqueued in tests [v2]

2020-12-10 Thread Thomas Schatzl
On Thu, 10 Dec 2020 09:01:54 GMT, Kim Barrett  wrote:

>> Please review this change that eliminates the use of Reference.isEnqueued by
>> tests.  There were three tests using it:
>> 
>> vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java
>> vmTestbase/gc/gctests/WeakReferenceGC/WeakReferenceGC.java
>> jdk/java/lang/ref/ReferenceEnqueue.java
>> 
>> In each of them, some combination of using Reference.refersTo and
>> ReferenceQueue.remove with a timeout were used to eliminate the use of
>> Reference.isEnqueued.
>> 
>> I also cleaned up ReferencesGC.java in various respects.  It contained
>> several bits of dead code, and the failure checks were made stronger.
>> 
>> Testing:
>> mach5 tier1
>> Locally (linux-x64) ran all three tests with each GC (including Shenandoah).
>
> Kim Barrett has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   move REMOVE and RETAIN decls and init

Also good with deferring the changes to the comments and the move of the 
statics.

-

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1691


Re: RFR: 8257876: Avoid Reference.isEnqueued in tests

2020-12-09 Thread Thomas Schatzl
On Wed, 9 Dec 2020 13:23:47 GMT, Thomas Schatzl  wrote:

>> Please review this change that eliminates the use of Reference.isEnqueued by
>> tests.  There were three tests using it:
>> 
>> vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java
>> vmTestbase/gc/gctests/WeakReferenceGC/WeakReferenceGC.java
>> jdk/java/lang/ref/ReferenceEnqueue.java
>> 
>> In each of them, some combination of using Reference.refersTo and
>> ReferenceQueue.remove with a timeout were used to eliminate the use of
>> Reference.isEnqueued.
>> 
>> I also cleaned up ReferencesGC.java in various respects.  It contained
>> several bits of dead code, and the failure checks were made stronger.
>> 
>> Testing:
>> mach5 tier1
>> Locally (linux-x64) ran all three tests with each GC (including Shenandoah).
>
> test/jdk/java/lang/ref/ReferenceEnqueue.java line 58:
> 
>> 56: for (int i = 0; i < iterations; i++) {
>> 57: System.gc();
>> 58: enqueued = (queue.remove(100) == ref);
> 
> The code does not catch `InterruptedException` like it does in the other 
> files.

I understand that the test code previously just forwarded the 
`InterruptedException` if it happened in the `Thread.sleep()` call too. So this 
may only be an exiting issue and please ignore this comment.
Not catching `InterruptedException` here only seems to be a cause for 
unnecessary failure. Then again, it probably does not happen a lot.

-

PR: https://git.openjdk.java.net/jdk/pull/1691


Re: RFR: 8257876: Avoid Reference.isEnqueued in tests

2020-12-09 Thread Thomas Schatzl
On Tue, 8 Dec 2020 09:52:51 GMT, Kim Barrett  wrote:

> Please review this change that eliminates the use of Reference.isEnqueued by
> tests.  There were three tests using it:
> 
> vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java
> vmTestbase/gc/gctests/WeakReferenceGC/WeakReferenceGC.java
> jdk/java/lang/ref/ReferenceEnqueue.java
> 
> In each of them, some combination of using Reference.refersTo and
> ReferenceQueue.remove with a timeout were used to eliminate the use of
> Reference.isEnqueued.
> 
> I also cleaned up ReferencesGC.java in various respects.  It contained
> several bits of dead code, and the failure checks were made stronger.
> 
> Testing:
> mach5 tier1
> Locally (linux-x64) ran all three tests with each GC (including Shenandoah).

Changes requested by tschatzl (Reviewer).

test/jdk/java/lang/ref/ReferenceEnqueue.java line 58:

> 56: for (int i = 0; i < iterations; i++) {
> 57: System.gc();
> 58: enqueued = (queue.remove(100) == ref);

The code does not catch `InterruptedException` like it does in the other files.

test/hotspot/jtreg/vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java line 
129:

> 127: }
> 128: 
> 129: int REMOVE = (int) (RANGE * RATIO);

These two constants could be factored out as static finals to match the casing.

-

PR: https://git.openjdk.java.net/jdk/pull/1691


Re: RFR: 8257876: Avoid Reference.isEnqueued in tests

2020-12-09 Thread Thomas Schatzl
On Tue, 8 Dec 2020 17:30:11 GMT, Mandy Chung  wrote:

>> Please review this change that eliminates the use of Reference.isEnqueued by
>> tests.  There were three tests using it:
>> 
>> vmTestbase/gc/gctests/ReferencesGC/ReferencesGC.java
>> vmTestbase/gc/gctests/WeakReferenceGC/WeakReferenceGC.java
>> jdk/java/lang/ref/ReferenceEnqueue.java
>> 
>> In each of them, some combination of using Reference.refersTo and
>> ReferenceQueue.remove with a timeout were used to eliminate the use of
>> Reference.isEnqueued.
>> 
>> I also cleaned up ReferencesGC.java in various respects.  It contained
>> several bits of dead code, and the failure checks were made stronger.
>> 
>> Testing:
>> mach5 tier1
>> Locally (linux-x64) ran all three tests with each GC (including Shenandoah).
>
> Marked as reviewed by mchung (Reviewer).

I'm not able to put this in the appropriate place using the github UI:

[pre-existing] The topWeakReferenceGC.java description at the top describes 
that the test calls System.gc() explicitly to trigger garbage collections at 
the end. It does not. Maybe this could be weasel-worded around like in the 
other cases in that text.

-

PR: https://git.openjdk.java.net/jdk/pull/1691


Re: RFR: 8253583: java/util/StringJoiner tests failing on 32-bit VMs after JDK-8246697

2020-09-24 Thread Thomas Schatzl
On Thu, 24 Sep 2020 10:02:13 GMT, Aleksey Shipilev  wrote:

>> Lgtm.
>
> Trivial, right?

Trivial.

-

PR: https://git.openjdk.java.net/jdk/pull/331


Re: RFR: 8253583: java/util/StringJoiner tests failing on 32-bit VMs after JDK-8246697

2020-09-24 Thread Thomas Schatzl
On Thu, 24 Sep 2020 08:35:24 GMT, Aleksey Shipilev  wrote:

> JDK-8246697 added -Xmx4g to test configurations. 32-bit VMs cannot do it. 
> @requires tests the OS RAM size, but you can
> easily have the x86_32 host with more than 4G ram, yet JVM would fail to 
> acquire that heap size. Need to check for
> bitness explicitly.  Testing:
>   - [x] java/util/StringJoiner on x86_64 (still run)
>   - [x] java/util/StringJoiner on x86_32 (skipped now)

Lgtm.

-

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/331


Re: RFR: JDK-8249258 java/util/StringJoiner/StringJoinerTest.java failed due to OOM (JDK 15)

2020-07-15 Thread Thomas Schatzl

Hi,

  I looked a bit at the allocations themselves, but first answering 
questions.


On 15.07.20 15:25, David Holmes wrote:
> On 15/07/2020 10:18 pm, Jim Laskey wrote:
>> Thomas explained: That large objects are never moved (outstanding
>> issue) So, it's possible to fragment the -Xmx4g such that a 2G object
>> can't be allocated,
>
> Naively one would expect that whatever memory was consumed by
>
> String maxString = "*".repeat(MAX_ARRAY_LENGTH);
>
> in OOM2, would be fully freed and available for use by the same
> statememt in OOM3. But without knowing the exact allocation patterns

This is true.

Augmenting OOM3 code with allocations/gcs:

Heap has 2.05g (1030 regions)
Allocation 1 for 1025 regions, 2g

  - concurrent mark start pause because of humongous allocation 
attempt; heap has 2.05g
  - not enough free space, so do a young collection, elevate to full 
collection -> heap shrunk to 2M

  - allocation goes through

1)  String maxString = "*".repeat(MAX_ARRAY_LENGTH);
try {

Allocation 2 for 2048 regions(!), 4g
  - concurrent start pause because of humongous allocation attempt
  - not enough free space, so do a young collection, elevate to full 
collection -> heap stays at 2.05g -> OOME


2)  new StringJoiner(maxString, "", maxString).toString();
fail("Should have thrown OutOfMemoryError");

Two observations:
- I ask myselves how that test could have ever failed (i.e. not throw an 
OOME). A 4g allocation on a 4g heap can in practice never succeed. This 
is very suspicious.


- It is also interesting why Allocation 2 internally has been truncated 
to a 2048 region allocation. It should be 2049 (4g + 16 bytes header?). 
Checking at lower layers, memory management get a request for 4294967296 
bytes which is exactly 2^32... this is too small for that object. 
Something is truncating that string. Printing it gives a length of 
2147483639 chars (i.e. 2^32-1-9). I assume that is correct to silently 
truncate the result string?


Thanks,
  Thomas


Re: RFR: JDK-8249258 java/util/StringJoiner/StringJoinerTest.java failed due to OOM (JDK 15)

2020-07-15 Thread Thomas Schatzl

Hi,

On 15.07.20 15:35, Jim Laskey wrote:

Try this:

- You have a 4G heap.
- You allocate some stuff, say 1 byte.
- You allocate a 2G object - so there is only 2G - 1, left. Not enough space 
for another 2G object.
- But you do allocate 1 byte.
- You have 1 byte, 2G and 1 byte.
- You free the original 2G object.
- But something allocates 1 byte in it's old space.
- Now there is no range that that can accommodate a 2G object. OOM.


  just for clarification: it's a bit more elaborate than that to get 
into this situation as G1 will try a full collection that will compact 
these 1 byte "stuff" into a contiguous range of memory.


G1 at this time will never move "large" (>= heap region size / 2; at 4g 
heap region size is 2M iirc) objects though. I.e. with a suitably placed 
single "large" object in that 4g heap (like covering the middle center 
regions of the heap, i.e. regions #1023 and #1024) you can make 
allocation of that 2g object impossible.


Thanks,
  Thomas


Re: RFR: 8188055: (ref) Add Reference.refersTo predicate

2020-04-08 Thread Thomas Schatzl

Hi,

On 08.04.20 02:25, Kim Barrett wrote:

[Note review on both core-libs and hotspot-gc-dev lists; try not to lose
either when replying.]

Please review a new function: java.lang.ref.Reference.refersTo.


[...]


CR:
https://bugs.openjdk.java.net/browse/JDK-8188055
https://bugs.openjdk.java.net/browse/JDK-8241029 (CSR)

Webrev:
https://cr.openjdk.java.net/~kbarrett/8188055/open.04/

Testing:
mach5 tier1

Locally (linux-x64) verified the new test passes with various garbage
collectors.



Lgtm.

Thomas


Re: RFR: Small cleanups in java.lang.ref

2018-05-18 Thread Thomas Schatzl
Hi Martin,

On Fri, 2018-05-18 at 07:47 -0700, Martin Buchholz wrote:
> 
> 
> On Fri, May 18, 2018 at 3:26 AM, Thomas Schatzl  e.com> wrote:
> > Hi,
> > 
> > On Wed, 2018-05-16 at 18:31 -0700, Martin Buchholz wrote:
> > > I've been confused by NULL vs null for years.
> > > 
> > > 8203327: Small cleanups in java.lang.ref
> > > http://cr.openjdk.java.net/~martin/webrevs/jdk/Reference-cleanup/
> > > https://bugs.openjdk.java.net/browse/JDK-8203327
> > 
> >   JDK-8203028 which is currently out for review
> > (http://cr.openjdk.java.net/~kbarrett/8203028/open.02) changes and
> > fixes the comments in that respect apart from other significant
> > changes to reference processing.
> > 
> > Pushing this cleanup would just cause merge conflicts for us (the
> > gc team). So this change does not seem to be required any more,
> > i.e. do you mind retracting this RFR and closing this change as
> > duplicate of the other?
> > 
> > (I apologize if my speech is too direct, I do not want to offend)
> 
> On the contrary, I'm very happy to see gc team actively maintaining
> java.lang.ref.

We are only fixing some performance issues for the current
implementation in G1 (and only incidentally for some of the other GCs)
(look for the "gc-reference-processor" label in the bug tracker) as
kind-of part of JEP 308 :)

In the future G1 may do reference processing completely concurrently
(as already mentioned elsewhere).

All these changes are and will most likely be completely unrelated to
java.lang.ref, apart from that one you just tripped over. I believe
there are no further changes in java.lang.ref needed, so gc-team
"maintenance" for that package can probably be considered "inactive"
again :P

> I've reverted changes to Reference.java
> 
> http://cr.openjdk.java.net/~martin/webrevs/jdk/Reference-cleanup-1/

  okay, I did not notice that we did not touch the ReferenceQueue,
currently travelling...

The change looks good to me now, but there are probably more
knowledgable people around :)

The copyright could be updated though.

Thanks,
  Thomas




Re: RFR: Small cleanups in java.lang.ref

2018-05-18 Thread Thomas Schatzl
Hi,

On Wed, 2018-05-16 at 18:31 -0700, Martin Buchholz wrote:
> I've been confused by NULL vs null for years.
> 
> 8203327: Small cleanups in java.lang.ref
> http://cr.openjdk.java.net/~martin/webrevs/jdk/Reference-cleanup/
> https://bugs.openjdk.java.net/browse/JDK-8203327

  JDK-8203028 which is currently out for review (http://cr.openjdk.java
.net/~kbarrett/8203028/open.02) changes and fixes the comments in that
respect apart from other significant changes to reference processing.

Pushing this cleanup would just cause merge conflicts for us (the gc
team). So this change does not seem to be required any more, i.e. do
you mind retracting this RFR and closing this change as duplicate of
the other?

(I apologize if my speech is too direct, I do not want to offend)

Thanks,
  Thomas





Re: GetPrimitiveArrayCritical vs GetByteArrayRegion: 140x slow-down using -Xcheck:jni and java.util.zip.DeflaterOutputStream

2018-03-19 Thread Thomas Schatzl
Hi,

On Fri, 2018-03-16 at 17:19 +, Ian Rogers wrote:
> Thanks Paul, very interesting.
> 
> On Fri, Mar 16, 2018 at 9:21 AM Paul Sandoz 
> wrote:
> > Hi Ian, Thomas,
> > 
> > [...]
> > (This is also something we need to consider if we modify buffers to
> > support capacities larger than Integer.MAX_VALUE. Also connects
> > with Project Panama.)
> > 
> > If Thomas has not done so or does not plan to i can log an issue
> > for you.
> > 
> 
> That'd be great. I wonder if identifying more TTSP issues should also
> be a bug. Its interesting to observe that overlooking TTSP in C2
> motivated the Unsafe.copyMemory change permitting a fresh TTSP issue.
> If TTSP is a 1st class issue then maybe we can deprecate JNI critical
> regions to support that effort :-)

Please log an issue. I am still a bit unsure what and how many issues
should be filed.

@Ian: at bugreports.oracle.com everyone may file bug reports without
the need for an account.
It will take some time until they show up in Jira due to vetting, but
if you have a good case, and can e.g. link to the mailing list, this
should be painless.

Thanks,
  Thomas



Re: GetPrimitiveArrayCritical vs GetByteArrayRegion: 140x slow-down using -Xcheck:jni and java.util.zip.DeflaterOutputStream

2018-03-15 Thread Thomas Schatzl
Hi,

On Thu, 2018-03-15 at 01:00 +, Ian Rogers wrote:
> An old data point on how large a critical region should be comes from
> java.nio.Bits. In JDK 9 the code migrated into unsafe, but in JDK 8
> the copies within a critical region were bound at most copying 1MB:
> http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/
> native/java/nio/Bits.c#l88 This is inconsistent with Deflater and
> ObjectOutputStream which both allow unlimited arrays and thereby
> critical region sizes.
> 
> In JDK 9 the copies starve the garbage collector in nio Bits too as
> there is no 1MB limit to the copy sizes:
> http://hg.openjdk.java.net/jdk/jdk/rev/f70e100d3195
> which came from:
> https://bugs.openjdk.java.net/browse/JDK-8149596
> 
> Perhaps this is a regression not demonstrated due to the testing
> challenge.
> [...]
> It doesn't seem unreasonable to have the loops for the copies occur
> in 1MB chunks but JDK-8149596 lost this and so I'm confused on what
> the HotSpot stand point is.

Please file a bug (seems to be a core-libs/java.nio regression?),
preferably with some kind of regression test. Also file enhancements (I
would guess) for the other cases allowing unlimited arrays.

Long TTSP is a performance bug as any other.

> In a way criticals are better than unsafe as they may
> pin the memory and not starve GC, which shenandoah does.

(Region based) Object pinning has its own share of problems:

 - only (relatively) easily implemented in region based collectors

 - may slow down pause a bit in presence of pinned regions/objects (for
non-concurrent copying collectors)

 - excessive use of pinning may cause OOME and VM exit probably earlier
than the gc locker. GC locker seems to provide a more gradual
degradation. E.g. pinning regions typically makes these regions
unavailable for allocation.
I.e. you still should not use it for many, very long living objects.
Of course this somewhat depends on the sophistication of the
implementation.

I think region based pinning would be a good addition to other
collectors than Shenandoah too. It has been on our minds for a long
time, but there are so many other more important issues :), so of
course we are eager to see contributions in this area. ;)

If you are interested on working on this, please ping us on hotspot-gc-
dev for implementation ideas to get you jump-started.

Thanks,
  Thomas



Re: RFR(xs): 8193909: Obsolete(remove) Co-operative Memory Management (CMM)

2018-02-15 Thread Thomas Schatzl
Hi,

On Wed, 2018-02-14 at 13:45 -0800, sangheon.kim wrote:
> Hi all,
> 
> Could I have some reviews for CMM removal?
> This is closed CR but some public codes also need small
> modifications. 
> This CR is for removing stuff related to an Oracle JDK
> module/package.
> Changes are just removing CMM from lists or in a test to skip the 
> testing logic.
> 
> CR: https://bugs.openjdk.java.net/browse/JDK-8193909
> Webrev: http://cr.openjdk.java.net/~sangheki/8193909/webrev.0
> Testing: hs-tier1~5, jdk1~3, open/test/jdk:jdk_core
> 

  looks good.

Thomas


Re: RFR(s): 8191859: SoftReference clock/timestamp should be declared volatile

2017-11-24 Thread Thomas Schatzl
Hi,

On Fri, 2017-11-24 at 14:13 +0100, Per Liden wrote:
> The clock and timestamp fields in SoftReference should be declared 
> volatile. These fields are read or updated by the VM, (possibly) 
> concurrently with the execution of Java threads.
> 
> To be more specific, the timestamp field is read by the GC during 
> reference discovery, e.g. during G1/CMS concurrent mark. The clock
> is updated by the GC, typically after reference processing has
> completed.
> 
> Webrev: http://cr.openjdk.java.net/~pliden/8191859/webrev.0/
> Bug: https://bugs.openjdk.java.net/browse/JDK-8191859
> 

  looks good to me.

Thomas



Re: RFR: 8132306: java/lang/ref/ReferenceEnqueue.java fails with "RuntimeException: Error: poll() returned null; expected ref object"

2015-07-29 Thread Thomas Schatzl
Hi Kim,

On Wed, 2015-07-29 at 03:57 -0400, Kim Barrett wrote:
> Please review this fix of a race condition in
> j.l.r.Reference/ReferenceQueue.  See comments in the bug report for a
> description of the race.  The race is being fixed by reordering a pair
> of volatile assignments.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8132306
> 
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8132306/webrev.00/
> 
> Testing:
> jprt with default and hotspot testsets
> 
> Locally tested race in original code by insertion of sleep in enqueue and
> verifying ReferenceEnqueue regression test fails consistently as
> described in the bug report.
> 
> Locally tested fixed code by insertion of sleeps at various points and
> verifying ReferenceEnqueue regression test still passes.

  looks good.

Thanks for taking care of this so quickly,
  Thomas




Re: Another take on Finalization

2015-06-08 Thread Thomas Schatzl
Hi,

On Sun, 2015-06-07 at 10:46 +0200, Peter Levart wrote:
> Hi,
> 
> On 06/05/2015 11:11 PM, Jonathan Payne wrote:
> > My problem was that finalization was not being run at all with the G1
>>collector. Not at all. That would have been fine with me because none
>>of the existing objects in the Finalizer queue actually needed the
>>service anymore: the files, sockets, streams, etc. had all been
>>properly closed by my application, otherwise the server would have
>>long since failed completely. However, those objects started to
>>accumulate in the VM and eventually (8 hours later) brought the server
>>down.
> 
> I made a quick test comparing default (PS MarkSweep / PS Scavenge) and 
> G1 collectors with -Xmx512m option. When finalizable objects are 
> released immediately after construction with a sustained rate of ~90 
> objects/ms, the results for default collector are:
[...]
> ...which suggests that G1 is no less eager to process finalizable 
> objects that are young garbage (see "in-flight" column). At least it 
> doesn't seem that they accumulate. So your problem might have been with 
> finalizable objects escaping to old-gen. Tuning the young-gen size and 
> survivor ratio might help in your case.

  that might have been an artifact of old G1 implementations that did
not do class unloading (and various other cleanups) during the
concurrent marking cycle.

That feature is in the official release since 8u40, so G1 should
definitely also try to finalize objects that escaped into old gen for
some time now, without full gc.

Thanks,
  Thomas




Re: [Fwd: Missing merge for 8014890: Reference queues may return more entries than expected]

2013-08-14 Thread Thomas Schatzl
Hi,

On Wed, 2013-08-14 at 10:43 +0100, Alan Bateman wrote:
> Mandy is just back from vacation and might not have got to your mail 
> yet. However it's not clear to me that there is an actual problem as it 
> looks like you've just ended up with two heads, maybe a hg fetch when 
> you had changes where you forgot to merge? As I understand, the hg 
> server is setup to limit the heads to 1 so it's not possible for any of 
> us to push changes that would result in a new head. Maybe start with a 
> fresh clone of jdk8/tl or jdk8/jdk8?

  I had the impression that I had a fresh checkout at that time, and did
not recheck right now.

A fresh checkout is okay.

Sorry for the noise,
  Thomas



[Fwd: Missing merge for 8014890: Reference queues may return more entries than expected]

2013-08-14 Thread Thomas Schatzl
Hi all,

  could somebody take care of the missing merge for the fix below?

Thanks,
Thomas

 Forwarded Message 
> From: Thomas Schatzl 
> To: mandy.ch...@oracle.com
> Subject: Missing merge for 8014890: Reference queues may return more
> entries than expected
> Date: Wed, 07 Aug 2013 09:47:30 +0200
> 
> Hi Mandy,
> 
>   while checking out latest sources, I saw that the commit for that
> patch has not been merged to the default branch in the repository. It
> represents its own head.
> 
> Could you fix this?
> 
> Here's the relevant hg output:
> 
> $ hg heads
> changeset:   7764:8ed8e2b4b90e
> tag: tip
> parent:  7763:0741b19835b0
> parent:  7692:b52a2ecdb803
> user:lana
> date:Tue Aug 06 10:10:39 2013 -0700
> summary: Merge
> 
> changeset:   7605:2c04ce2f782c
> user:mchung
> date:Mon Jul 08 14:05:59 2013 +0200
> summary: 8014890: (ref) Reference queues may return more entries
> than expected
> 




Re: RFR (XS): 8014890 : Reference queues may return more entries than expected

2013-07-12 Thread Thomas Schatzl
Hi,

On Fri, 2013-07-12 at 19:22 +0800, Mandy Chung wrote:
> Hi Thomas,
> 
> On 7/1/2013 7:51 PM, Thomas Schatzl wrote:
> > I changed this in
> > http://cr.openjdk.java.net/~tschatzl/8014890/webrev.refactor  to that
> > version now, using a temporary variable that stores r.queue before the
> > checks to avoid the double volatile reads.
> 
> This looks good.  I have pushed this for you.  Sorry for the delay as I 
> have been busy on other tasks.

Thanks a lot everyone.

Thomas



Re: RFR (XS): 8014890 : Reference queues may return more entries than expected

2013-07-01 Thread Thomas Schatzl
Hi all,

On Mon, 2013-07-01 at 15:44 +0400, Aleksey Shipilev wrote:
> On 07/01/2013 03:37 PM, David Holmes wrote:
> > On 1/07/2013 8:14 PM, Aleksey Shipilev wrote:
> >> The same "thou shalt not do multiple volatile reads" applies to 
> >> "(r.queue == NULL) || (r.queue == ENQUEUED)" now.
> > 
> > Doesn't that just reduce to "r.queue != this" ? (The assert suggests
> > so :) )
> 
> Thomas' original patch had this in form of "r.queue != this". I argue it
> is more future-proof to distinguish the concrete cases.

:)

I also thought it was more understandable if the cases were explicitly
enumerated, in addition to the assert.

I changed this in 
http://cr.openjdk.java.net/~tschatzl/8014890/webrev.refactor to that
version now, using a temporary variable that stores r.queue before the
checks to avoid the double volatile reads.

However for me either version is fine, just tell me what you favor.

I am not really happy about bitwise ORing the two comparison results as
it seems to reduce readability at no real gain.

Thanks,
Thomas




Re: RFR (XS): 8014890 : Reference queues may return more entries than expected

2013-07-01 Thread Thomas Schatzl
Hi David,

On Mon, 2013-07-01 at 17:51 +1000, David Holmes wrote:
> Well you can ignore what I wrote below - sorry. Somehow I got it in my 
> head that multiple enqueue's were intended/supported when of course they 
> are not. :(
> 
> So the proposed fix is okay - though I'd simplify the comment to just:
> 
> // Check that since getting the lock this reference hasn't already been
> // enqueued (and even then removed)

Done, see the new webrev at
http://cr.openjdk.java.net/~tschatzl/8014890/webrev.01/

I will send this version to Mandy for pushing if nobody objects in the next few 
days.

> The synchronization is problematic as I mention below but there is no 
> easy fix due to the lock-ordering problem, and any attempt at such a fix 
> would be much riskier. So this fix is fine - thanks.

Fyi, while waiting for your approval, I tried to clean up this a little
taking into account the comments from Peter and Aleksey (sorry if I
forgot somebody) into account.

A webrev for this is at
http://cr.openjdk.java.net/~tschatzl/8014890/webrev.refactor/ 

The changes are:
 - avoid the double read from the volatile head member in
ReferenceQueue.reallyPoll()
 - the fix for 8014890 you just reviewed
 - make Reference.queue volatile, and remove the synchronization on it
from Reference.isEnqueued() and ReferenceQueue.enqueue(). I do not see a
performance problem: a volatile read (in Reference.isEnqueued()) should
be at least as fast as the synchronization on that reference.

I think the change is okay as in functionally correct; I ran it through
jprt (running all the available test cases, including the one for the
original 8014890 patch, in the process), passing it.

Maybe it is useful to you for future reference.

Thanks everyone for your suggestions,
  Thomas

Standard information about the patch (the original one, not the
suggested cleanup):

JIRA:
https://jbs.oracle.com/bugs/browse/JDK-8014890

bugs.sun.com
http://bugs.sun.com/view_bug.do?bug_id=8014890

Testing:
jck, jprt, manual testing on most platforms




Re: RFR (XS): 8014890 : Reference queues may return more entries than expected

2013-06-19 Thread Thomas Schatzl
Hi,

On Wed, 2013-06-19 at 17:56 +0400, Aleksey Shipilev wrote:
> On 06/19/2013 04:05 PM, David Holmes wrote:
> > I think I'm going to need a lot more time to understand this issue 
> > completely. The synchronization being used seems complex and
> > somewhat ill-defined, and also inadequate. I'm not convinced that
> > adding code inside a synchronized block really fixes a race condition
> > caused by unsynchronized access - but it may well narrow the window
> > significantly.
> 
> +1. Having spent 30 minutes trying to figure out the synchronization
> protocol for R and RQ, I can say that is a haunted part of JDK.
> 

:) I had the same problem when trying to figure out the code.

> That said, I think the patch fixes the concrete issue. We are
> piggybacking on the mutual exclusion on RQ.lock to protect ourselves
> from the concurrent mutation of r.queue. The only two places where we
> mutate it is RQ.enqueue() and RQ.poll(), both secured with RQ.lock.
> Given the r.queue is not volatile, we should also acquire the same lock
> while reading r.queue, otherwise races obliterate the reasoning about
> the correctness.
>
> With that notion in mind, I start to wonder if we just need to move the
> check in enqueue() down into the synchronized(lock), and extend it to
> capture NULL:
> 
> --- a/src/share/classes/java/lang/ref/ReferenceQueue.java Mon Jun 17
> 16:28:22 2013 +0400
> +++ b/src/share/classes/java/lang/ref/ReferenceQueue.java Wed Jun 19
> 17:53:24 2013 +0400
> @@ -56,8 +56,9 @@
> 
>  boolean enqueue(Reference r) { /* Called only by
> Reference class */
>  synchronized (r) {
> -if (r.queue == ENQUEUED) return false;
>  synchronized (lock) {
> +if (r.queue == ENQUEUED || r.queue == NULL)
> +return false;
>  r.queue = ENQUEUED;
>  r.next = (head == null) ? r : head;
>  head = r;
> 
> Will it be equivalent to what Thomas is trying to accomplish?

Yes, this is equivalent. Instead of checking the separate ENQUEUED and
NULL values I simply used an "r.queue != this" check which would get
both cases. R.queue cannot have other values than ENQUEUED, NULL or the
queue's value set at initialization.

I wanted to minimize the changes for this particular CR.

Thanks,
Thomas




Re: RFR (XS): 8014890 : Reference queues may return more entries than expected

2013-06-19 Thread Thomas Schatzl
Hi,

On Wed, 2013-06-19 at 09:25 +0200, Thomas Schatzl wrote:
> Hi,
> 
> On Tue, 2013-06-18 at 15:53 -0700, Mandy Chung wrote:
> > Hi Thomas,
> > 
> > Thanks for the detailed analysis.
> > > http://cr.openjdk.java.net/~tschatzl/8014890/webrev/
> > 
> > The fix looks good.  I was able to reproduce the failure with the
> > test case in the bug report running with jdk8 b94 fastdebug build.
> > But the new regression test doesn't fail (I tried solaris-i586).
> > Did it fail on your system?
> > 
> 
> Yes, 100% on my developer machine. I will try to modify it to make it
> fail on more machines (increase number of runs).

With 1000 runs (from 100) I can reproduce on

product/win32, fastdebug+product/linux-x86_64, fastdebug+product/win64,
fastdebug+product/linux-x86, fastdebug+product/solaris-x86, fastdebug
+product/solaris-sparc...

Still a few platforms, i.e. some fastdebug platforms, do not reproduce
the failure. This 

With 100 runs I do not get that many platforms to fail. I will change
the number of iterations to 1000 by default.

When writing the test I was trying to make the test run as quickly as
possible while still showing the issue. I probably went too far in the
"being quick" direction.

I updated the webrev (just the change for the number of iterations from
100 to 1000 in the test program). Its runtime is still only a few
seconds.

Thomas




Re: RFR (XS): 8014890 : Reference queues may return more entries than expected

2013-06-19 Thread Thomas Schatzl
Hi Peter,

On Wed, 2013-06-19 at 11:05 +0200, Peter Levart wrote:
> On 06/19/2013 09:17 AM, Thomas Schatzl wrote:
> > Hi,
> >
> > On Wed, 2013-06-19 at 12:47 +1000, David Holmes wrote:
> >> Hi Thomas,
> >>
> >> On 19/06/2013 7:08 AM, Thomas Schatzl wrote:
> >>> Hi all,
> >>>
> >>> can I have reviews for the following change?
> >>>
> >>> It happens if multiple threads are enqueuing and dequeuing reference
> >>> objects into a reference queue, that Reference objects may be enqueued
> >>> at multiple times.
> >>>
> >>> This is because when java.lang.ref.ReferenceQueue.poll() returns and
> >>> inactivates a Reference object, another thread may just be during
> >>> enqueuing it again.
> >> Are we talking about different queues here? Otherwise both poll() and
> >> enqueue() are using synchronized(lock). I also note that enqueue
> >> synchronizes on the Reference itself, which suggests that poll (actually
> >> reallyPoll) should also be synchronizing on the reference (though we
> >> have a nested lock inversion problem that would need to be solved).
> > This does not help imo. Still there may be temporary storage containing the 
> > original ReferenceQueue reference, and .enqueue() gets called on it with 
> > the now inactive Reference.
> >
> > Enqueue() and (really-)poll() themselves already synchronize on the
> > "lock" lock to guard modification of the queue, this is safe.
> >
> > The synchronize(r) statement in ReferenceQueue.enqueue() is about 
> > synchronization with Reference.isEnqueued() I think. Actually I am not sure 
> > whether the synchronization between isEnqueued() and enqueue() makes a 
> > difference.
> 
> Hi Thomas,
> 
> It doesn't make a difference between Reference.isEnqueued() and 
> ReferenceQueue.poll(), since there isn't any synchronization between the 

I do not disagree with that, I was just guessing that the
synchronization on the reference was for synchronization between
isEnqueued() and enqueue(). Not poll() and isEnqueued() for the reasons
you mentioned.

> two. So isEnqueued() can still be returning true while another thread 
> has already de-queued the instance. I guess the real use of isEnqueued() 
> method is reliable detection of false -> true transition only.

Probably. 

> I can't see why the isEnqueued() method checks for both (next != null && 
> queue == ENQUEUED). Wouldn't the check for (queue == ENQUEUED) be 
> enough? In that case the Reference.queue field could be made volatile 
> and synchronization on Reference instance removed.

You are right that volatile would be needed in this case.

> While you're at it, the reallyPoll() could optimize the double volatile 
> read from head and only perform it once:

I would be fine with that change, depends on others' opinions. It might
be better to make a different CR for cleanups though.

> 
>  private Reference reallyPoll() {   /* Must hold 
> lock */
>  Reference r = head;
>  if (r != null) {
>  head = (r.next == r) ? null : r.next;
>  r.queue = NULL;
>  r.next = r;
>  queueLength--;
>  if (r instanceof FinalReference) {
>  sun.misc.VM.addFinalRefCount(-1);
>  }
>  return r;
>  }
>  return null;
>  }
> 
> 
> >
> > Another solution would be to synchronize enqueue() and poll() on the queue 
> > itself, and check whether the reference passed to enqueue() is inactive or 
> > not (i.e. this check is still needed).
> 
> ReferenceQueue.this and ReferenceQueue.lock are 1<->1. What difference 
> would that make?

None, as I mentioned, for the original issue. Except that it solves the
problem, that if you wanted (as I think has been suggested in the
earlier email) to also synchronize on the reference, which is a bad idea
because of the different order of the nested synchronization. (I am not
sure how also synchronizing on the reference in poll() would improve the
situation too)

It would also obviate the need for the "lock" lock in a next step as
they are 1<->1. But maybe for some other reasons using an explicit lock
object is the preferred way.

Thomas




Re: RFR (XS): 8014890 : Reference queues may return more entries than expected

2013-06-19 Thread Thomas Schatzl
Hi,

 one more note :)

On Wed, 2013-06-19 at 09:28 +0200, Thomas Schatzl wrote:
> Hi again,
> 
>   some correction...
> 
> On Wed, 2013-06-19 at 09:17 +0200, Thomas Schatzl wrote:
> > Hi,
> > 
> > On Wed, 2013-06-19 at 12:47 +1000, David Holmes wrote:
> > > Hi Thomas,
> > > 
> > > On 19/06/2013 7:08 AM, Thomas Schatzl wrote:
> > > > Hi all,
> > > >
> > > >can I have reviews for the following change?
> > > >
> > > > It happens if multiple threads are enqueuing and dequeuing reference
> > > > objects into a reference queue, that Reference objects may be enqueued
> > > > at multiple times.
> > > >
> > > > This is because when java.lang.ref.ReferenceQueue.poll() returns and
> > > > inactivates a Reference object, another thread may just be during
> > > > enqueuing it again.
> > > 
> > > Are we talking about different queues here? Otherwise both poll() and 
> > > enqueue() are using synchronized(lock). I also note that enqueue 
> > > synchronizes on the Reference itself, which suggests that poll (actually 
> > > reallyPoll) should also be synchronizing on the reference (though we 
> > > have a nested lock inversion problem that would need to be solved).
> > 
> > This does not help imo. Still there may be temporary storage containing the 
> > original ReferenceQueue reference, and .enqueue() gets called on it with 
> > the now inactive Reference.
> > 
> > Enqueue() and (really-)poll() themselves already synchronize on the
> > "lock" lock to guard modification of the queue, this is safe.
> > 
> > The synchronize(r) statement in ReferenceQueue.enqueue() is about 
> > synchronization with Reference.isEnqueued() I think. Actually I am not sure 
> > whether the synchronization between isEnqueued() and enqueue() makes a 
> > difference.
> 
> This also guards against multiple concurrent calls to enqueue on the
> same reference - so this statement is needed after all.

Actually, with that patch (e.g. the if (this != r.queue check) return
false; ) this situation would also be covered I think.

But that's unrelated, sorry.

Thomas




Re: RFR (XS): 8014890 : Reference queues may return more entries than expected

2013-06-19 Thread Thomas Schatzl
Hi again,

  some correction...

On Wed, 2013-06-19 at 09:17 +0200, Thomas Schatzl wrote:
> Hi,
> 
> On Wed, 2013-06-19 at 12:47 +1000, David Holmes wrote:
> > Hi Thomas,
> > 
> > On 19/06/2013 7:08 AM, Thomas Schatzl wrote:
> > > Hi all,
> > >
> > >can I have reviews for the following change?
> > >
> > > It happens if multiple threads are enqueuing and dequeuing reference
> > > objects into a reference queue, that Reference objects may be enqueued
> > > at multiple times.
> > >
> > > This is because when java.lang.ref.ReferenceQueue.poll() returns and
> > > inactivates a Reference object, another thread may just be during
> > > enqueuing it again.
> > 
> > Are we talking about different queues here? Otherwise both poll() and 
> > enqueue() are using synchronized(lock). I also note that enqueue 
> > synchronizes on the Reference itself, which suggests that poll (actually 
> > reallyPoll) should also be synchronizing on the reference (though we 
> > have a nested lock inversion problem that would need to be solved).
> 
> This does not help imo. Still there may be temporary storage containing the 
> original ReferenceQueue reference, and .enqueue() gets called on it with the 
> now inactive Reference.
> 
> Enqueue() and (really-)poll() themselves already synchronize on the
> "lock" lock to guard modification of the queue, this is safe.
> 
> The synchronize(r) statement in ReferenceQueue.enqueue() is about 
> synchronization with Reference.isEnqueued() I think. Actually I am not sure 
> whether the synchronization between isEnqueued() and enqueue() makes a 
> difference.

This also guards against multiple concurrent calls to enqueue on the
same reference - so this statement is needed after all.

Sorry,
  Thomas




Re: RFR (XS): 8014890 : Reference queues may return more entries than expected

2013-06-19 Thread Thomas Schatzl
Hi,

On Tue, 2013-06-18 at 15:53 -0700, Mandy Chung wrote:
> Hi Thomas,
> 
> Thanks for the detailed analysis.
> > http://cr.openjdk.java.net/~tschatzl/8014890/webrev/
> 
> The fix looks good.  I was able to reproduce the failure with the
> test case in the bug report running with jdk8 b94 fastdebug build.
> But the new regression test doesn't fail (I tried solaris-i586).
> Did it fail on your system?
> 

Yes, 100% on my developer machine. I will try to modify it to make it
fail on more machines (increase number of runs).

It is hard to reproduce such things consistently, but I think we nailed
down the issue and that's probably what counts in such cases.

Also consider that this issue has likely been present since quite some
time, and has only been noticed and investigated just now.

> I can sponsor your fix.

Thanks,
  Thomas




Re: RFR (XS): 8014890 : Reference queues may return more entries than expected

2013-06-19 Thread Thomas Schatzl

Hi,

On Wed, 2013-06-19 at 12:47 +1000, David Holmes wrote:
> Hi Thomas,
> 
> On 19/06/2013 7:08 AM, Thomas Schatzl wrote:
> > Hi all,
> >
> >can I have reviews for the following change?
> >
> > It happens if multiple threads are enqueuing and dequeuing reference
> > objects into a reference queue, that Reference objects may be enqueued
> > at multiple times.
> >
> > This is because when java.lang.ref.ReferenceQueue.poll() returns and
> > inactivates a Reference object, another thread may just be during
> > enqueuing it again.
> 
> Are we talking about different queues here? Otherwise both poll() and 
> enqueue() are using synchronized(lock). I also note that enqueue 
> synchronizes on the Reference itself, which suggests that poll (actually 
> reallyPoll) should also be synchronizing on the reference (though we 
> have a nested lock inversion problem that would need to be solved).

This does not help imo. Still there may be temporary storage containing the 
original ReferenceQueue reference, and .enqueue() gets called on it with the 
now inactive Reference.

Enqueue() and (really-)poll() themselves already synchronize on the
"lock" lock to guard modification of the queue, this is safe.

The synchronize(r) statement in ReferenceQueue.enqueue() is about 
synchronization with Reference.isEnqueued() I think. Actually I am not sure 
whether the synchronization between isEnqueued() and enqueue() makes a 
difference.

Another solution would be to synchronize enqueue() and poll() on the queue 
itself, and check whether the reference passed to enqueue() is inactive or not 
(i.e. this check is still needed).

> > ReferenceHandlerThread.run():
> >
> > 0: [...]
> > 1: ReferenceQueue q = r.queue; // r is the reference
> > 2: if (r != ReferenceQueue.NULL)
> > 3:   q.enqueue().

Between lines 2 and 3, q contains a reference to the old ReferenceQueue (which 
is not ReferenceQueue.NULL). So if the thread is switched there, i.e. the main 
thread does a poll() on q, the main thread makes r inactive. When switching 
back to the reference handler thread (or any other thread), enqueue() of that 
Reference r will still be called on the original q. The actual r.queue is 
already ReferenceQueue.NULL from the poll(). (i.e. the Reference is inactive). 
This change guards against that situation as this is (imo) an unexpected 
behavior (that you can enqueue a Reference multiple times; and an already 
inactive one too).

The only solution would be synchronizing on r in the ReferenceHandler code to 
avoid that (I think). However then everyone that uses Reference.enqueue() 
(which uses ReferenceQueue.enqueue()) would need to synchronize on the 
Reference to make the code thread safe. I haven't seen that in the 
documentation.

As mentioned this situation may occur independently of whether the 
ReferenceHandler thread is involved or not.

I.e. if you look at Reference.enqueue() which reads as

  this.queue.enqueue(this)

This is the same situation, if you consider that "this.queue" may be 
temporarily stored in e.g. a register during the thread switch.


> > Webrev with test case
> > http://cr.openjdk.java.net/~tschatzl/8014890/webrev/
> >
> > JIRA:
> > https://jbs.oracle.com/bugs/browse/JDK-8014890
> >
> > bugs.sun.com
> > http://bugs.sun.com/view_bug.do?bug_id=8014890
> >
> > Testing:
> > jck, jprt, manual testing
> >
> > Note that I also need a sponsor to push in case this change is approved.

Thanks,
  Thomas




RFR (XS): 8014890 : Reference queues may return more entries than expected

2013-06-18 Thread Thomas Schatzl
Hi all,

  can I have reviews for the following change?

It happens if multiple threads are enqueuing and dequeuing reference
objects into a reference queue, that Reference objects may be enqueued
at multiple times.

This is because when java.lang.ref.ReferenceQueue.poll() returns and
inactivates a Reference object, another thread may just be during
enqueuing it again.

In the test case provided, the two threads that conflict are the
reference handler thread and the program (main) thread.

Relevant code:

ReferenceHandlerThread.run():

0: [...]
1: ReferenceQueue q = r.queue; // r is the reference
2: if (r != ReferenceQueue.NULL)
3:   q.enqueue().

ReferenceQueue::poll()(reallyPoll()) code (I removed lots of code here)

1: synchronized(lock) {
2:   [...]
3:   r.queue = ReferenceQueue.NULL;
3:}

I.e. while ReferenceQueue.poll() sets the Reference's queue to the NULL
queue so that that reference will not be enqueued again (or at most into
the NULL queue which is a nop), it happens that if the task switch
occurs between lines 2 and 3 of the reference handler thread, q still
contains the old queue reference, and the reference handler thread will
enqueue the Reference into the original queue again.

You can achieve the same effect by simply calling
ReferenceQueue.enqueue() (i.e. without the reference handler thread, or
within the reference handler thread doing the != NULL check), it's just
that in such a case the "old" ReferenceQueue is stored in some register.

The guard for ReferenceQueue.NULL does not have any effect except for
possibly saving the virtual call. Simply calling r.enqueue() exhibits
the same problem.

The proposed solution is to filter out References within
ReferenceQueue.enqueue() again. At that point we can check whether the
given Reference is actually meant for this queue or not. Already removed
References are known to be "inactive" (as enqueue and poll are mutually
exclusive using a lock), in particular the References' queue is
different (actually the NULL queue) to the queue it is being enqueued.

This change should pose no change in semantics, as the ReferenceQueue of
the Reference can only be set in its constructor, and as soon as the
Reference is removed from a ReferenceQueue, its ReferenceQueue will be
the NULL queue. (I.e. before this change you could not enqueue such an
"inactive" Reference multiple times anyway)

(too many References and queues here :)

Webrev with test case
http://cr.openjdk.java.net/~tschatzl/8014890/webrev/

JIRA:
https://jbs.oracle.com/bugs/browse/JDK-8014890

bugs.sun.com
http://bugs.sun.com/view_bug.do?bug_id=8014890

Testing:
jck, jprt, manual testing

Note that I also need a sponsor to push in case this change is approved.

Thanks,
  Thomas



Re: RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread

2013-05-24 Thread Thomas Schatzl
Hi,

On Fri, 2013-05-24 at 09:09 +0100, Alan Bateman wrote:
> On 24/05/2013 08:20, Thomas Schatzl wrote:
> > Hi all,
> >
> > On Fri, 2013-05-24 at 14:30 +1000, David Holmes wrote:
> >> Thomas,
> >>
> >> Are we still waiting for a second core-libs reviewer on this?
> > Yes. Would some additional core-libs reviewer be so kind and look at
> > this issue?
> >
> > http://cr.openjdk.java.net/~tschatzl/7038914/webrev.3/
> >
> > Thomas
> >
> Only one reviewer is required but if you want a second then the proposed 
> change looks fine to me (and I'm happy the discussion got to the issue, 
> it wasn't initially clear). For the test then I guess I'd probably 
> rename it to ReferenceHandlerOOME but that's a minor comment.
> 
> -Alan

  thanks Alan, David and Peter and all others who contributed to this
patch.

I will move it further through the process then.

Thanks a lot,
  Thomas



Re: RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread

2013-05-24 Thread Thomas Schatzl
Hi all,

On Fri, 2013-05-24 at 14:30 +1000, David Holmes wrote:
> Thomas,
> 
> Are we still waiting for a second core-libs reviewer on this?

Yes. Would some additional core-libs reviewer be so kind and look at
this issue?

http://cr.openjdk.java.net/~tschatzl/7038914/webrev.3/

Thomas


> 
> David
> 
> On 17/05/2013 5:56 PM, Thomas Schatzl wrote:
> > On Fri, 2013-05-17 at 10:47 +1000, David Holmes wrote:
> >> On 16/05/2013 8:44 PM, Thomas Schatzl wrote:
> >>> On Mon, 2013-05-13 at 13:55 +0200, Thomas Schatzl wrote:
> >>>> I updated the test program and the patch in java.lang.ref.Reference
> >>>> accordingly.
> >>>>
> >>>> As for the problem of reproducibility, in my tests I had a 100%
> >>>> reproduction rate with the previous version of the test.
> >>>>
> >>>> However, now I also set -XX:-UseTLAB and -Xmx16M in the test program as
> >>>> suggested in some other emails.
> >>>>
> >>>> I will report back with a new webrev after some testing on more
> >>>> platforms as suggested by David.
> >>>
> >>> a new webrev for the patch is at
> >>>
> >>> http://cr.openjdk.java.net/~tschatzl/7038914/webrev.2/
> >>
> >> I think the comment is somewhat confusing, but then the details here are
> >> quite confusing. I guess the key part of this is that if OOME is thrown
> >> we don't want to try and load InterruptedException - though I'm unclear,
> >> based on normal exception processing semantics, when that might occur.
> >
> > I tried to clarify the comment a little; I also added a dummy
> > instantiation of InterruptedException at the start of the test program
> > to avoid OOME during class loading in this case as suggested by the
> > other email.
> >
> > The new webrev is at
> > http://cr.openjdk.java.net/~tschatzl/7038914/webrev.3/
> >
> > I only compiled and tested this webrev that nothing broke locally, as
> > the changes are minimal and mostly concerning comments. Pushing a jdk
> > through jprt also takes a long time.
> >
> > Before sending you a patchset after it has been reviewed, I will push it
> > through jprt again.
> >
> >> You can count me as a Reviewer and sponsor. I think only a second JDK/TL
> >> Reviewer is needed here as no impact on hotspot.
> >
> > Thanks,
> >Thomas
> >
> >




Re: RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread

2013-05-17 Thread Thomas Schatzl
On Fri, 2013-05-17 at 10:47 +1000, David Holmes wrote:
> On 16/05/2013 8:44 PM, Thomas Schatzl wrote:
> > On Mon, 2013-05-13 at 13:55 +0200, Thomas Schatzl wrote:
> >>I updated the test program and the patch in java.lang.ref.Reference
> >> accordingly.
> >>
> >> As for the problem of reproducibility, in my tests I had a 100%
> >> reproduction rate with the previous version of the test.
> >>
> >> However, now I also set -XX:-UseTLAB and -Xmx16M in the test program as
> >> suggested in some other emails.
> >>
> >> I will report back with a new webrev after some testing on more
> >> platforms as suggested by David.
> >
> >a new webrev for the patch is at
> >
> > http://cr.openjdk.java.net/~tschatzl/7038914/webrev.2/
> 
> I think the comment is somewhat confusing, but then the details here are 
> quite confusing. I guess the key part of this is that if OOME is thrown 
> we don't want to try and load InterruptedException - though I'm unclear, 
> based on normal exception processing semantics, when that might occur.

I tried to clarify the comment a little; I also added a dummy
instantiation of InterruptedException at the start of the test program
to avoid OOME during class loading in this case as suggested by the
other email.

The new webrev is at
http://cr.openjdk.java.net/~tschatzl/7038914/webrev.3/

I only compiled and tested this webrev that nothing broke locally, as
the changes are minimal and mostly concerning comments. Pushing a jdk
through jprt also takes a long time.

Before sending you a patchset after it has been reviewed, I will push it
through jprt again.

> You can count me as a Reviewer and sponsor. I think only a second JDK/TL 
> Reviewer is needed here as no impact on hotspot.

Thanks,
  Thomas




Re: RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread

2013-05-16 Thread Thomas Schatzl
Hi,

On Sat, 2013-05-11 at 09:45 +1000, David Holmes wrote:
> On 11/05/2013 6:53 AM, Dean Long wrote:
> > On 5/10/2013 1:22 PM, Peter Levart wrote:
> >>
> >> On 05/10/2013 10:08 PM, Peter Levart wrote:
> >>>
> >>> On 05/10/2013 08:10 PM, Dean Long wrote:
>  If you really want to bullet-proof ReferenceHandler (and other
>  system threads) against OOME caused by native allocations,
>  then don't forget about monitor inflation if there is contention for
>  "lock" :-)
>
> 
> Right - most C-heap allocation failures in the VM result in an abort but 
> some will convert to OOME (such as native thread creation related failures).
> 
> Any test that tries to force an OOME at a particular operation is going 
> to be fragile because you can't know what other allocations might be 
> needed under the covers. As Peter already discovered it might occur 
> during the "new" of InterruptedException, or it might happen during the 
> load/initialization of InterruptedException. So pre-initializing 
> InterruptedException is probably a wise thing to do.
> 

That means what? Should I file a new CR (against what component?) for
that (making InterruptedException a pre-initialized exception)?

Thanks,
Thomas




Re: RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread

2013-05-16 Thread Thomas Schatzl
Hi all,

On Mon, 2013-05-13 at 13:55 +0200, Thomas Schatzl wrote:
> Hi all,
>
>   I updated the test program and the patch in java.lang.ref.Reference
> accordingly.
> 
> As for the problem of reproducibility, in my tests I had a 100%
> reproduction rate with the previous version of the test.
> 
> However, now I also set -XX:-UseTLAB and -Xmx16M in the test program as
> suggested in some other emails.
> 
> I will report back with a new webrev after some testing on more
> platforms as suggested by David.

  a new webrev for the patch is at

http://cr.openjdk.java.net/~tschatzl/7038914/webrev.2/

Testing:
jprt; manual tests with the test program on a jdk with and without the
patch on linux 32/64 bit, windows 32/64 bit, and sparc 32/64 bit. The
test program showed 100% reproduction of the error without the patch,
and 100% fix success with a jdk containing the patch (each multiple
tries at that).
Note that jprt seems to run all jdk unit tests always anyway.

I also fixed the copyright date in java/lang/ref/Reference.java.

If this patch is considered acceptable, I need two reviewers as usual
(possibly from GC/hsx team and one from jdk), and one sponsor pushing
the change. (I am still intent on making Peter the author of the patch
as both the change and the test program came from him)

I am not even author for the jdk, but can act as a non-Reviewer reviewer
for the gc/hsx team.

Thanks a lot,
Thomas




Re: RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread

2013-05-13 Thread Thomas Schatzl
Hi all,

On Fri, 2013-05-10 at 12:52 +0200, Peter Levart wrote:
> Hi David, Thomas,
> 
> I think I understand you now, David. You want to minimize the 
> possibility of vacuously passing the test. So here's my attempt at that:
> 
> 
> public class OOMEInReferenceHandler {
>  static Object[] fillHeap() {
>  Object[] first = null, last = null;
>  int size = 1 << 20;
>  while (size > 0) {
>  try {
> [... new reproducer ...]


> [...]
> 
> To be sure the order of class referencing in handlers is not disturbed 
> by changes in compilers, I suggest the patch to be modified to this more 
> explicit variant:
> 
>  try {
>  try {
>  lock.wait();
>  } catch (InterruptedException  x) { }
>  } catch (OutOfMemoryError x) {}
> 
> 
> ..which also passes the test.

  I updated the test program and the patch in java.lang.ref.Reference
accordingly.

As for the problem of reproducibility, in my tests I had a 100%
reproduction rate with the previous version of the test.

However, now I also set -XX:-UseTLAB and -Xmx16M in the test program as
suggested in some other emails.

I will report back with a new webrev after some testing on more
platforms as suggested by David.

Hth,
Thomas




Re: RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread

2013-05-08 Thread Thomas Schatzl
Hi,

  please review the latest webrev for this patch that is available at

http://cr.openjdk.java.net/~tschatzl/7038914/webrev.2/

As mentioned, it incorporates the fix and reproducer from Peter Levart.

Bugs.sun.com:
http://bugs.sun.com/view_bug.do?bug_id=7038914

JIRA:
https://jbs.oracle.com/bugs/browse/JDK-7038914

Testing:
JPRT, with the new reproducer passing

In need of reviewers and a sponsor for this patch; as mentioned earlier
I will set Peter (plevart) as author for this patch, i.e. send a
patchset to the sponsor for this change with that author set.

Thanks,
Thomas



Re: RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread

2013-05-07 Thread Thomas Schatzl
Hi,

On Tue, 2013-05-07 at 16:10 +0200, Peter Levart wrote:
> On 05/07/2013 03:26 PM, Thomas Schatzl wrote:
> > Hi,
>
> > There is a message emitted by the VM reading "java.lang.OutOfMemoryError
> > thrown from the UncaughtExceptionHandler in thread "Reference Handler""
> > that is sufficient to detect the problem itself (at least if you enable
> > some flags).
> >
> > I will look at it again and report back if it can be used in some way.
> 
> On my computer the test always produced the same result. So it's pretty 
> reliable. The trick is in fillHeap() method that fills the heap so that 
> even "new Object[1]" throws OOME. Throwable object takes at least the 
> same space as Object[1];
> 

It is fine, thanks. Since you're openjdk author already (plevart?), I
will set you as author for the change - as you both provided the fix as
well as the reproducer.

I merely added boilerplate code.

I will provide a webrev as soon as a jprt run goes through.

Thanks everyone,
Thomas



Re: RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread

2013-05-07 Thread Thomas Schatzl
Hi,

On Tue, 2013-05-07 at 15:12 +0200, Peter Levart wrote:
> On 05/07/2013 09:51 AM, Thomas Schatzl wrote:
> > Hi all,
> >
> > On Tue, 2013-05-07 at 12:31 +1000, David Holmes wrote:
> >> Catching ThreadDeath is futile. If someone is invoking stop() then you
> >> can encounter the ThreadDeath anywhere and it is impossible to write
> >> completely robust code in the face of such an async exception. So please
> >> let's not even go there. stop() is long deprecated and should never be 
> >> used.
> >>
> >> Backing up I think the try/catch(IE|OOME) around wait() is the most
> >> reasonable solution here. Anyone messing with instrumentation or
> >> overriding etc can break things - so be it - don't do that.
> >> StackOverflowError can also completely break many things - again it is
> >> effectively an async exception and writing async-exception-safe Java
> >> code is impractical if not impossible.
> >I can understand this reasoning.
> >
> > I provided a new patch (this time for review)
> > http://cr.openjdk.java.net/~tschatzl/7038914/webrev.1/ which implements
> > this change as suggested.
> >
> > Regarding regression testing, I marked this bug as "noreg-other" with
> > the explanation that it is too hard to write a proper regression test,
> > and the note that any test would involve using methods that we don't
> > give any guarantees for (overriding package private jdk methods,
> > instrumentation).
> 
> Hi Thomas,
> 
> Does the bug reproducer I sent to the list not work for you? The test 
> can check the return value of refQueue.poll() and decide if it passes or 
> not (null return means the ReferenceHandler thread has died and the bug 
> is here, non-null return means thread still works and there is no bug).

I will check the code again, but unfortunately I think it does not help
a lot.

The problem of reproducing this issue is trying to get the
ReferenceHandler to die, i.e. have the OOME occur in the reference
handler thread.

The allocation of the InterruptedException is such a small allocation so
that in almost all of the cases of OOME, its allocation still succeeds
or is not the actual cause for the OOME. So the probability that the
java application threads get the OOME to handle is much higher,
especially in the stress tests.

There is a message emitted by the VM reading "java.lang.OutOfMemoryError
thrown from the UncaughtExceptionHandler in thread "Reference Handler""
that is sufficient to detect the problem itself (at least if you enable
some flags).

I will look at it again and report back if it can be used in some way.

Thanks,
  Thomas




Re: RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread

2013-05-07 Thread Thomas Schatzl
Hi all,

On Tue, 2013-05-07 at 12:31 +1000, David Holmes wrote:
> Catching ThreadDeath is futile. If someone is invoking stop() then you 
> can encounter the ThreadDeath anywhere and it is impossible to write 
> completely robust code in the face of such an async exception. So please 
> let's not even go there. stop() is long deprecated and should never be used.
> 
> Backing up I think the try/catch(IE|OOME) around wait() is the most 
> reasonable solution here. Anyone messing with instrumentation or 
> overriding etc can break things - so be it - don't do that. 
> StackOverflowError can also completely break many things - again it is 
> effectively an async exception and writing async-exception-safe Java 
> code is impractical if not impossible.

  I can understand this reasoning.

I provided a new patch (this time for review)
http://cr.openjdk.java.net/~tschatzl/7038914/webrev.1/ which implements
this change as suggested.

Regarding regression testing, I marked this bug as "noreg-other" with
the explanation that it is too hard to write a proper regression test,
and the note that any test would involve using methods that we don't
give any guarantees for (overriding package private jdk methods,
instrumentation).

I need reviewers and a sponsor pushing this as I don't have any role in
the jdk project, and this is a jdk only patch.

Thanks,
Thomas




Re: (Preliminary) RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread

2013-05-06 Thread Thomas Schatzl
Hi,

On Sat, 2013-05-04 at 21:23 +0200, Peter Levart wrote:
> 
> On 05/04/2013 07:38 PM, Vitaly Davidovich wrote:
> 
> > Oops, that was my mistake - I thought the lock here was a j.u.c.Lock
> > which of course doesn't even make sense given we're talking about
> > ObjectMonitor.  So disregard that bit.
> > 
> > Ignoring OOM and continuing just seems very fragile as it means you
> > somehow know that all state is still consistent.  Most java code
> > isn't async exception safe, so it's hard to reason about state after
> > OOM.  Maybe Reference Handler is OK in that regard though.
> > 
> 
> I think it is safe to ignore OOME here, since this is the only place

As my test program shows, this may not be the only place where heap
allocation can happen within that code.

Alan also mentioned something about instrumentation that can add memory
allocations basically anywhere.
As the reference handler code is plain java code, it will be affected as
other java code.

>  where heap allocation happens and it is known what provokes it.
> Otherwise ReferenceHandler just shuffles existing pointers in existing
> objects...

In the current compilers, yes. But what about other compilers/VMs that
may add more elaborate profiling information here and there? E.g. the
graal compiler? Not sure if it does, probably it does not, but it seems
shaky to rely on compiler code generation here.

So is it possible to handle all known issues (especially if the fix is
similar/known/understandable), not just the original one, here?

I mean, if we fix this issue as you suggested (I am not against that, it
looks reasonable), I would not know what to do with the test program
except file another bug against the very same component with the same
problem, with the same fix suggestion.

I.e. the code still seems to buggy even after applying that fix, and
with instrumentation I think I could create another reproducer. As
reference processing is something critical to the VM, I think it is
prudent to fix all known problems here and now if possible.

Maybe there is some argument against simply wrapping the entire loop
with a try-catch? Please elaborate if so (maybe I overlooked that?), or
maybe you could suggest another solution?

Additionally, just fixing only the exact issue here seems somewhat
wasteful in terms of time (imho) - because next time another developer
will likely spend lots of time trying to reproduce this again. (As
mentioned, the former assignee also seemed to be unable to reproduce
this for a long time).

Thanks,
Thomas




Re: (Preliminary) RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread

2013-05-03 Thread Thomas Schatzl
Hi,
> 
> Hi Tomas,
> 
> I don't know if this is the case here, but what if the
> ReferenceHandler thread is interrupted while wait()-ing and the
> construction of InterruptedException triggers OOME?

I am sure this is the case - previously I thought InterruptedException
is a preallocated exception like others.
ObjectMonitor::wait() may throw it, by creating new InterruptedException
instances.

Thanks!

Now that we've found the very likely cause, what to do about it?

The current state of silently crashing the reference handler thread is
unsatisfying imo as it leads to very hard to find problems.

The options I see all involve catching this (or any other OOME caused by
other means like the test program) and either recovering as much as
possible or exiting the VM (like in the sun.misc.Cleaner handling).

Any other suggestions?

Thanks,
Thomas



Re: (Preliminary) RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread

2013-05-02 Thread Thomas Schatzl
Hi,


On Tue, 2013-04-30 at 16:44 +0100, Alan Bateman wrote:
> On 30/04/2013 15:57, Thomas Schatzl wrote:
> > Hi all,
> >
> >the webrev at http://cr.openjdk.java.net/~tschatzl/7038914/webrev/
> > presents a first stab at the CR "7038914: VM could throw uncaught OOME
> > in ReferenceHandler thread".
> >
> > The problem is that under very heavy memory pressure, there is the
> > reference handler throws an exception with the message "Exception:
> > java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in
> > thread "Reference Handler".
> >
> > The change improves handling of out-of-memory conditions in the
> > ReferenceHandler thread. Instead of crashing the thread, and then
> > disabling reference processing, it catches this exception and continues.
>
> It's surprising to heard that the Reference Handler thread failed with 
> OOME. I wouldn't expect anything in this code path to throw OOME, except 
> maybe in fast-path for sun.misc.Cleaner but that will abort the VM be it 
> fails. The enqueue method that you override in the test to provoke this 
> is package-private so it's unlikely that the test or whatever that 
> resulted in this bug report is doing that.

The test is just that: a somewhat artificial way to reproduce the
problem always.

I tried some of the example programs listed below thousands of times,
sometimes without any issue. The developer previously working on that
also had severe problems reproducing it.

> So I'm again this proposed change, rather I'm just trying to understand 
> how it happened. Is there instrumentation involved by any chance? It the 
> OOME something other than "java heap" or do we know?

No instrumentation I can see of, but a whole set of weak reference
related nightly UTE tests fail at different times (I would suspect
nightly testing does not do any instrumentation). Here is a list with
exactly these errors:

gc/gctests/ReferencesGC
gc/gctests/WeakReference/weak003
gc/gctests/SoftReference/soft005
gc/gctests/LargeObjects/large002
nsk/jdi/ObjectReference/referringObjects/referringObjects002
gc/gctests/PhantomReference/PhantomReferenceTest
gc/gctests/OneeFinalizerTest
heapdump/JMapHeap
gc/gctests/FinalizeTest01
nsk/sysdict/vm/stress/btree/btree007

Apart from these failures, the more serious problem seems to be that the
reference handler thread silently dies. Which means that weak reference
processing is effectively disabled after such an error.

A VM abort like for the Cleaner processing would be lot preferable than
the current situation too.

Thanks,
Thomas




(Preliminary) RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread

2013-04-30 Thread Thomas Schatzl
Hi all,

  the webrev at http://cr.openjdk.java.net/~tschatzl/7038914/webrev/
presents a first stab at the CR "7038914: VM could throw uncaught OOME
in ReferenceHandler thread".

The problem is that under very heavy memory pressure, there is the
reference handler throws an exception with the message "Exception:
java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in
thread "Reference Handler".

The change improves handling of out-of-memory conditions in the
ReferenceHandler thread. Instead of crashing the thread, and then
disabling reference processing, it catches this exception and continues.

I'd like to discuss the change as I'm not really familiar with JDK
coding style, handling of such situations and have some questions about
it.

Bugs.sun
http://bugs.sun.com/view_bug.do?bug_id=7038914

JBS:
https://jbs.oracle.com/bugs/browse/JDK-7038914

Proposed webrev:
http://cr.openjdk.java.net/~tschatzl/7038914/webrev/

- first, I could not reliably reproduce the issue using the information
in the CR. Only via code review (and an idea from Bengt Rutisson -
thanks!) I implemented a nice way to reproduce an OOME in the reference
handler. This involves implementing a custom
java.lang.ref.ReferenceQueue and overriding the enqueue() method, and
doing some allocation that causes an OOME within that method.
My current theory is that synchronization/locking allocates some objects
on the java heap, which are very small, so an OOME in that thread can be
caused. I walked the locking code, but could not find a java heap
allocation there (ObjectMonitor seems to be a C heap object) - maybe I
overlooked it. Probably somebody else knows?
It cannot be the invocation of the Cleaner.clean() methods above the
enqueuing since it has it's own try-catch block already.
Anyway, since the reproducer I wrote shows the same symptoms as reported
in the CR, I hope that this test case is sufficient to be regarded as a
reproducer and the change as a fix.

- the actual change in java/lang/ref/Reference as mentioned involves
putting the entire main enqueuing procedure within a try-catch block.
It only catches OOME to decrease the possibility to catch anything that
should not be caught.
The problem is that this fix does not (and cannot) really fix bad
programming in anyone overriding java.lang.ref.ReferenceQueue.enqueue(),
i.e. if the OOME condition is before the actual execution of the
original enqueue() method, i.e. corruption of the queue may be still
possible.
On the other hand, since overriding ReferenceQueue.enqueue() requires
putting the custom ReferenceQueue into the boot class path, I assume
that people doing that are aware of possible issues.

- handling the OOME: in the catch block of the I put a block

// avoid crashing the reference handler thread,
// but provide for some diagnosability
assert false : e.toString();

to provide some diagnosability in the case of an exception (when
running with assertions). I copied that from other code that tries to
catch similar problems in the clean() method of the Cleaners. There are
other variants of managing this in the jdk, some involving calling
system.exit(). I thought that was too drastic, so I didn't do that, but
what is the appropriate way to handle this situation?

- if the use of locks or the synchronization keyword is indeed the
problem, I think it is possible to use nonblocking synchronization that
is known to not allocate any memory for managing the reference queues
instead. However I think to guard against misbehaving ReferenceQueue
implementations you'd still want to have a try-catch block here.

- is the location of the test correct? I.e. in the jdk
test/java/lang/ref directory? Or is the correct place for that the
hotspot test directories?

Since this is (seems to be) a JDK only change, and this is my first time
changing the JDK, I hope core-libs-dev is the right mailing list.
Otherwise please direct me to the the appropriate one.

Thanks,
Thomas