Re: RFR: 8329597: C2: Intrinsify Reference.clear [v10]

2024-10-16 Thread Aleksey Shipilev
On Tue, 15 Oct 2024 11:07:32 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `all`
>>  - [x] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Export in JVMCI too

Thank you for testing! Here goes.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2416954569


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v10]

2024-10-16 Thread Roberto Castañeda Lozano
On Tue, 15 Oct 2024 11:07:32 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `all`
>>  - [x] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Export in JVMCI too

Marked as reviewed by rcastanedalo (Reviewer).

-

PR Review: https://git.openjdk.org/jdk/pull/20139#pullrequestreview-2371670897


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v9]

2024-10-16 Thread Roberto Castañeda Lozano
On Tue, 15 Oct 2024 09:24:30 GMT, Roberto Castañeda Lozano 
 wrote:

> I will run some internal CI testing and report back in one or two days.

The test results look good. I tested the changes (up to commit 9f7ad7ab) on top 
of jdk-24+19 running tier1-tier5 on all Oracle-supported platforms.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2416038769


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v9]

2024-10-15 Thread Aleksey Shipilev
On Tue, 15 Oct 2024 10:02:15 GMT, Yudi Zheng  wrote:

>> Aleksey Shipilev has updated the pull request with a new target base due to 
>> a merge or a rebase. The pull request now contains 18 commits:
>> 
>>  - Simplify: just do keep alive checks
>>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>>  - More precise bit-unmasks
>>  - Reconcile with late barrier expansion in G1
>>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>>  - Review comments
>>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>>  - Also dispatch to slow-path on other arches
>>  - Fix other arches
>>  - Tighten up comments in Reference javadoc
>>  - ... and 8 more: https://git.openjdk.org/jdk/compare/580eb62d...9f7ad7ab
>
> src/hotspot/share/gc/z/zBarrierSetRuntime.hpp line 43:
> 
>> 41:   static void store_barrier_on_oop_field_with_healing(oop* p);
>> 42:   static void store_barrier_on_oop_field_without_healing(oop* p);
>> 43:   static void 
>> no_keepalive_store_barrier_on_oop_field_without_healing(oop* p);
> 
> Could you please export this to JVMCI? I.e.,
> 
> diff --git a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp 
> b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
> index 5452cca96b8..46aeb996c56 100644
> --- a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
> +++ b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
> @@ -847,6 +847,7 @@
>ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
> ZBarrierSetRuntime::load_barrier_on_oop_field_preloaded_store_good))  
>  \
>ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
> ZBarrierSetRuntime::no_keepalive_load_barrier_on_weak_oop_field_preloaded))   
>  \
>ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
> ZBarrierSetRuntime::no_keepalive_load_barrier_on_phantom_oop_field_preloaded))
>  \
> +  ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
> ZBarrierSetRuntime::no_keepalive_store_barrier_on_oop_field_without_healing)  
>  \
>ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
> ZBarrierSetRuntime::store_barrier_on_native_oop_field_without_healing))   
>  \
>ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
> ZBarrierSetRuntime::store_barrier_on_oop_field_with_healing)) 
>  \
>ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
> ZBarrierSetRuntime::store_barrier_on_oop_field_without_healing))  
>  \
> 
> Thanks!

Done!

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20139#discussion_r1800937787


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v10]

2024-10-15 Thread Aleksey Shipilev
> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
> method for `Reference.clear`. The original patch skipped intrinsification of 
> this method, because we thought `Reference.clear` is not on a performance 
> sensitive path. However, it shows up prominently on simple benchmarks that 
> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
> `RRWL` benchmarks.
> 
> We need to know the actual oop strongness/weakness before we call into C2 
> Access API, this work models this after existing code for `refersTo0` 
> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
> 
> Additional testing:
>  - [x] Linux x86_64 server fastdebug, `all`
>  - [x] Linux AArch64 server fastdebug, `all`

Aleksey Shipilev has updated the pull request incrementally with one additional 
commit since the last revision:

  Export in JVMCI too

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/20139/files
  - new: https://git.openjdk.org/jdk/pull/20139/files/9f7ad7ab..479781df

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=20139&range=09
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20139&range=08-09

  Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/20139.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20139/head:pull/20139

PR: https://git.openjdk.org/jdk/pull/20139


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v9]

2024-10-15 Thread Yudi Zheng
On Wed, 9 Oct 2024 08:44:34 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `all`
>>  - [x] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 18 commits:
> 
>  - Simplify: just do keep alive checks
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - More precise bit-unmasks
>  - Reconcile with late barrier expansion in G1
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - Review comments
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - Also dispatch to slow-path on other arches
>  - Fix other arches
>  - Tighten up comments in Reference javadoc
>  - ... and 8 more: https://git.openjdk.org/jdk/compare/580eb62d...9f7ad7ab

src/hotspot/share/gc/z/zBarrierSetRuntime.hpp line 43:

> 41:   static void store_barrier_on_oop_field_with_healing(oop* p);
> 42:   static void store_barrier_on_oop_field_without_healing(oop* p);
> 43:   static void 
> no_keepalive_store_barrier_on_oop_field_without_healing(oop* p);

Could you please export this to JVMCI? I.e.,

diff --git a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp 
b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
index 5452cca96b8..46aeb996c56 100644
--- a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
+++ b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
@@ -847,6 +847,7 @@
   ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
ZBarrierSetRuntime::load_barrier_on_oop_field_preloaded_store_good))   \
   ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
ZBarrierSetRuntime::no_keepalive_load_barrier_on_weak_oop_field_preloaded))\
   ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
ZBarrierSetRuntime::no_keepalive_load_barrier_on_phantom_oop_field_preloaded)) \
+  ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
ZBarrierSetRuntime::no_keepalive_store_barrier_on_oop_field_without_healing)   \
   ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
ZBarrierSetRuntime::store_barrier_on_native_oop_field_without_healing))\
   ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
ZBarrierSetRuntime::store_barrier_on_oop_field_with_healing))  \
   ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
ZBarrierSetRuntime::store_barrier_on_oop_field_without_healing))   \

Thanks!

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20139#discussion_r1800848130


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v9]

2024-10-15 Thread Roberto Castañeda Lozano
On Tue, 15 Oct 2024 09:18:05 GMT, Aleksey Shipilev  wrote:

> Thanks for review, folks. I am re-running testing locally here. Would 
> appreciate if you can give this patch a spin through your CIs as well.

I will run some internal CI testing and report back in one or two days.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2413362811


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v9]

2024-10-15 Thread Aleksey Shipilev
On Wed, 9 Oct 2024 08:44:34 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `all`
>>  - [x] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 18 commits:
> 
>  - Simplify: just do keep alive checks
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - More precise bit-unmasks
>  - Reconcile with late barrier expansion in G1
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - Review comments
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - Also dispatch to slow-path on other arches
>  - Fix other arches
>  - Tighten up comments in Reference javadoc
>  - ... and 8 more: https://git.openjdk.org/jdk/compare/580eb62d...9f7ad7ab

Thanks for review, folks. I am re-running testing locally here. Would 
appreciate if you can give this patch a spin through your CIs as well.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2413348085


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v9]

2024-10-14 Thread Vladimir Kozlov
On Wed, 9 Oct 2024 08:44:34 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `all`
>>  - [x] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 18 commits:
> 
>  - Simplify: just do keep alive checks
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - More precise bit-unmasks
>  - Reconcile with late barrier expansion in G1
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - Review comments
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - Also dispatch to slow-path on other arches
>  - Fix other arches
>  - Tighten up comments in Reference javadoc
>  - ... and 8 more: https://git.openjdk.org/jdk/compare/580eb62d...9f7ad7ab

C2 change (intrinsics code) is fine.

-

Marked as reviewed by kvn (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/20139#pullrequestreview-2367483219


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v9]

2024-10-14 Thread Aleksey Shipilev
On Wed, 9 Oct 2024 08:44:34 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `all`
>>  - [x] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 18 commits:
> 
>  - Simplify: just do keep alive checks
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - More precise bit-unmasks
>  - Reconcile with late barrier expansion in G1
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - Review comments
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - Also dispatch to slow-path on other arches
>  - Fix other arches
>  - Tighten up comments in Reference javadoc
>  - ... and 8 more: https://git.openjdk.org/jdk/compare/580eb62d...9f7ad7ab

Thanks! I see Kim reviewed JDK parts, so we need another Reviewer for Hotspot 
parts.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2410579109


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v9]

2024-10-13 Thread Erik Österlund
On Wed, 9 Oct 2024 08:44:34 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `all`
>>  - [x] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 18 commits:
> 
>  - Simplify: just do keep alive checks
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - More precise bit-unmasks
>  - Reconcile with late barrier expansion in G1
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - Review comments
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - Also dispatch to slow-path on other arches
>  - Fix other arches
>  - Tighten up comments in Reference javadoc
>  - ... and 8 more: https://git.openjdk.org/jdk/compare/580eb62d...9f7ad7ab

Looks good.

-

Marked as reviewed by eosterlund (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/20139#pullrequestreview-2364968158


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v8]

2024-10-09 Thread Aleksey Shipilev
On Mon, 7 Oct 2024 08:15:21 GMT, Erik Österlund  wrote:

>> Aleksey Shipilev has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   More precise bit-unmasks
>
> src/hotspot/share/gc/g1/c2/g1BarrierSetC2.cpp line 342:
> 
>> 340:   }
>> 341:   if ((on_weak || on_phantom) && no_keepalive) {
>> 342: // Be extra paranoid around this path. Only accept null stores,
> 
> I think there might be some orthogonal stuff that is unnecessarily mixed up 
> here. When no_keepalive is manually specified, then we shouldn't do the 
> pre-write barrier, regardless of reference strength. Similarly, when the new 
> value is null, we don't need to perform the post write barrier, regardless of 
> reference strength. Roberto added some code in refine_barrier_by_new_val_type 
> that already *should* take care of the latter part. It allows types to flow 
> around a bit, and then checks if the type of the new value is provably null, 
> and then removes the post write barrier. The existing logic for that should 
> be strictly more powerful than the new check you added, I think.
> 
> Based on the above explanation, I think I'm proposing this block is replaced 
> with this simpler condition:
> 
> if (no_keepalive) {
>   access.set_barrier_data(access.barrier_data() & ~G1C2BarrierPre);
> }

Right. We also do not need this complexity in Shenandoah barriers. This check 
was dragged here from the load barriers that _want_ to check if we are reading 
the `Reference.referent` and feed it to SATB _unless_ there is a no-keep-alive. 
For store barriers it is unnecessary, and we can just do keep-alive checks 
straight up. Should be done in new commit, testing now.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20139#discussion_r1793115683


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v9]

2024-10-09 Thread Aleksey Shipilev
> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
> method for `Reference.clear`. The original patch skipped intrinsification of 
> this method, because we thought `Reference.clear` is not on a performance 
> sensitive path. However, it shows up prominently on simple benchmarks that 
> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
> `RRWL` benchmarks.
> 
> We need to know the actual oop strongness/weakness before we call into C2 
> Access API, this work models this after existing code for `refersTo0` 
> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
> 
> Additional testing:
>  - [x] Linux x86_64 server fastdebug, `all`
>  - [x] Linux AArch64 server fastdebug, `all`

Aleksey Shipilev has updated the pull request with a new target base due to a 
merge or a rebase. The pull request now contains 18 commits:

 - Simplify: just do keep alive checks
 - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
 - More precise bit-unmasks
 - Reconcile with late barrier expansion in G1
 - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
 - Review comments
 - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
 - Also dispatch to slow-path on other arches
 - Fix other arches
 - Tighten up comments in Reference javadoc
 - ... and 8 more: https://git.openjdk.org/jdk/compare/580eb62d...9f7ad7ab

-

Changes: https://git.openjdk.org/jdk/pull/20139/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20139&range=08
  Stats: 361 lines in 25 files changed: 340 ins; 0 del; 21 mod
  Patch: https://git.openjdk.org/jdk/pull/20139.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20139/head:pull/20139

PR: https://git.openjdk.org/jdk/pull/20139


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v8]

2024-10-07 Thread Erik Österlund
On Sun, 6 Oct 2024 14:43:09 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `all`
>>  - [x] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   More precise bit-unmasks

One last thing...

src/hotspot/share/gc/g1/c2/g1BarrierSetC2.cpp line 342:

> 340:   }
> 341:   if ((on_weak || on_phantom) && no_keepalive) {
> 342: // Be extra paranoid around this path. Only accept null stores,

I think there might be some orthogonal stuff that is unnecessarily mixed up 
here. When no_keepalive is manually specified, then we shouldn't do the 
pre-write barrier, regardless of reference strength. Similarly, when the new 
value is null, we don't need to perform the post write barrier, regardless of 
reference strength. Roberto added some code in refine_barrier_by_new_val_type 
that already *should* take care of the latter part. It allows types to flow 
around a bit, and then checks if the type of the new value is provably null, 
and then removes the post write barrier. The existing logic for that should be 
strictly more powerful than the new check you added, I think.

Based on the above explanation, I think I'm proposing this block is replaced 
with this simpler condition:

if (no_keepalive) {
  access.set_barrier_data(access.barrier_data() & ~G1C2BarrierPre);
}

-

Changes requested by eosterlund (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/20139#pullrequestreview-2351232319
PR Review Comment: https://git.openjdk.org/jdk/pull/20139#discussion_r1789742277


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v8]

2024-10-07 Thread Aleksey Shipilev
On Sun, 6 Oct 2024 14:43:09 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `all`
>>  - [x] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   More precise bit-unmasks

Tests pass with the new change. I eyeballed G1 perfasm output on new benchmark, 
and there are no barriers in sight as well.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2396092920


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v6]

2024-10-06 Thread Tobias Hartmann
On Thu, 3 Oct 2024 16:54:40 GMT, Aleksey Shipilev  wrote:

>> test/micro/org/openjdk/bench/java/lang/ref/ReferenceClear.java line 2:
>> 
>>> 1: //
>>> 2: // * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved.
>> 
>> Drive-by comment: The `// *` format looks weird.
>
> Actually, this constellation of single-line comments should be replaced with 
> a multi-line comment block. The fix is now in my (unpublished) patch queue, 
> thanks.

Yes, that's what I meant. Thanks.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20139#discussion_r1789539477


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v8]

2024-10-06 Thread Aleksey Shipilev
> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
> method for `Reference.clear`. The original patch skipped intrinsification of 
> this method, because we thought `Reference.clear` is not on a performance 
> sensitive path. However, it shows up prominently on simple benchmarks that 
> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
> `RRWL` benchmarks.
> 
> We need to know the actual oop strongness/weakness before we call into C2 
> Access API, this work models this after existing code for `refersTo0` 
> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
> 
> Additional testing:
>  - [ ] Linux x86_64 server fastdebug, `all`
>  - [ ] Linux AArch64 server fastdebug, `all`

Aleksey Shipilev has updated the pull request incrementally with one additional 
commit since the last revision:

  More precise bit-unmasks

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/20139/files
  - new: https://git.openjdk.org/jdk/pull/20139/files/c3338302..cf808b9a

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=20139&range=07
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20139&range=06-07

  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/20139.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20139/head:pull/20139

PR: https://git.openjdk.org/jdk/pull/20139


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v7]

2024-10-03 Thread Aleksey Shipilev
> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
> method for `Reference.clear`. The original patch skipped intrinsification of 
> this method, because we thought `Reference.clear` is not on a performance 
> sensitive path. However, it shows up prominently on simple benchmarks that 
> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
> `RRWL` benchmarks.
> 
> We need to know the actual oop strongness/weakness before we call into C2 
> Access API, this work models this after existing code for `refersTo0` 
> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
> 
> Additional testing:
>  - [ ] Linux x86_64 server fastdebug, `all`
>  - [ ] Linux AArch64 server fastdebug, `all`

Aleksey Shipilev has updated the pull request with a new target base due to a 
merge or a rebase. The pull request now contains 15 commits:

 - Reconcile with late barrier expansion in G1
 - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
 - Review comments
 - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
 - Also dispatch to slow-path on other arches
 - Fix other arches
 - Tighten up comments in Reference javadoc
 - Attempt at implementing ZGC AArch64 parts
 - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
 - Amend the test case for guaranteing it works under different compilation 
regimes
 - ... and 5 more: https://git.openjdk.org/jdk/compare/ebb4759c...c3338302

-

Changes: https://git.openjdk.org/jdk/pull/20139/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20139&range=06
  Stats: 372 lines in 25 files changed: 351 ins; 0 del; 21 mod
  Patch: https://git.openjdk.org/jdk/pull/20139.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20139/head:pull/20139

PR: https://git.openjdk.org/jdk/pull/20139


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v6]

2024-10-03 Thread Aleksey Shipilev
On Wed, 2 Oct 2024 10:47:09 GMT, Tobias Hartmann  wrote:

>> Aleksey Shipilev has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Also dispatch to slow-path on other arches
>
> test/micro/org/openjdk/bench/java/lang/ref/ReferenceClear.java line 2:
> 
>> 1: //
>> 2: // * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved.
> 
> Drive-by comment: The `// *` format looks weird.

Actually, this constellation of single-line comments should be replaced with a 
multi-line comment block. The fix is now in my (unpublished) patch queue, 
thanks.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20139#discussion_r1786548233


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v6]

2024-10-03 Thread Aleksey Shipilev
On Thu, 3 Oct 2024 12:14:08 GMT, Erik Österlund  wrote:

>> Aleksey Shipilev has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Also dispatch to slow-path on other arches
>
> src/hotspot/share/opto/library_call.cpp line 7002:
> 
>> 7000:   // Add memory barrier to prevent commoning the accesses in this code,
>> 7001:   // since GC can change the value of referent at any time.
>> 7002:   insert_mem_bar(Op_MemBarCPUOrder);
> 
> I think this CPU memory barrier and comment above are confusing and obviously 
> just taken from the referent loading intrinsics. The commoning it is talking 
> about is to short circuit a second load with the result of a first load of 
> the referent field, since the compiler "knows" the first load would give the 
> "same" answer unless "something happened" (GC clearing it).
> In this case the mutator just cleared it, so what the compiler thinks is that 
> null is stored in that field. And that's completely accurate, and the GC is 
> not going to transition the field from null to some random other object.
> Let's remove this CPU memory barrier and the misleading comments.

Right. I removed it in my local patch queue. Now I have to reconcile this PR 
with late barrier expansion in G1, I'll push the update once that is done.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20139#discussion_r1786549120


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v6]

2024-10-03 Thread Erik Österlund
On Mon, 30 Sep 2024 16:59:16 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [ ] Linux x86_64 server fastdebug, `all`
>>  - [ ] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Also dispatch to slow-path on other arches

Changes requested by eosterlund (Reviewer).

src/hotspot/share/opto/library_call.cpp line 7002:

> 7000:   // Add memory barrier to prevent commoning the accesses in this code,
> 7001:   // since GC can change the value of referent at any time.
> 7002:   insert_mem_bar(Op_MemBarCPUOrder);

I think this CPU memory barrier and comment above are confusing and obviously 
just taken from the referent loading intrinsics. The commoning it is talking 
about is to short circuit a second load with the result of a first load of the 
referent field, since the compiler "knows" the first load would give the "same" 
answer unless "something happened" (GC clearing it).
In this case the mutator just cleared it, so what the compiler thinks is that 
null is stored in that field. And that's completely accurate, and the GC is not 
going to transition the field from null to some random other object.
Let's remove this CPU memory barrier and the misleading comments.

-

PR Review: https://git.openjdk.org/jdk/pull/20139#pullrequestreview-2345438249
PR Review Comment: https://git.openjdk.org/jdk/pull/20139#discussion_r1786116003


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v3]

2024-10-03 Thread Erik Österlund
On Mon, 30 Sep 2024 16:32:48 GMT, Aleksey Shipilev  wrote:

> > I think we need a new 
> > ZBarrierSetRuntime::no_keepalive_store_barrier_on_oop_field_without_healing(oop*
> >  p) and to make that the selected slow path function when 
> > ZBarrierNoKeepalive is set on a StorePNode. Its implementation would call 
> > ZBarrier::no_keep_alive_store_barrier_on_heap_oop_field. This should do the 
> > trick.
> 
> Thanks! See new commits: is that the shape you were thinking of?

Yeah that's perfect. Thanks!

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2391252177


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v6]

2024-10-02 Thread Tobias Hartmann
On Mon, 30 Sep 2024 16:59:16 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [ ] Linux x86_64 server fastdebug, `all`
>>  - [ ] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Also dispatch to slow-path on other arches

test/micro/org/openjdk/bench/java/lang/ref/ReferenceClear.java line 2:

> 1: //
> 2: // * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved.

Drive-by comment: The `// *` format looks weird.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20139#discussion_r1784292673


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v6]

2024-09-30 Thread Kim Barrett
On Mon, 30 Sep 2024 16:59:16 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [ ] Linux x86_64 server fastdebug, `all`
>>  - [ ] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Also dispatch to slow-path on other arches

Removing my "Request changes" as request has been satisfied.

I've only really looked at the changes in java.base, which look fine.  I've 
skimmed some
of the compiler code, but don't feel qualified to properly review it.  So don't 
count or
wait for me as a reviewer.

-

PR Review: https://git.openjdk.org/jdk/pull/20139#pullrequestreview-2338962626


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v3]

2024-09-30 Thread Kim Barrett
On Mon, 30 Sep 2024 16:45:12 GMT, Aleksey Shipilev  wrote:

>> src/java.base/share/classes/java/lang/ref/Reference.java line 420:
>> 
>>> 418: /* Implementation of clear(), also used by enqueue().  A simple
>>> 419:  * assignment of the referent field won't do for some garbage
>>> 420:  * collectors.
>> 
>> Description of clear0 is rendered stale by this change.  The first sentence 
>> is no longer true, since it's now
>> clearImpl that has that role.  The second sentence probably ought to also be 
>> moved into the description of
>> clearImpl.
>
> Thanks! I tightened up comments a bit, take another look?

Yes, that's better.  Thanks.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20139#discussion_r1782006243


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v6]

2024-09-30 Thread Aleksey Shipilev
> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
> method for `Reference.clear`. The original patch skipped intrinsification of 
> this method, because we thought `Reference.clear` is not on a performance 
> sensitive path. However, it shows up prominently on simple benchmarks that 
> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
> `RRWL` benchmarks.
> 
> We need to know the actual oop strongness/weakness before we call into C2 
> Access API, this work models this after existing code for `refersTo0` 
> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
> 
> Additional testing:
>  - [ ] Linux x86_64 server fastdebug, `all`
>  - [ ] Linux AArch64 server fastdebug, `all`

Aleksey Shipilev has updated the pull request incrementally with one additional 
commit since the last revision:

  Also dispatch to slow-path on other arches

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/20139/files
  - new: https://git.openjdk.org/jdk/pull/20139/files/8ba681a4..4fe4a911

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=20139&range=05
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20139&range=04-05

  Stats: 6 lines in 3 files changed: 6 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/20139.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20139/head:pull/20139

PR: https://git.openjdk.org/jdk/pull/20139


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v3]

2024-09-30 Thread Aleksey Shipilev
On Fri, 27 Sep 2024 23:51:13 GMT, Kim Barrett  wrote:

>> Aleksey Shipilev has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Amend the test case for guaranteing it works under different compilation 
>> regimes
>
> src/java.base/share/classes/java/lang/ref/Reference.java line 420:
> 
>> 418: /* Implementation of clear(), also used by enqueue().  A simple
>> 419:  * assignment of the referent field won't do for some garbage
>> 420:  * collectors.
> 
> Description of clear0 is rendered stale by this change.  The first sentence 
> is no longer true, since it's now
> clearImpl that has that role.  The second sentence probably ought to also be 
> moved into the description of
> clearImpl.

Thanks! I tightened up comments a bit, take another look?

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20139#discussion_r1781452602


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v5]

2024-09-30 Thread Aleksey Shipilev
> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
> method for `Reference.clear`. The original patch skipped intrinsification of 
> this method, because we thought `Reference.clear` is not on a performance 
> sensitive path. However, it shows up prominently on simple benchmarks that 
> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
> `RRWL` benchmarks.
> 
> We need to know the actual oop strongness/weakness before we call into C2 
> Access API, this work models this after existing code for `refersTo0` 
> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
> 
> Additional testing:
>  - [ ] Linux x86_64 server fastdebug, `all`
>  - [ ] Linux AArch64 server fastdebug, `all`

Aleksey Shipilev has updated the pull request incrementally with two additional 
commits since the last revision:

 - Fix other arches
 - Tighten up comments in Reference javadoc

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/20139/files
  - new: https://git.openjdk.org/jdk/pull/20139/files/cba0a8e9..8ba681a4

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=20139&range=04
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20139&range=03-04

  Stats: 16 lines in 4 files changed: 7 ins; 4 del; 5 mod
  Patch: https://git.openjdk.org/jdk/pull/20139.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20139/head:pull/20139

PR: https://git.openjdk.org/jdk/pull/20139


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v3]

2024-09-30 Thread Aleksey Shipilev
On Mon, 30 Sep 2024 15:08:53 GMT, Erik Österlund  wrote:

> I think we need a new 
> ZBarrierSetRuntime::no_keepalive_store_barrier_on_oop_field_without_healing(oop*
>  p) and to make that the selected slow path function when ZBarrierNoKeepalive 
> is set on a StorePNode. Its implementation would call 
> ZBarrier::no_keep_alive_store_barrier_on_heap_oop_field. This should do the 
> trick.

Thanks! See new commits: is that the shape you were thinking of? Once we get 
AArch64 parts right, I'll copy-paste that to other arches.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2383669579


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v4]

2024-09-30 Thread Aleksey Shipilev
> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
> method for `Reference.clear`. The original patch skipped intrinsification of 
> this method, because we thought `Reference.clear` is not on a performance 
> sensitive path. However, it shows up prominently on simple benchmarks that 
> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
> `RRWL` benchmarks.
> 
> We need to know the actual oop strongness/weakness before we call into C2 
> Access API, this work models this after existing code for `refersTo0` 
> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
> 
> Additional testing:
>  - [x] Linux x86_64 server fastdebug, `all`
>  - [x] Linux AArch64 server fastdebug, `all`

Aleksey Shipilev has updated the pull request with a new target base due to a 
merge or a rebase. The incremental webrev excludes the unrelated changes 
brought in by the merge/rebase. The pull request contains eight additional 
commits since the last revision:

 - Attempt at implementing ZGC AArch64 parts
 - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
 - Amend the test case for guaranteing it works under different compilation 
regimes
 - More precise barriers
 - Tests work
 - More touchups
 - Fixing the conditions, fixing the tests
 - Crude prototype, still failing the tests

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/20139/files
  - new: https://git.openjdk.org/jdk/pull/20139/files/437f2329..cba0a8e9

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=20139&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20139&range=02-03

  Stats: 258786 lines in 3084 files changed: 211178 ins; 30411 del; 17197 mod
  Patch: https://git.openjdk.org/jdk/pull/20139.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20139/head:pull/20139

PR: https://git.openjdk.org/jdk/pull/20139


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v3]

2024-09-30 Thread Erik Österlund
On Fri, 27 Sep 2024 18:57:51 GMT, Aleksey Shipilev  wrote:

> > Is ZGC affected by this? I see only G1 and Shenandoah changes.
> 
> Good question.
> 
> ZGC expands the GC barriers late. This is why the IR test configuration that 
> tests ZGC shows the same result as with other collectors: no additional fluff 
> in IR. I would not expect we need anything else in late expansion for ZGC for 
> Reference.clear, but maybe I am tired and cannot see it. Can you confirm this 
> is fine, @fisk?

ZGC needs some changes. Without doing anything, we propagate the 
AS_NO_KEEPALIVE decorator to the corresponding ZBarrierNoKeepalive bit being 
set in the barrier data of the StorePNode. However, we don't really do anything 
special with that information and we will in practice end up keeping the 
referent alive when clearing it with generational ZGC. The point of introducing 
the native implementation in the first place, was to make sure our GCs don't 
keep the referent alive when clearing it, as the user intention is clearly to 
not keep it alive.

I think we need a new 
ZBarrierSetRuntime::no_keepalive_store_barrier_on_oop_field_without_healing(oop*
 p) and to make that the selected slow path function when ZBarrierNoKeepalive 
is set on a StorePNode. Its implementation would call 
ZBarrier::no_keep_alive_store_barrier_on_heap_oop_field. This should do the 
trick.

Please let me know if you need further assistance.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2383479242


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v3]

2024-09-27 Thread Kim Barrett
On Fri, 19 Jul 2024 15:52:14 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `all`
>>  - [x] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Amend the test case for guaranteing it works under different compilation 
> regimes

Changes requested by kbarrett (Reviewer).

src/java.base/share/classes/java/lang/ref/Reference.java line 420:

> 418: /* Implementation of clear(), also used by enqueue().  A simple
> 419:  * assignment of the referent field won't do for some garbage
> 420:  * collectors.

Description of clear0 is rendered stale by this change.  The first sentence is 
no longer true, since it's now
clearImpl that has that role.  The second sentence probably ought to also be 
moved into the description of
clearImpl.

-

PR Review: https://git.openjdk.org/jdk/pull/20139#pullrequestreview-2334816850
PR Review Comment: https://git.openjdk.org/jdk/pull/20139#discussion_r1779311136


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v3]

2024-09-27 Thread Vladimir Kozlov
On Fri, 19 Jul 2024 15:52:14 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `all`
>>  - [x] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Amend the test case for guaranteing it works under different compilation 
> regimes

There is coming JEP for later G1 barriers expansion similar to ZGC. Will you 
still need this intrinsic after it? I assume Shenandoah will follow G1 later.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2379910959


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v3]

2024-09-27 Thread Aleksey Shipilev
On Fri, 27 Sep 2024 17:44:38 GMT, Vladimir Kozlov  wrote:

> Is ZGC affected by this? I see only G1 and Shenandoah changes.

Good question.

ZGC expands the GC barriers late. This is why the IR test configuration that 
tests ZGC shows the same result as with other collectors: no additional fluff 
in IR. I would not expect we need anything else in late expansion for ZGC for 
Reference.clear, but maybe I am tired and cannot see it. Can you confirm this 
is fine, @fisk?

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2379881102


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v3]

2024-09-27 Thread Vladimir Kozlov
On Fri, 19 Jul 2024 15:52:14 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `all`
>>  - [x] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Amend the test case for guaranteing it works under different compilation 
> regimes

Is ZGC affected by this? I see only G1 and Shenandoah changes.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2379772205


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v3]

2024-09-27 Thread Aleksey Shipilev
On Fri, 19 Jul 2024 15:52:14 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `all`
>>  - [x] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Amend the test case for guaranteing it works under different compilation 
> regimes

We keep seeing `Reference.clear` native call on hot paths in services in JDK 
17+. I would like to get this PR moving again. Please take a look :)

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2379346593


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v3]

2024-08-19 Thread Aleksey Shipilev
On Fri, 19 Jul 2024 15:52:14 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `all`
>>  - [x] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Amend the test case for guaranteing it works under different compilation 
> regimes

Not now, bot. Still waiting for reviews.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2295989638


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v3]

2024-07-19 Thread Aleksey Shipilev
> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
> method for `Reference.clear`. The original patch skipped intrinsification of 
> this method, because we thought `Reference.clear` is not on a performance 
> sensitive path. However, it shows up prominently on simple benchmarks that 
> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
> `RRWL` benchmarks.
> 
> We need to know the actual oop strongness/weakness before we call into C2 
> Access API, this work models this after existing code for `refersTo0` 
> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
> 
> Additional testing:
>  - [x] Linux x86_64 server fastdebug, `all`
>  - [x] Linux AArch64 server fastdebug, `all`

Aleksey Shipilev has updated the pull request incrementally with one additional 
commit since the last revision:

  Amend the test case for guaranteing it works under different compilation 
regimes

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/20139/files
  - new: https://git.openjdk.org/jdk/pull/20139/files/79ece901..437f2329

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=20139&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20139&range=01-02

  Stats: 36 lines in 1 file changed: 18 ins; 7 del; 11 mod
  Patch: https://git.openjdk.org/jdk/pull/20139.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20139/head:pull/20139

PR: https://git.openjdk.org/jdk/pull/20139


Re: RFR: 8329597: C2: Intrinsify Reference.clear

2024-07-17 Thread Aleksey Shipilev
On Thu, 11 Jul 2024 15:28:37 GMT, Aleksey Shipilev  wrote:

> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
> method for `Reference.clear`. The original patch skipped intrinsification of 
> this method, because we thought `Reference.clear` is not on a performance 
> sensitive path. However, it shows up prominently on simple benchmarks that 
> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
> `RRWL` benchmarks.
> 
> We need to know the actual oop strongness/weakness before we call into C2 
> Access API, this work models this after existing code for `refersTo0` 
> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
> 
> Additional testing:
>  - [x] Linux x86_64 server fastdebug, `all`
>  - [ ] Linux AArch64 server fastdebug, `all`

Split out the `refersTo` test to https://github.com/openjdk/jdk/pull/20215.

Yeah, so this version seems to work well on tests.

I am being extra paranoid about only accepting `null` stores, since 
`AS_NO_KEEPALIVE` means all other barriers like inter-generational 
post-barriers in G1 should still work. G1 barrier set delegates the stores to 
`CardTable/ModRefBarrierSet`, which: a) does not know which barriers can be 
bypassed by `AS_NO_KEEPALIVE`; b) calls back `G1BarrierSet` for prebarrier 
generation, but already loses the decorators. So the simplest way to deal with 
this is to handle this special case specially.

I think this is insanely sane, given how sharp-edged `AS_NO_KEEPALIVE` is.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2233066721
PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2234005550


Re: RFR: 8329597: C2: Intrinsify Reference.clear

2024-07-17 Thread Aleksey Shipilev
On Fri, 12 Jul 2024 13:19:31 GMT, Aleksey Shipilev  wrote:

>>> > The reason we did not do this before is that this is not a strong 
>>> > reference store. Strong reference stores with a SATB collector will keep 
>>> > the referent alive, which is typically the exact opposite of what a user 
>>> > wants when they clear a Reference.
>>> 
>>> You mean not doing this store just on the Java side? Yes, I agree, it would 
>>> be awkward. In intrinsic, we are storing with the same decorators that 
>>> `JVM_ReferenceClear` is using, which should be good with SATB collectors. 
>>> Perhaps I am misunderstanding the comment.
>> 
>> The runtime use of the Access API knows how to resolve an unknown oop ref 
>> strength using AccessBarrierSupport::resolve_unknown_oop_ref_strength. 
>> However, we do not have support for that in the C2 backend. In fact, it does 
>> not understand non-strong oop stores at all. Because there hasn't really 
>> been a use case for it, other than clearing a Reference. That's the precise 
>> reason why we do not have a clear intrinsic; it would have to add that 
>> infrastructure.
>
>> The runtime use of the Access API knows how to resolve an unknown oop ref 
>> strength using AccessBarrierSupport::resolve_unknown_oop_ref_strength. 
>> However, we do not have support for that in the C2 backend. In fact, it does 
>> not understand non-strong oop stores at all. 
> 
> Aw, nice usability landmine. I thought C2 barrier set would assert on me if 
> it cannot deliver. Apparently not, I see it just does pre-barriers when it is 
> not sure what strongness the store is. Hrmpf. OK, let me see what can be done 
> here. It might be just easier to further specialize `Reference.clear` in 
> subclasses and carry down the actual strongness, like we do with `refersTo0` 
> currently. This would still require C2 backend adjustments to handle 
> `AS_NO_KEEPALIVE` on stores, but at least we would not have to guess about 
> the strongness type in C2 intrinsic.

> I should have read what I was replying to more carefully, rather than 
> focusing on what was further up in the thread. Looks like you (@shipilev) 
> already spotted the refersTo stuff. But the enqueue => clear0 could have 
> easily been missed, so perhaps not an entirely unneeded suggestion.

Yeah, thanks. The `enqueue => clear0` was indeed easy to miss.

Pushed the crude prototype that follows `refersTo` example and drills some new 
`AS_NO_KEEPALIVE` holes in C2 Access API to cover this intrinsic case. Super 
untested. IR tests are still failing, I'll take more in-depth look there. 
(Perhaps it would not be possible to clearly match the absence of pre-barrier 
in IR tests, we'll see.)

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2231613218


Re: RFR: 8329597: C2: Intrinsify Reference.clear

2024-07-17 Thread Kim Barrett
On Mon, 15 Jul 2024 16:09:39 GMT, Kim Barrett  wrote:

> > Aw, nice usability landmine. I thought C2 barrier set would assert on me if 
> > it cannot deliver. Apparently not, [...]
> 
> Reference.refersTo has similar issues. See refersToImpl and refersTo0 in both 
> Reference and PhantomReference. I think you should be able to model on those 
> and the intrinsic implementation for refersTo to get what you want.
> 
> One additional complication is that Reference.enqueue intentionally calls 
> clear0. If implementing clear similarly to refersTo, then enqueue should be 
> changed to call clearImpl.

I should have read what I was replying to more carefully, rather than focusing 
on what was further up in the thread.
Looks like you (@shipilev) already spotted the refersTo stuff.  But the enqueue 
=> clear0 could have easily been missed,
so perhaps not an entirely unneeded suggestion.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2231464762


Re: RFR: 8329597: C2: Intrinsify Reference.clear

2024-07-17 Thread Kim Barrett
On Fri, 12 Jul 2024 13:19:31 GMT, Aleksey Shipilev  wrote:

> > The runtime use of the Access API knows how to resolve an unknown oop ref 
> > strength using AccessBarrierSupport::resolve_unknown_oop_ref_strength. 
> > However, we do not have support for that in the C2 backend. In fact, it 
> > does not understand non-strong oop stores at all.
> 
> Aw, nice usability landmine. I thought C2 barrier set would assert on me if 
> it cannot deliver. Apparently not, [...]

Reference.refersTo has similar issues.  See refersToImpl and refersTo0 in both 
Reference and PhantomReference.
I think you should be able to model on those and the intrinsic implementation 
for refersTo to get what you want.

One additional complication is that Reference.enqueue intentionally calls 
clear0.  If implementing clear similarly
to refersTo, then enqueue should be changed to call clearImpl.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2228872926


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v2]

2024-07-17 Thread Aleksey Shipilev
> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
> method for `Reference.clear`. The original patch skipped intrinsification of 
> this method, because we thought `Reference.clear` is not on a performance 
> sensitive path. However, it shows up prominently on simple benchmarks that 
> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
> `RRWL` benchmarks.
> 
> We need to know the actual oop strongness/weakness before we call into C2 
> Access API, this work models this after existing code for `refersTo0` 
> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
> 
> Additional testing:
>  - [x] Linux x86_64 server fastdebug, `all`
>  - [ ] Linux AArch64 server fastdebug, `all`

Aleksey Shipilev has updated the pull request with a new target base due to a 
merge or a rebase. The pull request now contains five commits:

 - More precise barriers
 - Tests work
 - More touchups
 - Fixing the conditions, fixing the tests
 - Crude prototype, still failing the tests

-

Changes: https://git.openjdk.org/jdk/pull/20139/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20139&range=01
  Stats: 329 lines in 13 files changed: 328 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/20139.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20139/head:pull/20139

PR: https://git.openjdk.org/jdk/pull/20139


Re: RFR: 8329597: C2: Intrinsify Reference.clear

2024-07-12 Thread Aleksey Shipilev
On Fri, 12 Jul 2024 11:57:56 GMT, Erik Österlund  wrote:

> The runtime use of the Access API knows how to resolve an unknown oop ref 
> strength using AccessBarrierSupport::resolve_unknown_oop_ref_strength. 
> However, we do not have support for that in the C2 backend. In fact, it does 
> not understand non-strong oop stores at all. 

Aw, nice usability landmine. I thought C2 barrier set would assert on me if it 
cannot deliver. Apparently not, I see it just does pre-barriers when it is not 
sure what strongness the store is. Hrmpf. OK, let me see what can be done here.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2225577027


Re: RFR: 8329597: C2: Intrinsify Reference.clear

2024-07-12 Thread Erik Österlund
On Fri, 12 Jul 2024 10:22:42 GMT, Aleksey Shipilev  wrote:

> > The reason we did not do this before is that this is not a strong reference 
> > store. Strong reference stores with a SATB collector will keep the referent 
> > alive, which is typically the exact opposite of what a user wants when they 
> > clear a Reference.
> 
> You mean not doing this store just on the Java side? Yes, I agree, it would 
> be awkward. In intrinsic, we are storing with the same decorators that 
> `JVM_ReferenceClear` is using, which should be good with SATB collectors. 
> Perhaps I am misunderstanding the comment.

The runtime use of the Access API knows how to resolve an unknown oop ref 
strength using AccessBarrierSupport::resolve_unknown_oop_ref_strength. However, 
we do not have support for that in the C2 backend. In fact, it does not 
understand non-strong oop stores at all. Because there hasn't really been a use 
case for it, other than clearing a Reference. That's the precise reason why we 
do not have a clear intrinsic; it would have to add that infrastructure.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2225430174


Re: RFR: 8329597: C2: Intrinsify Reference.clear

2024-07-12 Thread Aleksey Shipilev
On Fri, 12 Jul 2024 10:16:13 GMT, Erik Österlund  wrote:

> The reason we did not do this before is that this is not a strong reference 
> store. Strong reference stores with a SATB collector will keep the referent 
> alive, which is typically the exact opposite of what a user wants when they 
> clear a Reference.

You mean not doing this store just on the Java side? Yes, I agree, it would be 
awkward. In intrinsic, we are storing with the same decorators that 
`JVM_ReferenceClear` is using, which should be good with SATB collectors. 
Perhaps I am misunderstanding the comment.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2225277261


Re: RFR: 8329597: C2: Intrinsify Reference.clear

2024-07-12 Thread Erik Österlund
On Thu, 11 Jul 2024 15:28:37 GMT, Aleksey Shipilev  wrote:

> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
> method for `Reference.clear`. The original patch skipped intrinsification of 
> this method, because we thought `Reference.clear` is not on a performance 
> sensitive path. However, it shows up prominently on simple benchmarks that 
> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
> `RRWL` benchmarks.
> 
> Additional testing:
>  - [x] Linux x86_64 server fastdebug, `all`
>  - [ ] Linux AArch64 server fastdebug, `all`

The reason we did not do this before is that this is not a strong reference 
store. Strong reference stores with a SATB collector will keep the referent 
alive, which is typically the exact opposite of what a user wants when they 
clear a Reference.

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2225266939


Re: RFR: 8329597: C2: Intrinsify Reference.clear

2024-07-11 Thread Aleksey Shipilev
On Thu, 11 Jul 2024 15:28:37 GMT, Aleksey Shipilev  wrote:

> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
> method for `Reference.clear`. The original patch skipped intrinsification of 
> this method, because we thought `Reference.clear` is not on a performance 
> sensitive path. However, it shows up prominently on simple benchmarks that 
> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
> `RRWL` benchmarks.
> 
> Additional testing:
>  - [x] Linux x86_64 server fastdebug, `all`
>  - [ ] Linux AArch64 server fastdebug, `all`

On Mac AArch64, which suffers from both native call and WX transition:


Benchmark   Mode  Cnt  Score   Error  Units

# Intrinsic OFF
ReferenceClear.phantom  avgt9  52,297 ? 0,294  ns/op
ReferenceClear.phantom_new  avgt9  57,075 ? 0,296  ns/op
ReferenceClear.soft avgt9  52,567 ? 0,393  ns/op
ReferenceClear.soft_new avgt9  57,640 ? 0,264  ns/op
ReferenceClear.weak avgt9  53,018 ? 1,285  ns/op
ReferenceClear.weak_new avgt9  57,227 ? 0,483  ns/op

# Intrinsic ON (default)
ReferenceClear.phantom  avgt9   0,780 ? 0,017  ns/op
ReferenceClear.soft avgt9   0,784 ? 0,022  ns/op
ReferenceClear.weak avgt9   0,793 ? 0,033  ns/op
ReferenceClear.phantom_new  avgt9   3,018 ? 0,015  ns/op
ReferenceClear.soft_new avgt9   3,268 ? 0,014  ns/op
ReferenceClear.weak_new avgt9   3,004 ? 0,057  ns/op


On x86_64 m7a.16xlarge, which only suffers from the native call:


Benchmark   Mode  Cnt  Score   Error  Units

# Intrinsic OFF
ReferenceClear.phantom  avgt9  14.643 ± 0.049  ns/op
ReferenceClear.soft avgt9  14.939 ± 0.438  ns/op
ReferenceClear.weak avgt9  14.648 ± 0.081  ns/op
ReferenceClear.phantom_new  avgt9  19.859 ± 2.405  ns/op
ReferenceClear.soft_new avgt9  20.208 ± 1.805  ns/op
ReferenceClear.weak_new avgt9  20.385 ± 2.570  ns/op

# Intrinsic ON (default)
ReferenceClear.phantom  avgt9   0.821 ± 0.010  ns/op
ReferenceClear.soft avgt9   0.817 ± 0.007  ns/op
ReferenceClear.weak avgt9   0.819 ± 0.010  ns/op
ReferenceClear.phantom_new  avgt9   4.195 ± 0.729  ns/op
ReferenceClear.soft_new avgt9   4.315 ± 0.599  ns/op
ReferenceClear.weak_new avgt9   3.986 ± 0.596  ns/op

-

PR Comment: https://git.openjdk.org/jdk/pull/20139#issuecomment-2223248114


RFR: 8329597: C2: Intrinsify Reference.clear

2024-07-11 Thread Aleksey Shipilev
[JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
method for `Reference.clear`. The original patch skipped intrinsification of 
this method, because we thought `Reference.clear` is not on a performance 
sensitive path. However, it shows up prominently on simple benchmarks that 
touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
`RRWL` benchmarks.

Additional testing:
 - [x] Linux x86_64 server fastdebug, `all`
 - [ ] Linux AArch64 server fastdebug, `all`

-

Commit messages:
 - Move the membar at the end
 - Revert C1 parts
 - Work

Changes: https://git.openjdk.org/jdk/pull/20139/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20139&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8329597
  Stats: 132 lines in 7 files changed: 132 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/20139.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20139/head:pull/20139

PR: https://git.openjdk.org/jdk/pull/20139