Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v7]

2024-04-11 Thread Scott Gibbons
On Thu, 11 Apr 2024 00:38:11 GMT, Sandhya Viswanathan 
 wrote:

>> Scott Gibbons has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Add movq to locate_operand
>
> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 5988:
> 
>> 5986: movw(Address(to, 0), value);
>> 5987: addptr(to, 2);
>> 5988: subptr(count, 1<<(shift-1));
> 
> At line 5968 also we need the change from cmpl to cmpptr.
> cmpl(count, 2< src/hotspot/cpu/x86/macroAssembler_x86.cpp line 6050:
> 
>> 6048:   vpbroadcastd(xtmp, xtmp, Assembler::AVX_512bit);
>> 6049: 
>> 6050:   subptr(count, 16 << shift);
> 
> At line 6045 also the cmpl should change to cmpptr:
> cmpl(count, VM_Version::avx3_threshold());

Will do.

> src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2497:
> 
>> 2495: //
>> 2496: address StubGenerator::generate_unsafe_setmemory(const char *name,
>> 2497:  address 
>> byte_fill_entry) {
> 
> Need to add UnsafeSetMemoryMark on similar lines as UnsafeCopyMemoryMark to 
> handle page error.

Will do.

> src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2522:
> 
>> 2520: #define rScratch3 r8
>> 2521: #undef rScratch4
>> 2522: #define rScratch4 r11
> 
> We could do this setup using const Register declaration instead of using 
> #undef/#define pair.

We discussed this and the #define option was your preferred method since the 
registers are being re-used.  Do you want this changed back again?

> src/hotspot/share/opto/library_call.cpp line 4950:
> 
>> 4948: 
>> 4949: bool LibraryCallKit::inline_unsafe_setMemory() {
>> 4950:   if (callee()->is_static())  return false;  // caller must have the 
>> capability!
> 
> Also need to return false if StubRoutines::unsafe_setmemory() == nullptr.

Will do.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561128613
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561128862
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561129186
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561128151
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561130102


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v7]

2024-04-10 Thread Sandhya Viswanathan
On Mon, 8 Apr 2024 19:11:19 GMT, Scott Gibbons  wrote:

>> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64.  See 
>> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around 
>> this change.
>> 
>> Overall, making this an intrinsic improves overall performance of 
>> `Unsafe::setMemory` by up to 4x for all buffer sizes.
>> 
>> Tested with tier-1 (and full CI).  I've added a table of the before and 
>> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`).
>> 
>> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt)
>
> Scott Gibbons has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add movq to locate_operand

src/hotspot/cpu/x86/macroAssembler_x86.cpp line 5988:

> 5986: movw(Address(to, 0), value);
> 5987: addptr(to, 2);
> 5988: subptr(count, 1<<(shift-1));

At line 5968 also we need the change from cmpl to cmpptr.
cmpl(count, 2< 6048:   vpbroadcastd(xtmp, xtmp, Assembler::AVX_512bit);
> 6049: 
> 6050:   subptr(count, 16 << shift);

At line 6045 also the cmpl should change to cmpptr:
cmpl(count, VM_Version::avx3_threshold());

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1560205702
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1560213265


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v7]

2024-04-10 Thread Sandhya Viswanathan
On Mon, 8 Apr 2024 19:11:19 GMT, Scott Gibbons  wrote:

>> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64.  See 
>> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around 
>> this change.
>> 
>> Overall, making this an intrinsic improves overall performance of 
>> `Unsafe::setMemory` by up to 4x for all buffer sizes.
>> 
>> Tested with tier-1 (and full CI).  I've added a table of the before and 
>> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`).
>> 
>> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt)
>
> Scott Gibbons has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add movq to locate_operand

src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2497:

> 2495: //
> 2496: address StubGenerator::generate_unsafe_setmemory(const char *name,
> 2497:  address 
> byte_fill_entry) {

Need to add UnsafeSetMemoryMark on similar lines as UnsafeCopyMemoryMark to 
handle page error.

src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2522:

> 2520: #define rScratch3 r8
> 2521: #undef rScratch4
> 2522: #define rScratch4 r11

We could do this setup using const Register declaration instead of using 
#undef/#define pair.

src/hotspot/share/opto/library_call.cpp line 4950:

> 4948: 
> 4949: bool LibraryCallKit::inline_unsafe_setMemory() {
> 4950:   if (callee()->is_static())  return false;  // caller must have the 
> capability!

Also need to return false if StubRoutines::unsafe_setmemory() == nullptr.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1560201153
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1560202218
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1560194011


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v7]

2024-04-10 Thread Sandhya Viswanathan
On Mon, 8 Apr 2024 19:11:19 GMT, Scott Gibbons  wrote:

>> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64.  See 
>> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around 
>> this change.
>> 
>> Overall, making this an intrinsic improves overall performance of 
>> `Unsafe::setMemory` by up to 4x for all buffer sizes.
>> 
>> Tested with tier-1 (and full CI).  I've added a table of the before and 
>> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`).
>> 
>> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt)
>
> Scott Gibbons has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add movq to locate_operand

To me doing the compiler intrinsic for Unsafe setMemory looked to be the right 
first step as in this PR. There is precedence with Unsafe copyMemory intrinsic 
so we are in sync with what is done before. We could request Fei Yang/Hamlin Li 
for RISC-V intrinsic and Bhavana Kilambi/Nick Gasson for AARCH64 intrinsic as 
follow up PRs.

-

PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2048164233


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v7]

2024-04-08 Thread Dean Long
On Mon, 8 Apr 2024 19:11:19 GMT, Scott Gibbons  wrote:

>> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64.  See 
>> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around 
>> this change.
>> 
>> Overall, making this an intrinsic improves overall performance of 
>> `Unsafe::setMemory` by up to 4x for all buffer sizes.
>> 
>> Tested with tier-1 (and full CI).  I've added a table of the before and 
>> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`).
>> 
>> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt)
>
> Scott Gibbons has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add movq to locate_operand

Thanks, I see that my ideas have pretty much already been discussed in 
https://github.com/openjdk/jdk/pull/16760.  I might have missed it, but has the 
possibility of always setting the aligned interior region with 8 byte stores 
been discussed?  A literal reading of the javadoc seems to disallow it, but it 
seems like it should be allowed based on memory coherence.  Only the unaligned 
head and tail would need special treatment.

-

PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2043732829


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v7]

2024-04-08 Thread Scott Gibbons
> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64.  See 
> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around 
> this change.
> 
> Overall, making this an intrinsic improves overall performance of 
> `Unsafe::setMemory` by up to 4x for all buffer sizes.
> 
> Tested with tier-1 (and full CI).  I've added a table of the before and after 
> numbers for the JMH I ran (`MemorySegmentZeroUnsafe`).
> 
> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt)

Scott Gibbons has updated the pull request incrementally with one additional 
commit since the last revision:

  Add movq to locate_operand

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/18555/files
  - new: https://git.openjdk.org/jdk/pull/18555/files/fd6f04f7..f81aaa9f

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=18555=06
 - incr: https://webrevs.openjdk.org/?repo=jdk=18555=05-06

  Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/18555.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555

PR: https://git.openjdk.org/jdk/pull/18555