Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]

2024-04-19 Thread Jorn Vernee
On Fri, 19 Apr 2024 19:18:13 GMT, Scott Gibbons  wrote:

>> src/hotspot/share/opto/runtime.cpp line 786:
>> 
>>> 784:   fields[argp++] = TypePtr::NOTNULL;// dest
>>> 785:   fields[argp++] = TypeLong::LONG;  // size
>>> 786:   fields[argp++] = Type::HALF;  // size
>> 
>> Since the size is a `size_t`, I don't think this is correct on 32-bit 
>> platforms. I think here we want `TypeX_X`, and then add the extra `HALF` 
>> only on 64-bit platforms. Similar to what we do in `make_arraycopy_Type`: 
>> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/runtime.cpp#L799-L802
>> 
>> (Note that you will also have to adjust `argcnt` for this)
>
> I don't understand this well enough to be confident in the change.  Can you 
> please verify that I've changed it properly?

Your latest version looks good to me.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572909435


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]

2024-04-19 Thread Scott Gibbons
On Fri, 19 Apr 2024 18:16:33 GMT, Vladimir Kozlov  wrote:

>> Scott Gibbons has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Add enter() and leave(); remove Windows-specific register stuff
>
> src/hotspot/share/utilities/copy.hpp line 303:
> 
>> 301:   inline static void shared_disjoint_words_atomic(const HeapWord* from,
>> 302:   HeapWord* to, size_t 
>> count) {
>> 303: switch (count) {
> 
> I don't think this justify to change the file.

Reverted.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572824249


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]

2024-04-19 Thread Scott Gibbons
On Fri, 19 Apr 2024 15:50:05 GMT, Jorn Vernee  wrote:

>> Scott Gibbons has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Add enter() and leave(); remove Windows-specific register stuff
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2603:
> 
>> 2601: const Register wide_value = rax;
>> 2602: const Register rScratch1 = r10;
>> 2603: 
> 
> Maybe put an `assert_different_registers` here for the above registers, just 
> to be sure. (I see you are avoiding the existing `rscratch1` already, because 
> of a conflict with `c_rarg2`)

Done.

> src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2674:
> 
>> 2672: // Parameter order is (ptr, byteVal, size)
>> 2673: __ xchgq(c_rarg1, c_rarg2);
>> 2674: __ pop(rbp);// Clear effect of enter()
> 
> Why not just use `leave()` here?

No special reason.  I've changed it since it seems to provide more clarity.

> src/hotspot/share/opto/library_call.cpp line 4959:
> 
>> 4957:   if (stopped())  return true;
>> 4958: 
>> 4959:   if (StubRoutines::unsafe_setmemory() == nullptr) return false;
> 
> I don't see why this check is needed here, since we already check whether the 
> stub is there in `is_intrinsic_supported`.
> 
> Note that `inline_unsafe_copyMemory` also doesn't have this check. I think it 
> would be good to keep consistency between the two.

Removed.

> src/hotspot/share/opto/runtime.cpp line 780:
> 
>> 778: const TypeFunc* OptoRuntime::make_setmemory_Type() {
>> 779:   // create input type (domain)
>> 780:   int num_args  = 4;
> 
> This variable seems redundant.

It is.  It is there due to copy/paste from the other 10 places that also have 
the same redundant variable declaration.  I've removed it from here, but I 
think I'll be asked to submit a separate PR if I remove it from the other 
locations.

Note that it's also redundant in `make_arraycopy_Type()`.

> src/hotspot/share/opto/runtime.cpp line 786:
> 
>> 784:   fields[argp++] = TypePtr::NOTNULL;// dest
>> 785:   fields[argp++] = TypeLong::LONG;  // size
>> 786:   fields[argp++] = Type::HALF;  // size
> 
> Since the size is a `size_t`, I don't think this is correct on 32-bit 
> platforms. I think here we want `TypeX_X`, and then add the extra `HALF` only 
> on 64-bit platforms. Similar to what we do in `make_arraycopy_Type`: 
> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/runtime.cpp#L799-L802
> 
> (Note that you will also have to adjust `argcnt` for this)

I don't understand this well enough to be confident in the change.  Can you 
please verify that I've changed it properly?

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572797332
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572800059
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572804660
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572815040
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572823468


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]

2024-04-19 Thread Scott Gibbons
On Fri, 19 Apr 2024 18:14:05 GMT, Vladimir Kozlov  wrote:

>> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4013:
>> 
>>> 4011:   // Initialize table for unsafe copy memeory check.
>>> 4012:   if (UnsafeMemoryAccess::_table == nullptr) {
>>> 4013: UnsafeMemoryAccess::create_table(26);
>> 
>> How did you arrive at a table size of 26?
>
> This needs comment

I added 10 to the table size because I knew I was going to add 7 places where a 
mark was required for setMemory.  I left 3 for safety.  The algorithm changed 
so only 4 are needed.

The algorithm has since changed, so I changed this to:

`UnsafeCopyMemory::create_table(16 + 4); // 16 for copyMemory; 4 for setMemory`

I did a similar change to all other table creation numbers.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572841521


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]

2024-04-19 Thread Vladimir Kozlov
On Fri, 19 Apr 2024 15:43:17 GMT, Jorn Vernee  wrote:

>> Scott Gibbons has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Add enter() and leave(); remove Windows-specific register stuff
>
> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4013:
> 
>> 4011:   // Initialize table for unsafe copy memeory check.
>> 4012:   if (UnsafeMemoryAccess::_table == nullptr) {
>> 4013: UnsafeMemoryAccess::create_table(26);
> 
> How did you arrive at a table size of 26?

This needs comment

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572744077


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]

2024-04-19 Thread Jorn Vernee
On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons  wrote:

>> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64.  See 
>> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around 
>> this change.
>> 
>> Overall, making this an intrinsic improves overall performance of 
>> `Unsafe::setMemory` by up to 4x for all buffer sizes.
>> 
>> Tested with tier-1 (and full CI).  I've added a table of the before and 
>> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`).
>> 
>> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt)
>
> Scott Gibbons has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add enter() and leave(); remove Windows-specific register stuff

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4013:

> 4011:   // Initialize table for unsafe copy memeory check.
> 4012:   if (UnsafeMemoryAccess::_table == nullptr) {
> 4013: UnsafeMemoryAccess::create_table(26);

How did you arrive at a table size of 26?

src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2603:

> 2601: const Register wide_value = rax;
> 2602: const Register rScratch1 = r10;
> 2603: 

Maybe put an `assert_different_registers` here for the above registers, just to 
be sure. (I see you are avoiding the existing `rscratch1` already, because of a 
conflict with `c_rarg2`)

src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2674:

> 2672: // Parameter order is (ptr, byteVal, size)
> 2673: __ xchgq(c_rarg1, c_rarg2);
> 2674: __ pop(rbp);// Clear effect of enter()

Why not just use `leave()` here?

src/hotspot/share/opto/library_call.cpp line 4959:

> 4957:   if (stopped())  return true;
> 4958: 
> 4959:   if (StubRoutines::unsafe_setmemory() == nullptr) return false;

I don't see why this check is needed here, since we already check whether the 
stub is there in `is_intrinsic_supported`.

Note that `inline_unsafe_copyMemory` also doesn't have this check. I think it 
would be good to keep consistency between the two.

src/hotspot/share/opto/runtime.cpp line 780:

> 778: const TypeFunc* OptoRuntime::make_setmemory_Type() {
> 779:   // create input type (domain)
> 780:   int num_args  = 4;

This variable seems redundant.

src/hotspot/share/opto/runtime.cpp line 786:

> 784:   fields[argp++] = TypePtr::NOTNULL;// dest
> 785:   fields[argp++] = TypeLong::LONG;  // size
> 786:   fields[argp++] = Type::HALF;  // size

Since the size is a `size_t`, I don't think this is correct on 32-bit 
platforms. I think here we want `TypeX_X`, and then add the extra `HALF` only 
on 64-bit platforms. Similar to what we do in `make_arraycopy_Type`: 
https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/runtime.cpp#L799-L802

(Note that you will also have to adjust `argcnt` for this)

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572570842
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572578437
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572593795
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572556648
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572564382
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572562058


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]

2024-04-19 Thread Scott Gibbons
On Fri, 19 Apr 2024 13:25:33 GMT, Jatin Bhateja  wrote:

>> Scott Gibbons has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Add enter() and leave(); remove Windows-specific register stuff
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2611:
> 
>> 2609: // Propagate byte to full Register
>> 2610: __ movzbl(rScratch1, byteVal);
>> 2611: __ mov64(wide_value, 0x0101010101010101);
> 
> Long constant should be suffixed by ULL.

Fixed.

> test/micro/org/openjdk/bench/java/lang/foreign/MemorySegmentZeroUnsafe.java 
> line 1:
> 
>> 1: package org.openjdk.bench.java.lang.foreign;
> 
> Copyright header missing.

Added

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572502532
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572502204


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]

2024-04-19 Thread Jatin Bhateja
On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons  wrote:

>> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64.  See 
>> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around 
>> this change.
>> 
>> Overall, making this an intrinsic improves overall performance of 
>> `Unsafe::setMemory` by up to 4x for all buffer sizes.
>> 
>> Tested with tier-1 (and full CI).  I've added a table of the before and 
>> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`).
>> 
>> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt)
>
> Scott Gibbons has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add enter() and leave(); remove Windows-specific register stuff

Hi @asgibbons Please add a new test / extend an existing test for SIGBUS 
violation testing test/hotspot/jtreg/runtime/Unsafe/InternalErrorTest.java

src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2611:

> 2609: // Propagate byte to full Register
> 2610: __ movzbl(rScratch1, byteVal);
> 2611: __ mov64(wide_value, 0x0101010101010101);

Long constant should be suffixed by ULL.

test/micro/org/openjdk/bench/java/lang/foreign/MemorySegmentZeroUnsafe.java 
line 1:

> 1: package org.openjdk.bench.java.lang.foreign;

Copyright header missing.

-

PR Review: https://git.openjdk.org/jdk/pull/18555#pullrequestreview-2011247585
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572370327
PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572267154


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]

2024-04-18 Thread Sandhya Viswanathan
On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons  wrote:

>> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64.  See 
>> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around 
>> this change.
>> 
>> Overall, making this an intrinsic improves overall performance of 
>> `Unsafe::setMemory` by up to 4x for all buffer sizes.
>> 
>> Tested with tier-1 (and full CI).  I've added a table of the before and 
>> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`).
>> 
>> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt)
>
> Scott Gibbons has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add enter() and leave(); remove Windows-specific register stuff

@vnkozlov Could you please review this PR from @asgibbons? Looking forward to 
your inputs.

-

PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2064852401


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]

2024-04-16 Thread Scott Gibbons
On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons  wrote:

>> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64.  See 
>> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around 
>> this change.
>> 
>> Overall, making this an intrinsic improves overall performance of 
>> `Unsafe::setMemory` by up to 4x for all buffer sizes.
>> 
>> Tested with tier-1 (and full CI).  I've added a table of the before and 
>> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`).
>> 
>> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt)
>
> Scott Gibbons has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add enter() and leave(); remove Windows-specific register stuff

I ran the benchmark again using a branch from openjdk today against this 
modification.  Performance increased an average of 4.98x (max - 7.52x (unsafe 
64-byte aligned), min - 3.02x (panama 255-byte aligned), stddev - 0.94x).

I'll integrate after a second review has been done.  Thanks!

-

PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2059637042


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]

2024-04-15 Thread Sandhya Viswanathan
On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons  wrote:

>> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64.  See 
>> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around 
>> this change.
>> 
>> Overall, making this an intrinsic improves overall performance of 
>> `Unsafe::setMemory` by up to 4x for all buffer sizes.
>> 
>> Tested with tier-1 (and full CI).  I've added a table of the before and 
>> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`).
>> 
>> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt)
>
> Scott Gibbons has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add enter() and leave(); remove Windows-specific register stuff

PR looks good to me now. Thanks a lot for considering all the inputs.

-

Marked as reviewed by sviswanathan (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/18555#pullrequestreview-2002553946


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]

2024-04-15 Thread Scott Gibbons
> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64.  See 
> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around 
> this change.
> 
> Overall, making this an intrinsic improves overall performance of 
> `Unsafe::setMemory` by up to 4x for all buffer sizes.
> 
> Tested with tier-1 (and full CI).  I've added a table of the before and after 
> numbers for the JMH I ran (`MemorySegmentZeroUnsafe`).
> 
> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt)

Scott Gibbons has updated the pull request incrementally with one additional 
commit since the last revision:

  Add enter() and leave(); remove Windows-specific register stuff

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/18555/files
  - new: https://git.openjdk.org/jdk/pull/18555/files/113aa90f..7a1d67e5

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=18555=20
 - incr: https://webrevs.openjdk.org/?repo=jdk=18555=19-20

  Stats: 38 lines in 1 file changed: 1 ins; 20 del; 17 mod
  Patch: https://git.openjdk.org/jdk/pull/18555.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555

PR: https://git.openjdk.org/jdk/pull/18555