Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v4]

2024-04-09 Thread Maurizio Cimadamore
On Fri, 5 Apr 2024 02:40:16 GMT, Dean Long  wrote:

> That way C2 can do all its usual optimizations, like unrolling, 
> vectorization, and redundant store elimination (if it is an on-heap primitive 
> array that was just allocated, then there is no need to zero the parts that 
> are being "set").

I second that. It is something that came up quite frequently in the discussions 
around the FFM API.

-

PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2044409509


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v4]

2024-04-04 Thread Dean Long
On Wed, 3 Apr 2024 15:15:24 GMT, Scott Gibbons  wrote:

>> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64.  See 
>> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around 
>> this change.
>> 
>> Overall, making this an intrinsic improves overall performance of 
>> `Unsafe::setMemory` by up to 4x for all buffer sizes.
>> 
>> Tested with tier-1 (and full CI).  I've added a table of the before and 
>> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`).
>> 
>> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt)
>
> Scott Gibbons has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Fix Windows

As an experiment, couldn't you have the C2 intrinsic redirect to a Java helper 
that calls putByte() in a loop?

-

PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2038994043


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v4]

2024-04-04 Thread Dean Long
On Wed, 3 Apr 2024 15:15:24 GMT, Scott Gibbons  wrote:

>> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64.  See 
>> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around 
>> this change.
>> 
>> Overall, making this an intrinsic improves overall performance of 
>> `Unsafe::setMemory` by up to 4x for all buffer sizes.
>> 
>> Tested with tier-1 (and full CI).  I've added a table of the before and 
>> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`).
>> 
>> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt)
>
> Scott Gibbons has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Fix Windows

I think the right approach is to turn it into a loop in the IR, which I think 
is what Doug was implying.  That way C2 can do all its usual optimizations, 
like unrolling, vectorization, and redundant store elimination (if it is an 
on-heap primitive array that was just allocated, then there is no need to zero 
the parts that are being "set").

-

Changes requested by dlong (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/18555#pullrequestreview-1981533209


Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v4]

2024-04-03 Thread Scott Gibbons
> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64.  See 
> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around 
> this change.
> 
> Overall, making this an intrinsic improves overall performance of 
> `Unsafe::setMemory` by up to 4x for all buffer sizes.
> 
> Tested with tier-1 (and full CI).  I've added a table of the before and after 
> numbers for the JMH I ran (`MemorySegmentZeroUnsafe`).
> 
> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt)

Scott Gibbons has updated the pull request incrementally with one additional 
commit since the last revision:

  Fix Windows

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/18555/files
  - new: https://git.openjdk.org/jdk/pull/18555/files/3aa60a48..8bed1561

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=18555=03
 - incr: https://webrevs.openjdk.org/?repo=jdk=18555=02-03

  Stats: 8 lines in 1 file changed: 6 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/18555.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555

PR: https://git.openjdk.org/jdk/pull/18555