Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v7]
On Thu, 11 Apr 2024 00:38:11 GMT, Sandhya Viswanathan wrote: >> Scott Gibbons has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Add movq to locate_operand > > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 5988: > >> 5986: movw(Address(to, 0), value); >> 5987: addptr(to, 2); >> 5988: subptr(count, 1<<(shift-1)); > > At line 5968 also we need the change from cmpl to cmpptr. > cmpl(count, 2< src/hotspot/cpu/x86/macroAssembler_x86.cpp line 6050: > >> 6048: vpbroadcastd(xtmp, xtmp, Assembler::AVX_512bit); >> 6049: >> 6050: subptr(count, 16 << shift); > > At line 6045 also the cmpl should change to cmpptr: > cmpl(count, VM_Version::avx3_threshold()); Will do. > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2497: > >> 2495: // >> 2496: address StubGenerator::generate_unsafe_setmemory(const char *name, >> 2497: address >> byte_fill_entry) { > > Need to add UnsafeSetMemoryMark on similar lines as UnsafeCopyMemoryMark to > handle page error. Will do. > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2522: > >> 2520: #define rScratch3 r8 >> 2521: #undef rScratch4 >> 2522: #define rScratch4 r11 > > We could do this setup using const Register declaration instead of using > #undef/#define pair. We discussed this and the #define option was your preferred method since the registers are being re-used. Do you want this changed back again? > src/hotspot/share/opto/library_call.cpp line 4950: > >> 4948: >> 4949: bool LibraryCallKit::inline_unsafe_setMemory() { >> 4950: if (callee()->is_static()) return false; // caller must have the >> capability! > > Also need to return false if StubRoutines::unsafe_setmemory() == nullptr. Will do. - PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561128613 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561128862 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561129186 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561128151 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561130102
Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v7]
On Mon, 8 Apr 2024 19:11:19 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See >> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around >> this change. >> >> Overall, making this an intrinsic improves overall performance of >> `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and >> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional > commit since the last revision: > > Add movq to locate_operand src/hotspot/cpu/x86/macroAssembler_x86.cpp line 5988: > 5986: movw(Address(to, 0), value); > 5987: addptr(to, 2); > 5988: subptr(count, 1<<(shift-1)); At line 5968 also we need the change from cmpl to cmpptr. cmpl(count, 2< 6048: vpbroadcastd(xtmp, xtmp, Assembler::AVX_512bit); > 6049: > 6050: subptr(count, 16 << shift); At line 6045 also the cmpl should change to cmpptr: cmpl(count, VM_Version::avx3_threshold()); - PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1560205702 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1560213265
Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v7]
On Mon, 8 Apr 2024 19:11:19 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See >> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around >> this change. >> >> Overall, making this an intrinsic improves overall performance of >> `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and >> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional > commit since the last revision: > > Add movq to locate_operand src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2497: > 2495: // > 2496: address StubGenerator::generate_unsafe_setmemory(const char *name, > 2497: address > byte_fill_entry) { Need to add UnsafeSetMemoryMark on similar lines as UnsafeCopyMemoryMark to handle page error. src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2522: > 2520: #define rScratch3 r8 > 2521: #undef rScratch4 > 2522: #define rScratch4 r11 We could do this setup using const Register declaration instead of using #undef/#define pair. src/hotspot/share/opto/library_call.cpp line 4950: > 4948: > 4949: bool LibraryCallKit::inline_unsafe_setMemory() { > 4950: if (callee()->is_static()) return false; // caller must have the > capability! Also need to return false if StubRoutines::unsafe_setmemory() == nullptr. - PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1560201153 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1560202218 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1560194011
Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v7]
On Mon, 8 Apr 2024 19:11:19 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See >> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around >> this change. >> >> Overall, making this an intrinsic improves overall performance of >> `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and >> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional > commit since the last revision: > > Add movq to locate_operand To me doing the compiler intrinsic for Unsafe setMemory looked to be the right first step as in this PR. There is precedence with Unsafe copyMemory intrinsic so we are in sync with what is done before. We could request Fei Yang/Hamlin Li for RISC-V intrinsic and Bhavana Kilambi/Nick Gasson for AARCH64 intrinsic as follow up PRs. - PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2048164233
Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v7]
On Mon, 8 Apr 2024 19:11:19 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See >> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around >> this change. >> >> Overall, making this an intrinsic improves overall performance of >> `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and >> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional > commit since the last revision: > > Add movq to locate_operand Thanks, I see that my ideas have pretty much already been discussed in https://github.com/openjdk/jdk/pull/16760. I might have missed it, but has the possibility of always setting the aligned interior region with 8 byte stores been discussed? A literal reading of the javadoc seems to disallow it, but it seems like it should be allowed based on memory coherence. Only the unaligned head and tail would need special treatment. - PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2043732829
Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v7]
> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See > [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around > this change. > > Overall, making this an intrinsic improves overall performance of > `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after > numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Add movq to locate_operand - Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/fd6f04f7..f81aaa9f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=18555=06 - incr: https://webrevs.openjdk.org/?repo=jdk=18555=05-06 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555