Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]
On Fri, 19 Apr 2024 19:18:13 GMT, Scott Gibbons wrote: >> src/hotspot/share/opto/runtime.cpp line 786: >> >>> 784: fields[argp++] = TypePtr::NOTNULL;// dest >>> 785: fields[argp++] = TypeLong::LONG; // size >>> 786: fields[argp++] = Type::HALF; // size >> >> Since the size is a `size_t`, I don't think this is correct on 32-bit >> platforms. I think here we want `TypeX_X`, and then add the extra `HALF` >> only on 64-bit platforms. Similar to what we do in `make_arraycopy_Type`: >> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/runtime.cpp#L799-L802 >> >> (Note that you will also have to adjust `argcnt` for this) > > I don't understand this well enough to be confident in the change. Can you > please verify that I've changed it properly? Your latest version looks good to me. - PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572909435
Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]
On Fri, 19 Apr 2024 18:16:33 GMT, Vladimir Kozlov wrote: >> Scott Gibbons has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Add enter() and leave(); remove Windows-specific register stuff > > src/hotspot/share/utilities/copy.hpp line 303: > >> 301: inline static void shared_disjoint_words_atomic(const HeapWord* from, >> 302: HeapWord* to, size_t >> count) { >> 303: switch (count) { > > I don't think this justify to change the file. Reverted. - PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572824249
Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]
On Fri, 19 Apr 2024 15:50:05 GMT, Jorn Vernee wrote: >> Scott Gibbons has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Add enter() and leave(); remove Windows-specific register stuff > > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2603: > >> 2601: const Register wide_value = rax; >> 2602: const Register rScratch1 = r10; >> 2603: > > Maybe put an `assert_different_registers` here for the above registers, just > to be sure. (I see you are avoiding the existing `rscratch1` already, because > of a conflict with `c_rarg2`) Done. > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2674: > >> 2672: // Parameter order is (ptr, byteVal, size) >> 2673: __ xchgq(c_rarg1, c_rarg2); >> 2674: __ pop(rbp);// Clear effect of enter() > > Why not just use `leave()` here? No special reason. I've changed it since it seems to provide more clarity. > src/hotspot/share/opto/library_call.cpp line 4959: > >> 4957: if (stopped()) return true; >> 4958: >> 4959: if (StubRoutines::unsafe_setmemory() == nullptr) return false; > > I don't see why this check is needed here, since we already check whether the > stub is there in `is_intrinsic_supported`. > > Note that `inline_unsafe_copyMemory` also doesn't have this check. I think it > would be good to keep consistency between the two. Removed. > src/hotspot/share/opto/runtime.cpp line 780: > >> 778: const TypeFunc* OptoRuntime::make_setmemory_Type() { >> 779: // create input type (domain) >> 780: int num_args = 4; > > This variable seems redundant. It is. It is there due to copy/paste from the other 10 places that also have the same redundant variable declaration. I've removed it from here, but I think I'll be asked to submit a separate PR if I remove it from the other locations. Note that it's also redundant in `make_arraycopy_Type()`. > src/hotspot/share/opto/runtime.cpp line 786: > >> 784: fields[argp++] = TypePtr::NOTNULL;// dest >> 785: fields[argp++] = TypeLong::LONG; // size >> 786: fields[argp++] = Type::HALF; // size > > Since the size is a `size_t`, I don't think this is correct on 32-bit > platforms. I think here we want `TypeX_X`, and then add the extra `HALF` only > on 64-bit platforms. Similar to what we do in `make_arraycopy_Type`: > https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/runtime.cpp#L799-L802 > > (Note that you will also have to adjust `argcnt` for this) I don't understand this well enough to be confident in the change. Can you please verify that I've changed it properly? - PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572797332 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572800059 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572804660 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572815040 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572823468
Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]
On Fri, 19 Apr 2024 18:14:05 GMT, Vladimir Kozlov wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4013: >> >>> 4011: // Initialize table for unsafe copy memeory check. >>> 4012: if (UnsafeMemoryAccess::_table == nullptr) { >>> 4013: UnsafeMemoryAccess::create_table(26); >> >> How did you arrive at a table size of 26? > > This needs comment I added 10 to the table size because I knew I was going to add 7 places where a mark was required for setMemory. I left 3 for safety. The algorithm changed so only 4 are needed. The algorithm has since changed, so I changed this to: `UnsafeCopyMemory::create_table(16 + 4); // 16 for copyMemory; 4 for setMemory` I did a similar change to all other table creation numbers. - PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572841521
Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]
On Fri, 19 Apr 2024 15:43:17 GMT, Jorn Vernee wrote: >> Scott Gibbons has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Add enter() and leave(); remove Windows-specific register stuff > > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4013: > >> 4011: // Initialize table for unsafe copy memeory check. >> 4012: if (UnsafeMemoryAccess::_table == nullptr) { >> 4013: UnsafeMemoryAccess::create_table(26); > > How did you arrive at a table size of 26? This needs comment - PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572744077
Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]
On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See >> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around >> this change. >> >> Overall, making this an intrinsic improves overall performance of >> `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and >> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional > commit since the last revision: > > Add enter() and leave(); remove Windows-specific register stuff src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4013: > 4011: // Initialize table for unsafe copy memeory check. > 4012: if (UnsafeMemoryAccess::_table == nullptr) { > 4013: UnsafeMemoryAccess::create_table(26); How did you arrive at a table size of 26? src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2603: > 2601: const Register wide_value = rax; > 2602: const Register rScratch1 = r10; > 2603: Maybe put an `assert_different_registers` here for the above registers, just to be sure. (I see you are avoiding the existing `rscratch1` already, because of a conflict with `c_rarg2`) src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2674: > 2672: // Parameter order is (ptr, byteVal, size) > 2673: __ xchgq(c_rarg1, c_rarg2); > 2674: __ pop(rbp);// Clear effect of enter() Why not just use `leave()` here? src/hotspot/share/opto/library_call.cpp line 4959: > 4957: if (stopped()) return true; > 4958: > 4959: if (StubRoutines::unsafe_setmemory() == nullptr) return false; I don't see why this check is needed here, since we already check whether the stub is there in `is_intrinsic_supported`. Note that `inline_unsafe_copyMemory` also doesn't have this check. I think it would be good to keep consistency between the two. src/hotspot/share/opto/runtime.cpp line 780: > 778: const TypeFunc* OptoRuntime::make_setmemory_Type() { > 779: // create input type (domain) > 780: int num_args = 4; This variable seems redundant. src/hotspot/share/opto/runtime.cpp line 786: > 784: fields[argp++] = TypePtr::NOTNULL;// dest > 785: fields[argp++] = TypeLong::LONG; // size > 786: fields[argp++] = Type::HALF; // size Since the size is a `size_t`, I don't think this is correct on 32-bit platforms. I think here we want `TypeX_X`, and then add the extra `HALF` only on 64-bit platforms. Similar to what we do in `make_arraycopy_Type`: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/runtime.cpp#L799-L802 (Note that you will also have to adjust `argcnt` for this) - PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572570842 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572578437 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572593795 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572556648 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572564382 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572562058
Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]
On Fri, 19 Apr 2024 13:25:33 GMT, Jatin Bhateja wrote: >> Scott Gibbons has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Add enter() and leave(); remove Windows-specific register stuff > > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2611: > >> 2609: // Propagate byte to full Register >> 2610: __ movzbl(rScratch1, byteVal); >> 2611: __ mov64(wide_value, 0x0101010101010101); > > Long constant should be suffixed by ULL. Fixed. > test/micro/org/openjdk/bench/java/lang/foreign/MemorySegmentZeroUnsafe.java > line 1: > >> 1: package org.openjdk.bench.java.lang.foreign; > > Copyright header missing. Added - PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572502532 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572502204
Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]
On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See >> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around >> this change. >> >> Overall, making this an intrinsic improves overall performance of >> `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and >> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional > commit since the last revision: > > Add enter() and leave(); remove Windows-specific register stuff Hi @asgibbons Please add a new test / extend an existing test for SIGBUS violation testing test/hotspot/jtreg/runtime/Unsafe/InternalErrorTest.java src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2611: > 2609: // Propagate byte to full Register > 2610: __ movzbl(rScratch1, byteVal); > 2611: __ mov64(wide_value, 0x0101010101010101); Long constant should be suffixed by ULL. test/micro/org/openjdk/bench/java/lang/foreign/MemorySegmentZeroUnsafe.java line 1: > 1: package org.openjdk.bench.java.lang.foreign; Copyright header missing. - PR Review: https://git.openjdk.org/jdk/pull/18555#pullrequestreview-2011247585 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572370327 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572267154
Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]
On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See >> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around >> this change. >> >> Overall, making this an intrinsic improves overall performance of >> `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and >> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional > commit since the last revision: > > Add enter() and leave(); remove Windows-specific register stuff @vnkozlov Could you please review this PR from @asgibbons? Looking forward to your inputs. - PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2064852401
Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]
On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See >> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around >> this change. >> >> Overall, making this an intrinsic improves overall performance of >> `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and >> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional > commit since the last revision: > > Add enter() and leave(); remove Windows-specific register stuff I ran the benchmark again using a branch from openjdk today against this modification. Performance increased an average of 4.98x (max - 7.52x (unsafe 64-byte aligned), min - 3.02x (panama 255-byte aligned), stddev - 0.94x). I'll integrate after a second review has been done. Thanks! - PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2059637042
Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]
On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See >> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around >> this change. >> >> Overall, making this an intrinsic improves overall performance of >> `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and >> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional > commit since the last revision: > > Add enter() and leave(); remove Windows-specific register stuff PR looks good to me now. Thanks a lot for considering all the inputs. - Marked as reviewed by sviswanathan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18555#pullrequestreview-2002553946
Re: RFR: 8329331: Intrinsify Unsafe::setMemory [v21]
> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See > [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around > this change. > > Overall, making this an intrinsic improves overall performance of > `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after > numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Add enter() and leave(); remove Windows-specific register stuff - Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/113aa90f..7a1d67e5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=18555=20 - incr: https://webrevs.openjdk.org/?repo=jdk=18555=19-20 Stats: 38 lines in 1 file changed: 1 ins; 20 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555