Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v3]
On Tue, 26 Mar 2024 15:59:33 GMT, Damon Fenacci wrote: >> Yudi Zheng has updated the pull request incrementally with one additional >> commit since the last revision: >> >> address comment. > > `multiply_to_len` seems to be used by `generate_squareToLen` as well for > aarch64 and riscv but `zlen` is still passed in a register. > > https://github.com/openjdk/jdk/blob/870a6127cf54264c691f7322d775b202705c3bfa/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp#L4710 > https://github.com/openjdk/jdk/blob/870a6127cf54264c691f7322d775b202705c3bfa/src/hotspot/cpu/riscv/stubGenerator_riscv.cpp#L2881 > > I think it might work anyway but it might be better to adapt them if only for > completeness. > @dafedafe @dean-long please take a look and let me know if there are further > issues, thanks! Thanks @mur47x111! I noticed that you found even a few more `zlen` usages Did you test the change against all affected platforms? - PR Comment: https://git.openjdk.org/jdk/pull/18226#issuecomment-2061301421
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v3]
On Tue, 19 Mar 2024 21:09:31 GMT, Yudi Zheng wrote: >> Moving array construction within BigInteger.implMultiplyToLen intrinsic >> candidate to its caller simplifies the intrinsic implementation in JIT >> compiler. > > Yudi Zheng has updated the pull request incrementally with one additional > commit since the last revision: > > address comment. `multiply_to_len` seems to be used by `generate_squareToLen` as well for aarch64 and riscv but `zlen` is still passed in a register. https://github.com/openjdk/jdk/blob/870a6127cf54264c691f7322d775b202705c3bfa/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp#L4710 https://github.com/openjdk/jdk/blob/870a6127cf54264c691f7322d775b202705c3bfa/src/hotspot/cpu/riscv/stubGenerator_riscv.cpp#L2881 I think it might work anyway but it might be better to adapt them if only for completeness. - PR Review: https://git.openjdk.org/jdk/pull/18226#pullrequestreview-1960906919
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic
On Tue, 12 Mar 2024 10:44:54 GMT, Yudi Zheng wrote: > Moving array construction within BigInteger.implMultiplyToLen intrinsic > candidate to its caller simplifies the intrinsic implementation in JIT > compiler. Quite a simplification! Have you checked if there are any performance differences? - PR Review: https://git.openjdk.org/jdk/pull/18226#pullrequestreview-1943670073
Re: RFR: 8311906: Improve robustness of String constructors with mutable array inputs [v14]
On Thu, 30 Nov 2023 15:51:46 GMT, Roger Riggs wrote: >> Strings, after construction, are immutable but may be constructed from >> mutable arrays of bytes, characters, or integers. >> The string constructors should guard against the effects of mutating the >> arrays during construction that might invalidate internal invariants for the >> correct behavior of operations on the resulting strings. In particular, a >> number of operations have optimizations for operations on pairs of latin1 >> strings and pairs of non-latin1 strings, while operations between latin1 and >> non-latin1 strings use a more general implementation. >> >> The changes include: >> >> - Adding a warning to each constructor with an array as an argument to >> indicate that the results are indeterminate >> if the input array is modified before the constructor returns. >> The resulting string may contain any combination of characters sampled >> from the input array. >> >> - Ensure that strings that are represented as non-latin1 contain at least >> one non-latin1 character. >> For latin1 inputs, whether the arrays contain ASCII, ISO-8859-1, UTF8, or >> another encoding decoded to latin1 the scanning and compression is unchanged. >> If a non-latin1 character is found, the string is represented as >> non-latin1 with the added verification that a non-latin1 character is >> present at the same index. >> If that character is found to be latin1, then the input array has been >> modified and the result of the scan may be incorrect. >> Though a ConcurrentModificationException could be thrown, the risk to an >> existing application of an unexpected exception should be avoided. >> Instead, the non-latin1 copy of the input is re-scanned and compressed; >> that scan determines whether the latin1 or the non-latin1 representation is >> returned. >> >> - The methods that scan for non-latin1 characters and their intrinsic >> implementations are updated to return the index of the non-latin1 character. >> >> - String construction from StringBuilder and CharSequence must also be >> guarded as their contents may be modified during construction. > > Roger Riggs has updated the pull request incrementally with one additional > commit since the last revision: > > Correct jcc/jccb branches Intrinsics look OK to me. - Marked as reviewed by dfenacci (Committer). PR Review: https://git.openjdk.org/jdk/pull/16425#pullrequestreview-1761763069
Re: RFR: 8311906: Improve robustness of String constructors with mutable array inputs [v13]
On Mon, 27 Nov 2023 19:09:40 GMT, Roger Riggs wrote: >> Strings, after construction, are immutable but may be constructed from >> mutable arrays of bytes, characters, or integers. >> The string constructors should guard against the effects of mutating the >> arrays during construction that might invalidate internal invariants for the >> correct behavior of operations on the resulting strings. In particular, a >> number of operations have optimizations for operations on pairs of latin1 >> strings and pairs of non-latin1 strings, while operations between latin1 and >> non-latin1 strings use a more general implementation. >> >> The changes include: >> >> - Adding a warning to each constructor with an array as an argument to >> indicate that the results are indeterminate >> if the input array is modified before the constructor returns. >> The resulting string may contain any combination of characters sampled >> from the input array. >> >> - Ensure that strings that are represented as non-latin1 contain at least >> one non-latin1 character. >> For latin1 inputs, whether the arrays contain ASCII, ISO-8859-1, UTF8, or >> another encoding decoded to latin1 the scanning and compression is unchanged. >> If a non-latin1 character is found, the string is represented as >> non-latin1 with the added verification that a non-latin1 character is >> present at the same index. >> If that character is found to be latin1, then the input array has been >> modified and the result of the scan may be incorrect. >> Though a ConcurrentModificationException could be thrown, the risk to an >> existing application of an unexpected exception should be avoided. >> Instead, the non-latin1 copy of the input is re-scanned and compressed; >> that scan determines whether the latin1 or the non-latin1 representation is >> returned. >> >> - The methods that scan for non-latin1 characters and their intrinsic >> implementations are updated to return the index of the non-latin1 character. >> >> - String construction from StringBuilder and CharSequence must also be >> guarded as their contents may be modified during construction. > > Roger Riggs has updated the pull request incrementally with one additional > commit since the last revision: > > Use byte off branches in char_array_compress > Verified by manual tests with "-XX:AVX3Threshold=0" > And test in the PR > test/hotspot/jtreg/compiler/intrinsics/string/TestStringConstructionIntrinsics.java src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8547: > 8545: // bail out when there is nothing to be done > 8546: testl(tmp5, 0x); > 8547: jcc(Assembler::zero, post_alignment); @RogerRiggs I think you changed the wrong line Suggestion: jccb(Assembler::zero, post_alignment); src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8559: > 8557: evpcmpw(mask1, mask2, tmp1Reg, tmp2Reg, Assembler::le, /*signed*/ > false, Assembler::AVX_512bit); > 8558: ktestd(mask1, mask2); > 8559: jccb(Assembler::carryClear, copy_tail); Suggestion: jcc(Assembler::carryClear, copy_tail); (in this case the jump can potentially be long) - PR Review Comment: https://git.openjdk.org/jdk/pull/16425#discussion_r1410283240 PR Review Comment: https://git.openjdk.org/jdk/pull/16425#discussion_r1410284158
Re: RFR: 8311906: Improve robustness of String constructors with mutable array inputs [v12]
On Wed, 22 Nov 2023 05:03:41 GMT, Roger Riggs wrote: >> Strings, after construction, are immutable but may be constructed from >> mutable arrays of bytes, characters, or integers. >> The string constructors should guard against the effects of mutating the >> arrays during construction that might invalidate internal invariants for the >> correct behavior of operations on the resulting strings. In particular, a >> number of operations have optimizations for operations on pairs of latin1 >> strings and pairs of non-latin1 strings, while operations between latin1 and >> non-latin1 strings use a more general implementation. >> >> The changes include: >> >> - Adding a warning to each constructor with an array as an argument to >> indicate that the results are indeterminate >> if the input array is modified before the constructor returns. >> The resulting string may contain any combination of characters sampled >> from the input array. >> >> - Ensure that strings that are represented as non-latin1 contain at least >> one non-latin1 character. >> For latin1 inputs, whether the arrays contain ASCII, ISO-8859-1, UTF8, or >> another encoding decoded to latin1 the scanning and compression is unchanged. >> If a non-latin1 character is found, the string is represented as >> non-latin1 with the added verification that a non-latin1 character is >> present at the same index. >> If that character is found to be latin1, then the input array has been >> modified and the result of the scan may be incorrect. >> Though a ConcurrentModificationException could be thrown, the risk to an >> existing application of an unexpected exception should be avoided. >> Instead, the non-latin1 copy of the input is re-scanned and compressed; >> that scan determines whether the latin1 or the non-latin1 representation is >> returned. >> >> - The methods that scan for non-latin1 characters and their intrinsic >> implementations are updated to return the index of the non-latin1 character. >> >> - String construction from StringBuilder and CharSequence must also be >> guarded as their contents may be modified during construction. > > Roger Riggs has updated the pull request incrementally with one additional > commit since the last revision: > > Apply StringUTF16.coderFromArrayLen src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8538: > 8536: > 8537: testl(len, -64); > 8538: jcc(Assembler::zero, post_alignment); @cl4es since some of the `jcc` instructions have been changed into `jccb` in the rest of the intrinsic, I was wondering if it would make sense do the same for the rest of them (where the jump is short): Suggestion: jccb(Assembler::zero, post_alignment); src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8547: > 8545: // bail out when there is nothing to be done > 8546: testl(tmp5, 0x); > 8547: jcc(Assembler::zero, post_alignment); Suggestion: jccb(Assembler::zero, post_alignment); and here: https://github.com/openjdk/jdk/blob/d201344b631bf2cc9fb1990874fc7d42d523eeab/src/hotspot/cpu/x86/macroAssembler_x86.cpp#L8574 src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8584: > 8582: evpcmpuw(mask1, tmp1Reg, tmp2Reg, Assembler::le, > Assembler::AVX_512bit); > 8583: kortestdl(mask1, mask1); > 8584: jcc(Assembler::carryClear, reset_for_copy_tail); Suggestion: jccb(Assembler::carryClear, reset_for_copy_tail); and here: https://github.com/openjdk/jdk/blob/d201344b631bf2cc9fb1990874fc7d42d523eeab/src/hotspot/cpu/x86/macroAssembler_x86.cpp#L8590 - PR Review Comment: https://git.openjdk.org/jdk/pull/16425#discussion_r1406147105 PR Review Comment: https://git.openjdk.org/jdk/pull/16425#discussion_r1406147739 PR Review Comment: https://git.openjdk.org/jdk/pull/16425#discussion_r1406178794
Re: RFR: 8311906: Improve robustness of String constructors with mutable array inputs
On Mon, 30 Oct 2023 18:34:44 GMT, Roger Riggs wrote: > Strings, after construction, are immutable but may be constructed from > mutable arrays of bytes, characters, or integers. > The string constructors should guard against the effects of mutating the > arrays during construction that might invalidate internal invariants for the > correct behavior of operations on the resulting strings. In particular, a > number of operations have optimizations for operations on pairs of latin1 > strings and pairs of non-latin1 strings, while operations between latin1 and > non-latin1 strings use a more general implementation. > > The changes include: > > - Adding a warning to each constructor with an array as an argument to > indicate that the results are indeterminate > if the input array is modified before the constructor returns. > The resulting string may contain any combination of characters sampled from > the input array. > > - Ensure that strings that are represented as non-latin1 contain at least one > non-latin1 character. > For latin1 inputs, whether the arrays contain ASCII, ISO-8859-1, UTF8, or > another encoding decoded to latin1 the scanning and compression is unchanged. > If a non-latin1 character is found, the string is represented as non-latin1 > with the added verification that a non-latin1 character is present at the > same index. > If that character is found to be latin1, then the input array has been > modified and the result of the scan may be incorrect. > Though a ConcurrentModificationException could be thrown, the risk to an > existing application of an unexpected exception should be avoided. > Instead, the non-latin1 copy of the input is re-scanned and compressed; > that scan determines whether the latin1 or the non-latin1 representation is > returned. > > - The methods that scan for non-latin1 characters and their intrinsic > implementations are updated to return the index of the non-latin1 character. > > - String construction from StringBuilder and CharSequence must also be > guarded as their contents may be modified during construction. It would also be good to have the **aarch64** and **x64** intrinsics properly reviewed. @theRealAph could I ask you to have a look at **aarch64** and @TobiHartmann at **x64** please? (you seem to be the last ones that made major changes in the intrinsics) - PR Comment: https://git.openjdk.org/jdk/pull/16425#issuecomment-1802251889
Re: RFR: 8311906: Improve robustness of String constructors with mutable array inputs
On Mon, 30 Oct 2023 18:34:44 GMT, Roger Riggs wrote: > Strings, after construction, are immutable but may be constructed from > mutable arrays of bytes, characters, or integers. > The string constructors should guard against the effects of mutating the > arrays during construction that might invalidate internal invariants for the > correct behavior of operations on the resulting strings. In particular, a > number of operations have optimizations for operations on pairs of latin1 > strings and pairs of non-latin1 strings, while operations between latin1 and > non-latin1 strings use a more general implementation. > > The changes include: > > - Adding a warning to each constructor with an array as an argument to > indicate that the results are indeterminate > if the input array is modified before the constructor returns. > The resulting string may contain any combination of characters sampled from > the input array. > > - Ensure that strings that are represented as non-latin1 contain at least one > non-latin1 character. > For latin1 inputs, whether the arrays contain ASCII, ISO-8859-1, UTF8, or > another encoding decoded to latin1 the scanning and compression is unchanged. > If a non-latin1 character is found, the string is represented as non-latin1 > with the added verification that a non-latin1 character is present at the > same index. > If that character is found to be latin1, then the input array has been > modified and the result of the scan may be incorrect. > Though a ConcurrentModificationException could be thrown, the risk to an > existing application of an unexpected exception should be avoided. > Instead, the non-latin1 copy of the input is re-scanned and compressed; > that scan determines whether the latin1 or the non-latin1 representation is > returned. > > - The methods that scan for non-latin1 characters and their intrinsic > implementations are updated to return the index of the non-latin1 character. > > - String construction from StringBuilder and CharSequence must also be > guarded as their contents may be modified during construction. This PR includes changes to the `C2_MacroAssembler::char_array_compress_v` intrinsic for **RISCV** that we couldn't test. @RealFYang could you please test and review it? Thanks a lot! - PR Comment: https://git.openjdk.org/jdk/pull/16425#issuecomment-1799144689