Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v3]

2024-04-17 Thread Damon Fenacci
On Tue, 26 Mar 2024 15:59:33 GMT, Damon Fenacci  wrote:

>> Yudi Zheng has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   address comment.
>
> `multiply_to_len` seems to be used by `generate_squareToLen` as well for 
> aarch64 and riscv but `zlen` is still passed in a register.
> 
> https://github.com/openjdk/jdk/blob/870a6127cf54264c691f7322d775b202705c3bfa/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp#L4710
> https://github.com/openjdk/jdk/blob/870a6127cf54264c691f7322d775b202705c3bfa/src/hotspot/cpu/riscv/stubGenerator_riscv.cpp#L2881
> 
> I think it might work anyway but it might be better to adapt them if only for 
> completeness.

> @dafedafe @dean-long please take a look and let me know if there are further 
> issues, thanks!

Thanks @mur47x111! I noticed that you found even a few more `zlen` usages  

Did you test the change against all affected platforms?

-

PR Comment: https://git.openjdk.org/jdk/pull/18226#issuecomment-2061301421


Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v3]

2024-03-26 Thread Damon Fenacci
On Tue, 19 Mar 2024 21:09:31 GMT, Yudi Zheng  wrote:

>> Moving array construction within BigInteger.implMultiplyToLen intrinsic 
>> candidate to its caller simplifies the intrinsic implementation in JIT 
>> compiler.
>
> Yudi Zheng has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   address comment.

`multiply_to_len` seems to be used by `generate_squareToLen` as well for 
aarch64 and riscv but `zlen` is still passed in a register.

https://github.com/openjdk/jdk/blob/870a6127cf54264c691f7322d775b202705c3bfa/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp#L4710
https://github.com/openjdk/jdk/blob/870a6127cf54264c691f7322d775b202705c3bfa/src/hotspot/cpu/riscv/stubGenerator_riscv.cpp#L2881

I think it might work anyway but it might be better to adapt them if only for 
completeness.

-

PR Review: https://git.openjdk.org/jdk/pull/18226#pullrequestreview-1960906919


Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic

2024-03-18 Thread Damon Fenacci
On Tue, 12 Mar 2024 10:44:54 GMT, Yudi Zheng  wrote:

> Moving array construction within BigInteger.implMultiplyToLen intrinsic 
> candidate to its caller simplifies the intrinsic implementation in JIT 
> compiler.

Quite a simplification!
Have you checked if there are any performance differences?

-

PR Review: https://git.openjdk.org/jdk/pull/18226#pullrequestreview-1943670073


Re: RFR: 8311906: Improve robustness of String constructors with mutable array inputs [v14]

2023-12-03 Thread Damon Fenacci
On Thu, 30 Nov 2023 15:51:46 GMT, Roger Riggs  wrote:

>> Strings, after construction, are immutable but may be constructed from 
>> mutable arrays of bytes, characters, or integers.
>> The string constructors should guard against the effects of mutating the 
>> arrays during construction that might invalidate internal invariants for the 
>> correct behavior of operations on the resulting strings. In particular, a 
>> number of operations have optimizations for operations on pairs of latin1 
>> strings and pairs of non-latin1 strings, while operations between latin1 and 
>> non-latin1 strings use a more general implementation. 
>> 
>> The changes include:
>> 
>> - Adding a warning to each constructor with an array as an argument to 
>> indicate that the results are indeterminate 
>>   if the input array is modified before the constructor returns. 
>>   The resulting string may contain any combination of characters sampled 
>> from the input array.
>> 
>> - Ensure that strings that are represented as non-latin1 contain at least 
>> one non-latin1 character.
>>   For latin1 inputs, whether the arrays contain ASCII, ISO-8859-1, UTF8, or 
>> another encoding decoded to latin1 the scanning and compression is unchanged.
>>   If a non-latin1 character is found, the string is represented as 
>> non-latin1 with the added verification that a non-latin1 character is 
>> present at the same index.
>>   If that character is found to be latin1, then the input array has been 
>> modified and the result of the scan may be incorrect.
>>   Though a ConcurrentModificationException could be thrown, the risk to an 
>> existing application of an unexpected exception should be avoided.
>>   Instead, the non-latin1 copy of the input is re-scanned and compressed; 
>> that scan determines whether the latin1 or the non-latin1 representation is 
>> returned.
>> 
>> - The methods that scan for non-latin1 characters and their intrinsic 
>> implementations are updated to return the index of the non-latin1 character.
>> 
>> - String construction from StringBuilder and CharSequence must also be 
>> guarded as their contents may be modified during construction.
>
> Roger Riggs has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Correct jcc/jccb branches

Intrinsics look OK to me.

-

Marked as reviewed by dfenacci (Committer).

PR Review: https://git.openjdk.org/jdk/pull/16425#pullrequestreview-1761763069


Re: RFR: 8311906: Improve robustness of String constructors with mutable array inputs [v13]

2023-11-30 Thread Damon Fenacci
On Mon, 27 Nov 2023 19:09:40 GMT, Roger Riggs  wrote:

>> Strings, after construction, are immutable but may be constructed from 
>> mutable arrays of bytes, characters, or integers.
>> The string constructors should guard against the effects of mutating the 
>> arrays during construction that might invalidate internal invariants for the 
>> correct behavior of operations on the resulting strings. In particular, a 
>> number of operations have optimizations for operations on pairs of latin1 
>> strings and pairs of non-latin1 strings, while operations between latin1 and 
>> non-latin1 strings use a more general implementation. 
>> 
>> The changes include:
>> 
>> - Adding a warning to each constructor with an array as an argument to 
>> indicate that the results are indeterminate 
>>   if the input array is modified before the constructor returns. 
>>   The resulting string may contain any combination of characters sampled 
>> from the input array.
>> 
>> - Ensure that strings that are represented as non-latin1 contain at least 
>> one non-latin1 character.
>>   For latin1 inputs, whether the arrays contain ASCII, ISO-8859-1, UTF8, or 
>> another encoding decoded to latin1 the scanning and compression is unchanged.
>>   If a non-latin1 character is found, the string is represented as 
>> non-latin1 with the added verification that a non-latin1 character is 
>> present at the same index.
>>   If that character is found to be latin1, then the input array has been 
>> modified and the result of the scan may be incorrect.
>>   Though a ConcurrentModificationException could be thrown, the risk to an 
>> existing application of an unexpected exception should be avoided.
>>   Instead, the non-latin1 copy of the input is re-scanned and compressed; 
>> that scan determines whether the latin1 or the non-latin1 representation is 
>> returned.
>> 
>> - The methods that scan for non-latin1 characters and their intrinsic 
>> implementations are updated to return the index of the non-latin1 character.
>> 
>> - String construction from StringBuilder and CharSequence must also be 
>> guarded as their contents may be modified during construction.
>
> Roger Riggs has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Use byte off branches in char_array_compress
>   Verified by manual tests with "-XX:AVX3Threshold=0"
>   And test in the PR 
> test/hotspot/jtreg/compiler/intrinsics/string/TestStringConstructionIntrinsics.java

src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8547:

> 8545: // bail out when there is nothing to be done
> 8546: testl(tmp5, 0x);
> 8547: jcc(Assembler::zero, post_alignment);

@RogerRiggs I think you changed the wrong line  
Suggestion:

jccb(Assembler::zero, post_alignment);

src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8559:

> 8557: evpcmpw(mask1, mask2, tmp1Reg, tmp2Reg, Assembler::le, /*signed*/ 
> false, Assembler::AVX_512bit);
> 8558: ktestd(mask1, mask2);
> 8559: jccb(Assembler::carryClear, copy_tail);

Suggestion:

jcc(Assembler::carryClear, copy_tail);

(in this case the jump can potentially be long)

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16425#discussion_r1410283240
PR Review Comment: https://git.openjdk.org/jdk/pull/16425#discussion_r1410284158


Re: RFR: 8311906: Improve robustness of String constructors with mutable array inputs [v12]

2023-11-27 Thread Damon Fenacci
On Wed, 22 Nov 2023 05:03:41 GMT, Roger Riggs  wrote:

>> Strings, after construction, are immutable but may be constructed from 
>> mutable arrays of bytes, characters, or integers.
>> The string constructors should guard against the effects of mutating the 
>> arrays during construction that might invalidate internal invariants for the 
>> correct behavior of operations on the resulting strings. In particular, a 
>> number of operations have optimizations for operations on pairs of latin1 
>> strings and pairs of non-latin1 strings, while operations between latin1 and 
>> non-latin1 strings use a more general implementation. 
>> 
>> The changes include:
>> 
>> - Adding a warning to each constructor with an array as an argument to 
>> indicate that the results are indeterminate 
>>   if the input array is modified before the constructor returns. 
>>   The resulting string may contain any combination of characters sampled 
>> from the input array.
>> 
>> - Ensure that strings that are represented as non-latin1 contain at least 
>> one non-latin1 character.
>>   For latin1 inputs, whether the arrays contain ASCII, ISO-8859-1, UTF8, or 
>> another encoding decoded to latin1 the scanning and compression is unchanged.
>>   If a non-latin1 character is found, the string is represented as 
>> non-latin1 with the added verification that a non-latin1 character is 
>> present at the same index.
>>   If that character is found to be latin1, then the input array has been 
>> modified and the result of the scan may be incorrect.
>>   Though a ConcurrentModificationException could be thrown, the risk to an 
>> existing application of an unexpected exception should be avoided.
>>   Instead, the non-latin1 copy of the input is re-scanned and compressed; 
>> that scan determines whether the latin1 or the non-latin1 representation is 
>> returned.
>> 
>> - The methods that scan for non-latin1 characters and their intrinsic 
>> implementations are updated to return the index of the non-latin1 character.
>> 
>> - String construction from StringBuilder and CharSequence must also be 
>> guarded as their contents may be modified during construction.
>
> Roger Riggs has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Apply StringUTF16.coderFromArrayLen

src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8538:

> 8536: 
> 8537: testl(len, -64);
> 8538: jcc(Assembler::zero, post_alignment);

@cl4es since some of the `jcc` instructions have been changed into `jccb` in 
the rest of the intrinsic, I was wondering if it would make sense do the same 
for the rest of them (where the jump is short):

Suggestion:

jccb(Assembler::zero, post_alignment);

src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8547:

> 8545: // bail out when there is nothing to be done
> 8546: testl(tmp5, 0x);
> 8547: jcc(Assembler::zero, post_alignment);

Suggestion:

jccb(Assembler::zero, post_alignment);


and here:
https://github.com/openjdk/jdk/blob/d201344b631bf2cc9fb1990874fc7d42d523eeab/src/hotspot/cpu/x86/macroAssembler_x86.cpp#L8574

src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8584:

> 8582: evpcmpuw(mask1, tmp1Reg, tmp2Reg, Assembler::le, 
> Assembler::AVX_512bit);
> 8583: kortestdl(mask1, mask1);
> 8584: jcc(Assembler::carryClear, reset_for_copy_tail);

Suggestion:

jccb(Assembler::carryClear, reset_for_copy_tail);


and here:

https://github.com/openjdk/jdk/blob/d201344b631bf2cc9fb1990874fc7d42d523eeab/src/hotspot/cpu/x86/macroAssembler_x86.cpp#L8590

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16425#discussion_r1406147105
PR Review Comment: https://git.openjdk.org/jdk/pull/16425#discussion_r1406147739
PR Review Comment: https://git.openjdk.org/jdk/pull/16425#discussion_r1406178794


Re: RFR: 8311906: Improve robustness of String constructors with mutable array inputs

2023-11-08 Thread Damon Fenacci
On Mon, 30 Oct 2023 18:34:44 GMT, Roger Riggs  wrote:

> Strings, after construction, are immutable but may be constructed from 
> mutable arrays of bytes, characters, or integers.
> The string constructors should guard against the effects of mutating the 
> arrays during construction that might invalidate internal invariants for the 
> correct behavior of operations on the resulting strings. In particular, a 
> number of operations have optimizations for operations on pairs of latin1 
> strings and pairs of non-latin1 strings, while operations between latin1 and 
> non-latin1 strings use a more general implementation. 
> 
> The changes include:
> 
> - Adding a warning to each constructor with an array as an argument to 
> indicate that the results are indeterminate 
>   if the input array is modified before the constructor returns. 
>   The resulting string may contain any combination of characters sampled from 
> the input array.
> 
> - Ensure that strings that are represented as non-latin1 contain at least one 
> non-latin1 character.
>   For latin1 inputs, whether the arrays contain ASCII, ISO-8859-1, UTF8, or 
> another encoding decoded to latin1 the scanning and compression is unchanged.
>   If a non-latin1 character is found, the string is represented as non-latin1 
> with the added verification that a non-latin1 character is present at the 
> same index.
>   If that character is found to be latin1, then the input array has been 
> modified and the result of the scan may be incorrect.
>   Though a ConcurrentModificationException could be thrown, the risk to an 
> existing application of an unexpected exception should be avoided.
>   Instead, the non-latin1 copy of the input is re-scanned and compressed; 
> that scan determines whether the latin1 or the non-latin1 representation is 
> returned.
> 
> - The methods that scan for non-latin1 characters and their intrinsic 
> implementations are updated to return the index of the non-latin1 character.
> 
> - String construction from StringBuilder and CharSequence must also be 
> guarded as their contents may be modified during construction.

It would also be good to have the **aarch64** and **x64** intrinsics properly 
reviewed. @theRealAph could I ask you to have a look at **aarch64** and 
@TobiHartmann at **x64** please? (you seem to be the last ones that made major 
changes in the intrinsics)

-

PR Comment: https://git.openjdk.org/jdk/pull/16425#issuecomment-1802251889


Re: RFR: 8311906: Improve robustness of String constructors with mutable array inputs

2023-11-07 Thread Damon Fenacci
On Mon, 30 Oct 2023 18:34:44 GMT, Roger Riggs  wrote:

> Strings, after construction, are immutable but may be constructed from 
> mutable arrays of bytes, characters, or integers.
> The string constructors should guard against the effects of mutating the 
> arrays during construction that might invalidate internal invariants for the 
> correct behavior of operations on the resulting strings. In particular, a 
> number of operations have optimizations for operations on pairs of latin1 
> strings and pairs of non-latin1 strings, while operations between latin1 and 
> non-latin1 strings use a more general implementation. 
> 
> The changes include:
> 
> - Adding a warning to each constructor with an array as an argument to 
> indicate that the results are indeterminate 
>   if the input array is modified before the constructor returns. 
>   The resulting string may contain any combination of characters sampled from 
> the input array.
> 
> - Ensure that strings that are represented as non-latin1 contain at least one 
> non-latin1 character.
>   For latin1 inputs, whether the arrays contain ASCII, ISO-8859-1, UTF8, or 
> another encoding decoded to latin1 the scanning and compression is unchanged.
>   If a non-latin1 character is found, the string is represented as non-latin1 
> with the added verification that a non-latin1 character is present at the 
> same index.
>   If that character is found to be latin1, then the input array has been 
> modified and the result of the scan may be incorrect.
>   Though a ConcurrentModificationException could be thrown, the risk to an 
> existing application of an unexpected exception should be avoided.
>   Instead, the non-latin1 copy of the input is re-scanned and compressed; 
> that scan determines whether the latin1 or the non-latin1 representation is 
> returned.
> 
> - The methods that scan for non-latin1 characters and their intrinsic 
> implementations are updated to return the index of the non-latin1 character.
> 
> - String construction from StringBuilder and CharSequence must also be 
> guarded as their contents may be modified during construction.

This PR includes changes to the `C2_MacroAssembler::char_array_compress_v` 
intrinsic for **RISCV** that we couldn't test. @RealFYang could you please test 
and review it? Thanks a lot!

-

PR Comment: https://git.openjdk.org/jdk/pull/16425#issuecomment-1799144689