On Tue, 28 May 2024 12:48:19 GMT, Sandhya Viswanathan 
<sviswanat...@openjdk.org> wrote:

>> Scott Gibbons has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Fix tests
>
> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 239:
> 
>> 237:   //  the needle size is less than 32 bytes, we default to a
>> 238:   //  byte-by-byte comparison (this will be rare).
>> 239:   //
> 
> Is this still true?

Yes.  For UL, the code within `L_compareFull` effectively does byte-by-byte.

> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 278:
> 
>> 276:   __ bind(L_nextCheck);
>> 277:   __ testq(haystack_len_p, haystack_len_p);
>> 278:   __ je(L_zeroCheckFailed);
> 
> This check could be removed as the next check covers this one.

No.  This is checking for a zero length haystack.  The following compare checks 
for needle length longer than haystack, regardless of the value in each.  The 
comparison is signed, so a haystack length of 0 with a needle length of -1 will 
pass the following test and assume validity.

> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 360:
> 
>> 358:   __ push(rcx);
>> 359:   __ push(r8);
>> 360:   __ push(r9);
> 
> No need to save/restore rcx/r8/r9 on windows platform as well.

OK.

> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 379:
> 
>> 377: 
>> 378:   // Assume failure
>> 379:   __ movq(rbp, -1);
> 
> We are no more using rbp at return point so this is not needed now?

Removed.

> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 488:
> 
>> 486:   __ cmpq(r11, nMinusK);
>> 487:   __ ja_b(L_return);
>> 488:   __ movq(rax, r11);
> 
> At places where we know that return value in r11 is correct, we dont need to 
> checkRange so this could have its own label.

I don't want to change this because its reason for existence is to ensure we 
don't return a value that's beyond the end of the haystack.  We don't yet have 
a good enough test to validate whether we're reading past the end of the 
haystack, so I like this as insurance.

> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 566:
> 
>> 564:     //  rbp: -1
>> 565:     //  XMM_BYTE_0 - first element of needle broadcast
>> 566:     //  XMM_BYTE_K - last element of needle broadcast
> 
> The only registers that are used as input in the switch case are:
> r14 = needle
> rbx = haystack
> rsi = haystack length (n)
> r12 = needle length (k)
> r10 = n - k (where k is needle length)
> XMM_BYTE_0 = first element of needle, broadcast
> XMM_BYTE_K = last element of needle, broadcast
> So we could only list these, making it easier to comprehend.

I listed these registers to make it clear which registers had no expected value 
and could be used for temps, etc.

> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 578:
> 
>> 576:     // helper jumps to L_checkRangeAndReturn with a (-1) return value.
>> 577:     big_case_loop_helper(false, 0, L_checkRangeAndReturn, L_loopTop, 
>> mask, hsPtrRet, needleLen,
>> 578:                          needle, haystack, hsLength, tmp1, tmp2, tmp3, 
>> rScratch, ae, _masm);
> 
> If we run out of haystack instead of jumping to L_checkRangeAndReturn, we 
> could directly jump to  L_retrunError.

Again, I think we ought to leave this in.  Although it executes ~3 instructions 
that may not be necessary in some cases I think it's best to perform the check. 
 Once we have a good enough test to check reading past the end of the haystack 
we can change it.

> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 597:
> 
>> 595: 
>> 596:     // Need a lot of registers here to preserve state across 
>> arrays_equals call
>> 597: 
> 
> This comment is no longer valid, could be removed.

OK

> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 621:
> 
>> 619:     __ addq(hsPtrRet, index);
>> 620:     __ movq(r11, hsPtrRet);
>> 621:     __ jmp(L_checkRangeAndReturn);
> 
> Why do we have to checkRange here, would it not be always correct? It so we 
> could return r11 directly (by moving into rax).

There are cases where r11 could have a value that, when added to (k - 1) would 
go past the end of the haystack.  I did all in my power to ensure that it 
doesn't but there's no test I know of to ensure that condition.  I would 
recommend leaving this in for now.

> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 660:
> 
>> 658:   //  Haystack always copied to stack, so 32-byte reads OK
>> 659:   //  Haystack length < 32
>> 660:   //  10 < needle length < 32
> 
> Haystack length <= 32
> 10 < needle length <= 32

Changed.

> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 721:
> 
>> 719:                        false /* char */, knoreg);
>> 720:     __ testl(rTmp3, rTmp3);
>> 721:     __ jne(L_checkRangeAndReturn);
> 
> Why do we have to checkRange here, would it not be always correct? It so we 
> could return r11 directly (by moving into rax).

OK

> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 1333:
> 
>> 1331: 
>> 1332:   __ cmpq(nMinusK, 32);
>> 1333:   __ jae_b(L_greaterThan32);
> 
> Should this check be (n-k+1) >= 32? And so accordingly (n-k) >= 31
>  __ cmpq(nMinusK, 31);
>  __ jae_b(L_greaterThan32);

No. For (n-k)==32 we can do full reads.  I'll clarify by changing the label 
name.

> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 1382:
> 
>> 1380: 
>> 1381:   __ testl(eq_mask, eq_mask);
>> 1382:   __ je(noMatch);
> 
> We are mixing operation width l and q here.

Fixed.

> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 1750:
> 
>> 1748:     //  r15 = unused
>> 1749:     //  XMM_BYTE_0 - first element of needle, broadcast
>> 1750:     //  XMM_BYTE_K - last element of needle, broadcast
> 
> This comment is duplicated for both small haystack case and big haystack 
> case, could be made a common comment. 
> Also the only registers that are used as input in the switch case are:
> r14 = needle
> rbx = haystack
> rsi = haystack length (n)
> r12 = needle length (k)
> r10 = n - k (where k is needle length)
> XMM_BYTE_0 = first element of needle, broadcast
> XMM_BYTE_K = last element of needle, broadcast
> So we could only list these, making it easier to comprehend.

I listed all registers for clarity.  This ensures that we know what can be used 
as values or as scratch registers with no ambiguity.

> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 1758:
> 
>> 1756:     //
>> 1757:     // If a match is found, jump to L_checkRange, which ensures the
>> 1758:     // matched needle is not past the end of the haystack.
> 
> Another comment here would be useful:
> // The found index is returned in set_bit (r11).

Added.

> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 1810:
> 
>> 1808:     //  XMM_BYTE_K - last element of needle, broadcast
>> 1809:     //
>> 1810:     //  The haystack is > 32 bytes
> 
> Good to mention some info about the return found index value in comment about 
> how it is a combination of set_bit (r8), hs_ptr, and haystack.

Done.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1617663227
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1617667775
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1617669103
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1617671612
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1617673870
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1617680570
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1617699879
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1617700813
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1617704836
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1617705505
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1617711973
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1617713299
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1617714825
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1617716598
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1617717873
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1617726261

Reply via email to