Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v18]
On Wed, 15 May 2024 19:41:58 GMT, Volodymyr Paprotski wrote: >> Scott Gibbons has updated the pull request with a new target base due to a >> merge or a rebase. The pull request now contains 50 commits: >> >> - Merge remote-tracking branch 'origin/master' into indexof >> - Move arrays_equals back to c2_MacroAssembler >> - Merge branch 'openjdk:master' into indexof >> - Remove infinite loop (used for debugging) >> - Merge branch 'openjdk:master' into indexof >> - Cleaned up, ready for review >> - Pre-cleanup code >> - Add JMH. Add 16-byte compares to arrays_equals >> - Better method for mask creation >> - Merge branch 'openjdk:master' into indexof >> - ... and 40 more: https://git.openjdk.org/jdk/compare/b20fa7b4...f52d281d > > test/jdk/java/lang/StringBuffer/IndexOf.java line 81: > >> 79: String shs = (new String((hs_charset == StandardCharsets.UTF_16) ? >> haystack_16 : haystack)).substring(0, haystackSize); >> 80: >> 81: shs = "$&),,18+-!'8)+"; > > Should really keep the original test unmodified and add new tests as needed The test functionality was not changed. I just added printing of information when a failure occurs. - PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1613914184
Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v18]
> Re-write the IndexOf code without the use of the pcmpestri instruction, only > using AVX2 instructions. This change accelerates String.IndexOf on average > 1.3x for AVX2. The benchmark numbers: > > > BenchmarkScore > Latest > StringIndexOf.advancedWithMediumSub 343.573 317.934 > 0.925375393x > StringIndexOf.advancedWithShortSub1 1039.081 1053.96 > 1.014319384x > StringIndexOf.advancedWithShortSub2 55.828110.541 > 1.980027943x > StringIndexOf.constantPattern 9.361 11.906 > 1.271872663x > StringIndexOf.searchCharLongSuccess 4.216 4.218 > 1.000474383x > StringIndexOf.searchCharMediumSuccess 3.133 3.216 > 1.02649218x > StringIndexOf.searchCharShortSuccess 3.763.761 > 1.000265957x > StringIndexOf.success 9.186 9.713 > 1.057369911x > StringIndexOf.successBig14.34146.343 > 3.231504079x > StringIndexOfChar.latin1_AVX2_String6220.918 12154.52 > 1.953814533x > StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 > 1.006629895x > StringIndexOfChar.latin1_SSE4_String6978.854 6818.689 > 0.977049957x > StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 > 0.967675646x > StringIndexOfChar.latin1_Short_String 7132.541 6863.359 > 0.962260014x > StringIndexOfChar.latin1_Short_char 16013.389 16162.437 > 1.009307711x > StringIndexOfChar.latin1_mixed_String 7386.12314771.622 > 1.15517x > StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 > 0.987938803 Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 50 commits: - Merge remote-tracking branch 'origin/master' into indexof - Move arrays_equals back to c2_MacroAssembler - Merge branch 'openjdk:master' into indexof - Remove infinite loop (used for debugging) - Merge branch 'openjdk:master' into indexof - Cleaned up, ready for review - Pre-cleanup code - Add JMH. Add 16-byte compares to arrays_equals - Better method for mask creation - Merge branch 'openjdk:master' into indexof - ... and 40 more: https://git.openjdk.org/jdk/compare/b20fa7b4...f52d281d - Changes: https://git.openjdk.org/jdk/pull/16753/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16753&range=17 Stats: 4345 lines in 17 files changed: 4183 ins; 26 del; 136 mod Patch: https://git.openjdk.org/jdk/pull/16753.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16753/head:pull/16753 PR: https://git.openjdk.org/jdk/pull/16753