On Sat, 4 May 2024 19:35:21 GMT, Scott Gibbons <sgibb...@openjdk.org> wrote:
>> Re-write the IndexOf code without the use of the pcmpestri instruction, only >> using AVX2 instructions. This change accelerates String.IndexOf on average >> 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score >> Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 >> 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 >> 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 >> 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 >> 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 >> 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 >> 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 >> 1.000265957x >> StringIndexOf.success 9.186 >> 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 >> 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 >> 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 >> 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 >> 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 >> 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 >> 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 >> 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 >> 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 >> 0.987938803 > > Scott Gibbons has updated the pull request incrementally with one additional > commit since the last revision: > > Rearrange; add lambdas for clarity First pass at StringIndexOfHuge.java and IndexOf.java test/jdk/java/lang/StringBuffer/IndexOf.java line 40: > 38: private static boolean failure = false; > 39: public static void main(String[] args) throws Exception { > 40: String testName = "IndexOf"; intentation test/jdk/java/lang/StringBuffer/IndexOf.java line 47: > 45: char[] haystack_16 = new char[128]; > 46: > 47: for (int i = 0; i < 128; i++) { you can use `char` instead of `int` as iterator test/jdk/java/lang/StringBuffer/IndexOf.java line 54: > 52: // for (int i = 1; i < 128; i++) { > 53: // haystack_16[i] = (char) (i); > 54: // } dead code test/jdk/java/lang/StringBuffer/IndexOf.java line 64: > 62: Charset hs_charset = StandardCharsets.UTF_16; > 63: Charset needleCharset = StandardCharsets.ISO_8859_1; > 64: // Charset needleCharset = StandardCharsets.UTF_16; Move from main() into a function that takes `needleCharset` as a parameter, then call that function twice. test/jdk/java/lang/StringBuffer/IndexOf.java line 81: > 79: sourceBuffer = new StringBuffer(sourceString); > 80: targetString = generateTestString(10, 11); > 81: } while (sourceString.indexOf(targetString) != -1); Should really keep the original test unmodified and add new tests as needed test/jdk/java/lang/StringBuffer/IndexOf.java line 83: > 81: shs = "$&),,18+-!'8)+"; > 82: endNeedle = "8)-"; > 83: l_offset = 9; dead code test/jdk/java/lang/StringBuffer/IndexOf.java line 89: > 87: StringBuffer bshs = new StringBuffer(shs); > 88: > 89: // printStringBytes(shs.getBytes(hs_charset)); dead code (and next two comments) test/jdk/java/lang/StringBuffer/IndexOf.java line 90: > 88: > 89: // printStringBytes(shs.getBytes(hs_charset)); > 90: for (int i = 0; i < 200000; i++) { This wont be a deterministic way to reach the intrinsic. I would suggest copying the idea from test/jdk/com/sun/crypto/provider/Cipher/ChaCha20/unittest/Poly1305UnitTestDriver.java i.e. Have two `@run main` invocations at the top of this file, one with default parameters, one with `-Xcomp -XX:-TieredCompilation`. You dont need a 'driver' program, that was to handle something else. /* * @test * @modules java.base/com.sun.crypto.provider * @run main java.base/com.sun.crypto.provider.Poly1305KAT * @summary Unit test for com.sun.crypto.provider.Poly1305. */ /* * @test * @modules java.base/com.sun.crypto.provider * @summary Unit test for IntrinsicCandidate in com.sun.crypto.provider.Poly1305. * @run main/othervm -Xcomp -XX:-TieredCompilation -XX:+UnlockDiagnosticVMOptions -XX:+ForceUnreachable java.base/com.sun.crypto.provider.Poly1305KAT */ test/jdk/java/lang/StringBuffer/IndexOf.java line 126: > 124: int aNewLength = getRandomIndex(min, max); > 125: for (int y = 0; y < aNewLength; y++) { > 126: int achar = generator.nextInt(30) + 30; This will only ever generate LL cases, i.e. chars from [30,60]. Could be parametrized to also produce utf16 if instead of 30, offset was in the unicode range test/jdk/java/lang/StringBuffer/IndexOf.java line 199: > 197: > System.out.println("Source="+sourceString.substring(hsBegin, hsBegin + > haystackLen)); > 198: > System.out.println("Target="+targetString.substring(nBegin, nBegin + > needleLen)); > 199: System.out.println("haystackLen="+haystackLen+" > neeldeLen="+needleLen+" hsBegin="+hsBegin+" nBegin="+nBegin+ This looks like 'development scaffolding' (i.e. printf debugging) that was meant to be removed test/jdk/java/lang/StringBuffer/IndexOf.java line 237: > 235: + sourceBuffer.toString() + " len Buffer = " + > sourceBuffer.toString().length()); > 236: System.err.println(" naive = " + > naiveFind(sourceBuffer.toString(), targetString, 0) + ", IndexOf = " > 237: + sourceBuffer.indexOf(targetString)); More tracing left behind here and rest of this function (original just recorded failure and moved along) test/jdk/java/lang/StringBuffer/IndexOf.java line 284: > 282: > 283: // Note: it is possible although highly improbable that failCount will > 284: // be > 0 even if everthing is working ok This sounds like either a bug or a testcase bug? Same as line 301, `extremely remote possibility of > 1 match`? test/jdk/java/lang/StringBuffer/IndexOf.java line 295: > 293: sourceString = generateTestString(99, 100); > 294: sourceBuffer = new StringBuffer(sourceString); > 295: targetString = generateTestString(10, 11); Generate a random int [0,1,2] for LL, UU, UL, pass that as parameter to generateTestString() to test the other paths. Same for other tests in this file using this pattern. This test is specific to haystacklen=100, needlelen=10.. what about other haystack/needle sizes to exercise all the paths in the intrinsic assembler (i.e. haystack >=, <=32, needlelen ={1,2,3,4,5..32..}). Elsewhere already? test/jdk/java/lang/StringBuffer/IndexOf.java line 360: > 358: System.err.println(" sAnswer = " + sAnswer + ", sbAnswer = " + > sbAnswer); > 359: System.err.println(" testString = '" + testString + "'"); > 360: System.err.println(" testBuffer = '" + testBuffer + "'"); tracing left here and further down test/micro/org/openjdk/bench/java/lang/StringIndexOfHuge.java line 2: > 1: /* > 2: * Copyright (c) 2014, 2024, Oracle and/or its affiliates. All rights > reserved. New file, just 2024 test/micro/org/openjdk/bench/java/lang/StringIndexOfHuge.java line 81: > 79: lateMatchString16 = > dataStringHuge16.substring(dataStringHuge16.length() - 31); > 80: > 81: searchString = "oscar"; Would had liked to see a few more small needles (i.e. to test/verify individual switch statement cases) test/micro/org/openjdk/bench/java/lang/StringIndexOfHuge.java line 94: > 92: > 93: > 94: /** IndexOf Micros */ Would really had preferred @Param{"LL", "UU", "UL"}; would be easier to spot if there are any copy/paste errors.. test/micro/org/openjdk/bench/java/lang/StringIndexOfHuge.java line 132: > 130: @Benchmark > 131: public int searchHugeLargeSubstring() { > 132: return dataStringHuge.indexOf("B".repeat(30) + "X" + > "A".repeat(30), 74); .repeat() call and string concatenation shouldn't be part of the benchmark (here and several other @Benchmark functions in this file) since it will detract from the measurement. (String concatenation gets converted (by javac) into StringBuilder().append().append()....append().toString()) test/micro/org/openjdk/bench/java/lang/StringIndexOfHuge.java line 242: > 240: @Benchmark > 241: public int search16HugeLargeSubstring16() { > 242: return dataStringHuge16.indexOf("B".repeat(30) + "X" + > "A".repeat(30), 74); `search16HugeLargeSubstring16` implies UU, but with `"B".repeat(30) + "X" + "A".repeat(30)` is UL ------------- PR Review: https://git.openjdk.org/jdk/pull/16753#pullrequestreview-2058681000 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602136400 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602140456 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602137044 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602158011 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602160330 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602144091 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602147967 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602153043 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602181943 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602162587 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602167728 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602184697 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602198158 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602171418 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602200123 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602133525 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602130679 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602047091 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602115797