On Wed, 6 Dec 2023 14:20:14 GMT, Claes Redestad <redes...@openjdk.org> wrote:
> https://bugs.openjdk.org/browse/JDK-8215017 removed the only use of > `StringUTF16::equals`. At the time I did some performance verification > focused on x86 showing that simplifying and only using `StringLatin1::equals` > was either neutral or a win. > > I repeated this experiment recently, adding some focused tests on aarch64 > where the code generation actually tries to take advantage and generate > slightly more efficient code for `StringUTF16::equals`: > https://github.com/openjdk/jdk/pull/16933#discussion_r1414118658 > > The indication here is that disabling use of `StringUTF16::equals` was the > right choice: any effect from the low-level optimization (one less branch at > the tail end) was offset by the `isLatin1()` branch and added code generation > (that all gets inlined). > > In a `-XX:-CompactStrings` configuration the slightly improved code > generation in `StringUTF16::equals` might help, since the `isLatin1()` test > and subsequent call to `StringLatin1::equals` would be DCEd. To get the best > of both worlds the code in `String::equals` _could_ be sharpened so that we > statically pick the best implementation based on `CompactStrings` mode (see > comment below). This shows a tiny win (up to -0.2ns/op per `String::equals` > on M1; netural on x86). But is all this complexity worth it for a gain that > will get lost in the noise on anything realistic? > > This PR instead proposes removing `StringUTF16::equals` and simplifying the > mechanisms to support the `StringLatin1/UTF16::equals` pair of intrinsics in > hotspot. For reference these are the microbenchmarks used in the JDK-8215017 verification experiment: diff --git a/test/micro/org/openjdk/bench/java/lang/StringEquals.java b/test/micro/org/openjdk/bench/java/lang/StringEquals.java index b0db6a7037e..effe855c228 100644 --- a/test/micro/org/openjdk/bench/java/lang/StringEquals.java +++ b/test/micro/org/openjdk/bench/java/lang/StringEquals.java @@ -43,6 +43,9 @@ public class StringEquals { public String test5 = new String(test4); // equal to test4, but not same public String test6 = new String("0123456780"); public String test7 = new String("0123\u01FE"); + public String test8 = new String("12\u01FE"); + public String test9 = new String("12\u01FF"); + public String test10 = new String("12\u01FE"); @Benchmark public boolean different() { @@ -73,5 +76,15 @@ public boolean differentCoders() { public boolean equalsUTF16() { return test5.equals(test4); } + + @Benchmark + public boolean equalsUTF16_3_NE() { + return test8.equals(test9); + } + + @Benchmark + public boolean equalsUTF16_3_EQ() { + return test8.equals(test10); + } } And this is the change to `String` to get it back to pre-JDK-8215017 state: diff --git a/src/java.base/share/classes/java/lang/String.java b/src/java.base/share/classes/java/lang/String.java index 5869e086191..18ad0e85d33 100644 --- a/src/java.base/share/classes/java/lang/String.java +++ b/src/java.base/share/classes/java/lang/String.java @@ -1900,9 +1900,13 @@ public boolean equals(Object anObject) { if (this == anObject) { return true; } - return (anObject instanceof String aString) - && (!COMPACT_STRINGS || this.coder == aString.coder) - && StringLatin1.equals(value, aString.value); + if (anObject instanceof String aString) { + if (coder() == aString.coder()) { + return isLatin1() ? StringLatin1.equals(value, aString.value) + : StringUTF16.equals(value, aString.value); + } + } + return false; } /** M1 experiment with `-XX:-CompactStrings` and a patch to choose the `StringUTF16::equals` if `COMPACT_STRINGS` is false[1]: Name Cnt Base Error Test Error Unit Change StringEquals.equalsUTF16_3_EQ 5 1,799 ± 0,008 1,585 ± 0,009 ns/op 1,14x (p = 0,000*) StringEquals.equalsUTF16_3_NE 5 1,622 ± 0,007 1,541 ± 0,011 ns/op 1,05x (p = 0,000*) * = significant [1] diff --git a/src/java.base/share/classes/java/lang/String.java b/src/java.base/share/classes/java/lang/String.java index 5869e086191..10451bda83f 100644 --- a/src/java.base/share/classes/java/lang/String.java +++ b/src/java.base/share/classes/java/lang/String.java @@ -1900,9 +1900,14 @@ public boolean equals(Object anObject) { if (this == anObject) { return true; } - return (anObject instanceof String aString) - && (!COMPACT_STRINGS || this.coder == aString.coder) - && StringLatin1.equals(value, aString.value); + if (anObject instanceof String aString) { + if (COMPACT_STRINGS) { + return this.coder == aString.coder && StringLatin1.equals(value, aString.value); + } else { + return StringUTF16.equals(value, aString.value); + } + } + return false; } /** As expected this shows a tiny but measurable win on non-x86 platforms for `-CompactString` when retaining `StringUTF16::equals` and selecting it statically. For `+CompactStrings` this is performance neutral with the baseline. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16995#issuecomment-1845096784 PR Comment: https://git.openjdk.org/jdk/pull/16995#issuecomment-1845110919