On Wed, 6 Dec 2023 14:20:14 GMT, Claes Redestad <redes...@openjdk.org> wrote:
> https://bugs.openjdk.org/browse/JDK-8215017 removed the only use of > `StringUTF16::equals`. At the time I did some performance verification > focused on x86 showing that simplifying and only using `StringLatin1::equals` > was either neutral or a win. > > I repeated this experiment recently, adding some focused tests on aarch64 > where the code generation actually tries to take advantage and generate > slightly more efficient code for `StringUTF16::equals`: > https://github.com/openjdk/jdk/pull/16933#discussion_r1414118658 > > The indication here is that disabling use of `StringUTF16::equals` was the > right choice: any effect from the low-level optimization (one less branch at > the tail end) was offset by the `isLatin1()` branch and added code generation > (that all gets inlined). > > In a `-XX:-CompactStrings` configuration the slightly improved code > generation in `StringUTF16::equals` might help, since the `isLatin1()` test > and subsequent call to `StringLatin1::equals` would be DCEd. To get the best > of both worlds the code in `String::equals` _could_ be sharpened so that we > statically pick the best implementation based on `CompactStrings` mode (see > comment below). This shows a tiny win when testing with `-XX:-CompactStrings` > on M1 (up to -0.2ns/op per `String::equals`; neutral on x86). But is all this > complexity worth it for a gain that will get lost in the noise on anything > realistic? > > This PR instead proposes removing `StringUTF16::equals` and simplifying the > mechanisms to support the `StringLatin1/UTF16::equals` pair of intrinsics in > hotspot. The x86 `string_equals` instruction pre-dates and was updated for the Compact Strings JEP. No specialized 2-byte variants: https://github.com/openjdk/jdk/commit/7af927f9c10923b61f746eb6e566bcda853dd95a#diff-89791d4b051172965f7ba8f0cb7afbeb7e141f6de924dc07167c5ceefdce6bbe A bit strange since distinct 1-byte `array_equalsB` and 2-byte `array_equalsC` were introduced at this time, sharing code. So either an unintended omission, or the benefit of having both variants didn't manifest in benchmarks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16995#issuecomment-1924924403