On Fri, 4 Mar 2022 15:54:14 GMT, Ludovic Henry <luhe...@openjdk.org> wrote:
> Despite the hash value being cached for Strings, computing the hash still > represents a significant CPU usage for applications handling lots of text. > > Even though it would be generally better to do it through an enhancement to > the autovectorizer, the complexity of doing it by hand is trivial and the > gain is sizable (2x speedup) even without the Vector API. The algorithm has > been proposed by Richard Startin and Paul Sandoz [1]. > > Speedup are as follows on a `Intel(R) Xeon(R) E-2276G CPU @ 3.80GHz` > > > Benchmark (size) Mode Cnt Score > Error Units > StringHashCode.Algorithm.scalar 0 avgt 1.799 > ns/op > StringHashCode.Algorithm.scalar 1 avgt 3.233 > ns/op > StringHashCode.Algorithm.scalar 10 avgt 6.617 > ns/op > StringHashCode.Algorithm.scalar 100 avgt 61.481 > ns/op > StringHashCode.Algorithm.scalar 1000 avgt 631.690 > ns/op > StringHashCode.Algorithm.scalar 10000 avgt 6293.611 > ns/op > StringHashCode.Algorithm.scalarUnrolled8 0 avgt 1.890 > ns/op > StringHashCode.Algorithm.scalarUnrolled8 1 avgt 3.494 > ns/op > StringHashCode.Algorithm.scalarUnrolled8 10 avgt 9.050 > ns/op > StringHashCode.Algorithm.scalarUnrolled8 100 avgt 31.725 > ns/op > StringHashCode.Algorithm.scalarUnrolled8 1000 avgt 321.031 > ns/op > StringHashCode.Algorithm.scalarUnrolled8 10000 avgt 3203.838 > ns/op > StringHashCode.Algorithm.scalarUnrolled16 0 avgt 1.953 > ns/op > StringHashCode.Algorithm.scalarUnrolled16 1 avgt 3.485 > ns/op > StringHashCode.Algorithm.scalarUnrolled16 10 avgt 7.041 > ns/op > StringHashCode.Algorithm.scalarUnrolled16 100 avgt 30.975 > ns/op > StringHashCode.Algorithm.scalarUnrolled16 1000 avgt 316.616 > ns/op > StringHashCode.Algorithm.scalarUnrolled16 10000 avgt 3208.658 > ns/op > > > At Datadog, we handle a great amount of text (through logs management for > example), and hashing String represents a large part of our CPU usage. It's > very unlikely that we are the only one as String.hashCode is such a core > feature of the JVM-based languages with its use in HashMap for example. > Having even only a 2x speedup would allow us to save thousands of CPU cores > per month and improve correspondingly the energy/carbon impact. > > [1] > https://static.rainfocus.com/oracle/oow18/sess/1525822677955001tLqU/PF/codeone18-vector-API-DEV5081_1540354883936001Q3Sv.pdf My only comment is that I'd like to see some benchmark verifying also the UTF-16 code path. It should see a similar speed-up, but adding plenty of calls might mess up compilation and inlining. ------------- PR: https://git.openjdk.java.net/jdk/pull/7700