Re: RFR: 8282664: Unroll by hand StringUTF16 and StringLatin1 polynomial hash loops [v2]

Ludovic Henry Fri, 04 Mar 2022 09:45:45 -0800

> Despite the hash value being cached for Strings, computing the hash still 
> represents a significant CPU usage for applications handling lots of text.
> 
> Even though it would be generally better to do it through an enhancement to 
> the autovectorizer, the complexity of doing it by hand is trivial and the 
> gain is sizable (2x speedup) even without the Vector API. The algorithm has 
> been proposed by Richard Startin and Paul Sandoz [1].
> 
> Speedup are as follows on a `Intel(R) Xeon(R) E-2276G CPU @ 3.80GHz`
> 
> 
> Benchmark                                  (size)  Mode  Cnt     Score   
> Error  Units
> StringHashCode.Algorithm.scalar                 0  avgt          1.799        
>   ns/op
> StringHashCode.Algorithm.scalar                 1  avgt          3.233        
>   ns/op
> StringHashCode.Algorithm.scalar                10  avgt          6.617        
>   ns/op
> StringHashCode.Algorithm.scalar               100  avgt         61.481        
>   ns/op
> StringHashCode.Algorithm.scalar              1000  avgt        631.690        
>   ns/op
> StringHashCode.Algorithm.scalar             10000  avgt       6293.611        
>   ns/op
> StringHashCode.Algorithm.scalarUnrolled8        0  avgt          1.890        
>   ns/op
> StringHashCode.Algorithm.scalarUnrolled8        1  avgt          3.494        
>   ns/op
> StringHashCode.Algorithm.scalarUnrolled8       10  avgt          9.050        
>   ns/op
> StringHashCode.Algorithm.scalarUnrolled8      100  avgt         31.725        
>   ns/op
> StringHashCode.Algorithm.scalarUnrolled8     1000  avgt        321.031        
>   ns/op
> StringHashCode.Algorithm.scalarUnrolled8    10000  avgt       3203.838        
>   ns/op
> StringHashCode.Algorithm.scalarUnrolled16       0  avgt          1.953        
>   ns/op
> StringHashCode.Algorithm.scalarUnrolled16       1  avgt          3.485        
>   ns/op
> StringHashCode.Algorithm.scalarUnrolled16      10  avgt          7.041        
>   ns/op
> StringHashCode.Algorithm.scalarUnrolled16     100  avgt         30.975        
>   ns/op
> StringHashCode.Algorithm.scalarUnrolled16    1000  avgt        316.616        
>   ns/op
> StringHashCode.Algorithm.scalarUnrolled16   10000  avgt       3208.658        
>   ns/op
> 
> 
> At Datadog, we handle a great amount of text (through logs management for 
> example), and hashing String represents a large part of our CPU usage. It's 
> very unlikely that we are the only one as String.hashCode is such a core 
> feature of the JVM-based languages with its use in HashMap for example. 
> Having even only a 2x speedup would allow us to save thousands of CPU cores 
> per month and improve correspondingly the energy/carbon impact.
> 
> [1] 
> https://static.rainfocus.com/oracle/oow18/sess/1525822677955001tLqU/PF/codeone18-vector-API-DEV5081_1540354883936001Q3Sv.pdf


Ludovic Henry has updated the pull request incrementally with one additional 
commit since the last revision:

  Add UTF-16 benchmarks

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7700/files
  - new: https://git.openjdk.java.net/jdk/pull/7700/files/f7dda1d9..8e2e59bb

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7700&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7700&range=00-01

  Stats: 119 lines in 1 file changed: 70 ins; 7 del; 42 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7700.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7700/head:pull/7700

PR: https://git.openjdk.java.net/jdk/pull/7700

Re: RFR: 8282664: Unroll by hand StringUTF16 and StringLatin1 polynomial hash loops [v2]

Reply via email to