Re: RFR: 8315789: Minor HexFormat performance improvements

Claes Redestad Fri, 08 Sep 2023 02:39:31 -0700

On Fri, 8 Sep 2023 08:47:04 GMT, Raffaello Giulietti <rgiulie...@openjdk.org> 
wrote:


>> This PR seeks to improve formatting of hex digits using 
>> `java.util.HexFormat` somewhat.
>> 
>> This is achieved getting rid of a couple of lookup tables, caching the 
>> result of `HexFormat.of().withUpperCase()`, and removing tiny allocation 
>> that happens in the `formatHex(A, byte)` method. Improvements range from 
>> 20-40% on throughput, and some operations allocate less:
>> 
>> 
>> Name                               Cnt   Base   Error   Test   Error   Unit  
>>  Diff%
>> HexFormatBench.appenderLower        15  1,330 ± 0,021  1,065 ± 0,067  us/op  
>>  19,9% (p = 0,000*)
>>   :gc.alloc.rate                    15 11,481 ± 0,185  0,007 ± 0,000 MB/sec  
>> -99,9% (p = 0,000*)
>>   :gc.alloc.rate.norm               15 16,009 ± 0,000  0,007 ± 0,000   B/op 
>> -100,0% (p = 0,000*)
>>   :gc.count                         15  3,000          0,000         counts
>>   :gc.time                           3  2,000                            ms
>> HexFormatBench.appenderLowerCached  15  1,317 ± 0,013  1,065 ± 0,054  us/op  
>>  19,1% (p = 0,000*)
>>   :gc.alloc.rate                    15 11,590 ± 0,111  0,007 ± 0,000 MB/sec  
>> -99,9% (p = 0,000*)
>>   :gc.alloc.rate.norm               15 16,009 ± 0,000  0,007 ± 0,000   B/op 
>> -100,0% (p = 0,000*)
>>   :gc.count                         15  3,000          0,000         counts
>>   :gc.time                           3  2,000                            ms
>> HexFormatBench.appenderUpper        15  1,330 ± 0,022  1,065 ± 0,036  us/op  
>>  19,9% (p = 0,000*)
>>   :gc.alloc.rate                    15 34,416 ± 0,559  0,007 ± 0,000 MB/sec 
>> -100,0% (p = 0,000*)
>>   :gc.alloc.rate.norm               15 48,009 ± 0,000  0,007 ± 0,000   B/op 
>> -100,0% (p = 0,000*)
>>   :gc.count                         15  0,000          0,000         counts
>> HexFormatBench.appenderUpperCached  15  1,353 ± 0,009  1,033 ± 0,014  us/op  
>>  23,6% (p = 0,000*)
>>   :gc.alloc.rate                    15 11,284 ± 0,074  0,007 ± 0,000 MB/sec  
>> -99,9% (p = 0,000*)
>>   :gc.alloc.rate.norm               15 16,009 ± 0,000  0,007 ± 0,000   B/op 
>> -100,0% (p = 0,000*)
>>   :gc.count                         15  3,000          0,000         counts
>>   :gc.time                           3  2,000                            ms
>> HexFormatBench.toHexLower           15  0,198 ± 0,001  0,119 ± 0,008  us/op  
>>  40,1% (p = 0,000*)
>>   :gc.alloc.rate                    15  0,007 ± 0,000  0,007 ± 0,000 MB/sec  
>>  -0,0% (p = 0,816 )
>>   :gc.alloc.rate.norm               15  0,001 ± 0,000  0,001 ± 0,000   B/op  
>> -40,1% (p = 0,000*)
>>   :gc....
>
> I'm not sure that micro-benchmarks are very indicative on whether a lookup 
> table performs better than short and straightforward code.
> The reason is that, once in the CPU caches, a lookup table in 
> micro-benchmarks stays there, whereas in more realistic situations, where 
> access is more spread out in time, it is often evicted to make room for 
> other, more lively data.
> A micro-benchmark using a lookup table shows good results because of a high 
> rate of cache hits, whereas in other real-world workloads a lookup table 
> might result in many cache misses on access.

As @rgiulietti says lookup-table algorithms may outperform in microbenchmarks 
but lose out in real world scenarios, so we need to stay clear unless there's 
major benefit. 

And as it turns out, the relative benefit seem to come mainly from the use of 
`ByteArrayLittleEndian`, not from using the lookup table in `HexDigits.DIGITS`:

     public String toHexDigits(byte value) {
         byte[] rep = new byte[2];
-        rep[0] = (byte)toHighHexDigit(value);
-        rep[1] = (byte)toLowHexDigit(value);
+        ByteArrayLittleEndian.setChar(rep, 0, (char)(toHighHexDigit(value) << 
8 | toLowHexDigit(value)));
         try {
             return jla.newStringNoRepl(rep, StandardCharsets.ISO_8859_1);
         } catch (CharacterCodingException cce) {


This gets the same speed-up (4%) as calling `HexDigits.DIGITS`:


Name                       Cnt      Base    Error       Test    Error   Unit   
Diff%
HexFormatBench.toHexDigits  15     1,943 ±  0,014      1,860 ±  0,014  us/op    
4,2% (p = 0,000*)
  * = significant

-------------

PR Comment: https://git.openjdk.org/jdk/pull/15591#issuecomment-1711370758

Re: RFR: 8315789: Minor HexFormat performance improvements

Reply via email to