On Tue, 18 Jul 2023 01:49:17 GMT, 温绍锦 <d...@openjdk.org> wrote:

>> Optimization for:
>> Integer.toString
>> Long.toString
>> StringBuilder#append(int)
>> 
>> # Benchmark Result
>> 
>> 
>> sh make/devkit/createJMHBundle.sh
>> bash configure --with-jmh=build/jmh/jars
>> make test TEST="micro:java.lang.Integers.toString*" 
>> make test TEST="micro:java.lang.Longs.toString*" 
>> make test TEST="micro:java.lang.StringBuilders.toStringCharWithInt8"
>> 
>> 
>> ## 1. 
>> [aliyun_ecs_c8i.xlarge](https://help.aliyun.com/document_detail/25378.html#c8i)
>> * cpu : intel xeon sapphire rapids (x64)
>> 
>> ``` diff
>> -Benchmark               (size)  Mode  Cnt  Score   Error  Units (baseline)
>> -Integers.toStringBig       500  avgt   15  6.825 ± 0.023  us/op
>> -Integers.toStringSmall     500  avgt   15  4.823 ± 0.023  us/op
>> -Integers.toStringTiny      500  avgt   15  3.878 ± 0.101  us/op
>> 
>> +Benchmark               (size)  Mode  Cnt  Score   Error  Units (PR Update 
>> 04 f4aa1989)
>> +Integers.toStringBig       500  avgt   15  6.002 ± 0.054  us/op (+13.71%)
>> +Integers.toStringSmall     500  avgt   15  4.025 ± 0.020  us/op (+19.82%)
>> +Integers.toStringTiny      500  avgt   15  3.874 ± 0.067  us/op (+0.10%)
>> 
>> -Benchmark            (size)  Mode  Cnt  Score   Error  Units (baseline)
>> -Longs.toStringBig       500  avgt   15  9.224 ± 0.021  us/op
>> -Longs.toStringSmall     500  avgt   15  4.621 ± 0.087  us/op
>> 
>> +Benchmark            (size)  Mode  Cnt  Score   Error  Units (PR Update 04 
>> f4aa1989)
>> +Longs.toStringBig       500  avgt   15  7.483 ± 0.018  us/op (+23.26%)
>> +Longs.toStringSmall     500  avgt   15  4.020 ± 0.016  us/op (+14.95%)
>> 
>> -Benchmark                           Mode  Cnt     Score    Error  Units 
>> (baseline)
>> -StringBuilders.toStringCharWithInt8 avgt   15    89.327 ±  0.733  ns/op
>> 
>> +Benchmark                            Mode  Cnt   Score   Error  Units (PR 
>> Update 04 f4aa1989)
>> +StringBuilders.toStringCharWithInt8  avgt   15  36.639 ± 0.422  ns/op 
>> (+143.80%)
>> 
>> 
>> 
>> ## 2. 
>> [aliyun_ecs_c8a.xlarge](https://help.aliyun.com/document_detail/25378.html#c8a)
>> * cpu : amd epc genoa (x64)
>> 
>> ``` diff
>> -Benchmark               (size)  Mode  Cnt  Score   Error  Units (baseline)
>> -Integers.toStringBig       500  avgt   15  6.753 ± 0.007  us/op
>> -Integers.toStringSmall     500  avgt   15  4.470 ± 0.005  us/op
>> -Integers.toStringTiny      500  avgt   15  2.764 ± 0.020  us/op
>> 
>> +Benchmark               (size)  Mode  Cnt  Score   Error  Units (PR Update 
>> 04 f4aa1989)
>> +Integers.toStringBig       500  avgt   15  5.036 ± 0.005  us/op (+...
>
> 温绍锦 has updated the pull request incrementally with one additional commit 
> since the last revision:
> 
>   assert bounds check

While I share the wariness against lookup tables (and the microbenchmarks that 
lend them too much credence), I think the approach in this PR is mostly an 
incremental improvement on an existing design and should be evaluated as such. 
By packing two lookup tables into one we reduce the opportunity for cache 
misses. The new lookup table is no larger as the two existing ones, so we're 
not wasting memory or sacrificing density. Having it in a single table makes it 
less likely that the arrays will end up in different locations on the heap or 
in compiled code. I see mostly positives here.

The main drawback is the proliferation of `Unsafe` usage.

Switching out the lookup table-based algorithm for something clever without a 
lookup table is laudable, but comes with a new set of challenges and should be 
done as a follow up. Since these tables can be aggressively constant-folded 
into the code section by our JITs it might even turn out to be a wash. 

If @RogerRiggs is happy with asserts in lieu of explicit bounds checking then I 
move to approve this.

src/java.base/share/classes/java/lang/StringUTF16.java line 1585:

> 1583:                     buf,
> 1584:                     Unsafe.ARRAY_BYTE_BASE_OFFSET + (charPos << 1),
> 1585:                     PACKED_DIGITS_UTF16[r]);

What performance would you get if you used the same lookup table as the other 
implementations, but inflate the value to UTF-16 on the fly?


int packed = (int)Integer.PACKED_DIGITS[r];
int inflated = ((packed & 0xFF00) << 8) & (packed & 0xFF));
UNSAFE.putIntUnaligned(
                    buf,
                    Unsafe.ARRAY_BYTE_BASE_OFFSET + (charPos << 1),
                    inflated);


This would avoid juggling more lookup table data around than before, 
alleviating some of the concerns voiced in this PR comment thread.

-------------

PR Review: https://git.openjdk.org/jdk/pull/14699#pullrequestreview-1605608821
PR Review Comment: https://git.openjdk.org/jdk/pull/14699#discussion_r1312156127

Reply via email to