Re: RFR 8199843 : Optimize Integer/Long.highestOneBit()

Ivan Gerasimov Tue, 20 Mar 2018 10:25:40 -0700

Hi Claes!


On 3/20/18 2:46 AM, Claes Redestad wrote:

Hi,

On 2018-03-20 09:58, Ivan Gerasimov wrote:
Hello!
The hightestOneBit function doesn't have an intrinsic and iscurrently implemented with a dozen of instructions.Alternatively, it could be implemented as MIN_VALUE >>>numberOfLeadingZeros(i), which works for all integers but zero.The former function gets intrisified by hotspot, which results in+27% of throughput (see the jmh results below).
Would you please help review this simple fix?

BUGURL: https://bugs.openjdk.java.net/browse/JDK-8199843
WEBREV: http://cr.openjdk.java.net/~igerasim/8199843/00/webrev/
nice optimization!
Benchmark:http://cr.openjdk.java.net/~igerasim/8199843/00/MyBenchmark.java
Benchmark results:

Benchmark                        (arg)   Mode  Cnt Score Error Units
MyBenchmark.int_testMethod_new 0 thrpt 35 323430664.593 ±7492044.171 ops/sMyBenchmark.int_testMethod_new 42 thrpt 35 298526237.078 ±5978291.689 ops/sMyBenchmark.int_testMethod_new -42 thrpt 35 302903562.073 ±7984723.721 ops/sMyBenchmark.int_testMethod_org 0 thrpt 35 236245042.891 ±3635990.596 ops/sMyBenchmark.int_testMethod_org 42 thrpt 35 237903410.753 ±3437684.390 ops/sMyBenchmark.int_testMethod_org -42 thrpt 35 238472580.618 ±2654886.010 ops/sMyBenchmark.long_testMethod_new 0 thrpt 35 282646114.501 ±48028366.305 ops/sMyBenchmark.long_testMethod_new 42 thrpt 35 282382228.405 ±5781529.307 ops/sMyBenchmark.long_testMethod_new -42 thrpt 35 276724858.286 ±6529561.227 ops/sMyBenchmark.long_testMethod_org 0 thrpt 35 198500211.972 ±15096862.367 ops/sMyBenchmark.long_testMethod_org 42 thrpt 35 215854630.194 ±3112930.563 ops/sMyBenchmark.long_testMethod_org -42 thrpt 35 217992805.521 ±2622877.082 ops/s
To nitpick a bit:
Please run with some appropriate time unit, e.g., "-tu us" to makeresults more human readable.
And where are the baseline results? :-)
It'd also be nice to verify we don't regress too much in case there'sno intrinsic, i.e., test with the
intrinsic disabled.

Good point!

Here are results for Integer.highestOneBit with the intrinsic ofnumberOfLeadingZeros being disabled:

Benchmark                           (arg)   Mode  Cnt    Score Error   Units

MyBenchmark.int_testMethod_00_base 0 thrpt 35 324.369 ± 15.437ops/usMyBenchmark.int_testMethod_00_base 42 thrpt 35 307.741 ± 29.623ops/usMyBenchmark.int_testMethod_00_base -42 thrpt 35 324.563 ± 25.039ops/usMyBenchmark.int_testMethod_01_org 0 thrpt 35 231.276 ± 8.392ops/usMyBenchmark.int_testMethod_01_org 42 thrpt 35 230.466 ± 10.557ops/usMyBenchmark.int_testMethod_01_org -42 thrpt 35 238.579 ± 8.257ops/usMyBenchmark.int_testMethod_02_new 0 thrpt 35 326.752 ± 18.400ops/usMyBenchmark.int_testMethod_02_new 42 thrpt 35 200.604 ± 8.139ops/usMyBenchmark.int_testMethod_02_new -42 thrpt 35 212.313 ± 21.284ops/us

Base case just returns the argument, thus shows the maximum possibleupper bound of the throughput.


With non-zero values the new function performs 11-13% worse.
I guess it's acceptable?

With kind regards,
Ivan

Thanks!

/Claes


--
With kind regards,
Ivan Gerasimov

Re: RFR 8199843 : Optimize Integer/Long.highestOneBit()

Reply via email to