Re: RFR 8209171 : Simplify Java implementation of Integer/Long.numberOfTrailingZeros()

Ivan Gerasimov Sun, 12 Aug 2018 07:23:07 -0700

Hi Martin!


On 8/11/18 5:54 PM, Martin Buchholz wrote:

Hi Ivan,

Oh the allure of bit-twiddling!

Yes :)

I'm skeptical that ntz should ever delegate to nlz, and not onlybecause of the overhead of a wrapper, but because small numbers aremore common, and we can take that into account when implementing ntz.

I was under impression that the more frequently a function is called thefaster it gets compiled, so all the callers of this function will benefit.For example, if numberOfTrailingZeros is reduced to numberOfLeadingZerosthen when the later is compiled while the former is still not, it willstill be running faster than the variant with independent implementations.

  At least add "1" to the set of numbers to benchmark.

In the last proposed patch, all odd numbers will be processed via a fastpath (because for any odd i, ~i & (i - 1) == 0).So, I added 1, 16 and 42 -- small numbers with different number oftrailing zeros.


Here's the updated benchmark:
http://cr.openjdk.java.net/~igerasim/8209171/02/bench/int/MyBenchmark.java
(I only executed four implementations to keep the picture clear.)

  Here's my own entry in this race:

    static int ntz(int i) {
        if (i == 0) return 32;
        int n = 0;
        if ((i << 16)  == 0) { n += 16; i >>>= 16; }
        if ((i & 0xFF) == 0) { n +=  8; i >>>=  8; }
        if ((i & 0xF)  == 0) { n +=  4; i >>>=  4; }
        if ((i & 0x3)  == 0) { n +=  2; i >>>=  2; }
        return n + (~i & 1);
    }

Interesting!

You might also avoid inversion at the end, if n is initialized with 1,and then the last line may be written as return n - (i & 1).


Still this variant appears to be slower in most tried cases.
Here's the graph of the latest benchmark:
http://cr.openjdk.java.net/~igerasim/8209171/02/bench/int/bench-int-02-client.png
http://cr.openjdk.java.net/~igerasim/8209171/02/bench/int/bench-int-02-server.png

The variant from the test01 is the fastest in most cases, but I wouldprefer to proceed with the variant from test05:It's only slightly slower than 01, but contains less bytecode and helpsto warm up numberOfLeadingZeros().

Whatever happens, we ought to check in the microbenchmarks somewhere.It looks like the new jmh-jdk-microbenchmarks is the place.I also suspect that tests for these methods could be improved (butthere are existing hotspot tests).

To make sure the new code is correct I ran an exhaustive loop fromInteger.MIN_VALUE to MAX_VALUE inclusive and checked all the testedvariants of implementation.

nlz seems harder than ntz in the sense that for nlz "small ints" andrandom bits may have different optimal implementations.

On Fri, Aug 10, 2018 at 7:03 PM, Ivan Gerasimov<[email protected] <mailto:[email protected]>> wrote:


    Thanks Martin!


    On 8/9/18 5:42 PM, Martin Buchholz wrote:



    On Thu, Aug 9, 2018 at 5:27 PM, Ivan Gerasimov
    <[email protected] <mailto:[email protected]>> wrote:

        I did not use the intrinsified variants of
        numberOfLeadingZeros in the benchmark.


    Oops! Should have looked more closely!
    Did you know about
    http://www.hackersdelight.org/hdcodetxt/ntz.c.txt
    <http://www.hackersdelight.org/hdcodetxt/ntz.c.txt>

Ah, right, ntz1() is even better because it has less branches.How could I miss that?


    Here's the updated webrev and benchmarks:

    http://cr.openjdk.java.net/~igerasim/8209171/01/webrev/
    <http://cr.openjdk.java.net/%7Eigerasim/8209171/01/webrev/>
    http://cr.openjdk.java.net/~igerasim/8209171/01/bench/int/MyBenchmark.java
    
<http://cr.openjdk.java.net/%7Eigerasim/8209171/01/bench/int/MyBenchmark.java>
    http://cr.openjdk.java.net/~igerasim/8209171/01/bench/long/MyBenchmark.java
    
<http://cr.openjdk.java.net/%7Eigerasim/8209171/01/bench/long/MyBenchmark.java>

--With kind regards,

    Ivan Gerasimov


--
With kind regards,
Ivan Gerasimov

Re: RFR 8209171 : Simplify Java implementation of Integer/Long.numberOfTrailingZeros()

Reply via email to