On 05/17/2018 07:07 AM, Claes Redestad wrote: > Shouldn't this be called "Faster rounding up to nearest power of two"? > > Patch looks OK to me, but I'd like to see numbers with the > numberOfLeadingZeros intrinsic > disabled so that we ensure we're not incurring an unreasonable penalty > on platforms who don't > have an intrinsic for this. > > Running your benchmark with the intrinsic disabled[1] on my machine I > see a 25-30% penalty > with testNew relative to testOld, which is perhaps a bit toomuch for > comfort..
This was the original reason for using the existing code -- hardware support was uncommon. I like your idea of tweaking non-intrinsic numberOfLeadingZeros to reduce penalty on the now-uncommon platforms without support. -Doug > > So I took a look at profiles for numberOfLeadingZeros with the intrinsic > disabled and realized > it might be possible to improve: > > diff -r de35abdfe5da src/java.base/share/classes/java/lang/Integer.java > --- a/src/java.base/share/classes/java/lang/Integer.java Mon May > 14 16:21:08 2018 +0200 > +++ b/src/java.base/share/classes/java/lang/Integer.java Thu May > 17 12:44:53 2018 +0200 > @@ -1621,12 +1621,12 @@ > // HD, Figure 5-6 > if (i <= 0) > return i == 0 ? 32 : 0; > - int n = 1; > - if (i >>> 16 == 0) { n += 16; i <<= 16; } > - if (i >>> 24 == 0) { n += 8; i <<= 8; } > - if (i >>> 28 == 0) { n += 4; i <<= 4; } > - if (i >>> 30 == 0) { n += 2; i <<= 2; } > - n -= i >>> 31; > + int n = 0; > + if ( i < 1 << 16) { n += 16; i <<= 16; } > + if (i >= 0 && i < 1 << 24) { n += 8; i <<= 8; } > + if (i >= 0 && i < 1 << 28) { n += 4; i <<= 4; } > + if (i >= 0 && i < 1 << 30) { n += 2; i <<= 2; } > + if (i >= 0) n++; > return n; > } > > Adding a case that uses this version to your benchmark[2] and the new > version is only about > 10% slower than the baseline, with the added benefit that other uses of > numberOfLeadingZeros > might see a speed-upif there's no intrinsic (runs with intrinsic > disabled[1]): > > Benchmark (arg) Mode Cnt Score Error Units > TableFor.testNew 1 avgt 6 28.343 ± 0.534 ns/op > TableFor.testNew 42 avgt 6 26.458 ± 0.064 ns/op > TableFor.testNew2 1 avgt 6 23.969 ± 0.201 ns/op > TableFor.testNew2 42 avgt 6 23.934 ± 0.107 ns/op > TableFor.testOld 1 avgt 6 21.615 ± 0.803 ns/op > TableFor.testOld 42 avgt 6 21.418 ± 0.106 ns/op > > So I think with the above patch to Integer.numberOfLeadingZeros we can > get the benefit for > our supported platforms while staying roughly performance neutral on > platforms without > an intrinsic. Not strictly necessary to fold it into this RFE, of course. > > Thanks! > > /Claes > > [1] -XX:+UnlockDiagnosticVMOptions > -XX:DisableIntrinsic=_numberOfLeadingZeros_i > [2] http://cr.openjdk.java.net/~redestad/scratch/TableFor.java > > On 2018-05-17 05:48, David Holmes wrote: >>> Do you think it's good to go? >> >> I think I'd rather defer to a more performance oriented reviewer - >> paging Claes! ;-) >> >> David >> ----- >