On Fri, 2 Jul 2021 13:47:40 GMT, Andrew Haley <a...@openjdk.org> wrote:
>> You can also do that branchlessly which might prove better >> >> long result = Math.multiplyHigh(x, y); >> result += (y & (x >> 63)); >> result += (x & (y >> 63)); >> return result; > >> You can also do that branchlessly which might prove better >> >> ``` >> long result = Math.multiplyHigh(x, y); >> result += (y & (x >> 63)); >> result += (x & (y >> 63)); >> return result; >> ``` > I doubt very much that it would be better, because these days branch > prediction is excellent, and we also have conditional select instructions. > Exposing the condition helps C2 to eliminate it if the range of args is > known. The `if` code is easier to understand. > > Benchmark results, with one of the operands changing signs every iteration, > 1000 iterations: > > > Benchmark Mode Cnt Score Error Units > MulHiTest.mulHiTest1 (aph) avgt 3 1570.587 ± 16.602 ns/op > MulHiTest.mulHiTest2 (adinn) avgt 3 2237.637 ± 4.740 ns/op > > In any case, note that with this optimization the unsigned mulHi is in the > nanosecond range, so Good Enough. IMO. But weirdly, it's the other way around on AArch64, but there's little in it: Benchmark Mode Cnt Score Error Units MulHiTest.mulHiTest1 avgt 3 1492.108 ± 0.301 ns/op MulHiTest.mulHiTest2 avgt 3 1219.521 ± 1.516 ns/op but this is only in the case where we have unpredictable branches. Go with simple and easy to understand; it doesn't much matter. ------------- PR: https://git.openjdk.java.net/jdk/pull/4644