Thank you Andrew for looking into this!
On 3/24/18 4:13 AM, Andrew Haley wrote:
On 03/20/2018 05:20 PM, Ivan Gerasimov wrote:
I tried to run it, but the numbers are non-distinguishable for non-zero
arguments.
And my variant performs slightly better with zero argument.
So, I think it's reasonable to keep the variant with the ternary operator.
I am suspicious of this argument. Did you look at the generated code?
I get
cbnz w10, 0x000003ffa8202384
mov w0, wzr
for the zero test and
cbz w10, 0x000003ffa81d228c
clz w11, w10
orr w10, wzr, #0x80000000
lsr w0, w10, w11 ;*iushr
for 42.
The branch at the start of both versions goes to a deoptimize trap.
We don't want deoptimize traps if we can avoid them, so the branchless
version is better IMO.
This looks persuasive, so let's go ahead with the branchless variant!
With kind regards,
Ivan
I think your benchmark is of questionable validity because it always
uses the same value. This is unlikely in real code. I think the
versions should be benchmarked with a *varying* argument.
--
With kind regards,
Ivan Gerasimov