Hi David, On Sun, Mar 8, 2020 at 11:34 AM David Fetter <da...@fetter.org> wrote: > > On Mon, Mar 02, 2020 at 12:45:21PM -0800, Jesse Zhang wrote: > > Hi David, > > Per discussion on IRC with Andrew (RhodiumToad) Gierth: > > The runtime detection means there's always an indirect call overhead > and no way to inline. This is counter to what using compiler > intrinsics is supposed to do. > > It's better to rely on the compiler, because: > (a) The compiler often knows whether the value can or can't be 0 and > can therefore skip a conditional jump.
Yes, the compiler would know to eliminate the branch if the inlined function is called with a literal argument, or it infers an invariant from the context (like nesting inside a conditional block, or a previous conditional "noreturn" path). > (b) If you're targeting a recent microarchitecture, the compiler can > just use the right instruction. I might be more conservative than you are on (b). The thought of building a binary that cannot run "somewhere" where the compiler supports by default still mortifies me. > (c) Even if the conditional branch is left in, it's not a big overhead. > I 100% agree with (c), see benchmarking results upthread. Cheers, Jesse