https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123330

            Bug ID: 123330
           Summary: Optimization fails with standard branchless idiom
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tobi at gcc dot gnu.org
  Target Milestone: ---

The code gcc generated for this on x86-64 is surprisingly bad:

int bla(int x)
{
    return (x != 0) * __builtin_clz(x) + (x == 0) * 64;
}

See here:
https://godbolt.org/z/v5fc4ddqr

It actually evaluates both the multiplication on the right and the addition,
where really this is just a conditional move.

BTW it doesn't matter that the first branch is using `__builtin_clz`, which is
undefined for zero argument (if not building with `-mlzcnt), but the intent was
to avoid that undefined case in a clean way.  E.g.

int bla(int x)
{
    return (x != 0) * (x + 1) + (x == 0) * 64;
}
the same behavior https://godbolt.org/z/ETe8Wcaer

(I selected tree-optimization as category because there is no general
optimization category, and the same behavior occurs on ARM if I interpret the
assembly correctly).

Reply via email to