https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123505

            Bug ID: 123505
           Summary: Bit-trick for absolute value fails to optimize
           Product: gcc
           Version: 15.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tobi at gcc dot gnu.org
  Target Milestone: ---

Here's another branch-free pattern being misoptimized by gcc.  Of course the
ternary works well in this instance.
https://godbolt.org/z/4vsW54o1G

All the functions in the godbolt do the same thing:
int ternary_ge(char *buffer, int dec_exp) {
  uint16_t e_sign = dec_exp >= 0 ? ('-' << 8 | 'e') : ('+' << 8 | 'e');
  memcpy(buffer, &e_sign, 2);
  dec_exp = dec_exp >= 0 ? dec_exp : -dec_exp;
  return  dec_exp;
}

They copy two bytes which depend on whether dec_exp is positive or negative and
then return the absolute value of dec_exp.

I'm trying various patterns for obtaining the absolute value which all depend
on masking based on the sign.

This optimizes as well as the ternary, yielding identical code:
  int mask = dec_exp >> 31;
  dec_exp = ((dec_exp + mask) ^ mask);

This optimizes worse, but is still jump-free
  int mask = (dec_exp >= 0) - 1;
  dec_exp = ((dec_exp + mask) ^ mask);

And this introduces a jump
  int mask = -(dec_exp < 0);
  dec_exp = ((dec_exp + mask) ^ mask);

Without the first two lines, the last two versions optimize worse than ternary
and the shift, but no jump is introduced. https://godbolt.org/z/MPPh5b5js

Ideally, all cases would yield the same code with a conditional move.  This is
part of a bigger code, and fixing to use the ternary accelerates it by approx
10%.

Reply via email to