https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123505
Bug ID: 123505
Summary: Bit-trick for absolute value fails to optimize
Product: gcc
Version: 15.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tobi at gcc dot gnu.org
Target Milestone: ---
Here's another branch-free pattern being misoptimized by gcc. Of course the
ternary works well in this instance.
https://godbolt.org/z/4vsW54o1G
All the functions in the godbolt do the same thing:
int ternary_ge(char *buffer, int dec_exp) {
uint16_t e_sign = dec_exp >= 0 ? ('-' << 8 | 'e') : ('+' << 8 | 'e');
memcpy(buffer, &e_sign, 2);
dec_exp = dec_exp >= 0 ? dec_exp : -dec_exp;
return dec_exp;
}
They copy two bytes which depend on whether dec_exp is positive or negative and
then return the absolute value of dec_exp.
I'm trying various patterns for obtaining the absolute value which all depend
on masking based on the sign.
This optimizes as well as the ternary, yielding identical code:
int mask = dec_exp >> 31;
dec_exp = ((dec_exp + mask) ^ mask);
This optimizes worse, but is still jump-free
int mask = (dec_exp >= 0) - 1;
dec_exp = ((dec_exp + mask) ^ mask);
And this introduces a jump
int mask = -(dec_exp < 0);
dec_exp = ((dec_exp + mask) ^ mask);
Without the first two lines, the last two versions optimize worse than ternary
and the shift, but no jump is introduced. https://godbolt.org/z/MPPh5b5js
Ideally, all cases would yield the same code with a conditional move. This is
part of a bigger code, and fixing to use the ternary accelerates it by approx
10%.