https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122837
Bug ID: 122837
Summary: Suboptimal codegen for std::midpoint
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: ktkachov at gcc dot gnu.org
Target Milestone: ---
The code:
#include <cstdint>
#include <numeric>
typedef uint32_t u32;
typedef uint64_t u64;
u32 midpoint_u32(u32 a, u32 b) { return std::midpoint(a, b); }
u64 midpoint_u64(u64 a, u64 b) { return std::midpoint(a, b); }
For aarch64 -O2 -std=c++23 generates:
midpoint_u32(unsigned int, unsigned int):
cmp w0, w1
bhi .L3
mov w2, w1
mov w1, w0
sub w1, w2, w1
mov w3, 1
lsr w1, w1, 1
madd w0, w1, w3, w0
ret
.L3:
mov w2, w0
mov w3, -1
sub w1, w2, w1
lsr w1, w1, 1
madd w0, w1, w3, w0
ret
midpoint_u64(unsigned long, unsigned long):
cmp x0, x1
bhi .L7
mov x2, x1
mov x1, x0
sub x1, x2, x1
mov x3, 1
lsr x1, x1, 1
madd x0, x1, x3, x0
ret
.L7:
mov x2, x0
mov x3, -1
sub x1, x2, x1
lsr x1, x1, 1
madd x0, x1, x3, x0
ret
But Clang does the much shorter:
_Z12midpoint_u32jj:
subs w8, w0, w1
cneg w8, w8, ls
lsr w8, w8, #1
cneg w8, w8, hi
add w0, w8, w0
ret
_Z12midpoint_u64mm:
subs x8, x0, x1
cneg x8, x8, ls
lsr x8, x8, #1
cneg x8, x8, hi
add x0, x8, x0
ret
I haven't looked if that's a libstdc++ source-level change that's needed or if
we're missing optimisations in GCC. At -O0 both compilers seem to generate
similar code though