https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122791
Bug ID: 122791
Summary: Missed optimization with a loop that multiplies
counter by 2 until overflow
Product: gcc
Version: 15.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: other
Assignee: unassigned at gcc dot gnu.org
Reporter: Explorer09 at gmail dot com
Target Milestone: ---
(I reported this issue first in Clang
(https://github.com/llvm/llvm-project/issues/168580)
but it seems that GCC also missed the optimization.)
```c
#include <limits.h>
extern void subroutine(unsigned long x);
void func1a(void) {
unsigned long x = 1;
while (1) {
__asm__ ("" : "+r"(x));
subroutine(x);
if (x > ULONG_MAX / 2)
break;
x *= 2;
}
}
void func1b(void) {
unsigned long x = 1;
while (1) {
subroutine(x);
if (__builtin_add_overflow(x, x, &x))
break;
}
}
#if 0
void func1c(void) {
unsigned long x = 1;
while (1) {
subroutine(x);
if (__builtin_mul_overflow(x, 2UL, &x))
break;
}
}
void func1a_orig(void) {
unsigned long x = 1;
while (1) {
subroutine(x);
if (x > ULONG_MAX / 2)
break;
x *= 2;
}
}
#endif
```
x86-64 gcc 15.2 with `-Os` option produces
(https://godbolt.org/z/nxbEMcz4Y):
```assembly
func1a:
pushq %rbx
movl $1, %ebx
.L3:
movq %rbx, %rdi
call subroutine
testq %rbx, %rbx
js .L1
addq %rbx, %rbx
jmp .L3
.L1:
popq %rbx
ret
func1b:
pushq %rbx
movl $1, %ebx
.L9:
movq %rbx, %rdi
call subroutine
addq %rbx, %rbx
jnc .L9
popq %rbx
ret
```
While the conditional `(x > ULONG_MAX / 2)` can be converted into a "test if
sign bit is set" check, it can miss that x would multiply by 2 afterward, so
the code can be smaller by checking the carry bit after addition.
func1a(), func1b() and func1c() in the example are all equivalent. I expect
they compile to the same code.
I tested this issue both with x86-64 and ARM64 targets.