https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111011

            Bug ID: 111011
           Summary: gcc-13 incorrectly decrements by 2. It's twice as fast
                    as gcc-12 and clang!
           Product: gcc
           Version: 13.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: adam.warner.nz at gmail dot com
  Target Milestone: ---

(Please fix my guess at the correct component for this bug report)

I'm amused by a ghost in the GCC virtual machine. I'm running this code on a
Debian Linux x86-64 desktop with these software versions:

gcc-12 (Debian 12.3.0-7) 12.3.0
gcc-13 (Debian 13.2.0-2) 13.2.0
gcc (Debian 20230718-1) 14.0.0 20230718 (experimental) [master
r14-2597-g6bab2772dbc]
Debian clang version 17.0.0 (++20230128060150+75153adeda1a-1~exp1)

My CPU is locked at 2.7GHz. It should take a nice round 10 seconds to decrement
2.7x10^10 to zero if each decrement takes one clock cycle.

And indeed it used to:

$ cat countdown.c
#include <stdint.h>

int main() {
  int64_t count=27000000000;
  while (count>0) {
    __asm__ __volatile__("" : : : "memory");
    --count;
  }
  return 0;
}
$ gcc-12 -O3 countdown.c && time ./a.out 

real    0m10.029s
user    0m10.024s
sys     0m0.004s
$ clang-17 -O3 countdown.c && time ./a.out 

real    0m10.032s
user    0m10.030s
sys     0m0.000s


But now it only takes 5 seconds:
$ gcc-13 -O3 countdown.c && time ./a.out 

real    0m5.022s
user    0m5.021s
sys     0m0.001s
$ gcc-snapshot.sh -O3 countdown.c && time ./a.out 

real    0m5.023s
user    0m5.022s
sys     0m0.000s

By disassembling the machine code we can clearly see why:
$ gcc-13 -O3 countdown.c && objdump -d -m i386:x86-64:intel a.out
...
0000000000001040 <main>:
    1040:       48 b8 00 4e 53 49 06    movabs rax,0x649534e00
    1047:       00 00 00 
    104a:       66 0f 1f 44 00 00       nop    WORD PTR [rax+rax*1+0x0]
    1050:       48 83 e8 02             sub    rax,0x2
    1054:       75 fa                   jne    1050 <main+0x10>
    1056:       31 c0                   xor    eax,eax
    1058:       c3                      ret
    1059:       0f 1f 80 00 00 00 00    nop    DWORD PTR [rax+0x0]
...
  • [Bug rtl-optimization/111011]... adam.warner.nz at gmail dot com via Gcc-bugs

Reply via email to