https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87543

            Bug ID: 87543
           Summary: Missed opportunity to compute constant return value at
                    compile time
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: eyalroz at technion dot ac.il
  Target Milestone: ---

Brief illustration on GodBolt: https://godbolt.org/z/sQyNGA
A related question on StackOverflow:
https://stackoverflow.com/q/52677512/1593077


Consider the following two functions:

    #include <numeric> 

    int f1()
    {
        int arr[] = {1, 2, 3, 4, 5};
        auto n = sizeof(arr)/sizeof(arr[0]);
        return std::accumulate(arr,  arr + n, 0);
    }

    int f2()
    {
        int arr[] = {1, 2, 3, 4, 5};
        auto n = sizeof(arr)/sizeof(arr[0]);
        int sum = 0;
        for(int i = 0; i < n; i++) {
            sum += arr[i];
        }
        return sum;
    }

Both functions return 15, always; and while they're not marked constexpr, this
can clearly be realized by the compiler. In fact, it is, if we compiler with
-O3 (with GCC 8.2). However, with -O2, we get the following result:

    f1():
            movabs  rax, 8589934593
            lea     rdx, [rsp-40]
            mov     ecx, 1
            mov     DWORD PTR [rsp-24], 5
            mov     QWORD PTR [rsp-40], rax
            lea     rsi, [rdx+20]
            movabs  rax, 17179869187
            mov     QWORD PTR [rsp-32], rax
            xor     eax, eax
            jmp     .L3
    .L5:
            mov     ecx, DWORD PTR [rdx]
    .L3:
            add     rdx, 4
            add     eax, ecx
            cmp     rdx, rsi
            jne     .L5
            ret
    f2():
            mov     eax, 15
            ret


I don't think `std::accumulate` should have any code which should make -O2 fail
to notice the optimization opportunity in `f1()`. But if that assertion might
be debatable, surely adding -march=skylake to the -O3 can only result in
stronger optimization, right? However, it results in _both_ functions, rather
than just `f1()`, failing to fully optimize.


I asked about part of this issue at StackOverflow and a reply (by Florian
Weimer) suggested this might be a regression relative to GCC 6.3 . And, indeed,
if we switch the GCC version to 6.3 - both functions are not-fully-optimized in
-O2, and fully-optimized with -O3:
https://godbolt.org/z/JOqCoC

if I try GCC 7.3, things get weird in yet a different way: -O2 optimizes both
functions fully, and -O3 optimizes just the _first_ one.

Reply via email to