https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107099

            Bug ID: 107099
           Summary: uncprop a bit
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: amonakov at gcc dot gnu.org
  Target Milestone: ---

For the following testcase

#include <immintrin.h>

__attribute__((target("avx")))
int f(__m128i a[], long n)
{
    for (long i = 0; i < n; i++)
        if (!_mm_testz_si128(a[i], a[i]))
            return 0;
    return 1;
}

gcc -O2 generates

f:
        test    rsi, rsi
        jle     .L4
        xor     eax, eax
        jmp     .L3
.L10:
        add     rax, 1
        cmp     rsi, rax
        je      .L4
.L3:
        mov     rdx, rax
        sal     rdx, 4
        vmovdqa xmm0, XMMWORD PTR [rdi+rdx]
        xor     edx, edx
        vptest  xmm0, xmm0
        sete    dl
        je      .L10
        mov     eax, edx
        ret
.L4:
        mov     edx, 1
        mov     eax, edx
        ret

Note the redundant assignments to edx in the loop and compare with gcc -O2
-fdisable-tree-uncprop1

Also note that generally uncprop adds a data dependency where only a control
dependency existed, hurting speculative execution (hence more appropriate for
-Os than -O2).

Reply via email to