https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115974

            Bug ID: 115974
           Summary: sat_add vector patterns not done for aarch64
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64-linux-gnu

Take:
```
void f0(unsigned *__restrict__  a, unsigned * __restrict__ b)
{
        for(int i = 0;i < 1024;i ++)
        {
          unsigned tt;
          if (__builtin_add_overflow (a[i], b[i], &tt))
            tt = -1u;
          a[i] = tt;
        }
}
```

This should be vectorizable. Like it is on riscv or with clang.

LLVM's output:
```
.LBB1_1:                                // =>This Inner Loop Header: Depth=1
        ldp     q0, q3, [x10, #-16]
        subs    x8, x8, #8
        ldp     q1, q2, [x9, #-16]
        add     x10, x10, #32
        uqadd   v0.4s, v1.4s, v0.4s
        uqadd   v1.4s, v2.4s, v3.4s
        stp     q0, q1, [x9, #-16]
        add     x9, x9, #32
        b.ne    .LBB1_1
```

Reply via email to