https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98453

            Bug ID: 98453
           Summary: aarch64: Missed opportunity for STP for vec_duplicate
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ktkachov at gcc dot gnu.org
  Target Milestone: ---

typedef long long v2di __attribute__((vector_size (16)));
typedef int v2si __attribute__((vector_size (8)));

void
foo (v2di *x, long long a)
{
  v2di tmp = {a, a};
  *x = tmp;
}

void
foo2 (v2si *x, int a)
{
  v2si tmp = {a, a};
  *x = tmp;
}

at -O2 on aarch64 gives:
foo:
        dup     v0.2d, x1
        str     q0, [x0]
        ret

foo2:
        dup     v0.2s, w1
        str     d0, [x0]
        ret

These could just be: stp x1, x1, [x0] and stp w1, w1, [x0]
Combine already tries and fails to match:
(set (mem:V2DI (reg:DI 97) [1 *x_4(D)+0 S16 A128])
    (vec_duplicate:V2DI (reg:DI 98)))
and
(set (mem:V2SI (reg:DI 97) [2 *x_4(D)+0 S8 A64])
    (vec_duplicate:V2SI (reg:SI 98)))

So can be fixed by some new patterns in aarch64-simd.md.
We should make sure to handle the other 32-bit and 64-bit modes as well

Reply via email to