https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105617

--- Comment #9 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #8)
> (In reply to Hongtao.liu from comment #7)
> > Hmm, we have specific code to add scalar->vector(vmovq) cost to vector
> > construct, but it seems not to work here, guess it's because &r0,and thought
> > it was load not scalar? 
> Yes, true for as gimple_assign_load_p
> 
> 
> (gdb) p debug_gimple_stmt (def)
> 72# VUSE <.MEM_46>
> 73r0.0_20 = r0;
It's a load from stack, and finally eliminated in rtl dse1, but here the
vectorizer doesn't know.

And slp will not vectorize it when there's extra scalar->vector cost.

typedef long long uint64_t;
void add4i(uint64_t r0, uint64_t r1, uint64_t r2, uint64_t r3, uint64_t *dst)
{

  dst[0] = r0;
  dst[1] = r1;
  dst[2] = r2;
  dst[3] = r3;
}

add4i:
        mov     QWORD PTR [r8], rdi
        mov     QWORD PTR [r8+8], rsi
        mov     QWORD PTR [r8+16], rdx
        mov     QWORD PTR [r8+24], rcx
        ret

Reply via email to