Kyrill Tkachov wrote:

> That patch would look like the attached. Is this preferable?
> For the above example it generates the desired:
> foo_v4sf:
>       ldr     s0, [x0]
>       ldr     s1, [x1, 8]
>       ins     v0.s[1], v1.s[0]
>       ld1     {v0.s}[2], [x2]
>       ld1     {v0.s}[3], [x3]
>        ret

Yes that's what I expect. Also with only non-zero offsets we emit:

foo_v2di:
        ldr     d0, [x0, 8]
        ldr     d1, [x1, 16]
        ins     v0.d[1], v1.d[0]
        ret

foo_v4sf:
        ldr     s0, [x0, 4]
        ldr     s3, [x1, 20]
        ldr     s2, [x2, 32]
        ldr     s1, [x3, 80]
        ins     v0.s[1], v3.s[0]
        ins     v0.s[2], v2.s[0]
        ins     v0.s[3], v1.s[0]
        ret

The patch looks good now, lots of patterns removed, yet we generate better code!

Wilco

Reply via email to