On Fri, Apr 21, 2017 at 09:39:29AM +0100, Kyrill Tkachov wrote: > Hi all, > > Consider the code: > typedef long long v2di __attribute__ ((vector_size (16))); > void > store_laned (v2di x, long long *y) > { > y[0] = x[1]; > y[3] = x[0]; > } > > AArch64 GCC will generate: > store_laned: > umov x1, v0.d[0] > st1 {v0.d}[1], [x0] > str x1, [x0, 24] > ret > > It moves the zero lane into a core register and does a scalar store when > instead it could have used a scalar FP store > that supports the required addressing mode: > store_laned: > st1 {v0.d}[1], [x0] > str d0, [x0, 24] > ret > > Combine already tries to match this pattern: > > Trying 10 -> 11: > Failed to match this instruction: > (set (mem:DI (plus:DI (reg/v/f:DI 76 [ y ]) > (const_int 24 [0x18])) [1 MEM[(long long int *)y_4(D) + 24B]+0 S8 > A64]) > (vec_select:DI (reg/v:V2DI 75 [ x ]) > (parallel [ > (const_int 0 [0]) > ]))) > > but we don't match it in the backend. It's not hard to add it, so this patch > does that for all the relevant vector modes. > With this patch we generate the second sequence above and in SPEC2006 > eliminate some address computation instructions > because we use the more expressive STR instead of ST1 or we eliminate such > moves to the integer registers because we > can just do the store of the D-reg.
Good spot! > Bootstrapped and tested on aarch64-none-linux-gnu. > > Ok for trunk? OK. Thanks, James > 2017-04-21 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * config/aarch64/aarch64-simd.md (aarch64_store_lane0<mode>): > New pattern. > > 2017-04-21 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * gcc.target/aarch64/store_lane0_str_1.c: New test. >