On Tue, Jun 07, 2016 at 05:56:51PM +0100, Kyrill Tkachov wrote: > Hi all, > > This is the second part of James's patch from: > https://gcc.gnu.org/ml/gcc-patches/2013-09/msg01068.html > separated out. It reimplements the vcopyq_lane* intrinsics in C and > adds implementations of the other missing vcopy<q>_lane_<q> intrinsics. > > The differences from that patch are in the use of __aarch64_vset_lane_any and > __aarch64_vget_lane_any rather than the typed variants of these that were > used back in 2013 (and don't exist anymore). > > The testcase is also adjusted for the ABI change in GCC 5 where integer x1 > vectors are now passed and returned in SIMD registers. > > The vcopy_laneq_f64 test in the testcase is currently XFAILed because it > currently doesn't generate the optimal DUP instruction but instead emits a > UMOV to a scalar register and then an FMOV. This is a GCC 7 regression > tracked by PR 71307 and I think unrelated to this patch. > > Bootstrapped and tested on aarch64-none-linux-gnu. Also tested on > aarch64_be-none-elf. > > Ok for trunk?
Again, this looks OK to me, but as it is based on my code I can't approve it within the spirit of the write access policies. Please wait for Marcus or Richard to take a look. Thanks, James > > Thanks, > Kyrill > > 2016-06-07 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > James Greenhalgh <james.greenha...@arm.com> > > * config/aarch64/arm_neon.h (vcopyq_lane_f32, vcopyq_lane_f64, > vcopyq_lane_p8, vcopyq_lane_p16, vcopyq_lane_s8, vcopyq_lane_s16, > vcopyq_lane_s32, vcopyq_lane_s64, vcopyq_lane_u8, vcopyq_lane_u16, > vcopyq_lane_u32, vcopyq_lane_u64): Reimplement in C. > (vcopy_lane_f32, vcopy_lane_f64, vcopy_lane_p8, vcopy_lane_p16, > vcopy_lane_s8, vcopy_lane_s16, vcopy_lane_s32, vcopy_lane_s64, > vcopy_lane_u8, vcopy_lane_u16, vcopy_lane_u32, vcopy_lane_u64, > vcopy_laneq_f32, vcopy_laneq_f64, vcopy_laneq_p8, vcopy_laneq_p16, > vcopy_laneq_s8, vcopy_laneq_s16, vcopy_laneq_s32, vcopy_laneq_s64, > vcopy_laneq_u8, vcopy_laneq_u16, vcopy_laneq_u32, vcopy_laneq_u64, > vcopyq_laneq_f32, vcopyq_laneq_f64, vcopyq_laneq_p8, vcopyq_laneq_p16, > vcopyq_laneq_s8, vcopyq_laneq_s16, vcopyq_laneq_s32, vcopyq_laneq_s64, > vcopyq_laneq_u8, vcopyq_laneq_u16, vcopyq_laneq_u32, vcopyq_laneq_u64): > New intrinsics. > > 2016-06-07 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > James Greenhalgh <james.greenha...@arm.com> > > * gcc.target/aarch64/vect_copy_lane_1.c: New test.