Thanks for working on this!
I'd been fiddling around with a patch with some similar elements to this, but
many trials with union types, subregs, etc., all worsened the register
allocation and led to more unnecessary shuffling / moves. The only real thing I
tried which you don't do here, was to introduce a set_dreg expander to clean up
some of those macro definitions in arm_neon.h. That could easily follow in a
separate patch if desired!
So your patch looks good to me.
A couple of style nits:
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -128,7 +128,9 @@ enum aarch64_type_qualifiers
/* Polynomial types. */
qualifier_poly = 0x100,
/* Lane indices - must be in range, and flipped for bigendian. */
- qualifier_lane_index = 0x200
+ qualifier_lane_index = 0x200,
+ /* Lane indices for single lane structure loads and stores */
+ qualifier_struct_load_store_lane_index = 0x400
};
should be ...'loads and stores. */'
also the dg-error messages in the testsuite, do not need to be on the same line
as the statement generating the error, because the trailing 0 tells dg that the
position/line number doesn't matter (i.e. dg should allow the error to be
reported at any line); so these could be brought under 80 chars.
Thanks, Alan
Charles Baylis wrote:
This is another attempt at fixing this PR63870 for AArch64 (ARM is
still to come).
As before, the Q register variants are handled by moving the check for
the lane bounds into builtin expansion. The handling of lane numbers
is made consistent wrt endianess with other NEON single lane
operations - lane numbers in RTL are flipped for big-endian, and
flipped back at assembly time.
The D register variants are now handled by adding new builtins for all
the 64bit operations. These behave identically to Q register variants,
except that the permitted lane bounds are different.
In the iterators used by the relevant patterns are changed from VQ and
VALLDIF so that the correct vector sizes are used in the endian-flip
at assembly time.
Finally, a set of machine-generated test cases is added. These do need
to be in separate files, because of testsuite limitations.
Regression tested on qemu for aarch64-linux-gnu with no regressions
and all new tests pass.
OK for trunk?
gcc/ChangeLog:
<DATE> Charles Baylis <charles.bay...@linaro.org>
PR target/63870
* config/aarch64/aarch64-builtins.c (enum aarch64_type_qualifiers):
Add qualifier_struct_load_store_lane_index.
(aarch64_types_loadstruct_lane_qualifiers): Use
qualifier_struct_load_store_lane_index for lane index argument for
last argument.
(aarch64_types_storestruct_lane_qualifiers): Ditto.
(builtin_simd_arg): Add SIMD_ARG_STRUCT_LOAD_STORE_LANE_INDEX.
(aarch64_simd_expand_args): Add new argument describing mode of
builtin. Check lane bounds for arguments with
SIMD_ARG_STRUCT_LOAD_STORE_LANE_INDEX.
(aarch64_simd_expand_builtin): Emit error for incorrect lane indices
if marked with SIMD_ARG_STRUCT_LOAD_STORE_LANE_INDEX.
(aarch64_simd_expand_builtin): Handle arguments with
qualifier_struct_load_store_lane_index. Pass machine mode of builtin to
aarch64_simd_expand_args.
* config/aarch64/aarch64-simd-builtins.def: Declare ld[234]_lane and
vst[234]_lane with BUILTIN_VALLDIF.
* config/aarch64/aarch64-simd.md:
(aarch64_vec_load_lanesoi_lane<mode>): Use VALLDIF iterator. Perform
endianness reversal on lane index.
(aarch64_vec_load_lanesci_lane<mode>): Ditto.
(aarch64_vec_load_lanesxi_lane<mode>): Ditto.
(vec_store_lanesoi_lane<mode>): Use VALLDIF iterator. Fix typo
in attribute.
(vec_store_lanesci_lane<mode>): Use VALLDIF iterator.
(vec_store_lanesxi_lane<mode>): Ditto.
(aarch64_ld2_lane<mode>): Use VALLDIF iterator. Remove endianness
reversal of lane index.
(aarch64_ld3_lane<mode>): Ditto.
(aarch64_ld4_lane<mode>): Ditto.
(aarch64_st2_lane<mode>): Ditto.
(aarch64_st3_lane<mode>): Ditto.
(aarch64_st4_lane<mode>): Ditto.
* config/aarch64/arm_neon.h (__LD2_LANE_FUNC): Rename mode parameter
to qmode. Add new mode parameter. Update uses.
(__LD3_LANE_FUNC): Ditto.
(__LD4_LANE_FUNC): Ditto.
(__ST2_LANE_FUNC): Ditto.
(__ST3_LANE_FUNC): Ditto.
(__ST4_LANE_FUNC): Ditto.
<DATE> Charles Baylis <charles.bay...@linaro.org>
* gcc.target/aarch64/simd/vld2_lane_f32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2_lane_f64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2_lane_p8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2_lane_s16_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2_lane_s32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2_lane_s64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2_lane_s8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2_lane_u16_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2_lane_u32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2_lane_u64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2_lane_u8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2q_lane_f32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2q_lane_f64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2q_lane_p8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2q_lane_s16_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2q_lane_s32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2q_lane_s64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2q_lane_s8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2q_lane_u16_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2q_lane_u32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2q_lane_u64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld2q_lane_u8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3_lane_f32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3_lane_f64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3_lane_p8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3_lane_s16_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3_lane_s32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3_lane_s64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3_lane_s8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3_lane_u16_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3_lane_u32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3_lane_u64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3_lane_u8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3q_lane_f32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3q_lane_f64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3q_lane_p8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3q_lane_s16_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3q_lane_s32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3q_lane_s64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3q_lane_s8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3q_lane_u16_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3q_lane_u32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3q_lane_u64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld3q_lane_u8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4_lane_f32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4_lane_f64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4_lane_p8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4_lane_s16_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4_lane_s32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4_lane_s64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4_lane_s8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4_lane_u16_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4_lane_u32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4_lane_u64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4_lane_u8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4q_lane_f32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4q_lane_f64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4q_lane_p8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4q_lane_s16_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4q_lane_s32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4q_lane_s64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4q_lane_s8_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4q_lane_u16_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4q_lane_u32_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4q_lane_u64_indices_1.c: New test.
* gcc.target/aarch64/simd/vld4q_lane_u8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2_lane_f32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2_lane_f64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2_lane_p8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2_lane_s16_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2_lane_s32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2_lane_s64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2_lane_s8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2_lane_u16_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2_lane_u32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2_lane_u64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2_lane_u8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2q_lane_f32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2q_lane_f64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2q_lane_p8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2q_lane_s16_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2q_lane_s32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2q_lane_s64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2q_lane_s8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2q_lane_u16_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2q_lane_u32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2q_lane_u64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst2q_lane_u8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3_lane_f32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3_lane_f64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3_lane_p8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3_lane_s16_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3_lane_s32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3_lane_s64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3_lane_s8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3_lane_u16_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3_lane_u32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3_lane_u64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3_lane_u8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3q_lane_f32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3q_lane_f64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3q_lane_p8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3q_lane_s16_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3q_lane_s32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3q_lane_s64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3q_lane_s8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3q_lane_u16_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3q_lane_u32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3q_lane_u64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst3q_lane_u8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4_lane_f32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4_lane_f64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4_lane_p8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4_lane_s16_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4_lane_s32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4_lane_s64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4_lane_s8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4_lane_u16_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4_lane_u32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4_lane_u64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4_lane_u8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4q_lane_f32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4q_lane_f64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4q_lane_p8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4q_lane_s16_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4q_lane_s32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4q_lane_s64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4q_lane_s8_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4q_lane_u16_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4q_lane_u32_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4q_lane_u64_indices_1.c: New test.
* gcc.target/aarch64/simd/vst4q_lane_u8_indices_1.c: New test.