On 05/01/17 10:44, Kyrill Tkachov wrote: > Hi Andre, > > On 09/11/16 10:11, Andre Vieira (lists) wrote: >> Hi, >> >> This patch implements support for the ARM ACLE Coprocessor CDP >> intrinsics. See below a table mapping the intrinsics to their respective >> instructions: >> >> +----------------------------------------------------+--------------------------------------+ >> >> | Intrinsic signature | Instruction >> pattern | >> +----------------------------------------------------+--------------------------------------+ >> >> |void __arm_cdp(coproc, opc1, CRd, CRn, CRm, opc2) |CDP coproc, opc1, >> CRd, CRn, CRm, opc2 | >> +----------------------------------------------------+--------------------------------------+ >> >> |void __arm_cdp2(coproc, opc1, CRd, CRn, CRm, opc2) |CDP2 coproc, opc1, >> CRd, CRn, CRm, opc2| >> +----------------------------------------------------+--------------------------------------+ >> >> Note that any untyped variable in the intrinsic signature is required to >> be a compiler-time constant and has the type 'unsigned int'. We do some >> boundary checks for coproc:[0-15], opc1:[0-15], CR*:[0-31], opc2:[0-7]. >> If either of these requirements are not met a diagnostic is issued. >> >> I renamed neon_const_bounds in this patch, to arm_const_bounds, simply >> because it is also used in the Coprocessor intrinsics. It also requires >> the expansion of the builtin frame work such that it accepted 'void' >> modes and intrinsics with 6 arguments. >> >> I also changed acle.exp to run tests for multiple options, where all lto >> option sets are appended with -ffat-objects to allow for assembly scans. >> >> Is this OK for trunk? > > This is okay if bootstrap and testing is ok (as part of the whole series) > modulo a couple of nits in the documentation below. > > Thanks, > Kyrill > >> Regards, >> Andre >> >> gcc/ChangeLog: >> 2016-11-09 Andre Vieira <andre.simoesdiasvie...@arm.com> >> >> * config/arm/arm.md (<cdp>): New. >> * config/arm/arm.c (neon_const_bounds): Rename this ... >> (arm_const_bounds): ... this. >> (arm_coproc_builtin_available): New. >> * config/arm/arm-builtins.c (SIMD_MAX_BUILTIN_ARGS): Increase. >> (arm_type_qualifiers): Add 'qualifier_unsigned_immediate'. >> (CDP_QUALIFIERS): Define to... >> (arm_cdp_qualifiers): ... this. New. >> (void_UP): Define. >> (arm_expand_builtin_args): Add case for 6 arguments. >> * config/arm/arm-protos.h (neon_const_bounds): Rename this ... >> (arm_const_bounds): ... this. >> (arm_coproc_builtin_available): New. >> * config/arm/arm_acle.h (__arm_cdp): New. >> (__arm_cdp2): New. >> * config/arm/arm_acle_builtins.def (cdp): New. >> (cdp2): New. >> * config/arm/iterators.md (CDPI,CDP,cdp): New. >> * config/arm/neon.md: Rename all 'neon_const_bounds' to >> 'arm_const_bounds'. >> * config/arm/types.md (coproc): New. >> * config/arm/unspecs.md (VUNSPEC_CDP, VUNSPEC_CDP2): New. >> * gcc/doc/extend.texi (ACLE): Add a mention of Coprocessor intrinsics. >> >> gcc/testsuite/ChangeLog: >> 2016-11-09 Andre Vieira <andre.simoesdiasvie...@arm.com> >> >> * gcc.target/arm/acle/acle.exp: Run tests for different options >> and make sure fat-lto-objects is used such that we can still do >> assemble scans. >> * gcc.target/arm/acle/cdp.c: New. >> * gcc.target/arm/acle/cdp2.c: New. >> * lib/target-supports.exp (check_effective_target_arm_coproc1_ok): >> New. >> (check_effective_target_arm_coproc1_ok_nocache): New. >> (check_effective_target_arm_coproc2_ok): New. >> (check_effective_target_arm_coproc2_ok_nocache): New. >> (check_effective_target_arm_coproc3_ok): New. >> (check_effective_target_arm_coproc3_ok_nocache): New. > > --- a/gcc/doc/sourcebuild.texi > +++ b/gcc/doc/sourcebuild.texi > @@ -1675,6 +1675,21 @@ and @code{MOVT} instructions available. > ARM target generates Thumb-1 code for @code{-mthumb} with > @code{CBZ} and @code{CBNZ} instructions available. > > +@item arm_coproc1_ok > +@anchor{arm_coproc1_ok} > +ARM target supports the following coprocessor instruction: @code{CDP}, > +@code{LDC}, @code{STC}, @code{MCR} and @code{MRC}. > > > s/instruction/instructions/ > > > +@item arm_coproc2_ok > +@anchor{arm_coproc2_ok} > +ARM target supports the all the coprocessor instructions also listed as > +supported in @ref{arm_coproc1_ok} and the following: @code{CDP2}, > @code{LDC2}, > +@code{LDC2l}, @code{STC2}, @code{STC2l}, @code{MCR2} and @code{MRC2}. > + > > s/the all the/all the/. > Also, I'd prefer to say "in addition to the following" rather than "and > the following" > > +@item arm_coproc3_ok > +ARM target supports the all the coprocessor instructions also listed as > +supported in @ref{arm_coproc2_ok} and the following: @code{MCRR}, > @code{MCRR2}, > +@code{MRRC}, and @code{MRRC2}. > > Likewise. > Hi,
I reworked this patch after comments, rebased and noticed I had grouped MCRR/MRRC and MCRR2/MRRC2 together, the first two are supported in ARMv5TE but the latter only in ARMv6 and onwards. So I fixed the testsuite checks in this patch and the generation in the latter patch. I ran the patch series through a bootstrap and full regression on arm-none-linux-gnueabihf. Is this OK for trunk? Regards, Andre gcc/ChangeLog: 2017-01-xx Andre Vieira <andre.simoesdiasvie...@arm.com> * config/arm/arm.md (<cdp>): New. * config/arm/arm.c (neon_const_bounds): Rename this ... (arm_const_bounds): ... this. (arm_coproc_builtin_available): New. * config/arm/arm-builtins.c (SIMD_MAX_BUILTIN_ARGS): Increase. (arm_type_qualifiers): Add 'qualifier_unsigned_immediate'. (CDP_QUALIFIERS): Define to... (arm_cdp_qualifiers): ... this. New. (void_UP): Define. (arm_expand_builtin_args): Add case for 6 arguments. * config/arm/arm-protos.h (neon_const_bounds): Rename this ... (arm_const_bounds): ... this. (arm_coproc_builtin_available): New. * config/arm/arm_acle.h (__arm_cdp): New. (__arm_cdp2): New. * config/arm/arm_acle_builtins.def (cdp): New. (cdp2): New. * config/arm/iterators.md (CDPI,CDP,cdp): New. * config/arm/neon.md: Rename all 'neon_const_bounds' to 'arm_const_bounds'. * config/arm/types.md (coproc): New. * config/arm/unspecs.md (VUNSPEC_CDP, VUNSPEC_CDP2): New. * gcc/doc/extend.texi (ACLE): Add a mention of Coprocessor intrinsics. gcc/testsuite/ChangeLog: 2017-01-xx Andre Vieira <andre.simoesdiasvie...@arm.com> * gcc.target/arm/acle/acle.exp: Run tests for different options and make sure fat-lto-objects is used such that we can still do assemble scans. * gcc.target/arm/acle/cdp.c: New. * gcc.target/arm/acle/cdp2.c: New. * lib/target-supports.exp (check_effective_target_arm_coproc1_ok): New. (check_effective_target_arm_coproc1_ok_nocache): New. (check_effective_target_arm_coproc2_ok): New. (check_effective_target_arm_coproc2_ok_nocache): New. (check_effective_target_arm_coproc3_ok): New. (check_effective_target_arm_coproc3_ok_nocache): New. (check_effective_target_arm_coproc4_ok): New. (check_effective_target_arm_coproc4_ok_nocache): New.
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index febbec9fca079ac03b93edec970ebc537e25309b..2bb9e22bb8cf7ae2d8a5698e970af4845016d93c 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -39,7 +39,7 @@ #include "case-cfn-macros.h" #include "sbitmap.h" -#define SIMD_MAX_BUILTIN_ARGS 5 +#define SIMD_MAX_BUILTIN_ARGS 7 enum arm_type_qualifiers { @@ -54,6 +54,7 @@ enum arm_type_qualifiers /* Used when expanding arguments if an operand could be an immediate. */ qualifier_immediate = 0x8, /* 1 << 3 */ + qualifier_unsigned_immediate = 0x9, qualifier_maybe_immediate = 0x10, /* 1 << 4 */ /* void foo (...). */ qualifier_void = 0x20, /* 1 << 5 */ @@ -165,6 +166,18 @@ arm_unsigned_binop_qualifiers[SIMD_MAX_BUILTIN_ARGS] qualifier_unsigned }; #define UBINOP_QUALIFIERS (arm_unsigned_binop_qualifiers) +/* void (unsigned immediate, unsigned immediate, unsigned immediate, + unsigned immediate, unsigned immediate, unsigned immediate). */ +static enum arm_type_qualifiers +arm_cdp_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_void, qualifier_unsigned_immediate, + qualifier_unsigned_immediate, + qualifier_unsigned_immediate, + qualifier_unsigned_immediate, + qualifier_unsigned_immediate, + qualifier_unsigned_immediate }; +#define CDP_QUALIFIERS \ + (arm_cdp_qualifiers) /* The first argument (return type) of a store should be void type, which we represent with qualifier_void. Their first operand will be a DImode pointer to the location to store to, so we must use @@ -201,6 +214,7 @@ arm_storestruct_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS] #define oi_UP OImode #define hf_UP HFmode #define si_UP SImode +#define void_UP VOIDmode #define UP(X) X##_UP @@ -2226,6 +2240,10 @@ constant_arg: pat = GEN_FCN (icode) (target, op[0], op[1], op[2], op[3], op[4]); break; + case 6: + pat = GEN_FCN (icode) (target, op[0], op[1], op[2], op[3], op[4], op[5]); + break; + default: gcc_unreachable (); } @@ -2252,6 +2270,10 @@ constant_arg: pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3], op[4]); break; + case 6: + pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3], op[4], op[5]); + break; + default: gcc_unreachable (); } diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 9a8166bf4316c82722b3a299c8a1fcda878561a3..4d6a3ed3d47952728c3c4c1a8bd5ec0b9274bb16 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -96,7 +96,7 @@ extern rtx neon_make_constant (rtx); extern tree arm_builtin_vectorized_function (unsigned int, tree, tree); extern void neon_expand_vector_init (rtx, rtx); extern void neon_lane_bounds (rtx, HOST_WIDE_INT, HOST_WIDE_INT, const_tree); -extern void neon_const_bounds (rtx, HOST_WIDE_INT, HOST_WIDE_INT); +extern void arm_const_bounds (rtx, HOST_WIDE_INT, HOST_WIDE_INT); extern HOST_WIDE_INT neon_element_bits (machine_mode); extern void neon_emit_pair_result_insn (machine_mode, rtx (*) (rtx, rtx, rtx, rtx), @@ -176,6 +176,7 @@ extern void arm_expand_compare_and_swap (rtx op[]); extern void arm_split_compare_and_swap (rtx op[]); extern void arm_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx); extern rtx arm_load_tp (rtx); +extern bool arm_coproc_builtin_available (enum unspecv); #if defined TREE_CODE extern void arm_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree); diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 81a1b85812860739c8b414e467476ab3c26cecd5..64599981961d80c5493a88f30743b98a138ca932 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -12206,7 +12206,7 @@ neon_lane_bounds (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high, /* Bounds-check constants. */ void -neon_const_bounds (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high) +arm_const_bounds (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high) { bounds_check (operand, low, high, NULL_TREE, "constant"); } @@ -30888,4 +30888,34 @@ arm_expand_divmod_libfunc (rtx libfunc, machine_mode mode, *rem_p = remainder; } +/* This function checks for the availability of the coprocessor builtin passed + in BUILTIN for the current target. Returns true if it is available and + false otherwise. If a BUILTIN is passed for which this function has not + been implemented it will cause an exception. */ + +bool +arm_coproc_builtin_available (enum unspecv builtin) +{ + /* None of these builtins are available in Thumb mode if the target only + supports Thumb-1. */ + if (TARGET_THUMB1) + return false; + + switch (builtin) + { + case VUNSPEC_CDP: + if (arm_arch4) + return true; + break; + case VUNSPEC_CDP2: + /* Only present in ARMv5*, ARMv6 (but not ARMv6-M), ARMv7* and + ARMv8-{A,M}. */ + if (arm_arch5) + return true; + break; + default: + gcc_unreachable (); + } + return false; +} #include "gt-arm.h" diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 7a0ac7f8476cddb51cd93716af85f9cb25ef7090..b5325013c2179c06e0079f35a5c5bd0ae9388d4c 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -11919,6 +11919,26 @@ DONE; }) +(define_insn "<cdp>" + [(unspec_volatile [(match_operand:SI 0 "immediate_operand" "n") + (match_operand:SI 1 "immediate_operand" "n") + (match_operand:SI 2 "immediate_operand" "n") + (match_operand:SI 3 "immediate_operand" "n") + (match_operand:SI 4 "immediate_operand" "n") + (match_operand:SI 5 "immediate_operand" "n")] CDPI)] + "arm_coproc_builtin_available (VUNSPEC_<CDP>)" +{ + arm_const_bounds (operands[0], 0, 16); + arm_const_bounds (operands[1], 0, 16); + arm_const_bounds (operands[2], 0, (1 << 5)); + arm_const_bounds (operands[3], 0, (1 << 5)); + arm_const_bounds (operands[4], 0, (1 << 5)); + arm_const_bounds (operands[5], 0, 8); + return "<cdp>\\tp%c0, %1, CR%c2, CR%c3, CR%c4, %5"; +} + [(set_attr "length" "4") + (set_attr "type" "coproc")]) + ;; Vector bits common to IWMMXT and Neon (include "vec-common.md") ;; Load the Intel Wireless Multimedia Extension patterns diff --git a/gcc/config/arm/arm_acle.h b/gcc/config/arm/arm_acle.h index 03cd197b6c4c56c419072d52c46e85a5f2eb98ba..08add2b7ac79f487dea92477d39b9db886a3f027 100644 --- a/gcc/config/arm/arm_acle.h +++ b/gcc/config/arm/arm_acle.h @@ -32,6 +32,26 @@ extern "C" { #endif +#if (!__thumb__ || __thumb2__) && __ARM_ARCH >= 4 +__extension__ static __inline void __attribute__ ((__always_inline__)) +__arm_cdp (const unsigned int __coproc, const unsigned int __opc1, + const unsigned int __CRd, const unsigned int __CRn, + const unsigned int __CRm, const unsigned int __opc2) +{ + return __builtin_arm_cdp (__coproc, __opc1, __CRd, __CRn, __CRm, __opc2); +} + +#if __ARM_ARCH >= 5 +__extension__ static __inline void __attribute__ ((__always_inline__)) +__arm_cdp2 (const unsigned int __coproc, const unsigned int __opc1, + const unsigned int __CRd, const unsigned int __CRn, + const unsigned int __CRm, const unsigned int __opc2) +{ + return __builtin_arm_cdp2 (__coproc, __opc1, __CRd, __CRn, __CRm, __opc2); +} +#endif /* __ARM_ARCH >= 5. */ +#endif /* (!__thumb__ || __thumb2__) && __ARM_ARCH >= 4. */ + #ifdef __ARM_FEATURE_CRC32 __extension__ static __inline uint32_t __attribute__ ((__always_inline__)) __crc32b (uint32_t __a, uint8_t __b) diff --git a/gcc/config/arm/arm_acle_builtins.def b/gcc/config/arm/arm_acle_builtins.def index 81ab7720971ba042a5d64c22b6bd19710147e602..03b5bf88ef2632bceedba1e64c0f83bc50337364 100644 --- a/gcc/config/arm/arm_acle_builtins.def +++ b/gcc/config/arm/arm_acle_builtins.def @@ -24,3 +24,5 @@ VAR1 (UBINOP, crc32w, si) VAR1 (UBINOP, crc32cb, si) VAR1 (UBINOP, crc32ch, si) VAR1 (UBINOP, crc32cw, si) +VAR1 (CDP, cdp, void) +VAR1 (CDP, cdp2, void) diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index 4f04f1cc0f45205d75ba3200607e074d0e1a96bb..86d6aa70e5766bc42a4209f14e929942ee63b773 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -943,3 +943,8 @@ ;; Attributes for VFMA_LANE/ VFMS_LANE (define_int_attr neon_vfm_lane_as [(UNSPEC_VFMA_LANE "a") (UNSPEC_VFMS_LANE "s")]) + +;; An iterator for the CDP coprocessor instructions +(define_int_iterator CDPI [VUNSPEC_CDP VUNSPEC_CDP2]) +(define_int_attr cdp [(VUNSPEC_CDP "cdp") (VUNSPEC_CDP2 "cdp2")]) +(define_int_attr CDP [(VUNSPEC_CDP "CDP") (VUNSPEC_CDP2 "CDP2")]) diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 40f3a32befef869a0899bf47aa33c25486a8d178..cf281df0292d0f511d7d63e828886d860a3a8201 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -3654,7 +3654,7 @@ if (BYTES_BIG_ENDIAN) VCVT_US_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 1, 33); + arm_const_bounds (operands[2], 1, 33); return "vcvt.<sup>%#32.f32\t%<V_reg>0, %<V_reg>1, %2"; } [(set_attr "type" "neon_fp_to_int_<V_elem_ch><q>")] @@ -3668,7 +3668,7 @@ if (BYTES_BIG_ENDIAN) VCVT_US_N))] "TARGET_NEON_FP16INST" { - neon_const_bounds (operands[2], 0, 17); + arm_const_bounds (operands[2], 0, 17); return "vcvt.<sup>%#16.f16\t%<V_reg>0, %<V_reg>1, %2"; } [(set_attr "type" "neon_fp_to_int_<VH_elem_ch><q>")] @@ -3681,7 +3681,7 @@ if (BYTES_BIG_ENDIAN) VCVT_US_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 1, 33); + arm_const_bounds (operands[2], 1, 33); return "vcvt.f32.<sup>%#32\t%<V_reg>0, %<V_reg>1, %2"; } [(set_attr "type" "neon_int_to_fp_<V_elem_ch><q>")] @@ -3695,7 +3695,7 @@ if (BYTES_BIG_ENDIAN) VCVT_US_N))] "TARGET_NEON_FP16INST" { - neon_const_bounds (operands[2], 0, 17); + arm_const_bounds (operands[2], 0, 17); return "vcvt.f16.<sup>%#16\t%<V_reg>0, %<V_reg>1, %2"; } [(set_attr "type" "neon_int_to_fp_<VH_elem_ch><q>")] @@ -4300,7 +4300,7 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VEXT))] "TARGET_NEON" { - neon_const_bounds (operands[3], 0, GET_MODE_NUNITS (<MODE>mode)); + arm_const_bounds (operands[3], 0, GET_MODE_NUNITS (<MODE>mode)); return "vext.<V_sz_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2, %3"; } [(set_attr "type" "neon_ext<q>")] @@ -4397,7 +4397,7 @@ if (BYTES_BIG_ENDIAN) VSHR_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 1, neon_element_bits (<MODE>mode) + 1); + arm_const_bounds (operands[2], 1, neon_element_bits (<MODE>mode) + 1); return "v<shift_op>.<sup>%#<V_sz_elem>\t%<V_reg>0, %<V_reg>1, %2"; } [(set_attr "type" "neon_shift_imm<q>")] @@ -4411,7 +4411,7 @@ if (BYTES_BIG_ENDIAN) VSHRN_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 1, neon_element_bits (<MODE>mode) / 2 + 1); + arm_const_bounds (operands[2], 1, neon_element_bits (<MODE>mode) / 2 + 1); return "v<shift_op>.<V_if_elem>\t%P0, %q1, %2"; } [(set_attr "type" "neon_shift_imm_narrow_q")] @@ -4425,7 +4425,7 @@ if (BYTES_BIG_ENDIAN) VQSHRN_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 1, neon_element_bits (<MODE>mode) / 2 + 1); + arm_const_bounds (operands[2], 1, neon_element_bits (<MODE>mode) / 2 + 1); return "v<shift_op>.<sup>%#<V_sz_elem>\t%P0, %q1, %2"; } [(set_attr "type" "neon_sat_shift_imm_narrow_q")] @@ -4439,7 +4439,7 @@ if (BYTES_BIG_ENDIAN) VQSHRUN_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 1, neon_element_bits (<MODE>mode) / 2 + 1); + arm_const_bounds (operands[2], 1, neon_element_bits (<MODE>mode) / 2 + 1); return "v<shift_op>.<V_s_elem>\t%P0, %q1, %2"; } [(set_attr "type" "neon_sat_shift_imm_narrow_q")] @@ -4452,7 +4452,7 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VSHL_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 0, neon_element_bits (<MODE>mode)); + arm_const_bounds (operands[2], 0, neon_element_bits (<MODE>mode)); return "vshl.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %2"; } [(set_attr "type" "neon_shift_imm<q>")] @@ -4465,7 +4465,7 @@ if (BYTES_BIG_ENDIAN) VQSHL_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 0, neon_element_bits (<MODE>mode)); + arm_const_bounds (operands[2], 0, neon_element_bits (<MODE>mode)); return "vqshl.<sup>%#<V_sz_elem>\t%<V_reg>0, %<V_reg>1, %2"; } [(set_attr "type" "neon_sat_shift_imm<q>")] @@ -4478,7 +4478,7 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VQSHLU_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 0, neon_element_bits (<MODE>mode)); + arm_const_bounds (operands[2], 0, neon_element_bits (<MODE>mode)); return "vqshlu.<V_s_elem>\t%<V_reg>0, %<V_reg>1, %2"; } [(set_attr "type" "neon_sat_shift_imm<q>")] @@ -4492,7 +4492,7 @@ if (BYTES_BIG_ENDIAN) "TARGET_NEON" { /* The boundaries are: 0 < imm <= size. */ - neon_const_bounds (operands[2], 0, neon_element_bits (<MODE>mode) + 1); + arm_const_bounds (operands[2], 0, neon_element_bits (<MODE>mode) + 1); return "vshll.<sup>%#<V_sz_elem>\t%q0, %P1, %2"; } [(set_attr "type" "neon_shift_imm_long")] @@ -4507,7 +4507,7 @@ if (BYTES_BIG_ENDIAN) VSRA_N))] "TARGET_NEON" { - neon_const_bounds (operands[3], 1, neon_element_bits (<MODE>mode) + 1); + arm_const_bounds (operands[3], 1, neon_element_bits (<MODE>mode) + 1); return "v<shift_op>.<sup>%#<V_sz_elem>\t%<V_reg>0, %<V_reg>2, %3"; } [(set_attr "type" "neon_shift_acc<q>")] @@ -4521,7 +4521,7 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VSRI))] "TARGET_NEON" { - neon_const_bounds (operands[3], 1, neon_element_bits (<MODE>mode) + 1); + arm_const_bounds (operands[3], 1, neon_element_bits (<MODE>mode) + 1); return "vsri.<V_sz_elem>\t%<V_reg>0, %<V_reg>2, %3"; } [(set_attr "type" "neon_shift_reg<q>")] @@ -4535,7 +4535,7 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VSLI))] "TARGET_NEON" { - neon_const_bounds (operands[3], 0, neon_element_bits (<MODE>mode)); + arm_const_bounds (operands[3], 0, neon_element_bits (<MODE>mode)); return "vsli.<V_sz_elem>\t%<V_reg>0, %<V_reg>2, %3"; } [(set_attr "type" "neon_shift_reg<q>")] diff --git a/gcc/config/arm/types.md b/gcc/config/arm/types.md index 3de138ca3c8960d4bf93e4db4d27413099bc4f72..b0b375c6ddfbe69fff9abc3bdb6bcd592dd341f2 100644 --- a/gcc/config/arm/types.md +++ b/gcc/config/arm/types.md @@ -539,6 +539,10 @@ ; crypto_sha1_slow ; crypto_sha256_fast ; crypto_sha256_slow +; +; The classification below is for coprocessor instructions +; +; coproc (define_attr "type" "adc_imm,\ @@ -1073,7 +1077,8 @@ crypto_sha1_fast,\ crypto_sha1_slow,\ crypto_sha256_fast,\ - crypto_sha256_slow" + crypto_sha256_slow,\ + coproc" (const_string "untyped")) ; Is this an (integer side) multiply with a 32-bit (or smaller) result? diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md index 719ea08c0c44d71b5f4ee6c7ac40e118d7dae60f..01dd700a0af8043ce40ada939f9b0c34d846eded 100644 --- a/gcc/config/arm/unspecs.md +++ b/gcc/config/arm/unspecs.md @@ -150,6 +150,8 @@ VUNSPEC_GET_FPSCR ; Represent fetch of FPSCR content. VUNSPEC_SET_FPSCR ; Represent assign of FPSCR content. VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing. + VUNSPEC_CDP ; Represent the coprocessor cdp instruction. + VUNSPEC_CDP2 ; Represent the coprocessor cdp2 instruction. ]) ;; Enumerators for NEON unspecs. diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md index 29f62e51097d634357997bc9245bbb9952affc92..befdea9edd90f1b8c6cb81cb833c07bd2454fa80 100644 --- a/gcc/config/arm/vfp.md +++ b/gcc/config/arm/vfp.md @@ -1886,7 +1886,7 @@ (float_truncate:HF (float:SF (match_dup 0))))] "TARGET_VFP_FP16INST" { - neon_const_bounds (operands[2], 1, 33); + arm_const_bounds (operands[2], 1, 33); return "vcvt.f16.<sup>32\t%0, %0, %2\;vmov.f32\t%3, %0"; } [(set_attr "conds" "unconditional") @@ -1903,7 +1903,7 @@ { rtx op1 = gen_reg_rtx (SImode); - neon_const_bounds (operands[2], 1, 33); + arm_const_bounds (operands[2], 1, 33); emit_move_insn (op1, operands[1]); emit_insn (gen_neon_vcvth<sup>_nhf_unspec (op1, op1, operands[2], @@ -1927,7 +1927,7 @@ VCVT_SI_US_N))] "TARGET_VFP_FP16INST" { - neon_const_bounds (operands[2], 1, 33); + arm_const_bounds (operands[2], 1, 33); return "vmov.f32\t%0, %1\;vcvt.<sup>%#32.f16\t%0, %0, %2"; } [(set_attr "conds" "unconditional") @@ -1945,7 +1945,7 @@ { rtx op1 = gen_reg_rtx (SImode); - neon_const_bounds (operands[2], 1, 33); + arm_const_bounds (operands[2], 1, 33); emit_insn (gen_neon_vcvth<sup>_nsi_unspec (op1, operands[1], operands[2])); emit_move_insn (operands[0], op1); DONE; diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 30bdcf07ad8b1bd086bbb87acb36fb0333944087..e85da3a03130f14a91c4cc5931fe275b11509939 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -12625,8 +12625,9 @@ The built-in intrinsics for the Advanced SIMD extension are available when NEON is enabled. Currently, ARM and AArch64 back ends do not support ACLE 2.0 fully. Both -back ends support CRC32 intrinsics from @file{arm_acle.h}. The ARM back end's -16-bit floating-point Advanced SIMD intrinsics currently comply to ACLE v1.1. +back ends support CRC32 intrinsics and the ARM back end supports the +Coprocessor intrinsics, all from @file{arm_acle.h}. The ARM back end's 16-bit +floating-point Advanced SIMD intrinsics currently comply to ACLE v1.1. AArch64's back end does not have support for 16-bit floating point Advanced SIMD intrinsics yet. diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index 204518d38daa3e0545f150bb3c4a8a1caee9330a..292a3c7e0a4d29650510db0685cb8d09411d3f7c 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -1678,6 +1678,25 @@ div instruction. ARM target supports ARMv8-M Security Extensions, enabled by the @code{-mcmse} option. +@item arm_coproc1_ok +@anchor{arm_coproc1_ok} +ARM target supports the following coprocessor instructions: @code{CDP}, +@code{LDC}, @code{STC}, @code{MCR} and @code{MRC}. + +@item arm_coproc2_ok +@anchor{arm_coproc2_ok} +ARM target supports all the coprocessor instructions also listed as supported +in @ref{arm_coproc1_ok} in addition to the following: @code{CDP2}, @code{LDC2}, +@code{LDC2l}, @code{STC2}, @code{STC2l}, @code{MCR2} and @code{MRC2}. + +@item arm_coproc3_ok +@anchor{arm_coproc3_ok} +ARM target supports all the coprocessor instructions also listed as supported +in @ref{arm_coproc2_ok} in addition the following: @code{MCRR} and @code{MRRC}. + +@item arm_coproc4_ok +ARM target supports all the coprocessor instructions also listed as supported +in @ref{arm_coproc3_ok} in addition the following: @code{MCRR2} and @code{MRRC2}. @end table @subsubsection AArch64-specific attributes diff --git a/gcc/testsuite/gcc.target/arm/acle/acle.exp b/gcc/testsuite/gcc.target/arm/acle/acle.exp index c05080ebf1953b3443823a6665ccd0a1a09edb3a..aebf71cfbae594d951960c9ebfd3608003f7df78 100644 --- a/gcc/testsuite/gcc.target/arm/acle/acle.exp +++ b/gcc/testsuite/gcc.target/arm/acle/acle.exp @@ -27,9 +27,26 @@ load_lib gcc-dg.exp # Initialize `dg'. dg-init +set saved-dg-do-what-default ${dg-do-what-default} +set dg-do-what-default "assemble" + +set saved-lto_torture_options ${LTO_TORTURE_OPTIONS} + +# Add -ffat-lto-objects option to all LTO options such that we can do assembly +# scans. +proc add_fat_objects { list } { + set res {} + foreach el $list {set res [lappend res [concat $el " -ffat-lto-objects"]]} + return $res +}; +set LTO_TORTURE_OPTIONS [add_fat_objects ${LTO_TORTURE_OPTIONS}] + # Main loop. -dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \ +gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \ "" "" +# Restore globals +set dg-do-what-default ${saved-dg-do-what-default} +set LTO_TORTURE_OPTIONS ${saved-lto_torture_options} # All done. dg-finish diff --git a/gcc/testsuite/gcc.target/arm/acle/cdp.c b/gcc/testsuite/gcc.target/arm/acle/cdp.c new file mode 100644 index 0000000000000000000000000000000000000000..28b218e7cfcdb7d6ce1381feb4c6dea3ff08a620 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/acle/cdp.c @@ -0,0 +1,14 @@ +/* Test the cdp ACLE intrinsic. */ + +/* { dg-do assemble } */ +/* { dg-options "-save-temps" } */ +/* { dg-require-effective-target arm_coproc1_ok } */ + +#include "arm_acle.h" + +void test_cdp (void) +{ + __arm_cdp (10, 1, 2, 3, 4, 5); +} + +/* { dg-final { scan-assembler "cdp\tp10, #1, CR2, CR3, CR4, #5\n" } } */ diff --git a/gcc/testsuite/gcc.target/arm/acle/cdp2.c b/gcc/testsuite/gcc.target/arm/acle/cdp2.c new file mode 100644 index 0000000000000000000000000000000000000000..00bcd502b563cfe6df1e5d4c2e53f8034063d47e --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/acle/cdp2.c @@ -0,0 +1,14 @@ +/* Test the cdp2 ACLE intrinsic. */ + +/* { dg-do assemble } */ +/* { dg-options "-save-temps" } */ +/* { dg-require-effective-target arm_coproc2_ok } */ + +#include "arm_acle.h" + +void test_cdp2 (void) +{ + __arm_cdp2 (10, 4, 3, 2, 1, 0); +} + +/* { dg-final { scan-assembler "cdp2\tp10, #4, CR3, CR2, CR1, #0\n" } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index e4e015e721620e649b879aa398a59c550b5cbac8..342304da4b5fd02c70956496bcd03cdabaf78b01 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -8239,3 +8239,78 @@ proc check_effective_target_store_merge { } { return 0 } + +# Return 1 if the target supports coprocessor instructions: cdp, ldc, stc, mcr and +# mrc. +proc check_effective_target_arm_coproc1_ok_nocache { } { + if { ![istarget arm*-*-*] } { + return 0 + } + return [check_no_compiler_messages_nocache arm_coproc1_ok assembly { + #if (__thumb__ && !__thumb2__) || __ARM_ARCH < 4 + #error FOO + #endif + }] +} + +proc check_effective_target_arm_coproc1_ok { } { + return [check_cached_effective_target arm_coproc1_ok \ + check_effective_target_arm_coproc1_ok_nocache] +} + +# Return 1 if the target supports all coprocessor instructions checked by +# check_effective_target_arm_coproc1_ok in addition to the following: cdp2, +# ldc2, ldc2l, stc2, stc2l, mcr2 and mrc2. +proc check_effective_target_arm_coproc2_ok_nocache { } { + if { ![check_effective_target_arm_coproc1_ok] } { + return 0 + } + return [check_no_compiler_messages_nocache arm_coproc2_ok assembly { + #if __ARM_ARCH < 5 + #error FOO + #endif + }] +} + +proc check_effective_target_arm_coproc2_ok { } { + return [check_cached_effective_target arm_coproc2_ok \ + check_effective_target_arm_coproc2_ok_nocache] +} + +# Return 1 if the target supports all coprocessor instructions checked by +# check_effective_target_arm_coproc2_ok in addition the following: mcrr and +mrrc. +proc check_effective_target_arm_coproc3_ok_nocache { } { + if { ![check_effective_target_arm_coproc2_ok] } { + return 0 + } + return [check_no_compiler_messages_nocache arm_coproc3_ok assembly { + #if __ARM_ARCH < 6 && !defined (__ARM_ARCH_5TE__) + #error FOO + #endif + }] +} + +proc check_effective_target_arm_coproc3_ok { } { + return [check_cached_effective_target arm_coproc3_ok \ + check_effective_target_arm_coproc3_ok_nocache] +} + +# Return 1 if the target supports all coprocessor instructions checked by +# check_effective_target_arm_coproc3_ok in addition the following: mcrr2 and +# mrcc2. +proc check_effective_target_arm_coproc4_ok_nocache { } { + if { ![check_effective_target_arm_coproc3_ok] } { + return 0 + } + return [check_no_compiler_messages_nocache arm_coproc4_ok assembly { + #if __ARM_ARCH < 6 + #error FOO + #endif + }] +} + +proc check_effective_target_arm_coproc4_ok { } { + return [check_cached_effective_target arm_coproc4_ok \ + check_effective_target_arm_coproc4_ok_nocache] +}