On 28/11/14 09:23, Yangfei (Felix) wrote:
Hi,
This patch converts vpmaxX & vpminX intrinsics to use builtin functions
instead of the previous inline assembly syntax.
Regtested with aarch64-linux-gnu on QEMU. Also passed the glorious
testsuite of Christophe Lyon.
OK for the trunk?
Index: gcc/ChangeLog
===================================================================
--- gcc/ChangeLog (revision 218128)
+++ gcc/ChangeLog (working copy)
@@ -1,3 +1,19 @@
+2014-11-28 Felix Yang <felix.y...@huawei.com>
+
+ * config/aarch64/aarch64-simd.md (aarch64_<maxmin_uns>p<mode>): New
+ pattern.
+ * config/aarch64/aarch64-simd-builtins.def (smaxp, sminp, umaxp,
+ uminp, smax_nanp, smin_nanp): New builtins.
+ * config/aarch64/arm_neon.h (vpmax_s8, vpmax_s16, vpmax_s32,
+ vpmax_u8, vpmax_u16, vpmax_u32, vpmaxq_s8, vpmaxq_s16, vpmaxq_s32,
+ vpmaxq_u8, vpmaxq_u16, vpmaxq_u32, vpmax_f32, vpmaxq_f32, vpmaxq_f64,
+ vpmaxqd_f64, vpmaxs_f32, vpmaxnm_f32, vpmaxnmq_f32, vpmaxnmq_f64,
+ vpmaxnmqd_f64, vpmaxnms_f32, vpmin_s8, vpmin_s16, vpmin_s32, vpmin_u8,
+ vpmin_u16, vpmin_u32, vpminq_s8, vpminq_s16, vpminq_s32, vpminq_u8,
+ vpminq_u16, vpminq_u32, vpmin_f32, vpminq_f32, vpminq_f64, vpminqd_f64,
+ vpmins_f32, vpminnm_f32, vpminnmq_f32, vpminnmq_f64, vpminnmqd_f64,
+ vpminnms_f32): Rewrite using builtin functions.
+
You'll need to rebase over Alan Lawrance's patch.
https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00279.html
__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
Index: gcc/config/aarch64/aarch64-simd.md
===================================================================
--- gcc/config/aarch64/aarch64-simd.md (revision 218128)
+++ gcc/config/aarch64/aarch64-simd.md (working copy)
@@ -1015,6 +1015,28 @@
DONE;
})
+;; Pairwise Integer Max/Min operations.
+(define_insn "aarch64_<maxmin_uns>p<mode>"
+ [(set (match_operand:VQ_S 0 "register_operand" "=w")
+ (unspec:VQ_S [(match_operand:VQ_S 1 "register_operand" "w")
+ (match_operand:VQ_S 2 "register_operand" "w")]
+ MAXMINV))]
+ "TARGET_SIMD"
+ "<maxmin_uns_op>p\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
+ [(set_attr "type" "neon_minmax<q>")]
+)
+
Could you roll aarch64_reduc_<maxmin_uns>_internalv2si into this pattern?
Thanks,
Tejas.