Re: [PATCH] [ARM] Fix widen-sum pattern in neon.md.
On 04/15/2015 03:13 AM, Ramana Radhakrishnan wrote: On Thu, Mar 5, 2015 at 1:34 PM, Xingxing Pan wrote: Hi, The expanding of widen-sum pattern always fails. The vectorizer expects the operands to have the same size, while the current implementation of widen-sum pattern dose not conform to this. This patch implements the widen-sum pattern with vpadal. Change the vaddw pattern to anonymous. Add widen-sum test cases for neon. Can you please respin addressing James and Kyrill's comments ? Ramana -- Regards, Xingxing Hi, Sorry for late response. The pattern is rewritten to utilize neon_vpadal's "0" constraints. Have run vect.exp and neon.exp in an armv7 board. vect.exp has two new XFAILs: XFAIL: gcc.dg/vect/slp-reduc-3.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 XFAIL: gcc.dg/vect/slp-reduc-3.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 1 This is because widen-sum optimization precedes SLP. The xfail predicate vect_widen_sum_hi_to_si becomes true when widen-sum is enabled. neon.exp has four new XFAILs: XFAIL: gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c scan-tree-dump-times vect "pattern recognized.*w\\+" 1 XFAIL: gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c scan-rtl-dump-times expand "UNSPEC_VPADAL" 1 XFAIL: gcc.target/arm/neon/vect-widen-sum-char2short-s.c scan-tree-dump-times vect "pattern recognized.*w\\+" 1 XFAIL: gcc.target/arm/neon/vect-widen-sum-char2short-s.c scan-rtl-dump-times expand "UNSPEC_VPADAL" 1 If the widen-sum pattern is successfully expanded, "w+" and "UNSPEC_VPADAL" should appear in the dump file like other vect-widen-sum-*.c tests. But vect-widen-sum-char2short-s[-d].c is special because at tree level the signed operations will be converted into unsigned operations, which destroy the widen-sum pattern. That is due to the workaround of PR tree-optimization/25125. I just add xfail following gcc.dg/vect/vect-reduc-pattern-2c.c. -- Regards, Xingxing commit c44b5bd19efb029b8bbd4e3c7e2d631bdc482b7c Author: Xingxing Pan Date: Sun Apr 19 15:54:43 2015 +0800 Fix widen-sum pattern in neon.md. gcc/ 2015-04-19 Xingxing Pan * config/arm/iterators.md (VWSD): New. (V_widen_sum_d): New. * config/arm/neon.md (widen_ssum3): Redefined. (widen_usum3): Ditto. (neon_svaddw3): New anonymous define_insn. (neon_uvaddw3): Ditto. gcc/testsuite/ 2015-04-19 Xingxing Pan * gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c: New. * gcc.target/arm/neon/vect-widen-sum-char2short-s.c: New. * gcc.target/arm/neon/vect-widen-sum-char2short-u-d.c: New. * gcc.target/arm/neon/vect-widen-sum-char2short-u.c: New. * gcc.target/arm/neon/vect-widen-sum-short2int-s-d.c: New. * gcc.target/arm/neon/vect-widen-sum-short2int-s.c: New. * gcc.target/arm/neon/vect-widen-sum-short2int-u-d.c: New. * gcc.target/arm/neon/vect-widen-sum-short2int-u.c: New. * lib/target-supports.exp (check_effective_target_vect_widen_sum_hi_to_si_pattern): Return 1 for ARM NEON. (check_effective_target_vect_widen_sum_hi_to_si): Ditto. (check_effective_target_vect_widen_sum_qi_to_hi): Ditto. diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index f7f8ab7..f73278d 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -95,6 +95,9 @@ ;; Widenable modes. (define_mode_iterator VW [V8QI V4HI V2SI]) +;; Widenable modes. Used by widen sum. +(define_mode_iterator VWSD [V8QI V4HI V16QI V8HI]) + ;; Narrowable modes. (define_mode_iterator VN [V8HI V4SI V2DI]) @@ -555,9 +558,14 @@ ;; Same as V_widen, but lower-case. (define_mode_attr V_widen_l [(V8QI "v8hi") (V4HI "v4si") ( V2SI "v2di")]) -;; Widen. Result is half the number of elements, but widened to double-width. +;; Widen. Result is half the number of elements, but widened to double-width. (define_mode_attr V_unpack [(V16QI "V8HI") (V8HI "V4SI") (V4SI "V2DI")]) +;; Widen. Result is half the number of elements, but widened to double-width. +;; Used by widen sum. +(define_mode_attr V_widen_sum_d [(V8QI "V4HI") (V4HI "V2SI") + (V16QI "V8HI") (V8HI "V4SI")]) + ;; Conditions to be used in extenddi patterns. (define_mode_attr qhs_zextenddi_cond [(SI "") (HI "&& arm_arch6") (QI "")]) (define_mode_attr qhs_sextenddi_cond [(SI "") (HI "&& arm_arch6") diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 63c327e..839883f 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -1174,7 +1174,29 @@ ;; Widening operations -(define_insn
[PATCH] [ARM] Fix widen-sum pattern in neon.md.
Hi, The expanding of widen-sum pattern always fails. The vectorizer expects the operands to have the same size, while the current implementation of widen-sum pattern dose not conform to this. This patch implements the widen-sum pattern with vpadal. Change the vaddw pattern to anonymous. Add widen-sum test cases for neon. -- Regards, Xingxing commit 62637f371a3329ff56644526bc5dbf9356cbdd6c Author: Xingxing Pan Date: Wed Feb 25 16:44:25 2015 +0800 Fix widen-sum pattern in neon.md. 2015-03-05 Xingxing Pan config/arm/ * iterators.md: (VWSD): New define_mode_iterator. (V_widen_sum_d): New define_mode_attr. * neon.md (widen_ssum3): Redefined. (widen_usum3): Ditto. (neon_svaddw3): New anonymous define_insn. (neon_uvaddw3): Ditto. testsuite/gcc.target/arm/neon/ * vect-widen-sum-char2short-s-d.c: New file. * vect-widen-sum-char2short-s.c: Ditto. * vect-widen-sum-char2short-u-d.c: Ditto. * vect-widen-sum-char2short-u.c: Ditto. * vect-widen-sum-short2int-s-d.c: Ditto. * vect-widen-sum-short2int-s.c: Ditto. * vect-widen-sum-short2int-u-d.c: Ditto. * vect-widen-sum-short2int-u.c: Ditto. testsuite/lib/ * target-supports.exp: (check_effective_target_vect_widen_sum_hi_to_si_pattern): Return 1 for ARM NEON. (check_effective_target_vect_widen_sum_hi_to_si): Ditto. (check_effective_target_vect_widen_sum_qi_to_hi): Ditto. diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index f7f8ab7..4ba5901 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -95,6 +95,9 @@ ;; Widenable modes. (define_mode_iterator VW [V8QI V4HI V2SI]) +;; Widenable modes. Used by widen sum. +(define_mode_iterator VWSD [V8QI V4HI V16QI V8HI]) + ;; Narrowable modes. (define_mode_iterator VN [V8HI V4SI V2DI]) @@ -558,6 +561,11 @@ ;; Widen. Result is half the number of elements, but widened to double-width. (define_mode_attr V_unpack [(V16QI "V8HI") (V8HI "V4SI") (V4SI "V2DI")]) +;; Widen. Result is half the number of elements, but widened to double-width. +;; Used by widen sum. +(define_mode_attr V_widen_sum_d [(V8QI "V4HI") (V4HI "V2SI") + (V16QI "V8HI") (V8HI "V4SI")]) + ;; Conditions to be used in extenddi patterns. (define_mode_attr qhs_zextenddi_cond [(SI "") (HI "&& arm_arch6") (QI "")]) (define_mode_attr qhs_sextenddi_cond [(SI "") (HI "&& arm_arch6") diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 63c327e..6cac36d 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -1174,7 +1174,31 @@ ;; Widening operations -(define_insn "widen_ssum3" +(define_expand "widen_usum3" + [(match_operand: 0 "s_register_operand" "") + (match_operand:VWSD 1 "s_register_operand" "") + (match_operand: 2 "s_register_operand" "")] + "TARGET_NEON" + { +emit_move_insn(operands[0], operands[2]); +emit_insn (gen_neon_vpadalu (operands[0], operands[0], operands[1])); +DONE; + } +) + +(define_expand "widen_ssum3" + [(match_operand: 0 "s_register_operand" "") + (match_operand:VWSD 1 "s_register_operand" "") + (match_operand: 2 "s_register_operand" "")] + "TARGET_NEON" + { +emit_move_insn(operands[0], operands[2]); +emit_insn (gen_neon_vpadals (operands[0], operands[0], operands[1])); +DONE; + } +) + +(define_insn "*neon_svaddw3" [(set (match_operand: 0 "s_register_operand" "=w") (plus: (sign_extend: (match_operand:VW 1 "s_register_operand" "%w")) @@ -1184,7 +1208,7 @@ [(set_attr "type" "neon_add_widen")] ) -(define_insn "widen_usum3" +(define_insn "*neon_uvaddw3" [(set (match_operand: 0 "s_register_operand" "=w") (plus: (zero_extend: (match_operand:VW 1 "s_register_operand" "%w")) diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c new file mode 100644 index 000..c81c325 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c @@ -0,0 +1,64 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O2 -ffast-math -ftree-vectorize -mvectorize-with-neon-double -fdump-tree-vect-details -fdump-rtl-expand" } */ +/* { dg-add-options arm_neon } */ + +/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { xfail *-*-* } } } */ +/* { dg-final { clea
Re: [PATCH][AArch64]: Fix rtl type in aarch64.md.
On 02/28/2015 04:39 PM, James Greenhalgh wrote: On Sat, Feb 28, 2015 at 01:29:15AM +, Xingxing Pan wrote: On 02/27/2015 04:30 PM, Marcus Shawcroft wrote: On 26 February 2015 at 06:22, Xingxing Pan wrote: This patch fix the type of mov_aarch64 in aarch64.md. Is it OK for trunk? OK, thank you /Marcus Could someone help to apply the patch? Until now I don't have SVN write access. Thanks for the patch, I've committed it on your behalf as revision 221075. Cheers, James Thanks. -- Regards, Xingxing
Re: [PATCH][AArch64]: Fix rtl type in aarch64.md.
On 02/27/2015 04:30 PM, Marcus Shawcroft wrote: On 26 February 2015 at 06:22, Xingxing Pan wrote: Hi, This patch fix the type of mov_aarch64 in aarch64.md. Is it OK for trunk? OK, thank you /Marcus Hi, Could someone help to apply the patch? Until now I don't have SVN write access. -- Regards, Xingxing
Re: [PATCH, 2/2][ARM]: New CPU support for Marvell Whitney
On 02/25/2015 10:20 PM, James Greenhalgh wrote: On Wed, Feb 25, 2015 at 01:42:39PM +, Xingxing Pan wrote: > Hi, > > This patch expanding the following RTL types. And it has been merged to the latest code base. > > (neon_logic): Expand to neon_logic_reg and neon_logic_imm. > (neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q. > (neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar. > (neon_from_gp_q): Expand to neon_from_gp_q and neon_from_gp_scalar_q. > (neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar. > (neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q. > > Is it OK for trunk? > > -- > Regards, > Xingxing I've had a look through the AArch64 parts, and they look OK to me (though only Marcus or Richard can approve them), I have one additional comment. > ;; In this insn, operand 1 should be low, and operand 2 the high part of the > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index 8f157ce..8be2ebf 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -828,7 +828,7 @@ >} > } > [(set_attr "type" "mov_reg,mov_imm,mov_imm,load1,load1,store1,store1,\ > - neon_from_gp,neon_from_gp, neon_dup") > + neon_to_gp_scalar,neon_from_gp, neon_dup") > (set_attr "simd" "*,*,yes,*,*,*,*,yes,yes,yes")] > ) Here you change neon_from_gp to neon_to_gp_scalar. This looks like the correct thing to do, but would you mind pulling it out to a separate patch, first changing neon_from_gp to neon_to_gp? I'd just like to have the bug-fix separate from the bigger infrastructure change. Thanks, James Hi James, Thanks for your advice. I have submitted another patch to change the type from neon_from_gp to neon_to_gp. See https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01566.html. Attach the updated patch. -- Regards, Xingxing Expand several arm types. 2015-02-26 Xingxing Pan * config/arm/types.md: (neon_logic): Expand to neon_logic_reg and neon_logic_imm. (neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q. (neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar. (neon_from_gp_q): Expand to neon_from_gp_q and neon_from_gp_scalar_q. (neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar. (neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q. * config/aarch64/aarch64-simd.md: Ditto. * config/aarch64/aarch64.md: Ditto. * config/aarch64/thunderx.md: Ditto. * config/arm/arm.md: Ditto. * config/arm/cortex-a15-neon.md: Ditto. * config/arm/cortex-a17-neon.md: Ditto. * config/arm/cortex-a57.md: Ditto. * config/arm/cortex-a8-neon.md: Ditto. * config/arm/cortex-a9-neon.md: Ditto. * config/arm/marvell-whitney.md: Ditto. * config/arm/neon.md: Ditto. * config/arm/xgene1.md: Ditto. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 0557570..611d14c 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -115,7 +115,7 @@ } } [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\ - neon_logic, neon_to_gp, neon_from_gp,\ + neon_logic_reg, neon_to_gp_scalar, neon_from_gp_scalar,\ mov_reg, neon_move")] ) @@ -147,7 +147,7 @@ } } [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\ - neon_logic, multiple, multiple, multiple,\ + neon_logic_reg, multiple, multiple, multiple,\ neon_move") (set_attr "length" "4,4,4,8,8,8,4")] ) @@ -218,7 +218,7 @@ (match_operand:VQ 2 "vect_par_cnst_lo_half" "")))] "TARGET_SIMD && reload_completed" "umov\t%0, %1.d[0]" - [(set_attr "type" "neon_to_gp") + [(set_attr "type" "neon_to_gp_scalar") (set_attr "length" "4") ]) @@ -229,7 +229,7 @@ (match_operand:VQ 2 "vect_par_cnst_hi_half" "")))] "TARGET_SIMD && reload_completed" "umov\t%0, %1.d[1]" - [(set_attr "type" "neon_to_gp") + [(set_attr "type" "neon_to_gp_scalar") (set_attr "length" "4") ]) @@ -239,7 +239,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "orn\t%0., %2., %1." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "bic3" @@ -248,7 +248,7 @@ (match_operand:VDQ_I 2 "
Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney
On 02/25/2015 09:32 PM, Xingxing Pan wrote: Hi, This patch merges pipeline description for marvell-whitney to latest code base. Is it OK for trunk? Refactor the commit message. -- Regards, Xingxing Add pipeline description for marvell-whitney. 2015-02-26 Xingxing Pan * config/arm/arm-cores.def: Add new core marvell-whitney. * config/arm/arm-protos.h: (marvell_whitney_vector_mode_qi): Declare. (marvell_whitney_inner_shift): Ditto. * config/arm/arm-tables.opt: Regenerated. * config/arm/arm-tune.md: Regenerated. * config/arm/arm.c (arm_marvell_whitney_tune): New structure. (arm_issue_rate): Add marvell_whitney. (marvell_whitney_vector_mode_qi): New function. (marvell_whitney_inner_shift): Ditto. * config/arm/arm.md: Include marvell-whitney.md. (generic_sched): Add marvell_whitney. (generic_vfp): Ditto. * config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney. * config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md. * config/arm/marvell-whitney.md: New file. * doc/invoke.texi: Document marvell-whitney. diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def index d7e730d..fc76eb5 100644 --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -159,6 +159,7 @@ ARM_CORE("cortex-m7", cortexm7, cortexm7, 7EM, FL_LDSCHED, cortex_m7) ARM_CORE("cortex-m4", cortexm4, cortexm4, 7EM, FL_LDSCHED, v7m) ARM_CORE("cortex-m3", cortexm3, cortexm3, 7M, FL_LDSCHED, v7m) ARM_CORE("marvell-pj4", marvell_pj4, marvell_pj4, 7A, FL_LDSCHED, 9e) +ARM_CORE("marvell-whitney", marvell_whitney, marvell_whitney, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney) /* V7 big.LITTLE implementations */ ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15) diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 307babb..d047dbc 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -231,6 +231,9 @@ extern void arm_order_regs_for_local_alloc (void); extern int arm_max_conditional_execute (); +extern bool marvell_whitney_vector_mode_qi (rtx_insn *insn); +extern bool marvell_whitney_inner_shift (rtx_insn *insn); + /* Vectorizer cost model implementation. */ struct cpu_vec_costs { const int scalar_stmt_cost; /* Cost of any scalar operation, excluding diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt index 3450e5b..f0f9f3f 100644 --- a/gcc/config/arm/arm-tables.opt +++ b/gcc/config/arm/arm-tables.opt @@ -298,6 +298,9 @@ EnumValue Enum(processor_type) String(marvell-pj4) Value(marvell_pj4) EnumValue +Enum(processor_type) String(marvell-whitney) Value(marvell_whitney) + +EnumValue Enum(processor_type) String(cortex-a15.cortex-a7) Value(cortexa15cortexa7) EnumValue diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md index d459f27..fbfab2e 100644 --- a/gcc/config/arm/arm-tune.md +++ b/gcc/config/arm/arm-tune.md @@ -31,7 +31,8 @@ cortexa15,cortexa17,cortexr4, cortexr4f,cortexr5,cortexr7, cortexm7,cortexm4,cortexm3, - marvell_pj4,cortexa15cortexa7,cortexa17cortexa7, - cortexa53,cortexa57,cortexa72, - xgene1,cortexa57cortexa53,cortexa72cortexa53" + marvell_pj4,marvell_whitney,cortexa15cortexa7, + cortexa17cortexa7,cortexa53,cortexa57, + cortexa72,xgene1,cortexa57cortexa53, + cortexa72cortexa53" (const (symbol_ref "((enum attr_tune) arm_tune)"))) diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 7bf5b4d..e68287f 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -2000,6 +2000,25 @@ const struct tune_params arm_cortex_a9_tune = ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ }; +const struct tune_params arm_marvell_whitney_tune = +{ + arm_9e_rtx_costs, + &cortexa9_extra_costs, + cortex_a9_sched_adjust_cost, + 1, /* Constant limit. */ + 5, /* Max cond insns. */ + ARM_PREFETCH_BENEFICIAL(4,32,32), + false, /* Prefer constant pool. */ + arm_default_branch_cost, + false, /* Prefer LDRD/STRD. */ + {true, true}, /* Prefer non short circuit. */ + &arm_default_vec_cost,/* Vectorizer costs. */ + false,/* Prefer Neon for 64-bits bitops. */ + false, false, /* Prefer 32-bit encodings. */ + false, /* Prefer Neon for stringops. */ + 8 /* Maximum insns to inline memset. */ +}; + const struct tune_params arm_cortex_a12_tune = { arm_9e_rtx_costs, @@ -11717,6 +11736,51 @@ fa726te_sched_adjust_cost (rtx_insn *insn, rtx link, rtx_insn *dep, int * cost) return true; } +/* Return true if vector element size is byte. */ +bool +marvell_whitney_vector_mode_qi (rtx_insn *insn) +{ + machine_mode mode; + + if (GET_CODE (PATTERN (insn)) == SET) +{ + mode = GET_MODE (SET_DE
[PATCH][AArch64]: Fix rtl type in aarch64.md.
Hi, This patch fix the type of mov_aarch64 in aarch64.md. Is it OK for trunk? -- Regards, Xingxing [AArch64] Fix define_insn type in aarch64.md. 2015-02-26 Xingxing Pan * config/aarch64/aarch64.md: (mov_aarch64): Change type to neon_to_gp. diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 7103e0d..534a862 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -828,7 +828,7 @@ } } [(set_attr "type" "mov_reg,mov_imm,mov_imm,load1,load1,store1,store1,\ - neon_from_gp,neon_from_gp, neon_dup") + neon_to_gp,neon_from_gp,neon_dup") (set_attr "simd" "*,*,yes,*,*,*,*,yes,yes,yes")] )
Re: [PATCH, 2/2][ARM]: New CPU support for Marvell Whitney
On 02/25/2015 09:42 PM, Xingxing Pan wrote: Hi, This patch expanding the following RTL types. And it has been merged to the latest code base. (neon_logic): Expand to neon_logic_reg and neon_logic_imm. (neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q. (neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar. (neon_from_gp_q): Expand to neon_from_gp_q and neon_from_gp_scalar_q. (neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar. (neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q. Is it OK for trunk? Fix typos in commit message. -- Regards, Xingxing commit b0d0ebf6a2553bc7b6cc8f72fbaa0104938d0d41 Author: Xingxing Pan Date: Wed Feb 25 14:46:25 2015 +0800 2015-02-25 Xingxing Pan * config/arm/types.md: (neon_logic): Expand to neon_logic_reg and neon_logic_imm. (neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q. (neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar. (neon_from_gp_q): Expand to neon_from_gp_q and neon_from_gp_scalar_q. (neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar. (neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q. * config/aarch64/aarch64-simd.md: Ditto. * config/aarch64/aarch64.md: Ditto. * config/aarch64/thunderx.md: Ditto. * config/arm/arm.md: Ditto. * config/arm/cortex-a15-neon.md: Ditto. * config/arm/cortex-a17-neon.md: Ditto. * config/arm/cortex-a57.md: Ditto. * config/arm/cortex-a8-neon.md: Ditto. * config/arm/cortex-a9-neon.md: Ditto. * config/arm/marvell-whitney.md: Ditto. * config/arm/neon.md: Ditto. * config/arm/xgene1.md: Ditto. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 0557570..611d14c 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -115,7 +115,7 @@ } } [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\ - neon_logic, neon_to_gp, neon_from_gp,\ + neon_logic_reg, neon_to_gp_scalar, neon_from_gp_scalar,\ mov_reg, neon_move")] ) @@ -147,7 +147,7 @@ } } [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\ - neon_logic, multiple, multiple, multiple,\ + neon_logic_reg, multiple, multiple, multiple,\ neon_move") (set_attr "length" "4,4,4,8,8,8,4")] ) @@ -218,7 +218,7 @@ (match_operand:VQ 2 "vect_par_cnst_lo_half" "")))] "TARGET_SIMD && reload_completed" "umov\t%0, %1.d[0]" - [(set_attr "type" "neon_to_gp") + [(set_attr "type" "neon_to_gp_scalar") (set_attr "length" "4") ]) @@ -229,7 +229,7 @@ (match_operand:VQ 2 "vect_par_cnst_hi_half" "")))] "TARGET_SIMD && reload_completed" "umov\t%0, %1.d[1]" - [(set_attr "type" "neon_to_gp") + [(set_attr "type" "neon_to_gp_scalar") (set_attr "length" "4") ]) @@ -239,7 +239,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "orn\t%0., %2., %1." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "bic3" @@ -248,7 +248,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "bic\t%0., %2., %1." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "add3" @@ -444,7 +444,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "and\t%0., %1., %2." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "ior3" @@ -453,7 +453,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "orr\t%0., %1., %2." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "xor3" @@ -462,7 +462,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "eor\t%0., %1., %2." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "one_cmpl2" @@ -470,7 +470,7 @@ (not:VDQ_I (match_operand:VDQ_I 1 "register_operand" "w&
Re: [PATCH, 2/2][ARM]: New CPU support for Marvell Whitney
Hi, This patch expanding the following RTL types. And it has been merged to the latest code base. (neon_logic): Expand to neon_logic_reg and neon_logic_imm. (neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q. (neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar. (neon_from_gp_q): Expand to neon_from_gp_q and neon_from_gp_scalar_q. (neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar. (neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q. Is it OK for trunk? -- Regards, Xingxing commit b0d0ebf6a2553bc7b6cc8f72fbaa0104938d0d41 Author: Xingxing Pan Date: Wed Feb 25 14:46:25 2015 +0800 2015-02-25 Xingxing Pan * config/arm/types.md: (neon_logic): Expand to neon_logic_reg and neon_logic_imm. (neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q. (neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar. (neon_from_gp_q): Expand to neon_from_gp_q and neon_from_gp_scalar_q. (neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar. (neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q. * config/aarch64/aarch64-simd.md: Ditto. * config/aarch64/aarch64.md: Ditto. * config/aarch64/thunderx.md: Ditto. * config/arm/arm.md: Ditto. * config/arm/cortex-a15-neon.md: Ditto. * config/arm/cortex-a17-neon.md: Ditto. * config/arm/cortex-a57.md: Ditto. * config/arm/cortex-a8-neon.md: Ditto. * config/arm/cortex-a9-neon.md: Ditto. * config/arm/marvell-whitney.md: Ditto. * config/arm/neon.md: Ditto. * config/arm/types.md: Ditto. * config/arm/xgene1.md: Ditto. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 0557570..611d14c 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -115,7 +115,7 @@ } } [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\ - neon_logic, neon_to_gp, neon_from_gp,\ + neon_logic_reg, neon_to_gp_scalar, neon_from_gp_scalar,\ mov_reg, neon_move")] ) @@ -147,7 +147,7 @@ } } [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\ - neon_logic, multiple, multiple, multiple,\ + neon_logic_reg, multiple, multiple, multiple,\ neon_move") (set_attr "length" "4,4,4,8,8,8,4")] ) @@ -218,7 +218,7 @@ (match_operand:VQ 2 "vect_par_cnst_lo_half" "")))] "TARGET_SIMD && reload_completed" "umov\t%0, %1.d[0]" - [(set_attr "type" "neon_to_gp") + [(set_attr "type" "neon_to_gp_scalar") (set_attr "length" "4") ]) @@ -229,7 +229,7 @@ (match_operand:VQ 2 "vect_par_cnst_hi_half" "")))] "TARGET_SIMD && reload_completed" "umov\t%0, %1.d[1]" - [(set_attr "type" "neon_to_gp") + [(set_attr "type" "neon_to_gp_scalar") (set_attr "length" "4") ]) @@ -239,7 +239,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "orn\t%0., %2., %1." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "bic3" @@ -248,7 +248,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "bic\t%0., %2., %1." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "add3" @@ -444,7 +444,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "and\t%0., %1., %2." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "ior3" @@ -453,7 +453,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "orr\t%0., %1., %2." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "xor3" @@ -462,7 +462,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "eor\t%0., %1., %2." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "one_cmpl2" @@ -470,7 +470,7 @@ (not:VDQ_I (match_operand:VDQ_I 1 "register_operand" "w")))] "TARGET_SIMD"
Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney
Hi, This patch merges pipeline description for marvell-whitney to latest code base. Is it OK for trunk? -- Regards, Xingxing commit 83974dde8d9f773df1004aa1d5e3b05d8a33f5e0 Author: Xingxing Pan Date: Wed Feb 25 10:24:40 2015 +0800 2015-02-25 Xingxing Pan * config/arm/arm-cores.def: Add new core marvell-whitney. * config/arm/arm-protos.h: (marvell_whitney_vector_mode_qi): Declare. (marvell_whitney_inner_shift): Ditto. * config/arm/arm-tables.opt: Regenerated. * config/arm/arm-tune.md: Regenerated. * config/arm/arm.c (arm_marvell_whitney_tune): New structure. (arm_issue_rate): Add marvell_whitney. (marvell_whitney_vector_mode_qi): New function. (marvell_whitney_inner_shift): Ditto. * config/arm/arm.md: Include marvell-whitney.md. (generic_sched): Add marvell_whitney. (generic_vfp): Ditto. * config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney. * config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md. * config/arm/marvell-whitney.md: New file. * doc/invoke.texi: Document marvell-whitney. diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def index d7e730d..fc76eb5 100644 --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -159,6 +159,7 @@ ARM_CORE("cortex-m7", cortexm7, cortexm7, 7EM, FL_LDSCHED, cortex_m7) ARM_CORE("cortex-m4", cortexm4, cortexm4, 7EM, FL_LDSCHED, v7m) ARM_CORE("cortex-m3", cortexm3, cortexm3, 7M, FL_LDSCHED, v7m) ARM_CORE("marvell-pj4", marvell_pj4, marvell_pj4, 7A, FL_LDSCHED, 9e) +ARM_CORE("marvell-whitney", marvell_whitney, marvell_whitney, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney) /* V7 big.LITTLE implementations */ ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15) diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 307babb..d047dbc 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -231,6 +231,9 @@ extern void arm_order_regs_for_local_alloc (void); extern int arm_max_conditional_execute (); +extern bool marvell_whitney_vector_mode_qi (rtx_insn *insn); +extern bool marvell_whitney_inner_shift (rtx_insn *insn); + /* Vectorizer cost model implementation. */ struct cpu_vec_costs { const int scalar_stmt_cost; /* Cost of any scalar operation, excluding diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt index 3450e5b..f0f9f3f 100644 --- a/gcc/config/arm/arm-tables.opt +++ b/gcc/config/arm/arm-tables.opt @@ -298,6 +298,9 @@ EnumValue Enum(processor_type) String(marvell-pj4) Value(marvell_pj4) EnumValue +Enum(processor_type) String(marvell-whitney) Value(marvell_whitney) + +EnumValue Enum(processor_type) String(cortex-a15.cortex-a7) Value(cortexa15cortexa7) EnumValue diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md index d459f27..fbfab2e 100644 --- a/gcc/config/arm/arm-tune.md +++ b/gcc/config/arm/arm-tune.md @@ -31,7 +31,8 @@ cortexa15,cortexa17,cortexr4, cortexr4f,cortexr5,cortexr7, cortexm7,cortexm4,cortexm3, - marvell_pj4,cortexa15cortexa7,cortexa17cortexa7, - cortexa53,cortexa57,cortexa72, - xgene1,cortexa57cortexa53,cortexa72cortexa53" + marvell_pj4,marvell_whitney,cortexa15cortexa7, + cortexa17cortexa7,cortexa53,cortexa57, + cortexa72,xgene1,cortexa57cortexa53, + cortexa72cortexa53" (const (symbol_ref "((enum attr_tune) arm_tune)"))) diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 7bf5b4d..e68287f 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -2000,6 +2000,25 @@ const struct tune_params arm_cortex_a9_tune = ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ }; +const struct tune_params arm_marvell_whitney_tune = +{ + arm_9e_rtx_costs, + &cortexa9_extra_costs, + cortex_a9_sched_adjust_cost, + 1, /* Constant limit. */ + 5, /* Max cond insns. */ + ARM_PREFETCH_BENEFICIAL(4,32,32), + false, /* Prefer constant pool. */ + arm_default_branch_cost, + false, /* Prefer LDRD/STRD. */ + {true, true}, /* Prefer non short circuit. */ + &arm_default_vec_cost,/* Vectorizer costs. */ + false,/* Prefer Neon for 64-bits bitops. */ + false, false, /* Prefer 32-bit encodings. */ + false, /* Prefer Neon for stringops. */ + 8 /* Maximum insns to inline memset. */ +}; + const struct tune_params arm_cortex_a12_tune = { arm_9e_rtx_costs, @@ -11717,6 +11736,51 @@ fa726te_sched_adjust_cost (rtx_insn *insn, rtx link, rtx_insn *dep, int * cost) return true; } +/* Return true if vector element size is byte. */ +bool +marvell_whitney_vector_mode_qi (rtx_insn *insn) +{ + machine_mode mode; + + if (GET_CODE (PATT
Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney
On 09/01/2015 19:22, Kyrill Tkachov wrote: Hi Xingxing, On 19/12/14 11:01, Xingxing Pan wrote: +/* Return true if vector element size is byte. */ Minor nit: two spaces after full stop and before */ Same in other places in the patch. +bool +marvell_whitney_vector_element_size_is_byte (rtx insn) +{ + if (GET_CODE (PATTERN (insn)) == SET) +{ + if ((GET_MODE (SET_DEST (PATTERN (insn))) == V8QImode) || + (GET_MODE (SET_DEST (PATTERN (insn))) == V16QImode)) + return true; +} + + return false; +} I see this is called from inside marvell-whitney.md. It seems to me that this function takes RTX insns. Can the type of this be strengthened to rtx_insn * ? Also, this should be refactored and written a bit more generally by checking for VECTOR_MODE_P and then GET_MODE_INNER for QImode, saving you the trouble of enumerating the different vector QI modes. + +/* Return true if INSN has shift operation but is not a shift insn. */ +bool +marvell_whitney_non_shift_with_shift_operand (rtx insn) Similar comment. Can this be strengthened to rtx_insn * ? Thanks, Kyrill +{ + rtx pat = PATTERN (insn); + + if (GET_CODE (pat) != SET) +return false; + + /* Is not a shift insn. */ + rtx rvalue = SET_SRC (pat); + RTX_CODE code = GET_CODE (rvalue); + if (code == ASHIFT || code == ASHIFTRT + || code == LSHIFTRT || code == ROTATERT) +return false; + + subrtx_iterator::array_type array; + FOR_EACH_SUBRTX (iter, array, rvalue, ALL) +{ + /* Has shift operation. */ + RTX_CODE code = GET_CODE (*iter); + if (code == ASHIFT || code == ASHIFTRT + || code == LSHIFTRT || code == ROTATERT) +return true; +} + + return false; +} Hi Kyrill, Thanks for advice. Refactored patch is attached. -- Regards, Xingxing commit 3627056607b1e8604ac8d85ed44fdc7d3209cd3e Author: Xingxing Pan Date: Thu Dec 18 16:58:05 2014 +0800 2015-01-13 Xingxing Pan * config/arm/arm-cores.def: Add new core marvell-whitney. * config/arm/arm-protos.h: (marvell_whitney_vector_mode_qi): Declare. (marvell_whitney_inner_shift): Ditto. * config/arm/arm-tables.opt: Regenerated. * config/arm/arm-tune.md: Regenerated. * config/arm/arm.c (arm_marvell_whitney_tune): New structure. (arm_issue_rate): Add marvell_whitney. (marvell_whitney_vector_mode_qi): New function. (marvell_whitney_inner_shift): Ditto. * config/arm/arm.md: Include marvell-whitney.md. (generic_sched): Add marvell_whitney. (generic_vfp): Ditto. * config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney. * config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md. * config/arm/marvell-whitney.md: New file. * doc/invoke.texi: Document marvell-whitney. diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def index 6fa5d99..26eb7ab 100644 --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -159,6 +159,7 @@ ARM_CORE("cortex-m7", cortexm7, cortexm7, 7EM, FL_LDSCHED, cortex_m7) ARM_CORE("cortex-m4", cortexm4, cortexm4, 7EM, FL_LDSCHED, v7m) ARM_CORE("cortex-m3", cortexm3, cortexm3, 7M, FL_LDSCHED, v7m) ARM_CORE("marvell-pj4", marvell_pj4, marvell_pj4, 7A, FL_LDSCHED, 9e) +ARM_CORE("marvell-whitney", marvell_whitney, marvell_whitney, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney) /* V7 big.LITTLE implementations */ ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15) diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index fc45348..45001ae 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -231,6 +231,9 @@ extern void arm_order_regs_for_local_alloc (void); extern int arm_max_conditional_execute (); +extern bool marvell_whitney_vector_mode_qi (rtx_insn *insn); +extern bool marvell_whitney_inner_shift (rtx_insn *insn); + /* Vectorizer cost model implementation. */ struct cpu_vec_costs { const int scalar_stmt_cost; /* Cost of any scalar operation, excluding diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt index ece9d5e..dc5f364 100644 --- a/gcc/config/arm/arm-tables.opt +++ b/gcc/config/arm/arm-tables.opt @@ -298,6 +298,9 @@ EnumValue Enum(processor_type) String(marvell-pj4) Value(marvell_pj4) EnumValue +Enum(processor_type) String(marvell-whitney) Value(marvell_whitney) + +EnumValue Enum(processor_type) String(cortex-a15.cortex-a7) Value(cortexa15cortexa7) EnumValue diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md index 452820ab..c73c33c 100644 --- a/gcc/config/arm/arm-tune.md +++ b/gcc/config/arm/arm-tune.md @@ -31,6 +31,7 @@ cortexa15,cortexa17,cortexr4, cortexr4f,cortexr5,cortexr7, cortexm7,cortexm4,cortexm3, - marvell_pj4,cortexa15cortexa7,cortexa17corte
Re: [PING][patch 1/2][ARM]: New CPU support for Marvell Whitney
On 19/12/2014 19:01, Xingxing Pan wrote: On 19/12/2014 18:38, Kyrill Tkachov wrote: Hi Xingxin, It seems that your mail client mangled this patch, at least the following hunk doesn't apply, even when I try to get it from the web archives. Could you please resend it as an attachment perhaps? Thanks, Kyrill On 18/12/14 10:13, Xingxing Pan wrote: diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def index 423ee9e..b0ffbe1 100644 --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -159,6 +159,7 @@ ARM_CORE("cortex-m7", cortexm7, cortexm7, 7EM, FL_LDSCHED, cortex_m7) ARM_CORE("cortex-m4", cortexm4, cortexm4, 7EM, FL_LDSCHED, v7m) ARM_CORE("cortex-m3", cortexm3, cortexm3, 7M, FL_LDSCHED, v7m) ARM_CORE("marvell-pj4", marvell_pj4, marvell_pj4, 7A, FL_LDSCHED, 9e) +ARM_CORE("marvell-whitney",marvell_whitney, marvell_whitney, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney) Hi Kyrill, I've changed the code to use "tune" attribute directly. The new patch is attached. Thanks, Xingxing Any comments are welcome. Thanks, Xingxing
Re: [PATCH, 2/2][ARM]: New CPU support for Marvell Whitney
On 19/12/2014 18:29, Xingxing Pan wrote: On 19/12/2014 17:35, James Greenhalgh wrote: On Fri, Dec 19, 2014 at 08:19:17AM +, Xingxing Pan wrote: Hi, This patch expands the arm types neon_logic, neon_from_gp and neon_to_gp. This change mainly suits to marvell-whitney cores, and will not affect other arm core's pipeline description. neon_logic is expanded to neon_logic_reg and neon_logic_imm, corresponding respectively to the predicates s_register_operand and imm_for_neon_logic_operand. neon_from/to_gp is expanded to neon_reg_from/to_gp and neon_lane_from/to_gp, decided by whether the neon side is a single register or a register lane. Sorry to ask for churn here, but the naming scheme for lane operations elsewhere in types.md seems to be: neon_<_scalar><_q> as in: ; neon_mul_s_scalar ; neon_mul_s_scalar_q I think the types you are introducing should be: neon_from_gp_scalar neon_to_gp_scalar Thanks, James Hi James, Thanks for your comment. I've changed the type names. Regards, Xingxing 2014-12-19 Xingxing Pan * config/arm/types.md: (neon_logic): Expand to neon_logic_reg and neon_logic_imm. (neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q. (neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar. (neon_from_gp_q): Expand to neon_from_gp_q and neon_from_gp_scalar_q. (neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar. (neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q. * config/aarch64/aarch64-simd.md: Ditto. * config/aarch64/aarch64.md: Ditto. * config/aarch64/thunderx.md: Ditto. * config/arm/arm.md: Ditto. * config/arm/cortex-a15-neon.md: Ditto. * config/arm/cortex-a17-neon.md: Ditto. * config/arm/cortex-a8-neon.md: Ditto. * config/arm/cortex-a9-neon.md: Ditto. * config/arm/neon.md: Ditto. * config/arm/whitney.md: Ditto. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index d4256a5..63a2b7e 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -115,7 +115,7 @@ } } [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\ - neon_logic, neon_to_gp, neon_from_gp,\ + neon_logic_reg, neon_to_gp_scalar, neon_from_gp_scalar,\ mov_reg, neon_move")] ) @@ -147,7 +147,7 @@ } } [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\ - neon_logic, multiple, multiple, multiple,\ + neon_logic_reg, multiple, multiple, multiple,\ neon_move") (set_attr "length" "4,4,4,8,8,8,4")] ) @@ -227,7 +227,7 @@ (match_operand:VQ 2 "vect_par_cnst_lo_half" "")))] "TARGET_SIMD && reload_completed" "umov\t%0, %1.d[0]" - [(set_attr "type" "neon_to_gp") + [(set_attr "type" "neon_to_gp_scalar") (set_attr "length" "4") ]) @@ -238,7 +238,7 @@ (match_operand:VQ 2 "vect_par_cnst_hi_half" "")))] "TARGET_SIMD && reload_completed" "umov\t%0, %1.d[1]" - [(set_attr "type" "neon_to_gp") + [(set_attr "type" "neon_to_gp_scalar") (set_attr "length" "4") ]) @@ -248,7 +248,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "orn\t%0., %2., %1." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "bic3" @@ -257,7 +257,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "bic\t%0., %2., %1." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "add3" @@ -440,7 +440,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "and\t%0., %1., %2." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "ior3" @@ -449,7 +449,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "orr\t%0., %1., %2." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "xor3" @@ -458,7 +458,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "eor\t%0., %1., %2
Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney
On 19/12/2014 18:38, Kyrill Tkachov wrote: Hi Xingxin, It seems that your mail client mangled this patch, at least the following hunk doesn't apply, even when I try to get it from the web archives. Could you please resend it as an attachment perhaps? Thanks, Kyrill On 18/12/14 10:13, Xingxing Pan wrote: diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def index 423ee9e..b0ffbe1 100644 --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -159,6 +159,7 @@ ARM_CORE("cortex-m7", cortexm7, cortexm7, 7EM, FL_LDSCHED, cortex_m7) ARM_CORE("cortex-m4", cortexm4, cortexm4, 7EM, FL_LDSCHED, v7m) ARM_CORE("cortex-m3", cortexm3, cortexm3, 7M, FL_LDSCHED, v7m) ARM_CORE("marvell-pj4", marvell_pj4, marvell_pj4, 7A, FL_LDSCHED, 9e) +ARM_CORE("marvell-whitney",marvell_whitney, marvell_whitney, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney) Hi Kyrill, I've changed the code to use "tune" attribute directly. The new patch is attached. Thanks, Xingxing commit 56745b611d40b77e1911075159f89959335d0298 Author: Xingxing Pan Date: Thu Dec 18 16:58:05 2014 +0800 2014-12-18 Xingxing Pan * config/arm/arm-cores.def: Add new core marvell-whitney. * config/arm/arm-protos.h: (marvell_whitney_vector_element_size_is_byte): Declare. (marvell_whitney_non_shift_with_shift_operand): Ditto. * config/arm/arm-tables.opt: Regenerated. * config/arm/arm-tune.md: Regenerated. * config/arm/arm.c (arm_marvell_whitney_tune): New structure. (arm_issue_rate): Add marvell_whitney. (marvell_whitney_vector_element_size_is_byte): New function. (marvell_whitney_non_shift_with_shift_operand): Ditto. * config/arm/arm.md: Include marvell-whitney.md. (generic_sched): Add marvell_whitney. (generic_vfp): Ditto. * config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney. * config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md. * config/arm/marvell-whitney.md: New file. * doc/invoke.texi: Document marvell-whitney. diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def index 423ee9e..b0ffbe1 100644 --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -159,6 +159,7 @@ ARM_CORE("cortex-m7", cortexm7, cortexm7, 7EM, FL_LDSCHED, cortex_m7) ARM_CORE("cortex-m4", cortexm4, cortexm4, 7EM, FL_LDSCHED, v7m) ARM_CORE("cortex-m3", cortexm3, cortexm3, 7M, FL_LDSCHED, v7m) ARM_CORE("marvell-pj4", marvell_pj4, marvell_pj4, 7A, FL_LDSCHED, 9e) +ARM_CORE("marvell-whitney", marvell_whitney, marvell_whitney, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney) /* V7 big.LITTLE implementations */ ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15) diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 20cfa9f..e86db1e 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -231,6 +231,9 @@ extern void arm_order_regs_for_local_alloc (void); extern int arm_max_conditional_execute (); +extern bool marvell_whitney_vector_element_size_is_byte (rtx insn); +extern bool marvell_whitney_non_shift_with_shift_operand (rtx insn); + /* Vectorizer cost model implementation. */ struct cpu_vec_costs { const int scalar_stmt_cost; /* Cost of any scalar operation, excluding diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt index 9b1886e..3371ce3 100644 --- a/gcc/config/arm/arm-tables.opt +++ b/gcc/config/arm/arm-tables.opt @@ -298,6 +298,9 @@ EnumValue Enum(processor_type) String(marvell-pj4) Value(marvell_pj4) EnumValue +Enum(processor_type) String(marvell-whitney) Value(marvell_whitney) + +EnumValue Enum(processor_type) String(cortex-a15.cortex-a7) Value(cortexa15cortexa7) EnumValue diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md index d300c51..c73c33c 100644 --- a/gcc/config/arm/arm-tune.md +++ b/gcc/config/arm/arm-tune.md @@ -28,9 +28,10 @@ cortexm1smallmultiply,cortexm0smallmultiply,cortexm0plussmallmultiply, genericv7a,cortexa5,cortexa7, cortexa8,cortexa9,cortexa12, - cortexa15,cortexa17,cortexr4,cortexr4f, - cortexr5,cortexr7,cortexm7, - cortexm4,cortexm3,marvell_pj4, - cortexa15cortexa7,cortexa17cortexa7,cortexa53, - cortexa57,cortexa57cortexa53" + cortexa15,cortexa17,cortexr4, + cortexr4f,cortexr5,cortexr7, + cortexm7,cortexm4,cortexm3, + marvell_pj4,marvell_whitney,cortexa15cortexa7, + cortexa17cortexa7,cortexa53,cortexa57, + cortexa57cortexa53" (const (symbol_ref "((enum attr_tune) arm_tune)"))) diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 0
Re: [PATCH, 2/2][ARM]: New CPU support for Marvell Whitney
On 19/12/2014 17:35, James Greenhalgh wrote: On Fri, Dec 19, 2014 at 08:19:17AM +, Xingxing Pan wrote: Hi, This patch expands the arm types neon_logic, neon_from_gp and neon_to_gp. This change mainly suits to marvell-whitney cores, and will not affect other arm core's pipeline description. neon_logic is expanded to neon_logic_reg and neon_logic_imm, corresponding respectively to the predicates s_register_operand and imm_for_neon_logic_operand. neon_from/to_gp is expanded to neon_reg_from/to_gp and neon_lane_from/to_gp, decided by whether the neon side is a single register or a register lane. Sorry to ask for churn here, but the naming scheme for lane operations elsewhere in types.md seems to be: neon_<_scalar><_q> as in: ; neon_mul_s_scalar ; neon_mul_s_scalar_q I think the types you are introducing should be: neon_from_gp_scalar neon_to_gp_scalar Thanks, James Hi James, Thanks for your comment. I've changed the type names. Regards, Xingxing 2014-12-19 Xingxing Pan * config/arm/types.md: (neon_logic): Expand to neon_logic_reg and neon_logic_imm. (neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q. (neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar. (neon_from_gp_q): Expand to neon_from_gp_q and neon_from_gp_scalar_q. (neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar. (neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q. * config/aarch64/aarch64-simd.md: Ditto. * config/aarch64/aarch64.md: Ditto. * config/aarch64/thunderx.md: Ditto. * config/arm/arm.md: Ditto. * config/arm/cortex-a15-neon.md: Ditto. * config/arm/cortex-a17-neon.md: Ditto. * config/arm/cortex-a8-neon.md: Ditto. * config/arm/cortex-a9-neon.md: Ditto. * config/arm/neon.md: Ditto. * config/arm/whitney.md: Ditto. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index d4256a5..63a2b7e 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -115,7 +115,7 @@ } } [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\ - neon_logic, neon_to_gp, neon_from_gp,\ + neon_logic_reg, neon_to_gp_scalar, neon_from_gp_scalar,\ mov_reg, neon_move")] ) @@ -147,7 +147,7 @@ } } [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\ - neon_logic, multiple, multiple, multiple,\ + neon_logic_reg, multiple, multiple, multiple,\ neon_move") (set_attr "length" "4,4,4,8,8,8,4")] ) @@ -227,7 +227,7 @@ (match_operand:VQ 2 "vect_par_cnst_lo_half" "")))] "TARGET_SIMD && reload_completed" "umov\t%0, %1.d[0]" - [(set_attr "type" "neon_to_gp") + [(set_attr "type" "neon_to_gp_scalar") (set_attr "length" "4") ]) @@ -238,7 +238,7 @@ (match_operand:VQ 2 "vect_par_cnst_hi_half" "")))] "TARGET_SIMD && reload_completed" "umov\t%0, %1.d[1]" - [(set_attr "type" "neon_to_gp") + [(set_attr "type" "neon_to_gp_scalar") (set_attr "length" "4") ]) @@ -248,7 +248,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "orn\t%0., %2., %1." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "bic3" @@ -257,7 +257,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "bic\t%0., %2., %1." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "add3" @@ -440,7 +440,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "and\t%0., %1., %2." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "ior3" @@ -449,7 +449,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "orr\t%0., %1., %2." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "xor3" @@ -458,7 +458,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "eor\t%0., %1., %2." - [(set_attr "type" "neon_logic")
Re: [PATCH][ARM] Fix reservation pattern in cortex-a9-neon.md
Brilliant! Xingxing On 19/12/2014 17:44, James Greenhalgh wrote: On Fri, Dec 19, 2014 at 02:46:51AM +, Xingxing Pan wrote: Hi, This patch fix the reservation pattern of cortex_a9_neon_vmov in cortex-a9-neon.md. Is it OK for trunk? This patch is obvious, and fixes my typo. I couldn't see your name or email address in the MAINTAINERS file, so I've committed this under the "obvious" rule on your behalf as revision 218895. Thanks, James 2014-12-19 Xingxing Pan Note that there should be two spaces between your name and email address, as so: 2014-12-19 Xingxing Pan
[PATCH, 2/2][ARM]: New CPU support for Marvell Whitney
Hi, This patch expands the arm types neon_logic, neon_from_gp and neon_to_gp. This change mainly suits to marvell-whitney cores, and will not affect other arm core's pipeline description. neon_logic is expanded to neon_logic_reg and neon_logic_imm, corresponding respectively to the predicates s_register_operand and imm_for_neon_logic_operand. neon_from/to_gp is expanded to neon_reg_from/to_gp and neon_lane_from/to_gp, decided by whether the neon side is a single register or a register lane. Test on linux-gnueabi and no new regressions are found. OK for trunk? Regards, Xingxing 2014-12-19 Xingxing Pan * config/arm/types.md: (neon_logic): Expand to neon_logic_reg and neon_logic_imm. (neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q. (neon_from_gp): Expand to neon_reg_from_gp and neon_lane_from_gp. (neon_from_gp_q): Expand to neon_reg_from_gp_q and neon_lane_from_gp_q. (neon_to_gp): Expand to neon_reg_to_gp and neon_lane_to_gp. (neon_to_gp_q): Expand to neon_reg_to_gp_q and neon_lane_to_gp_q. * config/aarch64/aarch64-simd.md: Ditto. * config/aarch64/aarch64.md: Ditto. * config/aarch64/thunderx.md: Ditto. * config/arm/arm.md: Ditto. * config/arm/cortex-a15-neon.md: Ditto. * config/arm/cortex-a17-neon.md: Ditto. * config/arm/cortex-a8-neon.md: Ditto. * config/arm/cortex-a9-neon.md: Ditto. * config/arm/neon.md: Ditto. * config/arm/whitney.md: Ditto. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index d4256a5..ea92940 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -49,7 +49,7 @@ "@ dup\\t%0., %1 dup\\t%0., %1.[0]" - [(set_attr "type" "neon_from_gp, neon_dup")] + [(set_attr "type" "neon_reg_from_gp, neon_dup")] ) (define_insn "aarch64_simd_dup" @@ -115,7 +115,7 @@ } } [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\ - neon_logic, neon_to_gp, neon_from_gp,\ + neon_logic_reg, neon_lane_to_gp, neon_lane_from_gp,\ mov_reg, neon_move")] ) @@ -147,7 +147,7 @@ } } [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\ - neon_logic, multiple, multiple, multiple,\ + neon_logic_reg, multiple, multiple, multiple,\ neon_move") (set_attr "length" "4,4,4,8,8,8,4")] ) @@ -227,7 +227,7 @@ (match_operand:VQ 2 "vect_par_cnst_lo_half" "")))] "TARGET_SIMD && reload_completed" "umov\t%0, %1.d[0]" - [(set_attr "type" "neon_to_gp") + [(set_attr "type" "neon_lane_to_gp") (set_attr "length" "4") ]) @@ -238,7 +238,7 @@ (match_operand:VQ 2 "vect_par_cnst_hi_half" "")))] "TARGET_SIMD && reload_completed" "umov\t%0, %1.d[1]" - [(set_attr "type" "neon_to_gp") + [(set_attr "type" "neon_lane_to_gp") (set_attr "length" "4") ]) @@ -248,7 +248,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "orn\t%0., %2., %1." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "bic3" @@ -257,7 +257,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "bic\t%0., %2., %1." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "add3" @@ -440,7 +440,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "and\t%0., %1., %2." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "ior3" @@ -449,7 +449,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "orr\t%0., %1., %2." - [(set_attr "type" "neon_logic")] + [(set_attr "type" "neon_logic_reg")] ) (define_insn "xor3" @@ -458,7 +458,7 @@ (match_operand:VDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "eor\t%0., %1., %2." - [(set_a
[PATCH][ARM] Fix reservation pattern in cortex-a9-neon.md
Hi, This patch fix the reservation pattern of cortex_a9_neon_vmov in cortex-a9-neon.md. Is it OK for trunk? Regards, Xingxing 2014-12-19 Xingxing Pan * config/arm/cortex-a9-neon.md (cortex_a9_neon_vmov): Change reservation from cortex_a8_neon_dp to cortex_a9_neon_dp. diff --git a/gcc/config/arm/cortex-a9-neon.md b/gcc/config/arm/cortex-a9-neon.md index 3ff93f9..5c02b32 100644 --- a/gcc/config/arm/cortex-a9-neon.md +++ b/gcc/config/arm/cortex-a9-neon.md @@ -376,7 +376,7 @@ (define_insn_reservation "cortex_a9_neon_vmov" 3 (and (eq_attr "tune" "cortexa9") (eq_attr "cortex_a9_neon_type" "neon_vmov")) - "cortex_a8_neon_dp") + "cortex_a9_neon_dp") ;; Instructions using this reservation read their (D|Q)n operands at N2, ;; their (D|Q)m operands at N1, their (D|Q)d operands at N3, and
[patch 1/2][ARM]: New CPU support for Marvell Whitney
Hi, This patch contains Marvell Whitney core's pipeline description. Test on arm-linux-gnueabi and no new regression are found. Is it OK for trunk? Regards, Xingxing 2014-12-18 Xingxing Pan * config/arm/arm-cores.def: Add new core marvell-whitney. * config/arm/arm-protos.h: (marvell_whitney_vector_element_size_is_byte): Declare. (marvell_whitney_non_shift_with_shift_operand): Ditto. * config/arm/arm-tables.opt: Regenerated. * config/arm/arm-tune.md: Regenerated. * config/arm/arm.c (arm_marvell_whitney_tune): New structure. (arm_issue_rate): Add marvell_whitney. (marvell_whitney_vector_element_size_is_byte): New function. (marvell_whitney_non_shift_with_shift_operand): Ditto. * config/arm/arm.md: Include marvell-whitney.md. (generic_sched): Add marvell_whitney. (generic_vfp): Ditto. * config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney. * config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md. * config/arm/marvell-whitney.md: New file. * doc/invoke.texi: Document marvell-whitney. diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def index 423ee9e..b0ffbe1 100644 --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -159,6 +159,7 @@ ARM_CORE("cortex-m7", cortexm7, cortexm7, 7EM, FL_LDSCHED, cortex_m7) ARM_CORE("cortex-m4", cortexm4, cortexm4, 7EM, FL_LDSCHED, v7m) ARM_CORE("cortex-m3", cortexm3, cortexm3, 7M, FL_LDSCHED, v7m) ARM_CORE("marvell-pj4", marvell_pj4, marvell_pj4, 7A, FL_LDSCHED, 9e) +ARM_CORE("marvell-whitney",marvell_whitney, marvell_whitney, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney) /* V7 big.LITTLE implementations */ ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15) diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 20cfa9f..e86db1e 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -231,6 +231,9 @@ extern void arm_order_regs_for_local_alloc (void); extern int arm_max_conditional_execute (); +extern bool marvell_whitney_vector_element_size_is_byte (rtx insn); +extern bool marvell_whitney_non_shift_with_shift_operand (rtx insn); + /* Vectorizer cost model implementation. */ struct cpu_vec_costs { const int scalar_stmt_cost; /* Cost of any scalar operation, excluding diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt index 9b1886e..3371ce3 100644 --- a/gcc/config/arm/arm-tables.opt +++ b/gcc/config/arm/arm-tables.opt @@ -298,6 +298,9 @@ EnumValue Enum(processor_type) String(marvell-pj4) Value(marvell_pj4) EnumValue +Enum(processor_type) String(marvell-whitney) Value(marvell_whitney) + +EnumValue Enum(processor_type) String(cortex-a15.cortex-a7) Value(cortexa15cortexa7) EnumValue diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md index d300c51..c73c33c 100644 --- a/gcc/config/arm/arm-tune.md +++ b/gcc/config/arm/arm-tune.md @@ -28,9 +28,10 @@ cortexm1smallmultiply,cortexm0smallmultiply,cortexm0plussmallmultiply, genericv7a,cortexa5,cortexa7, cortexa8,cortexa9,cortexa12, - cortexa15,cortexa17,cortexr4,cortexr4f, - cortexr5,cortexr7,cortexm7, - cortexm4,cortexm3,marvell_pj4, - cortexa15cortexa7,cortexa17cortexa7,cortexa53, - cortexa57,cortexa57cortexa53" + cortexa15,cortexa17,cortexr4, + cortexr4f,cortexr5,cortexr7, + cortexm7,cortexm4,cortexm3, + marvell_pj4,marvell_whitney,cortexa15cortexa7, + cortexa17cortexa7,cortexa53,cortexa57, + cortexa57cortexa53" (const (symbol_ref "((enum attr_tune) arm_tune)"))) diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 0ec526b..183da4c 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -1914,6 +1914,25 @@ const struct tune_params arm_cortex_a9_tune = 8 /* Maximum insns to inline memset. */ }; +const struct tune_params arm_marvell_whitney_tune = +{ + arm_9e_rtx_costs, + &cortexa9_extra_costs, + cortex_a9_sched_adjust_cost, + 1, /* Constant limit. */ + 5, /* Max cond insns. */ + ARM_PREFETCH_BENEFICIAL(4,32,32), + false, /* Prefer constant pool. */ + arm_default_branch_cost, + false, /* Prefer LDRD/STRD. */ + {true, true},/* Prefer non short circuit. */ + &arm_default_vec_cost,/* Vectorizer costs. */ + false,