Re: [PATCH] [ARM] Fix widen-sum pattern in neon.md.

2015-04-19 Thread Xingxing Pan

On 04/15/2015 03:13 AM, Ramana Radhakrishnan wrote:

On Thu, Mar 5, 2015 at 1:34 PM, Xingxing Pan  wrote:

Hi,

The expanding of widen-sum pattern always fails. The vectorizer expects the
operands to have the same size, while the current implementation of
widen-sum pattern dose not conform to this.

This patch implements the widen-sum pattern with vpadal. Change the vaddw
pattern to anonymous. Add widen-sum test cases for neon.



Can you please respin addressing James and Kyrill's comments ?


Ramana


--
Regards,
Xingxing


Hi,

Sorry for late response.

The pattern is rewritten to utilize neon_vpadal's "0" 
constraints. Have run vect.exp and neon.exp in an armv7 board.


vect.exp has two new XFAILs:
XFAIL: gcc.dg/vect/slp-reduc-3.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 1
XFAIL: gcc.dg/vect/slp-reduc-3.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vectorizing stmts using SLP" 1


This is because widen-sum optimization precedes SLP. The xfail predicate 
vect_widen_sum_hi_to_si becomes true when widen-sum is enabled.


neon.exp has four new XFAILs:
XFAIL: gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c 
scan-tree-dump-times vect "pattern recognized.*w\\+" 1
XFAIL: gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c 
scan-rtl-dump-times expand "UNSPEC_VPADAL" 1
XFAIL: gcc.target/arm/neon/vect-widen-sum-char2short-s.c 
scan-tree-dump-times vect "pattern recognized.*w\\+" 1
XFAIL: gcc.target/arm/neon/vect-widen-sum-char2short-s.c 
scan-rtl-dump-times expand "UNSPEC_VPADAL" 1


If the widen-sum pattern is successfully expanded, "w+" and 
"UNSPEC_VPADAL" should appear in the dump file like other 
vect-widen-sum-*.c tests. But vect-widen-sum-char2short-s[-d].c is 
special because at tree level the signed operations will be converted 
into unsigned operations, which destroy the widen-sum pattern. That is 
due to the workaround of PR tree-optimization/25125. I just add xfail 
following gcc.dg/vect/vect-reduc-pattern-2c.c.



--
Regards,
Xingxing
commit c44b5bd19efb029b8bbd4e3c7e2d631bdc482b7c
Author: Xingxing Pan 
Date:   Sun Apr 19 15:54:43 2015 +0800

Fix widen-sum pattern in neon.md.

gcc/

2015-04-19  Xingxing Pan  

* config/arm/iterators.md (VWSD): New.
  (V_widen_sum_d): New.
* config/arm/neon.md (widen_ssum3): Redefined.
(widen_usum3): Ditto.
(neon_svaddw3): New anonymous define_insn.
(neon_uvaddw3): Ditto.

gcc/testsuite/

2015-04-19  Xingxing Pan  

* gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c: New.
* gcc.target/arm/neon/vect-widen-sum-char2short-s.c: New.
* gcc.target/arm/neon/vect-widen-sum-char2short-u-d.c: New.
* gcc.target/arm/neon/vect-widen-sum-char2short-u.c: New.
* gcc.target/arm/neon/vect-widen-sum-short2int-s-d.c: New.
* gcc.target/arm/neon/vect-widen-sum-short2int-s.c: New.
* gcc.target/arm/neon/vect-widen-sum-short2int-u-d.c: New.
* gcc.target/arm/neon/vect-widen-sum-short2int-u.c: New.
* lib/target-supports.exp
(check_effective_target_vect_widen_sum_hi_to_si_pattern): Return 1 for ARM NEON.
(check_effective_target_vect_widen_sum_hi_to_si): Ditto.
(check_effective_target_vect_widen_sum_qi_to_hi): Ditto.

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index f7f8ab7..f73278d 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -95,6 +95,9 @@
 ;; Widenable modes.
 (define_mode_iterator VW [V8QI V4HI V2SI])
 
+;; Widenable modes.  Used by widen sum.
+(define_mode_iterator VWSD [V8QI V4HI V16QI V8HI])
+
 ;; Narrowable modes.
 (define_mode_iterator VN [V8HI V4SI V2DI])
 
@@ -555,9 +558,14 @@
 ;; Same as V_widen, but lower-case.
 (define_mode_attr V_widen_l [(V8QI "v8hi") (V4HI "v4si") ( V2SI "v2di")])
 
-;; Widen. Result is half the number of elements, but widened to double-width.
+;; Widen.  Result is half the number of elements, but widened to double-width.
 (define_mode_attr V_unpack   [(V16QI "V8HI") (V8HI "V4SI") (V4SI "V2DI")])
 
+;; Widen.  Result is half the number of elements, but widened to double-width.
+;; Used by widen sum.
+(define_mode_attr V_widen_sum_d [(V8QI "V4HI") (V4HI "V2SI")
+ (V16QI "V8HI") (V8HI "V4SI")])
+
 ;; Conditions to be used in extenddi patterns.
 (define_mode_attr qhs_zextenddi_cond [(SI "") (HI "&& arm_arch6") (QI "")])
 (define_mode_attr qhs_sextenddi_cond [(SI "") (HI "&& arm_arch6")
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 63c327e..839883f 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -1174,7 +1174,29 @@
 
 ;; Widening operations
 
-(define_insn 

[PATCH] [ARM] Fix widen-sum pattern in neon.md.

2015-03-05 Thread Xingxing Pan

Hi,

The expanding of widen-sum pattern always fails. The vectorizer expects 
the operands to have the same size, while the current implementation of 
widen-sum pattern dose not conform to this.


This patch implements the widen-sum pattern with vpadal. Change the 
vaddw pattern to anonymous. Add widen-sum test cases for neon.


--
Regards,
Xingxing
commit 62637f371a3329ff56644526bc5dbf9356cbdd6c
Author: Xingxing Pan 
Date:   Wed Feb 25 16:44:25 2015 +0800

Fix widen-sum pattern in neon.md.

2015-03-05  Xingxing Pan  

config/arm/
* iterators.md:
(VWSD): New define_mode_iterator.
(V_widen_sum_d): New define_mode_attr.
* neon.md
(widen_ssum3): Redefined.
(widen_usum3): Ditto.
(neon_svaddw3): New anonymous define_insn.
(neon_uvaddw3): Ditto.
testsuite/gcc.target/arm/neon/
* vect-widen-sum-char2short-s-d.c: New file.
* vect-widen-sum-char2short-s.c: Ditto.
* vect-widen-sum-char2short-u-d.c: Ditto.
* vect-widen-sum-char2short-u.c: Ditto.
* vect-widen-sum-short2int-s-d.c: Ditto.
* vect-widen-sum-short2int-s.c: Ditto.
* vect-widen-sum-short2int-u-d.c: Ditto.
* vect-widen-sum-short2int-u.c: Ditto.
testsuite/lib/
* target-supports.exp:
(check_effective_target_vect_widen_sum_hi_to_si_pattern): Return 1 for ARM NEON.
(check_effective_target_vect_widen_sum_hi_to_si): Ditto.
(check_effective_target_vect_widen_sum_qi_to_hi): Ditto.

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index f7f8ab7..4ba5901 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -95,6 +95,9 @@
 ;; Widenable modes.
 (define_mode_iterator VW [V8QI V4HI V2SI])
 
+;; Widenable modes. Used by widen sum.
+(define_mode_iterator VWSD [V8QI V4HI V16QI V8HI])
+
 ;; Narrowable modes.
 (define_mode_iterator VN [V8HI V4SI V2DI])
 
@@ -558,6 +561,11 @@
 ;; Widen. Result is half the number of elements, but widened to double-width.
 (define_mode_attr V_unpack   [(V16QI "V8HI") (V8HI "V4SI") (V4SI "V2DI")])
 
+;; Widen. Result is half the number of elements, but widened to double-width.
+;; Used by widen sum.
+(define_mode_attr V_widen_sum_d [(V8QI "V4HI") (V4HI "V2SI")
+ (V16QI "V8HI") (V8HI "V4SI")])
+
 ;; Conditions to be used in extenddi patterns.
 (define_mode_attr qhs_zextenddi_cond [(SI "") (HI "&& arm_arch6") (QI "")])
 (define_mode_attr qhs_sextenddi_cond [(SI "") (HI "&& arm_arch6")
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 63c327e..6cac36d 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -1174,7 +1174,31 @@
 
 ;; Widening operations
 
-(define_insn "widen_ssum3"
+(define_expand "widen_usum3"
+ [(match_operand: 0 "s_register_operand" "")
+  (match_operand:VWSD 1 "s_register_operand" "")
+  (match_operand: 2 "s_register_operand" "")]
+  "TARGET_NEON"
+  {
+emit_move_insn(operands[0], operands[2]);
+emit_insn (gen_neon_vpadalu (operands[0], operands[0], operands[1]));
+DONE;
+  }
+)
+
+(define_expand "widen_ssum3"
+ [(match_operand: 0 "s_register_operand" "")
+  (match_operand:VWSD 1 "s_register_operand" "")
+  (match_operand: 2 "s_register_operand" "")]
+  "TARGET_NEON"
+  {
+emit_move_insn(operands[0], operands[2]);
+emit_insn (gen_neon_vpadals (operands[0], operands[0], operands[1]));
+DONE;
+  }
+)
+
+(define_insn "*neon_svaddw3"
   [(set (match_operand: 0 "s_register_operand" "=w")
 	(plus: (sign_extend:
 			  (match_operand:VW 1 "s_register_operand" "%w"))
@@ -1184,7 +1208,7 @@
   [(set_attr "type" "neon_add_widen")]
 )
 
-(define_insn "widen_usum3"
+(define_insn "*neon_uvaddw3"
   [(set (match_operand: 0 "s_register_operand" "=w")
 	(plus: (zero_extend:
 			  (match_operand:VW 1 "s_register_operand" "%w"))
diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c
new file mode 100644
index 000..c81c325
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c
@@ -0,0 +1,64 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_neon_hw } */
+/* { dg-options "-O2 -ffast-math -ftree-vectorize -mvectorize-with-neon-double -fdump-tree-vect-details -fdump-rtl-expand" } */
+/* { dg-add-options arm_neon } */
+
+/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { xfail *-*-* } } } */
+/* { dg-final { clea

Re: [PATCH][AArch64]: Fix rtl type in aarch64.md.

2015-02-28 Thread Xingxing Pan

On 02/28/2015 04:39 PM, James Greenhalgh wrote:

On Sat, Feb 28, 2015 at 01:29:15AM +, Xingxing Pan wrote:

On 02/27/2015 04:30 PM, Marcus Shawcroft wrote:

On 26 February 2015 at 06:22, Xingxing Pan  wrote:

This patch fix the type of mov_aarch64 in aarch64.md.
Is it OK for trunk?


OK, thank you /Marcus



Could someone help to apply the patch? Until now I don't have SVN write
access.


Thanks for the patch, I've committed it on your behalf as
revision 221075.

Cheers,
James



Thanks.

--
Regards,
Xingxing


Re: [PATCH][AArch64]: Fix rtl type in aarch64.md.

2015-02-27 Thread Xingxing Pan

On 02/27/2015 04:30 PM, Marcus Shawcroft wrote:

On 26 February 2015 at 06:22, Xingxing Pan  wrote:

Hi,

This patch fix the type of mov_aarch64 in aarch64.md.
Is it OK for trunk?


OK, thank you /Marcus



Hi,

Could someone help to apply the patch? Until now I don't have SVN write 
access.


--
Regards,
Xingxing


Re: [PATCH, 2/2][ARM]: New CPU support for Marvell Whitney

2015-02-25 Thread Xingxing Pan

On 02/25/2015 10:20 PM, James Greenhalgh wrote:

On Wed, Feb 25, 2015 at 01:42:39PM +, Xingxing Pan wrote:
> Hi,
>
> This patch expanding the following RTL types. And it has been merged to the 
latest code base.
>
>   (neon_logic): Expand to neon_logic_reg and neon_logic_imm.
>   (neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q.
>   (neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar.
>   (neon_from_gp_q): Expand to neon_from_gp_q and 
neon_from_gp_scalar_q.
>   (neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar.
>   (neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q.
>
> Is it OK for trunk?
>
> --
> Regards,
> Xingxing

I've had a look through the AArch64 parts, and they look OK to me
(though only Marcus or Richard can approve them), I have one additional
comment.

>   ;; In this insn, operand 1 should be low, and operand 2 the high part of the
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 8f157ce..8be2ebf 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -828,7 +828,7 @@
>}
>   }
> [(set_attr "type" "mov_reg,mov_imm,mov_imm,load1,load1,store1,store1,\
> - neon_from_gp,neon_from_gp, neon_dup")
> + neon_to_gp_scalar,neon_from_gp, neon_dup")
>  (set_attr "simd" "*,*,yes,*,*,*,*,yes,yes,yes")]
>   )

Here you change neon_from_gp to neon_to_gp_scalar.

This looks like the correct thing to do, but would you mind pulling it out
to a separate patch, first changing neon_from_gp to neon_to_gp?

I'd just like to have the bug-fix separate from the bigger infrastructure
change.

Thanks,
James


Hi James,

Thanks for your advice. I have submitted another patch to change the type from 
neon_from_gp to neon_to_gp.
See https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01566.html.

Attach the updated patch.

--
Regards,
Xingxing
Expand several arm types.

2015-02-26  Xingxing Pan  

* config/arm/types.md:
(neon_logic): Expand to neon_logic_reg and neon_logic_imm.
(neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q.
(neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar.
(neon_from_gp_q): Expand to neon_from_gp_q and neon_from_gp_scalar_q.
(neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar.
(neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q.
* config/aarch64/aarch64-simd.md: Ditto.
* config/aarch64/aarch64.md: Ditto.
* config/aarch64/thunderx.md: Ditto.
* config/arm/arm.md: Ditto.
* config/arm/cortex-a15-neon.md: Ditto.
* config/arm/cortex-a17-neon.md: Ditto.
* config/arm/cortex-a57.md: Ditto.
* config/arm/cortex-a8-neon.md: Ditto.
* config/arm/cortex-a9-neon.md: Ditto.
* config/arm/marvell-whitney.md: Ditto.
* config/arm/neon.md: Ditto.
* config/arm/xgene1.md: Ditto.

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 0557570..611d14c 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -115,7 +115,7 @@
  }
 }
   [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
- neon_logic, neon_to_gp, neon_from_gp,\
+ neon_logic_reg, neon_to_gp_scalar, neon_from_gp_scalar,\
  mov_reg, neon_move")]
 )
 
@@ -147,7 +147,7 @@
 }
 }
   [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
- neon_logic, multiple, multiple, multiple,\
+ neon_logic_reg, multiple, multiple, multiple,\
  neon_move")
(set_attr "length" "4,4,4,8,8,8,4")]
 )
@@ -218,7 +218,7 @@
   (match_operand:VQ 2 "vect_par_cnst_lo_half" "")))]
   "TARGET_SIMD && reload_completed"
   "umov\t%0, %1.d[0]"
-  [(set_attr "type" "neon_to_gp")
+  [(set_attr "type" "neon_to_gp_scalar")
(set_attr "length" "4")
   ])
 
@@ -229,7 +229,7 @@
   (match_operand:VQ 2 "vect_par_cnst_hi_half" "")))]
   "TARGET_SIMD && reload_completed"
   "umov\t%0, %1.d[1]"
-  [(set_attr "type" "neon_to_gp")
+  [(set_attr "type" "neon_to_gp_scalar")
(set_attr "length" "4")
   ])
 
@@ -239,7 +239,7 @@
 		(match_operand:VDQ_I 2 "register_operand" "w")))]
  "TARGET_SIMD"
  "orn\t%0., %2., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "bic3"
@@ -248,7 +248,7 @@
 		(match_operand:VDQ_I 2 "

Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2015-02-25 Thread Xingxing Pan

On 02/25/2015 09:32 PM, Xingxing Pan wrote:

Hi,

This patch merges pipeline description for marvell-whitney to latest
code base.
Is it OK for trunk?


Refactor the commit message.

--
Regards,
Xingxing
Add pipeline description for marvell-whitney.

2015-02-26  Xingxing Pan  

* config/arm/arm-cores.def: Add new core marvell-whitney.
* config/arm/arm-protos.h:
(marvell_whitney_vector_mode_qi): Declare.
(marvell_whitney_inner_shift): Ditto.
* config/arm/arm-tables.opt: Regenerated.
* config/arm/arm-tune.md: Regenerated.
* config/arm/arm.c (arm_marvell_whitney_tune): New structure.
(arm_issue_rate): Add marvell_whitney.
(marvell_whitney_vector_mode_qi): New function.
(marvell_whitney_inner_shift): Ditto.
* config/arm/arm.md: Include marvell-whitney.md.
(generic_sched): Add marvell_whitney.
(generic_vfp): Ditto.
* config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney.
* config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md.
* config/arm/marvell-whitney.md: New file.
* doc/invoke.texi: Document marvell-whitney.

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index d7e730d..fc76eb5 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -159,6 +159,7 @@ ARM_CORE("cortex-m7",		cortexm7, cortexm7,		7EM, FL_LDSCHED, cortex_m7)
 ARM_CORE("cortex-m4",		cortexm4, cortexm4,		7EM, FL_LDSCHED, v7m)
 ARM_CORE("cortex-m3",		cortexm3, cortexm3,		7M,  FL_LDSCHED, v7m)
 ARM_CORE("marvell-pj4",		marvell_pj4, marvell_pj4,	7A,  FL_LDSCHED, 9e)
+ARM_CORE("marvell-whitney",	marvell_whitney, marvell_whitney, 7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney)
 
 /* V7 big.LITTLE implementations */
 ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7,	7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 307babb..d047dbc 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -231,6 +231,9 @@ extern void arm_order_regs_for_local_alloc (void);
 
 extern int arm_max_conditional_execute ();
 
+extern bool marvell_whitney_vector_mode_qi (rtx_insn *insn);
+extern bool marvell_whitney_inner_shift (rtx_insn *insn);
+
 /* Vectorizer cost model implementation.  */
 struct cpu_vec_costs {
   const int scalar_stmt_cost;   /* Cost of any scalar operation, excluding
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 3450e5b..f0f9f3f 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -298,6 +298,9 @@ EnumValue
 Enum(processor_type) String(marvell-pj4) Value(marvell_pj4)
 
 EnumValue
+Enum(processor_type) String(marvell-whitney) Value(marvell_whitney)
+
+EnumValue
 Enum(processor_type) String(cortex-a15.cortex-a7) Value(cortexa15cortexa7)
 
 EnumValue
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index d459f27..fbfab2e 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -31,7 +31,8 @@
 	cortexa15,cortexa17,cortexr4,
 	cortexr4f,cortexr5,cortexr7,
 	cortexm7,cortexm4,cortexm3,
-	marvell_pj4,cortexa15cortexa7,cortexa17cortexa7,
-	cortexa53,cortexa57,cortexa72,
-	xgene1,cortexa57cortexa53,cortexa72cortexa53"
+	marvell_pj4,marvell_whitney,cortexa15cortexa7,
+	cortexa17cortexa7,cortexa53,cortexa57,
+	cortexa72,xgene1,cortexa57cortexa53,
+	cortexa72cortexa53"
 	(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 7bf5b4d..e68287f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -2000,6 +2000,25 @@ const struct tune_params arm_cortex_a9_tune =
   ARM_SCHED_AUTOPREF_OFF			/* Sched L2 autopref.  */
 };
 
+const struct tune_params arm_marvell_whitney_tune =
+{
+  arm_9e_rtx_costs,
+  &cortexa9_extra_costs,
+  cortex_a9_sched_adjust_cost,
+  1,		/* Constant limit.  */
+  5,		/* Max cond insns.  */
+  ARM_PREFETCH_BENEFICIAL(4,32,32),
+  false,	/* Prefer constant pool.  */
+  arm_default_branch_cost,
+  false,	/* Prefer LDRD/STRD.  */
+  {true, true},	/* Prefer non short circuit.  */
+  &arm_default_vec_cost,/* Vectorizer costs.  */
+  false,/* Prefer Neon for 64-bits bitops.  */
+  false, false, /* Prefer 32-bit encodings.  */
+  false,	/* Prefer Neon for stringops.  */
+  8		/* Maximum insns to inline memset.  */
+};
+
 const struct tune_params arm_cortex_a12_tune =
 {
   arm_9e_rtx_costs,
@@ -11717,6 +11736,51 @@ fa726te_sched_adjust_cost (rtx_insn *insn, rtx link, rtx_insn *dep, int * cost)
   return true;
 }
 
+/* Return true if vector element size is byte.  */
+bool
+marvell_whitney_vector_mode_qi (rtx_insn *insn)
+{
+  machine_mode mode;
+
+  if (GET_CODE (PATTERN (insn)) == SET)
+{
+  mode = GET_MODE (SET_DE

[PATCH][AArch64]: Fix rtl type in aarch64.md.

2015-02-25 Thread Xingxing Pan

Hi,

This patch fix the type of mov_aarch64 in aarch64.md.
Is it OK for trunk?

--
Regards,
Xingxing
[AArch64] Fix define_insn type in aarch64.md.

2015-02-26  Xingxing Pan  

* config/aarch64/aarch64.md:
  (mov_aarch64): Change type to neon_to_gp.

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 7103e0d..534a862 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -828,7 +828,7 @@
  }
 }
   [(set_attr "type" "mov_reg,mov_imm,mov_imm,load1,load1,store1,store1,\
- neon_from_gp,neon_from_gp, neon_dup")
+ neon_to_gp,neon_from_gp,neon_dup")
(set_attr "simd" "*,*,yes,*,*,*,*,yes,yes,yes")]
 )
 


Re: [PATCH, 2/2][ARM]: New CPU support for Marvell Whitney

2015-02-25 Thread Xingxing Pan

On 02/25/2015 09:42 PM, Xingxing Pan wrote:

Hi,

This patch expanding the following RTL types. And it has been merged to
the latest code base.

 (neon_logic): Expand to neon_logic_reg and neon_logic_imm.
 (neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q.
 (neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar.
 (neon_from_gp_q): Expand to neon_from_gp_q and
neon_from_gp_scalar_q.
 (neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar.
 (neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q.

Is it OK for trunk?



Fix typos in commit message.

--
Regards,
Xingxing
commit b0d0ebf6a2553bc7b6cc8f72fbaa0104938d0d41
Author: Xingxing Pan 
Date:   Wed Feb 25 14:46:25 2015 +0800

2015-02-25  Xingxing Pan  

* config/arm/types.md:
(neon_logic): Expand to neon_logic_reg and neon_logic_imm.
(neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q.
(neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar.
(neon_from_gp_q): Expand to neon_from_gp_q and neon_from_gp_scalar_q.
(neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar.
(neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q.
* config/aarch64/aarch64-simd.md: Ditto.
* config/aarch64/aarch64.md: Ditto.
* config/aarch64/thunderx.md: Ditto.
* config/arm/arm.md: Ditto.
* config/arm/cortex-a15-neon.md: Ditto.
* config/arm/cortex-a17-neon.md: Ditto.
* config/arm/cortex-a57.md: Ditto.
* config/arm/cortex-a8-neon.md: Ditto.
* config/arm/cortex-a9-neon.md: Ditto.
* config/arm/marvell-whitney.md: Ditto.
* config/arm/neon.md: Ditto.
* config/arm/xgene1.md: Ditto.

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 0557570..611d14c 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -115,7 +115,7 @@
  }
 }
   [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
- neon_logic, neon_to_gp, neon_from_gp,\
+ neon_logic_reg, neon_to_gp_scalar, neon_from_gp_scalar,\
  mov_reg, neon_move")]
 )
 
@@ -147,7 +147,7 @@
 }
 }
   [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
- neon_logic, multiple, multiple, multiple,\
+ neon_logic_reg, multiple, multiple, multiple,\
  neon_move")
(set_attr "length" "4,4,4,8,8,8,4")]
 )
@@ -218,7 +218,7 @@
   (match_operand:VQ 2 "vect_par_cnst_lo_half" "")))]
   "TARGET_SIMD && reload_completed"
   "umov\t%0, %1.d[0]"
-  [(set_attr "type" "neon_to_gp")
+  [(set_attr "type" "neon_to_gp_scalar")
(set_attr "length" "4")
   ])
 
@@ -229,7 +229,7 @@
   (match_operand:VQ 2 "vect_par_cnst_hi_half" "")))]
   "TARGET_SIMD && reload_completed"
   "umov\t%0, %1.d[1]"
-  [(set_attr "type" "neon_to_gp")
+  [(set_attr "type" "neon_to_gp_scalar")
(set_attr "length" "4")
   ])
 
@@ -239,7 +239,7 @@
 		(match_operand:VDQ_I 2 "register_operand" "w")))]
  "TARGET_SIMD"
  "orn\t%0., %2., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "bic3"
@@ -248,7 +248,7 @@
 		(match_operand:VDQ_I 2 "register_operand" "w")))]
  "TARGET_SIMD"
  "bic\t%0., %2., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "add3"
@@ -444,7 +444,7 @@
 		 (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "and\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "ior3"
@@ -453,7 +453,7 @@
 		 (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "orr\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "xor3"
@@ -462,7 +462,7 @@
 		 (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "eor\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "one_cmpl2"
@@ -470,7 +470,7 @@
 (not:VDQ_I (match_operand:VDQ_I 1 "register_operand" "w&

Re: [PATCH, 2/2][ARM]: New CPU support for Marvell Whitney

2015-02-25 Thread Xingxing Pan

Hi,

This patch expanding the following RTL types. And it has been merged to the 
latest code base.

(neon_logic): Expand to neon_logic_reg and neon_logic_imm.
(neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q.
(neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar.
(neon_from_gp_q): Expand to neon_from_gp_q and neon_from_gp_scalar_q.
(neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar.
(neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q.

Is it OK for trunk?

--
Regards,
Xingxing
commit b0d0ebf6a2553bc7b6cc8f72fbaa0104938d0d41
Author: Xingxing Pan 
Date:   Wed Feb 25 14:46:25 2015 +0800

2015-02-25  Xingxing Pan  

* config/arm/types.md:
(neon_logic): Expand to neon_logic_reg and neon_logic_imm.
(neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q.
(neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar.
(neon_from_gp_q): Expand to neon_from_gp_q and neon_from_gp_scalar_q.
(neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar.
(neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q.
* config/aarch64/aarch64-simd.md: Ditto.
* config/aarch64/aarch64.md: Ditto.
* config/aarch64/thunderx.md: Ditto.
* config/arm/arm.md: Ditto.
* config/arm/cortex-a15-neon.md: Ditto.
* config/arm/cortex-a17-neon.md: Ditto.
* config/arm/cortex-a57.md: Ditto.
* config/arm/cortex-a8-neon.md: Ditto.
* config/arm/cortex-a9-neon.md: Ditto.
* config/arm/marvell-whitney.md: Ditto.
* config/arm/neon.md: Ditto.
* config/arm/types.md: Ditto.
* config/arm/xgene1.md: Ditto.

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 0557570..611d14c 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -115,7 +115,7 @@
  }
 }
   [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
- neon_logic, neon_to_gp, neon_from_gp,\
+ neon_logic_reg, neon_to_gp_scalar, neon_from_gp_scalar,\
  mov_reg, neon_move")]
 )
 
@@ -147,7 +147,7 @@
 }
 }
   [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
- neon_logic, multiple, multiple, multiple,\
+ neon_logic_reg, multiple, multiple, multiple,\
  neon_move")
(set_attr "length" "4,4,4,8,8,8,4")]
 )
@@ -218,7 +218,7 @@
   (match_operand:VQ 2 "vect_par_cnst_lo_half" "")))]
   "TARGET_SIMD && reload_completed"
   "umov\t%0, %1.d[0]"
-  [(set_attr "type" "neon_to_gp")
+  [(set_attr "type" "neon_to_gp_scalar")
(set_attr "length" "4")
   ])
 
@@ -229,7 +229,7 @@
   (match_operand:VQ 2 "vect_par_cnst_hi_half" "")))]
   "TARGET_SIMD && reload_completed"
   "umov\t%0, %1.d[1]"
-  [(set_attr "type" "neon_to_gp")
+  [(set_attr "type" "neon_to_gp_scalar")
(set_attr "length" "4")
   ])
 
@@ -239,7 +239,7 @@
 		(match_operand:VDQ_I 2 "register_operand" "w")))]
  "TARGET_SIMD"
  "orn\t%0., %2., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "bic3"
@@ -248,7 +248,7 @@
 		(match_operand:VDQ_I 2 "register_operand" "w")))]
  "TARGET_SIMD"
  "bic\t%0., %2., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "add3"
@@ -444,7 +444,7 @@
 		 (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "and\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "ior3"
@@ -453,7 +453,7 @@
 		 (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "orr\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "xor3"
@@ -462,7 +462,7 @@
 		 (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "eor\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )
 
 (define_insn "one_cmpl2"
@@ -470,7 +470,7 @@
 (not:VDQ_I (match_operand:VDQ_I 1 "register_operand" "w")))]
   "TARGET_SIMD"

Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2015-02-25 Thread Xingxing Pan

Hi,

This patch merges pipeline description for marvell-whitney to latest code base.
Is it OK for trunk?

--
Regards,
Xingxing
commit 83974dde8d9f773df1004aa1d5e3b05d8a33f5e0
Author: Xingxing Pan 
Date:   Wed Feb 25 10:24:40 2015 +0800

2015-02-25 Xingxing Pan  

* config/arm/arm-cores.def: Add new core marvell-whitney.
* config/arm/arm-protos.h:
(marvell_whitney_vector_mode_qi): Declare.
(marvell_whitney_inner_shift): Ditto.
* config/arm/arm-tables.opt: Regenerated.
* config/arm/arm-tune.md: Regenerated.
* config/arm/arm.c (arm_marvell_whitney_tune): New structure.
(arm_issue_rate): Add marvell_whitney.
(marvell_whitney_vector_mode_qi): New function.
(marvell_whitney_inner_shift): Ditto.
* config/arm/arm.md: Include marvell-whitney.md.
(generic_sched): Add marvell_whitney.
(generic_vfp): Ditto.
* config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney.
* config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md.
* config/arm/marvell-whitney.md: New file.
* doc/invoke.texi: Document marvell-whitney.

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index d7e730d..fc76eb5 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -159,6 +159,7 @@ ARM_CORE("cortex-m7",		cortexm7, cortexm7,		7EM, FL_LDSCHED, cortex_m7)
 ARM_CORE("cortex-m4",		cortexm4, cortexm4,		7EM, FL_LDSCHED, v7m)
 ARM_CORE("cortex-m3",		cortexm3, cortexm3,		7M,  FL_LDSCHED, v7m)
 ARM_CORE("marvell-pj4",		marvell_pj4, marvell_pj4,	7A,  FL_LDSCHED, 9e)
+ARM_CORE("marvell-whitney",	marvell_whitney, marvell_whitney, 7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney)
 
 /* V7 big.LITTLE implementations */
 ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7,	7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 307babb..d047dbc 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -231,6 +231,9 @@ extern void arm_order_regs_for_local_alloc (void);
 
 extern int arm_max_conditional_execute ();
 
+extern bool marvell_whitney_vector_mode_qi (rtx_insn *insn);
+extern bool marvell_whitney_inner_shift (rtx_insn *insn);
+
 /* Vectorizer cost model implementation.  */
 struct cpu_vec_costs {
   const int scalar_stmt_cost;   /* Cost of any scalar operation, excluding
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 3450e5b..f0f9f3f 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -298,6 +298,9 @@ EnumValue
 Enum(processor_type) String(marvell-pj4) Value(marvell_pj4)
 
 EnumValue
+Enum(processor_type) String(marvell-whitney) Value(marvell_whitney)
+
+EnumValue
 Enum(processor_type) String(cortex-a15.cortex-a7) Value(cortexa15cortexa7)
 
 EnumValue
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index d459f27..fbfab2e 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -31,7 +31,8 @@
 	cortexa15,cortexa17,cortexr4,
 	cortexr4f,cortexr5,cortexr7,
 	cortexm7,cortexm4,cortexm3,
-	marvell_pj4,cortexa15cortexa7,cortexa17cortexa7,
-	cortexa53,cortexa57,cortexa72,
-	xgene1,cortexa57cortexa53,cortexa72cortexa53"
+	marvell_pj4,marvell_whitney,cortexa15cortexa7,
+	cortexa17cortexa7,cortexa53,cortexa57,
+	cortexa72,xgene1,cortexa57cortexa53,
+	cortexa72cortexa53"
 	(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 7bf5b4d..e68287f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -2000,6 +2000,25 @@ const struct tune_params arm_cortex_a9_tune =
   ARM_SCHED_AUTOPREF_OFF			/* Sched L2 autopref.  */
 };
 
+const struct tune_params arm_marvell_whitney_tune =
+{
+  arm_9e_rtx_costs,
+  &cortexa9_extra_costs,
+  cortex_a9_sched_adjust_cost,
+  1,		/* Constant limit.  */
+  5,		/* Max cond insns.  */
+  ARM_PREFETCH_BENEFICIAL(4,32,32),
+  false,	/* Prefer constant pool.  */
+  arm_default_branch_cost,
+  false,	/* Prefer LDRD/STRD.  */
+  {true, true},	/* Prefer non short circuit.  */
+  &arm_default_vec_cost,/* Vectorizer costs.  */
+  false,/* Prefer Neon for 64-bits bitops.  */
+  false, false, /* Prefer 32-bit encodings.  */
+  false,	/* Prefer Neon for stringops.  */
+  8		/* Maximum insns to inline memset.  */
+};
+
 const struct tune_params arm_cortex_a12_tune =
 {
   arm_9e_rtx_costs,
@@ -11717,6 +11736,51 @@ fa726te_sched_adjust_cost (rtx_insn *insn, rtx link, rtx_insn *dep, int * cost)
   return true;
 }
 
+/* Return true if vector element size is byte.  */
+bool
+marvell_whitney_vector_mode_qi (rtx_insn *insn)
+{
+  machine_mode mode;
+
+  if (GET_CODE (PATT

Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2015-01-13 Thread Xingxing Pan

On 09/01/2015 19:22, Kyrill Tkachov wrote:

Hi Xingxing,

On 19/12/14 11:01, Xingxing Pan wrote:

+/* Return true if vector element size is byte. */

Minor nit: two spaces after full stop and before */ Same in other places
in the patch.


+bool
+marvell_whitney_vector_element_size_is_byte (rtx insn)
+{
+  if (GET_CODE (PATTERN (insn)) == SET)
+{
+  if ((GET_MODE (SET_DEST (PATTERN (insn))) == V8QImode) ||
+  (GET_MODE (SET_DEST (PATTERN (insn))) == V16QImode))
+   return true;
+}
+
+  return false;
+}


I see this is called from inside marvell-whitney.md. It seems to me that
this function takes RTX insns. Can the type of this be strengthened to
rtx_insn * ?
Also, this should be refactored and written a bit more generally by
checking for VECTOR_MODE_P and then GET_MODE_INNER for QImode, saving
you the trouble of enumerating the different vector QI modes.



+
+/* Return true if INSN has shift operation but is not a shift insn. */
+bool
+marvell_whitney_non_shift_with_shift_operand (rtx insn)


Similar comment. Can this be strengthened to rtx_insn * ?

Thanks,
Kyrill

+{
+  rtx pat = PATTERN (insn);
+
+  if (GET_CODE (pat) != SET)
+return false;
+
+  /* Is not a shift insn. */
+  rtx rvalue = SET_SRC (pat);
+  RTX_CODE code = GET_CODE (rvalue);
+  if (code == ASHIFT || code == ASHIFTRT
+  || code == LSHIFTRT || code == ROTATERT)
+return false;
+
+  subrtx_iterator::array_type array;
+  FOR_EACH_SUBRTX (iter, array, rvalue, ALL)
+{
+  /* Has shift operation. */
+  RTX_CODE code = GET_CODE (*iter);
+  if (code == ASHIFT || code == ASHIFTRT
+  || code == LSHIFTRT || code == ROTATERT)
+return true;
+}
+
+  return false;
+}




Hi Kyrill,

Thanks for advice. Refactored patch is attached.

--
Regards,
Xingxing
commit 3627056607b1e8604ac8d85ed44fdc7d3209cd3e
Author: Xingxing Pan 
Date:   Thu Dec 18 16:58:05 2014 +0800

2015-01-13 Xingxing Pan  

* config/arm/arm-cores.def: Add new core marvell-whitney.
* config/arm/arm-protos.h:
(marvell_whitney_vector_mode_qi): Declare.
(marvell_whitney_inner_shift): Ditto.
* config/arm/arm-tables.opt: Regenerated.
* config/arm/arm-tune.md: Regenerated.
* config/arm/arm.c (arm_marvell_whitney_tune): New structure.
(arm_issue_rate): Add marvell_whitney.
(marvell_whitney_vector_mode_qi): New function.
(marvell_whitney_inner_shift): Ditto.
* config/arm/arm.md: Include marvell-whitney.md.
(generic_sched): Add marvell_whitney.
(generic_vfp): Ditto.
* config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney.
* config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md.
* config/arm/marvell-whitney.md: New file.
* doc/invoke.texi: Document marvell-whitney.

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index 6fa5d99..26eb7ab 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -159,6 +159,7 @@ ARM_CORE("cortex-m7",		cortexm7, cortexm7,		7EM, FL_LDSCHED, cortex_m7)
 ARM_CORE("cortex-m4",		cortexm4, cortexm4,		7EM, FL_LDSCHED, v7m)
 ARM_CORE("cortex-m3",		cortexm3, cortexm3,		7M,  FL_LDSCHED, v7m)
 ARM_CORE("marvell-pj4",		marvell_pj4, marvell_pj4,	7A,  FL_LDSCHED, 9e)
+ARM_CORE("marvell-whitney",	marvell_whitney, marvell_whitney, 7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney)
 
 /* V7 big.LITTLE implementations */
 ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7,	7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index fc45348..45001ae 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -231,6 +231,9 @@ extern void arm_order_regs_for_local_alloc (void);
 
 extern int arm_max_conditional_execute ();
 
+extern bool marvell_whitney_vector_mode_qi (rtx_insn *insn);
+extern bool marvell_whitney_inner_shift (rtx_insn *insn);
+
 /* Vectorizer cost model implementation.  */
 struct cpu_vec_costs {
   const int scalar_stmt_cost;   /* Cost of any scalar operation, excluding
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index ece9d5e..dc5f364 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -298,6 +298,9 @@ EnumValue
 Enum(processor_type) String(marvell-pj4) Value(marvell_pj4)
 
 EnumValue
+Enum(processor_type) String(marvell-whitney) Value(marvell_whitney)
+
+EnumValue
 Enum(processor_type) String(cortex-a15.cortex-a7) Value(cortexa15cortexa7)
 
 EnumValue
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index 452820ab..c73c33c 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -31,6 +31,7 @@
 	cortexa15,cortexa17,cortexr4,
 	cortexr4f,cortexr5,cortexr7,
 	cortexm7,cortexm4,cortexm3,
-	marvell_pj4,cortexa15cortexa7,cortexa17corte

Re: [PING][patch 1/2][ARM]: New CPU support for Marvell Whitney

2014-12-22 Thread Xingxing Pan

On 19/12/2014 19:01, Xingxing Pan wrote:

On 19/12/2014 18:38, Kyrill Tkachov wrote:

Hi Xingxin,

It seems that your mail client mangled this patch, at least the
following hunk doesn't apply, even when I try to get it from the web
archives.
Could you please resend it as an attachment perhaps?

Thanks,
Kyrill

On 18/12/14 10:13, Xingxing Pan wrote:

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index 423ee9e..b0ffbe1 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -159,6 +159,7 @@ ARM_CORE("cortex-m7",   cortexm7,
cortexm7, 7EM,
FL_LDSCHED, cortex_m7)
 ARM_CORE("cortex-m4", cortexm4, cortexm4, 7EM,
FL_LDSCHED, v7m)
 ARM_CORE("cortex-m3", cortexm3, cortexm3, 7M,
FL_LDSCHED, v7m)
 ARM_CORE("marvell-pj4",   marvell_pj4, marvell_pj4,
7A,  FL_LDSCHED, 9e)
+ARM_CORE("marvell-whitney",marvell_whitney,
marvell_whitney,
7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney)





Hi Kyrill,

I've changed the code to use "tune" attribute directly. The new patch is
attached.

Thanks,
Xingxing


Any comments are welcome.

Thanks,
Xingxing


Re: [PATCH, 2/2][ARM]: New CPU support for Marvell Whitney

2014-12-19 Thread Xingxing Pan

On 19/12/2014 18:29, Xingxing Pan wrote:

On 19/12/2014 17:35, James Greenhalgh wrote:

On Fri, Dec 19, 2014 at 08:19:17AM +, Xingxing Pan wrote:

Hi,

This patch expands the arm types neon_logic, neon_from_gp and
neon_to_gp. This change mainly suits to marvell-whitney cores, and
will not affect other arm core's pipeline description.

neon_logic is expanded to neon_logic_reg and neon_logic_imm,
corresponding respectively to the predicates s_register_operand and
imm_for_neon_logic_operand.

neon_from/to_gp is expanded to neon_reg_from/to_gp and
neon_lane_from/to_gp, decided by whether the neon side is a single
register or a register lane.


Sorry to ask for churn here, but the naming scheme for lane operations
elsewhere in types.md seems to be:

neon_<_scalar><_q>

as in:

; neon_mul_s_scalar
; neon_mul_s_scalar_q

I think the types you are introducing should be:

   neon_from_gp_scalar
   neon_to_gp_scalar

Thanks,
James


Hi James,

Thanks for your comment. I've changed the type names.

Regards,
Xingxing


     2014-12-19  Xingxing Pan  

 * config/arm/types.md:
 (neon_logic): Expand to neon_logic_reg and neon_logic_imm.
 (neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q.
 (neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar.
 (neon_from_gp_q): Expand to neon_from_gp_q and
neon_from_gp_scalar_q.
 (neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar.
 (neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q.
 * config/aarch64/aarch64-simd.md: Ditto.
 * config/aarch64/aarch64.md: Ditto.
 * config/aarch64/thunderx.md: Ditto.
 * config/arm/arm.md: Ditto.
 * config/arm/cortex-a15-neon.md: Ditto.
 * config/arm/cortex-a17-neon.md: Ditto.
 * config/arm/cortex-a8-neon.md: Ditto.
 * config/arm/cortex-a9-neon.md: Ditto.
 * config/arm/neon.md: Ditto.
 * config/arm/whitney.md: Ditto.

diff --git a/gcc/config/aarch64/aarch64-simd.md
b/gcc/config/aarch64/aarch64-simd.md
index d4256a5..63a2b7e 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -115,7 +115,7 @@
   }
  }
[(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
- neon_logic, neon_to_gp, neon_from_gp,\
+ neon_logic_reg, neon_to_gp_scalar,
neon_from_gp_scalar,\
   mov_reg, neon_move")]
  )

@@ -147,7 +147,7 @@
  }
  }
[(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
- neon_logic, multiple, multiple, multiple,\
+ neon_logic_reg, multiple, multiple, multiple,\
   neon_move")
 (set_attr "length" "4,4,4,8,8,8,4")]
  )
@@ -227,7 +227,7 @@
(match_operand:VQ 2 "vect_par_cnst_lo_half" "")))]
"TARGET_SIMD && reload_completed"
"umov\t%0, %1.d[0]"
-  [(set_attr "type" "neon_to_gp")
+  [(set_attr "type" "neon_to_gp_scalar")
 (set_attr "length" "4")
])

@@ -238,7 +238,7 @@
(match_operand:VQ 2 "vect_par_cnst_hi_half" "")))]
"TARGET_SIMD && reload_completed"
"umov\t%0, %1.d[1]"
-  [(set_attr "type" "neon_to_gp")
+  [(set_attr "type" "neon_to_gp_scalar")
 (set_attr "length" "4")
])

@@ -248,7 +248,7 @@
  (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "orn\t%0., %2., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
  )

  (define_insn "bic3"
@@ -257,7 +257,7 @@
  (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "bic\t%0., %2., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
  )

  (define_insn "add3"
@@ -440,7 +440,7 @@
   (match_operand:VDQ_I 2 "register_operand" "w")))]
"TARGET_SIMD"
"and\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
  )

  (define_insn "ior3"
@@ -449,7 +449,7 @@
   (match_operand:VDQ_I 2 "register_operand" "w")))]
"TARGET_SIMD"
"orr\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
  )

  (define_insn "xor3"
@@ -458,7 +458,7 @@
   (match_operand:VDQ_I 2 "register_operand" "w")))]
"TARGET_SIMD"
"eor\t%0., %1., %2

Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2014-12-19 Thread Xingxing Pan

On 19/12/2014 18:38, Kyrill Tkachov wrote:

Hi Xingxin,

It seems that your mail client mangled this patch, at least the
following hunk doesn't apply, even when I try to get it from the web
archives.
Could you please resend it as an attachment perhaps?

Thanks,
Kyrill

On 18/12/14 10:13, Xingxing Pan wrote:

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index 423ee9e..b0ffbe1 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -159,6 +159,7 @@ ARM_CORE("cortex-m7",   cortexm7,
cortexm7, 7EM,
FL_LDSCHED, cortex_m7)
 ARM_CORE("cortex-m4", cortexm4, cortexm4, 7EM,
FL_LDSCHED, v7m)
 ARM_CORE("cortex-m3", cortexm3, cortexm3, 7M,
FL_LDSCHED, v7m)
 ARM_CORE("marvell-pj4",   marvell_pj4, marvell_pj4,
7A,  FL_LDSCHED, 9e)
+ARM_CORE("marvell-whitney",marvell_whitney, marvell_whitney,
7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney)





Hi Kyrill,

I've changed the code to use "tune" attribute directly. The new patch is 
attached.


Thanks,
Xingxing
commit 56745b611d40b77e1911075159f89959335d0298
Author: Xingxing Pan 
Date:   Thu Dec 18 16:58:05 2014 +0800

2014-12-18 Xingxing Pan 

* config/arm/arm-cores.def: Add new core marvell-whitney.
* config/arm/arm-protos.h:
(marvell_whitney_vector_element_size_is_byte): Declare.
(marvell_whitney_non_shift_with_shift_operand): Ditto.
* config/arm/arm-tables.opt: Regenerated.
* config/arm/arm-tune.md: Regenerated.
* config/arm/arm.c (arm_marvell_whitney_tune): New structure.
(arm_issue_rate): Add marvell_whitney.
(marvell_whitney_vector_element_size_is_byte): New function.
(marvell_whitney_non_shift_with_shift_operand): Ditto.
* config/arm/arm.md: Include marvell-whitney.md.
(generic_sched): Add marvell_whitney.
(generic_vfp): Ditto.
* config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney.
* config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md.
* config/arm/marvell-whitney.md: New file.
* doc/invoke.texi: Document marvell-whitney.

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index 423ee9e..b0ffbe1 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -159,6 +159,7 @@ ARM_CORE("cortex-m7",		cortexm7, cortexm7,		7EM, FL_LDSCHED, cortex_m7)
 ARM_CORE("cortex-m4",		cortexm4, cortexm4,		7EM, FL_LDSCHED, v7m)
 ARM_CORE("cortex-m3",		cortexm3, cortexm3,		7M,  FL_LDSCHED, v7m)
 ARM_CORE("marvell-pj4",		marvell_pj4, marvell_pj4,	7A,  FL_LDSCHED, 9e)
+ARM_CORE("marvell-whitney",		marvell_whitney, marvell_whitney,	7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney)
 
 /* V7 big.LITTLE implementations */
 ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7,	7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 20cfa9f..e86db1e 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -231,6 +231,9 @@ extern void arm_order_regs_for_local_alloc (void);
 
 extern int arm_max_conditional_execute ();
 
+extern bool marvell_whitney_vector_element_size_is_byte (rtx insn);
+extern bool marvell_whitney_non_shift_with_shift_operand (rtx insn);
+
 /* Vectorizer cost model implementation.  */
 struct cpu_vec_costs {
   const int scalar_stmt_cost;   /* Cost of any scalar operation, excluding
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 9b1886e..3371ce3 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -298,6 +298,9 @@ EnumValue
 Enum(processor_type) String(marvell-pj4) Value(marvell_pj4)
 
 EnumValue
+Enum(processor_type) String(marvell-whitney) Value(marvell_whitney)
+
+EnumValue
 Enum(processor_type) String(cortex-a15.cortex-a7) Value(cortexa15cortexa7)
 
 EnumValue
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index d300c51..c73c33c 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -28,9 +28,10 @@
 	cortexm1smallmultiply,cortexm0smallmultiply,cortexm0plussmallmultiply,
 	genericv7a,cortexa5,cortexa7,
 	cortexa8,cortexa9,cortexa12,
-	cortexa15,cortexa17,cortexr4,cortexr4f,
-	cortexr5,cortexr7,cortexm7,
-	cortexm4,cortexm3,marvell_pj4,
-	cortexa15cortexa7,cortexa17cortexa7,cortexa53,
-	cortexa57,cortexa57cortexa53"
+	cortexa15,cortexa17,cortexr4,
+	cortexr4f,cortexr5,cortexr7,
+	cortexm7,cortexm4,cortexm3,
+	marvell_pj4,marvell_whitney,cortexa15cortexa7,
+	cortexa17cortexa7,cortexa53,cortexa57,
+	cortexa57cortexa53"
 	(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 0

Re: [PATCH, 2/2][ARM]: New CPU support for Marvell Whitney

2014-12-19 Thread Xingxing Pan

On 19/12/2014 17:35, James Greenhalgh wrote:

On Fri, Dec 19, 2014 at 08:19:17AM +, Xingxing Pan wrote:

Hi,

This patch expands the arm types neon_logic, neon_from_gp and
neon_to_gp. This change mainly suits to marvell-whitney cores, and
will not affect other arm core's pipeline description.

neon_logic is expanded to neon_logic_reg and neon_logic_imm,
corresponding respectively to the predicates s_register_operand and
imm_for_neon_logic_operand.

neon_from/to_gp is expanded to neon_reg_from/to_gp and
neon_lane_from/to_gp, decided by whether the neon side is a single
register or a register lane.


Sorry to ask for churn here, but the naming scheme for lane operations
elsewhere in types.md seems to be:

neon_<_scalar><_q>

as in:

; neon_mul_s_scalar
; neon_mul_s_scalar_q

I think the types you are introducing should be:

   neon_from_gp_scalar
   neon_to_gp_scalar

Thanks,
James


Hi James,

Thanks for your comment. I've changed the type names.

Regards,
Xingxing


    2014-12-19  Xingxing Pan  

* config/arm/types.md:
(neon_logic): Expand to neon_logic_reg and neon_logic_imm.
(neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q.
(neon_from_gp): Expand to neon_from_gp and neon_from_gp_scalar.
(neon_from_gp_q): Expand to neon_from_gp_q and 
neon_from_gp_scalar_q.

(neon_to_gp): Expand to neon_to_gp and neon_to_gp_scalar.
(neon_to_gp_q): Expand to neon_to_gp_q and neon_to_gp_scalar_q.
* config/aarch64/aarch64-simd.md: Ditto.
* config/aarch64/aarch64.md: Ditto.
* config/aarch64/thunderx.md: Ditto.
* config/arm/arm.md: Ditto.
* config/arm/cortex-a15-neon.md: Ditto.
* config/arm/cortex-a17-neon.md: Ditto.
* config/arm/cortex-a8-neon.md: Ditto.
* config/arm/cortex-a9-neon.md: Ditto.
* config/arm/neon.md: Ditto.
* config/arm/whitney.md: Ditto.

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md

index d4256a5..63a2b7e 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -115,7 +115,7 @@
  }
 }
   [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
- neon_logic, neon_to_gp, neon_from_gp,\
+ neon_logic_reg, neon_to_gp_scalar, 
neon_from_gp_scalar,\

  mov_reg, neon_move")]
 )

@@ -147,7 +147,7 @@
 }
 }
   [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
- neon_logic, multiple, multiple, multiple,\
+ neon_logic_reg, multiple, multiple, multiple,\
  neon_move")
(set_attr "length" "4,4,4,8,8,8,4")]
 )
@@ -227,7 +227,7 @@
   (match_operand:VQ 2 "vect_par_cnst_lo_half" "")))]
   "TARGET_SIMD && reload_completed"
   "umov\t%0, %1.d[0]"
-  [(set_attr "type" "neon_to_gp")
+  [(set_attr "type" "neon_to_gp_scalar")
(set_attr "length" "4")
   ])

@@ -238,7 +238,7 @@
   (match_operand:VQ 2 "vect_par_cnst_hi_half" "")))]
   "TARGET_SIMD && reload_completed"
   "umov\t%0, %1.d[1]"
-  [(set_attr "type" "neon_to_gp")
+  [(set_attr "type" "neon_to_gp_scalar")
(set_attr "length" "4")
   ])

@@ -248,7 +248,7 @@
(match_operand:VDQ_I 2 "register_operand" "w")))]
  "TARGET_SIMD"
  "orn\t%0., %2., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )

 (define_insn "bic3"
@@ -257,7 +257,7 @@
(match_operand:VDQ_I 2 "register_operand" "w")))]
  "TARGET_SIMD"
  "bic\t%0., %2., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )

 (define_insn "add3"
@@ -440,7 +440,7 @@
 (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "and\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )

 (define_insn "ior3"
@@ -449,7 +449,7 @@
 (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "orr\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
 )

 (define_insn "xor3"
@@ -458,7 +458,7 @@
 (match_operand:VDQ_I 2 "register_operand" "w")))]
   "TARGET_SIMD"
   "eor\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")

Re: [PATCH][ARM] Fix reservation pattern in cortex-a9-neon.md

2014-12-19 Thread Xingxing Pan

Brilliant!

Xingxing

On 19/12/2014 17:44, James Greenhalgh wrote:

On Fri, Dec 19, 2014 at 02:46:51AM +, Xingxing Pan wrote:

Hi,

This patch fix the reservation pattern of cortex_a9_neon_vmov in
cortex-a9-neon.md.

Is it OK for trunk?


This patch is obvious, and fixes my typo.

I couldn't see your name or email address in the MAINTAINERS file, so
I've committed this under the "obvious" rule on your behalf as
revision 218895.

Thanks,
James


  2014-12-19  Xingxing Pan 


Note that there should be two spaces between your name and email address,
as so:

2014-12-19  Xingxing Pan  






[PATCH, 2/2][ARM]: New CPU support for Marvell Whitney

2014-12-19 Thread Xingxing Pan

Hi,

This patch expands the arm types neon_logic, neon_from_gp and
neon_to_gp. This change mainly suits to marvell-whitney cores, and
will not affect other arm core's pipeline description.

neon_logic is expanded to neon_logic_reg and neon_logic_imm,
corresponding respectively to the predicates s_register_operand and
imm_for_neon_logic_operand.

neon_from/to_gp is expanded to neon_reg_from/to_gp and
neon_lane_from/to_gp, decided by whether the neon side is a single
register or a register lane.

Test on linux-gnueabi and no new regressions are found. OK for trunk?

Regards,
Xingxing


   2014-12-19  Xingxing Pan  

   * config/arm/types.md:
   (neon_logic): Expand to neon_logic_reg and neon_logic_imm.
   (neon_logic_q): Expand to neon_logic_reg_q and neon_logic_imm_q.
   (neon_from_gp): Expand to neon_reg_from_gp and 
neon_lane_from_gp.

   (neon_from_gp_q): Expand to neon_reg_from_gp_q and
neon_lane_from_gp_q.
   (neon_to_gp): Expand to neon_reg_to_gp and neon_lane_to_gp.
   (neon_to_gp_q): Expand to neon_reg_to_gp_q and 
neon_lane_to_gp_q.

   * config/aarch64/aarch64-simd.md: Ditto.
   * config/aarch64/aarch64.md: Ditto.
   * config/aarch64/thunderx.md: Ditto.
   * config/arm/arm.md: Ditto.
   * config/arm/cortex-a15-neon.md: Ditto.
   * config/arm/cortex-a17-neon.md: Ditto.
   * config/arm/cortex-a8-neon.md: Ditto.
   * config/arm/cortex-a9-neon.md: Ditto.
   * config/arm/neon.md: Ditto.
   * config/arm/whitney.md: Ditto.

diff --git a/gcc/config/aarch64/aarch64-simd.md
b/gcc/config/aarch64/aarch64-simd.md
index d4256a5..ea92940 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -49,7 +49,7 @@
  "@
   dup\\t%0., %1
   dup\\t%0., %1.[0]"
-  [(set_attr "type" "neon_from_gp, neon_dup")]
+  [(set_attr "type" "neon_reg_from_gp, neon_dup")]
)

(define_insn "aarch64_simd_dup"
@@ -115,7 +115,7 @@
 }
}
  [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
- neon_logic, neon_to_gp, neon_from_gp,\
+ neon_logic_reg, neon_lane_to_gp,
neon_lane_from_gp,\
 mov_reg, neon_move")]
)

@@ -147,7 +147,7 @@
}
}
  [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
- neon_logic, multiple, multiple, multiple,\
+ neon_logic_reg, multiple, multiple, multiple,\
 neon_move")
   (set_attr "length" "4,4,4,8,8,8,4")]
)
@@ -227,7 +227,7 @@
  (match_operand:VQ 2 "vect_par_cnst_lo_half" "")))]
  "TARGET_SIMD && reload_completed"
  "umov\t%0, %1.d[0]"
-  [(set_attr "type" "neon_to_gp")
+  [(set_attr "type" "neon_lane_to_gp")
   (set_attr "length" "4")
  ])

@@ -238,7 +238,7 @@
  (match_operand:VQ 2 "vect_par_cnst_hi_half" "")))]
  "TARGET_SIMD && reload_completed"
  "umov\t%0, %1.d[1]"
-  [(set_attr "type" "neon_to_gp")
+  [(set_attr "type" "neon_lane_to_gp")
   (set_attr "length" "4")
  ])

@@ -248,7 +248,7 @@
  (match_operand:VDQ_I 2 "register_operand" "w")))]
 "TARGET_SIMD"
 "orn\t%0., %2., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
)

(define_insn "bic3"
@@ -257,7 +257,7 @@
  (match_operand:VDQ_I 2 "register_operand" "w")))]
 "TARGET_SIMD"
 "bic\t%0., %2., %1."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
)

(define_insn "add3"
@@ -440,7 +440,7 @@
   (match_operand:VDQ_I 2 "register_operand" "w")))]
  "TARGET_SIMD"
  "and\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
)

(define_insn "ior3"
@@ -449,7 +449,7 @@
   (match_operand:VDQ_I 2 "register_operand" "w")))]
  "TARGET_SIMD"
  "orr\t%0., %1., %2."
-  [(set_attr "type" "neon_logic")]
+  [(set_attr "type" "neon_logic_reg")]
)

(define_insn "xor3"
@@ -458,7 +458,7 @@
   (match_operand:VDQ_I 2 "register_operand" "w")))]
  "TARGET_SIMD"
  "eor\t%0., %1., %2."
-  [(set_a

[PATCH][ARM] Fix reservation pattern in cortex-a9-neon.md

2014-12-18 Thread Xingxing Pan

Hi,

This patch fix the reservation pattern of cortex_a9_neon_vmov in 
cortex-a9-neon.md.


Is it OK for trunk?

Regards,
Xingxing


2014-12-19  Xingxing Pan 

* config/arm/cortex-a9-neon.md (cortex_a9_neon_vmov): Change 
reservation

from cortex_a8_neon_dp to cortex_a9_neon_dp.

diff --git a/gcc/config/arm/cortex-a9-neon.md 
b/gcc/config/arm/cortex-a9-neon.md

index 3ff93f9..5c02b32 100644
--- a/gcc/config/arm/cortex-a9-neon.md
+++ b/gcc/config/arm/cortex-a9-neon.md
@@ -376,7 +376,7 @@
 (define_insn_reservation "cortex_a9_neon_vmov" 3
   (and (eq_attr "tune" "cortexa9")
(eq_attr "cortex_a9_neon_type" "neon_vmov"))
-  "cortex_a8_neon_dp")
+  "cortex_a9_neon_dp")

 ;; Instructions using this reservation read their (D|Q)n operands at N2,
 ;; their (D|Q)m operands at N1, their (D|Q)d operands at N3, and




[patch 1/2][ARM]: New CPU support for Marvell Whitney

2014-12-18 Thread Xingxing Pan

Hi,

This patch contains Marvell Whitney core's pipeline description.
Test on arm-linux-gnueabi and no new regression are found.

Is it OK for trunk?

Regards,
Xingxing


 2014-12-18 Xingxing Pan 

 * config/arm/arm-cores.def: Add new core marvell-whitney.
 * config/arm/arm-protos.h:
 (marvell_whitney_vector_element_size_is_byte): Declare.
 (marvell_whitney_non_shift_with_shift_operand): Ditto.
 * config/arm/arm-tables.opt: Regenerated.
 * config/arm/arm-tune.md: Regenerated.
 * config/arm/arm.c (arm_marvell_whitney_tune): New structure.
 (arm_issue_rate): Add marvell_whitney.
 (marvell_whitney_vector_element_size_is_byte): New function.
 (marvell_whitney_non_shift_with_shift_operand): Ditto.
 * config/arm/arm.md: Include marvell-whitney.md.
 (generic_sched): Add marvell_whitney.
 (generic_vfp): Ditto.
 * config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney.
 * config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md.
 * config/arm/marvell-whitney.md: New file.
 * doc/invoke.texi: Document marvell-whitney.

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index 423ee9e..b0ffbe1 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -159,6 +159,7 @@ ARM_CORE("cortex-m7",   cortexm7, 
cortexm7, 7EM,

FL_LDSCHED, cortex_m7)
  ARM_CORE("cortex-m4", cortexm4, cortexm4, 7EM, 
FL_LDSCHED, v7m)
  ARM_CORE("cortex-m3", cortexm3, cortexm3, 7M, 
FL_LDSCHED, v7m)
  ARM_CORE("marvell-pj4",   marvell_pj4, marvell_pj4, 
 7A,  FL_LDSCHED, 9e)

+ARM_CORE("marvell-whitney",marvell_whitney, marvell_whitney,
7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney)

  /* V7 big.LITTLE implementations */
  ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7, 7A,
FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 20cfa9f..e86db1e 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -231,6 +231,9 @@ extern void arm_order_regs_for_local_alloc (void);

  extern int arm_max_conditional_execute ();

+extern bool marvell_whitney_vector_element_size_is_byte (rtx insn);
+extern bool marvell_whitney_non_shift_with_shift_operand (rtx insn);
+
  /* Vectorizer cost model implementation.  */
  struct cpu_vec_costs {
const int scalar_stmt_cost;   /* Cost of any scalar operation, 
excluding

diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 9b1886e..3371ce3 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -298,6 +298,9 @@ EnumValue
  Enum(processor_type) String(marvell-pj4) Value(marvell_pj4)

  EnumValue
+Enum(processor_type) String(marvell-whitney) Value(marvell_whitney)
+
+EnumValue
  Enum(processor_type) String(cortex-a15.cortex-a7) 
Value(cortexa15cortexa7)


  EnumValue
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index d300c51..c73c33c 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -28,9 +28,10 @@

cortexm1smallmultiply,cortexm0smallmultiply,cortexm0plussmallmultiply,
genericv7a,cortexa5,cortexa7,
cortexa8,cortexa9,cortexa12,
-   cortexa15,cortexa17,cortexr4,cortexr4f,
-   cortexr5,cortexr7,cortexm7,
-   cortexm4,cortexm3,marvell_pj4,
-   cortexa15cortexa7,cortexa17cortexa7,cortexa53,
-   cortexa57,cortexa57cortexa53"
+   cortexa15,cortexa17,cortexr4,
+   cortexr4f,cortexr5,cortexr7,
+   cortexm7,cortexm4,cortexm3,
+   marvell_pj4,marvell_whitney,cortexa15cortexa7,
+   cortexa17cortexa7,cortexa53,cortexa57,
+   cortexa57cortexa53"
(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 0ec526b..183da4c 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1914,6 +1914,25 @@ const struct tune_params arm_cortex_a9_tune =
8   /* Maximum insns to 
inline memset.  */

  };

+const struct tune_params arm_marvell_whitney_tune =
+{
+  arm_9e_rtx_costs,
+  &cortexa9_extra_costs,
+  cortex_a9_sched_adjust_cost,
+  1,   /* Constant limit.  */
+  5,   /* Max cond insns.  */
+  ARM_PREFETCH_BENEFICIAL(4,32,32),
+  false,   /* Prefer constant pool.  */
+  arm_default_branch_cost,
+  false,   /* Prefer LDRD/STRD.  */
+  {true, true},/* Prefer non 
short circuit.  */

+  &arm_default_vec_cost,/* Vectorizer costs.  */
+  false,