Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2015-02-26 Thread Kyrill Tkachov

Hi Xingxing,
Some comments inline on the tuning struct and pipeline description.

On 25/02/15 13:32, Xingxing Pan wrote:
 Hi,

 This patch merges pipeline description for marvell-whitney to latest 
code base.

 Is it OK for trunk?

 -- Regards, Xingxing
 whitney_p1_20150225.patch
A couple of questions:

+const struct tune_params arm_marvell_whitney_tune =
+{
+  arm_9e_rtx_costs,
+  cortexa9_extra_costs,
+  cortex_a9_sched_adjust_cost,

Are you sure the cortexa9_extra_costs are best for this? Did you try 
other cost

tables? Do you think it would be good to write your own?
The pipeline description you posted seems to be very intricate.
If such complexity is needed to get the most out of the core, then perhaps
it could benefit from a custom cost table?

Also, are you sure cortex_a9_sched_adjust_cost is appropriate?
It's a very Cortex-A9 specific hook and we don't use it for any other cores.
Did it give meaningful improvements in performance?

I don't know much about this core (any public documentation you can point
us at, perhaps?) so I'll trust you that these are optimal...


+  1,/* Constant limit.  */
+  5,/* Max cond insns.  */
+  ARM_PREFETCH_BENEFICIAL(4,32,32),
+  false,/* Prefer constant pool.  */
+  arm_default_branch_cost,
+  false,/* Prefer LDRD/STRD.  */
+  {true, true},/* Prefer non short circuit.  */
+  arm_default_vec_cost,/* Vectorizer costs.  */
+  false,/* Prefer Neon for 
64-bits bitops.  */
+  false, false, /* Prefer 32-bit 
encodings.  */

+  false,/* Prefer Neon for stringops.  */
+  8/* Maximum insns to inline memset.  */
+};
+


As Maxim said, you need to take into account the two new fields that were
recently added, the fusion and prefetcher modelling ones.


 diff --git a/gcc/config/arm/marvell-whitney.md 
b/gcc/config/arm/marvell-whitney.md

 new file mode 100644
 index 000..fffbd29
 --- /dev/null
 +++ b/gcc/config/arm/marvell-whitney.md
 @@ -0,0 +1,678 @@
 +;; Marvell ARM Processor Pipeline Description
 +;; Copyright (C) 2011-2014 Free Software Foundation, Inc.

Update the year to 2015 .


 +;;
 +;; Contributed by Marvell.
 +
 +;; This file is part of GCC.
 +;;
 +;; GCC is free software; you can redistribute it and/or modify it
 +;; under the terms of the GNU General Public License as published
 +;; by the Free Software Foundation; either version 3, or (at your
 +;; option) any later version.
 +;;
 +;; GCC is distributed in the hope that it will be useful, but WITHOUT
 +;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
 +;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
 +;; License for more details.
 +;;
 +;; You should have received a copy of the GNU General Public License
 +;; along with GCC; see the file COPYING3.  If not see
 +;; http://www.gnu.org/licenses/.
 +
 +(define_attr whitney_type
 + wTP1,wTP2,wTP3,wTP4,wTP5,wTP6,wTP7,wTP8,wTP9,wTP10,wTP11,wTP12,wTP13,\
 + 
wTP14,wTP15,wTP16,wTP17,wTP18,wTP19,wTP20,wTP21,wTP22,wTP23,wTP24,wTP25,\
 + 
wTP26,wTP27,wTP28,wTP29,wTP30,wTP31,wTP32,wTP33,wTP34,wTP35,wTP36,wTP37,\
 + 
wTP38,wTP39,wTP40,wTP41,wTP42,wTP43,wTP44,wTP45,wTP46,wTP47,wTP48,wTP49,\

 + wTP50,wTP51,wTP52,wTP53,wTP54,wTP55,wTP56,wTP57,wTP58,wTP59,\

These types are very hard to remember and understand, and hence maintain.
I'm assuming the wTP* notation stands for Whitney TyPe?
Look at the other pipeline descriptions in config/arm/ and try to give 
them more

descriptive names, like: whitney_alu_simple, whitney_alu_shift_reg etc.
It will make it much easier to understand.

I've tried this patch out and added:
(automata_option v)
(automata_option time)
(automata_option stats)
(automata_option progress)

to the .md file to get some compiler build-time stats on the automata.
I see that the whitney_pipe automaton is the largest automaton in terms
of NDFA states, and second largest in minimal DFA states:

44170 NDFA states,  188695 NDFA arcs
44170 DFA states,   188695 DFA arcs
 7070 minimal DFA states,   66279 minimal DFA arcs
  923 all insns 30 insn equivalence classes

Could the complexity of it be reduced without sacrificing code quality?
In some cores we break up the automaton into two automata, roughly 
separated into
an integer/memory unit and another one for FP/NEON units to reduce the 
state space.
Look at cortex-a15.md and cortex-a15-neon.md or some other description 
of that kind.



 +  (cond [
 +  (eq_attr type alu_imm,alu_sreg,alus_imm,alus_sreg,adc_reg,\
 + adc_imm,bfm,branch,call,clz,extend,logic_imm,\
 + logic_reg,logics_imm,logics_reg,logics_shift_reg,\
 + mov_imm,mov_reg,mvn_imm,mvn_reg,shift_imm,\
 +   shift_reg,rev)
 +  (const_string wTP1)
 +  (eq_attr type 

Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2015-02-26 Thread Maxim Kuvyrkov
 On Feb 26, 2015, at 10:36 AM, Xingxing Pan xxing...@marvell.com wrote:
 
 On 02/25/2015 09:32 PM, Xingxing Pan wrote:
 Hi,
 
 This patch merges pipeline description for marvell-whitney to latest
 code base.
 Is it OK for trunk?
 
 Refactor the commit message.
 

...

 Add pipeline description for marvell-whitney.
 
 2015-02-26  Xingxing Pan  xxing...@marvell.com
 
 * config/arm/arm-cores.def: Add new core marvell-whitney.
 * config/arm/arm-protos.h:
 (marvell_whitney_vector_mode_qi): Declare.
 (marvell_whitney_inner_shift): Ditto.
 * config/arm/arm-tables.opt: Regenerated.
 * config/arm/arm-tune.md: Regenerated.
 * config/arm/arm.c (arm_marvell_whitney_tune): New structure.
 (arm_issue_rate): Add marvell_whitney.
 (marvell_whitney_vector_mode_qi): New function.
 (marvell_whitney_inner_shift): Ditto.
 * config/arm/arm.md: Include marvell-whitney.md.
 (generic_sched): Add marvell_whitney.
 (generic_vfp): Ditto.
 * config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney.
 * config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md.
 * config/arm/marvell-whitney.md: New file.
 * doc/invoke.texi: Document marvell-whitney.
 

...

 diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
 index 7bf5b4d..e68287f 100644
 --- a/gcc/config/arm/arm.c
 +++ b/gcc/config/arm/arm.c
 @@ -2000,6 +2000,25 @@ const struct tune_params arm_cortex_a9_tune =
ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref.  */
  };
  
 +const struct tune_params arm_marvell_whitney_tune =
 +{
 +  arm_9e_rtx_costs,
 +  cortexa9_extra_costs,
 +  cortex_a9_sched_adjust_cost,
 +  1, /* Constant limit.  */
 +  5, /* Max cond insns.  */
 +  ARM_PREFETCH_BENEFICIAL(4,32,32),
 +  false, /* Prefer constant pool.  */
 +  arm_default_branch_cost,
 +  false, /* Prefer LDRD/STRD.  */
 +  {true, true},  /* Prefer non short 
 circuit.  */
 +  arm_default_vec_cost,/* Vectorizer costs.  */
 +  false,/* Prefer Neon for 64-bits 
 bitops.  */
 +  false, false, /* Prefer 32-bit encodings.  
 */
 +  false, /* Prefer Neon for stringops.  
 */
 +  8  /* Maximum insns to inline 
 memset.  */
 +};
 +

You need to bootstrap GCC with your patch applied.  As it is now, the above 
tune table is missing at least two entries (one for insn fusing, and one for 
sched L2 autoprefetcher).  It should cause a bootstrap fail.

  const struct tune_params arm_cortex_a12_tune =
  {
arm_9e_rtx_costs,
 @@ -11717,6 +11736,51 @@ fa726te_sched_adjust_cost (rtx_insn *insn, rtx link, 
 rtx_insn *dep, int * cost)
return true;
  }
  
 +/* Return true if vector element size is byte.  */
 +bool
 +marvell_whitney_vector_mode_qi (rtx_insn *insn)
 +{
 +  machine_mode mode;
 +
 +  if (GET_CODE (PATTERN (insn)) == SET)
 +{
 +  mode = GET_MODE (SET_DEST (PATTERN (insn)));
 +  if (VECTOR_MODE_P (mode)  GET_MODE_INNER (mode) == QImode)
 + return true;
 +}
 +
 +  return false;
 +}

It is preferable to avoid using predicates in DFA definitions.  Predicates 
prevent optimizations of the DFA, which causes them to be bigger and slower.  
This predicate can be easily replaced by an attribute  (set_attr 
dest_vect_mode mode) on affected instructions (the proper name for the 
attribute needs some thinking though).

That said, I don't a have strong opinion on this case.  It may be a worthwhile 
tradeoff to use the predicate to avoid adding new attribute to a dozen of 
instructions.  If you / other reviewers prefer to keep the predicate -- please 
add a comment noting that it is used by the DFA. 

 +
 +/* Return true if a non-shift insn has a shift operand.  */
 +bool
 +marvell_whitney_inner_shift (rtx_insn *insn)
 +{
 +  rtx pat = PATTERN (insn);
 +
 +  if (GET_CODE (pat) != SET)
 +return false;
 +
 +  /* Is not a shift insn.  */
 +  rtx rvalue = SET_SRC (pat);
 +  RTX_CODE code = GET_CODE (rvalue);
 +  if (code == ASHIFT || code == ASHIFTRT
 +  || code == LSHIFTRT || code == ROTATERT)
 +return false;
 +
 +  subrtx_iterator::array_type array;
 +  FOR_EACH_SUBRTX (iter, array, rvalue, ALL)
 +{
 +  /* Has shift operation.  */
 +  RTX_CODE code = GET_CODE (*iter);
 +  if (code == ASHIFT || code == ASHIFTRT
 +   || code == LSHIFTRT || code == ROTATERT)
 + return true;
 +}
 +
 +  return false;
 +}

Same comment as above.

Otherwise looks good.

Thanks!

--
Maxim Kuvyrkov
www.linaro.org






Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2015-02-26 Thread Ramana Radhakrishnan
On Thu, Dec 18, 2014 at 10:13 AM, Xingxing Pan xxing...@marvell.com wrote:
 Hi,

 This patch contains Marvell Whitney core's pipeline description.
 Test on arm-linux-gnueabi and no new regression are found.

 Is it OK for trunk?

I haven't had a chance to review this in detail but I think it's too
late to add such an invasive patch to the backend given we are in
stage4 today.


Ramana


 Regards,
 Xingxing


  2014-12-18 Xingxing Pan xxing...@marvell.com

  * config/arm/arm-cores.def: Add new core marvell-whitney.
  * config/arm/arm-protos.h:
  (marvell_whitney_vector_element_size_is_byte): Declare.
  (marvell_whitney_non_shift_with_shift_operand): Ditto.
  * config/arm/arm-tables.opt: Regenerated.
  * config/arm/arm-tune.md: Regenerated.
  * config/arm/arm.c (arm_marvell_whitney_tune): New structure.
  (arm_issue_rate): Add marvell_whitney.
  (marvell_whitney_vector_element_size_is_byte): New function.
  (marvell_whitney_non_shift_with_shift_operand): Ditto.
  * config/arm/arm.md: Include marvell-whitney.md.
  (generic_sched): Add marvell_whitney.
  (generic_vfp): Ditto.
  * config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney.
  * config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md.
  * config/arm/marvell-whitney.md: New file.
  * doc/invoke.texi: Document marvell-whitney.

 diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
 index 423ee9e..b0ffbe1 100644
 --- a/gcc/config/arm/arm-cores.def
 +++ b/gcc/config/arm/arm-cores.def
 @@ -159,6 +159,7 @@ ARM_CORE(cortex-m7,   cortexm7, cortexm7,
 7EM,
 FL_LDSCHED, cortex_m7)
   ARM_CORE(cortex-m4, cortexm4, cortexm4, 7EM,
 FL_LDSCHED, v7m)
   ARM_CORE(cortex-m3, cortexm3, cortexm3, 7M,
 FL_LDSCHED, v7m)
   ARM_CORE(marvell-pj4,   marvell_pj4, marvell_pj4,  7A,
 FL_LDSCHED, 9e)
 +ARM_CORE(marvell-whitney,marvell_whitney, marvell_whitney,
 7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney)

   /* V7 big.LITTLE implementations */
   ARM_CORE(cortex-a15.cortex-a7, cortexa15cortexa7, cortexa7, 7A,
 FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15)
 diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
 index 20cfa9f..e86db1e 100644
 --- a/gcc/config/arm/arm-protos.h
 +++ b/gcc/config/arm/arm-protos.h
 @@ -231,6 +231,9 @@ extern void arm_order_regs_for_local_alloc (void);

   extern int arm_max_conditional_execute ();

 +extern bool marvell_whitney_vector_element_size_is_byte (rtx insn);
 +extern bool marvell_whitney_non_shift_with_shift_operand (rtx insn);
 +
   /* Vectorizer cost model implementation.  */
   struct cpu_vec_costs {
 const int scalar_stmt_cost;   /* Cost of any scalar operation, excluding
 diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
 index 9b1886e..3371ce3 100644
 --- a/gcc/config/arm/arm-tables.opt
 +++ b/gcc/config/arm/arm-tables.opt
 @@ -298,6 +298,9 @@ EnumValue
   Enum(processor_type) String(marvell-pj4) Value(marvell_pj4)

   EnumValue
 +Enum(processor_type) String(marvell-whitney) Value(marvell_whitney)
 +
 +EnumValue
   Enum(processor_type) String(cortex-a15.cortex-a7) Value(cortexa15cortexa7)

   EnumValue
 diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
 index d300c51..c73c33c 100644
 --- a/gcc/config/arm/arm-tune.md
 +++ b/gcc/config/arm/arm-tune.md
 @@ -28,9 +28,10 @@

 cortexm1smallmultiply,cortexm0smallmultiply,cortexm0plussmallmultiply,
 genericv7a,cortexa5,cortexa7,
 cortexa8,cortexa9,cortexa12,
 -   cortexa15,cortexa17,cortexr4,cortexr4f,
 -   cortexr5,cortexr7,cortexm7,
 -   cortexm4,cortexm3,marvell_pj4,
 -   cortexa15cortexa7,cortexa17cortexa7,cortexa53,
 -   cortexa57,cortexa57cortexa53
 +   cortexa15,cortexa17,cortexr4,
 +   cortexr4f,cortexr5,cortexr7,
 +   cortexm7,cortexm4,cortexm3,
 +   marvell_pj4,marvell_whitney,cortexa15cortexa7,
 +   cortexa17cortexa7,cortexa53,cortexa57,
 +   cortexa57cortexa53
 (const (symbol_ref ((enum attr_tune) arm_tune
 diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
 index 0ec526b..183da4c 100644
 --- a/gcc/config/arm/arm.c
 +++ b/gcc/config/arm/arm.c
 @@ -1914,6 +1914,25 @@ const struct tune_params arm_cortex_a9_tune =
 8   /* Maximum insns to inline
 memset.  */
   };

 +const struct tune_params arm_marvell_whitney_tune =
 +{
 +  arm_9e_rtx_costs,
 +  cortexa9_extra_costs,
 +  cortex_a9_sched_adjust_cost,
 +  1,   /* Constant limit.  */
 +  5,   /* Max cond insns.  */
 +  ARM_PREFETCH_BENEFICIAL(4,32,32),
 +  false,   /* Prefer constant pool.  */
 +  arm_default_branch_cost,
 +  false,   /* Prefer 

Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2015-02-26 Thread Maxim Kuvyrkov
 On Feb 26, 2015, at 1:12 PM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:
 
 

...
 I've tried this patch out and added:
 (automata_option v)
 (automata_option time)
 (automata_option stats)
 (automata_option progress)
 
 to the .md file to get some compiler build-time stats on the automata.
 I see that the whitney_pipe automaton is the largest automaton in terms
 of NDFA states, and second largest in minimal DFA states:
 
44170 NDFA states,  188695 NDFA arcs
44170 DFA states,   188695 DFA arcs
 7070 minimal DFA states,   66279 minimal DFA arcs
  923 all insns 30 insn equivalence classes
 
 Could the complexity of it be reduced without sacrificing code quality?
 In some cores we break up the automaton into two automata, roughly separated 
 into
 an integer/memory unit and another one for FP/NEON units to reduce the state 
 space.
 Look at cortex-a15.md and cortex-a15-neon.md or some other description of 
 that kind.

I think converting the C predicates to attributes would reduce the number of 
states since DFA generator will be able to fold them.

--
Maxim Kuvyrkov
www.linaro.org





Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2015-02-25 Thread Xingxing Pan

Hi,

This patch merges pipeline description for marvell-whitney to latest code base.
Is it OK for trunk?

--
Regards,
Xingxing
commit 83974dde8d9f773df1004aa1d5e3b05d8a33f5e0
Author: Xingxing Pan xxing...@marvell.com
Date:   Wed Feb 25 10:24:40 2015 +0800

2015-02-25 Xingxing Pan  xxing...@marvell.com

* config/arm/arm-cores.def: Add new core marvell-whitney.
* config/arm/arm-protos.h:
(marvell_whitney_vector_mode_qi): Declare.
(marvell_whitney_inner_shift): Ditto.
* config/arm/arm-tables.opt: Regenerated.
* config/arm/arm-tune.md: Regenerated.
* config/arm/arm.c (arm_marvell_whitney_tune): New structure.
(arm_issue_rate): Add marvell_whitney.
(marvell_whitney_vector_mode_qi): New function.
(marvell_whitney_inner_shift): Ditto.
* config/arm/arm.md: Include marvell-whitney.md.
(generic_sched): Add marvell_whitney.
(generic_vfp): Ditto.
* config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney.
* config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md.
* config/arm/marvell-whitney.md: New file.
* doc/invoke.texi: Document marvell-whitney.

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index d7e730d..fc76eb5 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -159,6 +159,7 @@ ARM_CORE(cortex-m7,		cortexm7, cortexm7,		7EM, FL_LDSCHED, cortex_m7)
 ARM_CORE(cortex-m4,		cortexm4, cortexm4,		7EM, FL_LDSCHED, v7m)
 ARM_CORE(cortex-m3,		cortexm3, cortexm3,		7M,  FL_LDSCHED, v7m)
 ARM_CORE(marvell-pj4,		marvell_pj4, marvell_pj4,	7A,  FL_LDSCHED, 9e)
+ARM_CORE(marvell-whitney,	marvell_whitney, marvell_whitney, 7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney)
 
 /* V7 big.LITTLE implementations */
 ARM_CORE(cortex-a15.cortex-a7, cortexa15cortexa7, cortexa7,	7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 307babb..d047dbc 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -231,6 +231,9 @@ extern void arm_order_regs_for_local_alloc (void);
 
 extern int arm_max_conditional_execute ();
 
+extern bool marvell_whitney_vector_mode_qi (rtx_insn *insn);
+extern bool marvell_whitney_inner_shift (rtx_insn *insn);
+
 /* Vectorizer cost model implementation.  */
 struct cpu_vec_costs {
   const int scalar_stmt_cost;   /* Cost of any scalar operation, excluding
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 3450e5b..f0f9f3f 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -298,6 +298,9 @@ EnumValue
 Enum(processor_type) String(marvell-pj4) Value(marvell_pj4)
 
 EnumValue
+Enum(processor_type) String(marvell-whitney) Value(marvell_whitney)
+
+EnumValue
 Enum(processor_type) String(cortex-a15.cortex-a7) Value(cortexa15cortexa7)
 
 EnumValue
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index d459f27..fbfab2e 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -31,7 +31,8 @@
 	cortexa15,cortexa17,cortexr4,
 	cortexr4f,cortexr5,cortexr7,
 	cortexm7,cortexm4,cortexm3,
-	marvell_pj4,cortexa15cortexa7,cortexa17cortexa7,
-	cortexa53,cortexa57,cortexa72,
-	xgene1,cortexa57cortexa53,cortexa72cortexa53
+	marvell_pj4,marvell_whitney,cortexa15cortexa7,
+	cortexa17cortexa7,cortexa53,cortexa57,
+	cortexa72,xgene1,cortexa57cortexa53,
+	cortexa72cortexa53
 	(const (symbol_ref ((enum attr_tune) arm_tune
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 7bf5b4d..e68287f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -2000,6 +2000,25 @@ const struct tune_params arm_cortex_a9_tune =
   ARM_SCHED_AUTOPREF_OFF			/* Sched L2 autopref.  */
 };
 
+const struct tune_params arm_marvell_whitney_tune =
+{
+  arm_9e_rtx_costs,
+  cortexa9_extra_costs,
+  cortex_a9_sched_adjust_cost,
+  1,		/* Constant limit.  */
+  5,		/* Max cond insns.  */
+  ARM_PREFETCH_BENEFICIAL(4,32,32),
+  false,	/* Prefer constant pool.  */
+  arm_default_branch_cost,
+  false,	/* Prefer LDRD/STRD.  */
+  {true, true},	/* Prefer non short circuit.  */
+  arm_default_vec_cost,/* Vectorizer costs.  */
+  false,/* Prefer Neon for 64-bits bitops.  */
+  false, false, /* Prefer 32-bit encodings.  */
+  false,	/* Prefer Neon for stringops.  */
+  8		/* Maximum insns to inline memset.  */
+};
+
 const struct tune_params arm_cortex_a12_tune =
 {
   arm_9e_rtx_costs,
@@ -11717,6 +11736,51 @@ fa726te_sched_adjust_cost (rtx_insn *insn, rtx link, rtx_insn *dep, int * cost)
   return true;
 }
 
+/* Return true if vector element size is byte.  */
+bool
+marvell_whitney_vector_mode_qi (rtx_insn *insn)
+{
+  machine_mode mode;
+
+  if (GET_CODE (PATTERN (insn)) == SET)
+{
+  mode = GET_MODE (SET_DEST 

Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2015-02-25 Thread Xingxing Pan

On 02/25/2015 09:32 PM, Xingxing Pan wrote:

Hi,

This patch merges pipeline description for marvell-whitney to latest
code base.
Is it OK for trunk?


Refactor the commit message.

--
Regards,
Xingxing
Add pipeline description for marvell-whitney.

2015-02-26  Xingxing Pan  xxing...@marvell.com

* config/arm/arm-cores.def: Add new core marvell-whitney.
* config/arm/arm-protos.h:
(marvell_whitney_vector_mode_qi): Declare.
(marvell_whitney_inner_shift): Ditto.
* config/arm/arm-tables.opt: Regenerated.
* config/arm/arm-tune.md: Regenerated.
* config/arm/arm.c (arm_marvell_whitney_tune): New structure.
(arm_issue_rate): Add marvell_whitney.
(marvell_whitney_vector_mode_qi): New function.
(marvell_whitney_inner_shift): Ditto.
* config/arm/arm.md: Include marvell-whitney.md.
(generic_sched): Add marvell_whitney.
(generic_vfp): Ditto.
* config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney.
* config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md.
* config/arm/marvell-whitney.md: New file.
* doc/invoke.texi: Document marvell-whitney.

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index d7e730d..fc76eb5 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -159,6 +159,7 @@ ARM_CORE(cortex-m7,		cortexm7, cortexm7,		7EM, FL_LDSCHED, cortex_m7)
 ARM_CORE(cortex-m4,		cortexm4, cortexm4,		7EM, FL_LDSCHED, v7m)
 ARM_CORE(cortex-m3,		cortexm3, cortexm3,		7M,  FL_LDSCHED, v7m)
 ARM_CORE(marvell-pj4,		marvell_pj4, marvell_pj4,	7A,  FL_LDSCHED, 9e)
+ARM_CORE(marvell-whitney,	marvell_whitney, marvell_whitney, 7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney)
 
 /* V7 big.LITTLE implementations */
 ARM_CORE(cortex-a15.cortex-a7, cortexa15cortexa7, cortexa7,	7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 307babb..d047dbc 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -231,6 +231,9 @@ extern void arm_order_regs_for_local_alloc (void);
 
 extern int arm_max_conditional_execute ();
 
+extern bool marvell_whitney_vector_mode_qi (rtx_insn *insn);
+extern bool marvell_whitney_inner_shift (rtx_insn *insn);
+
 /* Vectorizer cost model implementation.  */
 struct cpu_vec_costs {
   const int scalar_stmt_cost;   /* Cost of any scalar operation, excluding
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 3450e5b..f0f9f3f 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -298,6 +298,9 @@ EnumValue
 Enum(processor_type) String(marvell-pj4) Value(marvell_pj4)
 
 EnumValue
+Enum(processor_type) String(marvell-whitney) Value(marvell_whitney)
+
+EnumValue
 Enum(processor_type) String(cortex-a15.cortex-a7) Value(cortexa15cortexa7)
 
 EnumValue
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index d459f27..fbfab2e 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -31,7 +31,8 @@
 	cortexa15,cortexa17,cortexr4,
 	cortexr4f,cortexr5,cortexr7,
 	cortexm7,cortexm4,cortexm3,
-	marvell_pj4,cortexa15cortexa7,cortexa17cortexa7,
-	cortexa53,cortexa57,cortexa72,
-	xgene1,cortexa57cortexa53,cortexa72cortexa53
+	marvell_pj4,marvell_whitney,cortexa15cortexa7,
+	cortexa17cortexa7,cortexa53,cortexa57,
+	cortexa72,xgene1,cortexa57cortexa53,
+	cortexa72cortexa53
 	(const (symbol_ref ((enum attr_tune) arm_tune
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 7bf5b4d..e68287f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -2000,6 +2000,25 @@ const struct tune_params arm_cortex_a9_tune =
   ARM_SCHED_AUTOPREF_OFF			/* Sched L2 autopref.  */
 };
 
+const struct tune_params arm_marvell_whitney_tune =
+{
+  arm_9e_rtx_costs,
+  cortexa9_extra_costs,
+  cortex_a9_sched_adjust_cost,
+  1,		/* Constant limit.  */
+  5,		/* Max cond insns.  */
+  ARM_PREFETCH_BENEFICIAL(4,32,32),
+  false,	/* Prefer constant pool.  */
+  arm_default_branch_cost,
+  false,	/* Prefer LDRD/STRD.  */
+  {true, true},	/* Prefer non short circuit.  */
+  arm_default_vec_cost,/* Vectorizer costs.  */
+  false,/* Prefer Neon for 64-bits bitops.  */
+  false, false, /* Prefer 32-bit encodings.  */
+  false,	/* Prefer Neon for stringops.  */
+  8		/* Maximum insns to inline memset.  */
+};
+
 const struct tune_params arm_cortex_a12_tune =
 {
   arm_9e_rtx_costs,
@@ -11717,6 +11736,51 @@ fa726te_sched_adjust_cost (rtx_insn *insn, rtx link, rtx_insn *dep, int * cost)
   return true;
 }
 
+/* Return true if vector element size is byte.  */
+bool
+marvell_whitney_vector_mode_qi (rtx_insn *insn)
+{
+  machine_mode mode;
+
+  if (GET_CODE (PATTERN (insn)) == SET)
+{
+  mode = GET_MODE (SET_DEST (PATTERN (insn)));
+  if (VECTOR_MODE_P (mode)  GET_MODE_INNER (mode) == QImode)
+	

Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2015-01-13 Thread Xingxing Pan

On 09/01/2015 19:22, Kyrill Tkachov wrote:

Hi Xingxing,

On 19/12/14 11:01, Xingxing Pan wrote:

+/* Return true if vector element size is byte. */

Minor nit: two spaces after full stop and before */ Same in other places
in the patch.


+bool
+marvell_whitney_vector_element_size_is_byte (rtx insn)
+{
+  if (GET_CODE (PATTERN (insn)) == SET)
+{
+  if ((GET_MODE (SET_DEST (PATTERN (insn))) == V8QImode) ||
+  (GET_MODE (SET_DEST (PATTERN (insn))) == V16QImode))
+   return true;
+}
+
+  return false;
+}


I see this is called from inside marvell-whitney.md. It seems to me that
this function takes RTX insns. Can the type of this be strengthened to
rtx_insn * ?
Also, this should be refactored and written a bit more generally by
checking for VECTOR_MODE_P and then GET_MODE_INNER for QImode, saving
you the trouble of enumerating the different vector QI modes.



+
+/* Return true if INSN has shift operation but is not a shift insn. */
+bool
+marvell_whitney_non_shift_with_shift_operand (rtx insn)


Similar comment. Can this be strengthened to rtx_insn * ?

Thanks,
Kyrill

+{
+  rtx pat = PATTERN (insn);
+
+  if (GET_CODE (pat) != SET)
+return false;
+
+  /* Is not a shift insn. */
+  rtx rvalue = SET_SRC (pat);
+  RTX_CODE code = GET_CODE (rvalue);
+  if (code == ASHIFT || code == ASHIFTRT
+  || code == LSHIFTRT || code == ROTATERT)
+return false;
+
+  subrtx_iterator::array_type array;
+  FOR_EACH_SUBRTX (iter, array, rvalue, ALL)
+{
+  /* Has shift operation. */
+  RTX_CODE code = GET_CODE (*iter);
+  if (code == ASHIFT || code == ASHIFTRT
+  || code == LSHIFTRT || code == ROTATERT)
+return true;
+}
+
+  return false;
+}




Hi Kyrill,

Thanks for advice. Refactored patch is attached.

--
Regards,
Xingxing
commit 3627056607b1e8604ac8d85ed44fdc7d3209cd3e
Author: Xingxing Pan xxing...@marvell.com
Date:   Thu Dec 18 16:58:05 2014 +0800

2015-01-13 Xingxing Pan  xxing...@marvell.com

* config/arm/arm-cores.def: Add new core marvell-whitney.
* config/arm/arm-protos.h:
(marvell_whitney_vector_mode_qi): Declare.
(marvell_whitney_inner_shift): Ditto.
* config/arm/arm-tables.opt: Regenerated.
* config/arm/arm-tune.md: Regenerated.
* config/arm/arm.c (arm_marvell_whitney_tune): New structure.
(arm_issue_rate): Add marvell_whitney.
(marvell_whitney_vector_mode_qi): New function.
(marvell_whitney_inner_shift): Ditto.
* config/arm/arm.md: Include marvell-whitney.md.
(generic_sched): Add marvell_whitney.
(generic_vfp): Ditto.
* config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney.
* config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md.
* config/arm/marvell-whitney.md: New file.
* doc/invoke.texi: Document marvell-whitney.

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index 6fa5d99..26eb7ab 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -159,6 +159,7 @@ ARM_CORE(cortex-m7,		cortexm7, cortexm7,		7EM, FL_LDSCHED, cortex_m7)
 ARM_CORE(cortex-m4,		cortexm4, cortexm4,		7EM, FL_LDSCHED, v7m)
 ARM_CORE(cortex-m3,		cortexm3, cortexm3,		7M,  FL_LDSCHED, v7m)
 ARM_CORE(marvell-pj4,		marvell_pj4, marvell_pj4,	7A,  FL_LDSCHED, 9e)
+ARM_CORE(marvell-whitney,	marvell_whitney, marvell_whitney, 7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney)
 
 /* V7 big.LITTLE implementations */
 ARM_CORE(cortex-a15.cortex-a7, cortexa15cortexa7, cortexa7,	7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index fc45348..45001ae 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -231,6 +231,9 @@ extern void arm_order_regs_for_local_alloc (void);
 
 extern int arm_max_conditional_execute ();
 
+extern bool marvell_whitney_vector_mode_qi (rtx_insn *insn);
+extern bool marvell_whitney_inner_shift (rtx_insn *insn);
+
 /* Vectorizer cost model implementation.  */
 struct cpu_vec_costs {
   const int scalar_stmt_cost;   /* Cost of any scalar operation, excluding
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index ece9d5e..dc5f364 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -298,6 +298,9 @@ EnumValue
 Enum(processor_type) String(marvell-pj4) Value(marvell_pj4)
 
 EnumValue
+Enum(processor_type) String(marvell-whitney) Value(marvell_whitney)
+
+EnumValue
 Enum(processor_type) String(cortex-a15.cortex-a7) Value(cortexa15cortexa7)
 
 EnumValue
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index 452820ab..c73c33c 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -31,6 +31,7 @@
 	cortexa15,cortexa17,cortexr4,
 	cortexr4f,cortexr5,cortexr7,
 	cortexm7,cortexm4,cortexm3,
-	marvell_pj4,cortexa15cortexa7,cortexa17cortexa7,
-	cortexa53,cortexa57,cortexa57cortexa53
+	

Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2015-01-09 Thread Kyrill Tkachov

Hi Xingxing,

On 19/12/14 11:01, Xingxing Pan wrote:

+/* Return true if vector element size is byte. */
Minor nit: two spaces after full stop and before */ Same in other places 
in the patch.



+bool
+marvell_whitney_vector_element_size_is_byte (rtx insn)
+{
+  if (GET_CODE (PATTERN (insn)) == SET)
+{
+  if ((GET_MODE (SET_DEST (PATTERN (insn))) == V8QImode) ||
+  (GET_MODE (SET_DEST (PATTERN (insn))) == V16QImode))
+   return true;
+}
+
+  return false;
+}


I see this is called from inside marvell-whitney.md. It seems to me that 
this function takes RTX insns. Can the type of this be strengthened to 
rtx_insn * ?
Also, this should be refactored and written a bit more generally by 
checking for VECTOR_MODE_P and then GET_MODE_INNER for QImode, saving 
you the trouble of enumerating the different vector QI modes.




+
+/* Return true if INSN has shift operation but is not a shift insn. */
+bool
+marvell_whitney_non_shift_with_shift_operand (rtx insn)


Similar comment. Can this be strengthened to rtx_insn * ?

Thanks,
Kyrill

+{
+  rtx pat = PATTERN (insn);
+
+  if (GET_CODE (pat) != SET)
+return false;
+
+  /* Is not a shift insn. */
+  rtx rvalue = SET_SRC (pat);
+  RTX_CODE code = GET_CODE (rvalue);
+  if (code == ASHIFT || code == ASHIFTRT
+  || code == LSHIFTRT || code == ROTATERT)
+return false;
+
+  subrtx_iterator::array_type array;
+  FOR_EACH_SUBRTX (iter, array, rvalue, ALL)
+{
+  /* Has shift operation. */
+  RTX_CODE code = GET_CODE (*iter);
+  if (code == ASHIFT || code == ASHIFTRT
+  || code == LSHIFTRT || code == ROTATERT)
+return true;
+}
+
+  return false;
+}





Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2014-12-19 Thread Kyrill Tkachov

Hi Xingxing,

On 18/12/14 10:13, Xingxing Pan wrote:

+
+(define_attr tune_marvell_whitney yes,no
+  (const (if_then_else (eq_attr tune marvell_whitney)
+   (const_string yes)
+   (const_string no


Why do you need this? The canonical way we do this in the other .md 
files is check

for the tune attribute directly.


+
+(define_automaton whitney_pipe)
+
+(define_cpu_unit wCU1 whitney_pipe)
+(define_cpu_unit wCU2 whitney_pipe)
+(define_cpu_unit wCU3 whitney_pipe)
+(define_cpu_unit wCU4 whitney_pipe)
+(define_cpu_unit wCU5 whitney_pipe)
+(define_cpu_unit wCU6 whitney_pipe)
+(define_cpu_unit wCU7 whitney_pipe)
+(define_cpu_unit wCU8 whitney_pipe)
+(define_cpu_unit wCU9 whitney_pipe)
+(define_cpu_unit wCU10 whitney_pipe)
+(define_cpu_unit wCU11 whitney_pipe)
+(define_cpu_unit wCU12 whitney_pipe)
+(define_cpu_unit wCU13 whitney_pipe)
+
+(define_insn_reservation wIR1 1
+  (and (eq_attr tune_marvell_whitney yes)
+   (eq_attr whitney_type wTP1))
+  wCU1|wCU2|(wCU3+wCU4)
+)


Further to the comment above, this could be written as:

+(define_insn_reservation wIR1 1
+  (and (eq_attr tune marvell_whitney)
+   (eq_attr whitney_type wTP1))
+  wCU1|wCU2|(wCU3+wCU4)
+)


Same with the other reservations in this file.

Kyrill


+
+(define_insn_reservation wIR2 2
+  (and (eq_attr tune_marvell_whitney yes)
+   (eq_attr whitney_type wTP2))
+  wCU1|wCU2|(wCU3+wCU4)
+)





Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2014-12-19 Thread Kyrill Tkachov

Hi Xingxin,

It seems that your mail client mangled this patch, at least the 
following hunk doesn't apply, even when I try to get it from the web 
archives.

Could you please resend it as an attachment perhaps?

Thanks,
Kyrill

On 18/12/14 10:13, Xingxing Pan wrote:

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index 423ee9e..b0ffbe1 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -159,6 +159,7 @@ ARM_CORE(cortex-m7,   cortexm7,
cortexm7, 7EM,
FL_LDSCHED, cortex_m7)
ARM_CORE(cortex-m4, cortexm4, cortexm4, 7EM,
FL_LDSCHED, v7m)
ARM_CORE(cortex-m3, cortexm3, cortexm3, 7M,
FL_LDSCHED, v7m)
ARM_CORE(marvell-pj4,   marvell_pj4, marvell_pj4,
   7A,  FL_LDSCHED, 9e)
+ARM_CORE(marvell-whitney,marvell_whitney, marvell_whitney,
7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney)





Re: [patch 1/2][ARM]: New CPU support for Marvell Whitney

2014-12-19 Thread Xingxing Pan

On 19/12/2014 18:38, Kyrill Tkachov wrote:

Hi Xingxin,

It seems that your mail client mangled this patch, at least the
following hunk doesn't apply, even when I try to get it from the web
archives.
Could you please resend it as an attachment perhaps?

Thanks,
Kyrill

On 18/12/14 10:13, Xingxing Pan wrote:

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index 423ee9e..b0ffbe1 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -159,6 +159,7 @@ ARM_CORE(cortex-m7,   cortexm7,
cortexm7, 7EM,
FL_LDSCHED, cortex_m7)
 ARM_CORE(cortex-m4, cortexm4, cortexm4, 7EM,
FL_LDSCHED, v7m)
 ARM_CORE(cortex-m3, cortexm3, cortexm3, 7M,
FL_LDSCHED, v7m)
 ARM_CORE(marvell-pj4,   marvell_pj4, marvell_pj4,
7A,  FL_LDSCHED, 9e)
+ARM_CORE(marvell-whitney,marvell_whitney, marvell_whitney,
7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney)





Hi Kyrill,

I've changed the code to use tune attribute directly. The new patch is 
attached.


Thanks,
Xingxing
commit 56745b611d40b77e1911075159f89959335d0298
Author: Xingxing Pan xxing...@marvell.com
Date:   Thu Dec 18 16:58:05 2014 +0800

2014-12-18 Xingxing Pan xxing...@marvell.com

* config/arm/arm-cores.def: Add new core marvell-whitney.
* config/arm/arm-protos.h:
(marvell_whitney_vector_element_size_is_byte): Declare.
(marvell_whitney_non_shift_with_shift_operand): Ditto.
* config/arm/arm-tables.opt: Regenerated.
* config/arm/arm-tune.md: Regenerated.
* config/arm/arm.c (arm_marvell_whitney_tune): New structure.
(arm_issue_rate): Add marvell_whitney.
(marvell_whitney_vector_element_size_is_byte): New function.
(marvell_whitney_non_shift_with_shift_operand): Ditto.
* config/arm/arm.md: Include marvell-whitney.md.
(generic_sched): Add marvell_whitney.
(generic_vfp): Ditto.
* config/arm/bpabi.h (BE8_LINK_SPEC): Add marvell-whitney.
* config/arm/t-arm (MD_INCLUDES): Add marvell-whitney.md.
* config/arm/marvell-whitney.md: New file.
* doc/invoke.texi: Document marvell-whitney.

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index 423ee9e..b0ffbe1 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -159,6 +159,7 @@ ARM_CORE(cortex-m7,		cortexm7, cortexm7,		7EM, FL_LDSCHED, cortex_m7)
 ARM_CORE(cortex-m4,		cortexm4, cortexm4,		7EM, FL_LDSCHED, v7m)
 ARM_CORE(cortex-m3,		cortexm3, cortexm3,		7M,  FL_LDSCHED, v7m)
 ARM_CORE(marvell-pj4,		marvell_pj4, marvell_pj4,	7A,  FL_LDSCHED, 9e)
+ARM_CORE(marvell-whitney,		marvell_whitney, marvell_whitney,	7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, marvell_whitney)
 
 /* V7 big.LITTLE implementations */
 ARM_CORE(cortex-a15.cortex-a7, cortexa15cortexa7, cortexa7,	7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 20cfa9f..e86db1e 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -231,6 +231,9 @@ extern void arm_order_regs_for_local_alloc (void);
 
 extern int arm_max_conditional_execute ();
 
+extern bool marvell_whitney_vector_element_size_is_byte (rtx insn);
+extern bool marvell_whitney_non_shift_with_shift_operand (rtx insn);
+
 /* Vectorizer cost model implementation.  */
 struct cpu_vec_costs {
   const int scalar_stmt_cost;   /* Cost of any scalar operation, excluding
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 9b1886e..3371ce3 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -298,6 +298,9 @@ EnumValue
 Enum(processor_type) String(marvell-pj4) Value(marvell_pj4)
 
 EnumValue
+Enum(processor_type) String(marvell-whitney) Value(marvell_whitney)
+
+EnumValue
 Enum(processor_type) String(cortex-a15.cortex-a7) Value(cortexa15cortexa7)
 
 EnumValue
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index d300c51..c73c33c 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -28,9 +28,10 @@
 	cortexm1smallmultiply,cortexm0smallmultiply,cortexm0plussmallmultiply,
 	genericv7a,cortexa5,cortexa7,
 	cortexa8,cortexa9,cortexa12,
-	cortexa15,cortexa17,cortexr4,cortexr4f,
-	cortexr5,cortexr7,cortexm7,
-	cortexm4,cortexm3,marvell_pj4,
-	cortexa15cortexa7,cortexa17cortexa7,cortexa53,
-	cortexa57,cortexa57cortexa53
+	cortexa15,cortexa17,cortexr4,
+	cortexr4f,cortexr5,cortexr7,
+	cortexm7,cortexm4,cortexm3,
+	marvell_pj4,marvell_whitney,cortexa15cortexa7,
+	cortexa17cortexa7,cortexa53,cortexa57,
+	cortexa57cortexa53
 	(const (symbol_ref ((enum attr_tune) arm_tune
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 0ec526b..183da4c 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1914,6 +1914,25 @@ const struct tune_params arm_cortex_a9_tune =