[RFC AArch64][PR 63304] Handle literal pools for functions > 1 MiB in size.

2015-07-27 Thread Ramana Radhakrishnan
Hi,

This patch appears to fix the issue in PR63304 where we have
functions that are > 1MiB. The idea is to use adrp / ldr or adrp / add
instructions to address the literal pools under the use of a command line
option. The patch as attached turns this feature on by default as
that is the mode I've used for testing.

My thought is that we turn this on by default on trunk but keep this
disabled by default for the release branches in order to get some
serious testing for this feature while it bakes on trunk.

As a follow-up I would like to try and see if estimate_num_insns or
something else can give us a heuristic to turn this on for "large" functions.
After all the number of incidences of this are quite low in real life,
so may be we should look to restrict this use as much as possible on the
grounds that this code generation implies an extra integer register for
addressing for every floating point and vector constant and I don't think
that's great in code that already may have high register pressure.

Tested on aarch64-none-elf with no regressions.
Bootstrapped and regression tested on
aarch64-none-linux-gnu. Additionally a test run of SPEC2k showed
no issues other than a miscompare for eon which also
appeared in the base run. Thus for now I'm confident this is
reasonably sane for the minute.

I will also note that this patch will also need rebasing on top of Kyrill's
work with target attributes and thus cannot be final for trunk - however this
version is still applicable for all release branches.

More testing is still underway but in the meanwhile I'd like to
put this up for some comments please.

regards
Ramana

  Ramana Radhakrishnan  

PR target/63304
* config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
nopcrelative_literal_loads.
(aarch64_classify_address): Likewise.
(aarch64_constant_pool_reload_icode): Define.
(aarch64_secondary_reload): Handle secondary reloads for
literal pools.
(aarch64_override_options): Handle nopcrelative_literal_loads.
(aarch64_classify_symbol): Handle nopcrelative_literal_loads.
* config/aarch64/aarch64.md (aarch64_reload_movcp):
Define.
(aarch64_reload_movcp): Likewise.
* config/aarch64/aarch64.opt: New option mnopc-relative-literal-loads
* config/aarch64/predicates.md (aarch64_constant_pool_symref): New
predicate.
* doc/invoke.texi (mnopc-relative-literal-loads): Document.
---
 gcc/config/aarch64/aarch64.c | 102 +--
 gcc/config/aarch64/aarch64.md|  26 ++
 gcc/config/aarch64/aarch64.opt   |   4 ++
 gcc/config/aarch64/predicates.md |   4 ++
 gcc/doc/invoke.texi  |   8 +++
 5 files changed, 140 insertions(+), 4 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 020f63c..f37a031 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1612,11 +1612,27 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm)
  aarch64_emit_move (dest, base);
  return;
}
+
  mem = force_const_mem (ptr_mode, imm);
  gcc_assert (mem);
+
+ /* If we aren't generating PC relative literals, then
+we need to expand the literal pool access carefully.
+This is something that needs to be done in a number
+of places, so could well live as a separate function.  */
+ if (nopcrelative_literal_loads)
+   {
+ gcc_assert (can_create_pseudo_p ());
+ base = gen_reg_rtx (ptr_mode);
+ aarch64_expand_mov_immediate (base, XEXP (mem, 0));
+ mem = gen_rtx_MEM (ptr_mode, base);
+   }
+
  if (mode != ptr_mode)
mem = gen_rtx_ZERO_EXTEND (mode, mem);
+
  emit_insn (gen_rtx_SET (dest, mem));
+
  return;
 
 case SYMBOL_SMALL_TLSGD:
@@ -3728,9 +3744,10 @@ aarch64_classify_address (struct aarch64_address_info 
*info,
  rtx sym, addend;
 
  split_const (x, &sym, &addend);
- return (GET_CODE (sym) == LABEL_REF
- || (GET_CODE (sym) == SYMBOL_REF
- && CONSTANT_POOL_ADDRESS_P (sym)));
+ return ((GET_CODE (sym) == LABEL_REF
+  || (GET_CODE (sym) == SYMBOL_REF
+  && CONSTANT_POOL_ADDRESS_P (sym)
+  && !nopcrelative_literal_loads)));
}
   return false;
 
@@ -4918,12 +4935,69 @@ aarch64_legitimize_reload_address (rtx *x_p,
 }
 
 
+/* Return the reload icode required for a constant pool in mode.  */
+static enum insn_code
+aarch64_constant_pool_reload_icode (machine_mode mode)
+{
+  switch (mode)
+{
+case SFmode:
+  return CODE_FOR_aarch64_reload_movcpsfdi;
+
+case DFmode:
+  return CODE_FOR_aarch64_reload_movcpdfdi;
+
+case TFmode:
+  return CODE_FOR_aarch64_reload_movcptfdi;
+
+case V8QImode:
+

Re: [RFC AArch64][PR 63304] Handle literal pools for functions > 1 MiB in size.

2015-09-14 Thread Ramana Radhakrishnan


On 27/08/15 15:07, Marcus Shawcroft wrote:
> On 27 July 2015 at 15:33, Ramana Radhakrishnan
>  wrote:
> 
>>   Ramana Radhakrishnan  
>>
>> PR target/63304
>> * config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
>> nopcrelative_literal_loads.
>> (aarch64_classify_address): Likewise.
>> (aarch64_constant_pool_reload_icode): Define.
>> (aarch64_secondary_reload): Handle secondary reloads for
>> literal pools.
>> (aarch64_override_options): Handle nopcrelative_literal_loads.
>> (aarch64_classify_symbol): Handle nopcrelative_literal_loads.
>> * config/aarch64/aarch64.md 
>> (aarch64_reload_movcp):
>> Define.
>> (aarch64_reload_movcp): Likewise.
>> * config/aarch64/aarch64.opt: New option mnopc-relative-literal-loads
>> * config/aarch64/predicates.md (aarch64_constant_pool_symref): New
>> predicate.
>> * doc/invoke.texi (mnopc-relative-literal-loads): Document.
> 
> This looks OK to me. It needs rebasing, but OK if the rebase is
> trival.  Default on is fine.  Hold off on the back ports for a couple
> of weeks.
> Cheers
> /Marcus
> 

This is what I applied. I'll give it a week or so on trunk before backporting 
to the release branches. 
Since we handle literal pools > 1MiB away on by default, this final rebased 
version switches the option name
to the positive form (mpc-relative-literal-loads) and handles it accordingly.

Tested on aarch64-none-elf , no regressions. Applied to trunk.

Thanks,
Ramana 


2015-09-14  Ramana Radhakrishnan  

PR target/63304
* config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
nopcrelative_literal_loads.
(aarch64_classify_address): Likewise.
(aarch64_constant_pool_reload_icode): Define.
(aarch64_secondary_reload): Handle secondary reloads for
literal pools.
(aarch64_override_options): Handle nopcrelative_literal_loads.
(aarch64_classify_symbol): Handle nopcrelative_literal_loads.
* config/aarch64/aarch64.md (aarch64_reload_movcp):
Define.
(aarch64_reload_movcp): Likewise.
* config/aarch64/aarch64.opt (mpc-relative-literal-loads): New option.
* config/aarch64/predicates.md (aarch64_constant_pool_symref): New
predicate.
* doc/invoke.texi (mpc-relative-literal-loads): Document.
Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 227737)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,22 @@
+2015-09-14  Ramana Radhakrishnan  
+
+   PR target/63304
+   * config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
+   nopcrelative_literal_loads.
+   (aarch64_classify_address): Likewise.
+   (aarch64_constant_pool_reload_icode): Define.
+   (aarch64_secondary_reload): Handle secondary reloads for
+   literal pools.
+   (aarch64_override_options): Handle nopcrelative_literal_loads.
+   (aarch64_classify_symbol): Handle nopcrelative_literal_loads.
+   * config/aarch64/aarch64.md (aarch64_reload_movcp):
+   Define.
+   (aarch64_reload_movcp): Likewise.
+   * config/aarch64/aarch64.opt (mpc-relative-literal-loads): New option.
+   * config/aarch64/predicates.md (aarch64_constant_pool_symref): New
+   predicate.
+   * doc/invoke.texi (mpc-relative-literal-loads): Document.
+
 2015-09-13  Olivier Hainque  
Eric Botcazou  
 
Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c(revision 227737)
+++ gcc/config/aarch64/aarch64.c(working copy)
@@ -1734,11 +1734,27 @@
  aarch64_emit_move (dest, base);
  return;
}
+
  mem = force_const_mem (ptr_mode, imm);
  gcc_assert (mem);
+
+ /* If we aren't generating PC relative literals, then
+we need to expand the literal pool access carefully.
+This is something that needs to be done in a number
+of places, so could well live as a separate function.  */
+ if (nopcrelative_literal_loads)
+   {
+ gcc_assert (can_create_pseudo_p ());
+ base = gen_reg_rtx (ptr_mode);
+ aarch64_expand_mov_immediate (base, XEXP (mem, 0));
+ mem = gen_rtx_MEM (ptr_mode, base);
+   }
+
  if (mode != ptr_mode)
mem = gen_rtx_ZERO_EXTEND (mode, mem);
+
  emit_insn (gen_rtx_SET (dest, mem));
+
  return;
 
 case SYMBOL_SMALL_TLSGD:
@@ -3854,9 +3870,10 @@
  rtx sym, addend;
 
  split_const (x, &sym, &addend);
- return (GET_CODE (sym) == LABEL_REF
- || (GET_CODE (sym) == SYMBOL_REF
- && CONSTANT_POOL_ADDRESS_P (sym)));
+ return ((GET_CODE (sym) == LABEL_REF
+  || (GET_CODE (s

Re: [RFC AArch64][PR 63304] Handle literal pools for functions > 1 MiB in size.

2015-08-27 Thread Marcus Shawcroft
On 27 July 2015 at 15:33, Ramana Radhakrishnan
 wrote:

>   Ramana Radhakrishnan  
>
> PR target/63304
> * config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
> nopcrelative_literal_loads.
> (aarch64_classify_address): Likewise.
> (aarch64_constant_pool_reload_icode): Define.
> (aarch64_secondary_reload): Handle secondary reloads for
> literal pools.
> (aarch64_override_options): Handle nopcrelative_literal_loads.
> (aarch64_classify_symbol): Handle nopcrelative_literal_loads.
> * config/aarch64/aarch64.md 
> (aarch64_reload_movcp):
> Define.
> (aarch64_reload_movcp): Likewise.
> * config/aarch64/aarch64.opt: New option mnopc-relative-literal-loads
> * config/aarch64/predicates.md (aarch64_constant_pool_symref): New
> predicate.
> * doc/invoke.texi (mnopc-relative-literal-loads): Document.

This looks OK to me. It needs rebasing, but OK if the rebase is
trival.  Default on is fine.  Hold off on the back ports for a couple
of weeks.
Cheers
/Marcus


Re: [RFC AArch64][PR 63304] Handle literal pools for functions > 1 MiB in size.

2015-09-11 Thread Ramana Radhakrishnan
On Thu, Aug 27, 2015 at 03:07:30PM +0100, Marcus Shawcroft wrote:
> On 27 July 2015 at 15:33, Ramana Radhakrishnan
>  wrote:
> 
> >   Ramana Radhakrishnan  
> >
> > PR target/63304
> > * config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
> > nopcrelative_literal_loads.
> > (aarch64_classify_address): Likewise.
> > (aarch64_constant_pool_reload_icode): Define.
> > (aarch64_secondary_reload): Handle secondary reloads for
> > literal pools.
> > (aarch64_override_options): Handle nopcrelative_literal_loads.
> > (aarch64_classify_symbol): Handle nopcrelative_literal_loads.
> > * config/aarch64/aarch64.md 
> > (aarch64_reload_movcp):
> > Define.
> > (aarch64_reload_movcp): Likewise.
> > * config/aarch64/aarch64.opt: New option 
> > mnopc-relative-literal-loads
> > * config/aarch64/predicates.md (aarch64_constant_pool_symref): New
> > predicate.
> > * doc/invoke.texi (mnopc-relative-literal-loads): Document.
> 
> This looks OK to me. It needs rebasing, but OK if the rebase is
> trival.  Default on is fine.  Hold off on the back ports for a couple
> of weeks.
> Cheers
> /Marcus

I didn't want to commit this and run off on holiday.

The rebase required is pretty much for Kyrill's work with saving
and restoring state for the target attributes stuff. So that's simple enough
and been tested ok.

I had forgotten there was a pre-requisite that requires a rebase after Alan's
recent work for F16, I've posted that again here after rebase for
approval.

https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02074.html

movtf is unnecessary as a separate expander. Move this to be with
the standard scalar floating point expanders.

Achieved by adding a new iterator and then using the same.

Tested cross aarch64-none-elf and no regressions.

Ramana

* config/aarch64/aarch.md (mov:GPF_F16): Use GPF_TF_F16.
(movtf): Delete.
* config/aarch64/iterators.md (GPF_TF_F16): New.
(GPF_F16): Delete.

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 2522982..58bb04a 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1043,8 +1043,8 @@
 })
 
 (define_expand "mov"
-  [(set (match_operand:GPF_F16 0 "nonimmediate_operand" "")
-   (match_operand:GPF_F16 1 "general_operand" ""))]
+  [(set (match_operand:GPF_TF_F16 0 "nonimmediate_operand" "")
+   (match_operand:GPF_TF_F16 1 "general_operand" ""))]
   ""
   {
 if (!TARGET_FLOAT)
@@ -1118,24 +1118,6 @@
  f_loadd,f_stored,load1,store1,mov_reg")]
 )
 
-(define_expand "movtf"
-  [(set (match_operand:TF 0 "nonimmediate_operand" "")
-   (match_operand:TF 1 "general_operand" ""))]
-  ""
-  {
-if (!TARGET_FLOAT)
-  {
-   aarch64_err_no_fpadvsimd (TFmode, "code");
-   FAIL;
-  }
-
-if (GET_CODE (operands[0]) == MEM
-&& ! (GET_CODE (operands[1]) == CONST_DOUBLE
- && aarch64_float_const_zero_rtx_p (operands[1])))
-  operands[1] = force_reg (TFmode, operands[1]);
-  }
-)
-
 (define_insn "*movtf_aarch64"
   [(set (match_operand:TF 0
 "nonimmediate_operand" "=w,?&r,w ,?r,w,?w,w,m,?r ,Ump,Ump")
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 475aa6e..c1a0ce2 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -38,8 +38,8 @@
 ;; Iterator for General Purpose Floating-point registers (32- and 64-bit modes)
 (define_mode_iterator GPF [SF DF])
 
-;; Iterator for General Purpose Float registers, inc __fp16.
-(define_mode_iterator GPF_F16 [HF SF DF])
+;; Iterator for all scalar floating point modes (HF, SF, DF and TF)
+(define_mode_iterator GPF_TF_F16 [HF SF DF TF])
 
 ;; Integer vector modes.
 (define_mode_iterator VDQ_I [V8QI V16QI V4HI V8HI V2SI V4SI V2DI])




Re: [RFC AArch64][PR 63304] Handle literal pools for functions > 1 MiB in size.

2015-09-11 Thread Richard Earnshaw
On 11/09/15 09:48, Ramana Radhakrishnan wrote:
> On Thu, Aug 27, 2015 at 03:07:30PM +0100, Marcus Shawcroft wrote:
>> On 27 July 2015 at 15:33, Ramana Radhakrishnan
>>  wrote:
>>
>>>   Ramana Radhakrishnan  
>>>
>>> PR target/63304
>>> * config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Handle
>>> nopcrelative_literal_loads.
>>> (aarch64_classify_address): Likewise.
>>> (aarch64_constant_pool_reload_icode): Define.
>>> (aarch64_secondary_reload): Handle secondary reloads for
>>> literal pools.
>>> (aarch64_override_options): Handle nopcrelative_literal_loads.
>>> (aarch64_classify_symbol): Handle nopcrelative_literal_loads.
>>> * config/aarch64/aarch64.md 
>>> (aarch64_reload_movcp):
>>> Define.
>>> (aarch64_reload_movcp): Likewise.
>>> * config/aarch64/aarch64.opt: New option 
>>> mnopc-relative-literal-loads
>>> * config/aarch64/predicates.md (aarch64_constant_pool_symref): New
>>> predicate.
>>> * doc/invoke.texi (mnopc-relative-literal-loads): Document.
>>
>> This looks OK to me. It needs rebasing, but OK if the rebase is
>> trival.  Default on is fine.  Hold off on the back ports for a couple
>> of weeks.
>> Cheers
>> /Marcus
> 
> I didn't want to commit this and run off on holiday.
> 
> The rebase required is pretty much for Kyrill's work with saving
> and restoring state for the target attributes stuff. So that's simple enough
> and been tested ok.
> 
> I had forgotten there was a pre-requisite that requires a rebase after Alan's
> recent work for F16, I've posted that again here after rebase for
> approval.
> 
> https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02074.html
> 
> movtf is unnecessary as a separate expander. Move this to be with
> the standard scalar floating point expanders.
> 
> Achieved by adding a new iterator and then using the same.
> 
> Tested cross aarch64-none-elf and no regressions.
> 
> Ramana
> 
>   * config/aarch64/aarch.md (mov:GPF_F16): Use GPF_TF_F16.
>   (movtf): Delete.
>   * config/aarch64/iterators.md (GPF_TF_F16): New.
>   (GPF_F16): Delete.

This is OK.

R.

> 
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 2522982..58bb04a 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -1043,8 +1043,8 @@
>  })
>  
>  (define_expand "mov"
> -  [(set (match_operand:GPF_F16 0 "nonimmediate_operand" "")
> - (match_operand:GPF_F16 1 "general_operand" ""))]
> +  [(set (match_operand:GPF_TF_F16 0 "nonimmediate_operand" "")
> + (match_operand:GPF_TF_F16 1 "general_operand" ""))]
>""
>{
>  if (!TARGET_FLOAT)
> @@ -1118,24 +1118,6 @@
>   f_loadd,f_stored,load1,store1,mov_reg")]
>  )
>  
> -(define_expand "movtf"
> -  [(set (match_operand:TF 0 "nonimmediate_operand" "")
> - (match_operand:TF 1 "general_operand" ""))]
> -  ""
> -  {
> -if (!TARGET_FLOAT)
> -  {
> - aarch64_err_no_fpadvsimd (TFmode, "code");
> - FAIL;
> -  }
> -
> -if (GET_CODE (operands[0]) == MEM
> -&& ! (GET_CODE (operands[1]) == CONST_DOUBLE
> -   && aarch64_float_const_zero_rtx_p (operands[1])))
> -  operands[1] = force_reg (TFmode, operands[1]);
> -  }
> -)
> -
>  (define_insn "*movtf_aarch64"
>[(set (match_operand:TF 0
>"nonimmediate_operand" "=w,?&r,w ,?r,w,?w,w,m,?r ,Ump,Ump")
> diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
> index 475aa6e..c1a0ce2 100644
> --- a/gcc/config/aarch64/iterators.md
> +++ b/gcc/config/aarch64/iterators.md
> @@ -38,8 +38,8 @@
>  ;; Iterator for General Purpose Floating-point registers (32- and 64-bit 
> modes)
>  (define_mode_iterator GPF [SF DF])
>  
> -;; Iterator for General Purpose Float registers, inc __fp16.
> -(define_mode_iterator GPF_F16 [HF SF DF])
> +;; Iterator for all scalar floating point modes (HF, SF, DF and TF)
> +(define_mode_iterator GPF_TF_F16 [HF SF DF TF])
>  
>  ;; Integer vector modes.
>  (define_mode_iterator VDQ_I [V8QI V16QI V4HI V8HI V2SI V4SI V2DI])
> 
>