On Tue, Mar 12, 2019 at 10:50 AM Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> Steve Ellcey <sell...@marvell.com> writes:
> > Richard,
> >
> > I don't necessarily disagree with anything in your comments and long
> > term I think that is the right direction, but I wonder if that level of
> > change is appropriate for GCC Stage 4 which is where we are now.  Your
> > changes would require fixes in shared code, whereas setting
> > REG_ALLOC_ORDER only affects Aarch64 and seems like a safer change.
>
> I guess it's weighing invasiveness in terms of lines of code/location of
> code vs. invasiveness in terms of the effect the code has.  Changing
> REG_ALLOC_ORDER affects all functinos that use significant numbers of
> SIMD registers, which could have unexpected knock-on effects like
> performance regressions or exposing latent bugs.  It would also make
> the RA try V24 before V16 for callers of vector PCS functions,
> even though they should prefer V16.
>
> Keeping the change specific to callees that use the new vector PCS
> feature seems more conservative from that point of view.
>
> > I am not sure how long it would take me to implement something along
> > the lines of what you are suggesting.
>
> OK, in that case, I thought I'd give a go.
>
> This patch adds a new target hook that says which registers a function
> can use without saving them in the prologue and restoring them in the
> epilogue.  By default this is call_used_reg_set.
>
> The patch only makes IRA use the hook, and in that context it's just
> an (important) optimisation.  But I think we need the same thing for
> passes like regrename, where it'll be a correctness issue.  I'll follow
> on with patches there if this one is OK.
>
> Tested on aarch64-linux-gnu (with and without SVE) and x86_64-linux-gnu.
> OK to install?

So why is call_used_reg_set not updated appropriately?  As you say
if you introduce sth else it becomes a correctness issue to use that "else"
rather than call_used_reg_set which was OK up until now.  I wonder how
other targets with a per-function different ABI handle this situation.  Or are
you the first and others at most affect calling conventions here?  That is,
it looks like making (parts of?) this_taret_hard_regs per function or
re-computed at set_cfun time.  There's alrady a hook for that btw,
targetm.set_current_function ().  Why not change call_used_reg_set there?

Richard.

> Richard
>
>
> 2019-03-12  Richard Sandiford  <richard.sandif...@arm.com>
>
> gcc/
>         PR target/89628
>         * target.def (function_scratch_regs): New hook.
>         * doc/tm.texi.in (TARGET_FUNCTION_SCRATCH_REGS): New placeholder.
>         * doc/tm.texi: Regenerate.
>         * targhooks.h (default_function_scratch_regs): Declare.
>         * targhooks.c (default_function_scratch_regs): New function.
>         * emit-rtl.h (rtl_data::scratch_regs): New member variable.
>         * function.c (init_dummy_function_start): Initialize it.
>         (init_function_start): Likewise.
>         * ira-color.c (calculate_saved_nregs): Use it instead of
>         call_used_reg_set.
>         * config/aarch64/aarch64.c (aarch64_function_scratch_regs): New
>         function.
>         (TARGET_FUNCTION_SCRATCH_REGS): Redefine.
>
> gcc/testsuite/
>         PR target/89628
>         * gcc.target/aarch64/pr89628.c: New test.
>
> Index: gcc/target.def
> ===================================================================
> --- gcc/target.def      2019-03-08 18:14:26.113008680 +0000
> +++ gcc/target.def      2019-03-12 09:34:25.447913041 +0000
> @@ -5763,6 +5763,18 @@ The default version of this hook always
>   default_hard_regno_scratch_ok)
>
>  DEFHOOK
> +(function_scratch_regs,
> + "This hook sets @var{regs} to the set of registers that function @var{fn}\n\
> +can use without having to save them in the prologue and restore them in 
> the\n\
> +epilogue.\n\
> +\n\
> +By default this set is the same as @code{call_used_reg_set}.  Targets only\n\
> +need to override the hook if some functions implement alternative ABIs 
> that\n\
> +save more registers or fewer registers than normal.",
> + void, (struct function *fn, HARD_REG_SET *regs),
> + default_function_scratch_regs)
> +
> +DEFHOOK
>  (hard_regno_call_part_clobbered,
>   "This hook should return true if @var{regno} is partly call-saved and\n\
>  partly call-clobbered, and if a value of mode @var{mode} would be partly\n\
> Index: gcc/doc/tm.texi.in
> ===================================================================
> --- gcc/doc/tm.texi.in  2019-03-08 18:14:25.577010718 +0000
> +++ gcc/doc/tm.texi.in  2019-03-12 09:34:25.443913054 +0000
> @@ -1705,6 +1705,11 @@ of @code{CALL_USED_REGISTERS}.
>  @cindex call-used register
>  @cindex call-clobbered register
>  @cindex call-saved register
> +@hook TARGET_FUNCTION_SCRATCH_REGS
> +
> +@cindex call-used register
> +@cindex call-clobbered register
> +@cindex call-saved register
>  @hook TARGET_HARD_REGNO_CALL_PART_CLOBBERED
>
>  @hook TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS
> Index: gcc/doc/tm.texi
> ===================================================================
> --- gcc/doc/tm.texi     2019-03-08 18:14:25.573010734 +0000
> +++ gcc/doc/tm.texi     2019-03-12 09:34:25.443913054 +0000
> @@ -1894,6 +1894,19 @@ of @code{CALL_USED_REGISTERS}.
>  @cindex call-used register
>  @cindex call-clobbered register
>  @cindex call-saved register
> +@deftypefn {Target Hook} void TARGET_FUNCTION_SCRATCH_REGS (struct function 
> *@var{fn}, HARD_REG_SET *@var{regs})
> +This hook sets @var{regs} to the set of registers that function @var{fn}
> +can use without having to save them in the prologue and restore them in the
> +epilogue.
> +
> +By default this set is the same as @code{call_used_reg_set}.  Targets only
> +need to override the hook if some functions implement alternative ABIs that
> +save more registers or fewer registers than normal.
> +@end deftypefn
> +
> +@cindex call-used register
> +@cindex call-clobbered register
> +@cindex call-saved register
>  @deftypefn {Target Hook} bool TARGET_HARD_REGNO_CALL_PART_CLOBBERED 
> (rtx_insn *@var{insn}, unsigned int @var{regno}, machine_mode @var{mode})
>  This hook should return true if @var{regno} is partly call-saved and
>  partly call-clobbered, and if a value of mode @var{mode} would be partly
> Index: gcc/targhooks.h
> ===================================================================
> --- gcc/targhooks.h     2019-03-08 18:14:25.849009683 +0000
> +++ gcc/targhooks.h     2019-03-12 09:34:25.447913041 +0000
> @@ -286,5 +286,6 @@ extern bool speculation_safe_value_not_n
>  extern rtx default_speculation_safe_value (machine_mode, rtx, rtx, rtx);
>  extern void default_remove_extra_call_preserved_regs (rtx_insn *,
>                                                       HARD_REG_SET *);
> +extern void default_function_scratch_regs (struct function *, HARD_REG_SET 
> *);
>
>  #endif /* GCC_TARGHOOKS_H */
> Index: gcc/targhooks.c
> ===================================================================
> --- gcc/targhooks.c     2019-03-08 18:14:25.845009699 +0000
> +++ gcc/targhooks.c     2019-03-12 09:34:25.447913041 +0000
> @@ -2379,4 +2379,10 @@ default_remove_extra_call_preserved_regs
>  {
>  }
>
> +void
> +default_function_scratch_regs (struct function *, HARD_REG_SET *regs)
> +{
> +  COPY_HARD_REG_SET (*regs, call_used_reg_set);
> +}
> +
>  #include "gt-targhooks.h"
> Index: gcc/emit-rtl.h
> ===================================================================
> --- gcc/emit-rtl.h      2019-03-08 18:15:33.700751752 +0000
> +++ gcc/emit-rtl.h      2019-03-12 09:34:25.443913054 +0000
> @@ -292,6 +292,9 @@ struct GTY(()) rtl_data {
>       sets them.  */
>    HARD_REG_SET asm_clobbers;
>
> +  /* Registers that the function can use without having to save them first.  
> */
> +  HARD_REG_SET scratch_regs;
> +
>    /* The highest address seen during shorten_branches.  */
>    int max_insn_address;
>  };
> Index: gcc/function.c
> ===================================================================
> --- gcc/function.c      2019-03-08 18:15:33.660751905 +0000
> +++ gcc/function.c      2019-03-12 09:34:25.447913041 +0000
> @@ -4875,6 +4875,7 @@ init_dummy_function_start (void)
>  {
>    push_dummy_function (false);
>    prepare_function_start ();
> +  COPY_HARD_REG_SET (crtl->scratch_regs, call_used_reg_set);
>  }
>
>  /* Generate RTL for the start of the function SUBR (a FUNCTION_DECL tree 
> node)
> @@ -4888,6 +4889,8 @@ init_function_start (tree subr)
>    initialize_rtl ();
>
>    prepare_function_start ();
> +  targetm.function_scratch_regs (DECL_STRUCT_FUNCTION (subr),
> +                                &crtl->scratch_regs);
>    decide_function_section (subr);
>
>    /* Warn if this value is an aggregate type,
> Index: gcc/ira-color.c
> ===================================================================
> --- gcc/ira-color.c     2019-03-08 18:15:26.824777889 +0000
> +++ gcc/ira-color.c     2019-03-12 09:34:25.447913041 +0000
> @@ -1660,7 +1660,7 @@ calculate_saved_nregs (int hard_regno, m
>    ira_assert (hard_regno >= 0);
>    for (i = hard_regno_nregs (hard_regno, mode) - 1; i >= 0; i--)
>      if (!allocated_hardreg_p[hard_regno + i]
> -       && !TEST_HARD_REG_BIT (call_used_reg_set, hard_regno + i)
> +       && !TEST_HARD_REG_BIT (crtl->scratch_regs, hard_regno + i)
>         && !LOCAL_REGNO (hard_regno + i))
>        nregs++;
>    return nregs;
> Index: gcc/config/aarch64/aarch64.c
> ===================================================================
> --- gcc/config/aarch64/aarch64.c        2019-03-08 18:15:38.228734542 +0000
> +++ gcc/config/aarch64/aarch64.c        2019-03-12 09:34:25.427913117 +0000
> @@ -1706,6 +1706,19 @@ aarch64_simd_call_p (rtx_insn *insn)
>    return aarch64_simd_decl_p (fndecl);
>  }
>
> +/* Implement TARGET_FUNCTION_SCRATCH_REGS.  Vector PCS functions have
> +   fewer available scratch registers than normal functions do.  */
> +
> +static void
> +aarch64_function_scratch_regs (function *fn, HARD_REG_SET *scratch_regs)
> +{
> +  default_function_scratch_regs (fn, scratch_regs);
> +  if (aarch64_simd_decl_p (fn->decl))
> +    for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
> +      if (FP_SIMD_SAVED_REGNUM_P (regno))
> +       CLEAR_HARD_REG_BIT (*scratch_regs, regno);
> +}
> +
>  /* Implement TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS.  If INSN calls
>     a function that uses the SIMD ABI, take advantage of the extra
>     call-preserved registers that the ABI provides.  */
> @@ -19208,6 +19221,9 @@ #define TARGET_MODES_TIEABLE_P aarch64_m
>  #define TARGET_HARD_REGNO_CALL_PART_CLOBBERED \
>    aarch64_hard_regno_call_part_clobbered
>
> +#undef TARGET_FUNCTION_SCRATCH_REGS
> +#define TARGET_FUNCTION_SCRATCH_REGS aarch64_function_scratch_regs
> +
>  #undef TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS
>  #define TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS \
>    aarch64_remove_extra_call_preserved_regs
> Index: gcc/testsuite/gcc.target/aarch64/pr89628.c
> ===================================================================
> --- /dev/null   2019-03-08 11:40:14.606883727 +0000
> +++ gcc/testsuite/gcc.target/aarch64/pr89628.c  2019-03-12 09:34:25.447913041 
> +0000
> @@ -0,0 +1,25 @@
> +/* { dg-options "-O2" } */
> +
> +typedef __Float32x4_t vec;
> +
> +__attribute__((aarch64_vector_pcs))
> +vec f(vec a0, vec a1, vec a2, vec a3, vec a4, vec a5, vec a6, vec a7)
> +{
> +  vec t0, t1, t2, t3, t4, t5, t6, t7, s0, s1, s2, s3;
> +  t0 = a0 - a7;
> +  t1 = a1 - a6;
> +  t2 = a2 - a5;
> +  t3 = a3 - a4;
> +  t4 = a4 - a3;
> +  t5 = a5 - a2;
> +  t6 = a6 - a1;
> +  t7 = a7 - a0;
> +  s0 = t0 * t1;
> +  s1 = t2 * t3;
> +  s2 = t4 * t5;
> +  s3 = t6 * t7;
> +  return s0 * s1 * s2 * s3 * a0 * a1 * a2 * a3 * a4 * a5 * a6 * a7;
> +}
> +
> +/* { dg-final { scan-assembler-not '\tldr\t' } } */
> +/* { dg-final { scan-assembler-not '\tstr\t' } } */

Reply via email to