Re: [EXT] Re: [Patch 4/4][Aarch64] v2: Implement Aarch64 SIMD ABI

2019-01-11 Thread Steve Ellcey
On Fri, 2019-01-11 at 12:30 +0100, Jakub Jelinek wrote:
> 
> > Yeah, like you say, this was originally posted in stage 1 and is the
> > last patch in the series.  Not committing it would leave the work
> > incomplete in GCC 9.  The problem is that we're now in stage 4 rather
> > than stage 3.
> > 
> > I don't think I'm neutral enough to make the call.  Richard, Jakub?
> 
> I'm ok with accepting this late as an exception, if it can be committed
> reasonably soon (within a week or at most two).
> 
>   Jakub

OK, I will make the comment change and check it in.  Note that this
does not give us the complete implementation yet.  Patch 3/4 was OK'ed
by Richard last week but I hadn't checked that in either.  So I will
check both these in later today unless I haer otherwise.

That still leaves Patch 2/4 which is Aarch64 specific.

https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01421.html

Jakub had some comments on the test changes which I fixed but I did
not get any feedback on the actual code changes so I am not sure if 
that is OK or not.

STeve Ellcey
sell...@marvell.com


Re: [Patch 4/4][Aarch64] v2: Implement Aarch64 SIMD ABI

2019-01-11 Thread Jakub Jelinek
On Fri, Jan 11, 2019 at 11:22:59AM +, Richard Sandiford wrote:
> Steve Ellcey  writes:
> > If this looks good to you can I go ahead and check it in?  I know
> > we are in Stage 3 now, but my recollection is that patches that were
> > initially submitted during Stage 1 could go ahead once approved.
> 
> Yeah, like you say, this was originally posted in stage 1 and is the
> last patch in the series.  Not committing it would leave the work
> incomplete in GCC 9.  The problem is that we're now in stage 4 rather
> than stage 3.
> 
> I don't think I'm neutral enough to make the call.  Richard, Jakub?

I'm ok with accepting this late as an exception, if it can be committed
reasonably soon (within a week or at most two).

Jakub


Re: [Patch 4/4][Aarch64] v2: Implement Aarch64 SIMD ABI

2019-01-11 Thread Richard Sandiford
Steve Ellcey  writes:
> OK, I fixed the issues in your last email.  I initially found one
> regression while testing.  In lra_create_live_ranges_1 I had removed
> the 'call_p = false' statement but did not replaced it with anything.
> This resulted in no regressions on aarch64 but caused a single
> regression on x86 (gcc.target/i386/pr87759.c).  I replaced the
> line with 'call_insn = NULL' and the regression went away so I
> have clean bootstraps and no regressions on aarch64 and x86 now.

Looks good to me bar the parameter description below.

> If this looks good to you can I go ahead and check it in?  I know
> we are in Stage 3 now, but my recollection is that patches that were
> initially submitted during Stage 1 could go ahead once approved.

Yeah, like you say, this was originally posted in stage 1 and is the
last patch in the series.  Not committing it would leave the work
incomplete in GCC 9.  The problem is that we're now in stage 4 rather
than stage 3.

I don't think I'm neutral enough to make the call.  Richard, Jakub?

> diff --git a/gcc/lra-lives.c b/gcc/lra-lives.c
> index a00ec38..b77b675 100644
> --- a/gcc/lra-lives.c
> +++ b/gcc/lra-lives.c
> @@ -576,25 +576,39 @@ lra_setup_reload_pseudo_preferenced_hard_reg (int regno,
>  
>  /* Check that REGNO living through calls and setjumps, set up conflict
> regs using LAST_CALL_USED_REG_SET, and clear corresponding bits in
> -   PSEUDOS_LIVE_THROUGH_CALLS and PSEUDOS_LIVE_THROUGH_SETJUMPS.  */
> +   PSEUDOS_LIVE_THROUGH_CALLS and PSEUDOS_LIVE_THROUGH_SETJUMPS.
> +   CALL_INSN may be the specific call we want to check that REGNO lives
> +   through or a call that is guaranteed to clobber REGNO if any call
> +   in the current block clobbers REGNO.  */

I think it would be more accurate to say:

   CALL_INSN is a call that is representative of all calls in the region
   described by the PSEUDOS_LIVE_THROUGH_* sets, in terms of the registers
   that it preserves and clobbers.  */

> +
>  static inline void
>  check_pseudos_live_through_calls (int regno,
> -   HARD_REG_SET last_call_used_reg_set)
> +   HARD_REG_SET last_call_used_reg_set,
> +   rtx_insn *call_insn)
>  {
>int hr;
> +  rtx_insn *old_call_insn;
>  
>if (! sparseset_bit_p (pseudos_live_through_calls, regno))
>  return;
> +
> +  gcc_assert (call_insn && CALL_P (call_insn));
> +  old_call_insn = lra_reg_info[regno].call_insn;
> +  if (!old_call_insn
> +  || (targetm.return_call_with_max_clobbers
> +   && targetm.return_call_with_max_clobbers (old_call_insn, call_insn)
> +  == call_insn))
> +lra_reg_info[regno].call_insn = call_insn;
> +
>sparseset_clear_bit (pseudos_live_through_calls, regno);
>IOR_HARD_REG_SET (lra_reg_info[regno].conflict_hard_regs,
>   last_call_used_reg_set);
>  
>for (hr = 0; HARD_REGISTER_NUM_P (hr); hr++)
> -if (targetm.hard_regno_call_part_clobbered (hr,
> +if (targetm.hard_regno_call_part_clobbered (call_insn, hr,
>   PSEUDO_REGNO_MODE (regno)))
>add_to_hard_reg_set (&lra_reg_info[regno].conflict_hard_regs,
>  PSEUDO_REGNO_MODE (regno), hr);
> -  lra_reg_info[regno].call_p = true;
>if (! sparseset_bit_p (pseudos_live_through_setjumps, regno))
>  return;
>sparseset_clear_bit (pseudos_live_through_setjumps, regno);

BTW, I think we could save some compile time by moving the "for" loop
into the new "if", since otherwise call_insn should have no new
information.  But that was true before as well (since we could have
skipped the loop if lra_reg_info[regno].call_p was already true),
so it's really a separate issue.

Thanks,
Richard


Re: [Patch 4/4][Aarch64] v2: Implement Aarch64 SIMD ABI

2019-01-10 Thread Steve Ellcey
OK, I fixed the issues in your last email.  I initially found one
regression while testing.  In lra_create_live_ranges_1 I had removed
the 'call_p = false' statement but did not replaced it with anything.
This resulted in no regressions on aarch64 but caused a single
regression on x86 (gcc.target/i386/pr87759.c).  I replaced the
line with 'call_insn = NULL' and the regression went away so I
have clean bootstraps and no regressions on aarch64 and x86 now.

If this looks good to you can I go ahead and check it in?  I know
we are in Stage 3 now, but my recollection is that patches that were
initially submitted during Stage 1 could go ahead once approved.

Steve Ellcey
sell...@marvell.com



2019-01-10  Steve Ellcey  

* config/aarch64/aarch64.c (aarch64_simd_call_p): New function.
(aarch64_hard_regno_call_part_clobbered): Add insn argument.
(aarch64_return_call_with_max_clobbers): New function.
(TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): New macro.
* config/avr/avr.c (avr_hard_regno_call_part_clobbered): Add insn
argument.
* config/i386/i386.c (ix86_hard_regno_call_part_clobbered): Ditto.
* config/mips/mips.c (mips_hard_regno_call_part_clobbered): Ditto.
* config/rs6000/rs6000.c (rs6000_hard_regno_call_part_clobbered): Ditto.
* config/s390/s390.c (s390_hard_regno_call_part_clobbered): Ditto.
* cselib.c (cselib_process_insn): Add argument to
targetm.hard_regno_call_part_clobbered call.
* ira-conflicts.c (ira_build_conflicts): Ditto.
* ira-costs.c (ira_tune_allocno_costs): Ditto.
* lra-constraints.c (inherit_reload_reg): Ditto.
* lra-int.h (struct lra_reg): Add call_insn field, remove call_p field.
* lra-lives.c (check_pseudos_live_through_calls): Add call_insn
argument.  Call targetm.return_call_with_max_clobbers.
Add argument to targetm.hard_regno_call_part_clobbered call.
(calls_have_same_clobbers_p): New function.
(process_bb_lives): Add call_insn and last_call_insn variables.
Pass call_insn to check_pseudos_live_through_calls.
Modify if stmt to check targetm.return_call_with_max_clobbers.
Update setting of flush variable.
(lra_create_live_ranges_1): Set call_insn to NULL instead of call_p
to false.
* lra.c (initialize_lra_reg_info_element): Set call_insn to NULL.
* regcprop.c (copyprop_hardreg_forward_1): Add argument to
targetm.hard_regno_call_part_clobbered call.
* reginfo.c (choose_hard_reg_mode): Ditto.
* regrename.c (check_new_reg_p): Ditto.
* reload.c (find_equiv_reg): Ditto.
* reload1.c (emit_reload_insns): Ditto.
* sched-deps.c (deps_analyze_insn): Ditto.
* sel-sched.c (init_regs_for_mode): Ditto.
(mark_unavailable_hard_regs): Ditto.
* targhooks.c (default_dwarf_frame_reg_mode): Ditto.
* target.def (hard_regno_call_part_clobbered): Add insn argument.
(return_call_with_max_clobbers): New target function.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): New hook.
* hooks.c (hook_bool_uint_mode_false): Change to
hook_bool_insn_uint_mode_false.
* hooks.h (hook_bool_uint_mode_false): Ditto.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1c300af..7a1f838 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1655,14 +1655,53 @@ aarch64_reg_save_mode (tree fndecl, unsigned regno)
 	   : (aarch64_simd_decl_p (fndecl) ? E_TFmode : E_DFmode);
 }
 
+/* Return true if the instruction is a call to a SIMD function, false
+   if it is not a SIMD function or if we do not know anything about
+   the function.  */
+
+static bool
+aarch64_simd_call_p (rtx_insn *insn)
+{
+  rtx symbol;
+  rtx call;
+  tree fndecl;
+
+  gcc_assert (CALL_P (insn));
+  call = get_call_rtx_from (insn);
+  symbol = XEXP (XEXP (call, 0), 0);
+  if (GET_CODE (symbol) != SYMBOL_REF)
+return false;
+  fndecl = SYMBOL_REF_DECL (symbol);
+  if (!fndecl)
+return false;
+
+  return aarch64_simd_decl_p (fndecl);
+}
+
 /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  The callee only saves
the lower 64 bits of a 128-bit register.  Tell the compiler the callee
clobbers the top 64 bits when restoring the bottom 64 bits.  */
 
 static bool
-aarch64_hard_regno_call_part_clobbered (unsigned int regno, machine_mode mode)
+aarch64_hard_regno_call_part_clobbered (rtx_insn *insn, unsigned int regno,
+	machine_mode mode)
+{
+  bool simd_p = insn && CALL_P (insn) && aarch64_simd_call_p (insn);
+  return FP_REGNUM_P (regno)
+	 && maybe_gt (GET_MODE_SIZE (mode), simd_p ? 16 : 8);
+}
+
+/* Implement TARGET_RETURN_CALL_WITH_MAX_CLOBBERS.  */
+
+rtx_insn *
+aarch64_return_call_with_max_clobbers (rtx_insn *call_1, rtx_insn *call_2)
 {
-  return FP_REGNUM_P (regno) && maybe_gt (GET_MODE_SIZE (mode), 8);
+  gcc_a

Re: [Patch 4/4][Aarch64] v2: Implement Aarch64 SIMD ABI

2019-01-10 Thread Richard Sandiford
Steve Ellcey  writes:
> On Wed, 2019-01-09 at 10:00 +, Richard Sandiford wrote:
>
> Thanks for the quick turnaround on the comments Richard.  Here is a new
> version where I tried to address all the issues you raised.  One thing
> I noticed is that I think your calls_have_same_clobbers_p function only
> works if, when return_call_with_max_clobbers is called with two calls
> that clobber the same set of registers, it always returns the first
> call.
>
> I don't think my original function had that guarantee but I changed it 
> so that it would and documented that requirement in target.def.  I
> couldn't see a better way to implement the calls_have_same_clobbers_p
> function other than doing that.

Yeah, I think that's a good guarantee to have.

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 1c300af..d88be6c 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -1655,14 +1655,56 @@ aarch64_reg_save_mode (tree fndecl, unsigned regno)
>  /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  The callee only saves
> the lower 64 bits of a 128-bit register.  Tell the compiler the callee
> clobbers the top 64 bits when restoring the bottom 64 bits.  */
>  
>  static bool
> -aarch64_hard_regno_call_part_clobbered (unsigned int regno, machine_mode 
> mode)
> +aarch64_hard_regno_call_part_clobbered (rtx_insn *insn, unsigned int regno,
> + machine_mode mode)
>  {
> -  return FP_REGNUM_P (regno) && maybe_gt (GET_MODE_SIZE (mode), 8);
> +  bool simd_p = insn && CALL_P (insn) && aarch64_simd_call_p (insn);
> +  return FP_REGNUM_P (regno)
> +  && maybe_gt (GET_MODE_SIZE (mode), simd_p ? 16 : 8);
> +}
> +
> +/* Implement TARGET_RETURN_CALL_WITH_MAX_CLOBBERS.  */
> +
> +rtx_insn *
> +aarch64_return_call_with_max_clobbers (rtx_insn *call_1, rtx_insn *call_2)
> +{
> +  gcc_assert (CALL_P (call_1) && CALL_P (call_2));
> +
> +  if (aarch64_simd_call_p (call_1) == aarch64_simd_call_p (call_2))
> +return call_1;
> +
> +  if (aarch64_simd_call_p (call_2))
> +return call_1;
> +  else
> +return call_2;

Think this is simpler as:

  gcc_assert (CALL_P (call_1) && CALL_P (call_2));

  if (!aarch64_simd_call_p (call_1) || aarch64_simd_call_p (call_2))
return call_1;
  else
return call_2;

> diff --git a/gcc/lra-lives.c b/gcc/lra-lives.c
> index a00ec38..61149e1 100644
> --- a/gcc/lra-lives.c
> +++ b/gcc/lra-lives.c
> @@ -579,22 +579,32 @@ lra_setup_reload_pseudo_preferenced_hard_reg (int regno,
> PSEUDOS_LIVE_THROUGH_CALLS and PSEUDOS_LIVE_THROUGH_SETJUMPS.  */
>  static inline void
>  check_pseudos_live_through_calls (int regno,
> -   HARD_REG_SET last_call_used_reg_set)
> +   HARD_REG_SET last_call_used_reg_set,
> +   rtx_insn *call_insn)

Should document the new parameter.

> @@ -906,17 +933,22 @@ process_bb_lives (basic_block bb, int &curr_point, bool 
> dead_insn_p)
>  
> bool flush = (! hard_reg_set_empty_p (last_call_used_reg_set)
>   && ! hard_reg_set_equal_p (last_call_used_reg_set,
> -this_call_used_reg_set));
> +this_call_used_reg_set)
> + && ! calls_have_same_clobbers_p (call_insn,
> +  last_call_insn));

This should be || with the current test, not &&.  We need to check
that last_call_insn is nonnull first.

> EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, j)
>   {
> IOR_HARD_REG_SET (lra_reg_info[j].actual_call_used_reg_set,
>   this_call_used_reg_set);
> +
> if (flush)
> - check_pseudos_live_through_calls
> -   (j, last_call_used_reg_set);
> + check_pseudos_live_through_calls (j,
> +   last_call_used_reg_set,
> +   curr_insn);
>   }

Should be last_call_insn rather than curr_insn.  I.e. when we flush,
we apply the properties of the previous call to pseudos live after
the new call.

Looks good otherwise.

Thanks,
Richard


Re: [Patch 4/4][Aarch64] v2: Implement Aarch64 SIMD ABI

2019-01-09 Thread Steve Ellcey
On Wed, 2019-01-09 at 10:00 +, Richard Sandiford wrote:

Thanks for the quick turnaround on the comments Richard.  Here is a new
version where I tried to address all the issues you raised.  One thing
I noticed is that I think your calls_have_same_clobbers_p function only
works if, when return_call_with_max_clobbers is called with two calls
that clobber the same set of registers, it always returns the first
call.

I don't think my original function had that guarantee but I changed it 
so that it would and documented that requirement in target.def.  I
couldn't see a better way to implement the calls_have_same_clobbers_p
function other than doing that.

Steve Ellcey
sell...@marvell.com


2019-01-09  Steve Ellcey  

* config/aarch64/aarch64.c (aarch64_simd_call_p): New function.
(aarch64_hard_regno_call_part_clobbered): Add insn argument.
(aarch64_return_call_with_max_clobbers): New function.
(TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): New macro.
* config/avr/avr.c (avr_hard_regno_call_part_clobbered): Add insn
argument.
* config/i386/i386.c (ix86_hard_regno_call_part_clobbered): Ditto.
* config/mips/mips.c (mips_hard_regno_call_part_clobbered): Ditto.
* config/rs6000/rs6000.c (rs6000_hard_regno_call_part_clobbered): Ditto.
* config/s390/s390.c (s390_hard_regno_call_part_clobbered): Ditto.
* cselib.c (cselib_process_insn): Add argument to
targetm.hard_regno_call_part_clobbered call.
* ira-conflicts.c (ira_build_conflicts): Ditto.
* ira-costs.c (ira_tune_allocno_costs): Ditto.
* lra-constraints.c (inherit_reload_reg): Ditto.
* lra-int.h (struct lra_reg): Add call_insn field, remove call_p field.
* lra-lives.c (check_pseudos_live_through_calls): Add call_insn
argument.  Call targetm.return_call_with_max_clobbers.
Add argument to targetm.hard_regno_call_part_clobbered call.
(calls_have_same_clobbers_p): New function.
(process_bb_lives): Add call_insn and last_call_insn variables.
Pass call_insn to check_pseudos_live_through_calls.
Modify if stmt to check targetm.return_call_with_max_clobbers.
Update setting of flush variable.
* lra.c (initialize_lra_reg_info_element): Set call_insn to NULL.
* regcprop.c (copyprop_hardreg_forward_1): Add argument to
targetm.hard_regno_call_part_clobbered call.
* reginfo.c (choose_hard_reg_mode): Ditto.
* regrename.c (check_new_reg_p): Ditto.
* reload.c (find_equiv_reg): Ditto.
* reload1.c (emit_reload_insns): Ditto.
* sched-deps.c (deps_analyze_insn): Ditto.
* sel-sched.c (init_regs_for_mode): Ditto.
(mark_unavailable_hard_regs): Ditto.
* targhooks.c (default_dwarf_frame_reg_mode): Ditto.
* target.def (hard_regno_call_part_clobbered): Add insn argument.
(return_call_with_max_clobbers): New target function.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): New hook.
* hooks.c (hook_bool_uint_mode_false): Change to
hook_bool_insn_uint_mode_false.
* hooks.h (hook_bool_uint_mode_false): Ditto.


diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1c300af..d88be6c 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1655,14 +1655,56 @@ aarch64_reg_save_mode (tree fndecl, unsigned regno)
 	   : (aarch64_simd_decl_p (fndecl) ? E_TFmode : E_DFmode);
 }
 
+/* Return true if the instruction is a call to a SIMD function, false
+   if it is not a SIMD function or if we do not know anything about
+   the function.  */
+
+static bool
+aarch64_simd_call_p (rtx_insn *insn)
+{
+  rtx symbol;
+  rtx call;
+  tree fndecl;
+
+  gcc_assert (CALL_P (insn));
+  call = get_call_rtx_from (insn);
+  symbol = XEXP (XEXP (call, 0), 0);
+  if (GET_CODE (symbol) != SYMBOL_REF)
+return false;
+  fndecl = SYMBOL_REF_DECL (symbol);
+  if (!fndecl)
+return false;
+
+  return aarch64_simd_decl_p (fndecl);
+}
+
 /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  The callee only saves
the lower 64 bits of a 128-bit register.  Tell the compiler the callee
clobbers the top 64 bits when restoring the bottom 64 bits.  */
 
 static bool
-aarch64_hard_regno_call_part_clobbered (unsigned int regno, machine_mode mode)
+aarch64_hard_regno_call_part_clobbered (rtx_insn *insn, unsigned int regno,
+	machine_mode mode)
 {
-  return FP_REGNUM_P (regno) && maybe_gt (GET_MODE_SIZE (mode), 8);
+  bool simd_p = insn && CALL_P (insn) && aarch64_simd_call_p (insn);
+  return FP_REGNUM_P (regno)
+	 && maybe_gt (GET_MODE_SIZE (mode), simd_p ? 16 : 8);
+}
+
+/* Implement TARGET_RETURN_CALL_WITH_MAX_CLOBBERS.  */
+
+rtx_insn *
+aarch64_return_call_with_max_clobbers (rtx_insn *call_1, rtx_insn *call_2)
+{
+  gcc_assert (CALL_P (call_1) && CALL_P (call_2));
+
+  if (aarch64_simd_call_p (call_1) == aarch6

Re: [Patch 4/4][Aarch64] v2: Implement Aarch64 SIMD ABI

2019-01-09 Thread Richard Sandiford
Steve Ellcey  writes:
>  /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  The callee only saves
> the lower 64 bits of a 128-bit register.  Tell the compiler the callee
> clobbers the top 64 bits when restoring the bottom 64 bits.  */
>  
>  static bool
> -aarch64_hard_regno_call_part_clobbered (unsigned int regno, machine_mode 
> mode)
> +aarch64_hard_regno_call_part_clobbered (rtx_insn *insn, unsigned int regno,
> + machine_mode mode)
>  {
> -  return FP_REGNUM_P (regno) && maybe_gt (GET_MODE_SIZE (mode), 8);
> +  bool simd_p = insn && CALL_P (insn) && aarch64_simd_call_p (insn);
> +  return FP_REGNUM_P (regno) && maybe_gt (GET_MODE_SIZE (mode), simd_p ? 16: 
> 8);

Should be a space before the ":" (which pushes the line over 80 chars).

> diff --git a/gcc/hooks.h b/gcc/hooks.h
> index 9e4bc29..dc6b2e1 100644
> --- a/gcc/hooks.h
> +++ b/gcc/hooks.h
> @@ -40,7 +40,9 @@ extern bool hook_bool_const_rtx_insn_const_rtx_insn_true 
> (const rtx_insn *,
>  extern bool hook_bool_mode_uhwi_false (machine_mode,
>  unsigned HOST_WIDE_INT);
>  extern bool hook_bool_puint64_puint64_true (poly_uint64, poly_uint64);
> -extern bool hook_bool_uint_mode_false (unsigned int, machine_mode);
> +extern bool hook_bool_insn_uint_mode_false (rtx_insn *,
> + unsigned int,
> + machine_mode);

No need to break the line after "rtx_insn *,".

>  extern bool hook_bool_uint_mode_true (unsigned int, machine_mode);
>  extern bool hook_bool_tree_false (tree);
>  extern bool hook_bool_const_tree_false (const_tree);
> diff --git a/gcc/ira-conflicts.c b/gcc/ira-conflicts.c
> index b57468b..b697e57 100644
> --- a/gcc/ira-conflicts.c
> +++ b/gcc/ira-conflicts.c
> @@ -808,7 +808,8 @@ ira_build_conflicts (void)
>regs must conflict with them.  */
> for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
>   if (!TEST_HARD_REG_BIT (call_used_reg_set, regno)
> - && targetm.hard_regno_call_part_clobbered (regno,
> + && targetm.hard_regno_call_part_clobbered (NULL,
> +regno,
>  obj_mode))
> {
>   SET_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno);

No need to break the line after "NULL,".

> diff --git a/gcc/ira-costs.c b/gcc/ira-costs.c
> index e5d8804..7f60712 100644
> --- a/gcc/ira-costs.c
> +++ b/gcc/ira-costs.c
> @@ -2379,7 +2379,8 @@ ira_tune_allocno_costs (void)
>  *crossed_calls_clobber_regs)
> && (ira_hard_reg_set_intersection_p (regno, mode,
>  call_used_reg_set)
> -   || targetm.hard_regno_call_part_clobbered (regno,
> +   || targetm.hard_regno_call_part_clobbered (NULL,
> +  regno,
>mode)))
>   cost += (ALLOCNO_CALL_FREQ (a)
>* (ira_memory_move_cost[mode][rclass][0]

Same here.

> diff --git a/gcc/lra-int.h b/gcc/lra-int.h
> index 9d9e81d..ccc7b00 100644
> --- a/gcc/lra-int.h
> +++ b/gcc/lra-int.h
> @@ -117,6 +117,8 @@ struct lra_reg
>/* This member is set up in lra-lives.c for subsequent
>   assignments.  */
>lra_copy_t copies;
> +  /* Call instruction that may affect this register.  */
> +  rtx_insn *call_insn;
>  };
>  
>  /* References to the common info about each register.  */

If we do this right, I think the new field should be able to replace call_p.
The pseudo crosses a call iff call_insn is nonnull.

I think the field belongs after:

  poly_int64 offset;

since it comes under:

  /* The following fields are defined only for pseudos.  */

rather than:

  /* These members are set up in lra-lives.c and updated in
 lra-coalesce.c.  */

> diff --git a/gcc/lra-lives.c b/gcc/lra-lives.c
> index 7b60691..0b96891 100644
> --- a/gcc/lra-lives.c
> +++ b/gcc/lra-lives.c
> @@ -579,18 +579,26 @@ lra_setup_reload_pseudo_preferenced_hard_reg (int regno,
> PSEUDOS_LIVE_THROUGH_CALLS and PSEUDOS_LIVE_THROUGH_SETJUMPS.  */
>  static inline void
>  check_pseudos_live_through_calls (int regno,
> -   HARD_REG_SET last_call_used_reg_set)
> +   HARD_REG_SET last_call_used_reg_set,
> +   rtx_insn *call_insn)
>  {
>int hr;
>  
> +  if (call_insn && CALL_P (call_insn) && 
> targetm.return_call_with_max_clobbers)
> +lra_reg_info[regno].call_insn =
> +  targetm.return_call_with_max_clobbers (call_insn,
> +  lra_reg_info[regno].call_insn);
> +

This should happen...

>if (! sparseset_bit_p (pseudos_live_through_

Re: [Patch 4/4][Aarch64] v2: Implement Aarch64 SIMD ABI

2019-01-08 Thread Steve Ellcey
On Mon, 2019-01-07 at 17:38 +, Richard Sandiford wrote:
> 
> Yeah, this was the kind of thing I had in mind, thanks.

Here is an updated version of the patch.  I bootstrapped and tested
on aarch64 and x86.  I didn't test the other platforms where I changed
the arguments to hard_regno_call_part_clobbered but I think they should
be OK.  I believe I addressed all the issues you brought up.  The ones
I am least confident of are the lra-lives.c changes.  I think they are
right and testing had no regressions, but they are probably the changes
that need to be checked most closely.

Steve Ellcey
sell...@marvell.com


2019-01-08  Steve Ellcey  

* config/aarch64/aarch64.c (aarch64_simd_call_p): New function.
(aarch64_hard_regno_call_part_clobbered): Add insn argument.
(aarch64_return_call_with_max_clobbers): New function.
(TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): New macro.
* config/avr/avr.c (avr_hard_regno_call_part_clobbered): Add insn
argument.
* config/i386/i386.c (ix86_hard_regno_call_part_clobbered): Ditto.
* config/mips/mips.c (mips_hard_regno_call_part_clobbered): Ditto.
* config/rs6000/rs6000.c (rs6000_hard_regno_call_part_clobbered): Ditto.
* config/s390/s390.c (s390_hard_regno_call_part_clobbered): Ditto.
* cselib.c (cselib_process_insn): Add argument to
targetm.hard_regno_call_part_clobbered call.
* ira-conflicts.c (ira_build_conflicts): Ditto.
* ira-costs.c (ira_tune_allocno_costs): Ditto.
* lra-constraints.c (inherit_reload_reg): Ditto.
* lra-int.h (struct lra_reg): Add call_insn field.
* lra-lives.c (check_pseudos_live_through_calls): Add call_insn
argument.  Call targetm.return_call_with_max_clobbers.
Add argument to targetm.hard_regno_call_part_clobbered call.
(process_bb_lives): Use new target function
targetm.return_call_with_max_clobbers to set call_insn.
Pass call_insn to check_pseudos_live_through_calls.
Modify if to check targetm.return_call_with_max_clobbers.
* lra.c (initialize_lra_reg_info_element): Set call_insn to NULL.
* regcprop.c (copyprop_hardreg_forward_1): Add argument to
targetm.hard_regno_call_part_clobbered call.
* reginfo.c (choose_hard_reg_mode): Ditto.
* regrename.c (check_new_reg_p): Ditto.
* reload.c (find_equiv_reg): Ditto.
* reload1.c (emit_reload_insns): Ditto.
* sched-deps.c (deps_analyze_insn): Ditto.
* sel-sched.c (init_regs_for_mode): Ditto.
(mark_unavailable_hard_regs): Ditto.
* targhooks.c (default_dwarf_frame_reg_mode): Ditto.
* target.def (hard_regno_call_part_clobbered): Add insn argument.
(return_call_with_max_clobbers): New target function.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): New hook.
* hooks.c (hook_bool_uint_mode_false): Change to
hook_bool_insn_uint_mode_false.
* hooks.h (hook_bool_uint_mode_false): Ditto.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1c45243..2063292 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1644,14 +1644,51 @@ aarch64_reg_save_mode (tree fndecl, unsigned regno)
 	   : (aarch64_simd_decl_p (fndecl) ? E_TFmode : E_DFmode);
 }
 
+/* Return true if the instruction is a call to a SIMD function, false
+   if it is not a SIMD function or if we do not know anything about
+   the function.  */
+
+static bool
+aarch64_simd_call_p (rtx_insn *insn)
+{
+  rtx symbol;
+  rtx call;
+  tree fndecl;
+
+  gcc_assert (CALL_P (insn));
+  call = get_call_rtx_from (insn);
+  symbol = XEXP (XEXP (call, 0), 0);
+  if (GET_CODE (symbol) != SYMBOL_REF)
+return false;
+  fndecl = SYMBOL_REF_DECL (symbol);
+  if (!fndecl)
+return false;
+
+  return aarch64_simd_decl_p (fndecl);
+}
+
 /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  The callee only saves
the lower 64 bits of a 128-bit register.  Tell the compiler the callee
clobbers the top 64 bits when restoring the bottom 64 bits.  */
 
 static bool
-aarch64_hard_regno_call_part_clobbered (unsigned int regno, machine_mode mode)
+aarch64_hard_regno_call_part_clobbered (rtx_insn *insn, unsigned int regno,
+	machine_mode mode)
 {
-  return FP_REGNUM_P (regno) && maybe_gt (GET_MODE_SIZE (mode), 8);
+  bool simd_p = insn && CALL_P (insn) && aarch64_simd_call_p (insn);
+  return FP_REGNUM_P (regno) && maybe_gt (GET_MODE_SIZE (mode), simd_p ? 16: 8);
+}
+
+/* Implement TARGET_RETURN_CALL_WITH_MAX_CLOBBERS.  */
+
+rtx_insn *
+aarch64_return_call_with_max_clobbers (rtx_insn *call_1, rtx_insn *call_2)
+{
+  gcc_assert (CALL_P (call_1));
+  if (call_2 == NULL_RTX || aarch64_simd_call_p (call_2))
+return call_1;
+  else
+return call_2;
 }
 
 /* Implement REGMODE_NATURAL_SIZE.  */
@@ -18764,6 +18801,10 @@ aarch64_libgcc_floating_mode_supported_p
 #defin

Re: [Patch 4/4][Aarch64] v2: Implement Aarch64 SIMD ABI

2019-01-07 Thread Richard Sandiford
Steve Ellcey  writes:
> On Thu, 2018-12-06 at 12:25 +, Richard Sandiford wrote:
>> 
>> Since we're looking at the call insns anyway, we could have a hook that
>> "jousts" two calls and picks the one that preserves *fewer* registers.
>> This would mean that loop produces a single instruction that conservatively
>> describes the call-preserved registers.  We could then stash that
>> instruction in lra_reg instead of the current check_part_clobbered
>> boolean.
>> 
>> The hook should by default be a null pointer, so that we can avoid
>> the instruction walk on targets that don't need it.
>> 
>> That would mean that LRA would always have a call instruction to hand
>> when asking about call-preserved information.  So I think we should
>> add an insn parameter to targetm.hard_regno_call_part_clobbered,
>> with a null insn selecting the defaul behaviour.   I know it's
>> going to be a pain to update all callers and targets, sorry.
>
> Richard,  here is an updated version of this patch.  It is not
> completly tested yet but I wanted to send this out and make
> sure it is what you had in mind and see if you had any comments about
> the new target function while I am testing it (including building
> some of the other targets).

Yeah, this was the kind of thing I had in mind, thanks.

>  /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  The callee only saves
> the lower 64 bits of a 128-bit register.  Tell the compiler the callee
> clobbers the top 64 bits when restoring the bottom 64 bits.  */
>  
>  static bool
> -aarch64_hard_regno_call_part_clobbered (unsigned int regno, machine_mode 
> mode)
> +aarch64_hard_regno_call_part_clobbered (rtx_insn *insn,
> + unsigned int regno,
> + machine_mode mode)
>  {
> +  if (insn && CALL_P (insn) && aarch64_simd_call_p (insn))
> +return false;
>return FP_REGNUM_P (regno) && maybe_gt (GET_MODE_SIZE (mode), 8);

This should be choosing between 8 and 16 for the maybe_gt, since
even SIMD functions clobber bits 128 and above for SVE.

> +/* Implement TARGET_RETURN_CALL_WITH_MAX_CLOBBERS.  */
> +
> +rtx_insn *
> +aarch64_return_call_with_max_clobbers (rtx_insn *call_1, rtx_insn *call_2)
> +{
> +  gcc_assert (CALL_P (call_1));
> +  if ((call_2 == NULL_RTX) || aarch64_simd_call_p (call_2))
> +return call_1;
> +  else
> +return call_2;

Nit: redundant parens in "(call_2 == NULL_RTX)".

> diff --git a/gcc/config/avr/avr.c b/gcc/config/avr/avr.c
> index 023308b..2cf993d 100644
> --- a/gcc/config/avr/avr.c
> +++ b/gcc/config/avr/avr.c
> @@ -12181,7 +12181,9 @@ avr_hard_regno_mode_ok (unsigned int regno, 
> machine_mode mode)
>  /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  */
>  
>  static bool
> -avr_hard_regno_call_part_clobbered (unsigned regno, machine_mode mode)
> +avr_hard_regno_call_part_clobbered (rtx_insn *insn ATTRIBUTE_UNUSED,
> + unsigned regno,
> + machine_mode mode)
>  {

Also very minor, sorry, but: I think it's usual to put the parameters
on the same line when they fit.  Same for the other hooks.

> @@ -2919,7 +2930,7 @@ the local anchor could be shared by other accesses to 
> nearby locations.
>  
>  The hook returns true if it succeeds, storing the offset of the
>  anchor from the base in @var{offset1} and the offset of the final address
> -from the anchor in @var{offset2}.  The default implementation returns false.
> +from the anchor in @var{offset2}.  ehe defnult implementation returns false.
>  @end deftypefn

Stray change here.

> diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
> index 7ffcd35..31a567a 100644
> --- a/gcc/lra-constraints.c
> +++ b/gcc/lra-constraints.c
> @@ -5368,16 +5368,24 @@ inherit_reload_reg (bool def_p, int original_regno,
>  static inline bool
>  need_for_call_save_p (int regno)
>  {
> +  machine_mode pmode = PSEUDO_REGNO_MODE (regno);
> +  int new_regno = reg_renumber[regno];
> +
>lra_assert (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] >= 0);
> -  return (usage_insns[regno].calls_num < calls_num
> -   && (overlaps_hard_reg_set_p
> -   ((flag_ipa_ra &&
> - ! hard_reg_set_empty_p 
> (lra_reg_info[regno].actual_call_used_reg_set))
> -? lra_reg_info[regno].actual_call_used_reg_set
> -: call_used_reg_set,
> -PSEUDO_REGNO_MODE (regno), reg_renumber[regno])
> -   || (targetm.hard_regno_call_part_clobbered
> -   (reg_renumber[regno], PSEUDO_REGNO_MODE (regno);
> +
> +  if (usage_insns[regno].calls_num >= calls_num)
> +return false;
> +
> +  if (flag_ipa_ra
> +  && !hard_reg_set_empty_p 
> (lra_reg_info[regno].actual_call_used_reg_set))
> +return (overlaps_hard_reg_set_p
> + (lra_reg_info[regno].actual_call_used_reg_set, pmode, new_regno)
> + || targetm.hard_regno_call_part_clobbered
> + (lra_reg_info[regno].call_insn, new_regno, pm

Re: [Patch 4/4][Aarch64] v2: Implement Aarch64 SIMD ABI

2019-01-04 Thread Steve Ellcey
On Thu, 2018-12-06 at 12:25 +, Richard Sandiford wrote:
> 
> Since we're looking at the call insns anyway, we could have a hook that
> "jousts" two calls and picks the one that preserves *fewer* registers.
> This would mean that loop produces a single instruction that conservatively
> describes the call-preserved registers.  We could then stash that
> instruction in lra_reg instead of the current check_part_clobbered
> boolean.
> 
> The hook should by default be a null pointer, so that we can avoid
> the instruction walk on targets that don't need it.
> 
> That would mean that LRA would always have a call instruction to hand
> when asking about call-preserved information.  So I think we should
> add an insn parameter to targetm.hard_regno_call_part_clobbered,
> with a null insn selecting the defaul behaviour.   I know it's
> going to be a pain to update all callers and targets, sorry.

Richard,  here is an updated version of this patch.  It is not
completly tested yet but I wanted to send this out and make
sure it is what you had in mind and see if you had any comments about
the new target function while I am testing it (including building
some of the other targets).

Steve Ellcey
sell...@cavium.com


2019-01-04  Steve Ellcey  

* config/aarch64/aarch64.c (aarch64_simd_call_p): New function.
(aarch64_hard_regno_call_part_clobbered): Add insn argument.
(aarch64_return_call_with_max_clobbers): New function.
(TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): New macro.
* config/avr/avr.c (avr_hard_regno_call_part_clobbered): Add insn
argument.
* config/i386/i386.c (ix86_hard_regno_call_part_clobbered): Ditto.
* config/mips/mips.c (mips_hard_regno_call_part_clobbered): Ditto.
* config/rs6000/rs6000.c (rs6000_hard_regno_call_part_clobbered): Ditto.
* config/s390/s390.c (s390_hard_regno_call_part_clobbered): Ditto.
* cselib.c (cselib_process_insn): Add argument to
targetm.hard_regno_call_part_clobbered call.
* conflicts.c (ira_build_conflicts): Ditto.
* ira-costs.c (ira_tune_allocno_costs): Ditto.
* lra-constraints.c (inherit_reload_reg): Ditto, plus refactor
return statement.
* lra-int.h (struct lra_reg): Add call_insn field.
* lra-lives.c (check_pseudos_live_through_calls): Add call_insn
argument.  Add argument to targetm.hard_regno_call_part_clobbered
call.
(process_bb_lives): Use new target function
targetm.return_call_with_max_clobbers to set call_insn.
Pass call_insn to check_pseudos_live_through_calls.
Set call_insn in lra_reg_info.
* lra.c (initialize_lra_reg_info_element): Set call_insn to NULL.
* regcprop.c (copyprop_hardreg_forward_1): Add argument to
targetm.hard_regno_call_part_clobbered call.
* reginfo.c (choose_hard_reg_mode): Ditto.
* regrename.c (check_new_reg_p): Ditto.
* reload.c (find_equiv_reg): Ditto.
* reload1.c (emit_reload_insns): Ditto.
* sched-deps.c (deps_analyze_insn): Ditto.
* sel-sched.c (init_regs_for_mode): Ditto.
(mark_unavailable_hard_regs): Ditto.
* targhooks.c (default_dwarf_frame_reg_mode): Ditto.
* target.def (hard_regno_call_part_clobbered): Add insn argument.
(return_call_with_max_clobbers): New target function.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): New hook.
* hooks.c (hook_bool_uint_mode_false): Change to
hook_bool_insn_uint_mode_false.
* hooks.h (hook_bool_uint_mode_false): Ditto.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index c5036c8..87af31b 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1565,16 +1565,55 @@ aarch64_reg_save_mode (tree fndecl, unsigned regno)
 	   : (aarch64_simd_decl_p (fndecl) ? E_TFmode : E_DFmode);
 }
 
+/* Return true if the instruction is a call to a SIMD function, false
+   if it is not a SIMD function or if we do not know anything about
+   the function.  */
+
+static bool
+aarch64_simd_call_p (rtx_insn *insn)
+{
+  rtx symbol;
+  rtx call;
+  tree fndecl;
+
+  gcc_assert (CALL_P (insn));
+  call = get_call_rtx_from (insn);
+  symbol = XEXP (XEXP (call, 0), 0);
+  if (GET_CODE (symbol) != SYMBOL_REF)
+return false;
+  fndecl = SYMBOL_REF_DECL (symbol);
+  if (!fndecl)
+return false;
+
+  return aarch64_simd_decl_p (fndecl);
+}
+
 /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  The callee only saves
the lower 64 bits of a 128-bit register.  Tell the compiler the callee
clobbers the top 64 bits when restoring the bottom 64 bits.  */
 
 static bool
-aarch64_hard_regno_call_part_clobbered (unsigned int regno, machine_mode mode)
+aarch64_hard_regno_call_part_clobbered (rtx_insn *insn,
+	unsigned int regno,
+	machine_mode mode)
 {
+  if (insn && CALL_P (insn) && aarch64_simd_call_p (insn)

Re: [Patch 4/4][Aarch64] v2: Implement Aarch64 SIMD ABI

2018-12-06 Thread Richard Sandiford
Steve Ellcey  writes:
> This is a patch 4 to support the Aarch64 SIMD ABI [1] in GCC.
>
> It defines a new target hook targetm.check_part_clobbered that
> takes a rtx_insn and checks to see if it is a call to a function
> that may clobber partial registers.  It returns true by default,
> which results in the current behaviour, but if we can determine
> that the function will not do any partial clobbers (like the
> Aarch64 SIMD functions) then it returns false.

Sorry, have a feeling this is going to be at least partly going
back on what I said before, but...

The patch only really deals with one user of the part-clobbered info,
namely LRA.  And as it happens, that caller does have access to the
relevant call insns (which was a concern before), since you walk them in:

  /* Check to see if any call might do a partial clobber.  */
  partial_clobber_in_bb = false;
  FOR_BB_INSNS_REVERSE_SAFE (bb, curr_insn, next)
{
  if (CALL_P (curr_insn)
  && targetm.check_part_clobbered (curr_insn))
{
  partial_clobber_in_bb = true;
  break;
}
}

Since we're looking at the call insns anyway, we could have a hook that
"jousts" two calls and picks the one that preserves *fewer* registers.
This would mean that loop produces a single instruction that conservatively
describes the call-preserved registers.  We could then stash that
instruction in lra_reg instead of the current check_part_clobbered
boolean.

The hook should by default be a null pointer, so that we can avoid
the instruction walk on targets that don't need it.

That would mean that LRA would always have a call instruction to hand
when asking about call-preserved information.  So I think we should
add an insn parameter to targetm.hard_regno_call_part_clobbered,
with a null insn selecting the defaul behaviour.   I know it's
going to be a pain to update all callers and targets, sorry.

This would also cope with the fact that, when SVE is enabled, SIMD
functions *do* still part-clobber the registers, just in a wider mode.
The current patch doesn't handle that, and it would be hard to fix without
pessimistically treating the functions as clobbering above 64 bits
rather 128 bits.

(Really, it would be good to overhaul the whole handling of ABIs
so that we have all the information about an ABI in one structure
and can ask "what ABI does this call use"?  But that's a lot of work.
The above should be good enough as long as the call-preserved behaviour
of ABIs follows a total ordering, which it does for AArch64.)

Thanks,
Richard


[Patch 4/4][Aarch64] v2: Implement Aarch64 SIMD ABI

2018-11-08 Thread Steve Ellcey
This is a patch 4 to support the Aarch64 SIMD ABI [1] in GCC.

It defines a new target hook targetm.check_part_clobbered that
takes a rtx_insn and checks to see if it is a call to a function
that may clobber partial registers.  It returns true by default,
which results in the current behaviour, but if we can determine
that the function will not do any partial clobbers (like the
Aarch64 SIMD functions) then it returns false.

Steve Ellcey
sell...@cavium.com



2018-11-08  Steve Ellcey  

* config/aarch64/aarch64.c (aarch64_check_part_clobbered): New function.
(TARGET_CHECK_PART_CLOBBERED): New macro.
* doc/tm.texi.in (TARGET_CHECK_PART_CLOBBERED): New hook.
* lra-constraints.c (need_for_call_save_p): Use check_part_clobbered.
* lra-int.h (check_part_clobbered): New field in lra_reg struct.
* lra-lives.c (check_pseudos_live_through_calls): Pass in
check_partial_clobber bool argument and use it.
(process_bb_lives): Check basic block for functions that may do
partial clobbers.  Pass this to check_pseudos_live_through_calls.
* lra.c (initialize_lra_reg_info_element): Inialize 
check_part_clobbered to false.
* target.def (check_part_clobbered): New target hook.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index c82c7b6..c2de4111 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1480,6 +1480,17 @@ aarch64_hard_regno_call_part_clobbered (unsigned int regno, machine_mode mode)
   return FP_REGNUM_P (regno) && maybe_gt (GET_MODE_SIZE (mode), 8);
 }
 
+/* Implement TARGET_CHECK_PART_CLOBBERED.  SIMD functions never save
+   partial registers, so they return false.  */
+
+static bool
+aarch64_check_part_clobbered(rtx_insn *insn)
+{
+  if (aarch64_simd_call_p (insn))
+return false;
+  return true;
+}
+
 /* Implement REGMODE_NATURAL_SIZE.  */
 poly_uint64
 aarch64_regmode_natural_size (machine_mode mode)
@@ -18294,6 +18305,9 @@ aarch64_libgcc_floating_mode_supported_p
 #define TARGET_HARD_REGNO_CALL_PART_CLOBBERED \
   aarch64_hard_regno_call_part_clobbered
 
+#undef TARGET_CHECK_PART_CLOBBERED
+#define TARGET_CHECK_PART_CLOBBERED aarch64_check_part_clobbered
+
 #undef TARGET_CONSTANT_ALIGNMENT
 #define TARGET_CONSTANT_ALIGNMENT aarch64_constant_alignment
 
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index e8af1bf..7dd6c54 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -1704,6 +1704,8 @@ of @code{CALL_USED_REGISTERS}.
 @cindex call-saved register
 @hook TARGET_HARD_REGNO_CALL_PART_CLOBBERED
 
+@hook TARGET_CHECK_PART_CLOBBERED
+
 @findex fixed_regs
 @findex call_used_regs
 @findex global_regs
diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index ab61989..89483d3 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -5325,16 +5325,23 @@ inherit_reload_reg (bool def_p, int original_regno,
 static inline bool
 need_for_call_save_p (int regno)
 {
+  machine_mode pmode = PSEUDO_REGNO_MODE (regno);
+  int new_regno = reg_renumber[regno];
+
   lra_assert (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] >= 0);
-  return (usage_insns[regno].calls_num < calls_num
-	  && (overlaps_hard_reg_set_p
-	  ((flag_ipa_ra &&
-		! hard_reg_set_empty_p (lra_reg_info[regno].actual_call_used_reg_set))
-	   ? lra_reg_info[regno].actual_call_used_reg_set
-	   : call_used_reg_set,
-	   PSEUDO_REGNO_MODE (regno), reg_renumber[regno])
-	  || (targetm.hard_regno_call_part_clobbered
-		  (reg_renumber[regno], PSEUDO_REGNO_MODE (regno);
+
+  if (usage_insns[regno].calls_num >= calls_num)
+return false;
+
+  if (flag_ipa_ra
+  && !hard_reg_set_empty_p (lra_reg_info[regno].actual_call_used_reg_set))
+return (overlaps_hard_reg_set_p
+		(lra_reg_info[regno].actual_call_used_reg_set, pmode, new_regno)
+	|| (lra_reg_info[regno].check_part_clobbered
+		&& targetm.hard_regno_call_part_clobbered (new_regno, pmode)));
+  else
+return (overlaps_hard_reg_set_p (call_used_reg_set, pmode, new_regno)
+|| targetm.hard_regno_call_part_clobbered (new_regno, pmode));
 }
 
 /* Global registers occurring in the current EBB.  */
diff --git a/gcc/lra-int.h b/gcc/lra-int.h
index 5267b53..e6aacd2 100644
--- a/gcc/lra-int.h
+++ b/gcc/lra-int.h
@@ -117,6 +117,8 @@ struct lra_reg
   /* This member is set up in lra-lives.c for subsequent
  assignments.  */
   lra_copy_t copies;
+  /* Whether or not the register is partially clobbered.  */
+  bool check_part_clobbered;
 };
 
 /* References to the common info about each register.  */
diff --git a/gcc/lra-lives.c b/gcc/lra-lives.c
index 0bf8cd0..b2dfe0e 100644
--- a/gcc/lra-lives.c
+++ b/gcc/lra-lives.c
@@ -597,7 +597,8 @@ lra_setup_reload_pseudo_preferenced_hard_reg (int regno,
PSEUDOS_LIVE_THROUGH_CALLS and PSEUDOS_LIVE_THROUGH_SETJUMPS.  */
 static inline void
 check_pseudos_live_through_calls (int regno,
-  HARD_REG_SET last_call_used_reg_set)
+  HARD_R