Re: [pushed] [RA]: Improve cost calculation of pseudos with equivalences

2023-09-17 Thread Jeff Law via Gcc-patches




On 9/14/23 09:28, Vladimir Makarov via Gcc-patches wrote:
I've committed the following patch.  The reason for this patch is 
explained in its commit message.


The patch was successfully bootstrapped and tested on x86-64, aarch64, 
and ppc64le.



ra-equiv-cost.patch_ZN7cObject4dropEP12cOwnedObject-stores

commit 3c834d85f2ec42c60995c2b678196a06cb744959
Author: Vladimir N. Makarov
Date:   Thu Sep 14 10:26:48 2023 -0400

 [RA]: Improve cost calculation of pseudos with equivalences
 
 RISCV target developers reported that RA can spill pseudo used in a

 loop although there are enough registers to assign.  It happens when
 the pseudo has an equivalence outside the loop and the equivalence is
 not merged into insns using the pseudo.  IRA sets up that memory cost
 to zero when the pseudo has an equivalence and it means that the
 pseudo will be probably spilled.  This approach worked well for i686
 (different approaches were benchmarked long time ago on spec2k).
 Although common sense says that the code is wrong and this was
 confirmed by RISCV developers.
 
 I've tried the following patch on I7-9700k and it improved spec17 fp

 by 1.5% (21.1 vs 20.8) although spec17 int is a bit worse by 0.45%
 (8.54 vs 8.58).  The average generated code size is practically the
 same (0.001% difference).
 
 In the future we probably need to try more sophisticated cost

 calculation which should take into account that the equiv can not be
 combined in usage insns and the costs of reloads because of this.
 
 gcc/ChangeLog:
 
 * ira-costs.cc (find_costs_and_classes): Decrease memory cost

 by equiv savings.

Thanks for diving into this!

What's rather strange is when I do an A/B test with this patch on RISC-V 
it appears to be a pretty consistent loss for integer code.  This would 
seem to match your findings on x86 as well.


I still need to dig into it more deeply, but I see higher ALU as well as 
higher load/store traffic.  The load/store traffic in the one case I've 
looked at so far (omnetpp) appears to be prologue/epilogue related. 
Essentially we're using an additional callee saved register on paths 
that don't trigger at runtime.


Jeff



[pushed] [RA]: Improve cost calculation of pseudos with equivalences

2023-09-14 Thread Vladimir Makarov via Gcc-patches
I've committed the following patch.  The reason for this patch is 
explained in its commit message.


The patch was successfully bootstrapped and tested on x86-64, aarch64, 
and ppc64le.


commit 3c834d85f2ec42c60995c2b678196a06cb744959
Author: Vladimir N. Makarov 
Date:   Thu Sep 14 10:26:48 2023 -0400

[RA]: Improve cost calculation of pseudos with equivalences

RISCV target developers reported that RA can spill pseudo used in a
loop although there are enough registers to assign.  It happens when
the pseudo has an equivalence outside the loop and the equivalence is
not merged into insns using the pseudo.  IRA sets up that memory cost
to zero when the pseudo has an equivalence and it means that the
pseudo will be probably spilled.  This approach worked well for i686
(different approaches were benchmarked long time ago on spec2k).
Although common sense says that the code is wrong and this was
confirmed by RISCV developers.

I've tried the following patch on I7-9700k and it improved spec17 fp
by 1.5% (21.1 vs 20.8) although spec17 int is a bit worse by 0.45%
(8.54 vs 8.58).  The average generated code size is practically the
same (0.001% difference).

In the future we probably need to try more sophisticated cost
calculation which should take into account that the equiv can not be
combined in usage insns and the costs of reloads because of this.

gcc/ChangeLog:

* ira-costs.cc (find_costs_and_classes): Decrease memory cost
by equiv savings.

diff --git a/gcc/ira-costs.cc b/gcc/ira-costs.cc
index d9e700e8947..8c93ace5094 100644
--- a/gcc/ira-costs.cc
+++ b/gcc/ira-costs.cc
@@ -1947,15 +1947,8 @@ find_costs_and_classes (FILE *dump_file)
 	}
 	  if (i >= first_moveable_pseudo && i < last_moveable_pseudo)
 	i_mem_cost = 0;
-	  else if (equiv_savings < 0)
-	i_mem_cost = -equiv_savings;
-	  else if (equiv_savings > 0)
-	{
-	  i_mem_cost = 0;
-	  for (k = cost_classes_ptr->num - 1; k >= 0; k--)
-		i_costs[k] += equiv_savings;
-	}
-
+	  else
+	i_mem_cost -= equiv_savings;
 	  best_cost = (1 << (HOST_BITS_PER_INT - 2)) - 1;
 	  best = ALL_REGS;
 	  alt_class = NO_REGS;