Re: [PATCH 6/6] ira: Handle "soft" conflicts between cap and non-cap allocnos

2022-01-10 Thread Vladimir Makarov via Gcc-patches



On 2022-01-06 09:48, Richard Sandiford wrote:

This patch looks for allocno conflicts of the following form:

- One allocno (X) is a cap allocno for some non-cap allocno X2.
- X2 belongs to some loop L2.
- The other allocno (Y) is a non-cap allocno.
- Y is an ancestor of some allocno Y2 in L2.
- Y2 is not referenced in L2 (that is, ALLOCNO_NREFS (Y2) == 0).
- Y can use a different allocation from Y2.

In this case, Y's register is live across L2 but is not used within it,
whereas X's register is used only within L2.  The conflict is therefore
only "soft", in that it can easily be avoided by spilling Y2 inside L2
without affecting any insn references.

In principle we could do this for ALLOCNO_NREFS (Y2) != 0 too, with the
callers then taking Y2's ALLOCNO_MEMORY_COST into account.  There would
then be no "cliff edge" between a Y2 that has no references and a Y2 that
has (say) a single cold reference.

However, doing that isn't necessary for the PR and seems to give
variable results in practice.  (fotonik3d_r improves slightly but
namd_r regresses slightly.)  It therefore seemed better to start
with the higher-value zero-reference case and see how things go.

On top of the previous patches in the series, this fixes the exchange2
regression seen in GCC 11.

gcc/
PR rtl-optimization/98782
* ira-int.h (ira_soft_conflict): Declare.
* ira-costs.c (max_soft_conflict_loop_depth): New constant.
(ira_soft_conflict): New function.
(spill_soft_conflicts): Likewise.
(assign_hard_reg): Use them to handle the case described by
the comment above ira_soft_conflict.
(improve_allocation): Likewise.
* ira.c (check_allocation): Allow allocnos with "soft" conflicts
to share the same register.

gcc/testsuite/
* gcc.target/aarch64/reg-alloc-4.c: New test.


OK.  If something goes wrong with the patches (e.g. a lot of GCC 
testsuite failures or performance degradation), we can revert only the 
last 3 of them as ones actually changing the heuristics.  But I hope it 
will be not necessary.


Thank you again for working on the PR.  Fixing it required big efforts 
in thinking, testing and benchmarking.





[PATCH 6/6] ira: Handle "soft" conflicts between cap and non-cap allocnos

2022-01-06 Thread Richard Sandiford via Gcc-patches
This patch looks for allocno conflicts of the following form:

- One allocno (X) is a cap allocno for some non-cap allocno X2.
- X2 belongs to some loop L2.
- The other allocno (Y) is a non-cap allocno.
- Y is an ancestor of some allocno Y2 in L2.
- Y2 is not referenced in L2 (that is, ALLOCNO_NREFS (Y2) == 0).
- Y can use a different allocation from Y2.

In this case, Y's register is live across L2 but is not used within it,
whereas X's register is used only within L2.  The conflict is therefore
only "soft", in that it can easily be avoided by spilling Y2 inside L2
without affecting any insn references.

In principle we could do this for ALLOCNO_NREFS (Y2) != 0 too, with the
callers then taking Y2's ALLOCNO_MEMORY_COST into account.  There would
then be no "cliff edge" between a Y2 that has no references and a Y2 that
has (say) a single cold reference.

However, doing that isn't necessary for the PR and seems to give
variable results in practice.  (fotonik3d_r improves slightly but
namd_r regresses slightly.)  It therefore seemed better to start
with the higher-value zero-reference case and see how things go.

On top of the previous patches in the series, this fixes the exchange2
regression seen in GCC 11.

gcc/
PR rtl-optimization/98782
* ira-int.h (ira_soft_conflict): Declare.
* ira-costs.c (max_soft_conflict_loop_depth): New constant.
(ira_soft_conflict): New function.
(spill_soft_conflicts): Likewise.
(assign_hard_reg): Use them to handle the case described by
the comment above ira_soft_conflict.
(improve_allocation): Likewise.
* ira.c (check_allocation): Allow allocnos with "soft" conflicts
to share the same register.

gcc/testsuite/
* gcc.target/aarch64/reg-alloc-4.c: New test.
---
 gcc/ira-color.c   | 284 --
 gcc/ira-int.h |   1 +
 gcc/ira.c |   2 +
 .../gcc.target/aarch64/reg-alloc-4.c  |  69 +
 4 files changed, 326 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/reg-alloc-4.c

diff --git a/gcc/ira-color.c b/gcc/ira-color.c
index 1487afc5ef1..36f3f4d70f3 100644
--- a/gcc/ira-color.c
+++ b/gcc/ira-color.c
@@ -36,6 +36,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "reload.h"
 #include "cfgloop.h"
 
+/* To prevent soft conflict detection becoming quadratic in the
+   loop depth.  Only for very pathological cases, so it hardly
+   seems worth a --param.  */
+const int max_soft_conflict_loop_depth = 64;
+
 typedef struct allocno_hard_regs *allocno_hard_regs_t;
 
 /* The structure contains information about hard registers can be
@@ -1707,6 +1712,167 @@ calculate_saved_nregs (int hard_regno, machine_mode 
mode)
   return nregs;
 }
 
+/* Allocnos A1 and A2 are known to conflict.  Check whether, in some loop L
+   that is either the current loop or a nested subloop, the conflict is of
+   the following form:
+
+   - One allocno (X) is a cap allocno for some non-cap allocno X2.
+
+   - X2 belongs to some loop L2.
+
+   - The other allocno (Y) is a non-cap allocno.
+
+   - Y is an ancestor of some allocno Y2 in L2.  (Note that such a Y2
+ must exist, given that X and Y conflict.)
+
+   - Y2 is not referenced in L2 (that is, ALLOCNO_NREFS (Y2) == 0).
+
+   - Y can use a different allocation from Y2.
+
+   In this case, Y's register is live across L2 but is not used within it,
+   whereas X's register is used only within L2.  The conflict is therefore
+   only "soft", in that it can easily be avoided by spilling Y2 inside L2
+   without affecting any insn references.
+
+   If the conflict does have this form, return the Y2 that would need to be
+   spilled in order to allow X and Y (and thus A1 and A2) to use the same
+   register.  Return null otherwise.  Returning null is conservatively correct;
+   any nonnnull return value is an optimization.  */
+ira_allocno_t
+ira_soft_conflict (ira_allocno_t a1, ira_allocno_t a2)
+{
+  /* Search for the loop L and its associated allocnos X and Y.  */
+  int search_depth = 0;
+  while (ALLOCNO_CAP_MEMBER (a1) && ALLOCNO_CAP_MEMBER (a2))
+{
+  a1 = ALLOCNO_CAP_MEMBER (a1);
+  a2 = ALLOCNO_CAP_MEMBER (a2);
+  if (search_depth++ > max_soft_conflict_loop_depth)
+   return nullptr;
+}
+  /* This must be true if A1 and A2 conflict.  */
+  ira_assert (ALLOCNO_LOOP_TREE_NODE (a1) == ALLOCNO_LOOP_TREE_NODE (a2));
+
+  /* Make A1 the cap allocno (X in the comment above) and A2 the
+ non-cap allocno (Y in the comment above).  */
+  if (ALLOCNO_CAP_MEMBER (a2))
+std::swap (a1, a2);
+  if (!ALLOCNO_CAP_MEMBER (a1))
+return nullptr;
+
+  /* Search for the real allocno that A1 caps (X2 in the comment above).  */
+  do
+{
+  a1 = ALLOCNO_CAP_MEMBER (a1);
+  if (search_depth++ > max_soft_conflict_loop_depth)
+   return nullptr;
+}
+  while (ALLOCNO_CAP_MEMBER