Hi All,

He's a respin of the patch which doesn't change the complexity of insert.

Richard S, since you approved the original patch could you take a look at this 
fix.

---

Hi All,

CSE uses equivalence classes to keep track of expressions that all have the same
values at the current point in the program.

Normal equivalences through SETs only insert and perform lookups in this set but
equivalence determined from comparisons, e.g.

(insn 46 44 47 7 (set (reg:CCZ 17 flags)
        (compare:CCZ (reg:SI 105 [ iD.2893 ])
            (const_int 0 [0]))) "cse.c":18:22 7 {*cmpsi_ccno_1}
     (expr_list:REG_DEAD (reg:SI 105 [ iD.2893 ])
        (nil)))

creates the equivalence EQ on (reg:SI 105 [ iD.2893 ]) and (const_int 0 [0]).

This causes a merge to happen between the two equivalence sets denoted by
(const_int 0 [0]) and (reg:SI 105 [ iD.2893 ]) respectively.

The operation happens through merge_equiv_classes however this function has an
invariant that the classes to be merge not contain any duplicates.  This is
because it frees entries before merging.

The given testcase when using the supplied flags trigger an ICE due to the
equivalence set being

(rr) p dump_class (class1)
Equivalence chain for (reg:SI 105 [ iD.2893 ]):
(reg:SI 105 [ iD.2893 ])
$3 = void

(rr) p dump_class (class2)
Equivalence chain for (const_int 0 [0]):
(const_int 0 [0])
(reg:SI 97 [ _10 ])
(reg:SI 97 [ _10 ])
$4 = void

This happens because the original INSN being recorded is

(insn 18 17 24 2 (set (subreg:V1SI (reg:SI 97 [ _10 ]) 0)
        (const_vector:V1SI [
                (const_int 0 [0])
            ])) "cse.c":11:9 1363 {*movv1si_internal}
     (expr_list:REG_UNUSED (reg:SI 97 [ _10 ])
        (nil)))

and we end up generating two equivalences. the first one is simply that
reg:SI 97 is 0.  The second one is that 0 can be extracted from the V1SI, so
subreg (subreg:V1SI (reg:SI 97) 0) 0 == 0.  This nested subreg gets folded away
to just reg:SI 97 and we re-insert the same equivalence.

This patch changes it so that once we figure out the bucket to insert into we
check if the equivalence set already contains the entry and if so just return
the existing entry and exit.

While doing so it also calculates the new insertion point such that this code
does not increase the worse case complexity of insert_with_costs.

Bootstrapped Regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu and no regressions.


Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

        PR rtl-optimization/103404
        * cse.c (insert_with_costs): Check if item exists already before adding
        a new entry in the equivalence class.

gcc/testsuite/ChangeLog:

        PR rtl-optimization/103404
        * gcc.target/i386/pr103404.c: New test.

--- inline copy of patch ---

diff --git a/gcc/cse.c b/gcc/cse.c
index 
c1c7d0ca27b73c4b944b4719f95fece74e0358d5..be6be52376d3fb328ca6e5cced8a0456439a9a04
 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -1528,6 +1528,7 @@ insert_with_costs (rtx x, struct table_elt *classp, 
unsigned int hash,
                   machine_mode mode, int cost, int reg_cost)
 {
   struct table_elt *elt;
+  struct table_elt *ins_loc = NULL, *next = NULL;
 
   /* If X is a register and we haven't made a quantity for it,
      something is wrong.  */
@@ -1537,6 +1538,51 @@ insert_with_costs (rtx x, struct table_elt *classp, 
unsigned int hash,
   if (REG_P (x) && REGNO (x) < FIRST_PSEUDO_REGISTER)
     add_to_hard_reg_set (&hard_regs_in_table, GET_MODE (x), REGNO (x));
 
+  /* We cannot allow a duplicate to be entered into the equivalence sets
+     and so we should perform a check before we do any allocations or
+     change the buckets and simultaneously determine final insertion
+     point.  */
+  if (classp)
+    {
+      struct table_elt *p, *tmp;
+      ins_loc = classp->first_same_value;
+
+      /* Check the first element on its own.  */
+      if (exp_equiv_p (ins_loc->exp, x, 1, false))
+       return ins_loc;
+
+      /* And also cost it on its own as the conditions for it are slightly
+        different from the others.  */
+      if (preferable (cost, reg_cost, ins_loc->cost, ins_loc->regcost) >= 0)
+       {
+         /* Skip over elements that are cheaper than the element being
+            inserted as unequal costs means it can't be a duplicate.  */
+         for (p = ins_loc;
+              (next = p->next_same_value)
+                && preferable (next->cost, next->regcost, cost, reg_cost) < 0;
+              p = next)
+           ;
+
+         /* Record p as the place where we want to insert elt if we are to
+            insert an element.  */
+         ins_loc = p;
+
+      /* The normal search stops when we encounter an element which has the 
same
+        costs as us.  That's where we need to insert, but we still need to 
check
+        the remaining list of equal cost things for a duplicate.  */
+      for (; p; p = tmp)
+       {
+         if (exp_equiv_p (p->exp, x, 1, false))
+           return p;
+
+         if (preferable (cost, reg_cost, p->cost, p->regcost) <= 0)
+           break;
+
+         tmp = p->next_same_value;
+       }
+      }
+    }
+
   /* Put an element for X into the right hash bucket.  */
 
   elt = free_element_chain;
@@ -1579,22 +1625,15 @@ insert_with_costs (rtx x, struct table_elt *classp, 
unsigned int hash,
        }
       else
        {
-         /* Insert not at head of the class.  */
-         /* Put it after the last element cheaper than X.  */
-         struct table_elt *p, *next;
-
-         for (p = classp;
-              (next = p->next_same_value) && CHEAPER (next, elt);
-              p = next)
-           ;
-
-         /* Put it after P and before NEXT.  */
+         /* Insert not at head of the class.
+            Put it after the last element cheaper than X.
+            Put it after INS_LOC and before NEXT.  */
          elt->next_same_value = next;
          if (next)
            next->prev_same_value = elt;
 
-         elt->prev_same_value = p;
-         p->next_same_value = elt;
+         elt->prev_same_value = ins_loc;
+         ins_loc->next_same_value = elt;
          elt->first_same_value = classp;
        }
     }
diff --git a/gcc/testsuite/gcc.target/i386/pr103404.c 
b/gcc/testsuite/gcc.target/i386/pr103404.c
new file mode 100644
index 
0000000000000000000000000000000000000000..66f33645301db09503fc0977fd0f061a19e56ea5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr103404.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Og -fcse-follow-jumps -fno-dce 
-fno-early-inlining -fgcse -fharden-conditional-branches -frerun-cse-after-loop 
-fno-tree-ccp -mavx5124fmaps -std=c99 -w" } */
+
+typedef unsigned __attribute__((__vector_size__ (4))) U;
+typedef unsigned __attribute__((__vector_size__ (16))) V;
+typedef unsigned __attribute__((__vector_size__ (64))) W;
+
+int x, y;
+
+V v;
+W w;
+
+inline
+int bar (U a)
+{
+  a |= x;
+  W k =
+    __builtin_shufflevector (v, 5 / a,
+                            2, 4, 0, 2, 4, 1, 0, 1,
+                            1, 2, 1, 3, 0, 4, 4, 0);
+  w = k;
+  y = 0;
+}
+
+int
+foo ()
+{
+  bar ((U){0xffffffff});
+  for (unsigned i; i < sizeof (foo);)
+    ;
+}
+

Attachment: rb15109.patch
Description: rb15109.patch

Reply via email to