On Mon, 29 Nov 2021, Tamar Christina wrote: > Hi All, > > CSE uses equivalence classes to keep track of expressions that all have the > same > values at the current point in the program. > > Normal equivalences through SETs only insert and perform lookups in this set > but > equivalence determined from comparisons, e.g. > > (insn 46 44 47 7 (set (reg:CCZ 17 flags) > (compare:CCZ (reg:SI 105 [ iD.2893 ]) > (const_int 0 [0]))) "cse.c":18:22 7 {*cmpsi_ccno_1} > (expr_list:REG_DEAD (reg:SI 105 [ iD.2893 ]) > (nil))) > > creates the equivalence EQ on (reg:SI 105 [ iD.2893 ]) and (const_int 0 [0]). > > This causes a merge to happen between the two equivalence sets denoted by > (const_int 0 [0]) and (reg:SI 105 [ iD.2893 ]) respectively. > > The operation happens through merge_equiv_classes however this function has an > invariant that the classes to be merge not contain any duplicates. This is > because it frees entries before merging. > > The given testcase when using the supplied flags trigger an ICE due to the > equivalence set being > > (rr) p dump_class (class1) > Equivalence chain for (reg:SI 105 [ iD.2893 ]): > (reg:SI 105 [ iD.2893 ]) > $3 = void > > (rr) p dump_class (class2) > Equivalence chain for (const_int 0 [0]): > (const_int 0 [0]) > (reg:SI 97 [ _10 ]) > (reg:SI 97 [ _10 ]) > $4 = void > > This happens because the original INSN being recorded is > > (insn 18 17 24 2 (set (subreg:V1SI (reg:SI 97 [ _10 ]) 0) > (const_vector:V1SI [ > (const_int 0 [0]) > ])) "cse.c":11:9 1363 {*movv1si_internal} > (expr_list:REG_UNUSED (reg:SI 97 [ _10 ]) > (nil))) > > and we end up generating two equivalences. the first one is simply that > reg:SI 97 is 0. The second one is that 0 can be extracted from the V1SI, so > subreg (subreg:V1SI (reg:SI 97) 0) 0 == 0. This nested subreg gets folded > away > to just reg:SI 97 and we re-insert the same equivalence. > > This patch changes it so that once we figure out the bucket to insert into we > check if the equivalence set already contains the entry and if so just return > the existing entry and exit. > > Bootstrapped Regtested on aarch64-none-linux-gnu, > x86_64-pc-linux-gnu and no regressions. > > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > PR rtl-optimization/103404 > * cse.c (insert_with_costs): Check if item exists already before adding > a new entry in the equivalence class. > > gcc/testsuite/ChangeLog: > > PR rtl-optimization/103404 > * gcc.target/i386/pr103404.c: New test. > > --- inline copy of patch -- > diff --git a/gcc/cse.c b/gcc/cse.c > index > c1c7d0ca27b73c4b944b4719f95fece74e0358d5..08295246c594109e947276051c6776e4cabca4ec > 100644 > --- a/gcc/cse.c > +++ b/gcc/cse.c > @@ -1537,6 +1537,17 @@ insert_with_costs (rtx x, struct table_elt *classp, > unsigned int hash, > if (REG_P (x) && REGNO (x) < FIRST_PSEUDO_REGISTER) > add_to_hard_reg_set (&hard_regs_in_table, GET_MODE (x), REGNO (x)); > > + /* We cannot allow a duplicate to be entered into the equivalence sets > + and so we should perform a check before we do any allocations or > + change the buckets. */ > + if (classp) > + { > + struct table_elt *p; > + for (p = classp; p; p = p->next_same_value) > + if (exp_equiv_p (p->exp, x, 1, false)) > + return p;
not really a review, leaving that to who approved the original change, but these things always look bad - this linear list walk makes insert_with_costs quadratic. Is there any mitigation (like limiting the number of entries?), is that already an existing problem elsewhere in CSE? > + } > + > /* Put an element for X into the right hash bucket. */ > > elt = free_element_chain; > diff --git a/gcc/testsuite/gcc.target/i386/pr103404.c > b/gcc/testsuite/gcc.target/i386/pr103404.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..66f33645301db09503fc0977fd0f061a19e56ea5 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr103404.c > @@ -0,0 +1,32 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-Og -fcse-follow-jumps -fno-dce > -fno-early-inlining -fgcse -fharden-conditional-branches > -frerun-cse-after-loop -fno-tree-ccp -mavx5124fmaps -std=c99 -w" } */ > + > +typedef unsigned __attribute__((__vector_size__ (4))) U; > +typedef unsigned __attribute__((__vector_size__ (16))) V; > +typedef unsigned __attribute__((__vector_size__ (64))) W; > + > +int x, y; > + > +V v; > +W w; > + > +inline > +int bar (U a) > +{ > + a |= x; > + W k = > + __builtin_shufflevector (v, 5 / a, > + 2, 4, 0, 2, 4, 1, 0, 1, > + 1, 2, 1, 3, 0, 4, 4, 0); > + w = k; > + y = 0; > +} > + > +int > +foo () > +{ > + bar ((U){0xffffffff}); > + for (unsigned i; i < sizeof (foo);) > + ; > +} > + > > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)