Re: [PATCH]middle-end cse: Make sure duplicate elements are not entered into the equivalence set [PR103404]

2021-12-06 Thread Richard Sandiford via Gcc-patches
Tamar Christina  writes:
> Hi,
>
> As discussed off-line this can only happen with a V1 mode, so here's a much 
> simpler patch.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> x86_64-pc-linux-gnu and no regressions.
>
>
> Ok for master?

OK, thanks.

Richard

> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> PR rtl-optimization/103404
> * cse.c (find_sets_in_insn): Don't select elements out of a V1 mode
> subreg.
>
> gcc/testsuite/ChangeLog:
>
> PR rtl-optimization/103404
> * gcc.target/i386/pr103404.c: New test.
>
> --- inline copy of patch ---
>
> diff --git a/gcc/cse.c b/gcc/cse.c
> index 
> c1c7d0ca27b73c4b944b4719f95fece74e0358d5..dc5d5aed047c7776f44b159a4286390d6499c18d
>  100644
> --- a/gcc/cse.c
> +++ b/gcc/cse.c
> @@ -4275,7 +4275,12 @@ find_sets_in_insn (rtx_insn *insn, vec 
> *psets)
>else if (GET_CODE (SET_SRC (x)) == CALL)
> ;
>else if (GET_CODE (SET_SRC (x)) == CONST_VECTOR
> -  && GET_MODE_CLASS (GET_MODE (SET_SRC (x))) != MODE_VECTOR_BOOL)
> +  && GET_MODE_CLASS (GET_MODE (SET_SRC (x))) != MODE_VECTOR_BOOL
> +  /* Prevent duplicates from being generated if the type is a V1
> + type and a subreg.  Folding this will result in the same
> + element as folding x itself.  */
> +  && !(SUBREG_P (SET_DEST (x))
> +   && known_eq (GET_MODE_NUNITS (GET_MODE (SET_SRC (x))), 
> 1)))
> {
>   /* First register the vector itself.  */
>   add_to_set (psets, x);
> diff --git a/gcc/testsuite/gcc.target/i386/pr103404.c 
> b/gcc/testsuite/gcc.target/i386/pr103404.c
> new file mode 100644
> index 
> ..66f33645301db09503fc0977fd0f061a19e56ea5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr103404.c
> @@ -0,0 +1,32 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-Og -fcse-follow-jumps -fno-dce 
> -fno-early-inlining -fgcse -fharden-conditional-branches 
> -frerun-cse-after-loop -fno-tree-ccp -mavx5124fmaps -std=c99 -w" } */
> +
> +typedef unsigned __attribute__((__vector_size__ (4))) U;
> +typedef unsigned __attribute__((__vector_size__ (16))) V;
> +typedef unsigned __attribute__((__vector_size__ (64))) W;
> +
> +int x, y;
> +
> +V v;
> +W w;
> +
> +inline
> +int bar (U a)
> +{
> +  a |= x;
> +  W k =
> +__builtin_shufflevector (v, 5 / a,
> +2, 4, 0, 2, 4, 1, 0, 1,
> +1, 2, 1, 3, 0, 4, 4, 0);
> +  w = k;
> +  y = 0;
> +}
> +
> +int
> +foo ()
> +{
> +  bar ((U){0x});
> +  for (unsigned i; i < sizeof (foo);)
> +;
> +}
> +


RE: [PATCH]middle-end cse: Make sure duplicate elements are not entered into the equivalence set [PR103404]

2021-12-06 Thread Tamar Christina via Gcc-patches
Hi,

As discussed off-line this can only happen with a V1 mode, so here's a much 
simpler patch.

Bootstrapped Regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu and no regressions.


Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR rtl-optimization/103404
* cse.c (find_sets_in_insn): Don't select elements out of a V1 mode
subreg.

gcc/testsuite/ChangeLog:

PR rtl-optimization/103404
* gcc.target/i386/pr103404.c: New test.

--- inline copy of patch ---

diff --git a/gcc/cse.c b/gcc/cse.c
index 
c1c7d0ca27b73c4b944b4719f95fece74e0358d5..dc5d5aed047c7776f44b159a4286390d6499c18d
 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -4275,7 +4275,12 @@ find_sets_in_insn (rtx_insn *insn, vec 
*psets)
   else if (GET_CODE (SET_SRC (x)) == CALL)
;
   else if (GET_CODE (SET_SRC (x)) == CONST_VECTOR
-  && GET_MODE_CLASS (GET_MODE (SET_SRC (x))) != MODE_VECTOR_BOOL)
+  && GET_MODE_CLASS (GET_MODE (SET_SRC (x))) != MODE_VECTOR_BOOL
+  /* Prevent duplicates from being generated if the type is a V1
+ type and a subreg.  Folding this will result in the same
+ element as folding x itself.  */
+  && !(SUBREG_P (SET_DEST (x))
+   && known_eq (GET_MODE_NUNITS (GET_MODE (SET_SRC (x))), 1)))
{
  /* First register the vector itself.  */
  add_to_set (psets, x);
diff --git a/gcc/testsuite/gcc.target/i386/pr103404.c 
b/gcc/testsuite/gcc.target/i386/pr103404.c
new file mode 100644
index 
..66f33645301db09503fc0977fd0f061a19e56ea5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr103404.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Og -fcse-follow-jumps -fno-dce 
-fno-early-inlining -fgcse -fharden-conditional-branches -frerun-cse-after-loop 
-fno-tree-ccp -mavx5124fmaps -std=c99 -w" } */
+
+typedef unsigned __attribute__((__vector_size__ (4))) U;
+typedef unsigned __attribute__((__vector_size__ (16))) V;
+typedef unsigned __attribute__((__vector_size__ (64))) W;
+
+int x, y;
+
+V v;
+W w;
+
+inline
+int bar (U a)
+{
+  a |= x;
+  W k =
+__builtin_shufflevector (v, 5 / a,
+2, 4, 0, 2, 4, 1, 0, 1,
+1, 2, 1, 3, 0, 4, 4, 0);
+  w = k;
+  y = 0;
+}
+
+int
+foo ()
+{
+  bar ((U){0x});
+  for (unsigned i; i < sizeof (foo);)
+;
+}
+


rb15109.patch
Description: rb15109.patch


RE: [PATCH]middle-end cse: Make sure duplicate elements are not entered into the equivalence set [PR103404]

2021-12-02 Thread Tamar Christina via Gcc-patches
Hi All,

He's a respin of the patch which doesn't change the complexity of insert.

Richard S, since you approved the original patch could you take a look at this 
fix.

---

Hi All,

CSE uses equivalence classes to keep track of expressions that all have the same
values at the current point in the program.

Normal equivalences through SETs only insert and perform lookups in this set but
equivalence determined from comparisons, e.g.

(insn 46 44 47 7 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 105 [ iD.2893 ])
(const_int 0 [0]))) "cse.c":18:22 7 {*cmpsi_ccno_1}
 (expr_list:REG_DEAD (reg:SI 105 [ iD.2893 ])
(nil)))

creates the equivalence EQ on (reg:SI 105 [ iD.2893 ]) and (const_int 0 [0]).

This causes a merge to happen between the two equivalence sets denoted by
(const_int 0 [0]) and (reg:SI 105 [ iD.2893 ]) respectively.

The operation happens through merge_equiv_classes however this function has an
invariant that the classes to be merge not contain any duplicates.  This is
because it frees entries before merging.

The given testcase when using the supplied flags trigger an ICE due to the
equivalence set being

(rr) p dump_class (class1)
Equivalence chain for (reg:SI 105 [ iD.2893 ]):
(reg:SI 105 [ iD.2893 ])
$3 = void

(rr) p dump_class (class2)
Equivalence chain for (const_int 0 [0]):
(const_int 0 [0])
(reg:SI 97 [ _10 ])
(reg:SI 97 [ _10 ])
$4 = void

This happens because the original INSN being recorded is

(insn 18 17 24 2 (set (subreg:V1SI (reg:SI 97 [ _10 ]) 0)
(const_vector:V1SI [
(const_int 0 [0])
])) "cse.c":11:9 1363 {*movv1si_internal}
 (expr_list:REG_UNUSED (reg:SI 97 [ _10 ])
(nil)))

and we end up generating two equivalences. the first one is simply that
reg:SI 97 is 0.  The second one is that 0 can be extracted from the V1SI, so
subreg (subreg:V1SI (reg:SI 97) 0) 0 == 0.  This nested subreg gets folded away
to just reg:SI 97 and we re-insert the same equivalence.

This patch changes it so that once we figure out the bucket to insert into we
check if the equivalence set already contains the entry and if so just return
the existing entry and exit.

While doing so it also calculates the new insertion point such that this code
does not increase the worse case complexity of insert_with_costs.

Bootstrapped Regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu and no regressions.


Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR rtl-optimization/103404
* cse.c (insert_with_costs): Check if item exists already before adding
a new entry in the equivalence class.

gcc/testsuite/ChangeLog:

PR rtl-optimization/103404
* gcc.target/i386/pr103404.c: New test.

--- inline copy of patch ---

diff --git a/gcc/cse.c b/gcc/cse.c
index 
c1c7d0ca27b73c4b944b4719f95fece74e0358d5..be6be52376d3fb328ca6e5cced8a0456439a9a04
 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -1528,6 +1528,7 @@ insert_with_costs (rtx x, struct table_elt *classp, 
unsigned int hash,
   machine_mode mode, int cost, int reg_cost)
 {
   struct table_elt *elt;
+  struct table_elt *ins_loc = NULL, *next = NULL;
 
   /* If X is a register and we haven't made a quantity for it,
  something is wrong.  */
@@ -1537,6 +1538,51 @@ insert_with_costs (rtx x, struct table_elt *classp, 
unsigned int hash,
   if (REG_P (x) && REGNO (x) < FIRST_PSEUDO_REGISTER)
 add_to_hard_reg_set (&hard_regs_in_table, GET_MODE (x), REGNO (x));
 
+  /* We cannot allow a duplicate to be entered into the equivalence sets
+ and so we should perform a check before we do any allocations or
+ change the buckets and simultaneously determine final insertion
+ point.  */
+  if (classp)
+{
+  struct table_elt *p, *tmp;
+  ins_loc = classp->first_same_value;
+
+  /* Check the first element on its own.  */
+  if (exp_equiv_p (ins_loc->exp, x, 1, false))
+   return ins_loc;
+
+  /* And also cost it on its own as the conditions for it are slightly
+different from the others.  */
+  if (preferable (cost, reg_cost, ins_loc->cost, ins_loc->regcost) >= 0)
+   {
+ /* Skip over elements that are cheaper than the element being
+inserted as unequal costs means it can't be a duplicate.  */
+ for (p = ins_loc;
+  (next = p->next_same_value)
+&& preferable (next->cost, next->regcost, cost, reg_cost) < 0;
+  p = next)
+   ;
+
+ /* Record p as the place where we want to insert elt if we are to
+insert an element.  */
+ ins_loc = p;
+
+  /* The normal search stops when we encounter an element which has the 
same
+costs as us.  That's where we need to insert, but we still need to 
check
+the remaining list of equal cost things for a duplicate.  */
+  for (; p; p = tmp)
+   {
+ if (exp_equiv_p (p->exp, x, 1, false))
+   return p;
+
+ if (preferab

RE: [PATCH]middle-end cse: Make sure duplicate elements are not entered into the equivalence set [PR103404]

2021-11-29 Thread Tamar Christina via Gcc-patches
> -Original Message-
> From: Richard Biener 
> Sent: Monday, November 29, 2021 9:02 AM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; j...@tachyum.com;
> Richard Sandiford 
> Subject: Re: [PATCH]middle-end cse: Make sure duplicate elements are not
> entered into the equivalence set [PR103404]
> 
> On Mon, 29 Nov 2021, Tamar Christina wrote:
> 
> > Hi All,
> >
> > CSE uses equivalence classes to keep track of expressions that all
> > have the same values at the current point in the program.
> >
> > Normal equivalences through SETs only insert and perform lookups in
> > this set but equivalence determined from comparisons, e.g.
> >
> > (insn 46 44 47 7 (set (reg:CCZ 17 flags)
> > (compare:CCZ (reg:SI 105 [ iD.2893 ])
> > (const_int 0 [0]))) "cse.c":18:22 7 {*cmpsi_ccno_1}
> >  (expr_list:REG_DEAD (reg:SI 105 [ iD.2893 ])
> > (nil)))
> >
> > creates the equivalence EQ on (reg:SI 105 [ iD.2893 ]) and (const_int 0 
> > [0]).
> >
> > This causes a merge to happen between the two equivalence sets denoted
> > by (const_int 0 [0]) and (reg:SI 105 [ iD.2893 ]) respectively.
> >
> > The operation happens through merge_equiv_classes however this
> > function has an invariant that the classes to be merge not contain any
> > duplicates.  This is because it frees entries before merging.
> >
> > The given testcase when using the supplied flags trigger an ICE due to
> > the equivalence set being
> >
> > (rr) p dump_class (class1)
> > Equivalence chain for (reg:SI 105 [ iD.2893 ]):
> > (reg:SI 105 [ iD.2893 ])
> > $3 = void
> >
> > (rr) p dump_class (class2)
> > Equivalence chain for (const_int 0 [0]):
> > (const_int 0 [0])
> > (reg:SI 97 [ _10 ])
> > (reg:SI 97 [ _10 ])
> > $4 = void
> >
> > This happens because the original INSN being recorded is
> >
> > (insn 18 17 24 2 (set (subreg:V1SI (reg:SI 97 [ _10 ]) 0)
> > (const_vector:V1SI [
> > (const_int 0 [0])
> > ])) "cse.c":11:9 1363 {*movv1si_internal}
> >  (expr_list:REG_UNUSED (reg:SI 97 [ _10 ])
> > (nil)))
> >
> > and we end up generating two equivalences. the first one is simply
> > that reg:SI 97 is 0.  The second one is that 0 can be extracted from
> > the V1SI, so subreg (subreg:V1SI (reg:SI 97) 0) 0 == 0.  This nested
> > subreg gets folded away to just reg:SI 97 and we re-insert the same
> equivalence.
> >
> > This patch changes it so that once we figure out the bucket to insert
> > into we check if the equivalence set already contains the entry and if
> > so just return the existing entry and exit.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> > and no regressions.
> >
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > PR rtl-optimization/103404
> > * cse.c (insert_with_costs): Check if item exists already before
> adding
> > a new entry in the equivalence class.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR rtl-optimization/103404
> > * gcc.target/i386/pr103404.c: New test.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/cse.c b/gcc/cse.c
> > index
> >
> c1c7d0ca27b73c4b944b4719f95fece74e0358d5..08295246c594109e947276051c
> 67
> > 76e4cabca4ec 100644
> > --- a/gcc/cse.c
> > +++ b/gcc/cse.c
> > @@ -1537,6 +1537,17 @@ insert_with_costs (rtx x, struct table_elt *classp,
> unsigned int hash,
> >if (REG_P (x) && REGNO (x) < FIRST_PSEUDO_REGISTER)
> >  add_to_hard_reg_set (&hard_regs_in_table, GET_MODE (x), REGNO
> > (x));
> >
> > +  /* We cannot allow a duplicate to be entered into the equivalence sets
> > + and so we should perform a check before we do any allocations or
> > + change the buckets.  */
> > +  if (classp)
> > +{
> > +  struct table_elt *p;
> > +  for (p = classp; p; p = p->next_same_value)
> > +   if (exp_equiv_p (p->exp, x, 1, false))
> > + return p;
> 
> not really a review, leaving that to who approved the original change, but
> these things always look bad - this linear list walk makes insert_with_costs
> quadratic.  Is there any mitigation (like limiting the number of entries?), is
> that already an existing problem elsewhere in CSE?

Hmm I believe insert_with_costs is already quadratic as it walks the list after 
alloc

Re: [PATCH]middle-end cse: Make sure duplicate elements are not entered into the equivalence set [PR103404]

2021-11-29 Thread Richard Biener via Gcc-patches
On Mon, 29 Nov 2021, Tamar Christina wrote:

> Hi All,
> 
> CSE uses equivalence classes to keep track of expressions that all have the 
> same
> values at the current point in the program.
> 
> Normal equivalences through SETs only insert and perform lookups in this set 
> but
> equivalence determined from comparisons, e.g.
> 
> (insn 46 44 47 7 (set (reg:CCZ 17 flags)
> (compare:CCZ (reg:SI 105 [ iD.2893 ])
> (const_int 0 [0]))) "cse.c":18:22 7 {*cmpsi_ccno_1}
>  (expr_list:REG_DEAD (reg:SI 105 [ iD.2893 ])
> (nil)))
> 
> creates the equivalence EQ on (reg:SI 105 [ iD.2893 ]) and (const_int 0 [0]).
> 
> This causes a merge to happen between the two equivalence sets denoted by
> (const_int 0 [0]) and (reg:SI 105 [ iD.2893 ]) respectively.
> 
> The operation happens through merge_equiv_classes however this function has an
> invariant that the classes to be merge not contain any duplicates.  This is
> because it frees entries before merging.
> 
> The given testcase when using the supplied flags trigger an ICE due to the
> equivalence set being
> 
> (rr) p dump_class (class1)
> Equivalence chain for (reg:SI 105 [ iD.2893 ]):
> (reg:SI 105 [ iD.2893 ])
> $3 = void
> 
> (rr) p dump_class (class2)
> Equivalence chain for (const_int 0 [0]):
> (const_int 0 [0])
> (reg:SI 97 [ _10 ])
> (reg:SI 97 [ _10 ])
> $4 = void
> 
> This happens because the original INSN being recorded is
> 
> (insn 18 17 24 2 (set (subreg:V1SI (reg:SI 97 [ _10 ]) 0)
> (const_vector:V1SI [
> (const_int 0 [0])
> ])) "cse.c":11:9 1363 {*movv1si_internal}
>  (expr_list:REG_UNUSED (reg:SI 97 [ _10 ])
> (nil)))
> 
> and we end up generating two equivalences. the first one is simply that
> reg:SI 97 is 0.  The second one is that 0 can be extracted from the V1SI, so
> subreg (subreg:V1SI (reg:SI 97) 0) 0 == 0.  This nested subreg gets folded 
> away
> to just reg:SI 97 and we re-insert the same equivalence.
> 
> This patch changes it so that once we figure out the bucket to insert into we
> check if the equivalence set already contains the entry and if so just return
> the existing entry and exit.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> x86_64-pc-linux-gnu and no regressions.
> 
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   PR rtl-optimization/103404
>   * cse.c (insert_with_costs): Check if item exists already before adding
>   a new entry in the equivalence class.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR rtl-optimization/103404
>   * gcc.target/i386/pr103404.c: New test.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/cse.c b/gcc/cse.c
> index 
> c1c7d0ca27b73c4b944b4719f95fece74e0358d5..08295246c594109e947276051c6776e4cabca4ec
>  100644
> --- a/gcc/cse.c
> +++ b/gcc/cse.c
> @@ -1537,6 +1537,17 @@ insert_with_costs (rtx x, struct table_elt *classp, 
> unsigned int hash,
>if (REG_P (x) && REGNO (x) < FIRST_PSEUDO_REGISTER)
>  add_to_hard_reg_set (&hard_regs_in_table, GET_MODE (x), REGNO (x));
>  
> +  /* We cannot allow a duplicate to be entered into the equivalence sets
> + and so we should perform a check before we do any allocations or
> + change the buckets.  */
> +  if (classp)
> +{
> +  struct table_elt *p;
> +  for (p = classp; p; p = p->next_same_value)
> + if (exp_equiv_p (p->exp, x, 1, false))
> +   return p;

not really a review, leaving that to who approved the original change,
but these things always look bad - this linear list walk makes
insert_with_costs quadratic.  Is there any mitigation (like limiting
the number of entries?), is that already an existing problem elsewhere
in CSE?

> +}
> +
>/* Put an element for X into the right hash bucket.  */
>  
>elt = free_element_chain;
> diff --git a/gcc/testsuite/gcc.target/i386/pr103404.c 
> b/gcc/testsuite/gcc.target/i386/pr103404.c
> new file mode 100644
> index 
> ..66f33645301db09503fc0977fd0f061a19e56ea5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr103404.c
> @@ -0,0 +1,32 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-Og -fcse-follow-jumps -fno-dce 
> -fno-early-inlining -fgcse -fharden-conditional-branches 
> -frerun-cse-after-loop -fno-tree-ccp -mavx5124fmaps -std=c99 -w" } */
> +
> +typedef unsigned __attribute__((__vector_size__ (4))) U;
> +typedef unsigned __attribute__((__vector_size__ (16))) V;
> +typedef unsigned __attribute__((__vector_size__ (64))) W;
> +
> +int x, y;
> +
> +V v;
> +W w;
> +
> +inline
> +int bar (U a)
> +{
> +  a |= x;
> +  W k =
> +__builtin_shufflevector (v, 5 / a,
> +  2, 4, 0, 2, 4, 1, 0, 1,
> +  1, 2, 1, 3, 0, 4, 4, 0);
> +  w = k;
> +  y = 0;
> +}
> +
> +int
> +foo ()
> +{
> +  bar ((U){0x});
> +  for (unsigned i; i < sizeof (foo);)
> +;
> +}
> +
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 9040

[PATCH]middle-end cse: Make sure duplicate elements are not entered into the equivalence set [PR103404]

2021-11-28 Thread Tamar Christina via Gcc-patches
Hi All,

CSE uses equivalence classes to keep track of expressions that all have the same
values at the current point in the program.

Normal equivalences through SETs only insert and perform lookups in this set but
equivalence determined from comparisons, e.g.

(insn 46 44 47 7 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 105 [ iD.2893 ])
(const_int 0 [0]))) "cse.c":18:22 7 {*cmpsi_ccno_1}
 (expr_list:REG_DEAD (reg:SI 105 [ iD.2893 ])
(nil)))

creates the equivalence EQ on (reg:SI 105 [ iD.2893 ]) and (const_int 0 [0]).

This causes a merge to happen between the two equivalence sets denoted by
(const_int 0 [0]) and (reg:SI 105 [ iD.2893 ]) respectively.

The operation happens through merge_equiv_classes however this function has an
invariant that the classes to be merge not contain any duplicates.  This is
because it frees entries before merging.

The given testcase when using the supplied flags trigger an ICE due to the
equivalence set being

(rr) p dump_class (class1)
Equivalence chain for (reg:SI 105 [ iD.2893 ]):
(reg:SI 105 [ iD.2893 ])
$3 = void

(rr) p dump_class (class2)
Equivalence chain for (const_int 0 [0]):
(const_int 0 [0])
(reg:SI 97 [ _10 ])
(reg:SI 97 [ _10 ])
$4 = void

This happens because the original INSN being recorded is

(insn 18 17 24 2 (set (subreg:V1SI (reg:SI 97 [ _10 ]) 0)
(const_vector:V1SI [
(const_int 0 [0])
])) "cse.c":11:9 1363 {*movv1si_internal}
 (expr_list:REG_UNUSED (reg:SI 97 [ _10 ])
(nil)))

and we end up generating two equivalences. the first one is simply that
reg:SI 97 is 0.  The second one is that 0 can be extracted from the V1SI, so
subreg (subreg:V1SI (reg:SI 97) 0) 0 == 0.  This nested subreg gets folded away
to just reg:SI 97 and we re-insert the same equivalence.

This patch changes it so that once we figure out the bucket to insert into we
check if the equivalence set already contains the entry and if so just return
the existing entry and exit.

Bootstrapped Regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu and no regressions.


Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR rtl-optimization/103404
* cse.c (insert_with_costs): Check if item exists already before adding
a new entry in the equivalence class.

gcc/testsuite/ChangeLog:

PR rtl-optimization/103404
* gcc.target/i386/pr103404.c: New test.

--- inline copy of patch -- 
diff --git a/gcc/cse.c b/gcc/cse.c
index 
c1c7d0ca27b73c4b944b4719f95fece74e0358d5..08295246c594109e947276051c6776e4cabca4ec
 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -1537,6 +1537,17 @@ insert_with_costs (rtx x, struct table_elt *classp, 
unsigned int hash,
   if (REG_P (x) && REGNO (x) < FIRST_PSEUDO_REGISTER)
 add_to_hard_reg_set (&hard_regs_in_table, GET_MODE (x), REGNO (x));
 
+  /* We cannot allow a duplicate to be entered into the equivalence sets
+ and so we should perform a check before we do any allocations or
+ change the buckets.  */
+  if (classp)
+{
+  struct table_elt *p;
+  for (p = classp; p; p = p->next_same_value)
+   if (exp_equiv_p (p->exp, x, 1, false))
+ return p;
+}
+
   /* Put an element for X into the right hash bucket.  */
 
   elt = free_element_chain;
diff --git a/gcc/testsuite/gcc.target/i386/pr103404.c 
b/gcc/testsuite/gcc.target/i386/pr103404.c
new file mode 100644
index 
..66f33645301db09503fc0977fd0f061a19e56ea5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr103404.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Og -fcse-follow-jumps -fno-dce 
-fno-early-inlining -fgcse -fharden-conditional-branches -frerun-cse-after-loop 
-fno-tree-ccp -mavx5124fmaps -std=c99 -w" } */
+
+typedef unsigned __attribute__((__vector_size__ (4))) U;
+typedef unsigned __attribute__((__vector_size__ (16))) V;
+typedef unsigned __attribute__((__vector_size__ (64))) W;
+
+int x, y;
+
+V v;
+W w;
+
+inline
+int bar (U a)
+{
+  a |= x;
+  W k =
+__builtin_shufflevector (v, 5 / a,
+2, 4, 0, 2, 4, 1, 0, 1,
+1, 2, 1, 3, 0, 4, 4, 0);
+  w = k;
+  y = 0;
+}
+
+int
+foo ()
+{
+  bar ((U){0x});
+  for (unsigned i; i < sizeof (foo);)
+;
+}
+


-- 
diff --git a/gcc/cse.c b/gcc/cse.c
index c1c7d0ca27b73c4b944b4719f95fece74e0358d5..08295246c594109e947276051c6776e4cabca4ec 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -1537,6 +1537,17 @@ insert_with_costs (rtx x, struct table_elt *classp, unsigned int hash,
   if (REG_P (x) && REGNO (x) < FIRST_PSEUDO_REGISTER)
 add_to_hard_reg_set (&hard_regs_in_table, GET_MODE (x), REGNO (x));
 
+  /* We cannot allow a duplicate to be entered into the equivalence sets
+ and so we should perform a check before we do any allocations or
+ change the buckets.  */
+  if (classp)
+{
+  struct table_elt *p;
+  for (p = classp; p; p = p->next_same_value)
+	if (exp_equiv_p (p->exp, x,