[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)

2025-08-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405

--- Comment #10 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:53f491ccd1e59fad77fb2cb30d1a58b9e5e5f63c

commit r16-3066-g53f491ccd1e59fad77fb2cb30d1a58b9e5e5f63c
Author: Richard Biener 
Date:   Wed Aug 6 12:31:13 2025 +0200

tree-optimization/121405 - missed VN with aggregate copy

The following handles value-numbering of a BIT_FIELD_REF of
a register that's defined by a load by looking up a subset
load similar to how we handle bit-and masked loads.  This
allows the testcase to be simplified by two FRE passes,
the first one will create the BIT_FIELD_REF.

PR tree-optimization/121405
* tree-ssa-sccvn.cc (visit_nary_op): Handle BIT_FIELD_REF
with reference def by looking up a combination of both.

* gcc.dg/tree-ssa/ssa-fre-107.c: New testcase.
* gcc.target/i386/pr90579.c: Adjust.

[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)

2025-08-06 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405

--- Comment #9 from Richard Biener  ---
Hmm, so it's FRE1 that replaces

  a_20 = MEM[(struct vec_char_16 *)&D.3024];
  _5 = a_20;
  MEM  [(char * {ref-all})&b] = _5;
  _1 = b.raw[0];

with

  a_20 = MEM[(struct vec_char_16 *)&D.3024];
  MEM  [(char * {ref-all})&b] = a_20;
  _15 = BIT_FIELD_REF ;

via

Value numbering stmt = _1 = b.raw[0];
Inserting name _12 for expression BIT_FIELD_REF 
Setting value number of _1 to _12 (changed)

so handling BIT_FIELD_REFs as proposed then only works in FRE2:

Value numbering stmt = _7 = BIT_FIELD_REF ;
Setting value number of _7 to t1_5(D) (changed)


That means in the

  /* 4) Assignment from an SSA name which definition we may be able
 to access pieces from or we can combine to a larger entity.  */

case where we end up creating the BIT_FIELD_REF we could try to see
to handle it like an aggregate copy.

Meanwhile the following handes it in FRE2 (or when there's a BIT_FIELD_REF
already):

diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index 00315d154e4..6b859de6ba9 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -5633,6 +5633,27 @@ visit_nary_op (tree lhs, gassign *stmt)
}
}
   break;
+case BIT_FIELD_REF:
+  if (TREE_CODE (TREE_OPERAND (rhs1, 0)) == SSA_NAME)
+   {
+ tree op0 = TREE_OPERAND (rhs1, 0);
+ gassign *ass = dyn_cast  (SSA_NAME_DEF_STMT (op0));
+ if (ass
+ && !gimple_has_volatile_ops (ass)
+ && vn_get_stmt_kind (ass) == VN_REFERENCE)
+   {
+ tree last_vuse = gimple_vuse (ass);
+ tree op = build3 (BIT_FIELD_REF, TREE_TYPE (rhs1),
+   gimple_assign_rhs1 (ass),
+   TREE_OPERAND (rhs1, 1), TREE_OPERAND (rhs1,
2));
+ tree result = vn_reference_lookup (op, gimple_vuse (ass),
+default_vn_walk_kind,
+NULL, true, &last_vuse);
+ if (result)
+   return set_ssa_val_to (lhs, result);
+   }
+   }
+  break;
 case TRUNC_DIV_EXPR:
   if (TYPE_UNSIGNED (type))
break;

[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)

2025-08-05 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2025-08-06
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)

2025-08-05 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405

Richard Biener  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #8 from Richard Biener  ---
Note I particularly dislike the hoops to jump through to allow simplification
to pick up the to-be-inserted stmts given those representations do not have a
leader.  That seems excessively costly.

Another (costly) possibility would be to lookup the BIT_FIELD_REF as a load
itself by composing it with its SSA def op0.  That would side-step the
intermediate CTOR "value".

[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)

2025-08-05 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405

--- Comment #7 from Richard Biener  ---
Created attachment 62054
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62054&action=edit
another related old patch

[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)

2025-08-05 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405

--- Comment #6 from Richard Biener  ---
Created attachment 62053
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62053&action=edit
old patch

I think this one, originally done for PR92645, but there's a related one for
PR93507.

[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)

2025-08-05 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405

--- Comment #5 from Richard Biener  ---
   :
  MEM  [(struct vec_char_16 *)&D.3024] = t0_7(D);
  MEM  [(struct vec_char_16 *)&D.3024 + 1B] = t1_8(D);
  a_20 = MEM[(struct vec_char_16 *)&D.3024];
  MEM  [(char * {ref-all})&b] = a_20;
  _15 = BIT_FIELD_REF ;

So this isn't a "copy", but rather what we miss is to record a value
for the compound load by MEM[(struct vec_char_16 *)&D.3024].  I do
have some old patches that build up a { t0_7(D), t1_8(D) } CTOR,
marked for insertion and value-number to that.  With this
BIT_FIELD_REF folding on that should work.  But I chickened out
because actually inserting such CTOR isn't always profitable.
We do this dance for vector constants.

[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)

2025-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405

Andrew Pinski  changed:

   What|Removed |Added

  Attachment #62046|0   |1
is obsolete||
  Attachment #62047|0   |1
is obsolete||

--- Comment #4 from Andrew Pinski  ---
Created attachment 62049
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62049&action=edit
This version shows the problem on most targets (except for strict align
targets)

[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)

2025-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405

--- Comment #3 from Andrew Pinski  ---
Created attachment 62047
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62047&action=edit
Reduced using int rather than char

Combine in GCC 15+ is not able to do the combine in this case. Instead we get:
salq$32, %rsi
shrq$32, %rsi
leal(%rdi,%rsi), %eax
ret

Which should be a the leal.

[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)

2025-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405

--- Comment #2 from Andrew Pinski  ---
This is the .optimized from highway where I saw this:
```
  MEM  [(struct Vec128 *)&D.741237] = _4040;
  MEM  [(struct Vec128 *)&D.741237 + 1B] = _4047;
  MEM  [(struct Vec128 *)&D.741237 + 2B] = _4054;
  MEM  [(struct Vec128 *)&D.741237 + 3B] = _4061;
  MEM  [(struct Vec128 *)&D.741237 + 4B] = _4068;
  MEM  [(struct Vec128 *)&D.741237 + 5B] = _4075;
  MEM  [(struct Vec128 *)&D.741237 + 6B] = _4082;
  MEM  [(struct Vec128 *)&D.741237 + 7B] = _4089;
  MEM  [(struct Vec128 *)&D.741237 + 8B] = _4096;
  MEM  [(struct Vec128 *)&D.741237 + 9B] = _4103;
  MEM  [(struct Vec128 *)&D.741237 + 10B] = _4110;
  MEM  [(struct Vec128 *)&D.741237 + 11B] = _4117;
  MEM  [(struct Vec128 *)&D.741237 + 12B] = _4124;
  MEM  [(struct Vec128 *)&D.741237 + 13B] = _4131;
  MEM  [(struct Vec128 *)&D.741237 + 14B] = _4138;
  MEM  [(struct Vec128 *)&D.741237 + 15B] = _4145;
  SR.14270_2654 = MEM[(struct Vec128 *)&D.741237];
...
  _3521 = (unsigned char) SR.14270_2654;
  _3522 = -_3521;
  _3523 = (signed char) _3522;
  a.raw[0] = _3523;
  _5255 = BIT_FIELD_REF ;
  _3531 = (unsigned char) _5255;
  _3532 = -_3531;
  _3533 = (signed char) _3532;
  a.raw[1] = _3533;
```

[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)

2025-08-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |13.5
  Known to fail||5.1.0
Summary|Another missed VN via a |[13/14/15/16 Regression]
   |copy (but via an int copy)  |Another missed VN via a
   ||copy (but via an int copy)
   Severity|enhancement |normal
  Known to work||4.9.4

--- Comment #1 from Andrew Pinski  ---
So when memcpy->using an integer type started to happen, the optimization
started to be missed.

If the vectorizer creates a CONSTRUCTOR here, then GCC is able to optimize it
(either -fno-vect-cost-model or on aarch64).