[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405 --- Comment #10 from GCC Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:53f491ccd1e59fad77fb2cb30d1a58b9e5e5f63c commit r16-3066-g53f491ccd1e59fad77fb2cb30d1a58b9e5e5f63c Author: Richard Biener Date: Wed Aug 6 12:31:13 2025 +0200 tree-optimization/121405 - missed VN with aggregate copy The following handles value-numbering of a BIT_FIELD_REF of a register that's defined by a load by looking up a subset load similar to how we handle bit-and masked loads. This allows the testcase to be simplified by two FRE passes, the first one will create the BIT_FIELD_REF. PR tree-optimization/121405 * tree-ssa-sccvn.cc (visit_nary_op): Handle BIT_FIELD_REF with reference def by looking up a combination of both. * gcc.dg/tree-ssa/ssa-fre-107.c: New testcase. * gcc.target/i386/pr90579.c: Adjust.
[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405
--- Comment #9 from Richard Biener ---
Hmm, so it's FRE1 that replaces
a_20 = MEM[(struct vec_char_16 *)&D.3024];
_5 = a_20;
MEM [(char * {ref-all})&b] = _5;
_1 = b.raw[0];
with
a_20 = MEM[(struct vec_char_16 *)&D.3024];
MEM [(char * {ref-all})&b] = a_20;
_15 = BIT_FIELD_REF ;
via
Value numbering stmt = _1 = b.raw[0];
Inserting name _12 for expression BIT_FIELD_REF
Setting value number of _1 to _12 (changed)
so handling BIT_FIELD_REFs as proposed then only works in FRE2:
Value numbering stmt = _7 = BIT_FIELD_REF ;
Setting value number of _7 to t1_5(D) (changed)
That means in the
/* 4) Assignment from an SSA name which definition we may be able
to access pieces from or we can combine to a larger entity. */
case where we end up creating the BIT_FIELD_REF we could try to see
to handle it like an aggregate copy.
Meanwhile the following handes it in FRE2 (or when there's a BIT_FIELD_REF
already):
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index 00315d154e4..6b859de6ba9 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -5633,6 +5633,27 @@ visit_nary_op (tree lhs, gassign *stmt)
}
}
break;
+case BIT_FIELD_REF:
+ if (TREE_CODE (TREE_OPERAND (rhs1, 0)) == SSA_NAME)
+ {
+ tree op0 = TREE_OPERAND (rhs1, 0);
+ gassign *ass = dyn_cast (SSA_NAME_DEF_STMT (op0));
+ if (ass
+ && !gimple_has_volatile_ops (ass)
+ && vn_get_stmt_kind (ass) == VN_REFERENCE)
+ {
+ tree last_vuse = gimple_vuse (ass);
+ tree op = build3 (BIT_FIELD_REF, TREE_TYPE (rhs1),
+ gimple_assign_rhs1 (ass),
+ TREE_OPERAND (rhs1, 1), TREE_OPERAND (rhs1,
2));
+ tree result = vn_reference_lookup (op, gimple_vuse (ass),
+default_vn_walk_kind,
+NULL, true, &last_vuse);
+ if (result)
+ return set_ssa_val_to (lhs, result);
+ }
+ }
+ break;
case TRUNC_DIV_EXPR:
if (TYPE_UNSIGNED (type))
break;
[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405 Richard Biener changed: What|Removed |Added Last reconfirmed||2025-08-06 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1
[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405 Richard Biener changed: What|Removed |Added CC||rguenth at gcc dot gnu.org --- Comment #8 from Richard Biener --- Note I particularly dislike the hoops to jump through to allow simplification to pick up the to-be-inserted stmts given those representations do not have a leader. That seems excessively costly. Another (costly) possibility would be to lookup the BIT_FIELD_REF as a load itself by composing it with its SSA def op0. That would side-step the intermediate CTOR "value".
[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405 --- Comment #7 from Richard Biener --- Created attachment 62054 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62054&action=edit another related old patch
[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405 --- Comment #6 from Richard Biener --- Created attachment 62053 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62053&action=edit old patch I think this one, originally done for PR92645, but there's a related one for PR93507.
[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405
--- Comment #5 from Richard Biener ---
:
MEM [(struct vec_char_16 *)&D.3024] = t0_7(D);
MEM [(struct vec_char_16 *)&D.3024 + 1B] = t1_8(D);
a_20 = MEM[(struct vec_char_16 *)&D.3024];
MEM [(char * {ref-all})&b] = a_20;
_15 = BIT_FIELD_REF ;
So this isn't a "copy", but rather what we miss is to record a value
for the compound load by MEM[(struct vec_char_16 *)&D.3024]. I do
have some old patches that build up a { t0_7(D), t1_8(D) } CTOR,
marked for insertion and value-number to that. With this
BIT_FIELD_REF folding on that should work. But I chickened out
because actually inserting such CTOR isn't always profitable.
We do this dance for vector constants.
[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405 Andrew Pinski changed: What|Removed |Added Attachment #62046|0 |1 is obsolete|| Attachment #62047|0 |1 is obsolete|| --- Comment #4 from Andrew Pinski --- Created attachment 62049 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62049&action=edit This version shows the problem on most targets (except for strict align targets)
[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405 --- Comment #3 from Andrew Pinski --- Created attachment 62047 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62047&action=edit Reduced using int rather than char Combine in GCC 15+ is not able to do the combine in this case. Instead we get: salq$32, %rsi shrq$32, %rsi leal(%rdi,%rsi), %eax ret Which should be a the leal.
[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405 --- Comment #2 from Andrew Pinski --- This is the .optimized from highway where I saw this: ``` MEM [(struct Vec128 *)&D.741237] = _4040; MEM [(struct Vec128 *)&D.741237 + 1B] = _4047; MEM [(struct Vec128 *)&D.741237 + 2B] = _4054; MEM [(struct Vec128 *)&D.741237 + 3B] = _4061; MEM [(struct Vec128 *)&D.741237 + 4B] = _4068; MEM [(struct Vec128 *)&D.741237 + 5B] = _4075; MEM [(struct Vec128 *)&D.741237 + 6B] = _4082; MEM [(struct Vec128 *)&D.741237 + 7B] = _4089; MEM [(struct Vec128 *)&D.741237 + 8B] = _4096; MEM [(struct Vec128 *)&D.741237 + 9B] = _4103; MEM [(struct Vec128 *)&D.741237 + 10B] = _4110; MEM [(struct Vec128 *)&D.741237 + 11B] = _4117; MEM [(struct Vec128 *)&D.741237 + 12B] = _4124; MEM [(struct Vec128 *)&D.741237 + 13B] = _4131; MEM [(struct Vec128 *)&D.741237 + 14B] = _4138; MEM [(struct Vec128 *)&D.741237 + 15B] = _4145; SR.14270_2654 = MEM[(struct Vec128 *)&D.741237]; ... _3521 = (unsigned char) SR.14270_2654; _3522 = -_3521; _3523 = (signed char) _3522; a.raw[0] = _3523; _5255 = BIT_FIELD_REF ; _3531 = (unsigned char) _5255; _3532 = -_3531; _3533 = (signed char) _3532; a.raw[1] = _3533; ```
[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |13.5 Known to fail||5.1.0 Summary|Another missed VN via a |[13/14/15/16 Regression] |copy (but via an int copy) |Another missed VN via a ||copy (but via an int copy) Severity|enhancement |normal Known to work||4.9.4 --- Comment #1 from Andrew Pinski --- So when memcpy->using an integer type started to happen, the optimization started to be missed. If the vectorizer creates a CONSTRUCTOR here, then GCC is able to optimize it (either -fno-vect-cost-model or on aarch64).
