[Bug tree-optimization/114413] BB SLP sub-graph merging fails to CSE nodes

2024-06-20 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114413

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |15.0
 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Richard Biener  ---
This should be largely fixed now, the missing piece that might be important in
some cases is CSE of permutes (or two-operator nodes) and of extern CTORs.

[Bug tree-optimization/114413] BB SLP sub-graph merging fails to CSE nodes

2024-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114413

--- Comment #1 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452

commit r15-1467-g46bb4ce4d30ab749d40f6f4cef6f1fb7c7813452
Author: Richard Biener 
Date:   Wed Jun 19 12:57:27 2024 +0200

tree-optimization/114413 - SLP CSE after permute optimization

We currently fail to re-CSE SLP nodes after optimizing permutes
which results in off cost estimates.  For gcc.dg/vect/bb-slp-32.c
this shows in not re-using the SLP node with the load and arithmetic
for both the store and the reduction.  The following implements
CSE by re-bst-mapping nodes as finalization part of vect_optimize_slp.

I've tried to make the CSE part of permute materialization but it
isn't a very good fit there.  I've not bothered to implement something
more complete, also handling external defs or defs without
SLP_TREE_SCALAR_STMTS.

I realize this might result in more BB SLP which in turn might slow
down code given costing for BB SLP is difficult (even that we now
vectorize gcc.dg/vect/bb-slp-32.c on x86_64 might be not a good idea).
This is nevertheless feeding more accurate info to costing which is
good.

PR tree-optimization/114413
* tree-vect-slp.cc (release_scalar_stmts_to_slp_tree_map):
New function, split out from ...
(vect_analyze_slp): ... here.  Call it.
(vect_cse_slp_nodes): New function.
(vect_optimize_slp): Call it.

* gcc.dg/vect/bb-slp-32.c: Expect CSE and vectorization on x86.

[Bug tree-optimization/114413] BB SLP sub-graph merging fails to CSE nodes

2024-06-19 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114413

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Last reconfirmed||2024-06-19