https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117079
--- Comment #4 from Christoph Müllner ---
The reason that we don't have "MEM " in the dump
anymore is that we now have "MEM ".
Further, the size of the function in the test case shrinks from 225
instructions down to 109 (almost all vector instructions).
I tried to measure a performance difference on my 5950X (-march=native) when
calling the test function four times in a loop with 1024l * 1024 * 1024 * 1024
iterations.
However, I did not see enough evidence to claim that the new code is better
(memory bandwidth is probably the limit):
* old: 4m34.405s, 4m47.825s, 4m38.187s
* new: 4m34.722s, 4m34.936s, 4m34.922s
I propose to fix the failing test case by fixing the test condition.
A patch for that is on the list:
https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673551.html
FWIW, here is a small code change that will bring back the old behavior for
analysis:
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -2595,7 +2595,7 @@ out:
auto_vec two_op_perm_indices[2];
vec two_op_scalar_stmts[2] = {vNULL, vNULL};
- if (two_operators && oprnds_info.length () == 2 && group_size > 2)
+ if (false && two_operators && oprnds_info.length () == 2 && group_size > 2)
{
unsigned idx = 0;
hash_map seen;