[Bug tree-optimization/122797] [16 Regression] ffmpeg is miscompiled since r16-5115

2025-11-24 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122797

Sam James  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #10 from Sam James  ---
Fixed.

[Bug tree-optimization/122797] [16 Regression] ffmpeg is miscompiled since r16-5115

2025-11-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122797

--- Comment #9 from GCC Commits  ---
The master branch has been updated by Robin Dapp :

https://gcc.gnu.org/g:54e9c0be90fa87767d57ff682e044959feb754f2

commit r16-5562-g54e9c0be90fa87767d57ff682e044959feb754f2
Author: Robin Dapp 
Date:   Sat Nov 22 20:53:25 2025 +0100

vect: Use start value in vect_load_perm_consecutive_p [PR122797].

vect_load_perm_consecutive_p is used in a spot where we want to check
that a permutation is consecutive and starting with 0.  Originally I
wanted to add this way of checking to the function but what I ended
up with is checking whether the given permutation is consecutive
starting from a certain index.  Thus, we will return true for
e.g. {1, 2, 3} which doesn't make sense in the context of the PR.
This patch corrects it.

PR tree-optimization/122797

gcc/ChangeLog:

* tree-vect-slp.cc (vect_load_perm_consecutive_p): Check
permutation start at element 0 with value instead of starting
at a given element.
(vect_optimize_slp_pass::remove_redundant_permutations):
Use start value of 0.
* tree-vectorizer.h (vect_load_perm_consecutive_p): Set default
value to to UINT_MAX.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr122797.c: New test.

[Bug tree-optimization/122797] [16 Regression] ffmpeg is miscompiled since r16-5115

2025-11-22 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122797

--- Comment #8 from Sam James  ---
(In reply to Robin Dapp from comment #7)
> Any specific options for x86 apart from -O3?

-O2 (or -O3) is enough. Thanks Robin!

[Bug tree-optimization/122797] [16 Regression] ffmpeg is miscompiled since r16-5115

2025-11-22 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122797

Robin Dapp  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rdapp at gcc dot gnu.org
 Status|NEW |ASSIGNED

[Bug tree-optimization/122797] [16 Regression] ffmpeg is miscompiled since r16-5115

2025-11-22 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122797

--- Comment #7 from Robin Dapp  ---
Also fails on riscv with e.g.

-O3 -march=rv64gcbv_zvbb_zvl512b -DQEMU -mtune=generic-ooo -mmax-vectorization
--param=riscv-autovec-mode=V4QI

Any specific options for x86 apart from -O3?

The issue is pretty simple/stupid and I believe is the whole reason why I added
the second parameter of

vect_load_perm_consecutive_p (slp_tree node, unsigned start_idx)

in the first place:

@@ -7855,15 +7867,7 @@ vect_optimize_slp_pass::remove_redundant_permutations ()
   else
{
  loop_vec_info loop_vinfo = as_a (m_vinfo);
- stmt_vec_info load_info;
- bool this_load_permuted = false;
- unsigned j;
- FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), j, load_info)
-   if (SLP_TREE_LOAD_PERMUTATION (node)[j] != j)
- {
-   this_load_permuted = true;
-   break;
- }
+ bool this_load_permuted = !vect_load_perm_consecutive_p (node);

This is obviously wrong and what I wanted is that start_idx is not a start
index but a start value.  It usually doesn't help to know that a permutation is
consecutive from somewhere in the middle.  What is helpful, though, is to know
that it is consecutive starting with value x.

Testing a patch.

[Bug tree-optimization/122797] [16 Regression] ffmpeg is miscompiled since r16-5115

2025-11-22 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122797

Sam James  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2025-11-22

--- Comment #6 from Sam James  ---
Confirmed (and thanks, I was chasing this down myself and ended up hitting a
different problem -- PR122793).

[Bug tree-optimization/122797] [16 Regression] ffmpeg is miscompiled since r16-5115

2025-11-22 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122797

--- Comment #5 from Sam James  ---
Created attachment 62880
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62880&action=edit
abort.c

[Bug tree-optimization/122797] [16 Regression] ffmpeg is miscompiled since r16-5115

2025-11-22 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122797

Sam James  changed:

   What|Removed |Added

   Keywords|needs-reduction |

--- Comment #4 from Sam James  ---
Thank you!

[Bug tree-optimization/122797] [16 Regression] ffmpeg is miscompiled since r16-5115

2025-11-22 Thread kasper93 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122797

--- Comment #3 from Kacper Michajłow  ---
I've reduced the above function to standalone testcase. You can compare result
value with any other working compiler. {src,dst}_stride is set to 0 in global
variable, just so things are not optimized out.

```
#include 

int src_stride = 0;
int dst_stride = 0;
int main() {
char src[12] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};
char dst[16];
char *s = src;
char *d = dst;
for (int i = 0; i < 2; i++) {
d[0] = s[0] + s[1] + s[2] + s[3] + s[4];
d[1] = s[1] + s[2] + s[3] + s[4] + s[5];
d[2] = s[2] + s[3] + s[4] + s[5] + s[6];
d[3] = s[3] + s[4] + s[5] + s[6] + s[7];
d[4] = s[4] + s[5] + s[6] + s[7] + s[8];
d[5] = s[5] + s[6] + s[7] + s[8] + s[9];
d[6] = s[6] + s[7] + s[8] + s[9] + s[10];
d[7] = s[7] + s[8] + s[9] + s[10] + s[11];
s += src_stride;
d += dst_stride;
}
printf("%d", d[0]);
}
```

[Bug tree-optimization/122797] [16 Regression] ffmpeg is miscompiled since r16-5115

2025-11-21 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122797

Jeffrey A. Law  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug tree-optimization/122797] [16 Regression] ffmpeg is miscompiled since r16-5115

2025-11-21 Thread kasper93 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122797

--- Comment #2 from Kacper Michajłow  ---
Created attachment 62877
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62877&action=edit
min_avg_cavs_filt8_hv_ii.c

I've extracted one of the affected functions, see min_avg_cavs_filt8_hv_ii.c.
Hopefully this will be easier for you to start of from.