[Bug tree-optimization/97043] latent wrong-code with SLP vectorization

2022-02-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043

--- Comment #4 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:e75e5d2c41d294c4da4adfe610204ce5d97c3a4e

commit r9-9959-ge75e5d2c41d294c4da4adfe610204ce5d97c3a4e
Author: Richard Biener 
Date:   Mon Sep 14 11:25:04 2020 +0200

tree-optimization/97043 - fix latent wrong-code with SLP vectorization

When the unrolling decision comes late and would have prevented
eliding a SLP load permutation we can end up generating aligned
loads when the load is in fact unaligned.  Most of the time
alignment analysis figures out the load is in fact unaligned
but that cannot be relied upon.

The following removes the SLP load permutation eliding based on
the still premature vectorization factor.

2020-09-14  Richard Biener  

PR tree-optimization/97043
* tree-vect-slp.c (vect_analyze_slp_instance): Do not
elide a load permutation if the current vectorization
factor is one.

(cherry picked from commit e93428a8b056aed83a7678d4dc8272131ab671ba)

[Bug tree-optimization/97043] latent wrong-code with SLP vectorization

2021-02-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043
Bug 97043 depends on bug 97236, which changed state.

Bug 97236 Summary: [8 Regression] g:e93428a8b056aed83a7678 triggers vlc 
miscompile
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97236

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/97043] latent wrong-code with SLP vectorization

2021-02-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Richard Biener  ---
Fixed, not backporting further.

[Bug tree-optimization/97043] latent wrong-code with SLP vectorization

2021-02-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043
Bug 97043 depends on bug 97236, which changed state.

Bug 97236 Summary: [8/9 Regression] g:e93428a8b056aed83a7678 triggers vlc 
miscompile
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97236

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

[Bug tree-optimization/97043] latent wrong-code with SLP vectorization

2020-10-06 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043
Bug 97043 depends on bug 97236, which changed state.

Bug 97236 Summary: [10 Regression] g:e93428a8b056aed83a7678 triggers vlc 
miscompile
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97236

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/97043] latent wrong-code with SLP vectorization

2020-09-14 Thread cvs-commit at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043

--- Comment #2 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:e93428a8b056aed83a7678d4dc8272131ab671ba

commit r10-8759-ge93428a8b056aed83a7678d4dc8272131ab671ba
Author: Richard Biener 
Date:   Mon Sep 14 11:25:04 2020 +0200

tree-optimization/97043 - fix latent wrong-code with SLP vectorization

When the unrolling decision comes late and would have prevented
eliding a SLP load permutation we can end up generating aligned
loads when the load is in fact unaligned.  Most of the time
alignment analysis figures out the load is in fact unaligned
but that cannot be relied upon.

The following removes the SLP load permutation eliding based on
the still premature vectorization factor.

2020-09-14  Richard Biener  

PR tree-optimization/97043
* tree-vect-slp.c (vect_analyze_slp_instance): Do not
elide a load permutation if the current vectorization
factor is one.

[Bug tree-optimization/97043] latent wrong-code with SLP vectorization

2020-09-14 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

[Bug tree-optimization/97043] latent wrong-code with SLP vectorization

2020-09-14 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043

Richard Biener  changed:

   What|Removed |Added

   Keywords||wrong-code
 Ever confirmed|0   |1
 Blocks||96522
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2020-09-14

--- Comment #1 from Richard Biener  ---
This blocks backporting the fix for PR96522, causing the gcc.dg/vect/pr81410.c
testcase to FAIL execution with an unaligned access using an aligned load.

The trunk rev. that fixed this is gbc484e250990393e887f7239157cc85ce6fadcce

A pragmatic fix might be

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index f6331eeea86..3fdf56f9335 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2309,9 +2309,8 @@ vect_analyze_slp_instance (vec_info *vinfo,
  /* The load requires permutation when unrolling exposes
 a gap either because the group is larger than the SLP
 group-size or because there is a gap between the groups. 
*/
- && (known_eq (unrolling_factor, 1U)
- || (group_size == DR_GROUP_SIZE (first_stmt_info)
- && DR_GROUP_GAP (first_stmt_info) == 0)))
+ && group_size == DR_GROUP_SIZE (first_stmt_info)
+ && DR_GROUP_GAP (first_stmt_info) == 0)
{
  SLP_TREE_LOAD_PERMUTATION (load_node).release ();
  continue;

with biggest effects eventually on load-lane targets (arm/aarch64) where
we then eventually prefer more of those.  For the testcase in question
we then generate the following, matching trunk

movdqa  (%rdx), %xmm2
movdqa  16(%rdx), %xmm0
shufpd  $1, 32(%rdx), %xmm0

instead of

movdqa  (%rdx), %xmm1
addq$48, %rdx
movdqu  -24(%rdx), %xmm2

(or with the backport of PR96522 a wrong movdqa in place of the movdqu).


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96522
[Bug 96522] [9/10 Regression] Incorrect with with -O -fno-tree-pta