[Bug tree-optimization/97043] latent wrong-code with SLP vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043 --- Comment #4 from CVS Commits --- The releases/gcc-9 branch has been updated by Richard Biener : https://gcc.gnu.org/g:e75e5d2c41d294c4da4adfe610204ce5d97c3a4e commit r9-9959-ge75e5d2c41d294c4da4adfe610204ce5d97c3a4e Author: Richard Biener Date: Mon Sep 14 11:25:04 2020 +0200 tree-optimization/97043 - fix latent wrong-code with SLP vectorization When the unrolling decision comes late and would have prevented eliding a SLP load permutation we can end up generating aligned loads when the load is in fact unaligned. Most of the time alignment analysis figures out the load is in fact unaligned but that cannot be relied upon. The following removes the SLP load permutation eliding based on the still premature vectorization factor. 2020-09-14 Richard Biener PR tree-optimization/97043 * tree-vect-slp.c (vect_analyze_slp_instance): Do not elide a load permutation if the current vectorization factor is one. (cherry picked from commit e93428a8b056aed83a7678d4dc8272131ab671ba)
[Bug tree-optimization/97043] latent wrong-code with SLP vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043 Bug 97043 depends on bug 97236, which changed state. Bug 97236 Summary: [8 Regression] g:e93428a8b056aed83a7678 triggers vlc miscompile https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97236 What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/97043] latent wrong-code with SLP vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #3 from Richard Biener --- Fixed, not backporting further.
[Bug tree-optimization/97043] latent wrong-code with SLP vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043 Bug 97043 depends on bug 97236, which changed state. Bug 97236 Summary: [8/9 Regression] g:e93428a8b056aed83a7678 triggers vlc miscompile https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97236 What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |---
[Bug tree-optimization/97043] latent wrong-code with SLP vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043 Bug 97043 depends on bug 97236, which changed state. Bug 97236 Summary: [10 Regression] g:e93428a8b056aed83a7678 triggers vlc miscompile https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97236 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/97043] latent wrong-code with SLP vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043 --- Comment #2 from CVS Commits --- The releases/gcc-10 branch has been updated by Richard Biener : https://gcc.gnu.org/g:e93428a8b056aed83a7678d4dc8272131ab671ba commit r10-8759-ge93428a8b056aed83a7678d4dc8272131ab671ba Author: Richard Biener Date: Mon Sep 14 11:25:04 2020 +0200 tree-optimization/97043 - fix latent wrong-code with SLP vectorization When the unrolling decision comes late and would have prevented eliding a SLP load permutation we can end up generating aligned loads when the load is in fact unaligned. Most of the time alignment analysis figures out the load is in fact unaligned but that cannot be relied upon. The following removes the SLP load permutation eliding based on the still premature vectorization factor. 2020-09-14 Richard Biener PR tree-optimization/97043 * tree-vect-slp.c (vect_analyze_slp_instance): Do not elide a load permutation if the current vectorization factor is one.
[Bug tree-optimization/97043] latent wrong-code with SLP vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED
[Bug tree-optimization/97043] latent wrong-code with SLP vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043 Richard Biener changed: What|Removed |Added Keywords||wrong-code Ever confirmed|0 |1 Blocks||96522 Status|UNCONFIRMED |NEW Last reconfirmed||2020-09-14 --- Comment #1 from Richard Biener --- This blocks backporting the fix for PR96522, causing the gcc.dg/vect/pr81410.c testcase to FAIL execution with an unaligned access using an aligned load. The trunk rev. that fixed this is gbc484e250990393e887f7239157cc85ce6fadcce A pragmatic fix might be diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c index f6331eeea86..3fdf56f9335 100644 --- a/gcc/tree-vect-slp.c +++ b/gcc/tree-vect-slp.c @@ -2309,9 +2309,8 @@ vect_analyze_slp_instance (vec_info *vinfo, /* The load requires permutation when unrolling exposes a gap either because the group is larger than the SLP group-size or because there is a gap between the groups. */ - && (known_eq (unrolling_factor, 1U) - || (group_size == DR_GROUP_SIZE (first_stmt_info) - && DR_GROUP_GAP (first_stmt_info) == 0))) + && group_size == DR_GROUP_SIZE (first_stmt_info) + && DR_GROUP_GAP (first_stmt_info) == 0) { SLP_TREE_LOAD_PERMUTATION (load_node).release (); continue; with biggest effects eventually on load-lane targets (arm/aarch64) where we then eventually prefer more of those. For the testcase in question we then generate the following, matching trunk movdqa (%rdx), %xmm2 movdqa 16(%rdx), %xmm0 shufpd $1, 32(%rdx), %xmm0 instead of movdqa (%rdx), %xmm1 addq$48, %rdx movdqu -24(%rdx), %xmm2 (or with the backport of PR96522 a wrong movdqa in place of the movdqu). Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96522 [Bug 96522] [9/10 Regression] Incorrect with with -O -fno-tree-pta