[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 Andrew Pinski changed: What|Removed |Added CC||19373742 at buaa dot edu.cn --- Comment #25 from Andrew Pinski --- *** Bug 111951 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|REOPENED|RESOLVED --- Comment #24 from Tamar Christina --- ok, should be actually fixed now
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 --- Comment #23 from CVS Commits --- The master branch has been updated by Tamar Christina : https://gcc.gnu.org/g:9ed6b22eb4188c57bb3f5cdba5a7effa95395186 commit r14-4861-g9ed6b22eb4188c57bb3f5cdba5a7effa95395186 Author: Tamar Christina Date: Mon Oct 23 14:07:20 2023 +0100 middle-end: don't keep .MEM guard nodes for PHI nodes who dominate loop [PR111860] The previous patch tried to remove PHI nodes that dominated the first loop, however the correct fix is to only remove .MEM nodes. This patch thus makes the condition a bit stricter and only tries to remove MEM phi nodes. I couldn't figure out a way to easily determine if a particular PHI is vUSE related, so the patch does: 1. check if the definition is a vDEF and not defined in main loop. 2. check if the definition is a PHI and not defined in main loop. 3. check if the definition is a default definition. For no 2 and 3 we may misidentify the PHI, in both cases the value is defined outside of the loop version block which also makes it ok to remove. gcc/ChangeLog: PR tree-optimization/111860 * tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg): Drop .MEM nodes only. gcc/testsuite/ChangeLog: PR tree-optimization/111860 * gcc.dg/vect/pr111860-2.c: New test. * gcc.dg/vect/pr111860-3.c: New test.
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 Andrew Pinski changed: What|Removed |Added CC||manuel.lauss at googlemail dot com --- Comment #22 from Andrew Pinski --- *** Bug 111902 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 --- Comment #21 from Tamar Christina --- patch submitted https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633734.html
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 --- Comment #20 from Tamar Christina --- (In reply to David Binderman from comment #19) > Created attachment 56154 [details] > C source code > > You might like to have a go at getting the attached code working: > > $ ~/gcc/results/bin/gcc -c -w -O3 bug967B.c > bug967B.c: In function ‘__wcstod128_l_internal’: > bug967B.c:10:1: error: stmt with wrong VUSE >10 | __wcstod128_l_internal() { > | ^~ > > I have 20+ other cases. I can provide them, if you like. No need :) They're all the same bug. The idea for the fix was correct, but the way I checked if the loop was versioned wasn't strong enough. All the reported testcases now pass. I'll start regressions.
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 David Binderman changed: What|Removed |Added CC||dcb314 at hotmail dot com --- Comment #19 from David Binderman --- Created attachment 56154 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56154=edit C source code You might like to have a go at getting the attached code working: $ ~/gcc/results/bin/gcc -c -w -O3 bug967B.c bug967B.c: In function ‘__wcstod128_l_internal’: bug967B.c:10:1: error: stmt with wrong VUSE 10 | __wcstod128_l_internal() { | ^~ I have 20+ other cases. I can provide them, if you like.
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 Tamar Christina changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #18 from Tamar Christina --- Fix is too conservative, when there's no use in either loop it fails as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111877 shows.
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 Tamar Christina changed: What|Removed |Added CC||zsojka at seznam dot cz --- Comment #17 from Tamar Christina --- *** Bug 111877 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 --- Comment #16 from David Binderman --- Created attachment 56153 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56153=edit C source code
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 Tamar Christina changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #15 from Tamar Christina --- Fixed, thanks for the report
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 --- Comment #14 from CVS Commits --- The master branch has been updated by Tamar Christina : https://gcc.gnu.org/g:217a0fcb852aeb4aa9e3fb9baec6ff947c8de3d4 commit r14-4746-g217a0fcb852aeb4aa9e3fb9baec6ff947c8de3d4 Author: Tamar Christina Date: Thu Oct 19 13:44:01 2023 +0100 middle-end: don't create LC-SSA PHI variables for PHI nodes who dominate loop As the testcase shows, when a PHI node dominates the loop there is no new definition inside the loop. As such there would be no PHI nodes to update. When we maintain LCSSA form we create an intermediate node in between the two loops to thread alongt the value. However later on when we update the second loop we don't have any PHI nodes to update and so adjust_phi_and_debug_stmts does nothing. This leaves us with an incorrect phi node. Normally this does nothing and just gets ignored. But in the case of the vUSE chain we end up corrupting the chain. As such whenever a PHI node's argument dominates the loop, we should remove the newly created PHI node after edge redirection. The one exception to this is when the loop has been versioned. In such cases the versioned loop may not use the value but the second loop can. When this happens and we add the loop guard unless the join block has the PHI it can't find the original value for use inside the guard block. The next refactoring in the series moves the formation of the guard block inside peeling itself. Here we have all the information and wouldn't need to re-create it later. gcc/ChangeLog: PR tree-optimization/111860 * tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg): Remove PHI nodes that dominate loop. gcc/testsuite/ChangeLog: PR tree-optimization/111860 * gcc.dg/vect/pr111860.c: New test.
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 --- Comment #13 from Tamar Christina --- Patch posted https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633569.html
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 --- Comment #12 from Tamar Christina --- yes, patch was tested on both aarch64 and x86, but I did not test libgomp indeed. In any case, waiting for regression run to finish and will submit patch.
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 --- Comment #11 from David Binderman --- (In reply to Andrew Pinski from comment #10) > Maybe since it was in the libgomp testsuite you missed it when you tested > your patch. I usually find that compiling all the C,C++ and Fortran code in the gcc/testsuite directory and below with "-g -O3 -march=native -Wall" and searching for "internal compiler error" can be useful. testsuite $ find . -name \*.c -print | wc -l 51192 testsuite $ find . \( -name \*.C -o -name \*.cc -o -name \*.cpp -o -name \*.cxx \) | wc -l 20083 testsuite $ find . -type f -print | grep -E -i "\.f$|\.f[0-9][0-9]$" | wc -l 8036 testsuite $ Bonus points for two different architectures (arm & x86_64 ?).
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 --- Comment #10 from Andrew Pinski --- Just an FYI, I do get a similar ICE on: libgomp/testsuite/libgomp.fortran/simd3.f90 Testcase on aarch64-linux-gnu now too. Maybe since it was in the libgomp testsuite you missed it when you tested your patch. /home/ubuntu/src/upstream-gcc-aarch64/gcc/libgomp/testsuite/libgomp.fortran/simd3.f90:56:18: Error: stmt with wrong VUSE^M # VUSE <.MEM_68>^M _21 = D.3326[_50];^M expected .MEM_95^M /home/ubuntu/src/upstream-gcc-aarch64/gcc/libgomp/testsuite/libgomp.fortran/simd3.f90:56:18: Error: PHI node with wrong VUSE on edge from BB 32^M .MEM_131 = PHI <.MEM_68(32), .MEM_68(29)>^M expected .MEM_95^M /home/ubuntu/src/upstream-gcc-aarch64/gcc/libgomp/testsuite/libgomp.fortran/simd3.f90:56:18: Error: PHI node with wrong VUSE on edge from BB 29^M .MEM_131 = PHI <.MEM_68(32), .MEM_68(29)>^M expected .MEM_95^M during GIMPLE pass: vect^M /home/ubuntu/src/upstream-gcc-aarch64/gcc/libgomp/testsuite/libgomp.fortran/simd3.f90:56:18: internal compiler error: verify_ssa failed^M 0x12312eb verify_ssa(bool, bool)^M
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 Andrew Pinski changed: What|Removed |Added CC||shaohua.li at inf dot ethz.ch --- Comment #9 from Andrew Pinski --- *** Bug 111869 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 Andrew Pinski changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #8 from Andrew Pinski --- .
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 Tamar Christina changed: What|Removed |Added CC||seurer at gcc dot gnu.org --- Comment #7 from Tamar Christina --- *** Bug 111868 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860 --- Comment #6 from Tamar Christina --- Ok, so the problem is that the loop never creates memory references, and so after redirecting the edges when we update the new references we do so by trying to update the PHI nodes. But since the loop has no MEM phi node there's nothing to update but we created a new artificial node during redirect. Because there's no PHI node that means that adjust_phi_and_debug_stmts isn't strong enough here. So I can either remove phi nodes whom's SSA vars haven't been defined inside the loop have not been defined in the body, or I'll need to replace adjust_phi_and_debug_stmts with something that goes through all uses inside the new loop and exit. Which do you prefer richi? It seems like removing the PHI node after redirect is the simplest one and one less thing to keep updated.