[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

Andrew Pinski  changed:

   What|Removed |Added

 CC||19373742 at buaa dot edu.cn

--- Comment #25 from Andrew Pinski  ---
*** Bug 111951 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

Tamar Christina  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|REOPENED|RESOLVED

--- Comment #24 from Tamar Christina  ---
ok, should be actually fixed now

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

--- Comment #23 from CVS Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:9ed6b22eb4188c57bb3f5cdba5a7effa95395186

commit r14-4861-g9ed6b22eb4188c57bb3f5cdba5a7effa95395186
Author: Tamar Christina 
Date:   Mon Oct 23 14:07:20 2023 +0100

middle-end: don't keep .MEM guard nodes for PHI nodes who dominate loop
[PR111860]

The previous patch tried to remove PHI nodes that dominated the first loop,
however the correct fix is to only remove .MEM nodes.

This patch thus makes the condition a bit stricter and only tries to remove
MEM phi nodes.

I couldn't figure out a way to easily determine if a particular PHI is vUSE
related, so the patch does:

1. check if the definition is a vDEF and not defined in main loop.
2. check if the definition is a PHI and not defined in main loop.
3. check if the definition is a default definition.

For no 2 and 3 we may misidentify the PHI, in both cases the value is
defined
outside of the loop version block which also makes it ok to remove.

gcc/ChangeLog:

PR tree-optimization/111860
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
Drop .MEM nodes only.

gcc/testsuite/ChangeLog:

PR tree-optimization/111860
* gcc.dg/vect/pr111860-2.c: New test.
* gcc.dg/vect/pr111860-3.c: New test.

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

Andrew Pinski  changed:

   What|Removed |Added

 CC||manuel.lauss at googlemail dot 
com

--- Comment #22 from Andrew Pinski  ---
*** Bug 111902 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-20 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

--- Comment #21 from Tamar Christina  ---
patch submitted
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633734.html

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

--- Comment #20 from Tamar Christina  ---
(In reply to David Binderman from comment #19)
> Created attachment 56154 [details]
> C source code
> 
> You might like to have a go at getting the attached code working:
> 
> $ ~/gcc/results/bin/gcc -c -w -O3  bug967B.c
> bug967B.c: In function ‘__wcstod128_l_internal’:
> bug967B.c:10:1: error: stmt with wrong VUSE
>10 | __wcstod128_l_internal() {
>   | ^~
> 
> I have 20+ other cases. I can provide them, if you like.

No need :) They're all the same bug.  The idea for the fix was correct, but the
way I checked if the loop was versioned wasn't strong enough.

All the reported testcases now pass. I'll start regressions.

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

David Binderman  changed:

   What|Removed |Added

 CC||dcb314 at hotmail dot com

--- Comment #19 from David Binderman  ---
Created attachment 56154
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56154=edit
C source code

You might like to have a go at getting the attached code working:

$ ~/gcc/results/bin/gcc -c -w -O3  bug967B.c
bug967B.c: In function ‘__wcstod128_l_internal’:
bug967B.c:10:1: error: stmt with wrong VUSE
   10 | __wcstod128_l_internal() {
  | ^~

I have 20+ other cases. I can provide them, if you like.

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

Tamar Christina  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #18 from Tamar Christina  ---
Fix is too conservative, when there's no use in either loop it fails as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111877 shows.

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

Tamar Christina  changed:

   What|Removed |Added

 CC||zsojka at seznam dot cz

--- Comment #17 from Tamar Christina  ---
*** Bug 111877 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

--- Comment #16 from David Binderman  ---
Created attachment 56153
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56153=edit
C source code

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

Tamar Christina  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from Tamar Christina  ---
Fixed, thanks for the report

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

--- Comment #14 from CVS Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:217a0fcb852aeb4aa9e3fb9baec6ff947c8de3d4

commit r14-4746-g217a0fcb852aeb4aa9e3fb9baec6ff947c8de3d4
Author: Tamar Christina 
Date:   Thu Oct 19 13:44:01 2023 +0100

middle-end: don't create LC-SSA PHI variables for PHI nodes who dominate
loop

As the testcase shows, when a PHI node dominates the loop there is no new
definition inside the loop.  As such there would be no PHI nodes to update.

When we maintain LCSSA form we create an intermediate node in between the
two
loops to thread alongt the value.  However later on when we update the
second
loop we don't have any PHI nodes to update and so
adjust_phi_and_debug_stmts
does nothing.   This leaves us with an incorrect phi node.  Normally this
does
nothing and just gets ignored.  But in the case of the vUSE chain we end up
corrupting the chain.

As such whenever a PHI node's argument dominates the loop, we should remove
the newly created PHI node after edge redirection.

The one exception to this is when the loop has been versioned.  In such
cases
the versioned loop may not use the value but the second loop can.

When this happens and we add the loop guard unless the join block has the
PHI
it can't find the original value for use inside the guard block.

The next refactoring in the series moves the formation of the guard block
inside peeling itself.  Here we have all the information and wouldn't
need to re-create it later.

gcc/ChangeLog:

PR tree-optimization/111860
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
Remove PHI nodes that dominate loop.

gcc/testsuite/ChangeLog:

PR tree-optimization/111860
* gcc.dg/vect/pr111860.c: New test.

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

--- Comment #13 from Tamar Christina  ---
Patch posted https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633569.html

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

--- Comment #12 from Tamar Christina  ---
yes, patch was tested on both aarch64 and x86, but I did not test libgomp
indeed.

In any case, waiting for regression run to finish and will submit patch.

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

--- Comment #11 from David Binderman  ---
(In reply to Andrew Pinski from comment #10)
> Maybe since it was in the libgomp testsuite you missed it when you tested
> your patch.

I usually find that compiling all the C,C++ and Fortran code
in the gcc/testsuite directory and below with "-g -O3 -march=native -Wall"
and searching for "internal compiler error" can be useful.

testsuite $ find . -name \*.c -print | wc -l
51192
testsuite $ find . \( -name \*.C -o -name \*.cc -o -name \*.cpp -o -name \*.cxx
\) | wc -l
20083
testsuite $ find . -type f -print | grep -E -i "\.f$|\.f[0-9][0-9]$" | wc -l
8036
testsuite $ 

Bonus points for two different architectures (arm & x86_64 ?).

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

--- Comment #10 from Andrew Pinski  ---
Just an FYI, I do get a similar ICE on:
libgomp/testsuite/libgomp.fortran/simd3.f90

Testcase on aarch64-linux-gnu now too.

Maybe since it was in the libgomp testsuite you missed it when you tested your
patch.


/home/ubuntu/src/upstream-gcc-aarch64/gcc/libgomp/testsuite/libgomp.fortran/simd3.f90:56:18:
Error: stmt with wrong VUSE^M
# VUSE <.MEM_68>^M
_21 = D.3326[_50];^M
expected .MEM_95^M
/home/ubuntu/src/upstream-gcc-aarch64/gcc/libgomp/testsuite/libgomp.fortran/simd3.f90:56:18:
Error: PHI node with wrong VUSE on edge from BB 32^M
.MEM_131 = PHI <.MEM_68(32), .MEM_68(29)>^M
expected .MEM_95^M
/home/ubuntu/src/upstream-gcc-aarch64/gcc/libgomp/testsuite/libgomp.fortran/simd3.f90:56:18:
Error: PHI node with wrong VUSE on edge from BB 29^M
.MEM_131 = PHI <.MEM_68(32), .MEM_68(29)>^M
expected .MEM_95^M
during GIMPLE pass: vect^M
/home/ubuntu/src/upstream-gcc-aarch64/gcc/libgomp/testsuite/libgomp.fortran/simd3.f90:56:18:
internal compiler error: verify_ssa failed^M
0x12312eb verify_ssa(bool, bool)^M

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

Andrew Pinski  changed:

   What|Removed |Added

 CC||shaohua.li at inf dot ethz.ch

--- Comment #9 from Andrew Pinski  ---
*** Bug 111869 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #8 from Andrew Pinski  ---
.

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

Tamar Christina  changed:

   What|Removed |Added

 CC||seurer at gcc dot gnu.org

--- Comment #7 from Tamar Christina  ---
*** Bug 111868 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

--- Comment #6 from Tamar Christina  ---
Ok, so the problem is that the loop never creates memory references, and so
after redirecting the edges when we update the new references we do so by
trying to update the PHI nodes.

But since the loop has no MEM phi node there's nothing to update but we created
a new artificial node during redirect.

Because there's no PHI node that means that adjust_phi_and_debug_stmts isn't
strong enough here.

So I can either remove phi nodes whom's SSA vars haven't been defined inside
the loop have not been defined in the body, or I'll need to replace
adjust_phi_and_debug_stmts with something that goes through all uses inside the
new loop and exit.

Which do you prefer richi? It seems like removing the PHI node after redirect
is the simplest one and one less thing to keep updated.