[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2024-01-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

Tamar Christina  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #17 from Tamar Christina  ---
Fixed. Thanks for the report and let me know if there's something still broken.

[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2024-01-12 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

--- Comment #16 from GCC Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:411de96dbf2bdafc7a90ebbfc63e68afd6388d29

commit r14-7195-g411de96dbf2bdafc7a90ebbfc63e68afd6388d29
Author: Tamar Christina 
Date:   Fri Jan 12 15:25:34 2024 +

middle-end: maintain LCSSA form when peeled vector iterations have virtual
operands

This patch fixes several interconnected issues.

1. When picking an exit we wanted to check for niter_desc.may_be_zero not
true.
   i.e. we want to pick an exit which we know will iterate at least once.
   However niter_desc.may_be_zero is not a boolean.  It is a tree that
encodes
   a boolean value.  !niter_desc.may_be_zero is just checking if we have
some
   information, not what the information is.  This leads us to pick a more
   difficult to vectorize exit more often than we should.

2. Because we had this bug, we used to pick an alternative exit much more
ofthen
   which showed one issue, when the loop accesses memory and we "invert it"
we
   would corrupt the VUSE chain.  This is because on an peeled vector
iteration
   every exit restarts the loop (i.e. they're all early) BUT since we may
have
   performed a store, the vUSE would need to be updated.  This version
maintains
   virtual PHIs correctly in these cases.   Note that we can't simply
remove all
   of them and recreate them because we need the PHI nodes still in the
right
   order for if skip_vector.

3. Since we're moving the stores to a safe location I don't think we
actually
   need to analyze whether the store is in range of the memref,  because if
we
   ever get there, we know that the loads must be in range, and if the
loads are
   in range and we get to the store we know the early breaks were not taken
and
   so the scalar loop would have done the VF stores too.

4. Instead of searching for where to move stores to, they should always be
in
   exit belonging to the latch.  We can only ever delay stores and even if
we
   pick a different exit than the latch one as the main one, effects still
   happen in program order when vectorized.  If we don't move the stores to
the
   latch exit but instead to whever we pick as the "main" exit then we can
   perform incorrect memory accesses (luckily these are trapped by
verify_ssa).

5. We only used to analyze loads inside the same BB as an early break, and
also
   we'd never analyze the ones inside the block where we'd be moving memory
   references to.  This is obviously bogus and to fix it this patch splits
apart
   the two constraints.  We first validate that all load memory references
are
   in bounds and only after that do we perform the alias checks for the
writes.
   This makes the code simpler to understand and more trivially correct.

gcc/ChangeLog:

PR tree-optimization/113137
PR tree-optimization/113136
PR tree-optimization/113172
PR tree-optimization/113178
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
Maintain PHIs on inverted loops.
(vect_do_peeling): Maintain virtual PHIs on inverted loops.
* tree-vect-loop.cc (vec_init_loop_exit_info): Pick exit closes to
latch.
(vect_create_loop_vinfo): Record all conds instead of only alt
ones.

gcc/testsuite/ChangeLog:

PR tree-optimization/113137
PR tree-optimization/113136
PR tree-optimization/113172
PR tree-optimization/113178
* g++.dg/vect/vect-early-break_4-pr113137.cc: New test.
* g++.dg/vect/vect-early-break_5-pr113137.cc: New test.
* gcc.dg/vect/vect-early-break_95-pr113137.c: New test.
* gcc.dg/vect/vect-early-break_96-pr113136.c: New test.
* gcc.dg/vect/vect-early-break_97-pr113172.c: New test.

[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2024-01-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

--- Comment #15 from Tamar Christina  ---
(In reply to David Binderman from comment #14)
> (In reply to Tamar Christina from comment #13)
> > Patch submitted
> 
> Two weeks have elapsed and the patch doesn't seem to appear in git.
> 
> Is it perhaps stuck somewhere ?

maintainers were on holiday till this week.  Everything's been approved now and
making the final changes maintainers wanted and will regtest on various
architectures.

I expect to commit the patches sometime today.  Sorry for the delay.

[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2024-01-12 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

--- Comment #14 from David Binderman  ---
(In reply to Tamar Christina from comment #13)
> Patch submitted

Two weeks have elapsed and the patch doesn't seem to appear in git.

Is it perhaps stuck somewhere ?

[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2023-12-29 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

--- Comment #13 from Tamar Christina  ---
Patch submitted

[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2023-12-29 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

--- Comment #12 from Tamar Christina  ---
ok, x86_64 bootstrap and regtest with -O3 and --enable-checking=yes,rtl,extra
now passes.

aarch64 hit a small issue in libgcc that I'm not sure I should be allowing or
not. will investigate and either fix or disable and post patches.

[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2023-12-29 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

--- Comment #11 from Tamar Christina  ---
Created attachment 56963
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56963=edit
maintain-lcssa-peeled.patch

patch undergoing testing for both this and PR113136

[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2023-12-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

--- Comment #10 from Tamar Christina  ---
Ok, so this bug is simply fixed by:

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index f51ae3e719e..e7a5917bc4c 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -976,7 +976,8 @@ vec_init_loop_exit_info (class loop *loop)
   if (number_of_iterations_exit_assumptions (loop, exit, _desc,
NULL)
  && !chrec_contains_undetermined (niter_desc.niter))
{
- if (!niter_desc.may_be_zero || !candidate)
+ tree may_be_zero = niter_desc.may_be_zero;
+ if ((may_be_zero && integer_zerop (may_be_zero)) || !candidate)
candidate = exit;
}
 }

because niter_desc.may_be_zero is not a boolean but instead a tree that encodes
a boolean.

Due to this we were forcing much more complicated loops than required.  However
we *should* be able to handle these complicated loops since we don't know when
they'll occur.. so I'll post a companion patch to fix those too.

[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2023-12-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

--- Comment #9 from Tamar Christina  ---
Ok, have a working patch but it's a bit ugly, working on cleaning it up.

[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2023-12-27 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

David Binderman  changed:

   What|Removed |Added

 CC||dcb314 at hotmail dot com

--- Comment #8 from David Binderman  ---
I see this one also in a build of package LFSC.

$ ~/gcc/results/bin/gcc -c -w -O3 bug992.cc
foundBugs $ ~/gcc/results/bin/gcc -c -w -O3 -march=znver2 bug992.cc 
/home/dcb38/rpmbuild/BUILD/LFSC-bbc1798864dbc328092356d4c01f02ddc39ea6bd/src/code.cpp:
In function ‘Expr* read_code()’:
/home/dcb38/rpmbuild/BUILD/LFSC-bbc1798864dbc328092356d4c01f02ddc39ea6bd/src/code.cpp:112:7:
error: PHI node with wrong VUSE on edge from BB 114
  112 | Expr *read_code()
  |   ^
.MEM_824 = PHI <.MEM_1387(114)>
expected .MEM_1042

[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2023-12-27 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

--- Comment #7 from Tamar Christina  ---
Have update the memory analysis part to support inverted loops. now working on
wiring through virtual phis during peeling.

Aside form this missing part CFG looks correct. will  have a final patch in a
bit.

[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2023-12-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

Tamar Christina  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Target Milestone|--- |14.0
   Priority|P3  |P1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2023-12-26
   Assignee|unassigned at gcc dot gnu.org  |tnfchris at gcc dot 
gnu.org

--- Comment #6 from Tamar Christina  ---
Thanks for the report and testcases.

[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2023-12-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

--- Comment #5 from Tamar Christina  ---
Simpler reproducer:

int b;
void a() __attribute__((__noreturn__));
void c() {
  char *buf;
  int bufsz = 64;
  while (b) {
!bufsz ? a(), 0 : *buf++ = bufsz--;
b -= 4;
  }
}

The loop has an inverted control flow that accesses memory.
The testcase doesn't seem to have existing cases for these, but due to the
inverted nature the loop's main exit is before the latch exit which we now
consider the early exit.

because we only analyze early exits we miss that there's a memory reference
that needs to be moved, and because it wasn't moved the vUSEs don't line up.

will fix tomorrow when back at work.

[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2023-12-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

--- Comment #4 from Tamar Christina  ---
*** Bug 113135 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2023-12-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

--- Comment #2 from Tamar Christina  ---
*** Bug 113146 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2023-12-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

--- Comment #3 from Tamar Christina  ---
*** Bug 113139 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/113137] [14 regression] Failed bootstrap with -O3 -march=znver2

2023-12-25 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113137

--- Comment #1 from Sam James  ---
Created attachment 56937
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56937=edit
reduced.ii