On Tue, 20 Jun 2017, Alan Hayward wrote:

> 
> > On 19 Jun 2017, at 13:35, Richard Biener <rguent...@suse.de> wrote:
> >
> > On Mon, 19 Jun 2017, Christophe Lyon wrote:
> >
> >> Hi Richard,
> >>
> >> On 16 June 2017 at 14:18, Richard Biener <rguent...@suse.de> wrote:
> >>> On Wed, 14 Jun 2017, Richard Biener wrote:
> >>>
> >>>>
> >>>> niter estimates are not kept up-to-date (they reference gimple stmts
> >>>> and trees) in the keep-loop-stuff infrastructure so similar to the
> >>>> SCEV cache we rely on people freeing it after passes.
> >>>>
> >>>> The following brings us a step closer to that by freeing them whenever
> >>>> SCEV is invalidated (we only compute them when SCEV is active) plus
> >>>> removing the odd record-bounds pass that just computes them, leaving
> >>>> scavenging to following passes.
> >>>>
> >>>> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> >>>
> >>> Some awkward interactions with peeling means I'm installing the
> >>> following less aggressive variant.
> >>>
> >>> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
> >>>
> >>> Richard.
> >>>
> >>> 2017-06-16  Richard Biener  <rguent...@suse.de>
> >>>
> >>>        PR tree-optimization/81090
> >>>        * passes.def (pass_record_bounds): Remove.
> >>>        * tree-pass.h (make_pass_record_bounds): Likewise.
> >>>        * tree-ssa-loop.c (pass_data_record_bounds, pass_record_bounds,
> >>>        make_pass_record_bounds): Likewise.
> >>>        * tree-ssa-loop-ivcanon.c (canonicalize_induction_variables): Do
> >>>        not free niter estimates at the beginning but at the end.
> >>>        * tree-scalar-evolution.c (scev_finalize): Free niter estimates.
> >>>
> >>>        * gcc.dg/graphite/pr81090.c: New testcase.
> >>>
> >>
> >> Sorry to bother you again...
> >> With this commit (r249249), I've noticed regressions on aarch64/arm:
> >> FAIL:    gcc.dg/vect/pr65947-9.c -flto -ffat-lto-objects
> >> scan-tree-dump-not vect "LOOP VECTORIZED"
> >> FAIL:    gcc.dg/vect/pr65947-9.c scan-tree-dump-not vect "LOOP VECTORIZED"
> >
> > So the testcase gets vectorized now (for whatever reason) and still passes
> > execution.  Not sure why the testcase checked for not being vectorized.
> >
> > Alan?
> >
> > Richard.
> 
> I’ve not looked at the new patch, but pr65947-9.c was added to test:
> 
> + /* Condition reduction with maximum possible loop size.  Will fail to
> +    vectorize because the vectorisation requires a slot for default values.  
> */
> 
> So, in the pr65947-9.c, if nothing passes the IF clause, then LAST needs to be
> set to -72.

So the runtime part of the testcase fails to test this case and we expect
it to FAIL if vectorized?

Index: testsuite/gcc.dg/vect/pr65947-9.c
===================================================================
--- testsuite/gcc.dg/vect/pr65947-9.c   (revision 249145)
+++ testsuite/gcc.dg/vect/pr65947-9.c   (working copy)
@@ -34,9 +34,9 @@ main (void)
 
   check_vect ();
 
-  char ret = condition_reduction (a, 16);
+  char ret = condition_reduction (a, 1);
 
-  if (ret != 10)
+  if (ret != -72)
     abort ();
 
   return 0;

On aarch64 I can reproduce the inline copy in main to be vectorized
(doesn't happen on x86_64).  niter analysis says:

Analyzing # of iterations of loop 1
  exit condition [253, + , 4294967295] != 0
  bounds on difference of bases: -253 ... -253
  result:
    # of iterations 253, bounded by 253
Analyzing # of iterations of loop 1
  exit condition [253, + , 4294967295] != 0
  bounds on difference of bases: -253 ... -253
  result:
    # of iterations 253, bounded by 253
Statement (exit)if (ivtmp_45 != 0)
 is executed at most 253 (bounded by 253) + 1 times in loop 1.

so it fits there.  While the offline copy has

Analyzing # of iterations of loop 1
  exit condition [254, + , 4294967295] != 0
  bounds on difference of bases: -254 ... -254
  result:
    # of iterations 254, bounded by 254
Analyzing # of iterations of loop 1
  exit condition [254, + , 4294967295] != 0
  bounds on difference of bases: -254 ... -254
  result:
    # of iterations 254, bounded by 254
Statement (exit)if (ivtmp_7 != 0)
 is executed at most 254 (bounded by 254) + 1 times in loop 1.

we peeled one iteration (ch_loop does that) so we have the place
left.

Marking the function noinline works as a fix I guess.

Tested on x86_64-unknown-linux-gnu, installed.

Richard.

2017-06-20  Richard Biener  <rguent...@suse.de>

        * gcc.dg/vect/pr65947-9.c: Adjust.

Index: gcc/testsuite/gcc.dg/vect/pr65947-9.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/pr65947-9.c       (revision 249145)
+++ gcc/testsuite/gcc.dg/vect/pr65947-9.c       (working copy)
@@ -9,10 +9,10 @@ extern void abort (void) __attribute__ (
 /* Condition reduction with maximum possible loop size.  Will fail to
    vectorize because the vectorisation requires a slot for default values.  */
 
-char
+signed char __attribute__((noinline,noclone))
 condition_reduction (char *a, char min_v)
 {
-  char last = -72;
+  signed char last = -72;
 
   for (int i = 0; i < N; i++)
     if (a[i] < min_v)
@@ -21,10 +21,10 @@ condition_reduction (char *a, char min_v
   return last;
 }
 
-char
-main (void)
+int
+main ()
 {
-  char a[N] = {
+  signed char a[N] = {
   11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
   1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
   21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
@@ -34,11 +34,14 @@ main (void)
 
   check_vect ();
 
-  char ret = condition_reduction (a, 16);
-
+  signed char ret = condition_reduction (a, 16);
   if (ret != 10)
     abort ();
 
+  ret = condition_reduction (a, 1);
+  if (ret != -72)
+    abort ();
+
   return 0;
 }
 

Reply via email to