[Bug middle-end/111125] [14 Regression] tree-ssa.exp and vect.exp failures after commit r14-3281-g99b5921bfc8f91

2023-08-24 Thread thiago.bauermann at linaro dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25

--- Comment #8 from Thiago Jung Bauermann  
---
Confirmed. All the failures I reported are fixed in trunk. Thank you!

[Bug middle-end/111125] [14 Regression] tree-ssa.exp and vect.exp failures after commit r14-3281-g99b5921bfc8f91

2023-08-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Richard Biener  ---
Should be all fixed now.

[Bug middle-end/111125] [14 Regression] tree-ssa.exp and vect.exp failures after commit r14-3281-g99b5921bfc8f91

2023-08-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:43da77a4f1636280c4259402c9c2c543e6ec6c0b

commit r14-3444-g43da77a4f1636280c4259402c9c2c543e6ec6c0b
Author: Richard Biener 
Date:   Thu Aug 24 11:10:43 2023 +0200

tree-optimization/25 - avoid BB vectorization in novector loops

When a loop is marked with

  #pragma GCC novector

the following makes sure to also skip BB vectorization for contained
blocks.  That avoids gcc.dg/vect/bb-slp-29.c failing on aarch64
because of extra BB vectorization therein.  I'm not specifically
dealing with sub-loops of novector loops, the desired semantics
isn't documented.

PR tree-optimization/25
* tree-vect-slp.cc (vect_slp_function): Split at novector
loop entry, do not push blocks in novector loops.

[Bug middle-end/111125] [14 Regression] tree-ssa.exp and vect.exp failures after commit r14-3281-g99b5921bfc8f91

2023-08-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:e80f7c13f64e10c6a3354c5d6b42da60b21ed0b8

commit r14-3440-ge80f7c13f64e10c6a3354c5d6b42da60b21ed0b8
Author: Richard Biener 
Date:   Thu Aug 24 10:30:12 2023 +0200

tree-optimization/25 - properly cost BB reduction remain stmt handling

We assume that all root stmts which compose the total reduction chain
are vectorized but fail to account for the cost of adding back the
scalar defs we are not vectorizing.  The following rectifies this,
fixing the gcc.dg/tree-ssa/slsr-11.c FAIL on aarch64.

PR tree-optimization/25
* tree-vect-slp.cc (vectorizable_bb_reduc_epilogue): Account
for the remain_defs processing.

[Bug middle-end/111125] [14 Regression] tree-ssa.exp and vect.exp failures after commit r14-3281-g99b5921bfc8f91

2023-08-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:308e716266787f84ba4a47546317dae83be8901c

commit r14-3436-g308e716266787f84ba4a47546317dae83be8901c
Author: Richard Biener 
Date:   Thu Aug 24 10:55:06 2023 +0200

testsuite/25 - disable BB vectorization for the test

The test is for loop vectorization producing non-canonical
multiplications.  We can now BB vectorize the whole function
when the target supports .REDUC_PLUS for V2SImode but we don't
have a dejagnu selector for that.  Disable BB vectorization
like we disabled epilogue vectorization.

PR testsuite/25
* gcc.dg/vect/pr53773.c: Disable BB vectorization.

[Bug middle-end/111125] [14 Regression] tree-ssa.exp and vect.exp failures after commit r14-3281-g99b5921bfc8f91

2023-08-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25

--- Comment #3 from Richard Biener  ---
gcc.dg/vect/pr53773.c is interesting - we vectorize the function to

   [local count: 118111600]:
  _20 = {integral_4(D), decimal_5(D)};
  if (power_ten_6(D) > 0)
goto ; [89.00%]
  else
goto ; [11.00%]

   [local count: 955630224]:
  # power_ten_19 = PHI 
  # vect_integral_15.4_1 = PHI 
  vect_integral_9.5_12 = vect_integral_15.4_1 * { 10, 10 };
  power_ten_11 = power_ten_19 + -1;
  if (power_ten_11 != 0)
goto ; [89.00%]
  else
goto ; [11.00%]

   [local count: 118111600]:
  # vect_integral_16.7_21 = PHI 
  _22 = VIEW_CONVERT_EXPR(vect_integral_16.7_21);
  _23 = .REDUC_PLUS (_22); [tail call]
  _24 = (int) _23;
  return _24;

where loop vectorization fails because

/space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/pr53773.c:9:20: note: 
Analyze phi: integral_15 = PHI 
/space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/pr53773.c:9:20: missed: 
Peeling for epilogue is not supported for nonlinear induction except neg when
iteration count is unknown.
/space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/pr53773.c:9:20: missed:  not
vectorized: can't create required epilog loop

loop vectorization doesn't try SLP here because we only SLP reduction groups,
not induction groups.

So I think this vectorization is quite nice, possibly even better than
the loop vectorization we expect.  generated code:

foo:
.LFB0:
.cfi_startproc
fmovs31, w0
ins v31.s[1], w1
cmp w2, 0
ble .L2
moviv30.2s, 0xa
.p2align 3,,7
.L3:
mul v31.2s, v31.2s, v30.2s
subsw2, w2, #1
bne .L3
.L2:
addpv31.2s, v31.2s, v31.2s
fmovw0, s31
ret

the path for power_ten == 0 is of course sub-optimal.  Note it's again
determined not profitable with costing (we do not try to weight stmts
based on profile, thus in-loop stmts cost the same as out-of-loop stmts).

I'm going to adjust the testcase, disabling BB vectorization.

[Bug middle-end/111125] [14 Regression] tree-ssa.exp and vect.exp failures after commit r14-3281-g99b5921bfc8f91

2023-08-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25

--- Comment #2 from Richard Biener  ---
For gcc.dg/vect/bb-slp-29.c we are now vectorizing

#pragma GCC novector
  for (i = 0; i < N/2; i++)
{
  if (dst[i] != A * src[i] + B * src[i+1])
abort ();
}

in particular the multiplication and the addition (but not the load which
had predictive commoning applied).  When cost modeling is enabled this
vectorization is not deemed profitable (but the vect testsuite runs with
-fno-vect-cost-model).

I wonder if we want to excempt basic blocks within loops marked with novector
from BB vectorization.

[Bug middle-end/111125] [14 Regression] tree-ssa.exp and vect.exp failures after commit r14-3281-g99b5921bfc8f91

2023-08-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2023-08-24
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
For gcc.dg/tree-ssa/slsr-11.c we vectorize the reduction to

   [local count: 1073741824]:
  _15 = {s_5(D), s_5(D)};
  vect_a3_11.3_16 = _15 * { 6, 4 };
  vect__3.4_17 = (vector(2) long int) vect_a3_11.3_16;
  a1_6 = s_5(D) * 2;
  _18 = VIEW_CONVERT_EXPR(vect__3.4_17);
  _19 = .REDUC_PLUS (_18);
  _20 = (unsigned long) a1_6;
  _21 = (unsigned long) c_7(D);
  _29 = _21 * 2;
  _31 = _19 + _29;
  _30 = _20 + _21;
  _27 = _30 + _31;
  _28 = (long int) _27;
  return _28;

note: Cost model analysis for part in loop 0:
  Vector cost: 9
  Scalar cost: 9

doesn't look profitable.  I think there's something off with the scalar
accounting, I'll have a look there.

[Bug middle-end/111125] [14 Regression] tree-ssa.exp and vect.exp failures after commit r14-3281-g99b5921bfc8f91

2023-08-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
   Keywords||testsuite-fail
Summary|tree-ssa.exp and vect.exp   |[14 Regression]
   |failures after commit   |tree-ssa.exp and vect.exp
   |r14-3281-g99b5921bfc8f91|failures after commit
   ||r14-3281-g99b5921bfc8f91