[Bug tree-optimization/43423] gcc should vectorize this loop through if-conversion

2021-09-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43423

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|--- |7.0
 Status|ASSIGNED|RESOLVED

--- Comment #14 from Andrew Pinski  ---
So GCC 7 is able to optimize this loop fully and split it into two at -O3
(r7-3966) after my comment #12.

Also starting with GCC 7, we were able to vectorize the loop at -O2
-ftree-vectorize since tree-if-conv.c can do the ifconversion (I don't have the
revision).

So this is all fixed anyways.

[Bug tree-optimization/43423] gcc should vectorize this loop through if-conversion

2021-07-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43423

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Severity|normal  |enhancement
 CC||pinskia at gcc dot gnu.org

--- Comment #13 from Andrew Pinski  ---
The improvement in comment #12 is something which I am working on.

[Bug tree-optimization/43423] gcc should vectorize this loop through if-conversion

2016-08-27 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43423

Andrew Pinski  changed:

   What|Removed |Added

 Blocks||53947

--- Comment #12 from Andrew Pinski  ---
.L8:
ldr q4, [x9, x8]
cmgtv2.4s, v6.4s, v0.4s
ldr q3, [x10, x8]
add w12, w12, 1
ldr q1, [x2, x8]
add v0.4s, v0.4s, v5.4s
add v3.4s, v3.4s, v4.4s << this one
add v1.4s, v1.4s, v4.4s  << this one
bit v1.16b, v3.16b, v2.16b
str q1, [x9, x8]
add x8, x8, 16
cmp w7, w12
bhi .L8

This is the trunk on aarch64-linux-gnu.  Range splitting is not there but there
is more it can be done even without range splitting; there is one extra add.

PRE produces:
  :
  _2 = b[i_18];
  _3 = _2 + pretmp_14;
  goto ;

  :
  _5 = c[i_18];
  _6 = _5 + pretmp_14;

  :
  # cstore_17 = PHI <_3(4), _6(5)>

But we could do better and do:
  :
  _2 = b[i_18];
  goto ;

  :
  _5 = c[i_18];

  :
  # _N = PHI <_2(4), _5(5)>
  _cstore_17 = _N + pretmp_14;


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug tree-optimization/43423] gcc should vectorize this loop through if-conversion

2010-05-25 Thread spop at gcc dot gnu dot org


--- Comment #11 from spop at gcc dot gnu dot org  2010-05-25 23:33 ---
This is not a IV type problem: the number of iterations may be zero when mid ==
0 or mid == n, so the number of iterations analysis has a condition under which
niter may_be_zero.

I sent out a patch that makes niter return a COND_EXPR
instead of a chrec_dont_know:
http://gcc.gnu.org/ml/gcc-patches/2010-05/msg01927.html

With that patch I now get 
  note: not vectorized: data ref analysis failed D.2726_51 = a[var.9_55];


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43423



[Bug tree-optimization/43423] gcc should vectorize this loop through if-conversion

2010-05-24 Thread spop at gcc dot gnu dot org


--- Comment #10 from spop at gcc dot gnu dot org  2010-05-24 23:02 ---
note: not vectorized: number of iterations cannot be computed.

Graphite has a problem with the generation of induction variables types
that makes the code harder to analyze after Graphite.  I will try to get this
fixed to make this loop vectorized with the iteration range splitting that
Graphite does by default. 

Sebastian


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43423



[Bug tree-optimization/43423] gcc should vectorize this loop through if-conversion

2010-05-24 Thread changpeng dot fang at amd dot com


--- Comment #9 from changpeng dot fang at amd dot com  2010-05-24 22:47 
---
(In reply to comment #8)
> -fgraphite-identity does iteration splitting for this case.

Do you know why it could not be vectorized after iteration 
range splitting?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43423



[Bug tree-optimization/43423] gcc should vectorize this loop through if-conversion

2010-05-24 Thread spop at gcc dot gnu dot org


--- Comment #8 from spop at gcc dot gnu dot org  2010-05-24 22:44 ---
-fgraphite-identity does iteration splitting for this case.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43423



[Bug tree-optimization/43423] gcc should vectorize this loop through if-conversion

2010-05-07 Thread changpeng dot fang at amd dot com


--- Comment #7 from changpeng dot fang at amd dot com  2010-05-07 21:41 
---
(In reply to comment #4)
> (In reply to comment #3)
> > Subject: Re:  gcc should vectorize this loop 
> > through "iteration range splitting"
> > You mean that the problem is the if-conversion of the stores
> > "a[i] = ..."
> 
> If we rewrite the code like:
> int a[100], b[100], c[100];
> 
> void foo(int n, int mid)
> {
>   int i;
>   for(i=0; i {
>   int t;
>   int ai = a[i], bi = b[i], ci = c[i];
>   if (i < mid)
> t = ai + bi;
>   else
> t = ai + ci;
>   a[i] = t;
> }
> }
> 
> --- CUT ---
> This gets vectorized as we produce an if-cvt first.
> 

There are both correctness and performance issues in the re-written code.
b[i] or c[i] may not be executed in the original loop.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43423



[Bug tree-optimization/43423] gcc should vectorize this loop through if-conversion

2010-04-08 Thread spop at gcc dot gnu dot org


--- Comment #6 from spop at gcc dot gnu dot org  2010-04-08 17:47 ---
I changed the title of this bug to match the comments in the PR:
we should vectorize this loop using if-conversion, and not "iteration
range splitting".

Also note that in general, by doing an "iteration range splitting" the data 
locality in the two loops could be worse than in the if-converted loop.


-- 

spop at gcc dot gnu dot org changed:

   What|Removed |Added

Summary|gcc should vectorize this   |gcc should vectorize this
   |loop through "iteration |loop through if-conversion
   |range splitting"|


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43423