[Bug tree-optimization/61194] [4.9/4.10 Regression] vectorization failed with bit-precision arithmetic not supported even if conversion to int is requested

2014-07-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61194

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|4.9.1   |4.9.2

--- Comment #16 from Jakub Jelinek jakub at gcc dot gnu.org ---
GCC 4.9.1 has been released.


[Bug tree-optimization/61194] [4.9/4.10 Regression] vectorization failed with bit-precision arithmetic not supported even if conversion to int is requested

2014-05-17 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61194

--- Comment #15 from Marc Glisse glisse at gcc dot gnu.org ---
Seems related to PR 57328.


[Bug tree-optimization/61194] [4.9/4.10 Regression] vectorization failed with bit-precision arithmetic not supported even if conversion to int is requested

2014-05-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61194

--- Comment #6 from Richard Biener rguenth at gcc dot gnu.org ---
Created attachment 32803
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=32803action=edit
patch

Like this.


[Bug tree-optimization/61194] [4.9/4.10 Regression] vectorization failed with bit-precision arithmetic not supported even if conversion to int is requested

2014-05-16 Thread vincenzo.innocente at cern dot ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61194

--- Comment #7 from vincenzo Innocente vincenzo.innocente at cern dot ch ---
great!

the original version (that vectorized in 4.8.1)
void barX() {
  for (int i=0; i1024; ++i) {
k[i] = (x[i]0)  (w[i]y[i]);
z[i] = (k[i]) ? z[i] : y[i];
 }
}

does not vectorize yet.

On the other hand I am very happy to see
void bar() {
  for (int i=0; i1024; ++i) {
auto c = ( (x[i]0)  (w[i]y[i])) | (y[i]0.5f);
z[i] = c ? y[i] : z[i];
 }
}
vectorized
if (c) z[i] = y[i];
does not even with -ftree-loop-if-convert-stores
not a real issue at least for what I am concerned


[Bug tree-optimization/61194] [4.9/4.10 Regression] vectorization failed with bit-precision arithmetic not supported even if conversion to int is requested

2014-05-16 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61194

--- Comment #8 from rguenther at suse dot de rguenther at suse dot de ---
On Fri, 16 May 2014, vincenzo.innocente at cern dot ch wrote:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61194
 
 --- Comment #7 from vincenzo Innocente vincenzo.innocente at cern dot ch ---
 great!
 
 the original version (that vectorized in 4.8.1)
 void barX() {
   for (int i=0; i1024; ++i) {
 k[i] = (x[i]0)  (w[i]y[i]);
 z[i] = (k[i]) ? z[i] : y[i];
  }
 }
 
 does not vectorize yet.

That's because we hit

check_bool_pattern (var=ssa_name 0x76c36e10, loop_vinfo=0x1f3e900, 
bb_vinfo=0x0)
at /space/rguenther/src/svn/trunk/gcc/tree-vect-patterns.c:2596
2596   dt))
...
2605  if (!has_single_use (def))
2606return false;

because

  _5 = x[i_18];
  _6 = _5  0.0;
  _7 = w[i_18];
  _8 = y[i_18];
  _9 = _7  _8;
  _10 = _9  _6;
  _11 = (int) _10;
  k[i_18] = _11;
  iftmp.0_13 = z[i_18];
  iftmp.0_2 = _10 ? iftmp.0_13 : _8;

thus we have CSEd the load from k and propagated from the
conversion.  VRP does this:

   _11 = (int) _10;
-  k[i_1] = _11;
-  if (_11 != 0)
+  k[i_18] = _11;
+  if (_10 != 0)

and -fno-tree-vrp fixes the regression.  If k were of type
_Bool then it likely wouldn't vectorize with 4.8 either.

The vectorizer cannot handle multi-uses of a pattern part
(in this case it's the start which would be doable, but it's
far from trivial ...).  That said,

static float x[1024];
static float y[1024];
static float z[1024];
static float w[1024];

static _Bool k[1024];

void __attribute__((noinline,noclone)) barX()
{
  int i;
  for (i=0; i1024; ++i) {
  k[i] = (x[i]0)  (w[i]y[i]);
  z[i] = (k[i]) ? z[i] : y[i];
  }
}

is not vectorized even in 4.8 for the cited reason.

 On the other hand I am very happy to see
 void bar() {
   for (int i=0; i1024; ++i) {
 auto c = ( (x[i]0)  (w[i]y[i])) | (y[i]0.5f);
 z[i] = c ? y[i] : z[i];
  }
 }
 vectorized
 if (c) z[i] = y[i];
 does not even with -ftree-loop-if-convert-stores
 not a real issue at least for what I am concerned

I think it doesn't introduce data races unless you
also specify --param allow-store-data-races=1.

I also don't see the testcases vectorized when using
 instead of .

If not already there, these warrant (different) bugreports.


[Bug tree-optimization/61194] [4.9/4.10 Regression] vectorization failed with bit-precision arithmetic not supported even if conversion to int is requested

2014-05-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61194

--- Comment #9 from Richard Biener rguenth at gcc dot gnu.org ---
Author: rguenth
Date: Fri May 16 11:21:11 2014
New Revision: 210514

URL: http://gcc.gnu.org/viewcvs?rev=210514root=gccview=rev
Log:
2014-05-16  Richard Biener  rguent...@suse.de

PR tree-optimization/61194
* tree-vect-patterns.c (adjust_bool_pattern): Also handle
bool patterns ending in a COND_EXPR.

* gcc.dg/vect/pr61194.c: New testcase.

Added:
trunk/gcc/testsuite/gcc.dg/vect/pr61194.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-patterns.c


[Bug tree-optimization/61194] [4.9/4.10 Regression] vectorization failed with bit-precision arithmetic not supported even if conversion to int is requested

2014-05-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61194

--- Comment #10 from Richard Biener rguenth at gcc dot gnu.org ---
Created attachment 32805
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=32805action=edit
patch fixing the regression

This would fix the regression (also without the previous patch?)


[Bug tree-optimization/61194] [4.9/4.10 Regression] vectorization failed with bit-precision arithmetic not supported even if conversion to int is requested

2014-05-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61194

--- Comment #11 from Richard Biener rguenth at gcc dot gnu.org ---
(In reply to Richard Biener from comment #10)
 Created attachment 32805 [details]
 patch fixing the regression
 
 This would fix the regression (also without the previous patch?)

It does, on the 4.9 branch at least, for

static float x[1024];
static float y[1024];
static float z[1024];
static float w[1024];

static int k[1024];

void __attribute__((noinline,noclone)) barX()
{
  int i;
  for (i=0; i1024; ++i)
{
  k[i] = x[i]0;
  k[i] =  w[i]y[i];
  z[i] = (k[i]) ? z[i] : y[i];
}
}

but it doesn't change the outcome of the big testcase in the original report.
It does together with the other patch though:

 g++-4.9 t.C -Ofast -ftree-loop-if-convert-stores  -fopt-info-vec -B. -fopenmp
t.C:11:5: note: loop vectorized
t.C:19:23: note: loop vectorized
t.C:24:5: note: loop vectorized
t.C:29:5: note: loop vectorized
t.C:35:5: note: loop vectorized
t.C:41:5: note: loop vectorized
t.C:47:5: note: loop vectorized

bar2 still not vectorized there.

But with 4.7 I see the same as with 4.8 and 4.9:

35: LOOP VECTORIZED.
41: LOOP VECTORIZED.
47: LOOP VECTORIZED.

so where exactly does the regression part appear for you?  Is that only
for the code in comment#1?


[Bug tree-optimization/61194] [4.9/4.10 Regression] vectorization failed with bit-precision arithmetic not supported even if conversion to int is requested

2014-05-16 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61194

--- Comment #12 from Richard Biener rguenth at gcc dot gnu.org ---
void bar2() {
  for (int i=0; i1024; ++i) {
k[i] = x[i]0; j[i] = w[i]0;
z[i] = ( k[i]  j[i]) ? z[i] : y[i];
 }
}

has similar issues (non-single-uses due to CSE and propagating from the
conversion sources):

  _5 = x[i_20];
  _6 = _5  0.0;
  _7 = (int) _6;
  k[i_20] = _7;
  _9 = w[i_20];
  _10 = _9  0.0;
  _11 = (int) _10;
  j[i_20] = _11;
  _18 = _10  _6;
  iftmp.0_14 = z[i_20];
  iftmp.0_15 = y[i_20];
  iftmp.0_2 = _18 ? iftmp.0_14 : iftmp.0_15;
  z[i_20] = iftmp.0_2;

This is generally caused by optimizing code to use smaller precisions.  So
I think we need a more general solution for this than just the 2nd patch
I attached (which I won't pursue - I figure the first one would be way
more useful as it results in the same result for your initial large testcase
where the 2nd patch doesn't make a difference).


[Bug tree-optimization/61194] [4.9/4.10 Regression] vectorization failed with bit-precision arithmetic not supported even if conversion to int is requested

2014-05-16 Thread vincenzo.innocente at cern dot ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61194

--- Comment #13 from vincenzo Innocente vincenzo.innocente at cern dot ch ---
I confirm that with last patch the regression is gone also in a more complex
actual application I had.

The regression concerns only comment 2 and 3.

all the other cases in comment 1 were various attempt of mine to see if
anything was changed that allowed vectorization using a different syntax.
I am happy that now they all vectorize (but bar2...)

when, in 2011, I wrote the original test case, I introduced the int vector to
make it vectorize (most probably I also submitted a bug report on the subject)


[Bug tree-optimization/61194] [4.9/4.10 Regression] vectorization failed with bit-precision arithmetic not supported even if conversion to int is requested

2014-05-16 Thread vincenzo.innocente at cern dot ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61194

--- Comment #14 from vincenzo Innocente vincenzo.innocente at cern dot ch ---
provided that future patches will make the code in comment 1 and 2 (and bar) go
vectorize is fine  with me.
if it ends up to vectorize also with bool instead of int even better.
(I am not sure that bit/byte handling is really more efficient in sse and avx
w.r.t plain 32bit int)


[Bug tree-optimization/61194] [4.9/4.10 Regression] vectorization failed with bit-precision arithmetic not supported even if conversion to int is requested

2014-05-15 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61194

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2014-05-15
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
Summary|[4.9 Regression]|[4.9/4.10 Regression]
   |vectorization failed with   |vectorization failed with
   |bit-precision arithmetic   |bit-precision arithmetic
   |not supported even if  |not supported even if
   |conversion to int is|conversion to int is
   |requested   |requested
 Ever confirmed|0   |1

--- Comment #3 from Richard Biener rguenth at gcc dot gnu.org ---
I see on trunk after if-conversion

  _6 = _5  0.0;
  _9 = _7  _8;
  _10 = _9  _6;
  _11 = (int) _10;
  k[i_18] = _11;
  iftmp.0_13 = z[i_18];
  iftmp.0_2 = _10 ? iftmp.0_13 : _8;
  z[i_18] = iftmp.0_2;

so what happens is that we do have bit-precision arithmetic with the
bitwise and.

This is a regression because of the way we lower comparisons now I guess.

I will have a look.


[Bug tree-optimization/61194] [4.9/4.10 Regression] vectorization failed with bit-precision arithmetic not supported even if conversion to int is requested

2014-05-15 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61194

--- Comment #4 from Richard Biener rguenth at gcc dot gnu.org ---
Actually the vectorizer punts on the comparisons itself.  The pattern
recognizer handles some of them as

  patt_10 = _4  0.0 ? 1 : 0;

but not those feeding the BIT expressions which would need to be widened then
(though they are supported as bit-precision).


[Bug tree-optimization/61194] [4.9/4.10 Regression] vectorization failed with bit-precision arithmetic not supported even if conversion to int is requested

2014-05-15 Thread vincenzo.innocente at cern dot ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61194

--- Comment #5 from vincenzo Innocente vincenzo.innocente at cern dot ch ---
of course if you can make
z[i] = ( (x[i]0)  (w[i]0)) ? z[i] : y[i];
to vectorize would be even better!