[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-25 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

--- Comment #18 from Jakub Jelinek jakub at gcc dot gnu.org 2011-10-25 
08:02:16 UTC ---
Author: jakub
Date: Tue Oct 25 08:02:08 2011
New Revision: 180424

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=180424
Log:
PR tree-optimization/50596
* tree-vect-stmts.c (vect_mark_relevant): Only use
FOR_EACH_IMM_USE_FAST if lhs is SSA_NAME.
(vectorizable_store): If is_pattern_stmt_p look through
VIEW_CONVERT_EXPR on lhs.
* tree-vect-patterns.c (check_bool_pattern, adjust_bool_pattern):
Use unsigned type instead of signed.
(vect_recog_bool_pattern): Optimize also stores into bool memory in
addition to casts from bool to integral types.
(vect_mark_pattern_stmts): If pattern_stmt already has vinfo
created, don't create it again.

* gcc.dg/vect/vect-cond-10.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/vect/vect-cond-10.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-patterns.c
trunk/gcc/tree-vect-stmts.c


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-25 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED

--- Comment #19 from Jakub Jelinek jakub at gcc dot gnu.org 2011-10-25 
08:23:32 UTC ---
Bool stores are handled now too.


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-16 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

--- Comment #16 from Jakub Jelinek jakub at gcc dot gnu.org 2011-10-16 
13:10:26 UTC ---
Author: jakub
Date: Sun Oct 16 13:10:20 2011
New Revision: 180057

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=180057
Log:
PR tree-optimization/50596
* tree-vectorizer.h (NUM_PATTERNS): Increase to 7.
* tree-vect-patterns.c (vect_vect_recog_func_ptrs): Add
vect_recog_bool_pattern.
(check_bool_pattern, adjust_bool_pattern_cast,
adjust_bool_pattern, vect_recog_bool_pattern): New functions.

* gcc.dg/vect/vect-cond-9.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/vect/vect-cond-9.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-patterns.c
trunk/gcc/tree-vectorizer.h


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-16 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

--- Comment #17 from vincenzo Innocente vincenzo.innocente at cern dot ch 
2011-10-16 13:47:22 UTC ---
cool!
even
signed char k[1024];
61void foo6() {
62  for (int i=0; i!=N; ++i)
63k[i] = (a[i]b[i])  (c[i]d[i]);
vectorize!
with 
bool k[1024];
does not. I can survive though.
I will have to measure performance. I suspect that using
int k[1024];
will be faster…
Anyhow great achievement


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-07 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

--- Comment #13 from vincenzo Innocente vincenzo.innocente at cern dot ch 
2011-10-07 07:35:40 UTC ---
is not PR50649 caused by your changes?


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-07 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

--- Comment #14 from vincenzo Innocente vincenzo.innocente at cern dot ch 
2011-10-07 10:15:03 UTC ---
signed char k[1024];
void foo6() {
  for (int i=0; i!=N; ++i)
k[i] = (a[i]b[i] ? -1 : 0)  (c[i]d[i] ? -1 : 0);
}

requires -fno-tree-pre to vectorize
w/o I get
not vectorized: relevant stmt not supported: prephitmp.214_16 = D.2173_10 
D.2174_11 ? iftmp.2_2 : 0;



btw in almost all code of mine -fno-tree-pre produces always faster code when
vectorization matters!


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-07 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

--- Comment #15 from Jakub Jelinek jakub at gcc dot gnu.org 2011-10-07 
10:31:13 UTC ---
float a[1024], b[1024], c[1024], d[1024];
int j[1024];

void
f1 (void)
{
  int i;
  for (i = 0; i  1024; ++i)
{
  unsigned int x = a[i]  b[i] ? -1 : 0;
  unsigned int y = c[i]  d[i] ? -1 : 0;
  j[i] = (x  y)  31;
}
}

vectorizes fine and generates quite good code IMHO.  Something similar I'd like
to achieve with the vect_recog_bool_pattern I'm working on even for some of
your testcases.


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-06 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

--- Comment #9 from Jakub Jelinek jakub at gcc dot gnu.org 2011-10-06 
11:57:54 UTC ---
Created attachment 25428
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=25428
gcc47-vect-condexpr-mixed.patch

I believe at least some simple case of bool could be handled in
tree-vect-patterns.c by transforming bool lhs assignments with comparison on
rhs
into COND_EXPRs on char/short/int/long (depending on the comparison operand
size), /|/^ could be handled too and finally either cast to some integer type
or memory store).  Before trying to write it, I tried to write something
simpler, in particular a pattern recognizer that allows to vectorize mixed size
type COND_EXPRs (so far only with INTEGER_CST then/else).  For the case where
COND_EXPR lhs type is wider than comparison type I think it must be
INTEGER_CSTs, otherwise we can't ensure that they fit into the narrower integer
type.  But for lhs type narrower than comparison type
  lhs = cmp0  cmp1 ? val1 : val2;
(where sizeof (lhs)  sizeof (cmp0)) the above in theory could be transformed
into (for itype an integer type with the same sign as val1's type, but size of
cmp0) into:
  val1' = (itype) val1;
  val2' = (itype) val2;
  lhs' = cmp0  cmp1 ? val1' : val2';
  lhs = (__typeof (lhs)) lhs';
but we'd need more than one def_stmt for that.

This patch allows e.g. vectorization of:
float a[1024], b[1024];
unsigned char k[1024];

void
foo (void)
{
  int i;
  for (i = 0; i  1024; ++i)
k[i] = a[i]  b[i] ? -1 : 0;
}
on i?86/x86_64 which couldn't be previously vectorized.

Ira, does this sound reasonable?  How should a testcase look like (I think it
will be currently only vectorized on i?86/x86_64, as it needs mixed mode vcond
support, which, while probably implementable for e.g. altivec, is currently
i386 backend only feature)?  If this makes sense, I'll try to do the bool
pattern recognition next.


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-06 Thread irar at il dot ibm.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

Ira Rosen irar at il dot ibm.com changed:

   What|Removed |Added

 CC||irar at il dot ibm.com

--- Comment #10 from Ira Rosen irar at il dot ibm.com 2011-10-06 12:31:29 UTC 
---
(In reply to comment #9)

 
 Ira, does this sound reasonable?  

Looks good to me. (You can probably use build_nonstandard_integer_type()
instead of lang_hooks.types.type_for_mode).

 How should a testcase look like (I think it
 will be currently only vectorized on i?86/x86_64, as it needs mixed mode vcond
 support, which, while probably implementable for e.g. altivec, is currently
 i386 backend only feature)?  

I am not sure I understand the question. Are you asking how to check that it
gets vectorized only on i?86/x86_64? If so, you need a new proc in
lib/target-supports.exp (something like vect_cond_mixed_types).


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-06 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

  Attachment #25428|0   |1
is obsolete||

--- Comment #11 from Jakub Jelinek jakub at gcc dot gnu.org 2011-10-06 
13:30:36 UTC ---
Created attachment 25429
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=25429
/tmp/gcc47-vect-condexpr-mixed.patch

Thanks, here is an updated patch.


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-06 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

--- Comment #12 from Jakub Jelinek jakub at gcc dot gnu.org 2011-10-06 
17:49:43 UTC ---
Author: jakub
Date: Thu Oct  6 17:49:36 2011
New Revision: 179626

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=179626
Log:
PR tree-optimization/50596
* tree-vectorizer.h (vect_is_simple_cond): New prototype.
(NUM_PATTERNS): Change to 6.
* tree-vect-patterns.c (vect_recog_mixed_size_cond_pattern): New
function.
(vect_vect_recog_func_ptrs): Add vect_recog_mixed_size_cond_pattern.
(vect_mark_pattern_stmts): Don't create stmt_vinfo for def_stmt
if it already has one, and don't set STMT_VINFO_VECTYPE in it
if it is already set.
* tree-vect-stmts.c (vect_mark_stmts_to_be_vectorized): Handle
COND_EXPR in pattern stmts.
(vect_is_simple_cond): No longer static.

* lib/target-supports.exp (check_effective_target_vect_cond_mixed):
New.
* gcc.dg/vect/vect-cond-8.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/vect/vect-cond-8.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/lib/target-supports.exp
trunk/gcc/tree-vect-patterns.c
trunk/gcc/tree-vect-stmts.c
trunk/gcc/tree-vectorizer.h


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-05 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

--- Comment #8 from Jakub Jelinek jakub at gcc dot gnu.org 2011-10-05 
07:10:56 UTC ---
Until http://gcc.gnu.org/viewcvs?root=gccview=revrev=176563
float a[1024], b[1024], c[1024], d[1024];
int j[1024];

void
foo (void)
{
  int i;
  for (i = 0; i  1024; ++i)
{
  int x = a[i]  b[i];
  int y = c[i]  d[i];
  j[i] = x  c[i]  y;
}
}

didn't use any bool types, just int and float, still it couldn't vectorize:
pr50596-2.c:8: note: not vectorized: relevant stmt not supported: x_5 =
D.2699_3  D.2700_4;

I think we could use VECT_COND_EXPR vect1  vect2, { 1, 1, ...}, { 0, 0, ...
}
for that (and hopefully the backends optimize that well, e.g. into
anding the comparison mask with { 1, 1, ... } or doing per-element right shift
by element width - 1 on the mask.

With bool it would be nice if at least for non-stores we would pick the best
suitable wider integer vector type (in this case where the bools are set by
comparison operation and feed  that is afterwards cast to int the best is
obviously int vector).


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-04 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 CC||irar at gcc dot gnu.org,
   ||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek jakub at gcc dot gnu.org 2011-10-04 
07:04:25 UTC ---
The first problem with vectorization of ori function is similar to why the
first loop below is not vectorized and second is:
float a[1024], b[1024], c[1024], d[1024], e[1024];
void foo (void)
{
  for (int i = 0; i  1024; i++)
a[i] = b[i]  c[i] ? d[i] : e[i];
}
void bar (void)
{
  for (int i = 0; i  1024; i++)
{
  float d_ = d[i], e_ = e[i];
  a[i] = b[i]  c[i] ? d_ : e_;
}
}

gcc doesn't think it is ok to load d[i] resp. e[i] unconditionally.  In this
exact case where the loop bound is known and it is an static array of at least
that size it is probably fine, but if d or e was a pointer which might point to
a smaller array, d[i] or e[i] accesses might segfault.

That said, we still have control flow that even ifcvt doesn't fix up even with:
void
f2 ()
{
  for (int i = 0; i != N; ++i)
{
  float c_ = c[i], d_ = d[i];
  z[i] = a[i]  b[i]  c_  d_;
}
}

void
f3 ()
{
  for (int i = 0; i != N; ++i)
{
  float a_ = a[i], b_ = b[i], c_ = c[i], d_ = d[i];
  z[i] = a_  b_  c_  d_;
}
}

Note even if there would be no control flow, we'd still give up on bool not
being vectorized.  Bool is problematic, we'd have to use an unsigned char
vector instead (if bool is QImode) for vcond.  But it would be a vcond with
different
datamode and cmpmode size, we'd either need to do it using a V4SFmode/V8SFmode
vcond, then VEC_PACK_TRUNC_EXPR them into V16QImode/V32QImode.

Anyway, I think handling _Bool/bool somehow is now much more urgent than it has
been before, given Kai's/Richard's change to use _Bool/bool much more often in
GIMPLE.  If a bool SSA_NAME just feeds some COND_EXPR, we could just use some
wider type, or we could use wider vcond etc.


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-04 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

--- Comment #3 from vincenzo Innocente vincenzo.innocente at cern dot ch 
2011-10-04 09:11:53 UTC ---
for (int i = 0; i  1024; i++)
a[i] = b[i]  c[i] ? d[i] : e[i];
DOES vectorize with
 -ftree-loop-if-convert-stores
even with 
float * a; float * b; float * c; float * d; float * e;


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-04 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

Richard Guenther rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2011-10-04
 CC||rguenth at gcc dot gnu.org
 Ever Confirmed|0   |1

--- Comment #4 from Richard Guenther rguenth at gcc dot gnu.org 2011-10-04 
11:09:58 UTC ---
I agree with the need to at least support vectorizing loads and stores of
1-bit unsigned precision values.  We need to be careful with arithmetic
and conversions though (which is why we reject bools right now).


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-04 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

--- Comment #5 from Jakub Jelinek jakub at gcc dot gnu.org 2011-10-04 
11:13:58 UTC ---
(In reply to comment #4)
 I agree with the need to at least support vectorizing loads and stores of
 1-bit unsigned precision values.  We need to be careful with arithmetic
 and conversions though (which is why we reject bools right now).

We could represent the arithmetic and conversions (or at least subset thereof)
using *COND_EXPRs etc.  In any case, the bool representation is desirable for
the scalar loop, so this isn't something we should be doing in ifcvt, it needs
to be done in the vectorizer itself.


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-04 Thread rguenther at suse dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

--- Comment #6 from rguenther at suse dot de rguenther at suse dot de 
2011-10-04 11:26:51 UTC ---
On Tue, 4 Oct 2011, jakub at gcc dot gnu.org wrote:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596
 
 --- Comment #5 from Jakub Jelinek jakub at gcc dot gnu.org 2011-10-04 
 11:13:58 UTC ---
 (In reply to comment #4)
  I agree with the need to at least support vectorizing loads and stores of
  1-bit unsigned precision values.  We need to be careful with arithmetic
  and conversions though (which is why we reject bools right now).
 
 We could represent the arithmetic and conversions (or at least subset thereof)
 using *COND_EXPRs etc.  In any case, the bool representation is desirable for
 the scalar loop, so this isn't something we should be doing in ifcvt, it needs
 to be done in the vectorizer itself.

Sure.  Note that in GIMPLE

 bool = (bool) int;

isn't equivalent to bool = int != 0 but to a truncation to 1-bit
precision.  Thus for the truncation a BIT_AND is enough.  I'm just
worried about N-precision signed to mode-precision sign-extension
(for the 1-bit case we can use a COND_EXPR, but for more bits
it gets more difficult).

Richard.


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-04 Thread rguenther at suse dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

--- Comment #7 from rguenther at suse dot de rguenther at suse dot de 
2011-10-04 11:28:18 UTC ---
On Tue, 4 Oct 2011, jakub at gcc dot gnu.org wrote:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596
 
 --- Comment #5 from Jakub Jelinek jakub at gcc dot gnu.org 2011-10-04 
 11:13:58 UTC ---
 (In reply to comment #4)
  I agree with the need to at least support vectorizing loads and stores of
  1-bit unsigned precision values.  We need to be careful with arithmetic
  and conversions though (which is why we reject bools right now).
 
 We could represent the arithmetic and conversions (or at least subset thereof)
 using *COND_EXPRs etc.  In any case, the bool representation is desirable for
 the scalar loop, so this isn't something we should be doing in ifcvt, it needs
 to be done in the vectorizer itself.

Oh, and we don't handle expanding N-bit precision arithmetic on
vector types properly - for scalars we do the necessary truncation
at RTL expansion time.  So I think we should give up for that case
for now.


[Bug tree-optimization/50596] Problems in vectorization of condition expression

2011-10-03 Thread vincenzo.innocente at cern dot ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50596

--- Comment #1 from vincenzo Innocente vincenzo.innocente at cern dot ch 
2011-10-03 08:40:53 UTC ---
manage to vectorize this

int j[1024];
void foo5() {
  for (int i=0; i!=N; ++i)
j[i] = (a[i]b[i] ? -1 : 0)  (c[i]d[i] ? -1 : 0);
}


which is not bad, still a funny syntax (at least for those who are not used to
code in native SSE)