To: 'gcc@gcc.gnu.org'
Subject: Vectorizer question
Hello Everyone,
I have a question regarding the vectorizer. In the following code
below...
Int func (int x, int y)
{
If (x==y)
Return (x+y);
Else
Return (x-y);
}
If we force the x and y
Hello Everyone,
I have a question regarding the vectorizer. In the following code
below...
Int func (int x, int y)
{
If (x==y)
Return (x+y);
Else
Return (x-y);
}
If we force the x and y to be vectors of vectorlength 4, then will the
On 5/16/2012 4:01 PM, Iyer, Balaji V wrote:
Hello Everyone,
I have a question regarding the vectorizer. In the following code
below...
Int func (int x, int y)
{
If (x==y)
Return (x+y);
Else
Return (x-y);
}
If we force the x and y to be
Jakub Jelinek ja...@redhat.com wrote on 15/12/2011 03:51:25 PM:
On Thu, Dec 15, 2011 at 03:35:34PM +0200, Ira Rosen wrote:
This patch also fixes
a problem where vect_determine_vectorization_factor would iterate the
same
stmt twice - for some reason both the original stmt and pattern
Jakub Jelinek ja...@redhat.com wrote on 15/12/2011 09:02:57 AM:
On Thu, Dec 15, 2011 at 08:32:26AM +0200, Ira Rosen wrote:
+ cond = build2 (LT_EXPR, boolean_type_node, oprnd0, build_int_cst
(itype, 0));
+ gsi = gsi_for_stmt (last_stmt);
+ if (rhs_code == TRUNC_DIV_EXPR)
+
On Thu, 15 Dec 2011, Ira Rosen wrote:
Jakub Jelinek ja...@redhat.com wrote on 15/12/2011 09:02:57 AM:
On Thu, Dec 15, 2011 at 08:32:26AM +0200, Ira Rosen wrote:
+ cond = build2 (LT_EXPR, boolean_type_node, oprnd0, build_int_cst
(itype, 0));
+ gsi = gsi_for_stmt (last_stmt);
On Thu, Dec 15, 2011 at 10:02:15AM +0100, Richard Guenther wrote:
But it's really ugly to insert part of pattern sequence, don't you think?
It indeed is. The issue in the past was ICEing with -fno-tree-dce
when the pattern stmts did not have regular RTL expansion support
and the
Jakub Jelinek ja...@redhat.com wrote on 15/12/2011 12:54:29 PM:
Perhaps it would be even cleaner to get rid of the pattern stmt and def
stmt
seq distinction and just have pattern as whole be represented as
gimple_seq,
but perhaps that cleanup can be deferred for later.
Sounds good.
This
On Thu, Dec 15, 2011 at 03:35:34PM +0200, Ira Rosen wrote:
This patch also fixes
a problem where vect_determine_vectorization_factor would iterate the
same
stmt twice - for some reason both the original stmt and pattern stmt (and
def stmt) are marked as relevant,
Do you have a testcase
Jakub Jelinek ja...@redhat.com wrote on 15/12/2011 03:51:25 PM:
On Thu, Dec 15, 2011 at 03:35:34PM +0200, Ira Rosen wrote:
This patch also fixes
a problem where vect_determine_vectorization_factor would iterate the
same
stmt twice - for some reason both the original stmt and pattern
On Tue, Dec 13, 2011 at 05:57:40PM +0400, Kirill Yukhin wrote:
Let me hack up a quick pattern recognizer for this...
Here it is, untested so far.
On the testcase doing 200 f1+f2+f3+f4 calls in the loop with -O3 -mavx
on Sandybridge (so, vectorized just with 16 byte vectors) gives:
vanilla
Hi Jakub,
For 401.bzip2 it looks perfect. This is loop is vectorized:
.L6:
vmovdqa (%rsi,%rax), %ymm0
addl$1, %ecx
vpsrad $8, %ymm0, %ymm0
vpsrld $31, %ymm0, %ymm1
vpaddd %ymm1, %ymm0, %ymm0
vpsrad $1, %ymm0, %ymm0
vpaddd %ymm2,
On Wed, Dec 14, 2011 at 01:25:13PM +0100, Jakub Jelinek wrote:
On Tue, Dec 13, 2011 at 05:57:40PM +0400, Kirill Yukhin wrote:
Let me hack up a quick pattern recognizer for this...
Here it is, untested so far.
On the testcase doing 200 f1+f2+f3+f4 calls in the loop with -O3 -mavx
on
Jakub Jelinek ja...@redhat.com wrote on 14/12/2011 02:25:13 PM:
@@ -1573,6 +1576,211 @@ vect_recog_vector_vector_shift_pattern (
return pattern_stmt;
}
+/* Detect a signed division by power of two constant that wouldn't be
+ otherwise vectorized:
+
+ type a_t, b_t;
+
+ S1
On Thu, Dec 15, 2011 at 08:32:26AM +0200, Ira Rosen wrote:
+ cond = build2 (LT_EXPR, boolean_type_node, oprnd0, build_int_cst
(itype, 0));
+ gsi = gsi_for_stmt (last_stmt);
+ if (rhs_code == TRUNC_DIV_EXPR)
+{
+ tree var = vect_recog_temp_ssa_var (itype, NULL);
+
Hi guys,
While looking at Spec2006/401.bzip2 I found such a loop:
for (i = 1; i = alphaSize; i++) {
j = weight[i] 8;
j = 1 + (j / 2);
weight[i] = j 8;
}
Which is not vectorizeble (using Intel's AVX2) because division by two
is not recognized as rshift:
5: ==
On Tue, 13 Dec 2011, Kirill Yukhin wrote:
Hi guys,
While looking at Spec2006/401.bzip2 I found such a loop:
for (i = 1; i = alphaSize; i++) {
j = weight[i] 8;
j = 1 + (j / 2);
weight[i] = j 8;
}
Which is not vectorizeble (using Intel's AVX2) because division
On Tue, Dec 13, 2011 at 02:07:11PM +0100, Richard Guenther wrote:
Hi guys,
While looking at Spec2006/401.bzip2 I found such a loop:
for (i = 1; i = alphaSize; i++) {
j = weight[i] 8;
j = 1 + (j / 2);
weight[i] = j 8;
}
It would be helpful to have a
The full case attached.
Jakub, you are right, we have to convert signed ints into something a
bit more tricky.
BTW, here is output for that cases from Intel compiler:
vpxor %ymm1, %ymm1, %ymm1 #184.23
vmovdqu .L_2il0floatpacket.12(%rip), %ymm0
On Tue, Dec 13, 2011 at 05:42:16PM +0400, Kirill Yukhin wrote:
The full case attached.
Jakub, you are right, we have to convert signed ints into something a
bit more tricky.
BTW, here is output for that cases from Intel compiler:
Ah, so that matches to do j / 2 in the pattern recognizer as
Great!
Thanks, K
Let me hack up a quick pattern recognizer for this...
Jakub
The attached testcase yields (on a core2 duo, gcc trunk):
gfortran -O3 -ftree-vectorize -ffast-math -march=native test.f90
time ./a.out
real0m3.414s
ifort -xT -O3 test.f90
time ./a.out
real0m1.556s
The assembly contains:
ifort gfortran
mulpd 140 0
mulsd
2008/8/18 VandeVondele Joost [EMAIL PROTECTED]:
The attached testcase yields (on a core2 duo, gcc trunk):
gfortran -O3 -ftree-vectorize -ffast-math -march=native test.f90
time ./a.out
real0m3.414s
ifort -xT -O3 test.f90
time ./a.out
real0m1.556s
The assembly contains:
It would be nice to have a stand-alone testcase for this, so please
file a bugreport.
I've opened PR37150 for this.
Thanks,
Joost
24 matches
Mail list logo