[Bug tree-optimization/37416] [4.4 Regression] Failure to return number of loop iterations

2008-11-22 Thread irar at il dot ibm dot com


--- Comment #2 from irar at il dot ibm dot com  2008-11-22 15:08 ---
(In reply to comment #1)
> This bug is shamefully incomplete.  There is no way anyone willing to give 
> this
> a look can know what to look for.
> For example, a few things one would have to know before he/she can even begin
> to consider whether/how to analyze the problem:
> 1. What is the target where you see this?
> 2. What compiler flags are you using?
-O3

> 3. Where do you look for the number of iterations (which dump)?
vectorizer's dump

> 4. What "missed-optimization" does this cause (something not vectorized)?
the loop is not vectorized because the number of iterations is unknown

> Please read http://gcc.gnu.org/bugs.html#report before filing more bugs.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

  GCC build triplet||x86_64-suse-linux
   GCC host triplet||x86_64-suse-linux
 GCC target triplet||x86_64-suse-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37416



[Bug tree-optimization/38464] [4.4 Regression] vect/costmodel/ppc/costmodel-slp-12.c fails to vectorize

2008-12-11 Thread irar at il dot ibm dot com


--- Comment #2 from irar at il dot ibm dot com  2008-12-11 08:02 ---
Fixed.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38464



[Bug tree-optimization/38529] [4.3/4.4 regression] ICE with nested loops

2008-12-15 Thread irar at il dot ibm dot com


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2008-12-15 08:26:30
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38529



[Bug tree-optimization/38529] [4.3/4.4 regression] ICE with nested loops

2008-12-15 Thread irar at il dot ibm dot com


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |irar at il dot ibm dot com
   |dot org |
 Status|NEW |ASSIGNED
   Last reconfirmed|2008-12-15 08:26:30 |2008-12-15 14:42:26
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38529



[Bug tree-optimization/37194] [4.3/4.4 Regression] Autovectorization of small constant iteration loop degrades performance

2008-12-30 Thread irar at il dot ibm dot com


--- Comment #7 from irar at il dot ibm dot com  2008-12-30 14:57 ---
(In reply to comment #6)
> t.i:3: note: Vectorization may not be profitable.
> why doesn't the cost model then disallow vectorization here?

This is misleading. It only means that there exists loop bound threshold either
defined by the user or calculated with the cost model. It does not mean that
the cost model's decision is that the vectorization is not profitable.

I am adding this to our cleanup todo list.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37194



[Bug tree-optimization/37194] [4.3/4.4 Regression] Autovectorization of small constant iteration loop degrades performance

2009-01-05 Thread irar at il dot ibm dot com


--- Comment #8 from irar at il dot ibm dot com  2009-01-05 13:58 ---
To handle unknown alignment of data, the vectorizer creates a prolog loop to
peel a statically unknown number of scalar iterations (0<=nhttp://gcc.gnu.org/bugzilla/show_bug.cgi?id=37194



[Bug tree-optimization/38721] [alias-improvements] vectorizer miscompiles gfortran.fortran-torture/execute/elemental.f90 at -O3

2009-01-05 Thread irar at il dot ibm dot com


--- Comment #1 from irar at il dot ibm dot com  2009-01-05 13:19 ---
Here is a reduced testcase:

program test_elemental
   implicit none
   integer, dimension (2, 4) :: a
   integer, dimension (2, 4) :: b
   integer(kind = 8), dimension(2) :: c

   a = reshape ((/2, 3, 4, 5, 6, 7, 8, 9/), (/2, 4/))
   b = 0
   a = e_fn (a(:, 4:1:-1), 1 + b)
   ! This tests intrinsic elemental conversion functions.
   c = 2 * a(1, 1)
   if (any (c .ne. 14)) call abort

   ! This triggered bug due to building ss chains in the wrong order.
   b = 0;
   a = a - e_fn (a, b)
   if (any (a .ne. 0)) call abort

contains

elemental integer(kind=4) function e_fn (p, q)
   integer, intent(in) :: p, q
   e_fn = p - q
end function
end program


The problem is that dse2 removes the stores to array A.4 which is used by the
vectorized code:

  A.4[0] = D.1635_155;
  ...
  A.4[7] = D.1635_165;
  vect_pA.67_156 = (vector integer(kind=4) *) &A.4;
  vect_pa.73_197 = (vector integer(kind=4) *) &a;
  vect_var_.68_254 = *vect_pA.67_156;
  *vect_pa.73_197 = vect_var_.68_254;
  vect_pA.63_256 = vect_pA.67_156 + 16;
  vect_pa.69_257 = vect_pa.73_197 + 16;
  vect_var_.68_170 = *vect_pA.63_256;
  *vect_pa.69_257 = vect_var_.68_170;

We propagate alias info from the scalar to vector ref in
vect_create_data_ref_ptr() (in tree-vect-transform.c):

  /** (2) Add aliasing information to the new vector-pointer:
  (The points-to info (DR_PTR_INFO) may be defined later.)  **/

  tag = DR_SYMBOL_TAG (dr);
  gcc_assert (tag);

  /* If tag is a variable (and NOT_A_TAG) than a new symbol memory
 tag must be created with tag added to its may alias list.  */
  if (!MTAG_P (tag))
new_type_alias (vect_ptr, tag, DR_REF (dr));
  else
set_symbol_mem_tag (vect_ptr, tag);

Those lines do not exist on the branch. Do you take care of this somewhere
else?

Ira


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2009-01-05 13:19:53
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38721



[Bug tree-optimization/37194] [4.3/4.4 Regression] Autovectorization of small constant iteration loop degrades performance

2009-01-08 Thread irar at il dot ibm dot com


--- Comment #12 from irar at il dot ibm dot com  2009-01-08 09:25 ---
(In reply to comment #11)
> fixed for 4.3.3?
> Thanks.

No, still waiting for approval.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37194



[Bug tree-optimization/38529] [4.3 regression] ICE with nested loops

2009-01-10 Thread irar at il dot ibm dot com


--- Comment #4 from irar at il dot ibm dot com  2009-01-11 07:48 ---
Fixed on 4.3 branch as well.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38529



[Bug tree-optimization/37194] [4.3 Regression] Autovectorization of small constant iteration loop degrades performance

2009-01-10 Thread irar at il dot ibm dot com


--- Comment #14 from irar at il dot ibm dot com  2009-01-11 07:57 ---
Fixed.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37194



[Bug tree-optimization/37021] Fortran Complex reduction / multiplication not vectorized

2009-01-25 Thread irar at il dot ibm dot com


--- Comment #6 from irar at il dot ibm dot com  2009-01-25 09:12 ---
(In reply to comment #5)
> So,
>  4) The vectorized version sucks because we have to use peeling for niters
> because we need to unroll the loop once and cannot apply SLP here.

What do you mean by "unroll the loop once"?

> Q1: does SLP work with reductions at all?

No. SLP currently originates from groups of strided stores.

> Q2: does SLP do pattern recognition?

Pattern recoginition is done before SLP, and SLP handles stmts that were marked
as a part of a pattern. There is no SLP specific pattern recoginition.

> First of all we would need to recognize a complex reduction as a single
> vectorized reduction.  Second we need to vectorize the complex multiplication
> with SLP, feeding the reduction with one resulting complex vector.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37021



[Bug tree-optimization/37021] Fortran Complex reduction / multiplication not vectorized

2009-01-25 Thread irar at il dot ibm dot com


--- Comment #8 from irar at il dot ibm dot com  2009-01-25 12:17 ---
(In reply to comment #7)
> > > Q1: does SLP work with reductions at all?
> > 
> > No. SLP currently originates from groups of strided stores.
> Ah, I see.  In this loop we have two reductions, so to apply SLP
> we would need to see that we can use a group of reductions for SLP?

Yes, I think this will work.

> > > Q2: does SLP do pattern recognition?
> > 
> > Pattern recoginition is done before SLP, and SLP handles stmts that were 
> > marked
> > as a part of a pattern. There is no SLP specific pattern recoginition.
> Ok, but with a reduction it won't help me here.
> Can a loop be vectorized with just pattern recognition?  Hm, if I
> remember correctly we detect scalar patterns and then vectorize them.
> We don't support detecting "vector patterns" from scalar code, correct?

Yes, if I understand you correctly, we detect scalar patterns, but adding
vector pattern detection does not seem to be complicated.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37021



[Bug tree-optimization/38968] Complex matrix product is not vectorized

2009-01-26 Thread irar at il dot ibm dot com


--- Comment #3 from irar at il dot ibm dot com  2009-01-26 13:09 ---
(In reply to comment #2)
> Now, I wonder why we do not just use alignment + misalign in that case.

I think you are right.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38968



[Bug middle-end/40021] [4.5 Regression] Revision 146817 miscompiled DAXPY in BLAS

2009-05-05 Thread irar at il dot ibm dot com


--- Comment #6 from irar at il dot ibm dot com  2009-05-05 12:41 ---
Reproduced on x86_64-suse-linux.

Seems that, somehow, the vectorized version of loop in line 29 is performed,
even though the number of scalar iterations is 1.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 CC||irar at il dot ibm dot com
 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2009-05-05 12:41:15
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40021



[Bug tree-optimization/40074] [4.4/4.5 Regression] ICE in vect_get_vec_def_for_operand, at tree-vect-stmts.c:944

2009-05-10 Thread irar at il dot ibm dot com


--- Comment #13 from irar at il dot ibm dot com  2009-05-10 09:20 ---
(In reply to comment #12)
> Well, that revision only enabled vectorization support for more things...
> (which is probably what makes this a regression in the first place).

Right, I think it is something in the strided accesses detection. I am looking
into it now.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40074



[Bug tree-optimization/40074] [4.4/4.5 Regression] ICE in vect_get_vec_def_for_operand, at tree-vect-stmts.c:944

2009-05-10 Thread irar at il dot ibm dot com


--- Comment #14 from irar at il dot ibm dot com  2009-05-10 11:00 ---
I am testing:

Index: tree-vect-data-refs.c
===
--- tree-vect-data-refs.c   (revision 147329)
+++ tree-vect-data-refs.c   (working copy)
@@ -1424,7 +1424,7 @@ vect_analyze_group_access (struct data_r
   /* First stmt in the interleaving chain. Check the chain.  */
   gimple next = DR_GROUP_NEXT_DR (vinfo_for_stmt (stmt));
   struct data_reference *data_ref = dr;
-  unsigned int count = 1;
+  unsigned int count = 1, gaps = 0;
   tree next_step;
   tree prev_init = DR_INIT (data_ref);
   gimple prev = stmt;
@@ -1490,6 +1490,8 @@ vect_analyze_group_access (struct data_r
fprintf (vect_dump, "interleaved store with gaps");
  return false;
}
+
+  gaps += diff - 1;
}

   /* Store the gap from the previous member of the group. If there is
no
@@ -1506,8 +1508,9 @@ vect_analyze_group_access (struct data_r
  the type to get COUNT_IN_BYTES.  */
   count_in_bytes = type_size * count;

-  /* Check that the size of the interleaving is not greater than STEP.  */
-  if (dr_step < count_in_bytes)
+ /* Check that the size of the interleaving (including gaps) is not
greater
+ than STEP.  */
+  if (dr_step && dr_step < count_in_bytes + gaps * type_size)
 {
   if (vect_print_dump_info (REPORT_DETAILS))
 {

It fixes the reduced testcase, but I failed to compile the original one, so
maybe someone could check that the above patch fixes the ICE for the original
testcase?

Thanks,
Ira


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |irar at il dot ibm dot com
   |dot org |
 Status|NEW |ASSIGNED
   Last reconfirmed|2009-05-08 20:59:57 |2009-05-10 11:00:34
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40074



[Bug tree-optimization/40074] [4.4/4.5 Regression] ICE in vect_get_vec_def_for_operand, at tree-vect-stmts.c:944

2009-05-11 Thread irar at il dot ibm dot com


--- Comment #18 from irar at il dot ibm dot com  2009-05-11 12:45 ---
Fixed.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40074



[Bug tree-optimization/40233] New: Test failures with "alignment of array elements is greater than element size"

2009-05-24 Thread irar at il dot ibm dot com
for excess errors)
WARNING: g++.dg/torture/stackalign/eh-alloca-1.C  -O3 -g  compilation failed to
produce executable
FAIL: g++.dg/torture/stackalign/eh-global-1.C  -O3 -fomit-frame-pointer  (test
for excess errors)
WARNING: g++.dg/torture/stackalign/eh-global-1.C  -O3 -fomit-frame-pointer 
compilation failed to produce executable
FAIL: g++.dg/torture/stackalign/eh-global-1.C  -O3 -g  (test for excess errors)
WARNING: g++.dg/torture/stackalign/eh-global-1.C  -O3 -g  compilation failed to
produce executable
FAIL: g++.dg/torture/stackalign/eh-inline-1.C  -O3 -fomit-frame-pointer  (test
for excess errors)
WARNING: g++.dg/torture/stackalign/eh-inline-1.C  -O3 -fomit-frame-pointer 
compilation failed to produce executable
FAIL: g++.dg/torture/stackalign/eh-inline-1.C  -O3 -g  (test for excess errors)
WARNING: g++.dg/torture/stackalign/eh-inline-1.C  -O3 -g  compilation failed to
produce executable
FAIL: g++.dg/torture/stackalign/eh-inline-2.C  -O3 -fomit-frame-pointer  (test
for excess errors)
WARNING: g++.dg/torture/stackalign/eh-inline-2.C  -O3 -fomit-frame-pointer 
compilation failed to produce executable
FAIL: g++.dg/torture/stackalign/eh-inline-2.C  -O3 -g  (test for excess errors)
WARNING: g++.dg/torture/stackalign/eh-inline-2.C  -O3 -g  compilation failed to
produce executable
FAIL: g++.dg/torture/stackalign/eh-vararg-1.C  -O3 -fomit-frame-pointer  (test
for excess errors)
WARNING: g++.dg/torture/stackalign/eh-vararg-1.C  -O3 -fomit-frame-pointer 
compilation failed to produce executable
FAIL: g++.dg/torture/stackalign/eh-vararg-1.C  -O3 -g  (test for excess errors)
WARNING: g++.dg/torture/stackalign/eh-vararg-1.C  -O3 -g  compilation failed to
produce executable
FAIL: g++.dg/torture/stackalign/eh-vararg-2.C  -O3 -fomit-frame-pointer  (test
for excess errors)
WARNING: g++.dg/torture/stackalign/eh-vararg-2.C  -O3 -fomit-frame-pointer 
compilation failed to produce executable
FAIL: g++.dg/torture/stackalign/eh-vararg-2.C  -O3 -g  (test for excess errors)
WARNING: g++.dg/torture/stackalign/eh-vararg-2.C  -O3 -g  compilation failed to
produce executable

The failures start from revision 147829 - basic block SLP. SLP checks if there
is a vector type for the scalar type used in a basic block. It calls
make_vector_type() for a vector type, where array of this type is built for
debug representation purposes in build_array_type():

at ../../gcc/gcc/stor-layout.c:1848
1848  error ("alignment of array elements is greater than
element size");
(gdb) back
#0  layout_type (type=0x2b2860eb2240) at ../../gcc/gcc/stor-layout.c:1848
#1  0x008dc33c in type_hash_lookup (hashcode=2524125531, type=0x40)
at ../../gcc/gcc/tree.c:4721
#2  0x008dc3c9 in type_hash_canon (hashcode=2524125531, type=0x40)
at ../../gcc/gcc/tree.c:4772
#3  0x008dd1d1 in build_array_type (elt_type=0x2b2860e52600,
index_type=0x2b2860dd90c0) at ../../gcc/gcc/tree.c:5851
#4  0x008f4d1d in make_vector_type (innertype=0x2b2860e52600,
nunits=4, mode=VOIDmode) at ../../gcc/gcc/tree.c:7441
#5  0x0089d9c8 in get_vectype_for_scalar_type
(scalar_type=0x2b2860e52600) at ../../gcc/gcc/tree-vect-stmts.c:4348
#6  0x00bbc3ef in vect_analyze_data_refs (loop_vinfo=, bb_vinfo=)
at ../../gcc/gcc/tree-vect-data-refs.c:2050
...
(gdb) p debug_generic_expr (type)
aligned[4]
$6 = void


-- 
   Summary: Test failures with "alignment of array elements is
greater than element size"
   Product: gcc
   Version: 4.5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: irar at il dot ibm dot com
 GCC build triplet: x86_64-suse-linux
  GCC host triplet: x86_64-suse-linux
GCC target triplet: x86_64-suse-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40233



[Bug middle-end/40240] [4.5 regression] ICE in execute_cse_reciprocals, at tree-ssa-math-opts.c:469

2009-05-25 Thread irar at il dot ibm dot com


--- Comment #2 from irar at il dot ibm dot com  2009-05-25 08:20 ---
(In reply to comment #1)
> this is likely being fixed by Ira

I committed the fix. Could you please check if it really fixes this one as
well?

Thanks,
Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40240



[Bug middle-end/40244] [4.5 Regression] Revision147829 caused extra failures

2009-05-26 Thread irar at il dot ibm dot com


--- Comment #1 from irar at il dot ibm dot com  2009-05-26 08:58 ---
(In reply to comment #0)
> On Linux/ia64, revision 147829:
> http://gcc.gnu.org/ml/gcc-cvs/2009-05/msg00806.html
> caused:
> FAIL: Matrix4f -O3 compilation from source

Could you please provide some information, it doesn't fail on x86_64...

> FAIL: gcc.dg/vect/bb-slp-10.c scan-tree-dump-times slp "unsupported alignment
> in basic block." 1
> FAIL: gcc.dg/vect/bb-slp-4.c scan-tree-dump-times slp "basic block vectorized
> using SLP" 0

I think they can be fixed as following. Could you please check?

Index: testsuite/gcc.dg/vect/bb-slp-4.c
===
--- testsuite/gcc.dg/vect/bb-slp-4.c(revision 147862)
+++ testsuite/gcc.dg/vect/bb-slp-4.c(working copy)
@@ -18,14 +18,10 @@ main1 ()

   *pout++ = *pin++;
   *pout++ = *pin++;
-  *pout++ = *pin++;
-  *pout++ = *pin++;

   /* Check results.  */
   if (out[0] != in[0]
-  || out[1] != in[1]
-  || out[2] != in[2]
-  || out[3] != in[3])
+  || out[1] != in[1])
 abort();

   return 0;
Index: testsuite/gcc.dg/vect/bb-slp-10.c
===
--- testsuite/gcc.dg/vect/bb-slp-10.c   (revision 147862)
+++ testsuite/gcc.dg/vect/bb-slp-10.c   (working copy)
@@ -14,7 +14,7 @@ main1 (unsigned int x, unsigned int y)
 {
   int i;
   unsigned int *pin = &in[0];
-  unsigned int *pout = &out[2];
+  unsigned int *pout = &out[1];
   unsigned int a0, a1, a2, a3;

   /* Misaligned store.  */
@@ -29,10 +29,10 @@ main1 (unsigned int x, unsigned int y)
   *pout++ = a3 * y;

   /* Check results.  */
-  if (out[2] != (in[0] + 23) * x
-  || out[3] != (in[1] + 142) * y
-  || out[4] != (in[2] + 2) * x
-  || out[5] != (in[3] + 31) * y)
+  if (out[1] != (in[0] + 23) * x
+  || out[2] != (in[1] + 142) * y
+  || out[3] != (in[2] + 2) * x
+  || out[4] != (in[3] + 31) * y)
 abort();

   return 0;

Thanks,
Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40244



[Bug tree-optimization/40254] [4.5 Regression] SPEC2006 403.gcc miscompares

2009-05-27 Thread irar at il dot ibm dot com


--- Comment #4 from irar at il dot ibm dot com  2009-05-27 08:43 ---
The bug is in data-refs analysis for basic blocks: two accesses that are not
adjacent (reload.c:1370) are considered as adjacent, and, therefore, get
vectorized together, causing the wrong code generation.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |irar at il dot ibm dot com
   |dot org |
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2009-05-27 08:43:46
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40254



[Bug tree-optimization/40254] [4.5 Regression] SPEC2006 403.gcc miscompares

2009-05-27 Thread irar at il dot ibm dot com


--- Comment #5 from irar at il dot ibm dot com  2009-05-27 09:59 ---
I'll test this patch tomorrow:

Index: tree-data-ref.c
===
--- tree-data-ref.c (revision 147903)
+++ tree-data-ref.c (working copy)
@@ -718,17 +725,26 @@ dr_analyze_innermost (struct data_refere
   base_iv.no_overflow = true;
 }

-  if (!poffset || !in_loop)
+  if (!poffset)
 {
   offset_iv.base = ssize_int (0);
   offset_iv.step = ssize_int (0);
 }
-  else if (!simple_iv (loop, loop_containing_stmt (stmt),
-  poffset, &offset_iv, false))
+  else
 {
-  if (dump_file && (dump_flags & TDF_DETAILS))
-   fprintf (dump_file, "failed: evolution of offset is not affine.\n");
-  return false;
+  if (!in_loop)
+{
+  offset_iv.base = poffset;
+  offset_iv.step = ssize_int (0);
+}
+  else if (!simple_iv (loop, loop_containing_stmt (stmt),
+  poffset, &offset_iv, false))
+{
+  if (dump_file && (dump_flags & TDF_DETAILS))
+fprintf (dump_file, "failed: evolution of offset is not"
+" affine.\n");
+  return false;
+}
 }

   init = ssize_int (pbitpos / BITS_PER_UNIT);


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40254



[Bug middle-end/40244] [4.5 Regression] Revision 147829 caused extra failures

2009-05-30 Thread irar at il dot ibm dot com


--- Comment #5 from irar at il dot ibm dot com  2009-05-30 16:53 ---
(In reply to comment #4)
> (In reply to comment #1)
> > (In reply to comment #0)
> > > On Linux/ia64, revision 147829:
> > > http://gcc.gnu.org/ml/gcc-cvs/2009-05/msg00806.html
> > > caused:
> > > FAIL: Matrix4f -O3 compilation from source
> > 
> > Could you please provide some information, it doesn't fail on x86_64...
> > 
> > > FAIL: gcc.dg/vect/bb-slp-10.c scan-tree-dump-times slp "unsupported 
> > > alignment
> > > in basic block." 1
> > > FAIL: gcc.dg/vect/bb-slp-4.c scan-tree-dump-times slp "basic block 
> > > vectorized
> > > using SLP" 0
> > 
> > I think they can be fixed as following. Could you please check?
> > 
> Yes, it fixed the problem. Thanks.

Thanks.
Is Matrix4f OK now too?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40244



[Bug testsuite/40244] [4.5 Regression] Revision 147829 caused extra failures

2009-05-30 Thread irar at il dot ibm dot com


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |irar at il dot ibm dot com
   |dot org |
 Status|NEW |ASSIGNED
  Component|middle-end  |testsuite
   Last reconfirmed|2009-05-29 07:52:46 |2009-05-31 06:45:04
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40244



[Bug testsuite/40244] [4.5 Regression] Revision 147829 caused extra failures

2009-05-31 Thread irar at il dot ibm dot com


--- Comment #8 from irar at il dot ibm dot com  2009-05-31 09:04 ---
Fixed.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40244



[Bug tree-optimization/39129] The meaning of 'BB' in "too many BBs in loop"

2009-05-31 Thread irar at il dot ibm dot com


--- Comment #4 from irar at il dot ibm dot com  2009-05-31 10:55 ---
So, will "too many basic blocks in loop" be good enough? Because this is what
it is, the reason that the loop form is not suitable for the vectorizer is that
there are too many basic blocks in it.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39129



[Bug tree-optimization/39129] The meaning of 'BB' in "too many BBs in loop"

2009-05-31 Thread irar at il dot ibm dot com


--- Comment #6 from irar at il dot ibm dot com  2009-05-31 12:33 ---
For non-empty latch block we actually print "not vectorized: unexpected loop
form." So I can change it to "not vectorized: non-empty latch block", and
instead of "too many BBs" I can write "control flow in loop". 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39129



[Bug tree-optimization/39129] The meaning of 'BB' in "too many BBs in loop"

2009-06-01 Thread irar at il dot ibm dot com


--- Comment #9 from irar at il dot ibm dot com  2009-06-01 08:20 ---
Fixed.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39129



[Bug tree-optimization/40348] Powerpc spe segfaults in vectorizing powf (a[i], 0.5f)

2009-06-07 Thread irar at il dot ibm dot com


--- Comment #2 from irar at il dot ibm dot com  2009-06-07 07:59 ---
So, I guess this patch fixes it? 

Thanks,
Ira

Index: tree-vect-patterns.c
===
--- tree-vect-patterns.c(revision 148035)
+++ tree-vect-patterns.c(working copy)
@@ -515,6 +515,9 @@ vect_recog_pow_pattern (gimple last_stmt
   && REAL_VALUES_EQUAL (TREE_REAL_CST (exp), dconsthalf))
 {
   tree newfn = mathfn_built_in (TREE_TYPE (base), BUILT_IN_SQRT);
+  if (!newfn)
+return NULL;
+
   *type_in = get_vectype_for_scalar_type (TREE_TYPE (base));
   if (*type_in)
{


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40348



[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.

2009-06-15 Thread irar at il dot ibm dot com


--- Comment #12 from irar at il dot ibm dot com  2009-06-15 09:58 ---
(In reply to comment #9)
> The patch in comment #8 fixes the failures reported in comment #7. I now see
> (powerpc-apple-darwin9 with -m64):
> FAIL: gcc.dg/vect/vect-42.c scan-tree-dump-times vect "Alignment of access
> forced using versioning" 3

Is this target ([istarget *-*-darwin*] && [is-effective-target lp64]) (meaning
vector_alignment_reachable is false for it)?

If so, why do we do peeling? And also why in that case it doesn't XPASS
"Alignment of access forced using peeling" 1 "vect"?

Otherwise, vector_alignment_reachable is true, and it is not supposed to look
for the versioning string at all (since the target is not vect_no_align,
right?).

It doesn't make sense to me either way...
Revital, maybe you can try to add brackets: { ! { vector_alignment_reachable }
} instead of { ! vector_alignment_reachable} ?

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359



[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.

2009-06-16 Thread irar at il dot ibm dot com


--- Comment #17 from irar at il dot ibm dot com  2009-06-16 07:36 ---
Dominique, 

Could you please try this patch (I changed (!a && !b) to !(a || b)).

Thanks,
Ira



Index: vect-42.c
===
--- vect-42.c   (revision 148487)
+++ vect-42.c   (working copy)
@@ -63,7 +63,7 @@
 }

 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using
versioning" 3 "vect" { target { vect_no_align || { { !
vector_alignment_reachable} && {!vect_hw_misalign} } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using
versioning" 3 "vect" { target { vect_no_align || { !  {
vector_alignment_reachable || vect_hw_misalign } } } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4
"vect" { xfail { { vect_no_align || vect_hw_misalign } || { !
vector_alignment_reachable } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using
peeling" 1 "vect" { xfail { { vect_no_align || vect_hw_misalign } || { !
vector_alignment_reachable } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added
------------------------
 CC||irar at il dot ibm dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359



[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.

2009-06-16 Thread irar at il dot ibm dot com


--- Comment #19 from irar at il dot ibm dot com  2009-06-16 10:18 ---
(In reply to comment #18)
> > Could you please try this patch (I changed (!a && !b) to !(a || b)).
> I am currently regtesting on my ppc and it takes a long time. Meanwhile I am
> not sure to understand what you expect with this change: if I am not mistaken
> !(a || b) == (!a && !b) .

Yes, the problem is that we think that the test is correct and it doesn't work
because of some syntax/brackets/space problems.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359



[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.

2009-06-16 Thread irar at il dot ibm dot com


--- Comment #21 from irar at il dot ibm dot com  2009-06-16 11:08 ---
(In reply to comment #20)
> What are the expected patterns for the 3 variables
> with -m32 and -m64?

I am not sure, this is why I asked you if the target is 
([istarget *-*-darwin*] && [is-effective-target lp64]).

vect_no_align and vect_hw_misalign have to be false, so, I guess,
vector_alignment_reachable is different for -m32 and -m64, since the behaviour
is different. 

"Alignment of access forced using versioning" means the vectorizer uses loop
versioning to force alignment. It happens when there is no misalignment support
at all (vect_no_align) or when other methods fail: loop peeling doesn't help
(!vector_alignment_reachable) and also there is no hardware misalignment
support (!vect_hw_misalign).

>From the dump you attached, I see that loop peeling was done, therefore,
vector_alignment_reachable is true, and it must not look for "Alignment of
access forced using versioning". But it does. This what makes me think that it
is just a syntax problem.

On the other hand, I don't understand the difference with -m32 and -m64. It
seems to me, that ([istarget *-*-darwin*] && [is-effective-target lp64]) is
false for -m32 and, possibly, true for -m64. But that contradicts the dump.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359



[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.

2009-06-17 Thread irar at il dot ibm dot com


--- Comment #23 from irar at il dot ibm dot com  2009-06-17 08:22 ---
(In reply to comment #22)
> My understanding is that ([istarget *-*-darwin*] && [is-effective-target 
> lp64])
> should return false for -m32 and true for -m64. At least it is how it works on
> other tests I have looked at. Is there anyway to check it?

You can add 
/* { dg-final { scan-tree-dump-times "bla bla bla" 1 "vect" { target
vector_alignment_reachable } } } */
to some test. It should fail for -m32 and pass for -m64 (since we think that
vector_alignment_reachable is true for -m32 and false for -m64).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359



[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.

2009-06-17 Thread irar at il dot ibm dot com


--- Comment #25 from irar at il dot ibm dot com  2009-06-17 11:06 ---
(In reply to comment #24)
> If I add to vect-42.c (with my patch) the line
>
> /* { dg-final { scan-tree-dump-times "bla bla bla" 1 "vect" { target
vector_alignment_reachable } } } */
...
> i.e., the test is done for -m32 (and fail) but not for -m64.

So, vector_alignment_reachable is true for -m32 and false for -m64.

...
> i.e., vect_hw_misalign is false for both -m32 and -m64.
> So it looks that vect_hw_misalign has the opposite meaning of that assumed in
> comment #16:
> > hmmm... versioning should not be done for targets that support
> > vect_hw_misalign... 

Why? vect_hw_misalign means that misaligned data acceses are supported by
hardware, therefore, we don't need to do versioning. And we expect versioning
here with -m64 since both vect_hw_misalign and vector_alignment_reachable are
false.

> Final note, the change in comment #17 does not help.

Thanks for checking.


I still don't understand why this test works on -m64
/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling"
1 "vect" { xfail { { vect_no_align || vect_hw_misalign } || { !
vector_alignment_reachable } } } } } */
vector_alignment_reachable is false, so there should be no peeling according to
the test. But it is there, and the test doesn't XPASS...

And, of course, I don't understand why we do peeling, i.e., builtin
vector_alignment_reachable returns true.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359



[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.

2009-06-17 Thread irar at il dot ibm dot com


--- Comment #29 from irar at il dot ibm dot com  2009-06-17 12:40 ---
Oh, so the first dump you attached (in comment #11) was for -m32. Now it makes
sense.

I think, we have to distinguish between vect_no_align and the other cases. I
will prepare a patch tomorrow.

Thanks,
Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359



[Bug middle-end/40475] [4.5 Regression] gcc.dg/vect/vect-nest-cycle-[12].c

2009-06-17 Thread irar at il dot ibm dot com


--- Comment #1 from irar at il dot ibm dot com  2009-06-17 12:46 ---
Could you please attach a vectorizer dump for one of them? I need to know what
prevented vectorization.

Thanks,
Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40475



[Bug middle-end/40475] [4.5 Regression] gcc.dg/vect/vect-nest-cycle-[12].c

2009-06-18 Thread irar at il dot ibm dot com


--- Comment #4 from irar at il dot ibm dot com  2009-06-18 07:17 ---
Created an attachment (id=18017)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18017&action=view)
patch to fix the tests

Thanks. It's misalignment.
Could you please check the attached patch?


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |irar at il dot ibm dot com
   |dot org |
 Status|UNCONFIRMED |ASSIGNED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40475



[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.

2009-06-18 Thread irar at il dot ibm dot com


--- Comment #31 from irar at il dot ibm dot com  2009-06-18 08:03 ---
Created an attachment (id=18019)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18019&action=view)
patch to fix vect-42.c

I think the easiest way to fix it is to change the test to have one vetorizable
loop again as before http://gcc.gnu.org/viewcvs?view=rev&revision=147851.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359



[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.

2009-06-18 Thread irar at il dot ibm dot com


--- Comment #33 from irar at il dot ibm dot com  2009-06-18 09:14 ---
Created an attachment (id=18020)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18020&action=view)
fix vect-42.c

OK, now I understand why we need two loops here (we need to pass the arrays as
parameters to avoid versioning for alias).
So, I split the checks for vect_no_align and the others. Hope, this time it
works.
Thanks.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

  Attachment #18019|0   |1
is obsolete||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359



[Bug testsuite/40475] [4.5 Regression] gcc.dg/vect/vect-nest-cycle-[12].c

2009-06-21 Thread irar at il dot ibm dot com


--- Comment #7 from irar at il dot ibm dot com  2009-06-21 07:32 ---
Fixed.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40475



[Bug tree-optimization/40542] [4.3/4.4/4.5 Regression] vectorizes access to volatile array

2009-06-28 Thread irar at il dot ibm dot com


--- Comment #2 from irar at il dot ibm dot com  2009-06-28 10:57 ---
So, the solution is to prevent vectorization of volatile types, like in the
patch below?

Index: tree-vect-data-refs.c
===
--- tree-vect-data-refs.c   (revision 149023)
+++ tree-vect-data-refs.c   (working copy)
@@ -1896,6 +1896,14 @@ vect_analyze_data_refs (loop_vec_info lo
   return false;
 }

+  if (TYPE_VOLATILE (TREE_TYPE (DR_REF (dr
+{
+  if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
+fprintf (vect_dump, "not vectorized: memory access of volatile "
+"type");
+  return false;
+}
+
   stmt = DR_STMT (dr);
   stmt_info = vinfo_for_stmt (stmt);


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40542



[Bug tree-optimization/40542] [4.3/4.4/4.5 Regression] vectorizes access to volatile array

2009-06-30 Thread irar at il dot ibm dot com


--- Comment #7 from irar at il dot ibm dot com  2009-06-30 12:02 ---
Fixed.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40542



[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)

2009-07-04 Thread irar at il dot ibm dot com


--- Comment #27 from irar at il dot ibm dot com  2009-07-05 06:48 ---
(In reply to comment #23)
> because there are two reductions in that loop which I think the vectorizer
> cannot handle:

Actually, the vectorizer can vectorize two reductions. I think, the problem is
in cond_expr in reduction:

>   pos.0_3 = [cond_expr] D.1599_29 ? pos.0_32 : pos.0_31;
>   limit.2_5 = [cond_expr] D.1599_29 ? limit.2_22 : limit.2_8;

I'll look into it.

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31067



[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing

2009-07-16 Thread irar at il dot ibm dot com


--- Comment #2 from irar at il dot ibm dot com  2009-07-16 12:29 ---
pr40770.c:20: note: ==> examining statement: sincostmp.21_1 = __builtin_cexpi
(D.1625_3);
pr40770.c:20: note: get vectype for scalar type:  complex double
pr40770.c:20: note: not vectorized: unsupported data-type complex double

make_vector_type returns NULL for this type.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770



[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing

2009-07-16 Thread irar at il dot ibm dot com


--- Comment #6 from irar at il dot ibm dot com  2009-07-16 17:31 ---
(In reply to comment #3)
> > make_vector_type returns NULL for this type.
> Yes - there is no vector type for complex double.  But the vectorizer
> could query for a vector type for the complex component type (double)
> and divide the vector element count by 2 (for complex) to get the
> vectorization factor which would be 1 here.  

I see.

> Should SLP the be possible
> for that loop?

Not with the current implementation - SLP needs strided stores to start. Here
the stores are not even adjacent. I think, it would be better to vectorize this
loop with regular loop-based vectorization to avoid permutations. I'll take a
better look on Sunday.

Ira

> Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770



[Bug tree-optimization/40801] internal compiler error: in vect_get_vec_def_for_stmt_copy, at tree-vect-stmts.c:1096

2009-07-19 Thread irar at il dot ibm dot com


--- Comment #3 from irar at il dot ibm dot com  2009-07-19 09:35 ---
Testing a fix.

Ira


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |irar at il dot ibm dot com
   |dot org |
 Status|NEW |ASSIGNED
   Last reconfirmed|2009-07-18 19:15:43 |2009-07-19 09:35:55
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40801



[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing

2009-07-20 Thread irar at il dot ibm dot com


--- Comment #7 from irar at il dot ibm dot com  2009-07-20 11:18 ---
AFAIU, querying for the component type of complex type is not difficult to
implement. 
I think, that loop-based vectorization is preferable here, so we should stay
with vectorization factor of 2 for doubles.

The next problem is to vectorize 
  D.1611_4 = IMAGPART_EXPR ;
and
  D.1612_6 = REALPART_EXPR ;

Currently, we support only loads and stores with IMAGPART/REALPART_EXPR,
vectorizing them as strided accesses, with extract odd and even operations for
loads. So, we will have to support interleaving of non-memory variables. 

Does __builtin_cexpi have a vector implementation? If so, does it return two
vectors?

If not, I guess, we need something like:

  sincostmp.1 = __builtin_cexpi (xd[i]);
  sincostmp.2 = __builtin_cexpi (xd[i+1]);
  v1 = VEC_EXTRACT_EVEN (sincostmp.1, sincostmp.2);
  v2 = VEC_EXTRACT_ODD (sincostmp.1, sincostmp.2);
  sf[i:i+1] = v1;
  cf[i:i+1] = v2;
  i = i + 2;

Or we can use the two vectors from vectorized __builtin_cexpi as parameters of
extract operations.
Does that make sense?

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770



[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)

2009-07-20 Thread irar at il dot ibm dot com


--- Comment #28 from irar at il dot ibm dot com  2009-07-20 12:03 ---
I've just committed a patch that adds support of cond_expr in reductions in
nested cycles (http://gcc.gnu.org/ml/gcc-patches/2009-07/msg01124.html). 

cond_expr cannot be vectorized in reduction of inner-most loop, because such
reduction changes the order of computation, and that cannot be done for
cond_expr.

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31067



[Bug tree-optimization/40801] internal compiler error: in vect_get_vec_def_for_stmt_copy, at tree-vect-stmts.c:1096

2009-07-26 Thread irar at il dot ibm dot com


--- Comment #5 from irar at il dot ibm dot com  2009-07-26 07:04 ---
Fixed.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40801



[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)

2009-07-26 Thread irar at il dot ibm dot com


--- Comment #32 from irar at il dot ibm dot com  2009-07-26 07:48 ---
(In reply to comment #30)
> Regarding the just committed inline version: It would be interesting to know
> whether it is vectorizable (with/without -ffinite-math-only [i.e.
> -ffast-math]).

It depends on where it is inlined. It has to be vectorized in outer loop (see
my previous comment), so it needs another loop around it.

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31067



[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)

2009-07-27 Thread irar at il dot ibm dot com


--- Comment #34 from irar at il dot ibm dot com  2009-07-27 08:36 ---
(In reply to comment #33)
> Using the example from comment 23 with
...
> gfortran shows: test.f90:12: note: not vectorized: unsupported use in stmt.
> and needs 2.272s. (By comparison. 4.4 needs 3.688s.)

This is for the inner loop vectorization. For the outer loop we get:
tmp.f90:11: note: not vectorized: control flow in loop.
because of the if's. Maybe loop unswitching can help us. 
Vectorizable outer-loops look like this:

(pre-header)
   |
  header <---+
   | |
  inner-loop |
   | |
  tail --+
   |
(exit-bb)


Does ifort vectorize the exact same implemantion of minloc?

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31067



[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)

2009-07-27 Thread irar at il dot ibm dot com


--- Comment #38 from irar at il dot ibm dot com  2009-07-27 12:44 ---
I am not sure that that kind of computation can be generated automatically,
since in general the order of caclulation of cond_expr cannot be changed. 

However, the loop can be split:

  for (i = 0; i < end; i++)
if (arr[i] < limit)
  limit = arr[i];

  for (i = 0; i < end; i++)
if (arr[i] == limit)
  {
pos = i + 1;
break;
  }

making the first loop vectorizable (inner-most loop vectorization).

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31067



[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)

2009-07-28 Thread irar at il dot ibm dot com


--- Comment #41 from irar at il dot ibm dot com  2009-07-28 08:12 ---
That requires pattern recognition. MIN/MAX_EXPR are recognized by the first
phiopt pass, so MIN/MAXLOC should be either also recognized there or in the
vectorizer. (The phiopt pass transforms if clause to MIN/MAX_EXPR. The
vectorizer gets COND_EXPR after if-conversion pass).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31067



[Bug middle-end/37150] vectorizer misses some loops

2009-08-06 Thread irar at il dot ibm dot com


--- Comment #10 from irar at il dot ibm dot com  2009-08-06 10:49 ---
Yes. The problem is that only a basic implementation was added. To vectorize
this code several improvements must be done: support stmt group sizes greater
than vector size, allow loads and stores to the same location, initiate SLP
analysis from groups of loads, support misaligned access, etc. 

Finding a benchmark could really help to push these items to the top of
vectorizer's todo list.

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150



[Bug tree-optimization/41008] [4.5 Regression] ICE in vect_is_simple_reduction, at tree-vect-loop.c:1708

2009-08-09 Thread irar at il dot ibm dot com


--- Comment #3 from irar at il dot ibm dot com  2009-08-09 12:15 ---
Fixed.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41008



[Bug tree-optimization/41019] Variate_generator with mt19937 and normal_distribution produces wrong sequence for "-O3".

2009-08-12 Thread irar at il dot ibm dot com


--- Comment #6 from irar at il dot ibm dot com  2009-08-12 12:14 ---
Looks like a problem in data-ref analysis:

Creating dr for this_6(D)->_M_x[__k_87]
...
base_address: this_6(D)
offset from base address: 0
constant offset from base address: 0
step: 8
aligned to: 128
base_object: this_6(D)->_M_x[0]

And the vectorizer creates accesses relatively to this_6(D) (base_address
above) with zero offset (instead of this_6(D)->_M_x[0] or with an offset of
_M_x).

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41019



[Bug tree-optimization/41019] Variate_generator with mt19937 and normal_distribution produces wrong sequence for "-O3".

2009-08-12 Thread irar at il dot ibm dot com


--- Comment #8 from irar at il dot ibm dot com  2009-08-13 05:40 ---
(In reply to comment #7)
> Oh.  Did you manage to reduce or reproduce with a smaller testcase?

No, I just looked at the vectorized loops. The guilty one is

bin/../lib/gcc/x86_64-unknown-linux-gnu/4.5.0/../../../../include/c++/4.5.0/tr1/random.tcc:231

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41019



[Bug tree-optimization/41019] Variate_generator with mt19937 and normal_distribution produces wrong sequence for "-O3".

2009-08-13 Thread irar at il dot ibm dot com


--- Comment #10 from irar at il dot ibm dot com  2009-08-13 11:34 ---
Reduced testcase:

#include 
#include 

#define N 4

long int a[N];
int main ()
{
  int k;

  for (k = 0; k < N; ++k)
a[k] = a[k] != 5 ? 12 : 10;

  for (k = 0; k < N; ++k)
printf ("%u ", a[k]);

  printf ("\n");

  return 0;
}

%gcc -O3 t.c
% ./a.out
0 0 0 0

%gcc -O2 t.c
% ./a.out
12 12 12 12

If the type of 'a' is int, there is no problem. The vectorizer produces almost
the same code in both cases (except for number of iterations and types).
I am attaching the assembly for int and long int versions.

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41019



[Bug tree-optimization/41019] Variate_generator with mt19937 and normal_distribution produces wrong sequence for "-O3".

2009-08-13 Thread irar at il dot ibm dot com


--- Comment #11 from irar at il dot ibm dot com  2009-08-13 11:36 ---
Created an attachment (id=18350)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18350&action=view)
The assembly for the long int version (wrong code)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41019



[Bug tree-optimization/41019] Variate_generator with mt19937 and normal_distribution produces wrong sequence for "-O3".

2009-08-13 Thread irar at il dot ibm dot com


--- Comment #12 from irar at il dot ibm dot com  2009-08-13 11:37 ---
Created an attachment (id=18351)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18351&action=view)
The assembly for the int version (correct)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41019



[Bug tree-optimization/25211] [4.1/4.2 Regression] verify_ssa ICE for mesa with -Os -ftree-loop-linear

2005-12-14 Thread irar at il dot ibm dot com


--- Comment #4 from irar at il dot ibm dot com  2005-12-14 13:11 ---
I think the reason why this ICE occurs with my patch
(http://gcc.gnu.org/viewcvs?view=rev&rev=102356) is that my patch enables
data-refs analysis for INDIRECT_REFs. Similar ICE in PR 20256 happens also
before my patch since the data-refs there are ARRAY_REFs, and ARRAY_REFs were
already supported before.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25211



[Bug tree-optimization/25371] -ftree-vectorize results in internal compiler error on AMD64

2005-12-18 Thread irar at il dot ibm dot com


--- Comment #3 from irar at il dot ibm dot com  2005-12-18 08:15 ---
I failed to reproduce this ICE on ppc and i686.
Vectorizer's dump file can help.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 CC||irar at il dot ibm dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25371



[Bug tree-optimization/21591] not vectorizing a loop with access to structs

2006-09-13 Thread irar at il dot ibm dot com


--- Comment #7 from irar at il dot ibm dot com  2006-09-13 08:32 ---
I think, the problem here is that we only check SMT and not NMT. I am preparing
a patch to fix this. NMT is stored in ptr_info_def of data-ref, and only if it
does not exist, SMT will be checked.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 CC||irar at il dot ibm dot com,
   ||dnovillo at redhat dot com
 AssignedTo|unassigned at gcc dot gnu   |irar at il dot ibm dot com
   |dot org |
 Status|NEW |ASSIGNED
   Last reconfirmed|2006-02-21 01:04:59 |2006-09-13 08:32:31
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21591



[Bug tree-optimization/18438] vectorizer failed for vector matrix multiplication

2006-09-19 Thread irar at il dot ibm dot com


--- Comment #3 from irar at il dot ibm dot com  2006-09-19 07:10 ---
> t.c:20: note: not vectorized: mixed data-types
> t.c:20: note: can't determine vectorization factor.
>
> Removing flags[i] = true;

Multiple data-types vectorization is already supported in the autovect branch,
and the patches for mainline (starting from
http://gcc.gnu.org/ml/gcc-patches/2006-02/msg00941.html) will be committed as
soon as 4.3 is open.  


> we get:
> t.c:20: note: not consecutive access
> t.c:20: note: not vectorized: complicated access pattern.

Vectorization of strided accesses is also already implemented in the autovect
branch (and will be committed to the mainline 4.3). However, this case contains
stores with gaps (stores to opoints[i][0], opoints[i][1], and opoints[i][2],
without a store to opoints[i][3]), and only loads with gaps are currently
supported.

Therefore, this loop will be vectorizable in the autovect branch (and soon in
the mainline 4.3) if a store to opoints[i][3] is added.

Ira


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 CC|                        |irar at il dot ibm dot com
   Last reconfirmed|2005-12-21 03:49:03 |2006-09-19 07:10:15
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18438



[Bug tree-optimization/19049] not vectorizing a fortran loop

2006-09-19 Thread irar at il dot ibm dot com


--- Comment #7 from irar at il dot ibm dot com  2006-09-19 07:29 ---
Even though vectorization of strided accesses is already implemented in the
autovect branch (and will be committed to the mainline 4.3), this case contains
a store with a gap (store to a[i] without a store to a[i-1]), and such stores
are not supported (the current implementation supports only loads with gaps).

Note, however, that adding a store to a[i-1] will create a data dependence in
the loop.

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19049



[Bug tree-optimization/26969] [4.1 Regression] ICE with -O1 -funswitch-loops -ftree-vectorize

2006-10-18 Thread irar at il dot ibm dot com


--- Comment #15 from irar at il dot ibm dot com  2006-10-18 11:03 ---
(In reply to comment #13)
> We need to check if above patch fixes PR26969 as well.

Checked, it does not.


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 CC|    |irar at il dot ibm dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26969



[Bug tree-optimization/26362] ICE on the autovect-branch (gfortran example)

2007-01-28 Thread irar at il dot ibm dot com


--- Comment #3 from irar at il dot ibm dot com  2007-01-28 10:45 ---
The current versions of both mainline and autovect branch do not ICE. Strided
loads are not implemented for SSE. I opened a PR 30211 for it. 
I think this PR can be closed.

Ira


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 CC||irar at il dot ibm dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26362



[Bug tree-optimization/27659] ICE on autovect-branch

2007-01-28 Thread irar at il dot ibm dot com


--- Comment #3 from irar at il dot ibm dot com  2007-01-28 11:38 ---
I tried to reproduce this on x86 with current autovect branch and mainline with
.../g++ -fpreprocessed tmp.ii -S -O3 -ftree-vectorize -msse2 -ansi
-fdump-tree-vect-details. It doesn't not ICE, and the loop is vectorized.

Ira


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 CC||irar at il dot ibm dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27659



[Bug c/30843] [4.3 Regression] ice for legal code with -ftree-vectorize -O2

2007-02-19 Thread irar at il dot ibm dot com


--- Comment #5 from irar at il dot ibm dot com  2007-02-19 11:18 ---
Subject: Re:  ice for legal code with -ftree-vectorize -O2

I know what the problem is. If we don't remove the store while iterating,
we can't get it later (the si), can we?

Ira




 "dorit at il dot  
 ibm dot com"  
 <[EMAIL PROTECTED]  To 
 .gnu.org> Ira Rosen/Haifa/[EMAIL PROTECTED]
   
cc 
 18/02/2007 23:52  
   Subject 
   [Bug c/30843] ice for legal code
 Please respond to with -ftree-vectorize -O2   
 [EMAIL PROTECTED] 
  gnu.org  










--

dorit at il dot ibm dot com changed:

   What|Removed |Added


 CC|                        |irar at il dot ibm dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30843

--- You are receiving this mail because: ---
You are on the CC list for the bug, or are watching someone who is.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30843



[Bug c/30843] [4.3 Regression] ice for legal code with -ftree-vectorize -O2

2007-02-19 Thread irar at il dot ibm dot com


--- Comment #6 from irar at il dot ibm dot com  2007-02-19 12:41 ---
Sorry about the last comment, it was sent by mistake.

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30843



[Bug bootstrap/30921] New: Bootstrap failure with -ftree-vectorize on i386

2007-02-21 Thread irar at il dot ibm dot com
Bootstrap with vectorization enabled fails on i386 starting from revision
121767:
http://gcc.gnu.org/viewcvs?view=rev&revision=121767

Ira


-- 
   Summary: Bootstrap failure with -ftree-vectorize on i386
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: irar at il dot ibm dot com
 GCC build triplet: i386-redhat-linux
  GCC host triplet: i386-redhat-linux
GCC target triplet: i386-redhat-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30921



[Bug bootstrap/30921] Bootstrap failure with -ftree-vectorize on i386

2007-02-22 Thread irar at il dot ibm dot com


--- Comment #1 from irar at il dot ibm dot com  2007-02-22 07:58 ---
Here is the ChangeLog entry for that patch:

2007-02-09  Richard Henderson  <[EMAIL PROTECTED]>

* config/i386/constraints.md (Ym): New constraint.
* config/i386/i386.md (movsi_1): Change Y2 to Yi constraints.
(movdi_1_rex64): Split sse and xmm general register moves from
memory move alternatives.  Use conditional register constraints.
(movsf_1, movdf_integer): Likewise.
(zero_extendsidi2_32, zero_extendsidi2_rex64): Likewise.
(movdf_integer_rex64): New.
(pushsf_rex64): Fix output constraints.
* config/i386/sse.md (sse2_loadld): Split rm alternative, use Yi.
(sse2_stored): Likewise.
(sse2_storeq_rex64): New.
* config/i386/i386.c (x86_inter_unit_moves): Enable for not 
amd and not generic.
(ix86_secondary_memory_needed): Don't bypass TARGET_INTER_UNIT_MOVES
for optimize_size.  Remove SF/DFmode hack.

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30921



[Bug bootstrap/30921] Bootstrap failure with -ftree-vectorize on i386

2007-02-22 Thread irar at il dot ibm dot com


--- Comment #3 from irar at il dot ibm dot com  2007-02-22 08:22 ---
(In reply to comment #2)
> (In reply to comment #0)
> > Bootstrap with vectorization enabled fails on i386 starting from revision
> > 121767:
> > http://gcc.gnu.org/viewcvs?view=rev&revision=121767
> Could you post exact steps how to reproduce this failure?

Run

make bootstrap BOOT_CFLAGS="-O2 -g -ftree-vectorize -msse2"

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30921



[Bug tree-optimization/24309] [4.1/4.2/4.3 Regression] ICE with -O3 -ftree-loop-linear

2007-03-05 Thread irar at il dot ibm dot com


--- Comment #15 from irar at il dot ibm dot com  2007-03-05 09:30 ---
I tried the reduced testcase on powerpc with -ftree-loop-linear and both -O2
and -O3 on 4.1, 4.2 and 4.3, and it works fine.

Ira


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 CC||irar at il dot ibm dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24309



[Bug tree-optimization/25371] -ftree-vectorize results in internal compiler error on AMD64

2007-03-11 Thread irar at il dot ibm dot com


--- Comment #6 from irar at il dot ibm dot com  2007-03-11 10:33 ---
Harsha, could you please attach vectorizer's dump file (produced with
-fdump-tree-vect-details)?

Thanks,
Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25371



[Bug tree-optimization/31343] New: ICE in data-refs dependence testing

2007-03-25 Thread irar at il dot ibm dot com
An attempt to divide by zero is made (causing ICE on the attached test
case) for evolution functions with zero step.

For the following evolution functions of pS[i_15].x and pS[i_15].y from the
attached test
  (chrec_a = {{0, +, 1}_1, +, 0}_2)
  (chrec_b = {{1, +, 1}_1, +, 0}_2)
the difference (-1) is calculated, and then the check whether the step
(0)divides the difference is performed in function
chrec_steps_divide_constant_p (tree-data-ref.c), causing ICE.


-- 
   Summary: ICE in data-refs dependence testing
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: irar at il dot ibm dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31343



[Bug tree-optimization/31343] ICE in data-refs dependence testing

2007-03-25 Thread irar at il dot ibm dot com


--- Comment #1 from irar at il dot ibm dot com  2007-03-25 10:02 ---
Created an attachment (id=13281)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13281&action=view)
test case


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31343



[Bug tree-optimization/32806] New: Missing optimization to remove backword dependencies

2007-07-18 Thread irar at il dot ibm dot com
for (i=0; ihttp://gcc.gnu.org/bugzilla/show_bug.cgi?id=32806



[Bug bootstrap/33031] Bootstrap fails on gcc/tree.c

2007-08-09 Thread irar at il dot ibm dot com


--- Comment #1 from irar at il dot ibm dot com  2007-08-09 08:44 ---
I got this too on x86_64-linux.
I guess the guilty patch is 

r127306 | chaoyingfu | 2007-08-09 01:29:12 +0300 (Thu, 09 Aug 2007) | 213 lines

since it added the function fixed_zerop:

* tree.c 
 ...
(fixed_zerop): New function.

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33031



[Bug tree-optimization/33447] New: Non-empty latch block prevents loop vectorization

2007-09-16 Thread irar at il dot ibm dot com
The following loop (from linpk.f90) contains a non-empty latch block before
tree optimizations:

Source code:
Line
   m = MOD(N,4)
323IF ( m.NE.0 ) THEN
324   DO i = 1 , m
325  Dy(i) = Dy(i) + Da*Dx(i)
326   ENDDO
327   IF ( N.LT.4 ) RETURN
328ENDIF
329mp1 = m + 1
330DO i = mp1 , N , 4
331   Dy(i) = Dy(i) + Da*Dx(i)
332   Dy(i+1) = Dy(i+1) + Da*Dx(i+1)
333   Dy(i+2) = Dy(i+2) + Da*Dx(i+2)
334   Dy(i+3) = Dy(i+3) + Da*Dx(i+3)
335ENDDO 

The first SSA dump:

:
  ...

  if (countm1.32_8 == 0)
goto ;
  else
goto ;

:
  countm1.32_98 = countm1.32_8 + 4294967295;
  goto ;

This is also related to PR 28643 and PR 33244. However, in these PRs some tree
optimization puts stmts/phi nodes in the latch block, while in the lnpck
example the latch block is non-empty to begin with.


-- 
   Summary: Non-empty latch block prevents loop vectorization
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: irar at il dot ibm dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33447



[Bug middle-end/33449] [4.3 regression] ICE for fortran code with -O2 -ftree-vectorize

2007-09-17 Thread irar at il dot ibm dot com


--- Comment #4 from irar at il dot ibm dot com  2007-09-17 08:59 ---
(In reply to comment #3)
> I can reproduce that on x86_64-linux with trunk rev. 128442. 

Dorit's fix is revision 128514, so it is not supposed to work on 128442...
Anyway, I am trying to reproduce this ICE on x86_64-linux now, with the current
trunk (128538).

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33449



[Bug middle-end/33449] [4.3 regression] ICE for fortran code with -O2 -ftree-vectorize

2007-09-17 Thread irar at il dot ibm dot com


--- Comment #5 from irar at il dot ibm dot com  2007-09-17 09:54 ---
(In reply to comment #4)
> Anyway, I am trying to reproduce this ICE on x86_64-linux now, with the 
> current
> trunk (128538).

It doesn't ICE for me. (The loop gets vectorized).

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33449



[Bug target/33505] Vectorizer (or spu target builtins) and PCH don't get along

2007-09-30 Thread irar at il dot ibm dot com


--- Comment #1 from irar at il dot ibm dot com  2007-09-30 09:42 ---
I managed to reproduce it. 

Here http://gcc.gnu.org/ml/gcc-patches/2007-09/msg01559.html Richard suggested
to add a GTY(()) to
struct spu_builtin_description spu_builtins[] = {
#define DEF_BUILTIN(fcode, icode, name, type, params) \
  {fcode, icode, name, type, params, NULL_TREE},
#include "spu-builtins.def"
#undef DEF_BUILTIN
};

Actually there is a GTY(()) in spu-builtins.h
 extern GTY(()) struct spu_builtin_description spu_builtins[];

But anyway I tried to the following and it didn't help:
Index: spu.c
===
--- spu.c   (revision 128708)
+++ spu.c   (working copy)
@@ -4459,7 +4459,7 @@
 ^L
 /* Create the built-in types and functions */

-struct spu_builtin_description spu_builtins[] = {
+struct spu_builtin_description GTY (()) spu_builtins[] = {
 #define DEF_BUILTIN(fcode, icode, name, type, params) \
   {fcode, icode, name, type, params, NULL_TREE},
 #include "spu-builtins.def"

Ira


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 CC|                    |irar at il dot ibm dot com,
   ||richard dot guenther at
   ||gmail dot com
 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2007-09-30 09:42:56
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33505



[Bug middle-end/33597] Internal compiler error while compiling libswcale from ffmpeg

2007-09-30 Thread irar at il dot ibm dot com


--- Comment #6 from irar at il dot ibm dot com  2007-09-30 10:37 ---
(In reply to comment #5)
> Patch in testing:

Thanks for fixing this! 
(I've just started to test the exact same patch :))

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33597



[Bug target/33505] Vectorizer (or spu target builtins) and PCH don't get along

2007-10-02 Thread irar at il dot ibm dot com


--- Comment #3 from irar at il dot ibm dot com  2007-10-02 09:22 ---
(In reply to comment #2)
> This is kinda on my list of stuff to forward port from the internal PS3
> toolchain.

Maybe I can help with testing this patch for mainline?

Thanks,
Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33505



[Bug tree-optimization/33680] [4.3 Regression] ICE when compilling elbg.c from ffmpeg (vectorizer)

2007-10-07 Thread irar at il dot ibm dot com


--- Comment #7 from irar at il dot ibm dot com  2007-10-07 12:31 ---
(In reply to comment #3)

I get:
pr33680.c: In function גfג:
pr33680.c:1: error: expected an SSA_NAME object
pr33680.c:1: error: in statement
D.1618_93 = D.1556 /[ex] 4;
pr33680.c:1: internal compiler error: verify_ssa failed

The problem is that D.1556 is a VAR_DECL and not an SSA_NAME.

This stmt is created while gimplifying data-ref base in
vect_create_addr_base_for_vector_ref(). The expr is
(int[0:D.1553] *) newcentroid.1_22 + (long unsigned int) dim_4(D) * 8

 
sizes-gimplified type_1 BLK size 
unit size  
...

D.1618 = D.1556 /[ex] 4 is created, taking D.1556 as the unit size in
gimplify_compound_lval. And later D.1618 is replaced with an SSA_NAME
D.1618_93, since it's a lhs (in gimplify_modify_expr). 

> Vectorizer produces invalid Gimple SSA code:
> 
>   D.1769_169 = D.1599 /[ex] 4;
> 
> D.1599 should be renamed.
> 

Where should it be renamed? In gimplify_smth? 

Thanks,
Ira


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 CC|                    |irar at il dot ibm dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33680



[Bug tree-optimization/33680] [4.3 Regression] ICE when compilling elbg.c from ffmpeg (vectorizer)

2007-10-09 Thread irar at il dot ibm dot com


--- Comment #9 from irar at il dot ibm dot com  2007-10-09 12:49 ---
(In reply to comment #8)
> If you use force_gimple_operand_bsi, it takes care of that itself.

Thanks! I will try to see if we can use it. The problem is we don't have a bsi,
we insert those stmts using bsi_insert_on_edge_immediate on
loop_preheader_edge.

> If you e.g. use force_gimple_operand instead, you need to take care of
> calling mark_symbols_for_renaming yourself.

In order to do this, we will have to go through the statement list created by
force_gimple_operand, and I am not sure that it's a good idea.

Thanks,
Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33680



[Bug tree-optimization/33680] [4.3 Regression] ICE when compilling elbg.c from ffmpeg (vectorizer)

2007-10-10 Thread irar at il dot ibm dot com


--- Comment #11 from irar at il dot ibm dot com  2007-10-10 13:23 ---
I understand that those symbols have to be renamed, I am just saying that maybe
it   should be done in the gimplifier and not in the vectorizer. But since
force_gimple_operand_bsi also goes through the statements list, I guess it is
reasonable to do the same thing in the vectorizer. Or we can add a new API like
force_gimple_operand_and_mark_for_renaming.

Anyway, I tried your patch. Now we get a different ICE:
internal compiler error: in referenced_var_lookup, at tree-dfa.c:642

D.1556 is marked for renaming but then during update_ssa it cannot find it - 
htab_find_with_hash (tree-dfa.c:641) returns NULL.

#0  referenced_var_lookup (uid=1556) at ../../gcc/gcc/tree-dfa.c:642
#1  0x006f9308 in update_ssa (update_flags=2048) at
../../gcc/gcc/tree-into-ssa.c:3207
#2  0x00aac184 in vect_transform_loop (loop_vinfo=0xe94410) at
../../gcc/gcc/tree-vect-transform.c:7431
#3  0x007fae09 in vectorize_loops () at
../../gcc/gcc/tree-vectorizer.c:2507
#4  0x00631726 in execute_one_pass (pass=0xdfc0c0) at
../../gcc/gcc/passes.c:1116
#5  0x006318ec in execute_pass_list (pass=0xdfc0c0) at
../../gcc/gcc/passes.c:1169
#6  0x006318fe in execute_pass_list (pass=0xdfbee0) at
../../gcc/gcc/passes.c:1170
#7  0x006318fe in execute_pass_list (pass=0xdfb2e0) at
../../gcc/gcc/passes.c:1170
#8  0x007086ce in tree_rest_of_compilation (fndecl=0x2ba807b05800) at
../../gcc/gcc/tree-optimize.c:404
#9  0x0088a054 in cgraph_expand_function (node=0x2ba807b05900) at
../../gcc/gcc/cgraphunit.c:1070
#10 0x0088bbe7 in cgraph_optimize () at ../../gcc/gcc/cgraphunit.c:1139
#11 0x004144fe in c_write_global_declarations () at
../../gcc/gcc/c-decl.c:8077
#12 0x006ad2e7 in toplev_main (argc=, argv=) at ../../gcc/gcc/toplev.c:1052
#13 0x2ba8077d5154 in __libc_start_main () from /lib64/libc.so.6
#14 0x00403cf9 in _start ()


Thanks,
Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33680



[Bug tree-optimization/33680] [4.3 Regression] ICE when compilling elbg.c from ffmpeg (vectorizer)

2007-10-11 Thread irar at il dot ibm dot com


--- Comment #13 from irar at il dot ibm dot com  2007-10-11 10:43 ---
Maybe we can fix DCE not to eliminate such vars?

Or somehow fix split_constant_offset?
The following patch changes the base from 
(int[0:D.1553] *) newcentroid.1_22 + (long unsigned int) dim_4(D) * 8
to (int[0:D.1553] *) D.1560_21 + (long unsigned int) dim_4(D) * 8
and, hence, there is no need in the size of newcentroid.1_22:

Index: tree-data-ref.c
===
--- tree-data-ref.c (revision 128902)
+++ tree-data-ref.c (working copy)
@@ -579,8 +579,10 @@ split_constant_offset (tree exp, tree *v
  {
split_constant_offset (def_stmt_rhs, &var0, &off0);
var0 = fold_convert (type, var0);
-   *var = var0;
-   *off = off0;
+   split_constant_offset (var0, &var2, &off2);
+   *var = var2;
+   *off = fold_build2 (PLUS_EXPR, TREE_TYPE (off2),
+off0, off2);
return;
  }
  }
Maybe we can check if the base is of the VLA type and then try to further split
it as above (and not to vectorize if we fail)?

Thanks,
Ira


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 CC||rakdver at gcc dot gnu dot
   ||org
   Priority|P1  |P3


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33680



[Bug tree-optimization/33680] [4.3 Regression] ICE when compilling elbg.c from ffmpeg (vectorizer)

2007-10-11 Thread irar at il dot ibm dot com


--- Comment #14 from irar at il dot ibm dot com  2007-10-11 12:34 ---
BTW, without this patch http://gcc.gnu.org/ml/gcc-patches/2007-07/msg02122.html
there is no ICE and the loop gets vectorized.

Ira


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 CC||Jan dot Sjodin at amd dot
   ||com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33680



[Bug tree-optimization/33680] [4.3 Regression] ICE when compilling elbg.c from ffmpeg (vectorizer)

2007-10-15 Thread irar at il dot ibm dot com


--- Comment #16 from irar at il dot ibm dot com  2007-10-15 10:42 ---
This patch fixes the ICE and doesn't cause regressions in the vectorizer
testsuite:

Index: tree-data-ref.c
===
--- tree-data-ref.c (revision 129292)
+++ tree-data-ref.c (working copy)
@@ -571,11 +571,16 @@ split_constant_offset (tree exp, tree *v
if (TREE_CODE (def_stmt) == GIMPLE_MODIFY_STMT)
  {
tree def_stmt_rhs = GIMPLE_STMT_OPERAND (def_stmt, 1);
+tree arr = NULL_TREE;
+
+if (TREE_CODE (def_stmt_rhs) == ADDR_EXPR)
+  arr = TREE_OPERAND (def_stmt_rhs, 0);

if (!TREE_SIDE_EFFECTS (def_stmt_rhs)
&& EXPR_P (def_stmt_rhs)
&& !REFERENCE_CLASS_P (def_stmt_rhs)
-   && !get_call_expr_in (def_stmt_rhs))
+   && !get_call_expr_in (def_stmt_rhs)
+&& (!arr || TREE_THIS_NOTRAP (arr)))
  {
split_constant_offset (def_stmt_rhs, &var0, &off0);
var0 = fold_convert (type, var0);

This way we avoid arrays with unknown size.

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33680



[Bug tree-optimization/33804] ICE in vect_transform_stmt, at tree-vect-transform.c:6131 with -ftree-vectorize

2007-10-18 Thread irar at il dot ibm dot com


--- Comment #3 from irar at il dot ibm dot com  2007-10-18 08:33 ---
It works fine for me (and the loop gets SLPed) on powerpc-64 and x86_64.

Could you please run it with -fdump-tree-vect-details and attach the dump file?

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33804



[Bug tree-optimization/33812] New: Vectorizer testcases fail

2007-10-18 Thread irar at il dot ibm dot com
With current trunk (r129433) vectorizer testcases fail on powerpc64:

FAIL: gcc.dg/vect/vect-64.c (internal compiler error)
FAIL: gcc.dg/vect/vect-64.c (test for excess errors)
WARNING: gcc.dg/vect/vect-64.c compilation failed to produce executable
FAIL: gcc.dg/vect/vect-68.c (internal compiler error)
FAIL: gcc.dg/vect/vect-68.c (test for excess errors)
WARNING: gcc.dg/vect/vect-68.c compilation failed to produce executable
FAIL: gcc.dg/vect/vect-70.c (internal compiler error)
FAIL: gcc.dg/vect/vect-70.c (test for excess errors)
WARNING: gcc.dg/vect/vect-70.c compilation failed to produce executable
FAIL: gcc.dg/vect/no-scevccp-slp-31.c (internal compiler error)
FAIL: gcc.dg/vect/no-scevccp-slp-31.c (test for excess errors)
WARNING: gcc.dg/vect/no-scevccp-slp-31.c compilation failed to produce
executable

The tests ICE:
vect-64.c: In function גmain1ג:
vect-64.c:75: internal compiler error: in change_address_1, at emit-rtl.c:1888
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.

#0  change_address_1 (memref=0xf7e44f40, mode=SImode, addr=0xf7e44f30,
validate=1) at ../../gcc/gcc/emit-rtl.c:1888
#1  0x10179fd4 in validize_mem (ref=0xf7e44f40) at ../../gcc/gcc/explow.c:546
#2  0x101a88e8 in emit_move_insn (x=0xf7e38e00, y=0xf7e44f40) at
../../gcc/gcc/expr.c:3403
#3  0x105543e4 in rs6000_emit_epilogue (sibcall=0) at
../../gcc/gcc/config/rs6000/rs6000.c:16095
#4  0x28000488 in ?? ()
#5  0x105ec5ec in gen_epilogue () at
../../gcc/gcc/config/rs6000/rs6000.md:14476
#6  0x102242b4 in rest_of_handle_thread_prologue_and_epilogue () at
../../gcc/gcc/function.c:5298
#7  0x102a7af4 in execute_one_pass (pass=0x108f6a78) at
../../gcc/gcc/passes.c:1117
#8  0x102a7d68 in execute_pass_list (pass=0x108f6a78) at
../../gcc/gcc/passes.c:1170
#9  0x102a7d80 in execute_pass_list (pass=0x108f6ee8) at
../../gcc/gcc/passes.c:1171
#10 0x102a7d80 in execute_pass_list (pass=0x108f6eb4) at
../../gcc/gcc/passes.c:1171
#11 0x103ae68c in tree_rest_of_compilation (fndecl=0xf7dbc100) at
../../gcc/gcc/tree-optimize.c:404
#12 0x105705fc in cgraph_expand_function (node=0xf7dbc300) at
../../gcc/gcc/cgraphunit.c:1060
#13 0x105728c4 in cgraph_optimize () at ../../gcc/gcc/cgraphunit.c:1123
#14 0x10017914 in c_write_global_declarations () at ../../gcc/gcc/c-decl.c:8077
#15 0x1033fff0 in toplev_main (argc=, argv=) at ../../gcc/gcc/toplev.c:1055
#16 0x1009e370 in main (argc=0, argv=0x0) at ../../gcc/gcc/main.c:35

This doesn't happen with r129290.

Ira


-- 
   Summary: Vectorizer testcases fail
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: irar at il dot ibm dot com
 GCC build triplet: powerpc64-suse-linux
  GCC host triplet: powerpc64-suse-linux
GCC target triplet: powerpc64-suse-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33812



[Bug tree-optimization/33804] ICE in vect_transform_stmt, at tree-vect-transform.c:6131 with -ftree-vectorize

2007-10-21 Thread irar at il dot ibm dot com


--- Comment #5 from irar at il dot ibm dot com  2007-10-21 08:45 ---
(In reply to comment #4)
> Created an attachment (id=14370)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14370&action=view) [edit]
> Vectorization dump file
> 
Thanks!

The vectorizer fails in transformation phase in function
vectorizable_operation:
  if (icode == CODE_FOR_nothing)
{
  if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "op not supported by target.");
  if (GET_MODE_SIZE (vec_mode) != UNITS_PER_WORD
  || LOOP_VINFO_VECT_FACTOR (loop_vinfo)
 < vect_min_worthwhile_factor (code))
return false;
  if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "proceeding using word mode.");
}

During the analysis, we also get CODE_FOR_nothing, but also at that stage
LOOP_VINFO_VECT_FACTOR (loop_vinfo) > vect_min_worthwhile_factor (code)
hence we proceed using word mode.
At the end of the analysis, we change the vectorization factor (divide it by 4)
to perform pure SLP on the loop, so during the transformation phase, when we
get to the same code again, we probably get that  
LOOP_VINFO_VECT_FACTOR (loop_vinfo) < vect_min_worthwhile_factor (code)
and we fail.

The idea was that we should not fail to vectorize during the transformation,
since everything was checked during the analysis, therefore, a gcc_assert was
put here.

I'll have to think how to fix this problem.

Thanks,
Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33804



[Bug rtl-optimization/33846] [4.3 Regression] ICE in trunc_int_for_mode, at explow.c:55

2007-10-21 Thread irar at il dot ibm dot com


--- Comment #4 from irar at il dot ibm dot com  2007-10-21 11:02 ---
The problem is with vector shift with scalar shift argument. 
For the code created by the vectorizer:
  vect_var_.49_103 = ~vect_var_.47_101;
  vect_var_.50_105 = vect_var_.49_103 >> 31;

(ashiftrt:V4SI (not:V4SI (reg:V4SI 100))
(const_int 31 [0x1f]))
is created.

The failure is in
explow.c:55  gcc_assert (SCALAR_INT_MODE_P (mode)); 
since MODE is V4SImode.

Ira


-- 

irar at il dot ibm dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2007-10-21 11:02:08
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33846



[Bug tree-optimization/33804] ICE in vect_transform_stmt, at tree-vect-transform.c:6131 with -ftree-vectorize

2007-10-21 Thread irar at il dot ibm dot com


--- Comment #6 from irar at il dot ibm dot com  2007-10-21 12:52 ---
The solution can be just not check if the vectorization is worthwhile during
the transformation. The decision whether to vectorize or not should be made
during the analysis anyway. 
The vectorization factor can get smaller only in case that there is only
SLP-kind of vectorization in the loop, and the VF is the unrolling factor
needed to operate on full vectors. So the profitability of this loop
vectorization doesn't change.

Index: tree-vect-transform.c
===
--- tree-vect-transform.c   (revision 129404)
+++ tree-vect-transform.c   (working copy)
@@ -3865,18 +3865,21 @@ vectorizable_operation (tree stmt, block
 {
   if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "op not supported by target.");
+  /* Check only during analysis.  */
   if (GET_MODE_SIZE (vec_mode) != UNITS_PER_WORD
-  || LOOP_VINFO_VECT_FACTOR (loop_vinfo)
-< vect_min_worthwhile_factor (code))
+  || (LOOP_VINFO_VECT_FACTOR (loop_vinfo)
+ < vect_min_worthwhile_factor (code)
+  && !vec_stmt))
 return false;
   if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "proceeding using word mode.");
 }

-  /* Worthwhile without SIMD support?  */
+  /* Worthwhile without SIMD support? Check only during analysis.  */
   if (!VECTOR_MODE_P (TYPE_MODE (vectype))
   && LOOP_VINFO_VECT_FACTOR (loop_vinfo)
-< vect_min_worthwhile_factor (code))
+< vect_min_worthwhile_factor (code)
+  && !vec_stmt)
 {
   if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "not worthwhile without SIMD support.");

Tested on vectorizer testsuite on x86-64-linux.

Ira


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33804



[Bug tree-optimization/20122] Wrong code with gcc 4.0 tree-vectorizer

2005-02-24 Thread irar at il dot ibm dot com

--- Additional Comments From irar at il dot ibm dot com  2005-02-24 13:41 
---
I found the problem that causes this. I'll send the patch next week.
Ira

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20122


[Bug tree-optimization/20122] Wrong code with gcc 4.0 tree-vectorizer

2005-03-02 Thread irar at il dot ibm dot com


-- 
   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |irar at il dot ibm dot com
   |dot org |
 Status|NEW |ASSIGNED
   Last reconfirmed|2005-03-02 11:42:36 |2005-03-02 12:43:57
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20122


[Bug tree-optimization/20122] Wrong code with gcc 4.0 tree-vectorizer

2005-03-02 Thread irar at il dot ibm dot com

--- Additional Comments From irar at il dot ibm dot com  2005-03-02 12:45 
---
Fixed in http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01788.html. Waiting for 
review.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20122


  1   2   3   4   >