[Bug tree-optimization/33860] [4.3 Regression] ICE in vectorizable_load, at tree-vect-transform.c:5503

2007-12-02 Thread pinskia at gcc dot gnu dot org


-- 

pinskia at gcc dot gnu dot org changed:

   What|Removed |Added

   Target Milestone|--- |4.3.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33860



[Bug tree-optimization/33860] [4.3 Regression] ICE in vectorizable_load, at tree-vect-transform.c:5503

2007-11-13 Thread tbm at cyrius dot com


--- Comment #6 from tbm at cyrius dot com  2007-11-13 08:27 ---
So I guess this can be closed?


-- 

tbm at cyrius dot com changed:

   What|Removed |Added

 CC||dorit at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33860



[Bug tree-optimization/33860] [4.3 Regression] ICE in vectorizable_load, at tree-vect-transform.c:5503

2007-11-13 Thread dorit at gcc dot gnu dot org


--- Comment #7 from dorit at gcc dot gnu dot org  2007-11-13 13:29 ---
fixed


-- 

dorit at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33860



[Bug tree-optimization/33860] [4.3 Regression] ICE in vectorizable_load, at tree-vect-transform.c:5503

2007-10-23 Thread dorit at gcc dot gnu dot org


--- Comment #5 from dorit at gcc dot gnu dot org  2007-10-23 19:50 ---
Subject: Bug 33860

Author: dorit
Date: Tue Oct 23 19:50:18 2007
New Revision: 129587

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=129587
Log:
PR tree-optimization/33860
* tree-vect-transform.c (vect_analyze_data_ref_access): Don't allow
interleaved accesses in case the dr is inside the inner-loop during
outer-loop vectorization.


Added:
trunk/gcc/testsuite/g++.dg/vect/pr33860.cc
trunk/gcc/testsuite/g++.dg/vect/pr33860a.cc
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-analyze.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33860



[Bug tree-optimization/33860] [4.3 Regression] ICE in vectorizable_load, at tree-vect-transform.c:5503

2007-10-22 Thread tbm at cyrius dot com


--- Comment #1 from tbm at cyrius dot com  2007-10-22 14:11 ---
/* Testcase by Martin Michlmayr [EMAIL PROTECTED] */

class Matrix
{
  public:
double data[4][4];
Matrix operator* (const Matrix matrix) const;
void makeRotationAboutVector (void);
};
void Matrix::makeRotationAboutVector (void)
{
   Matrix irx;
   *this = irx * (*this);
}
Matrix Matrix::operator* (const Matrix matrix) const
{
  Matrix ret;
  for (int i = 0; i  4; i++)
for (int j = 0; j  4; j++)
  ret.data[j][i] = matrix.data[j][2] + matrix.data[j][3];
  return ret;
}


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33860



[Bug tree-optimization/33860] [4.3 Regression] ICE in vectorizable_load, at tree-vect-transform.c:5503

2007-10-22 Thread tbm at cyrius dot com


--- Comment #3 from tbm at cyrius dot com  2007-10-22 14:12 ---
Breakpoint 1, fancy_abort (file=0xd7bb78 gcc/tree-vect-transform.c,
line=5503, function=0xd7d000 vectorizable_load) at gcc/diagnostic.c:659
659 {
(gdb) where
#0  fancy_abort (file=0xd7bb78 gcc/tree-vect-transform.c, line=5503,
function=0xd7d000 vectorizable_load) at gcc/diagnostic.c:659
#1  0x00bc149b in vectorizable_load (stmt=0x2b48329082d0, bsi=0x0,
vec_stmt=0x0,
slp_node=0x0) at gcc/tree-vect-transform.c:5503
#2  0x00ba4488 in vect_analyze_operations (loop_vinfo=0x11304f0)
at gcc/tree-vect-analyze.c:484
#3  0x00bac0db in vect_analyze_loop (loop=value optimized out)
at gcc/tree-vect-analyze.c:4341
#4  0x00934ae0 in vectorize_loops () at gcc/tree-vectorizer.c:2501
#5  0x00765847 in execute_one_pass (pass=0x101ce20)
at gcc/passes.c:1117
#6  0x00765a0c in execute_pass_list (pass=0x101ce20)
at gcc/passes.c:1170
#7  0x00765a1e in execute_pass_list (pass=0x101cc40)
at gcc/passes.c:1171
#8  0x00765a1e in execute_pass_list (pass=0x101c040)
at gcc/passes.c:1171
#9  0x00840c1e in tree_rest_of_compilation (fndecl=0x2b48328f0b00)
at gcc/tree-optimize.c:404
#10 0x009c3bc2 in cgraph_expand_function (node=0x2b48328ff300)
at gcc/cgraphunit.c:1060
#11 0x009c5668 in cgraph_optimize () at gcc/cgraphunit.c:1123
#12 0x004ac2cf in cp_write_global_declarations ()
at gcc/cp/decl2.c:3410
#13 0x007e3ec7 in toplev_main (argc=value optimized out, argv=value
optimized out)
at gcc/toplev.c:1055
#14 0x2b483244cb44 in __libc_start_main () from /lib/libc.so.6
#15 0x004043f9 in _start ()
(gdb)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33860



[Bug tree-optimization/33860] [4.3 Regression] ICE in vectorizable_load, at tree-vect-transform.c:5503

2007-10-22 Thread tbm at cyrius dot com


--- Comment #2 from tbm at cyrius dot com  2007-10-22 14:12 ---
Created an attachment (id=14388)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14388action=view)
preprocessed source


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33860



[Bug tree-optimization/33860] [4.3 Regression] ICE in vectorizable_load, at tree-vect-transform.c:5503

2007-10-22 Thread dorit at gcc dot gnu dot org


--- Comment #4 from dorit at gcc dot gnu dot org  2007-10-22 22:54 ---
There's some bad interaction here between the data-interleaving support and the
outer-loop support - these are not yet supported together, however it still
slipped through the checks during the analysis phase. This patch fixes that by
not allowing us to detect interleaved accesses in the inner-loop during
outer-loop vectorization:

--- tree-vect-analyze.c 2007-10-22 08:34:45.0 +0200
+++ tree-vect-analyze.dn.c  2007-10-22 22:23:01.0 +0200
@@ -2321,6 +2321,10 @@

   if (nested_in_vect_loop_p (loop, stmt))
 {
+  /* Interleaved accesses are not yet supported within outer-loop
+vectorization for references in the inner-loop.  */
+  DR_GROUP_FIRST_DR (vinfo_for_stmt (stmt)) = NULL_TREE;
+
   /* For the rest of the analysis we use the outer-loop step.  */
   step = STMT_VINFO_DR_STEP (stmt_info);
   dr_step = TREE_INT_CST_LOW (step);

(yet to be bootstrapped etc.)

By the way, on powerpc-linux, this testcase gets vectorized with this fix
(after changing the doubles to floats, and forcing alignment of the data array
with attribute aligned), without taking advantage of the fact that the two
loads are interleaved. 

By the way, I suspect that the vectorized code here is quite worse than the
original scalar code;
instead of: (ld,ld,add,store) * 16
we have: (vload,realign,splat,vload,realign,splat,vadd,vstore) * 4
with additional overhead outside the loop.
After the ICE is fixed we should probably add this as a missed-optimization PR
(both in terms of the cost model, and in terms of exploiting the data reuse of
the interleaved accesses).


-- 

dorit at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |dorit at gcc dot gnu dot org
   |dot org |
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2007-10-22 22:54:32
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33860