[Bug tree-optimization/48172] [4.5/4.6/4.7 Regression] incorrect vectorization of loop in GCC 4.5.* with -O3

2011-05-13 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48172

--- Comment #14 from Richard Guenther rguenth at gcc dot gnu.org 2011-05-13 
08:31:28 UTC ---
Author: rguenth
Date: Fri May 13 08:31:18 2011
New Revision: 173725

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=173725
Log:
2011-05-13  Richard Guenther  rguent...@suse.de

PR tree-optimization/48172
* tree-vect-loop-manip.c (vect_vfa_segment_size): Avoid
multiplying by number of iterations for equal step.
(vect_create_cond_for_alias_checks): Likewise.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-vect-loop-manip.c


[Bug tree-optimization/48172] [4.5/4.6/4.7 Regression] incorrect vectorization of loop in GCC 4.5.* with -O3

2011-05-12 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48172

--- Comment #8 from Richard Guenther rguenth at gcc dot gnu.org 2011-05-12 
10:40:02 UTC ---
Created attachment 24236
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=24236
patch

Patch I'm going to test.


[Bug tree-optimization/48172] [4.5/4.6/4.7 Regression] incorrect vectorization of loop in GCC 4.5.* with -O3

2011-05-12 Thread irar at il dot ibm.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48172

--- Comment #9 from Ira Rosen irar at il dot ibm.com 2011-05-12 11:48:56 UTC 
---
(In reply to comment #8)
 Created attachment 24236 [details]
 patch
 
 Patch I'm going to test.

So, segment_length = scalar_step * vf * scalar_niters?
I think we don't need vf here.

Also, why not do that only for different steps?

Thanks,
Ira


[Bug tree-optimization/48172] [4.5/4.6/4.7 Regression] incorrect vectorization of loop in GCC 4.5.* with -O3

2011-05-12 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48172

--- Comment #10 from Richard Guenther rguenth at gcc dot gnu.org 2011-05-12 
12:14:48 UTC ---
Author: rguenth
Date: Thu May 12 12:14:45 2011
New Revision: 173703

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=173703
Log:
2011-05-12  Richard Guenther  rguent...@suse.de

PR tree-optimization/48172
* tree-vect-loop-manip.c (vect_vfa_segment_size): Do not exclude
the number of iterations from the segment size calculation.
(vect_create_cond_for_alias_checks): Adjust.

* gcc.dg/vect/pr48172.c: New testcase.

Added:
trunk/gcc/testsuite/gcc.dg/vect/pr48172.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-loop-manip.c


[Bug tree-optimization/48172] [4.5/4.6/4.7 Regression] incorrect vectorization of loop in GCC 4.5.* with -O3

2011-05-12 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48172

--- Comment #12 from Richard Guenther rguenth at gcc dot gnu.org 2011-05-12 
12:46:10 UTC ---
Like this?

Index: gcc/tree-vect-loop-manip.c
===
--- gcc/tree-vect-loop-manip.c  (revision 173703)
+++ gcc/tree-vect-loop-manip.c  (working copy)
@@ -2353,23 +2353,19 @@ vect_create_cond_for_align_checks (loop_

Input:
  DR: The data reference.
- VECT_FACTOR: vectorization factor.
- SCALAR_LOOP_NITERS: number of iterations.
+ LENGTH_FACTOR: segment length to consider.

Return an expression whose value is the size of segment which will be
accessed by DR.  */

 static tree
-vect_vfa_segment_size (struct data_reference *dr, int vect_factor,
+vect_vfa_segment_size (struct data_reference *dr, tree length_factor,
   tree scalar_loop_niters)
 {
   tree segment_length;
   segment_length = size_binop (MULT_EXPR,
   fold_convert (sizetype, DR_STEP (dr)),
-  size_int (vect_factor));
-  segment_length = size_binop (MULT_EXPR,
-  segment_length,
-  fold_convert (sizetype, scalar_loop_niters));
+  fold_convert (sizetype, length_factor));
   if (vect_supportable_dr_alignment (dr, false)
 == dr_explicit_realign_optimized)
 {
@@ -2465,10 +2461,12 @@ vect_create_cond_for_alias_checks (loop_
 vect_create_addr_base_for_vector_ref (stmt_b, cond_expr_stmt_list,
  NULL_TREE, loop);

-  segment_length_a = vect_vfa_segment_size (dr_a, vect_factor,
-   scalar_loop_iters);
-  segment_length_b = vect_vfa_segment_size (dr_b, vect_factor,
-   scalar_loop_iters);
+  if (!operand_equal_p (DR_STEP (dr_a), DR_STEP (dr_b), 0))
+   length_factor = scalar_loop_iters;
+  else
+   length_factor = size_int (vect_factor);
+  segment_length_a = vect_vfa_segment_size (dr_a, length_factor);
+  segment_length_b = vect_vfa_segment_size (dr_b, length_factor);

   if (vect_print_dump_info (REPORT_DR_DETAILS))
{

I also think that the re-alignment adjustment needs to be multiplied
by DR_STEP (maybe we only support it for DR_STEP == 1 at the moment).


[Bug tree-optimization/48172] [4.5/4.6/4.7 Regression] incorrect vectorization of loop in GCC 4.5.* with -O3

2011-05-12 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48172

--- Comment #11 from Richard Guenther rguenth at gcc dot gnu.org 2011-05-12 
12:38:15 UTC ---
(In reply to comment #9)
 (In reply to comment #8)
  Created attachment 24236 [details]
  patch
  
  Patch I'm going to test.
 
 So, segment_length = scalar_step * vf * scalar_niters?
 I think we don't need vf here.

Hm, right.  I'll prepare a followup.

 Also, why not do that only for different steps?

We don't know this at this point.  Maybe we can change the structure
of the code somewhat.  I'll have a look.

 Thanks,
 Ira


[Bug tree-optimization/48172] [4.5/4.6/4.7 Regression] incorrect vectorization of loop in GCC 4.5.* with -O3

2011-05-12 Thread irar at il dot ibm.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48172

--- Comment #13 from Ira Rosen irar at il dot ibm.com 2011-05-12 13:02:39 UTC 
---
(In reply to comment #12)
 Like this?
 

Yes, looks good to me.

 
 I also think that the re-alignment adjustment needs to be multiplied
 by DR_STEP (maybe we only support it for DR_STEP == 1 at the moment).

The realignment adjustment is for the case when we load two consecutive aligned
vectors and extract the relevant elements from them (in Altivec): for a[1:4] we
load a[0:3] and a[4:7]. So, the adjustment adds one more vector size to cover
that additional loaded vector. I don't see why it needs to be multiplied by
DR_STEP.

Thanks,
Ira


[Bug tree-optimization/48172] [4.5/4.6/4.7 Regression] incorrect vectorization of loop in GCC 4.5.* with -O3

2011-04-28 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48172

Richard Guenther rguenth at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|4.5.3   |4.5.4

--- Comment #7 from Richard Guenther rguenth at gcc dot gnu.org 2011-04-28 
14:51:31 UTC ---
GCC 4.5.3 is being released, adjusting target milestone.


[Bug tree-optimization/48172] [4.5/4.6/4.7 Regression] incorrect vectorization of loop in GCC 4.5.* with -O3

2011-04-10 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48172

Richard Guenther rguenth at gcc dot gnu.org changed:

   What|Removed |Added

   Priority|P3  |P2


[Bug tree-optimization/48172] [4.5/4.6/4.7 Regression] incorrect vectorization of loop in GCC 4.5.* with -O3

2011-04-07 Thread xunxun1982 at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48172

PcX xunxun1982 at gmail dot com changed:

   What|Removed |Added

 CC||xunxun1982 at gmail dot com

--- Comment #6 from PcX xunxun1982 at gmail dot com 2011-04-08 00:15:36 UTC 
---
Using mingw gcc4.5.2, the situation is not the same.
I found that when I use -O3, the result is pass.
When I use -O3 -march=native, the result is COMPILER BUG: array[1025] should
be 98177 but is 0.


[Bug tree-optimization/48172] [4.5/4.6/4.7 Regression] incorrect vectorization of loop in GCC 4.5.* with -O3

2011-03-19 Thread irar at il dot ibm.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48172

Ira Rosen irar at il dot ibm.com changed:

   What|Removed |Added

 CC||irar at il dot ibm.com

--- Comment #5 from Ira Rosen irar at il dot ibm.com 2011-03-19 17:11:46 UTC 
---
(In reply to comment #4)

 
 In particular for all tests the segment size we use for the alias tests
 is not enough for data-refs with differing DR_STEP.  It would need to
 take the number of iterations into account.

Right, instead of checking
init_addr1 + step1 * vf  init_addr2
we should check something like
init_addr1 + step1 * vf + scalar_niters * (step1 - step2)  init_addr2
(or the other direction).


[Bug tree-optimization/48172] [4.5/4.6/4.7 Regression] incorrect vectorization of loop in GCC 4.5.* with -O3

2011-03-18 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48172

Richard Guenther rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
  Known to work||4.4.4
   Keywords||wrong-code
   Last reconfirmed||2011.03.18 09:29:27
 Ever Confirmed|0   |1
Summary|incorrect vectorization of  |[4.5/4.6/4.7 Regression]
   |loop in GCC 4.5.* with -O3  |incorrect vectorization of
   ||loop in GCC 4.5.* with -O3
   Target Milestone|--- |4.5.3
  Known to fail||4.5.0, 4.5.2, 4.6.0, 4.7.0

--- Comment #1 from Richard Guenther rguenth at gcc dot gnu.org 2011-03-18 
09:29:27 UTC ---
Not vectorized on the 4.4 branch because of

t.c:23: note: not vectorized: unsupported unaligned store.
t.c:16: note: vect_model_induction_cost: inside_cost = 6, outside_cost = 4 .
t.c:16: note: not vectorized: relevant stmt not supported: i.16_8 = (uint32_t)
i_60;


Confirmed.

It's not the conversion but the unaligned store it seems, also fails
with 4.5.0.


[Bug tree-optimization/48172] [4.5/4.6/4.7 Regression] incorrect vectorization of loop in GCC 4.5.* with -O3

2011-03-18 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48172

--- Comment #2 from Richard Guenther rguenth at gcc dot gnu.org 2011-03-18 
09:48:39 UTC ---
(compute_affine_dependence
  (stmt_a =
D.3677_12 = array[D.3676_11];
)
  (stmt_b =
array[D.3675_10] = D.3680_16;
)
(subscript_dependence_tester
(analyze_overlapping_iterations
  (chrec_a = {0, +, 2}_2)
  (chrec_b = {514, +, 1}_2)
(analyze_siv_subscript
(analyze_subscript_affine_affine
  (overlaps_a = [257 + 1 * x_1]
)
  (overlaps_b = [0 + 2 * x_1]
)
)
)
  (overlap_iterations_a = [257 + 1 * x_1]
)
  (overlap_iterations_b = [0 + 2 * x_1]
)
)
(Dependence relation cannot be represented by distance vector.)
)
(compute_affine_dependence
  (stmt_a =
D.3679_15 = array[D.3678_14];
)
  (stmt_b =
array[D.3675_10] = D.3680_16;
)
(subscript_dependence_tester
(analyze_overlapping_iterations
  (chrec_a = {1, +, 2}_2)
  (chrec_b = {514, +, 1}_2)
(analyze_siv_subscript
(analyze_subscript_affine_affine
  (overlaps_a = [257 + 1 * x_1]
)
  (overlaps_b = [1 + 2 * x_1]
)
)
)
  (overlap_iterations_a = [257 + 1 * x_1]
)
  (overlap_iterations_b = [1 + 2 * x_1]
  (overlap_iterations_b = [1 + 2 * x_1]
)
)
(Dependence relation cannot be represented by distance vector.)
)
)

...

t.c:23: note: versioning for alias required: bad dist vector for
array[D.3676_11] and array[D.3675_10]
t.c:23: note: mark for run-time aliasing test between array[D.3676_11] and
array[D.3675_10]
t.c:23: note: versioning for alias required: bad dist vector for
array[D.3678_14] and array[D.3675_10]
t.c:23: note: mark for run-time aliasing test between array[D.3678_14] and
array[D.3675_10]

and the alias check looks like

  vect_parray.14_32 = array;
  vect_parray.17_31 = array[514];
  D.3707_29 = vect_parray.14_32 + 32;
  D.3708_28 = D.3707_29  vect_parray.17_31;
  D.3709_63 = vect_parray.17_31 + 16;
  D.3710_64 = D.3709_63  vect_parray.14_32;
  D.3711_65 = D.3708_28 || D.3710_64;
  D.3713_78 = !D.3711_65;
  if (D.3713_78 != 0)
goto bb 12;
  else
goto bb 9;

which doesn't at all test something sensible.

Shortened non-runtime testcase:

#define ASIZE 1028
#define HALF (ASIZE/2)
unsigned int array[ASIZE];

void foo(void)
{
  int i;
  for (i = 0; i  HALF-1; i++)
array[HALF+i] = array[2*i] + array[2*i + 1];
}


[Bug tree-optimization/48172] [4.5/4.6/4.7 Regression] incorrect vectorization of loop in GCC 4.5.* with -O3

2011-03-18 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48172

Richard Guenther rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||irar at gcc dot gnu.org
 AssignedTo|unassigned at gcc dot   |rguenth at gcc dot gnu.org
   |gnu.org |

--- Comment #3 from Richard Guenther rguenth at gcc dot gnu.org 2011-03-18 
10:00:34 UTC ---
Versioning for alias only seems to consider the case that DR_STEP is the
same for all DRs, right?

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 171097)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -528,6 +528,14 @@ vect_mark_for_runtime_alias_test (ddr_p
   print_generic_expr (vect_dump, DR_REF (DDR_B (ddr)), TDF_SLIM);
 }

+  if (!operand_equal_p (DR_STEP (DDR_A (ddr)), DR_STEP (DDR_B (ddr)), 0))
+{
+  if (vect_print_dump_info (REPORT_DR_DETAILS))
+   fprintf (vect_dump, versioning not supported for accesses with 
+different step.);
+  return false;
+}
+
   if (optimize_loop_nest_for_size_p (loop))
 {
   if (vect_print_dump_info (REPORT_DR_DETAILS))

fixes it for me.


[Bug tree-optimization/48172] [4.5/4.6/4.7 Regression] incorrect vectorization of loop in GCC 4.5.* with -O3

2011-03-18 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48172

--- Comment #4 from Richard Guenther rguenth at gcc dot gnu.org 2011-03-18 
11:57:11 UTC ---
The patch FAILs

FAIL: gcc.dg/vect/pr37539.c scan-tree-dump-times vect vectorized 1 loops 2
FAIL: gcc.dg/vect/pr43432.c scan-tree-dump-times vect vectorized 1 loops 1
FAIL: gcc.dg/vect/vect-multitypes-11.c scan-tree-dump-times vect vectorized 1
l
oops 2
FAIL: gcc.dg/vect/vect-multitypes-12.c scan-tree-dump-times vect vectorized 1
l
oops 2
FAIL: gcc.dg/vect/vect-multitypes-16.c scan-tree-dump-times vect vectorized 1
l
oops 1

on x86_64.  I can't see how the alias check is ok for pr37539.c given that
we could call ayuv2yuyv_ref with d[0], d[4].  Similar for pr43432.c.
For vect-multitypes-11.c type-based aliasing should handle the case
instead of

/space/rguenther/src/svn/trunk/gcc/testsuite/gcc.dg/vect/vect-multitypes-11.c:14:
note: versioning for alias required: can't determine dependence between x[i_16]
and MEM[(int *)D.3264_7]

not sure why this doesn't happen.  For -fno-strict-aliasing the runtime test
looks bogus as well.  vect-multitypes-12.c and vect-multitypes-16.c look
similar
(but as character types are involved TBAA doesn't help and the vectorization
does not appear to be safe).

In particular for all tests the segment size we use for the alias tests
is not enough for data-refs with differing DR_STEP.  It would need to
take the number of iterations into account.