[Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu

2014-01-29 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

--- Comment #7 from Jakub Jelinek jakub at gcc dot gnu.org ---
Author: jakub
Date: Wed Jan 29 09:27:43 2014
New Revision: 207225

URL: http://gcc.gnu.org/viewcvs?rev=207225root=gccview=rev
Log:
PR tree-optimization/59594
* tree-vect-data-refs.c (vect_analyze_data_ref_accesses): Sort
a copy of the datarefs vector rather than the vector itself.

* gcc.dg/vect/no-vfa-vect-depend-2.c: New test.
* gcc.dg/vect/no-vfa-vect-depend-3.c: New test.
* gcc.dg/vect/pr59594.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c
trunk/gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-3.c
trunk/gcc/testsuite/gcc.dg/vect/pr59594.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-data-refs.c


[Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu

2014-01-29 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Jakub Jelinek jakub at gcc dot gnu.org ---
Fixed.


[Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu

2014-01-24 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

--- Comment #6 from Jakub Jelinek jakub at gcc dot gnu.org ---
On:
#define N 1024
int ia[N + 1];
int ib[N + 1];

void
f1 (void)
{
  int i;
  for (i = 0; i  N; i++)
{
  ia[i + 1] = 1;
  ib[i] = ia[i];
}
}

void
f2 (void)
{
  int i;
  for (i = 0; i  N; i++)
{
  ia[i] = 1;
  ib[i] = ia[i + 1];
}
}

void
f3 (void)
{
  int i;
  for (i = N - 1; i = 0; i--)
{
  ia[i + 1] = 1;
  ib[i] = ia[i];
}
}

void
f4 (void)
{
  int i;
  for (i = N - 1; i = 0; i--)
{
  ia[i] = 1;
  ib[i] = ia[i + 1];
}
}

we properly vectorize f2 and f3 where the write/read DDR is DDR_REVERSED_P and
not f1/f4.

On
#define N 1024
int ia[N + 1];
int ib[N + 1];

void
f1 (void)
{
  int i;
  for (i = 0; i  N; i++)
{
  ia[i + 1] = 1;
  ia[i] = 2;
}
}

void
f2 (void)
{
  int i;
  for (i = 0; i  N; i++)
{
  ia[i] = 1;
  ia[i + 1] = 2;
}
}

void
f3 (void)
{
  int i;
  for (i = N - 1; i = 0; i--)
{
  ia[i + 1] = 1;
  ia[i] = 2;
}
}

void
f4 (void)
{
  int i;
  for (i = N - 1; i = 0; i--)
{
  ia[i] = 1;
  ia[i + 1] = 2;
}
}

we don't vectorize f1 and f2, in both cases for the write/write DDR
DDR_REVERSED_P is false, and vectorize f3/f4, where DDR_REVERSED_P is true in
both cases.  f2 and f3 shouldn't be vectorizable (at least not as is, when we'd
be trying to vectorize the two stores just by putting a vector store at that
position), f1 and f4 can.  So, this leads me to believe that for write/write we
don't have a way to differentiate between the bad and good cases using dist  0
 DDR_REVERSED_P test.  In that case, I'd think best would be to not ignore
dist  0  DDR_REVERSED_P (ddr)
ddrs if (!DR_IS_READ (dra)  !DR_IS_READ (drb)).


[Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu

2014-01-22 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

--- Comment #5 from Jakub Jelinek jakub at gcc dot gnu.org ---
Created attachment 31919
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31919action=edit
gcc49-pr59594.patch

Untested patch for discussion.  The reason why we (incorrectly) vectorize the
testcase is that we ignore the data dependency, on the testcase both the b[a]
read vs. b[a+1] store and b[a] store vs. b[a+1] store DDRs have dist 1 and
DDR_REVERSED_P set and we ignore those.  Now on say:
int printf (const char *, ...);

int a;
static int b[1024];

int
main ()
{
  for (a = 0; a = 512; a++)
{
  b[a - 1] = b[a];
  b[a] = 1;
}
  printf (%d\n, b[1]);
  return 0;
}
only the b[a] read vs. b[a-1] store is dist 1 DDR_REVERSED_P and b[a] store vs.
b[a-1] store is dist 1 !DDR_REVERSED_P, thus we don't vectorize it (correctly).
Unfortunately not ignoring dist  0  DDR_REVERSED_P ddrs for negative step
regresses the testcase I've attached, where there is a write after read ddr and
it works properly with the current check.  While the attached patch keeps that
testcase (no-vfa-vect-depend*.c) working and fixes the test (pr59594.c), the
conditions are piled completely randomly, I'm afraid I don't know why it is so,
if for the DDR_REVERSED_P continue it matters whether step is positive or
negative, or if that is irrelevant and all the write after write DDR_REVERSED_P
ddrs need to be checked normally (abs (dist) = *max_vf), or if say only write
after read should be treated as the code treats it right now and even read
after write is problematic.  The DDR_REVERSED_P stuff has been added in 2007
for PR32377, see e.g. http://gcc.gnu.org/ml/gcc-patches/2007-09/msg01067.html

Richard, any ideas?


[Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu

2014-01-13 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com,
   ||jakub at gcc dot gnu.org,
   ||meibf at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek jakub at gcc dot gnu.org ---
(In reply to H.J. Lu from comment #1)
 It is caused by r204062.

That is only a part of the story, this testcase seems to be interesting.

Starting with r204062 until r204560 this has been broken because of an ldist
bug.
r204561 fixed that again and the testcase worked until r206147.
r206148 (aka negative step vectorization support) started to ICE on this.
With r206178 it stopped ICEing and works again with -O3 -mtune=generic, not
sure if that has been intentional to change the generic tuning with that patch.
 H.J.?
Anyway, with -O3 -mtune=core-avx2 e.g. r206178 still ICEs and r206179 (which
fixed the ICE) starts the miscompilation.


[Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu

2014-01-13 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

--- Comment #3 from H.J. Lu hjl.tools at gmail dot com ---
(In reply to Jakub Jelinek from comment #2)
 (In reply to H.J. Lu from comment #1)
  It is caused by r204062.
 
 That is only a part of the story, this testcase seems to be interesting.
 
 Starting with r204062 until r204560 this has been broken because of an ldist
 bug.
 r204561 fixed that again and the testcase worked until r206147.
 r206148 (aka negative step vectorization support) started to ICE on this.
 With r206178 it stopped ICEing and works again with -O3 -mtune=generic, not
 sure if that has been intentional to change the generic tuning with that
 patch.  H.J.?

There was a typo in r206178 and was fixed by r206180:

http://gcc.gnu.org/git/?p=gcc.git;a=commit;h=ff385162ee38caf3100ba4a0f682241c3b0d681d

Can you try r206180 on this?

 Anyway, with -O3 -mtune=core-avx2 e.g. r206178 still ICEs and r206179 (which
 fixed the ICE) starts the miscompilation.


[Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu

2014-01-13 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

--- Comment #4 from Jakub Jelinek jakub at gcc dot gnu.org ---
You're right, r206180 fixed this (i.e. that it FAILs at runtime even for
-mtune=generic).  In any case, sounds like this is a problem in determination
of the aliasing, we should have refused to vectorize this (of course unless we
are smart enough to find out all the b[a] = 1; stores but the b[0] = 1; is
redundant, but I suppose the vectorizer isn't the right place to do that
optimization.


[Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu

2013-12-24 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

H.J. Lu hjl.tools at gmail dot com changed:

   What|Removed |Added

   Priority|P3  |P1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2013-12-24
   Target Milestone|--- |4.9.0
Summary|wrong code (by tree |[4.9 Regression] wrong code
   |vectorizer) at -O3 on   |(by tree vectorizer) at -O3
   |x86_64-linux-gnu|on x86_64-linux-gnu
 Ever confirmed|0   |1


[Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu

2013-12-24 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

H.J. Lu hjl.tools at gmail dot com changed:

   What|Removed |Added

 CC||rguenther at suse dot de

--- Comment #1 from H.J. Lu hjl.tools at gmail dot com ---
It is caused by r204062.