https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83202

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2017-11-29
             Blocks|                            |53947
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot 
gnu.org
     Ever confirmed|0                           |1

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
wiht += 4 the inner loop doesn't iterate so it's effectively

void test(double data[4][4])
{
  for (int i = 0; i < 4; i++)
  {
    data[i][i] = data[i][i] * data[i][i];
    data[i][i+1] = data[i][i+1] * data[i][i+1];
  }
}

we fail to SLP here because we get confused by the computed group size of 5
as there's a gap of three elements between the first stores of each iteration.

When later doing BB vectorization we fail to analyze dependences, likely
because
not analyzing refs as thoroughly as with loops.

For your second example we fail to loop vectorize this because we completely
peel the inner loop in cunrolli, leaving control flow inside the loop...
I have a patch for that one.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

Reply via email to