https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88978

            Bug ID: 88978
           Summary: Failed outer loop vectorization with grouped accesses
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

We fail to vectorize outer loops when there are grouped accesses in the
inner loop:

int a[1024];
int b[1024][1024];

void foo ()
{
  for (int i = 0; i < 512; ++i)
    {
      int a1 = a[2*i];
      int a2 = a[2*i+1];
      for (int j = 0; j < 1024; ++j)
        {
          b[j][2*i] = a1;
          b[j][2*i+1] = a2;
        }
    }
}

This is mostly because we cannot do SLP here (for implementation reasons).
We are vectorizing the following just fine, applying interleaving to the
outer loop accesses:

int a[1024];
int b[1024][1024];

void foo ()
{
  for (int i = 0; i < 512; ++i)
    {
      int a1 = a[2*i];
      int a2 = a[2*i+1];
      for (int j = 0; j < 1024; ++j)
        b[j][i] = a1+a2;
    }
}

The guard in question is the following which is premature (before SLP
would be even tried) and somewhat inaccurate since it is grouped
accesses in the inner loop when doing outer loop vectorization rather
than grouped accesses in an outer loop that fail.

static bool
vect_analyze_data_ref_access (dr_vec_info *dr_info)
{
...
  if (loop && nested_in_vect_loop_p (loop, stmt_info))
    {
      if (dump_enabled_p ())
        dump_printf_loc (MSG_NOTE, vect_location,
                         "grouped access in outer loop.\n");
      return false;
    }

Reply via email to