https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
int __attribute__((noinline,noclone))
foo (int *out)
{
  int mask[] = { 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
      0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1 };
  int i;
  for (i = 0; i < 32; ++i)
    {
      if (mask[i])
        out[i] = i;
    }
  return out[7];
}

testcase for x86_64 and .MASK_STORE, could be optimized to return 1.  FRE
sees

  .MASK_STORE (out_41(D), 32B, mask__7.9_47, { 0, 1, 2, 3, 4, 5, 6, 7 });
  _10 = &mask[8] + 32;
  MEM <vector(8) int> [(int *)_10] = { 0, 1, 0, 1, 0, 1, 0, 1 };

and 'mask' having address taken makes it clobbered by .MASK_STORE.  There's
also the older issue that when mask is incoming but marked __restrict that
isn't good enough because __restrict and calls doesn't work.

The IL with .LEN_STORE might suffer similar issues at the point FRE gets
to see it.

We might be able to improve BB SLP to not code-gen

  _10 = &mask[8] + 32;
  MEM <vector(8) int> [(int *)_10] = { 0, 1, 0, 1, 0, 1, 0, 1 };

here, making 'mask' addressable again.  I have a patch for this in testing.

Reply via email to