https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108226

            Bug ID: 108226
           Summary: __restrict on inlined function parameters does not
                    function as expected
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jhaberman at gmail dot com
  Target Milestone: ---

In bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58526 and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60712#c3 it is said that
restrict/__restrict on inlined function parameters was fixed in GCC 5.  But I
ran into a case where __restrict does not work as expected:

// Godbolt link for this example: https://godbolt.org/z/e5j93Ex3v

long g;

static void Func1(void* p1, int* p2) {
  switch (*p2) {
    case 2:
      __builtin_memcpy(p1, &g, 1);
      return;
    case 1:
      __builtin_memcpy(p1, &g, 8);
      return;
    case 0: {
      __builtin_memcpy(p1, &g, 16);
      return;
    }
  }
}

static void Func2(char* __restrict p1, int* __restrict p2) {
  *p2 = 1;
  *p1 = 123;
  Func1(p1, p2);
}

void Func3(char* p1, int* p2) {
  *p2 = 1;
  Func2(p1, p2);
}

The __restrict qualifiers on Func2() should allow the switch() should be
optimized away.  Clang optimizes it, GCC does not.

It appears that __restrict on function parameters can even make the code worse.
Consider a slight variation on this example:

// Godbolt link for this example: https://godbolt.org/z/Y61qajETd

long g;

static void Func1(void* p1, int* p2) {
  switch (*p2) {
    case 2:
      __builtin_memcpy(p1, &g, 1);
      return;
    case 1:
      __builtin_memcpy(p1, &g, 8);
      return;
    case 0: {
      __builtin_memcpy(p1, &g, 16);
      return;
    }
  }
}

// If we remove __restrict here, GCC succeeds in optimizing away the switch().
static void Func2(char* __restrict p1, int* __restrict p2) {
  *p1 = 123;
  *p2 = 1;
  Func1(p1, p2);
}

void Func3(char* p1, int* p2) {
  *p2 = 1;
  Func2(p1, p2);
}

In this case, it should be straightforward to optimize away the switch(), even
without __restrict.  But GCC does not optimize this correctly unless we
*remove* __restrict.

Reply via email to