9 Regression] min(4, strlen(s)) optimized to strlen(s) with -flto

msebor at gcc dot gnu.org Sun, 24 Jun 2018 13:39:26 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86259


--- Comment #14 from Martin Sebor <msebor at gcc dot gnu.org> ---
> You say that
> 
>  struct { int a; int b; } s, s2;
>  memcpy (&s2.a, &s.a, sizeof (s));
> 
> is invalid, aka not copying the whole structure since you pass in a
> pointer to s2.a rather than s2?

Yes.  It's invalid for the same reason as the following:

  int *p = &s.a;
  int *q = &s2.a;

  *q = *p;    // okay
  ++q, ++p;   // okay
  *q = *p;    // undefined #1
  ++q, ++p;   // undefined #2

In s.a and s.b are effectively arrays of one element, i.e., int[1].  As Joseph
explained in comment #13, pointer arithmetic is defined only for and within
each array object, irrespective of whether the array is itself an element of
another array or a member of a struct.  So above, #1 is undefined because it
accesses a past-the-end element of an array, and #2 is undefined because the
only valid arithmetic on a past-the-end pointer is to subtract one from it.

This is specified in 6.5.6 Additive operators, p8 of C11.  In C89, the
corresponding text is in section 3.3.6.  The same restriction applies to all
library functions, including memcpy and strcpy.

> Alternatively
> 
>  struct { int x; int a; int b; } s, s2;
>  memcpy (&s2.a, &s.a, sizeof (s) - sizeof (int));
> 
> is similarly invalid, not copying the last two elements?

Yes.  It's invalid for the same reasons as above.  This would be valid:

  struct S { int x; int a; int b; } s, s2;
  memcpy ((char*)&s2 + offseof (struct S, a),
          (char*)&2 + offseof (struct S, a),
          sizeof (s) - offseof (struct S, a));

This is valid because &s and &s2 point to the objects s and s2.  Any object can
be interpreted as an array of char with a size equal to the object itself and
the pointer arithmetic is valid anywhere within the object.

> 
> I don't see how str* are in anyway less special than mem* here in case you
> come up with an exception for mem*.

In the standard, the restrictions above apply equally to expressions as well as
library functions.  For better or worse, GCC treats memxxx() functions
differently than strxxx() functions because of legacy invalid code that relies
on the second memcpy example working.  This is evident from the effects of
_FORTIFY_SOURCE.  For example, the following gets a warning when compiled with
_FORTIFY_SOURCE=1 and aborts with a buffer overflow error at runtime:

  #include <string.h>

  int main (void)
  {
    struct {
      char a[4];
      char b[3];
    } a;

    a[0] = '0';
    strcpy (a.a + 1, "12345");

    __builtin_puts (a.a);
  }

In function ‘strcpy’,
    inlined from ‘main’ at c.c:12:5:
/usr/include/bits/string3.h:110:10: warning: ‘__builtin___memcpy_chk’ writing 6
bytes into a region of size 3 overflows the destination [-Wstringop-overflow=]
...
*** buffer overflow detected ***: ./a.out terminated

(The region of 3 is the amount of space in the a.a array at offset 1.)

but because the special treatment applies only to the memxxx functions and not
to string functions, the equivalent program using memcpy() compiles without a
warning and runs successfully to completion:

  #include <string.h>

  int main (void)
  {
    struct {
      char a[4];
      char b[3];
    } a;

    a[0] = '0';
    memcpy (a.a + 1, "12345", 6);

    __builtin_puts (a.a);
  }

Because the special treatment applies only to raw memory functions like memcpy
and memmove and not to string functions like strcpy or strlen, none of the test
cases here is valid even under GCC's relaxed rules.

[Bug tree-optimization/86259] [8/9 Regression] min(4, strlen(s)) optimized to strlen(s) with -flto

Reply via email to