On Sun, Jan 3, 2010 at 6:46 AM, Joshua Haberman <[email protected]> wrote:
> The aliasing policies that GCC implements seem to be more strict than
> what is in the C99 standard. I am wondering if this is true or whether
> I am mistaken (I am not an expert on the standard, so the latter is
> definitely possible).
>
> The relevant text is:
>
> An object shall have its stored value accessed only by an lvalue
> expression that has one of the following types:
>
> * a type compatible with the effective type of the object,
> [...]
> * an aggregate or union type that includes one of the aforementioned
> types among its members (including, recursively, a member of a
> subaggregate or contained union), or
Literally interpreting this sentence the way you do removes nearly all
advantages of type-based aliasing that you have when dealing with
disambiguating a pointer dereference vs. an object reference
and thus cannot be the desired interpretation (and thus we do not allow this).
It basically would force us to treat *ptr vs. Obj as *ptr vs. *(Obj *)ptr2.
> To me this allows the following:
>
> int i;
> union u { int x; } *pu = (union u*)&i;
> printf("%d\n", pu->x);
>
> In this example, the object "i", which is of type "int", is having its
> stored value accessed by an lvalue expression of type "union u", which
> includes the type "int" among its members.
>
> I have seen other articles that interpret the standard in this way.
> See section "Casting through a union (2)" from this article, which
> claims that casts of this sort are legal and that GCC's warnings
> against them are false positives:
> http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html
Yes, this article contains many mistakes but the author failed to listen.
> However, this appears to be contrary to GCC's documentation. From the
> manpage:
>
> Similarly, access by taking the address, casting the resulting
> pointer and dereferencing the result has undefined behavior, even
> if the cast uses a union type, e.g.:
>
> int f() {
> double d = 3.0;
> return ((union a_union *) &d)->i;
> }
>
> I have also been able to experimentally verify that GCC will mis-compile
> this fragment if we expect the behavior the standard specifies:
> int g;
> struct A { int x; };
> int foo(struct A *a) {
> if(g) a->x = 5;
> return g;
> }
>
> With GCC 4.3.3 -O3 on x86-64 (Ubuntu), g is only loaded once:
>
> 0000000000000000 <foo>:
> 0: 8b 05 00 00 00 00 mov eax,DWORD PTR [rip+0x0] # 6
> <foo+0x6>
> 6: 85 c0 test eax,eax
> 8: 74 06 je 10 <foo+0x10>
> a: c7 07 05 00 00 00 mov DWORD PTR [rdi],0x5
> 10: f3 c3 repz ret
>
> But this is incorrect if foo() was called as:
>
> foo((struct A*)&g);
>
> Here is another example:
>
> struct A { int x; };
> struct B { int x; };
> int foo(struct A *a, struct B *b) {
> if(a->x) b->x = 5;
> return a->x;
> }
>
> When I compile this, a->x is only loaded once, even though foo()
> could have been called like this:
>
> int i;
> foo((struct A*)&i, (struct B*)&i);
>
> From this I conclude that GCC diverges from the standard, in that it does not
> allow casts of this sort. In one sense this is good (because the policy GCC
> implements is more aggressive, and yet still reasonable) but on the other hand
> it means (if I am not mistaken) that GCC will incorrectly optimize strictly
> conforming programs.
Correct. GCC follows its own documentation here, not some random
websites and maybe not the strict reading of the standard. There are
other corner-cases where it does so, namely with the union type rule
(which I fail to come up with a std reference at the moment).
Richard.
> Clarifications are most welcome!
>
> Josh
>
>