On Wed, Mar 18, 2015 at 9:04 AM, Hal Finkel <[email protected]> wrote: > > > ------------------------------ > > *From: *"Daniel Berlin" <[email protected]> > *To: *"Jeroen Dobbelaere" <[email protected]> > *Cc: *"Hal Finkel" <[email protected]>, > [email protected], > [email protected], "[email protected] Developers" < > [email protected]> > *Sent: *Wednesday, March 18, 2015 10:27:28 AM > *Subject: *Re: [PATCH] Fix for bug 21725: wrong results with union and > strict-aliasing > > > > On Wed, Mar 18, 2015 at 2:22 AM, Jeroen Dobbelaere < > [email protected]> wrote: > >> >> >> On Tue, Mar 17, 2015 at 11:55 PM, Daniel Berlin <[email protected]> >> wrote: >> >>> [..] >>> >>> In theory, the last time i remember, you weren't allow to set one member >>> of a union and read another. >>> But uh, that's not real user code :) >>> >>> (and IIRC, it does not say anything real) >>> >>> >> This is indeed correct. The main issue is that llvm thinks it is allowed >> to reorder 'stores' (as it thinks they are not aliasing), >> so that a legal read afterwards will result in a wrong answer (if an >> optimization or the backend reorders the stores based on the wrong aliasing >> information). >> > > I thought both LLVM and GCC guaranteed type punning through unions would > work? > > > Yes, LLVM does this (at least for "local" type punning, for some heuristic > definition of local), because we always allow BasicAA to override TBAA when > BasicAA is *sure* there is aliasing. > > > >> >> >> >>>> If you access a member (or nested member) of a union, starting from the >>>> union itself, then it depends if the other type is also accessible through >>>> the union. >>>> >>>> >>>> So: >>>> >>>> int foo(union foo* a, float* b, int* c) { >>>> a->a=1; >>>> *b=2; >>>> // compiler must assume a->a and *b can alias >>>> // compiler must not assume *b and *c alias (access not through union) >>>> } >>>> >>>> (Also see section 3.10 of the c++03 standard; >>>> >>> >>> >>> This, IMHO, does not say what you seem to think it does :) >>> >>> For C++03, 3.10 only includes the word "union" here: "If a program >>> attempts to access the stored value of an object through an lvalue of other >>> than one of the following types the behavior is undefined: >>> >>> — the dynamic type of the object, >>> — a cv-qualified version of the dynamic type of the object, >>> — a type that is the signed or unsigned type corresponding to the >>> dynamic type of the object, >>> — a type that is the signed or unsigned type corresponding to a >>> cv-qualified version of the dynamic type of the object, >>> — an aggregate or union type that includes one of the aforementioned >>> types among its members (including, recursively, a member of a subaggregate >>> or contained union), >>> — a type that is a (possibly cv-qualified) base class type of the >>> dynamic type of the object, >>> — a char or unsigned char type." >>> >>> >>> C++ standard experts, at least on the GCC side, did not view this as >>> saying "all accesses must have an explicit union access", but that "It must >>> be part of a union type", but about whether you try to access it through a >>> union that doesn't have the right actual types in it. >>> >>> The type of those objects is right the type of the object. There is, >>> IMHO, nothing illegal about those accesses. >>> >>> >>> >> So, with that interpretation, the mere presence of a 'union { short s; >> int i; }' (in the same or a different compilation unit) should influence >> the (type based) aliasing of all short and int pointers ? >> > > It's actually worse than this interpretation, since someone pointed out on > the gcc side that "ihateunions" could be in a different TU, and you'd never > see the union at all. > While I think this is the right interpretation (as do others), that's not > the point. > I'm actually on your side here :) > > But remember that Hal said that if hte language has semantics, we should > follow them. > > > I did. It is also true that standards have defects ;) If a strict reading > of the language semantics implies a 'spooky action at a distance' effect, > then we'll need to draw a line in the sand somewhere to avoid that. I also > think your interpretation that implies that is incorrect... > > > The whole point of this is essentially to point out that the language > semantics weren't completely thought out here :) > Because of that, our options are either "make up a rule people can follow" > or "disable TBAA". > > GCC, and every other compiler i'm aware of, has chosen the former, at > least for C++03 (I don't know what C++11 or 14 say here). > > > Yes; fair enough. What I want is that if we make up a rule, then we make > that rule well defined and documented (not just whatever the implementation > happens to do today); and we should also write a paper for the C++ > Standards Committee proposing that as an official solution (I'll be happy > to present at the next meeting, and we have approximately 1 month before > the next paper deadline, IIRC). > > In the current draft, I relevant wording looks equivalent: > > — an aggregate or union type that includes one of the aforementioned types > among its elements > or non-static data members (including, recursively, an element or > non-static data member of > a subaggregate or contained union), > > > However, another point to consider is that, "C++ standard experts, at > least on the GCC side, did not view this as saying "all accesses must have > an explicit union access", but that..." may not be a reasonable reading of > the standard. The text says, "If a program attempts to access the stored > value of an object through a glvalue of other than one > of the following types the behavior is undefined: ...", and so I read that > as saying that the access must be through a glvalue of an aggregate or > union type (with a relevant type as one of its members). So the union or > aggregate type *must* be explicitly present in the access. I don't believe > any other interpretations, especially those which imply non-local effects, > are reasonable. > > Exactly what rule did GCC implement? >
If you use an explicit union access, it assumes they alias, otherwise, all bets are off. Note: As you all know, I have really no dog in this fight, nor am i a language lawyer. Among other things, i'm an aliasing guy. I just do the edge of what the language lawyers tell me is allowed :)
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
