On 05/01/2017 12:49 PM, Daniel Berlin wrote:


On Fri, Apr 21, 2017 at 4:03 AM, Hal Finkel via Phabricator <revi...@reviews.llvm.org <mailto:revi...@reviews.llvm.org>> wrote:

    hfinkel added a comment.

    In https://reviews.llvm.org/D32199#732737
    <https://reviews.llvm.org/D32199#732737>, @rsmith wrote:

    > In https://reviews.llvm.org/D32199#732189
    <https://reviews.llvm.org/D32199#732189>, @hfinkel wrote:
    >
    > > In https://reviews.llvm.org/D32199#731472
    <https://reviews.llvm.org/D32199#731472>, @rsmith wrote:
    > >
    > > > 1. C's "effective type" rule allows writes to set the type
    pretty much unconditionally, unless the storage is for a variable
    with a declared type
    > >
    > >
    > > To come back to this point: We don't really implement these
    rules now, and it is not clear that we will. The problem here is
    that, if we take the specification literally, then we can't use
    our current TBAA at all. The problem is that if we have:
    > >
    > >   write x, !tbaa "int"
    > >   read x, !tbaa "int"
    > >   write x, !tbaa "float"
    > >
    > >
    > > TBAA will currently tell us that the "float" write aliases
    with neither the preceding read nor the preceding write.
    >
    >
    > Right, C's TBAA rules do not (in general) permit a store to be
    reordered before a memory operation of a different type, they only
    allow loads to be moved before stores. (Put another way, they do
    not tell you that pointers point to distinct memory locations,
    just that a stored value cannot be observed by a load of a
    different type.) You get the more general "distinct memory
    locations" result only for objects of a declared type.
    >
    > C++ is similar, except that (because object lifetimes do not
    currently begin magically due to a store) you /can/ reorder stores
    past a memory operation of a different type if you know no
    object's lifetime began in between. (But currently we do not
    record all lifetime events in IR, so we can't do that today. Also,
    we may be about to lose the property that you can statically
    determine a small number of places that might start an object
    lifetime.)
    >
    > > Also, a strict reading of C's access rules seems to rule out
    the premise underlying our struct-path TBAA entirely. So long as
    I'm accessing a value using a struct that has some member,
    including recursively, with that type, then it's fine. The
    matching of the relative offsets is a sufficient, but not
    necessary, condition for well-defined access. C++ has essentially
    the same language (and, thus, potentially the same problem).
    >
    > I agree this rule is garbage, but it's not as permissive as I
    think you're suggesting. The rule says that you can use an lvalue
    of struct type to access memory of struct field type. In C this
    happens during struct assignment, for instance. It does *not*
    permit using an lvalue of struct field type to access unrelated
    fields of the same struct. So C appears to allow this nonsense:
    >
    >   char *p = malloc(8);
    >   *(int*)p = 0;
    >   *(int*)(p + 4) = 0;
    >   struct S {int n; float f;} s = *(struct S*)p; // use lvalue of
    type `struct S` to access object of effective type `int`, to
    initialize a `float`
    >
    >
    > but not this nonsense:
    >
    >   float q = ((struct S*)p)->f; // ub, cannot use lvalue of type
    `float` to access object of effective type `int`
    >
    >
    > ... which just means that we can't make much use of TBAA when
    emitting struct copies in C.
    >
    > In C++, on the other hand, the rule is even more garbage, since
    there is no way to perform a memory access with a glvalue of class
    type. (The closest you get is that a defaulted union
    construction/assignment copies the object representation, but
    that's expressed in terms of copying a sequence of unsigned chars,
    and in any case those are member functions and so already require
    an object of the correct type to exist.) See wg21.link/cwg2051


    Our struct-path TBAA does the following:

      struct X { int a, b; };
      X x { 50, 100 };
      X *o = (X*) (((int*) &x) + 1);

      int a_is_b = o->a; // This is UB (or so we say)?


This is UB.
A good resource for this stuff is http://www.cl.cam.ac.uk/~pes20/cerberus/ <http://www.cl.cam.ac.uk/%7Epes20/cerberus/> which has a long document where they exlpore all of these and what various compilers do, along with what the standard seems to say.

http://www.cl.cam.ac.uk/~pes20/cerberus/notes30-full.pdf is 172 pages, and so I may have missed it, but I don't see this case. Also, I'd really like to see where the standard says this is UB. I don't see it.

Thanks again,
Hal

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to