On Fri, Jan 31, 2020 at 1:05 PM Uecker, Martin
<martin.uec...@med.uni-goettingen.de> wrote:
>
> Am Freitag, den 31.01.2020, 09:02 +0100 schrieb Richard Biener:
> > On Thu, Jan 30, 2020 at 6:09 PM Uecker, Martin
> > <martin.uec...@med.uni-goettingen.de> wrote:
> > >
> > > Am Donnerstag, den 30.01.2020, 16:50 +0000 schrieb Michael Matz:
> > > > Hi,
> > > >
> > > > On Thu, 30 Jan 2020, Uecker, Martin wrote:
> > > >
> > > > > > guarantees face serious implementation difficulties I think
> > > > > > so the only alternative to PVNI (which I think is implementable
> > > > > > but at a optimization opportunity cost) is one that makes
> > > > > > two pointers with the same value always have the same
> > > > > > provenance (and otherwise make the behavior undefined).
> > > > >
> > > > > This would need to come with precise rules about
> > > > > when the occurance of two such pointers is UB,
> > > > > e.g. comparisons of such pointers, or that
> > > > > two such pointers are cast to int in the same
> > > > > execution.
> > > > >
> > > > > The mere existance of such pointers should be
> > > > > quite common and should not already be UB.
> > > > >
> > > > > But I am uncomfortable with the idea that
> > > > > comparison of pointers is always allowed except
> > > > > for some special case which then is UB. This
> > > > > might cause are and very difficult to find bugs.
> > > >
> > > > As Richi said, the comparison itself wouldn't be UB, all comparisons 
> > > > would
> > > > be allowed.  But _if_ the pointers compare equal, they must have same 
> > > > (or
> > > > overlapping) provenance (i.e. when they have not, then _that_ is UB).
> > >
> > > Sorry, I still don't get it.  In the following example,
> > >
> > > int a[1];
> > > int b[1];
> > >
> > > it is often the case that &a[1] and &b[0] compare equal
> > > because they have the same address but the pointer
> > > have different provenance.
> > >
> > > Or does there need to be an actual evaluation of a comparison
> > > operations? In this case, I do not see the difference to what
> > > I said.
> >
> > I guess I wanted to say that if you do
> >
> >   if (&a[1] == &b[0])
> >     if (&a[1] != &b[0])
> >       abort ();
> >
> > then the abort might happen.  I'm using the term "undefined behavior"
> > here.  So whenever you create a value based on two pointers with
> > the same value and different provenance you invoke undefined behavior.
>
> Yes, but it is tricky because one needs to define
> "create a value based on two pointers with..."
>
> Assuming one does not track provenance through integers,
> the only way to create expressions using two pointers
> are comparisons, pointer subtraction, and the tertiary
> operator.
>
> The tertiary operator seems unproblematic. For pointer
> subtraction, the standard already requires same provenance.
>
> For comparisons, one could consider making this case UB.
> But I fear this could be the source of subtle bugs.
>
> Then there is the question about what happens if a
> programm inspects the representation bytes  of a
> pointer directly...

At least the pointer is then no longer a register ;)

> > That allows the compiler to optimize
> >
> > int *q, *r;
> > if (q == r)
> >   *r = 1;
> >
> > into
> >
> > if (q == r)
> >   *q = 1;
> >
> > which it is currently not allowed to do because of that dread one-after-the
> > object equality compare, not because of PNVI, but similar cases
>
> Yes, but as provenance is tracked at compile-time, you could still
> do the optimization if you assign the right provenance to the
> replaced variable, i.e. you replace 'r' with 'q' but keep the
> provenance of 'r'. So while this puts a burden on the compiler
> writers, it seems feasible. Or am  I missing something?

With SSA it's not easy since q before the comparison is the same so it's

  *q_1 = 2;
  if (q_1 == r_2)
    *r_2 = 1;  ->  *q_1 = 1;

and we cannot change the provenance of q_1 since that affects the
earlier store.  We'd have to somehow attach provenance to all
_operations_ where it matters (the dereference in this case).  That's
a much larger change.

> > obviously can be constructed with integers (and make our live difficult
> > as we're tracking provenance through integers).
>
> As in PVNI integers do not have provenance, such an optimization would
> always be valid for integers as would all other natural algebraic
> optimizations for integers. I consider this a major strength of
> the proposal and I kind of hoped that compiler writers would agree.

Yes, sure - it avoids this class of problems.  PVNI is probably the
very simplest approach to fix whatever problem it tries to fix ;)

> > Compilers fundamentally work with value-equivalences, the above example
> > shows we may not.  That's IMHO a defect in the standard.
>
> I consider provenance to be part of the value. Think about
> architectures with descriptors that actually trap if you use
> the wrong pointer. This nicely corresponds to a concept
> of abstract pointers which not simple the address of a
> memory location.
>
> The problems we have that we can not (cheaply) track provenance
> at runtime on modern CPUs and only the address part of the pointer
> is available ar runtime. For the standard, this implies
> that the rules must work both abstract pointers with provenance
> and address-only pointers where information about provenance
> is not available. Whenever there is a discrepancy between
> these two models, we can either make it UB or use the semantics
> of the address-only case.
>
> The only real problematic case we have with PVNI is comparisons
> for one-after-the object pointers with a pointer of different
> provenance. The only choices we have is to make this UB or
> to make the result well-defined and based on the address.
> Both choices have disadvantages.
>
> If we track provenance through integers, there are many
> other difficult problems. The reason is that you then
> cannot work with value-equivalences anymore even for
> integer expressions which are much more complex.
> The amount of additional problems we create here
> is the main reason we want to have PVNI and not
> track provenance through integers.

I understand that.

Richard.

> Best,
> Martin
>
>

Reply via email to