Re: Documentation for plain accesses and data races

Boqun Feng Mon, 16 Sep 2019 04:53:59 -0700

On Mon, Sep 16, 2019 at 01:17:53PM +0800, Boqun Feng wrote:
> Hi Alan,
> 
> I spend some time reading this, really helpful! Thanks.
> 
> Please see comments below:
> 
> On Fri, Sep 06, 2019 at 02:11:29PM -0400, Alan Stern wrote:
> [...]
> > If two memory accesses aren't concurrent then one must execute before
> > the other.  Therefore the LKMM decides two accesses aren't concurrent
> > if they can be connected by a sequence of hb, pb, and rb links
> > (together referred to as xb, for "executes before").  However, there
> > are two complicating factors.
> > 
> > If X is a load and X executes before a store Y, then indeed there is
> > no danger of X and Y being concurrent.  After all, Y can't have any
> > effect on the value obtained by X until the memory subsystem has
> > propagated Y from its own CPU to X's CPU, which won't happen until
> > some time after Y executes and thus after X executes.  But if X is a
> > store, then even if X executes before Y it is still possible that X
> > will propagate to Y's CPU just as Y is executing.  In such a case X
> > could very well interfere somehow with Y, and we would have to
> > consider X and Y to be concurrent.
> > 
> > Therefore when X is a store, for X and Y to be non-concurrent the LKMM
> > requires not only that X must execute before Y but also that X must
> > propagate to Y's CPU before Y executes.  (Or vice versa, of course, if
> > Y executes before X -- then Y must propagate to X's CPU before X
> > executes if Y is a store.)  This is expressed by the visibility
> > relation (vis), where X ->vis Y is defined to hold if there is an
> > intermediate event Z such that:
> > 
> >     X is connected to Z by a possibly empty sequence of
> >     cumul-fence links followed by an optional rfe link (if none of
> >     these links are present, X and Z are the same event),
> > 
> 
> I wonder whehter we could add an optional ->coe or ->fre between X and
> the possibly empty sequence of cumul-fence, smiliar to the definition
> of ->prop. This makes sense because if we have
> 
>       X ->coe X' (and X' in on CPU C)
> 
> , X must be already propagated to C before X' executed, according to our
> operation model:
> 
> "... In particular, it must arrange for the store to be co-later than
> (i.e., to overwrite) any other store to the same location which has
> already propagated to CPU C."
> 
> In other words, we can define ->vis as:
> 
>       let vis = prop ; ((strong-fence ; [Marked] ; xbstar) | (xbstar & int))
>


Hmm.. so the problem with this approach is that the (xbstar & int) part
doesn't satisfy the requirement of visibility... i.e.

        X ->prop Z ->(xbstar & int) Y

may not guarantee when Y executes, X is already propagated to Y's CPU.

Lemme think more on this...

Regards,
Boqun

> , and for this document, reference the "prop" section in
> explanation.txt. This could make the model simple (both for description
> and explanation), and one better thing is that we won't need commit in
> Paul's dev branch:
>       
>       c683f2c807d2 "tools/memory-model: Fix data race detection for unordered 
> store and load"
> 
> , and data race rules will look more symmetrical ;-)
> 
> Thoughts? Or am I missing something subtle here?
> 
> Regards,
> Boqun
>       
> > and either:
> > 
> >     Z is connected to Y by a strong-fence link followed by a
> >     possibly empty sequence of xb links,
> > 
> > or:
> > 
> >     Z is on the same CPU as Y and is connected to Y by a possibly
> >     empty sequence of xb links (again, if the sequence is empty it
> >     means Z and Y are the same event).
> > 
> > The motivations behind this definition are straightforward:
> > 
> >     cumul-fence memory barriers force stores that are po-before
> >     the barrier to propagate to other CPUs before stores that are
> >     po-after the barrier.
> > 
> >     An rfe link from an event W to an event R says that R reads
> >     from W, which certainly means that W must have propagated to
> >     R's CPU before R executed.
> > 
> >     strong-fence memory barriers force stores that are po-before
> >     the barrier, or that propagate to the barrier's CPU before the
> >     barrier executes, to propagate to all CPUs before any events
> >     po-after the barrier can execute.
> > 
> > To see how this works out in practice, consider our old friend, the MP
> > pattern (with fences and statement labels, but without the conditional
> > test):
> > 
> >     int buf = 0, flag = 0;
> > 
> >     P0()
> >     {
> >             X: WRITE_ONCE(buf, 1);
> >                smp_wmb();
> >             W: WRITE_ONCE(flag, 1);
> >     }
> > 
> >     P1()
> >     {
> >             int r1;
> >             int r2 = 0;
> > 
> >             Z: r1 = READ_ONCE(flag);
> >                smp_rmb();
> >             Y: r2 = READ_ONCE(buf);
> >     }
> > 
> > The smp_wmb() memory barrier gives a cumul-fence link from X to W, and
> > assuming r1 = 1 at the end, there is an rfe link from W to Z.  This
> > means that the store to buf must propagate from P0 to P1 before Z
> > executes.  Next, Z and Y are on the same CPU and the smp_rmb() fence
> > provides an xb link from Z to Y (i.e., it forces Z to execute before
> > Y).  Therefore we have X ->vis Y: X must propagate to Y's CPU before Y
> > executes.
> > 
> [...]

signature.asc
Description: PGP signature

Re: Documentation for plain accesses and data races

Reply via email to