Re: Plain accesses and data races in the Linux Kernel Memory Model

Andrea Parri Thu, 17 Jan 2019 07:04:10 -0800

On Wed, Jan 16, 2019 at 10:36:58PM +0100, Andrea Parri wrote:
> [...]
> 
> > The difficulty with incorporating plain accesses in the memory model
> > is that the compiler has very few constraints on how it treats plain
> > accesses.  It can eliminate them, duplicate them, rearrange them,
> > merge them, split them up, and goodness knows what else.  To make some
> > sense of this, I have taken the view that a plain access can exist
> > (perhaps multiple times) within a certain bounded region of code.
> > Ordering of two accesses X and Y means that we guarantee at least one
> > instance of the X access must be executed before any instances of the
> > Y access.  (This is assuming that neither of the accesses is
> > completely eliminated by the compiler; otherwise there is nothing to
> > order!)
> > 
> > After adding some simple definitions for the sets of plain and marked
> > accesses and for compiler barriers, the patch updates the ppo
> > relation.  The basic idea here is that ppo can be broken down into
> > categories: memory barriers, overwrites, and dependencies (including
> > dep-rfi).
> > 
> >     Memory barriers always provide ordering (compiler barriers do
> >     not but they have indirect effects).
> > 
> >     Overwriting always provides ordering.  This may seem
> >     surprising in the case where both X and Y are plain writes,
> >     but in that case the memory model will say that X can be
> >     eliminated unless there is at least a compiler barrier between
> >     X and Y, and this barrier will enforce the ordering.
> > 
> >     Some dependencies provide ordering and some don't.  Going by
> >     cases:
> > 
> >             An address dependency to a read provides ordering when
> >             the source is a marked read, even when the target is a
> >             plain read.  This is necessary if rcu_dereference() is
> >             to work correctly; it is tantamount to assuming that
> >             the compiler never speculates address dependencies.
> >             However, if the source is a plain read then there is
> >             no ordering.  This is because of Alpha, which does not
> >             respect address dependencies to reads (on Alpha,
> >             marked reads include a memory barrier to enforce the
> >             ordering but plain reads do not).
> 
> Can the compiler (maybe, it does?) transform, at the C or at the "asm"
> level, LB1's P0 in LB2's P0 (LB1 and LB2 are reported below)?
> 
> C LB1
> 
> {
>       int *x = &a;
> }
> 
> P0(int **x, int *y)
> {
>       int *r0;
> 
>       r0 = rcu_dereference(*x);
>       *r0 = 0;
>       smp_wmb();
>       WRITE_ONCE(*y, 1);
> }
> 
> P1(int **x, int *y, int *b)
> {
>       int r0;
> 
>       r0 = READ_ONCE(*y);
>       rcu_assign_pointer(*x, b);
> }
> 
> exists (0:r0=b /\ 1:r0=1)
> 
> 
> C LB2
> 
> {
>       int *x = &a;
> }
> 
> P0(int **x, int *y)
> {
>       int *r0;
> 
>       r0 = rcu_dereference(*x);
>       if (*r0)
>               *r0 = 0;
>       smp_wmb();
>       WRITE_ONCE(*y, 1);
> }
> 
> P1(int **x, int *y, int *b)
> {
>       int r0;
> 
>       r0 = READ_ONCE(*y);
>       rcu_assign_pointer(*x, b);
> }
> 
> exists (0:r0=b /\ 1:r0=1)
> 
> LB1 and LB2 are data-race free, according to the patch; LB1's "exists"
> clause is not satisfiable, while LB2's "exists" clause is satisfiable.
> 
> I'm adding Nick to Cc (I never spoke with him, but from what I see in
> LKML, he must understand compiler better than I currently do... ;-/ )
> 
>   Andrea
> 
> 
> > 
> >             An address dependency to a write always provides
> >             ordering.  Neither the compiler nor the CPU can
> >             speculate the address of a write, because a wrong
> >             guess could generate a data race.  (Question: do we
> >             need to include the case where the source is a plain
> >             read?)
> > 
> >             A data or control dependency to a write provides
> >             ordering if the target is a marked write.  This is
> >             because the compiler is obliged to translate a marked
> >             write as a single machine instruction; if it
> >             speculates such a write there will be no opportunity
> >             to correct a mistake.
> > 
> >             Dep-rfi (i.e., a data or address dependency from a
> >             read to a write which is then read from on the same
> >             CPU) provides ordering between the two reads if the
> >             target is a marked read.  This is again because the
> >             marked read will be translated as a machine-level load
> >             instruction, and then the CPU will guarantee the
> >             ordering.
> > 
> >             There is a special case (data;rfi) that doesn't
> >             provide ordering in itself but can contribute to other
> >             orderings.  A data;rfi link corresponds to situations
> >             where a value is stored in a temporary shared variable
> >             and then loaded back again.  Since the compiler might
> >             choose to eliminate the temporary, its accesses can't
> >             be said to be ordered -- but the accesses around it
> >             might be.  As a simple example, consider:
> > 
> >                     r1 = READ_ONCE(ptr);
> >                     tmp = r1;
> >                     r2 = tmp;
> >                     WRITE_ONCE(*r2, 5);
> > 
> >             The plain accesses involving tmp don't have any
> >             particular ordering requirements, but we do know that
> >             the READ_ONCE must be ordered before the WRITE_ONCE.
> >             The chain of relations is:
> > 
> >                     [marked] ; data ; rfi ; addr ; [marked]
> > 
> >             showing that a data;rfi has been inserted into an
> >             address dependency from a marked read to a marked
> >             write.  In general, any number of data;rfi links can
> >             be inserted in each of the other kinds of dependencies.


As a more general comment (disclaimer), I'm not sure we want to/can add
all the constraints above.  On one hand, for some of them, I ignore the
existence of current use cases in the source (and I don't quite see my-
self encouraging their adoption...); on the other hand, these certainly
do not make the model "simpler" or easier to maintain (in a sound way).

Moreover, I doubt that runtime checkers a la KTSan will ever be able to
assist the developer by supporting these "dependency orderings". [1]

Maybe we could start by adding those orderings that we know are "widely"
relied upon _and_ used by the developers, and later add more/strengthen
the model as needed (where feasible).

Thoughts?

  Andrea

[1] https://groups.google.com/d/msg/ktsan/bVZ1c6H2NE0/gapvllYNBQAJ

Re: Plain accesses and data races in the Linux Kernel Memory Model

Reply via email to