On Fri, Oct 06, 2017 at 11:07:23AM +0200, Peter Zijlstra wrote:
> On Thu, Oct 05, 2017 at 11:22:04AM -0700, Paul E. McKenney wrote:
> > Hmmm...  Here is what I was worried about:
> > 
> >     C C-PaulEMcKenney-W+RWC4+2017-10-05
> > 
> >     {
> >     }
> > 
> >     P0(int *a, int *x)
> >     {
> >             WRITE_ONCE(*a, 1);
> >             smp_mb(); /* Lock acquisition for rcu_node ->lock. */
> >             WRITE_ONCE(*x, 1);
> >     }
> > 
> >     P1(int *x, int *y, spinlock_t *l)
> >     {
> >             r3 = READ_ONCE(*x);
> >             smp_mb(); /* Lock acquisition for rcu_node ->lock. */
> >             spin_lock(l); /* Locking in complete(). */
> >             WRITE_ONCE(*y, 1);
> >             spin_unlock(l);
> >     }
> > 
> >     P2(int *y, int *b, spinlock_t *l)
> >     {
> >             spin_lock(l); /* Locking in wait_for_completion. */
> >             r4 = READ_ONCE(*y);
> >             spin_unlock(l);
> >             r1 = READ_ONCE(*b);
> >     }
> > 
> >     P3(int *b, int *a)
> >     {
> >             WRITE_ONCE(*b, 1);
> >             smp_mb();
> >             r2 = READ_ONCE(*a);
> >     }
> > 
> >     exists (1:r3=1 /\ 2:r4=1 /\ 2:r1=0 /\ 3:r2=0)
> 
> /me goes and install this herd thing again.. I'm sure I had it running
> _somewhere_.. A well.
> 
>       C C-PaulEMcKenney-W+RWC4+2017-10-05
> 
>       {
>       }
> 
>       P0(int *a, int *x)
>       {
>               WRITE_ONCE(*a, 1);
>               smp_mb(); /* Lock acquisition for rcu_node ->lock. */
>               WRITE_ONCE(*x, 1);
>       }
> 
>       P1(int *x, int *y)
>       {
>               r3 = READ_ONCE(*x);
>               smp_mb(); /* Lock acquisition for rcu_node ->lock. */
>               smp_store_release(y, 1);
>       }
> 
>       P2(int *y, int *b)
>       {
>               r4 = smp_load_acquire(y);
>               r1 = READ_ONCE(*b);
>       }
> 
>       P3(int *b, int *a)
>       {
>               WRITE_ONCE(*b, 1);
>               smp_mb();
>               r2 = READ_ONCE(*a);
>       }
> 
>       exists (1:r3=1 /\ 2:r4=1 /\ 2:r1=0 /\ 3:r2=0)
> 
> 
> Is what I was thinking of, I think that is the minimal ordering
> complete()/wait_for_completion() need to provide.

OK, I will bite...  What do the smp_store_release() and the
smp_load_acquire() correspond to?  I see just plain locking in
wait_for_completion() and complete().

> (also, that r# numbering confuses the hell out of me, its not related to
> P nor to the variables)

Yeah, it is random, sorry!!!

> Test C-PaulEMcKenney-W+RWC4+2017-10-05 Allowed
> States 15
> 1:r3=0; 2:r1=0; 2:r4=0; 3:r2=0;
> 1:r3=0; 2:r1=0; 2:r4=0; 3:r2=1;
> 1:r3=0; 2:r1=0; 2:r4=1; 3:r2=0;
> 1:r3=0; 2:r1=0; 2:r4=1; 3:r2=1;
> 1:r3=0; 2:r1=1; 2:r4=0; 3:r2=0;
> 1:r3=0; 2:r1=1; 2:r4=0; 3:r2=1;
> 1:r3=0; 2:r1=1; 2:r4=1; 3:r2=0;
> 1:r3=0; 2:r1=1; 2:r4=1; 3:r2=1;
> 1:r3=1; 2:r1=0; 2:r4=0; 3:r2=0;
> 1:r3=1; 2:r1=0; 2:r4=0; 3:r2=1;
> 1:r3=1; 2:r1=0; 2:r4=1; 3:r2=1;
> 1:r3=1; 2:r1=1; 2:r4=0; 3:r2=0;
> 1:r3=1; 2:r1=1; 2:r4=0; 3:r2=1;
> 1:r3=1; 2:r1=1; 2:r4=1; 3:r2=0;
> 1:r3=1; 2:r1=1; 2:r4=1; 3:r2=1;
> No
> Witnesses
> Positive: 0 Negative: 15
> Condition exists (1:r3=1 /\ 2:r4=1 /\ 2:r1=0 /\ 3:r2=0)
> Observation C-PaulEMcKenney-W+RWC4+2017-10-05 Never 0 15
> Time C-PaulEMcKenney-W+RWC4+2017-10-05 0.04
> Hash=f7f8ad6eab33e90718a394bcb021557d

But yes, looking closer, this corresponds to the rule of thumb about
non-rf relations and full memory barriers.  We have two non-rf relations
(P2->P3 and P3->P0), so we need two full barriers, one each between the
non-rf relations.

So I dropped that patch yesterday.  The main thing I was missing was
that there is no ordering-free fastpath in wait_for_completion() and
complete(): Each unconditionally acquires the lock.  So the smp_mb()
that I was trying to add doesn't need to be there.

                                                        Thanx, Paul

Reply via email to