On Fri, Nov 22, 2013 at 02:46:30PM +0100, Peter Zijlstra wrote:
> On Thu, Nov 21, 2013 at 10:02:37AM -0800, Paul E. McKenney wrote:
> > > --- a/kernel/events/ring_buffer.c
> > > +++ b/kernel/events/ring_buffer.c
> > 
> > My patch does not cover this file.  Wouldn't hurt for them to be
> > separate.
> 
> Oh sure, but I wanted to present the RFC with at least one working
> example to illustrate why I even bother and to aid in discussion.
> 
> > > @@ -62,18 +62,18 @@ static void perf_output_put_handle(struc
> > >    *   kernel                             user
> > >    *
> > >    *   READ ->data_tail                   READ ->data_head
> > > -  *   smp_mb()   (A)                     smp_rmb()       (C)
> > > +  *   barrier()  (A)                     smp_rmb()       (C)
> > 
> > We need a conditional for this to work.  I know that the required
> > conditional is there in the code, but we need it explicitly in this
> > example as well.
> 
> Agreed, I skimped on that because I didn't quite know how to write that
> best.

Indeed, we still seem to be converging on that.

> How about the below version?

Much better!  Might even be correct.  ;-)

                                                        Thanx, Paul

> ---
> --- a/kernel/events/ring_buffer.c
> +++ b/kernel/events/ring_buffer.c
> @@ -61,19 +61,20 @@ static void perf_output_put_handle(struc
>        *
>        *   kernel                             user
>        *
> -      *   READ ->data_tail                   READ ->data_head
> -      *   smp_mb()   (A)                     smp_rmb()       (C)
> -      *   WRITE $data                        READ $data
> -      *   smp_wmb()  (B)                     smp_mb()        (D)
> -      *   STORE ->data_head                  WRITE ->data_tail
> +      *   if (LOAD ->data_tail) {            LOAD ->data_head
> +      *                      (A)             smp_rmb()       (C)
> +      *      STORE $data                     LOAD $data
> +      *      smp_wmb()       (B)             smp_mb()        (D)
> +      *      STORE ->data_head               STORE ->data_tail
> +      *   }
>        *
>        * Where A pairs with D, and B pairs with C.
>        *
> -      * I don't think A needs to be a full barrier because we won't in fact
> -      * write data until we see the store from userspace. So we simply don't
> -      * issue the data WRITE until we observe it. Be conservative for now.
> +      * In our case (A) is a control dependency that separates the load of
> +      * the ->data_tail and the stores of $data. In case ->data_tail
> +      * indicates there is no room in the buffer to store $data we do not.
>        *
> -      * OTOH, D needs to be a full barrier since it separates the data READ
> +      * D needs to be a full barrier since it separates the data READ
>        * from the tail WRITE.
>        *
>        * For B a WMB is sufficient since it separates two WRITEs, and for C
> @@ -81,7 +82,7 @@ static void perf_output_put_handle(struc
>        *
>        * See perf_output_begin().
>        */
> -     smp_wmb();
> +     smp_wmb(); /* B, matches C */
>       rb->user_page->data_head = head;
> 
>       /*
> @@ -144,17 +145,26 @@ int perf_output_begin(struct perf_output
>               if (!rb->overwrite &&
>                   unlikely(CIRC_SPACE(head, tail, perf_data_size(rb)) < size))
>                       goto fail;
> +
> +             /*
> +              * The above forms a control dependency barrier separating the
> +              * @tail load above from the data stores below. Since the @tail
> +              * load is required to compute the branch to fail below.
> +              *
> +              * A, matches D; the full memory barrier userspace SHOULD issue
> +              * after reading the data and before storing the new tail
> +              * position.
> +              *
> +              * See perf_output_put_handle().
> +              */
> +
>               head += size;
>       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> 
>       /*
> -      * Separate the userpage->tail read from the data stores below.
> -      * Matches the MB userspace SHOULD issue after reading the data
> -      * and before storing the new tail position.
> -      *
> -      * See perf_output_put_handle().
> +      * We rely on the implied barrier() by local_cmpxchg() to ensure
> +      * none of the data stores below can be lifted up by the compiler.
>        */
> -     smp_mb();
> 
>       if (unlikely(head - local_read(&rb->wakeup) > rb->watermark))
>               local_add(rb->watermark, &rb->wakeup);
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to