descriptor reads

Paolo Bonzini Tue, 24 Apr 2012 06:49:12 -0700

Il 23/04/2012 15:19, Michael S. Tsirkin ha scritto:
> virtio has the equivalent of:
> 
>       if (vq->last_avail_index != vring_avail_idx(vq)) {
>               read descriptor head at vq->last_avail_index;
>       }
> 
> In theory, processor can reorder descriptor head
> read to happen speculatively before the index read.
> this would trigger the following race:
> 
>       host descriptor head read <- reads invalid head from ring
>               guest writes valid descriptor head
>               guest writes avail index
>       host avail index read <- observes valid index
> 
> as a result host will use an invalid head value.
> This was not observed in the field by me but after
> the experience with the previous two races
> I think it is prudent to address this theoretical race condition.


I think your fix is right, but it is not needed on x86.  lfence is only
required for weakly-ordered memory types, so says Intel, and AMD is even
clearer: "Loads from differing memory types may
be performed out of order, in particular between WC/WC+ and other memory
types".

> Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
> ---
>  hw/virtio.c    |    5 +++++
>  qemu-barrier.h |    7 ++++++-
>  2 files changed, 11 insertions(+), 1 deletions(-)
> 
> diff --git a/hw/virtio.c b/hw/virtio.c
> index def0bf1..c081e1b 100644
> --- a/hw/virtio.c
> +++ b/hw/virtio.c
> @@ -287,6 +287,11 @@ static int virtqueue_num_heads(VirtQueue *vq, unsigned 
> int idx)
>                       idx, vring_avail_idx(vq));
>          exit(1);
>      }
> +    /* On success, callers read a descriptor at vq->last_avail_idx.
> +     * Make sure descriptor read does not bypass avail index read. */
> +    if (num_heads) {
> +        smp_rmb();
> +    }
>  
>      return num_heads;
>  }
> diff --git a/qemu-barrier.h b/qemu-barrier.h
> index f6722a8..39aa0b0 100644
> --- a/qemu-barrier.h
> +++ b/qemu-barrier.h
> @@ -24,10 +24,13 @@
>  #define smp_mb() asm volatile("lock; addl $0,0(%%esp) " ::: "memory")
>  #endif
>  
> +#define smp_rmb() smp_mb()
> +
>  #elif defined(__x86_64__)
>  
>  #define smp_wmb()   barrier()
>  #define smp_mb() asm volatile("mfence" ::: "memory")
> +#define smp_rmb() asm volatile("lfence" ::: "memory")
>  
>  #elif defined(_ARCH_PPC)
>  
> @@ -38,6 +41,7 @@
>   */
>  #define smp_wmb()   asm volatile("eieio" ::: "memory")
>  #define smp_mb()   asm volatile("eieio" ::: "memory")
> +#define smp_rmb()   asm volatile("eieio" ::: "memory")

rmb() is lwsync on PPC, not eieio.

I would be grateful if, instead of fixing the qemu-barrier.h parts of
the patches, you picked up the (sole) patch in the atomics branch of
git://github.com/bonzini/qemu.git.  The constructs there are more
complete than what we have in qemu-barrier.h, and the memory barriers
are based on http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
(written by people who know much more than me :)).

Paolo

>  
>  #else
>  
> @@ -45,10 +49,11 @@
>   * For (host) platforms we don't have explicit barrier definitions
>   * for, we use the gcc __sync_synchronize() primitive to generate a
>   * full barrier.  This should be safe on all platforms, though it may
> - * be overkill for wmb().
> + * be overkill for wmb() and rmb().
>   */
>  #define smp_wmb()   __sync_synchronize()
>  #define smp_mb()   __sync_synchronize()
> +#define smp_rmb()   __sync_synchronize()
>  
>  #endif
>

Re: [Qemu-devel] [PATCHv2 3/3] virtio: order index/descriptor reads

Reply via email to