On Wed, Sep 16, 2020 at 10:40:03PM +0200, Philippe Mathieu-Daudé wrote:

I think the current use of volatile is fine. It's widely used in device
drivers (see Linux and DPDK) so I'm not sure eliminating it is
worthwhile.

> Follow docs/devel/atomics.rst guidelines and use atomic operations.
> 
> Cc: Paolo Bonzini <pbonz...@redhat.com>
> Cc: Richard Henderson <r...@twiddle.net>
> Cc: Dr. David Alan Gilbert <dgilb...@redhat.com>
> Signed-off-by: Philippe Mathieu-Daudé <phi...@redhat.com>
> ---
>  block/nvme.c | 38 ++++++++++++++++++++------------------
>  1 file changed, 20 insertions(+), 18 deletions(-)
> 
> diff --git a/block/nvme.c b/block/nvme.c
> index b91749713e0..be80ea1f410 100644
> --- a/block/nvme.c
> +++ b/block/nvme.c
> @@ -46,7 +46,7 @@ typedef struct {
>      uint8_t  *queue;
>      uint64_t iova;
>      /* Hardware MMIO register */
> -    volatile uint32_t *doorbell;
> +    uint32_t *doorbell;
>  } NVMeQueue;
>  
>  typedef struct {
> @@ -82,7 +82,7 @@ typedef struct {
>  } NVMeQueuePair;
>  
>  /* Memory mapped registers */
> -typedef volatile struct {
> +typedef struct {
>      NvmeBar ctrl;
>      struct {
>          uint32_t sq_tail;
> @@ -273,8 +273,7 @@ static void nvme_kick(NVMeQueuePair *q)
>      trace_nvme_kick(s, q->index);
>      assert(!(q->sq.tail & 0xFF00));
>      /* Fence the write to submission queue entry before notifying the 
> device. */
> -    smp_wmb();
> -    *q->sq.doorbell = cpu_to_le32(q->sq.tail);
> +    atomic_rcu_set(q->sq.doorbell, cpu_to_le32(q->sq.tail));

I suggest using smp_wmb()/smp_rmb() instead of atomic operations since
operation doesn't actually need to be atomic.

This hunk can be dropped from the patch, the existing behavior is
already correct.

> @@ -734,10 +731,11 @@ static int nvme_init(BlockDriverState *bs, const char 
> *device, int namespace,
>      timeout_ms = MIN(500 * ((cap >> 24) & 0xFF), 30000);
>  
>      /* Reset device to get a clean state. */
> -    s->regs->ctrl.cc &= const_le32(0xFE);
> +    atomic_set(&s->regs->ctrl.cc,
> +               cpu_to_le32(atomic_read(&s->regs->ctrl.cc) & 
> const_le32(0xFE)));
>      /* Wait for CSTS.RDY = 0. */
>      deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * 
> SCALE_MS;
> -    while (s->regs->ctrl.csts & const_le32(0x1)) {
> +    while (atomic_read(&s->regs->ctrl.csts) & const_le32(0x1)) {
>          if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
>              error_setg(errp, "Timeout while waiting for device to reset (%"
>                               PRId64 " ms)",

Linux drivers use readl()/writel() to perform memory loads/stores with
appropriate constraints for MMIO accesses (e.g. the instructions cannot
be optimized by the compiler). QEMU lacks an API like this because it
didn't contain userspace drivers before block/nvme.c.

The semantics needed here are that the compiler must perform the memory
access and cannot optimize it.

Please introduce an API for hardware register accesses instead of
(ab)using atomics.

Stefan

Attachment: signature.asc
Description: PGP signature

Reply via email to