On Wed, Sep 16, 2020 at 10:40:03PM +0200, Philippe Mathieu-Daudé wrote: I think the current use of volatile is fine. It's widely used in device drivers (see Linux and DPDK) so I'm not sure eliminating it is worthwhile.
> Follow docs/devel/atomics.rst guidelines and use atomic operations. > > Cc: Paolo Bonzini <pbonz...@redhat.com> > Cc: Richard Henderson <r...@twiddle.net> > Cc: Dr. David Alan Gilbert <dgilb...@redhat.com> > Signed-off-by: Philippe Mathieu-Daudé <phi...@redhat.com> > --- > block/nvme.c | 38 ++++++++++++++++++++------------------ > 1 file changed, 20 insertions(+), 18 deletions(-) > > diff --git a/block/nvme.c b/block/nvme.c > index b91749713e0..be80ea1f410 100644 > --- a/block/nvme.c > +++ b/block/nvme.c > @@ -46,7 +46,7 @@ typedef struct { > uint8_t *queue; > uint64_t iova; > /* Hardware MMIO register */ > - volatile uint32_t *doorbell; > + uint32_t *doorbell; > } NVMeQueue; > > typedef struct { > @@ -82,7 +82,7 @@ typedef struct { > } NVMeQueuePair; > > /* Memory mapped registers */ > -typedef volatile struct { > +typedef struct { > NvmeBar ctrl; > struct { > uint32_t sq_tail; > @@ -273,8 +273,7 @@ static void nvme_kick(NVMeQueuePair *q) > trace_nvme_kick(s, q->index); > assert(!(q->sq.tail & 0xFF00)); > /* Fence the write to submission queue entry before notifying the > device. */ > - smp_wmb(); > - *q->sq.doorbell = cpu_to_le32(q->sq.tail); > + atomic_rcu_set(q->sq.doorbell, cpu_to_le32(q->sq.tail)); I suggest using smp_wmb()/smp_rmb() instead of atomic operations since operation doesn't actually need to be atomic. This hunk can be dropped from the patch, the existing behavior is already correct. > @@ -734,10 +731,11 @@ static int nvme_init(BlockDriverState *bs, const char > *device, int namespace, > timeout_ms = MIN(500 * ((cap >> 24) & 0xFF), 30000); > > /* Reset device to get a clean state. */ > - s->regs->ctrl.cc &= const_le32(0xFE); > + atomic_set(&s->regs->ctrl.cc, > + cpu_to_le32(atomic_read(&s->regs->ctrl.cc) & > const_le32(0xFE))); > /* Wait for CSTS.RDY = 0. */ > deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * > SCALE_MS; > - while (s->regs->ctrl.csts & const_le32(0x1)) { > + while (atomic_read(&s->regs->ctrl.csts) & const_le32(0x1)) { > if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) { > error_setg(errp, "Timeout while waiting for device to reset (%" > PRId64 " ms)", Linux drivers use readl()/writel() to perform memory loads/stores with appropriate constraints for MMIO accesses (e.g. the instructions cannot be optimized by the compiler). QEMU lacks an API like this because it didn't contain userspace drivers before block/nvme.c. The semantics needed here are that the compiler must perform the memory access and cannot optimize it. Please introduce an API for hardware register accesses instead of (ab)using atomics. Stefan
signature.asc
Description: PGP signature