Re: [PATCH v2 12/19] vhost: add vhost_svq_poll

Jason Wang Tue, 19 Jul 2022 01:49:12 -0700

On Tue, Jul 19, 2022 at 4:42 PM Eugenio Perez Martin
<epere...@redhat.com> wrote:
>
> On Tue, Jul 19, 2022 at 9:38 AM Jason Wang <jasow...@redhat.com> wrote:
> >
> >
> > 在 2022/7/16 01:05, Eugenio Perez Martin 写道:
> > > On Fri, Jul 15, 2022 at 10:48 AM Jason Wang <jasow...@redhat.com> wrote:
> > >> On Fri, Jul 15, 2022 at 1:39 PM Eugenio Perez Martin
> > >> <epere...@redhat.com> wrote:
> > >>> On Fri, Jul 15, 2022 at 5:59 AM Jason Wang <jasow...@redhat.com> wrote:
> > >>>> On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <epere...@redhat.com> 
> > >>>> wrote:
> > >>>>> It allows the Shadow Control VirtQueue to wait for the device to use 
> > >>>>> the
> > >>>>> available buffers.
> > >>>>>
> > >>>>> Signed-off-by: Eugenio Pérez <epere...@redhat.com>
> > >>>>> ---
> > >>>>>   hw/virtio/vhost-shadow-virtqueue.h |  1 +
> > >>>>>   hw/virtio/vhost-shadow-virtqueue.c | 22 ++++++++++++++++++++++
> > >>>>>   2 files changed, 23 insertions(+)
> > >>>>>
> > >>>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
> > >>>>> b/hw/virtio/vhost-shadow-virtqueue.h
> > >>>>> index 1692541cbb..b5c6e3b3b4 100644
> > >>>>> --- a/hw/virtio/vhost-shadow-virtqueue.h
> > >>>>> +++ b/hw/virtio/vhost-shadow-virtqueue.h
> > >>>>> @@ -89,6 +89,7 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq, 
> > >>>>> const SVQElement *elem,
> > >>>>>   int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec 
> > >>>>> *out_sg,
> > >>>>>                     size_t out_num, const struct iovec *in_sg, size_t 
> > >>>>> in_num,
> > >>>>>                     SVQElement *elem);
> > >>>>> +size_t vhost_svq_poll(VhostShadowVirtqueue *svq);
> > >>>>>
> > >>>>>   void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int 
> > >>>>> svq_kick_fd);
> > >>>>>   void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int 
> > >>>>> call_fd);
> > >>>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
> > >>>>> b/hw/virtio/vhost-shadow-virtqueue.c
> > >>>>> index 5244896358..31a267f721 100644
> > >>>>> --- a/hw/virtio/vhost-shadow-virtqueue.c
> > >>>>> +++ b/hw/virtio/vhost-shadow-virtqueue.c
> > >>>>> @@ -486,6 +486,28 @@ static void vhost_svq_flush(VhostShadowVirtqueue 
> > >>>>> *svq,
> > >>>>>       } while (!vhost_svq_enable_notification(svq));
> > >>>>>   }
> > >>>>>
> > >>>>> +/**
> > >>>>> + * Poll the SVQ for one device used buffer.
> > >>>>> + *
> > >>>>> + * This function race with main event loop SVQ polling, so extra
> > >>>>> + * synchronization is needed.
> > >>>>> + *
> > >>>>> + * Return the length written by the device.
> > >>>>> + */
> > >>>>> +size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
> > >>>>> +{
> > >>>>> +    do {
> > >>>>> +        uint32_t len;
> > >>>>> +        SVQElement *elem = vhost_svq_get_buf(svq, &len);
> > >>>>> +        if (elem) {
> > >>>>> +            return len;
> > >>>>> +        }
> > >>>>> +
> > >>>>> +        /* Make sure we read new used_idx */
> > >>>>> +        smp_rmb();
> > >>>> There's already one smp_rmb(0 in vhost_svq_get_buf(). So this seems 
> > >>>> useless?
> > >>>>
> > >>> That rmb is after checking for new entries with (vq->last_used_idx !=
> > >>> svq->shadow_used_idx) , to avoid reordering used_idx read with the
> > >>> actual used entry. So my understanding is
> > >>> that the compiler is free to skip that check within the while loop.
> > >> What do you mean by "that check" here?
> > >>
> > > The first check of (presumably cached) last_used_idx !=
> > > shadow_used_idx. Right before the SVQ's vring fetch of
> > > svq->vring.used->idx.
> > >
> > >>> Maybe the right solution is to add it in vhost_svq_more_used after the
> > >>> condition (vq->last_used_idx != svq->shadow_used_idx) is false?
> > >> I'm not sure I get the goal of the smp_rmb() here. What barrier does it 
> > >> pair?
> > >>
> > > It pairs with the actual device memory write.
> > >
> > > Note that I'm worried about compiler optimizations or reordering
> > > causing an infinite loop, not that the memory is updated properly.
> > >
> > > Let's say we just returned from vhost_svq_add, and avail_idx -
> > > used_idx > 0. The device still did not update SVQ vring used idx, and
> > > qemu is very fast so it completes a full call of vhost_svq_get_buf
> > > before the device is able to increment the used index. We can trace
> > > the full vhost_svq_get_buf without a memory barrier.
> > >
> > > If the compiler inlines enough and we delete the new smp_rmb barrier,
> > > this is what it sees:
> > >
> > > size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
> > > {
> > >      do {
> > >          more_used = false
> > >          // The next conditional returns false
> > >          if (svq->last_used_idx != svq->shadow_used_idx) {
> > >              goto useful;
> > >          }
> > >
> > >          svq->shadow_used_idx = cpu_to_le16(svq->vring.used->idx);
> > >
> > >          // next conditional returns false too
> > >          if (!(svq->last_used_idx != svq->shadow_used_idx))
> > >              continue;
> > >
> > > useful:
> > >          // actual code to handle new used buffer
> > >          break;
> > >          }
> > >      }
> > > }
> > >
> > > And qemu does not need to read again none of the variables since
> > > nothing modifies them within the loop before "useful" tag
> > > (svq->vring.used->idx, svq->last_used_idx, svq->shadow_used_idx). So
> > > it could freely rewrite as:
> > >
> > > size_t vhost_svq_poll(VhostShadowVirtqueue *svq) {
> > >      if (svq->last_used_idx == svq->shadow_used_idx &&
> > >          svq->last_used_idx == svq->vring.used->idx) {
> > >              for (;;);
> > >      }
> > > }
> > >
> > > That's why I think the right place for the mb is right after caller
> > > code see (potentially cached) last_used_idx == shadow_used_idx, and it
> > > needs to read a value paired with the "device's mb" in the SVQ vring.
> >
> >
> > I think you need "volatile" instead of the memory barriers.
>
> The kernel's READ_ONCE implementation uses a volatile casting but also
> a memory read barrier after it.


This sounds strange, the volatile should not tie to any barriers. And
this is what I've seen in kernel's code:

/*
 * Use __READ_ONCE() instead of READ_ONCE() if you do not require any
 * atomicity. Note that this may result in tears!
 */
#ifndef __READ_ONCE
define __READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x) *)&(x))
#endif

Thanks

> I guess it's because the compiler can
> reorder non-volatile accesses around volatile ones. In the proposed
> code, that barrier is provided by the vhost_svq_more_used caller
> (vhost_svq_get_buf). I think no other caller should need it.
>
> Thanks!
>
> > If I
> > understand correctly, you want the load from the memory instead of the
> > registers here.
> >
> > Thanks
> >
> >
> > >
> > > We didn't have that problem before, since we clear event_notifier
> > > right before the do{}while(), and event loop should hit a memory
> > > barrier in the next select / poll / read / whatever syscall to check
> > > that event notifier fd is set again.
> > >
> > >> Since we are in the busy loop, we will read the for new used_idx for
> > >> sure,
> > > I'm not so sure of that, but maybe I've missed something.
> > >
> > > I'm sending v3 with this comment pending, so we can iterate faster.
> > >
> > > Thanks!
> > >
> > >> and we can't forecast when the used_idx is committed to memory.
> > >>
> >
>

Re: [PATCH v2 12/19] vhost: add vhost_svq_poll

Reply via email to