On Tue, Jul 19, 2022 at 4:42 PM Eugenio Perez Martin <epere...@redhat.com> wrote: > > On Tue, Jul 19, 2022 at 9:38 AM Jason Wang <jasow...@redhat.com> wrote: > > > > > > 在 2022/7/16 01:05, Eugenio Perez Martin 写道: > > > On Fri, Jul 15, 2022 at 10:48 AM Jason Wang <jasow...@redhat.com> wrote: > > >> On Fri, Jul 15, 2022 at 1:39 PM Eugenio Perez Martin > > >> <epere...@redhat.com> wrote: > > >>> On Fri, Jul 15, 2022 at 5:59 AM Jason Wang <jasow...@redhat.com> wrote: > > >>>> On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <epere...@redhat.com> > > >>>> wrote: > > >>>>> It allows the Shadow Control VirtQueue to wait for the device to use > > >>>>> the > > >>>>> available buffers. > > >>>>> > > >>>>> Signed-off-by: Eugenio Pérez <epere...@redhat.com> > > >>>>> --- > > >>>>> hw/virtio/vhost-shadow-virtqueue.h | 1 + > > >>>>> hw/virtio/vhost-shadow-virtqueue.c | 22 ++++++++++++++++++++++ > > >>>>> 2 files changed, 23 insertions(+) > > >>>>> > > >>>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.h > > >>>>> b/hw/virtio/vhost-shadow-virtqueue.h > > >>>>> index 1692541cbb..b5c6e3b3b4 100644 > > >>>>> --- a/hw/virtio/vhost-shadow-virtqueue.h > > >>>>> +++ b/hw/virtio/vhost-shadow-virtqueue.h > > >>>>> @@ -89,6 +89,7 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq, > > >>>>> const SVQElement *elem, > > >>>>> int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec > > >>>>> *out_sg, > > >>>>> size_t out_num, const struct iovec *in_sg, size_t > > >>>>> in_num, > > >>>>> SVQElement *elem); > > >>>>> +size_t vhost_svq_poll(VhostShadowVirtqueue *svq); > > >>>>> > > >>>>> void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int > > >>>>> svq_kick_fd); > > >>>>> void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int > > >>>>> call_fd); > > >>>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.c > > >>>>> b/hw/virtio/vhost-shadow-virtqueue.c > > >>>>> index 5244896358..31a267f721 100644 > > >>>>> --- a/hw/virtio/vhost-shadow-virtqueue.c > > >>>>> +++ b/hw/virtio/vhost-shadow-virtqueue.c > > >>>>> @@ -486,6 +486,28 @@ static void vhost_svq_flush(VhostShadowVirtqueue > > >>>>> *svq, > > >>>>> } while (!vhost_svq_enable_notification(svq)); > > >>>>> } > > >>>>> > > >>>>> +/** > > >>>>> + * Poll the SVQ for one device used buffer. > > >>>>> + * > > >>>>> + * This function race with main event loop SVQ polling, so extra > > >>>>> + * synchronization is needed. > > >>>>> + * > > >>>>> + * Return the length written by the device. > > >>>>> + */ > > >>>>> +size_t vhost_svq_poll(VhostShadowVirtqueue *svq) > > >>>>> +{ > > >>>>> + do { > > >>>>> + uint32_t len; > > >>>>> + SVQElement *elem = vhost_svq_get_buf(svq, &len); > > >>>>> + if (elem) { > > >>>>> + return len; > > >>>>> + } > > >>>>> + > > >>>>> + /* Make sure we read new used_idx */ > > >>>>> + smp_rmb(); > > >>>> There's already one smp_rmb(0 in vhost_svq_get_buf(). So this seems > > >>>> useless? > > >>>> > > >>> That rmb is after checking for new entries with (vq->last_used_idx != > > >>> svq->shadow_used_idx) , to avoid reordering used_idx read with the > > >>> actual used entry. So my understanding is > > >>> that the compiler is free to skip that check within the while loop. > > >> What do you mean by "that check" here? > > >> > > > The first check of (presumably cached) last_used_idx != > > > shadow_used_idx. Right before the SVQ's vring fetch of > > > svq->vring.used->idx. > > > > > >>> Maybe the right solution is to add it in vhost_svq_more_used after the > > >>> condition (vq->last_used_idx != svq->shadow_used_idx) is false? > > >> I'm not sure I get the goal of the smp_rmb() here. What barrier does it > > >> pair? > > >> > > > It pairs with the actual device memory write. > > > > > > Note that I'm worried about compiler optimizations or reordering > > > causing an infinite loop, not that the memory is updated properly. > > > > > > Let's say we just returned from vhost_svq_add, and avail_idx - > > > used_idx > 0. The device still did not update SVQ vring used idx, and > > > qemu is very fast so it completes a full call of vhost_svq_get_buf > > > before the device is able to increment the used index. We can trace > > > the full vhost_svq_get_buf without a memory barrier. > > > > > > If the compiler inlines enough and we delete the new smp_rmb barrier, > > > this is what it sees: > > > > > > size_t vhost_svq_poll(VhostShadowVirtqueue *svq) > > > { > > > do { > > > more_used = false > > > // The next conditional returns false > > > if (svq->last_used_idx != svq->shadow_used_idx) { > > > goto useful; > > > } > > > > > > svq->shadow_used_idx = cpu_to_le16(svq->vring.used->idx); > > > > > > // next conditional returns false too > > > if (!(svq->last_used_idx != svq->shadow_used_idx)) > > > continue; > > > > > > useful: > > > // actual code to handle new used buffer > > > break; > > > } > > > } > > > } > > > > > > And qemu does not need to read again none of the variables since > > > nothing modifies them within the loop before "useful" tag > > > (svq->vring.used->idx, svq->last_used_idx, svq->shadow_used_idx). So > > > it could freely rewrite as: > > > > > > size_t vhost_svq_poll(VhostShadowVirtqueue *svq) { > > > if (svq->last_used_idx == svq->shadow_used_idx && > > > svq->last_used_idx == svq->vring.used->idx) { > > > for (;;); > > > } > > > } > > > > > > That's why I think the right place for the mb is right after caller > > > code see (potentially cached) last_used_idx == shadow_used_idx, and it > > > needs to read a value paired with the "device's mb" in the SVQ vring. > > > > > > I think you need "volatile" instead of the memory barriers. > > The kernel's READ_ONCE implementation uses a volatile casting but also > a memory read barrier after it.
This sounds strange, the volatile should not tie to any barriers. And this is what I've seen in kernel's code: /* * Use __READ_ONCE() instead of READ_ONCE() if you do not require any * atomicity. Note that this may result in tears! */ #ifndef __READ_ONCE define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x)) #endif Thanks > I guess it's because the compiler can > reorder non-volatile accesses around volatile ones. In the proposed > code, that barrier is provided by the vhost_svq_more_used caller > (vhost_svq_get_buf). I think no other caller should need it. > > Thanks! > > > If I > > understand correctly, you want the load from the memory instead of the > > registers here. > > > > Thanks > > > > > > > > > > We didn't have that problem before, since we clear event_notifier > > > right before the do{}while(), and event loop should hit a memory > > > barrier in the next select / poll / read / whatever syscall to check > > > that event notifier fd is set again. > > > > > >> Since we are in the busy loop, we will read the for new used_idx for > > >> sure, > > > I'm not so sure of that, but maybe I've missed something. > > > > > > I'm sending v3 with this comment pending, so we can iterate faster. > > > > > > Thanks! > > > > > >> and we can't forecast when the used_idx is committed to memory. > > >> > > >