On Fri, Jan 13, 2017 at 10:29:46PM +0000, Felipe Franciosi wrote: > > > On 13 Jan 2017, at 10:18, Michael S. Tsirkin <m...@redhat.com> wrote: > > > > On Fri, Jan 13, 2017 at 05:15:22PM +0000, Felipe Franciosi wrote: > >> > >>> On 13 Jan 2017, at 09:04, Michael S. Tsirkin <m...@redhat.com> wrote: > >>> > >>> On Fri, Jan 13, 2017 at 03:09:46PM +0000, Felipe Franciosi wrote: > >>>> Hi Marc-Andre, > >>>> > >>>>> On 13 Jan 2017, at 07:03, Marc-André Lureau <mlur...@redhat.com> wrote: > >>>>> > >>>>> Hi > >>>>> > >>>>> ----- Original Message ----- > >>>>>> Currently, VQs are started as soon as a SET_VRING_KICK is received. > >>>>>> That > >>>>>> is too early in the VQ setup process, as the backend might not yet have > >>>>> > >>>>> I think we may want to reconsider queue_set_started(), move it > >>>>> elsewhere, since kick/call fds aren't mandatory to process the rings. > >>>> > >>>> Hmm. The fds aren't mandatory, but I imagine in that case we should > >>>> still receive SET_VRING_KICK/CALL messages without an fd (ie. with the > >>>> VHOST_MSG_VQ_NOFD_MASK flag set). Wouldn't that be the case? > >>> > >>> Please look at docs/specs/vhost-user.txt, Starting and stopping rings > >>> > >>> The spec says: > >>> Client must start ring upon receiving a kick (that is, detecting that > >>> file descriptor is readable) on the descriptor specified by > >>> VHOST_USER_SET_VRING_KICK, and stop ring upon receiving > >>> VHOST_USER_GET_VRING_BASE. > >> > >> Yes I have seen the spec, but there is a race with the current > >> libvhost-user code which needs attention. My initial proposal (which got > >> turned down) was to send a spurious notification upon seeing a callfd. > >> Then I came up with this proposal. See below. > >> > >>> > >>> > >>>>> > >>>>>> a callfd to notify in case it received a kick and fully processed the > >>>>>> request/command. This patch only starts a VQ when a SET_VRING_CALL is > >>>>>> received. > >>>>> > >>>>> I don't like that much, as soon as the kick fd is received, it should > >>>>> start polling it imho. callfd is optional, it may have one and not the > >>>>> other. > >>>> > >>>> So the question is whether we should be receiving a SET_VRING_CALL > >>>> anyway or not, regardless of an fd being sent. (I think we do, but I > >>>> haven't done extensive testing with other device types.) > >>> > >>> I would say not, only KICK is mandatory and that is also not enough > >>> to process ring. You must wait for it to be readable. > >> > >> The problem is that Qemu takes time between sending the kickfd and the > >> callfd. Hence the race. Consider this scenario: > >> > >> 1) Guest configures the device > >> 2) Guest put a request on a virtq > >> 3) Guest kicks > >> 4) Qemu starts configuring the backend > >> 4.a) Qemu sends the masked callfds > >> 4.b) Qemu sends the virtq sizes and addresses > >> 4.c) Qemu sends the kickfds > >> > >> (When using MQ, Qemu will only send the callfd once all VQs are configured) > >> > >> 5) The backend starts listening on the kickfd upon receiving it > >> 6) The backend picks up the guest's request > >> 7) The backend processes the request > >> 8) The backend puts the response on the used ring > >> 9) The backend notifies the masked callfd > >> > >> 4.d) Qemu sends the callfds > >> > >> At which point the guest missed the notification and gets stuck. > >> > >> Perhaps you prefer my initial proposal of sending a spurious notification > >> when the backend sees a callfd? > >> > >> Felipe > > > > I thought we read the masked callfd when we unmask it, > > and forward the interrupt. See kvm_irqfd_assign: > > > > /* > > * Check if there was an event already pending on the eventfd > > * before we registered, and trigger it as if we didn't miss it. > > */ > > events = f.file->f_op->poll(f.file, &irqfd->pt); > > > > if (events & POLLIN) > > schedule_work(&irqfd->inject); > > > > > > > > Is this a problem you observe in practice? > > Thanks for pointing out to this code; I wasn't aware of it. > > Indeed I'm encountering it in practice. And I've checked that my kernel has > the code above. > > Starts to sound like a race: > Qemu registers the new notifier with kvm > Backend kicks the (now no longer registered) maskfd
vhost user is not supposed to use maskfd at all. We have this code: if (net->nc->info->type == NET_CLIENT_DRIVER_VHOST_USER) { dev->use_guest_notifier_mask = false; } isn't it effective? > Qemu sends the new callfd to the application > > It's not hard to repro. How could this situation be avoided? > > Cheers, > Felipe > > > > > >> > >>> > >>>>> > >>>>> Perhaps it's best for now to delay the callfd notification with a flag > >>>>> until it is received? > >>>> > >>>> The other idea is to always kick when we receive the callfd. I remember > >>>> discussing that alternative with you before libvhost-user went in. The > >>>> protocol says both the driver and the backend must handle spurious > >>>> kicks. This approach also fixes the bug. > >>>> > >>>> I'm happy with whatever alternative you want, as long it makes > >>>> libvhost-user usable for storage devices. > >>>> > >>>> Thanks, > >>>> Felipe > >>>> > >>>> > >>>>> > >>>>> > >>>>>> Signed-off-by: Felipe Franciosi <fel...@nutanix.com> > >>>>>> --- > >>>>>> contrib/libvhost-user/libvhost-user.c | 26 +++++++++++++------------- > >>>>>> 1 file changed, 13 insertions(+), 13 deletions(-) > >>>>>> > >>>>>> diff --git a/contrib/libvhost-user/libvhost-user.c > >>>>>> b/contrib/libvhost-user/libvhost-user.c > >>>>>> index af4faad..a46ef90 100644 > >>>>>> --- a/contrib/libvhost-user/libvhost-user.c > >>>>>> +++ b/contrib/libvhost-user/libvhost-user.c > >>>>>> @@ -607,19 +607,6 @@ vu_set_vring_kick_exec(VuDev *dev, VhostUserMsg > >>>>>> *vmsg) > >>>>>> DPRINT("Got kick_fd: %d for vq: %d\n", vmsg->fds[0], index); > >>>>>> } > >>>>>> > >>>>>> - dev->vq[index].started = true; > >>>>>> - if (dev->iface->queue_set_started) { > >>>>>> - dev->iface->queue_set_started(dev, index, true); > >>>>>> - } > >>>>>> - > >>>>>> - if (dev->vq[index].kick_fd != -1 && dev->vq[index].handler) { > >>>>>> - dev->set_watch(dev, dev->vq[index].kick_fd, VU_WATCH_IN, > >>>>>> - vu_kick_cb, (void *)(long)index); > >>>>>> - > >>>>>> - DPRINT("Waiting for kicks on fd: %d for vq: %d\n", > >>>>>> - dev->vq[index].kick_fd, index); > >>>>>> - } > >>>>>> - > >>>>>> return false; > >>>>>> } > >>>>>> > >>>>>> @@ -661,6 +648,19 @@ vu_set_vring_call_exec(VuDev *dev, VhostUserMsg > >>>>>> *vmsg) > >>>>>> > >>>>>> DPRINT("Got call_fd: %d for vq: %d\n", vmsg->fds[0], index); > >>>>>> > >>>>>> + dev->vq[index].started = true; > >>>>>> + if (dev->iface->queue_set_started) { > >>>>>> + dev->iface->queue_set_started(dev, index, true); > >>>>>> + } > >>>>>> + > >>>>>> + if (dev->vq[index].kick_fd != -1 && dev->vq[index].handler) { > >>>>>> + dev->set_watch(dev, dev->vq[index].kick_fd, VU_WATCH_IN, > >>>>>> + vu_kick_cb, (void *)(long)index); > >>>>>> + > >>>>>> + DPRINT("Waiting for kicks on fd: %d for vq: %d\n", > >>>>>> + dev->vq[index].kick_fd, index); > >>>>>> + } > >>>>>> + > >>>>>> return false; > >>>>>> } > >>>>>> > >>>>>> -- > >>>>>> 1.9.4 > >>>>>> > >>>>>> > >>>>