> On 13 Jan 2017, at 10:18, Michael S. Tsirkin <m...@redhat.com> wrote:
> 
> On Fri, Jan 13, 2017 at 05:15:22PM +0000, Felipe Franciosi wrote:
>> 
>>> On 13 Jan 2017, at 09:04, Michael S. Tsirkin <m...@redhat.com> wrote:
>>> 
>>> On Fri, Jan 13, 2017 at 03:09:46PM +0000, Felipe Franciosi wrote:
>>>> Hi Marc-Andre,
>>>> 
>>>>> On 13 Jan 2017, at 07:03, Marc-André Lureau <mlur...@redhat.com> wrote:
>>>>> 
>>>>> Hi
>>>>> 
>>>>> ----- Original Message -----
>>>>>> Currently, VQs are started as soon as a SET_VRING_KICK is received. That
>>>>>> is too early in the VQ setup process, as the backend might not yet have
>>>>> 
>>>>> I think we may want to reconsider queue_set_started(), move it elsewhere, 
>>>>> since kick/call fds aren't mandatory to process the rings.
>>>> 
>>>> Hmm. The fds aren't mandatory, but I imagine in that case we should still 
>>>> receive SET_VRING_KICK/CALL messages without an fd (ie. with the 
>>>> VHOST_MSG_VQ_NOFD_MASK flag set). Wouldn't that be the case?
>>> 
>>> Please look at docs/specs/vhost-user.txt, Starting and stopping rings
>>> 
>>> The spec says:
>>>     Client must start ring upon receiving a kick (that is, detecting that
>>>     file descriptor is readable) on the descriptor specified by
>>>     VHOST_USER_SET_VRING_KICK, and stop ring upon receiving
>>>     VHOST_USER_GET_VRING_BASE.
>> 
>> Yes I have seen the spec, but there is a race with the current libvhost-user 
>> code which needs attention. My initial proposal (which got turned down) was 
>> to send a spurious notification upon seeing a callfd. Then I came up with 
>> this proposal. See below.
>> 
>>> 
>>> 
>>>>> 
>>>>>> a callfd to notify in case it received a kick and fully processed the
>>>>>> request/command. This patch only starts a VQ when a SET_VRING_CALL is
>>>>>> received.
>>>>> 
>>>>> I don't like that much, as soon as the kick fd is received, it should 
>>>>> start polling it imho. callfd is optional, it may have one and not the 
>>>>> other.
>>>> 
>>>> So the question is whether we should be receiving a SET_VRING_CALL anyway 
>>>> or not, regardless of an fd being sent. (I think we do, but I haven't done 
>>>> extensive testing with other device types.)
>>> 
>>> I would say not, only KICK is mandatory and that is also not enough
>>> to process ring. You must wait for it to be readable.
>> 
>> The problem is that Qemu takes time between sending the kickfd and the 
>> callfd. Hence the race. Consider this scenario:
>> 
>> 1) Guest configures the device
>> 2) Guest put a request on a virtq
>> 3) Guest kicks
>> 4) Qemu starts configuring the backend
>> 4.a) Qemu sends the masked callfds
>> 4.b) Qemu sends the virtq sizes and addresses
>> 4.c) Qemu sends the kickfds
>> 
>> (When using MQ, Qemu will only send the callfd once all VQs are configured)
>> 
>> 5) The backend starts listening on the kickfd upon receiving it
>> 6) The backend picks up the guest's request
>> 7) The backend processes the request
>> 8) The backend puts the response on the used ring
>> 9) The backend notifies the masked callfd
>> 
>> 4.d) Qemu sends the callfds
>> 
>> At which point the guest missed the notification and gets stuck.
>> 
>> Perhaps you prefer my initial proposal of sending a spurious notification 
>> when the backend sees a callfd?
>> 
>> Felipe
> 
> I thought we read the masked callfd when we unmask it,
> and forward the interrupt. See kvm_irqfd_assign:
> 
>        /*
>         * Check if there was an event already pending on the eventfd
>         * before we registered, and trigger it as if we didn't miss it.
>         */
>        events = f.file->f_op->poll(f.file, &irqfd->pt);
> 
>        if (events & POLLIN)
>                schedule_work(&irqfd->inject);
> 
> 
> 
> Is this a problem you observe in practice?

Thanks for pointing out to this code; I wasn't aware of it.

Indeed I'm encountering it in practice. And I've checked that my kernel has the 
code above.

Starts to sound like a race:
Qemu registers the new notifier with kvm
Backend kicks the (now no longer registered) maskfd
Qemu sends the new callfd to the application

It's not hard to repro. How could this situation be avoided?

Cheers,
Felipe


> 
>> 
>>> 
>>>>> 
>>>>> Perhaps it's best for now to delay the callfd notification with a flag 
>>>>> until it is received?
>>>> 
>>>> The other idea is to always kick when we receive the callfd. I remember 
>>>> discussing that alternative with you before libvhost-user went in. The 
>>>> protocol says both the driver and the backend must handle spurious kicks. 
>>>> This approach also fixes the bug.
>>>> 
>>>> I'm happy with whatever alternative you want, as long it makes 
>>>> libvhost-user usable for storage devices.
>>>> 
>>>> Thanks,
>>>> Felipe
>>>> 
>>>> 
>>>>> 
>>>>> 
>>>>>> Signed-off-by: Felipe Franciosi <fel...@nutanix.com>
>>>>>> ---
>>>>>> contrib/libvhost-user/libvhost-user.c | 26 +++++++++++++-------------
>>>>>> 1 file changed, 13 insertions(+), 13 deletions(-)
>>>>>> 
>>>>>> diff --git a/contrib/libvhost-user/libvhost-user.c
>>>>>> b/contrib/libvhost-user/libvhost-user.c
>>>>>> index af4faad..a46ef90 100644
>>>>>> --- a/contrib/libvhost-user/libvhost-user.c
>>>>>> +++ b/contrib/libvhost-user/libvhost-user.c
>>>>>> @@ -607,19 +607,6 @@ vu_set_vring_kick_exec(VuDev *dev, VhostUserMsg 
>>>>>> *vmsg)
>>>>>>       DPRINT("Got kick_fd: %d for vq: %d\n", vmsg->fds[0], index);
>>>>>>   }
>>>>>> 
>>>>>> -    dev->vq[index].started = true;
>>>>>> -    if (dev->iface->queue_set_started) {
>>>>>> -        dev->iface->queue_set_started(dev, index, true);
>>>>>> -    }
>>>>>> -
>>>>>> -    if (dev->vq[index].kick_fd != -1 && dev->vq[index].handler) {
>>>>>> -        dev->set_watch(dev, dev->vq[index].kick_fd, VU_WATCH_IN,
>>>>>> -                       vu_kick_cb, (void *)(long)index);
>>>>>> -
>>>>>> -        DPRINT("Waiting for kicks on fd: %d for vq: %d\n",
>>>>>> -               dev->vq[index].kick_fd, index);
>>>>>> -    }
>>>>>> -
>>>>>>   return false;
>>>>>> }
>>>>>> 
>>>>>> @@ -661,6 +648,19 @@ vu_set_vring_call_exec(VuDev *dev, VhostUserMsg 
>>>>>> *vmsg)
>>>>>> 
>>>>>>   DPRINT("Got call_fd: %d for vq: %d\n", vmsg->fds[0], index);
>>>>>> 
>>>>>> +    dev->vq[index].started = true;
>>>>>> +    if (dev->iface->queue_set_started) {
>>>>>> +        dev->iface->queue_set_started(dev, index, true);
>>>>>> +    }
>>>>>> +
>>>>>> +    if (dev->vq[index].kick_fd != -1 && dev->vq[index].handler) {
>>>>>> +        dev->set_watch(dev, dev->vq[index].kick_fd, VU_WATCH_IN,
>>>>>> +                       vu_kick_cb, (void *)(long)index);
>>>>>> +
>>>>>> +        DPRINT("Waiting for kicks on fd: %d for vq: %d\n",
>>>>>> +               dev->vq[index].kick_fd, index);
>>>>>> +    }
>>>>>> +
>>>>>>   return false;
>>>>>> }
>>>>>> 
>>>>>> --
>>>>>> 1.9.4
>>>>>> 
>>>>>> 
>>>> 


Reply via email to