On 01/06/2023 16:08, Stefan Hajnoczi wrote:
> On Wed, May 31, 2023 at 04:49:39PM -0400, Vivek Goyal wrote:
>> On Wed, May 31, 2023 at 10:34:15PM +0200, Peter-Jan Gootzen wrote:
>>> On 31/05/2023 21:18, Vivek Goyal wrote:
>>>> On Wed, May 31, 2023 at 07:10:32PM +0200, Peter-Jan Gootzen wrote:
>>>>> When the Virtio queue is full, a work item is scheduled
>>>>> to execute in 1ms that retries adding the request to the queue.
>>>>> This is a large amount of time on the scale on which a
>>>>> virtio-fs device can operate. When using a DPU this is around
>>>>> 40us baseline without going to a remote server (4k, QD=1).
>>>>> This patch queues requests when the Virtio queue is full,
>>>>> and when a completed request is taken off, immediately fills
>>>>> it back up with queued requests.
>>>>>
>>>>> This reduces the 99.9th percentile latencies in our tests by
>>>>> 60x and slightly increases the overall throughput, when using a
>>>>> queue depth 2x the size of the Virtio queue size, with a
>>>>> DPU-powered virtio-fs device.
>>>>>
>>>>> Signed-off-by: Peter-Jan Gootzen <peter-...@gootzen.net>
>>>>> ---
>>>>> V1 -> V2: Not scheduling dispatch work anymore when not needed
>>>>> and changed delayed_work structs to work_struct structs
>>>>>
>>>>>  fs/fuse/virtio_fs.c | 32 +++++++++++++++++---------------
>>>>>  1 file changed, 17 insertions(+), 15 deletions(-)
>>>>>
>>>>> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
>>>>> index 4d8d4f16c727..a676297db09b 100644
>>>>> --- a/fs/fuse/virtio_fs.c
>>>>> +++ b/fs/fuse/virtio_fs.c
>>>>> @@ -45,7 +45,7 @@ struct virtio_fs_vq {
>>>>>   struct work_struct done_work;
>>>>>   struct list_head queued_reqs;
>>>>>   struct list_head end_reqs;      /* End these requests */
>>>>> - struct delayed_work dispatch_work;
>>>>> + struct work_struct dispatch_work;
>>>>>   struct fuse_dev *fud;
>>>>>   bool connected;
>>>>>   long in_flight;
>>>>> @@ -202,7 +202,7 @@ static void virtio_fs_drain_queue(struct virtio_fs_vq 
>>>>> *fsvq)
>>>>>   }
>>>>>  
>>>>>   flush_work(&fsvq->done_work);
>>>>> - flush_delayed_work(&fsvq->dispatch_work);
>>>>> + flush_work(&fsvq->dispatch_work);
>>>>>  }
>>>>>  
>>>>>  static void virtio_fs_drain_all_queues_locked(struct virtio_fs *fs)
>>>>> @@ -346,6 +346,9 @@ static void virtio_fs_hiprio_done_work(struct 
>>>>> work_struct *work)
>>>>>                   dec_in_flight_req(fsvq);
>>>>>           }
>>>>>   } while (!virtqueue_enable_cb(vq) && likely(!virtqueue_is_broken(vq)));
>>>>> +
>>>>> + if (!list_empty(&fsvq->queued_reqs))
>>>>> +         schedule_work(&fsvq->dispatch_work);
>>>>>   spin_unlock(&fsvq->lock);
>>>>>  }
>>>>>  
>>>>> @@ -353,7 +356,7 @@ static void virtio_fs_request_dispatch_work(struct 
>>>>> work_struct *work)
>>>>>  {
>>>>>   struct fuse_req *req;
>>>>>   struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq,
>>>>> -                                          dispatch_work.work);
>>>>> +                                          dispatch_work);
>>>>>   int ret;
>>>>>  
>>>>>   pr_debug("virtio-fs: worker %s called.\n", __func__);
>>>>> @@ -388,8 +391,6 @@ static void virtio_fs_request_dispatch_work(struct 
>>>>> work_struct *work)
>>>>>                   if (ret == -ENOMEM || ret == -ENOSPC) {
>>>>>                           spin_lock(&fsvq->lock);
>>>>>                           list_add_tail(&req->list, &fsvq->queued_reqs);
>>>>> -                         schedule_delayed_work(&fsvq->dispatch_work,
>>>>> -                                               msecs_to_jiffies(1));
>>>>
>>>> Virtqueue being full is only one of the reasons for failure to queue
>>>> the request. What if virtqueue is empty but we could not queue the
>>>> request because lack of memory (-ENOMEM). In that case we will queue
>>>> the request and it might not be dispatched because there is no completion.
>>>> (Assume there is no further new request coming). That means deadlock?
>>>>
>>>> Thanks
>>>> Vivek
>>>>
>>>
>>> Good catch that will deadlock.
>>>
>>> Is default kernel behavior to indefinitely retry a file system
>>> request until memory is available?
>>
>> As of now that seems to be the behavior. I think I had copied this
>> code from another driver. 
>>
>> But I guess one can argue that if memory is not available, then
>> return -ENOMEM to user space instead of retrying in kernel.
>>
>> Stefan, Miklos, WDYT?
> 
> My understanding is that file system syscalls may return ENOMEM, so this
> is okay.
> 
> Stefan

Then I propose only handling -ENOSPC as a special case and letting all
other errors go through to userspace.

Noob Linux contributor question: how often should I send in a new revision of
the patch? Should I wait for more comments or send in a V3 with that fix now?

Best,
Peter-Jan
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Reply via email to