On 01/06/2023 20:45, Vivek Goyal wrote:
> On Thu, Jun 01, 2023 at 10:08:50AM -0400, Stefan Hajnoczi wrote:
>> On Wed, May 31, 2023 at 04:49:39PM -0400, Vivek Goyal wrote:
>>> On Wed, May 31, 2023 at 10:34:15PM +0200, Peter-Jan Gootzen wrote:
>>>> On 31/05/2023 21:18, Vivek Goyal wrote:
>>>>> On Wed, May 31, 2023 at 07:10:32PM +0200, Peter-Jan Gootzen wrote:
>>>>>> When the Virtio queue is full, a work item is scheduled
>>>>>> to execute in 1ms that retries adding the request to the queue.
>>>>>> This is a large amount of time on the scale on which a
>>>>>> virtio-fs device can operate. When using a DPU this is around
>>>>>> 40us baseline without going to a remote server (4k, QD=1).
>>>>>> This patch queues requests when the Virtio queue is full,
>>>>>> and when a completed request is taken off, immediately fills
>>>>>> it back up with queued requests.
>>>>>>
>>>>>> This reduces the 99.9th percentile latencies in our tests by
>>>>>> 60x and slightly increases the overall throughput, when using a
>>>>>> queue depth 2x the size of the Virtio queue size, with a
>>>>>> DPU-powered virtio-fs device.
>>>>>>
>>>>>> Signed-off-by: Peter-Jan Gootzen <peter-...@gootzen.net>
>>>>>> ---
>>>>>> V1 -> V2: Not scheduling dispatch work anymore when not needed
>>>>>> and changed delayed_work structs to work_struct structs
>>>>>>
>>>>>>  fs/fuse/virtio_fs.c | 32 +++++++++++++++++---------------
>>>>>>  1 file changed, 17 insertions(+), 15 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
>>>>>> index 4d8d4f16c727..a676297db09b 100644
>>>>>> --- a/fs/fuse/virtio_fs.c
>>>>>> +++ b/fs/fuse/virtio_fs.c
>>>>>> @@ -45,7 +45,7 @@ struct virtio_fs_vq {
>>>>>>          struct work_struct done_work;
>>>>>>          struct list_head queued_reqs;
>>>>>>          struct list_head end_reqs;      /* End these requests */
>>>>>> -        struct delayed_work dispatch_work;
>>>>>> +        struct work_struct dispatch_work;
>>>>>>          struct fuse_dev *fud;
>>>>>>          bool connected;
>>>>>>          long in_flight;
>>>>>> @@ -202,7 +202,7 @@ static void virtio_fs_drain_queue(struct 
>>>>>> virtio_fs_vq *fsvq)
>>>>>>          }
>>>>>>  
>>>>>>          flush_work(&fsvq->done_work);
>>>>>> -        flush_delayed_work(&fsvq->dispatch_work);
>>>>>> +        flush_work(&fsvq->dispatch_work);
>>>>>>  }
>>>>>>  
>>>>>>  static void virtio_fs_drain_all_queues_locked(struct virtio_fs *fs)
>>>>>> @@ -346,6 +346,9 @@ static void virtio_fs_hiprio_done_work(struct 
>>>>>> work_struct *work)
>>>>>>                          dec_in_flight_req(fsvq);
>>>>>>                  }
>>>>>>          } while (!virtqueue_enable_cb(vq) && 
>>>>>> likely(!virtqueue_is_broken(vq)));
>>>>>> +
>>>>>> +        if (!list_empty(&fsvq->queued_reqs))
>>>>>> +                schedule_work(&fsvq->dispatch_work);
>>>>>>          spin_unlock(&fsvq->lock);
>>>>>>  }
>>>>>>  
>>>>>> @@ -353,7 +356,7 @@ static void virtio_fs_request_dispatch_work(struct 
>>>>>> work_struct *work)
>>>>>>  {
>>>>>>          struct fuse_req *req;
>>>>>>          struct virtio_fs_vq *fsvq = container_of(work, struct 
>>>>>> virtio_fs_vq,
>>>>>> -                                                 dispatch_work.work);
>>>>>> +                                                 dispatch_work);
>>>>>>          int ret;
>>>>>>  
>>>>>>          pr_debug("virtio-fs: worker %s called.\n", __func__);
>>>>>> @@ -388,8 +391,6 @@ static void virtio_fs_request_dispatch_work(struct 
>>>>>> work_struct *work)
>>>>>>                          if (ret == -ENOMEM || ret == -ENOSPC) {
>>>>>>                                  spin_lock(&fsvq->lock);
>>>>>>                                  list_add_tail(&req->list, 
>>>>>> &fsvq->queued_reqs);
>>>>>> -                                
>>>>>> schedule_delayed_work(&fsvq->dispatch_work,
>>>>>> -                                                      
>>>>>> msecs_to_jiffies(1));
>>>>>
>>>>> Virtqueue being full is only one of the reasons for failure to queue
>>>>> the request. What if virtqueue is empty but we could not queue the
>>>>> request because lack of memory (-ENOMEM). In that case we will queue
>>>>> the request and it might not be dispatched because there is no completion.
>>>>> (Assume there is no further new request coming). That means deadlock?
>>>>>
>>>>> Thanks
>>>>> Vivek
>>>>>
>>>>
>>>> Good catch that will deadlock.
>>>>
>>>> Is default kernel behavior to indefinitely retry a file system
>>>> request until memory is available?
>>>
>>> As of now that seems to be the behavior. I think I had copied this
>>> code from another driver. 
>>>
>>> But I guess one can argue that if memory is not available, then
>>> return -ENOMEM to user space instead of retrying in kernel.
>>>
>>> Stefan, Miklos, WDYT?
>>
>> My understanding is that file system syscalls may return ENOMEM, so this
>> is okay.
> 
> Ok. Fair enough. Thanks.
> 
> One more question. How do we know virtqueue is full. Is -ENOSPC is the
> correct error code to check and retry indefinitely. Are there other
> situations where -ENOSPC can be returned. Peter's current patch rely
> on the fact that there is atleast one completion happening after
> queuing of request which will kick the worker thread and submit the
> request later.
> 
> We need to watch out for race conditions very closely. If that assumption
> is not valid in some cases or there are races between getting -ENOSPC
> and request completions, we will have a deadlock scenario.
> 
> Thanks
> Vivek
> 

When following the code path, -ENOSPC is only returned when the queue is
full. So that assumption is correct.

As for race conditions where a request might be lost. If -ENOSPC is
returned and in the mean time the queue has become full again, the 
worker will simply fail and be re-executed when something is taken
off the queue again. It is possible that this will happen indefinitely,
but that's also possible in the current version of the driver.

Peter-Jan
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Reply via email to