On 31/05/2023 21:18, Vivek Goyal wrote:
> On Wed, May 31, 2023 at 07:10:32PM +0200, Peter-Jan Gootzen wrote:
>> When the Virtio queue is full, a work item is scheduled
>> to execute in 1ms that retries adding the request to the queue.
>> This is a large amount of time on the scale on which a
>> virtio-fs device can operate. When using a DPU this is around
>> 40us baseline without going to a remote server (4k, QD=1).
>> This patch queues requests when the Virtio queue is full,
>> and when a completed request is taken off, immediately fills
>> it back up with queued requests.
>>
>> This reduces the 99.9th percentile latencies in our tests by
>> 60x and slightly increases the overall throughput, when using a
>> queue depth 2x the size of the Virtio queue size, with a
>> DPU-powered virtio-fs device.
>>
>> Signed-off-by: Peter-Jan Gootzen <peter-...@gootzen.net>
>> ---
>> V1 -> V2: Not scheduling dispatch work anymore when not needed
>> and changed delayed_work structs to work_struct structs
>>
>>  fs/fuse/virtio_fs.c | 32 +++++++++++++++++---------------
>>  1 file changed, 17 insertions(+), 15 deletions(-)
>>
>> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
>> index 4d8d4f16c727..a676297db09b 100644
>> --- a/fs/fuse/virtio_fs.c
>> +++ b/fs/fuse/virtio_fs.c
>> @@ -45,7 +45,7 @@ struct virtio_fs_vq {
>>      struct work_struct done_work;
>>      struct list_head queued_reqs;
>>      struct list_head end_reqs;      /* End these requests */
>> -    struct delayed_work dispatch_work;
>> +    struct work_struct dispatch_work;
>>      struct fuse_dev *fud;
>>      bool connected;
>>      long in_flight;
>> @@ -202,7 +202,7 @@ static void virtio_fs_drain_queue(struct virtio_fs_vq 
>> *fsvq)
>>      }
>>  
>>      flush_work(&fsvq->done_work);
>> -    flush_delayed_work(&fsvq->dispatch_work);
>> +    flush_work(&fsvq->dispatch_work);
>>  }
>>  
>>  static void virtio_fs_drain_all_queues_locked(struct virtio_fs *fs)
>> @@ -346,6 +346,9 @@ static void virtio_fs_hiprio_done_work(struct 
>> work_struct *work)
>>                      dec_in_flight_req(fsvq);
>>              }
>>      } while (!virtqueue_enable_cb(vq) && likely(!virtqueue_is_broken(vq)));
>> +
>> +    if (!list_empty(&fsvq->queued_reqs))
>> +            schedule_work(&fsvq->dispatch_work);
>>      spin_unlock(&fsvq->lock);
>>  }
>>  
>> @@ -353,7 +356,7 @@ static void virtio_fs_request_dispatch_work(struct 
>> work_struct *work)
>>  {
>>      struct fuse_req *req;
>>      struct virtio_fs_vq *fsvq = container_of(work, struct virtio_fs_vq,
>> -                                             dispatch_work.work);
>> +                                             dispatch_work);
>>      int ret;
>>  
>>      pr_debug("virtio-fs: worker %s called.\n", __func__);
>> @@ -388,8 +391,6 @@ static void virtio_fs_request_dispatch_work(struct 
>> work_struct *work)
>>                      if (ret == -ENOMEM || ret == -ENOSPC) {
>>                              spin_lock(&fsvq->lock);
>>                              list_add_tail(&req->list, &fsvq->queued_reqs);
>> -                            schedule_delayed_work(&fsvq->dispatch_work,
>> -                                                  msecs_to_jiffies(1));
> 
> Virtqueue being full is only one of the reasons for failure to queue
> the request. What if virtqueue is empty but we could not queue the
> request because lack of memory (-ENOMEM). In that case we will queue
> the request and it might not be dispatched because there is no completion.
> (Assume there is no further new request coming). That means deadlock?
> 
> Thanks
> Vivek
> 

Good catch that will deadlock.

Is default kernel behavior to indefinitely retry a file system
request until memory is available?

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Reply via email to