On Fri, Oct 17, 2025 at 11:43:30AM +0200, Fiona Ebner wrote:
> When scsi_req_dequeue() is reached via
> scsi_req_cancel_async()
> virtio_scsi_tmf_cancel_req()
> virtio_scsi_do_tmf_aio_context(),
> there is a deadlock when trying to acquire the SCSI device's requests
> lock, because it was already acquired in
> virtio_scsi_do_tmf_aio_context().
>
> In particular, the issue happens with a FreeBSD guest (13, 14, 15,
> maybe more), when it cancels SCSI requests, because of timeout.
>
> This is a regression caused by commit da6eebb33b ("virtio-scsi:
> perform TMFs in appropriate AioContexts") and the introduction of the
> requests_lock earlier.
>
> To fix the issue, only cancel the requests after releasing the
> requests_lock. For this, the SCSI device's requests are iterated while
> holding the requests_lock and the requests to be cancelled are
> collected in a list. Then, the collected requests are cancelled
> one by one while not holding the requests_lock. This is safe, because
> only requests from the current AioContext are collected and acted
> upon.
>
> Originally reported by Proxmox VE users:
> https://bugzilla.proxmox.com/show_bug.cgi?id=6810
> https://forum.proxmox.com/threads/173914/
>
> Fixes: da6eebb33b ("virtio-scsi: perform TMFs in appropriate AioContexts")
> Suggested-by: Stefan Hajnoczi <[email protected]>
> Signed-off-by: Fiona Ebner <[email protected]>
> ---
>
> Changes in v2:
> * Different approach, collect requests for cancelling in a list for a
> localized solution rather than keeping track of the lock status via
> function arguments.
>
> hw/scsi/virtio-scsi.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)Thanks, applied to my block tree: https://gitlab.com/stefanha/qemu/commits/block I replace g_list_append() with g_list_prepend() like in scsi_device_for_each_req_async_bh(). The GLib documentation says the following (https://docs.gtk.org/glib/type_func.List.append.html): g_list_append() has to traverse the entire list to find the end, which is inefficient when adding multiple elements. A common idiom to avoid the inefficiency is to use g_list_prepend() and reverse the list with g_list_reverse() when all elements have been added. We don't call g_list_reverse() in scsi_device_for_each_req_async_bh() and I don't think it's necessary here either. Stefan
signature.asc
Description: PGP signature
