Am 03.11.23 um 14:12 schrieb Fiona Ebner: > Hi, > > I ran into a strange issue where guest IO would get completely stuck > during certain block jobs a while ago and finally managed to find a > small reproducer [0]. I'm using a VM with virtio-blk-pci (or > virtio-scsi-pci) with an iothread and running > > fio --name=file --size=100M --direct=1 --rw=randwrite --bs=4k > --ioengine=psync --numjobs=5 --runtime=1200 --time_based > > in the guest. Then I'm issuing the QMP command with the reproducer in a > loop. Usually, the guest IO will get stuck after about 1-3 minutes, > sometimes fio can manage to continue with a lower speed for a while (but > trying to Ctrl+C it or doing other IO in the guest will already be > broken), which I guess could be a hint that it's an issue with notifiers? > > Bisecting (to declare a commit good, I waited 10 minutes) led me to this > patch, i.e. commit 1665d9326f ("virtio-blk: implement > BlockDevOps->drained_begin()") and for SCSI, I verified that the issue > similarly starts happening after 766aa2de0f ("virtio-scsi: implement > BlockDevOps->drained_begin()"). > > Both issues are still present on current master (i.e. 1c98a821a2 > ("tests/qtest: Introduce tests for AMD/Xilinx Versal TRNG device")) > > Happy to provide more information and hints about how to debug the issue > further. >
I think I was finally able to get to the bottom of this and have a plausible-sounding pet theory now. It involves the VirtIO notifier optimization during poll mode. Let's step through some debug prints I added. First number is always the thread ID (I'm sorry that I used warn_report rather than proper tracing): > 247050 nodefd 29 poll_set_started 1 The iothread starts poll mode for the node with fd 29 which is the virtio host notifier. > 247050 0x55e515185270 poll begin for vq > 247050 0x55e515185270 setting notification for vq 0 virtio_queue_set_notification is called to disable notification. > 247050 nodefd 29 poll_set_started 1 done > 247050 0x55e515185270 handle vq suppress_notifications 0 num_reqs 1 > 247050 0x55e515185270 handle vq suppress_notifications 0 num_reqs 4 virtio-blk handling some requests, note that suppress_notifications is 0 because we are in poll mode. > 247048 nodefd 29 addr 0x55e51496ed70 marking as deleted Main thread marks the node for deletion when beginning drain, i.e. detaches the host notifier. > 247048 nodefd 29 addr 0x55e513cdcd20 is_new 1 adding node Main thread adds a new node when ending drain, i.e. attaches the host notifier. > 247050 nodefd 29 addr 0x55e51496ed70 remove deleted handler The iothread removes the handler marked for removal. In particular from the node_poll list: QLIST_SAFE_REMOVE(node, node_poll); > 247050 disabling poll mode before fdmon_ops->wait This is just before the call to poll_set_started(ctx, &ready_list, false) Whoops!! Nobody ends poll mode for the node with fd 29, because the old node was deleted from the node_poll list already and new node is not part of it, i.e. nobody has started poll mode for the new node. > 247050 0x55e515185270 handle vq suppress_notifications 0 num_reqs 0 fdmon_ops->wait() returns one last time (not sure why) but no actual requests. > 247050 disabling poll mode before fdmon_ops->wait After this, the fdmon_ops->wait() (it's fdmon_poll_wait in my case) will just wait forever (or until triggering QMP 'stop' and 'cont' which restarts the dataplane). A minimal workaround seems to be either calling event_notifier_set(virtio_queue_get_host_notifier(vq)); or virtio_queue_set_notification(vq, true); in drainded_end (for both VirtIO SCSI/block). But is this an actual issue with the AIO interface/implementation? Or should it rather be considered a bug in the VirtIO SCSI/block drain implementation, because of the notification optimization? Best Regards, Fiona > > [0]: > >> diff --git a/blockdev.c b/blockdev.c >> index db2725fe74..bf2e0fc22c 100644 >> --- a/blockdev.c >> +++ b/blockdev.c >> @@ -2986,6 +2986,11 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp) >> bool zero_target; >> int ret; >> >> + bdrv_drain_all_begin(); >> + bdrv_drain_all_end(); >> + return; >> + >> + >> bs = qmp_get_root_bs(arg->device, errp); >> if (!bs) { >> return; > > >