Hi, Am 30.05.23 um 18:32 schrieb Kevin Wolf: > From: Stefan Hajnoczi <stefa...@redhat.com> > > Detach ioeventfds during drained sections to stop I/O submission from > the guest. virtio-blk is no longer reliant on aio_disable_external() > after this patch. This will allow us to remove the > aio_disable_external() API once all other code that relies on it is > converted. > > Take extra care to avoid attaching/detaching ioeventfds if the data > plane is started/stopped during a drained section. This should be rare, > but maybe the mirror block job can trigger it. > > Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com> > Message-Id: <20230516190238.8401-18-stefa...@redhat.com> > Signed-off-by: Kevin Wolf <kw...@redhat.com>
I ran into a strange issue where guest IO would get completely stuck during certain block jobs a while ago and finally managed to find a small reproducer [0]. I'm using a VM with virtio-blk-pci (or virtio-scsi-pci) with an iothread and running fio --name=file --size=100M --direct=1 --rw=randwrite --bs=4k --ioengine=psync --numjobs=5 --runtime=1200 --time_based in the guest. Then I'm issuing the QMP command with the reproducer in a loop. Usually, the guest IO will get stuck after about 1-3 minutes, sometimes fio can manage to continue with a lower speed for a while (but trying to Ctrl+C it or doing other IO in the guest will already be broken), which I guess could be a hint that it's an issue with notifiers? Bisecting (to declare a commit good, I waited 10 minutes) led me to this patch, i.e. commit 1665d9326f ("virtio-blk: implement BlockDevOps->drained_begin()") and for SCSI, I verified that the issue similarly starts happening after 766aa2de0f ("virtio-scsi: implement BlockDevOps->drained_begin()"). Both issues are still present on current master (i.e. 1c98a821a2 ("tests/qtest: Introduce tests for AMD/Xilinx Versal TRNG device")) Happy to provide more information and hints about how to debug the issue further. Best Regards, Fiona [0]: > diff --git a/blockdev.c b/blockdev.c > index db2725fe74..bf2e0fc22c 100644 > --- a/blockdev.c > +++ b/blockdev.c > @@ -2986,6 +2986,11 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp) > bool zero_target; > int ret; > > + bdrv_drain_all_begin(); > + bdrv_drain_all_end(); > + return; > + > + > bs = qmp_get_root_bs(arg->device, errp); > if (!bs) { > return;