On Fri, 03/24 15:27, Stefan Hajnoczi wrote: > On Thu, Mar 23, 2017 at 06:57:14PM +0100, Paolo Bonzini wrote: > > > > > > On 23/03/2017 18:44, Stefan Hajnoczi wrote: > > >> It's possible to wedge QEMU if the guest tries to reset a virtio-pci > > >> device as QEMU is also using the drive for a blockjob. This patchset > > >> aims to allow us to safely pause/resume jobs attached to individual > > >> nodes in a manner similar to how bdrv_drain_all_begin/end do. > > > > > > Weird, I thought the 0 nanosecond sleep that block jobs do in their > > > main loop allows aio_poll() loops to finish. > > > > The 0 nanosecond sleep is now done in the BDS AioContext rather than in > > the "non-aio_poll-aware" main loop: > > > > commit 0b9caf9b3166c8deb3c4f3a774c2384b069dc29c > > Author: Fam Zheng <f...@redhat.com> > > Date: Tue Aug 26 15:15:43 2014 +0800 > > > > coroutine: Drop co_sleep_ns > > > > block_job_sleep_ns is the only user. Since we are moving towards > > AioContext aware code, it's better to use the explicit version and drop > > the old one. > > > > Signed-off-by: Fam Zheng <f...@redhat.com> > > Reviewed-by: BenoƮt Canet <benoit.ca...@nodalink.com> > > Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com> > > But we hold the AioContext lock and are calling aio_poll(), so I would > expect our loop to terminate. The blockjob coroutine should still be > leaving this little gap in activity during which the aio_poll() loop > finishes.
I am not sure, but at each gap, aio_poll() is free to fire the 0 nanosecond sleep timer cb already, which will generate more I/O. The correct thing is, like in this series, set the pause flag. Fam