Stefan Hajnoczi <stefa...@redhat.com> writes: > On Sun, Feb 14, 2016 at 02:22:10PM +0800, Fam Zheng wrote: >> On Tue, 02/09 13:47, Stefan Hajnoczi wrote: >> > On Mon, Feb 08, 2016 at 03:17:23PM +0000, Dr. David Alan Gilbert wrote: >> > > Does this make sense to everyone else, or does anyone have any better >> > > suggestions? >> > >> > As a concrete example, any monitor command that calls bdrv_drain_all() >> > can hang forever with the QEMU global mutex held if I/O requests are >> > stuck (e.g. NFS mount is unreachable). >> > >> > bdrv_aio_cancel() can also hang but is mostly exposed to device >> > emulation, not the monitor. >> > >> > One solution for these block layer functions is to add a timeout >> > argument and let them return an error. This way the monitor and device >> > emulation do not hang forever. >> >> Yes, there are a few places in block layer invoking aio_poll() in a loop >> waiting for certain events, and a disconnected network link could make QEMU >> hang. In these cases a timeout is a huge improvement. Maybe we can mark the >> BDS as "hanging" (-EIO is returned for all further requests) and let >> bdrv_drain_all() return. > > No, we need to be very careful about hung requests that are still in > flight. They could complete after a long time and modify the disk or > guest RAM. > > The only thing bdrv_drain_all() can return is -ETIME. Beyond that it > cannot pretend that requests have been drained since that could lead to > data corruption/loss. > > The caller has to abort whatever it was trying to do since it has no > guarantee that requests have drained. Maybe the remaining requests will > fail and go away, maybe they will complete. We don't know... > >> > >> > The benefit of the timeout is that both monitor and device emulation >> > hangs are tackled. It also doesn't require monitor changes. >> > >> > I'm not sure who chooses the timeout value and which value makes sense >> > (policy vs mechanism separation)... >> >> Default to 30 seconds like Linux, and make it tunable through command line >> options as well as QMP? > > Yes. Either commands that block can take an optional timeout (seconds) > parameter, or we could have a global disk I/O timeout value. I prefer > passing an optional per-command value. > > Libvirt and other sophisticated clients could begin using it in a > backwards-compatible way. Existing code wouldn't use it and still > suffer from the hang problem (but at least QEMU is > backwards-compatible).
I'm not arguing against timeouts, I just want to remind everyone that blocking while holding the BQL is a bug, and limiting the block time with timeouts is a work-around, no more, no less.