On Mon, Feb 08, 2016 at 03:17:23PM +0000, Dr. David Alan Gilbert wrote: > Does this make sense to everyone else, or does anyone have any better > suggestions?
As a concrete example, any monitor command that calls bdrv_drain_all() can hang forever with the QEMU global mutex held if I/O requests are stuck (e.g. NFS mount is unreachable). bdrv_aio_cancel() can also hang but is mostly exposed to device emulation, not the monitor. One solution for these block layer functions is to add a timeout argument and let them return an error. This way the monitor and device emulation do not hang forever. The benefit of the timeout is that both monitor and device emulation hangs are tackled. It also doesn't require monitor changes. I'm not sure who chooses the timeout value and which value makes sense (policy vs mechanism separation)... Stefan
signature.asc
Description: PGP signature