On Mon, Mar 13, 2023 at 5:32 PM Kevin Wolf <kw...@redhat.com> wrote: > > So I still think that this bug is a symptom of a problem in the design > > of request queuing. > > > > In fact, shouldn't request queuing was enabled at the _end_ of > > bdrv_drained_begin (once the BlockBackend has reached a quiescent > > state on its own terms), rather than at the beginning (which leads to > > deadlocks like this one)? > > 1. I want to have exclusive access to the node. This one wants request > queuing from the start to avoid losing time unnecessarily until the > guest stops sending new requests. > > 2. I want to wait for my requests to complete. This one never wants > request queuing. Enabling it at the end of bdrv_drained_begin() > wouldn't hurt it (because it has already achieved its goal then), but > it's also not necessary for the same reason.
Right, doing it at the end would be needed to avoid the deadlocks. On the other hand, case 1 can (and I think should) be handled by .drained_begin, or shortcut through aio_disable_external() for those devices that use ioeventfd. Paolo > So maybe what we could take from this is that request queuing should be > temporarily disabled while we're in blk_drain*() because these > interfaces are only meant for case 2. In all other cases, it should > continue to work as it does now.