Am 19.05.2022 um 13:27 hat Stefan Hajnoczi geschrieben: > On Wed, May 18, 2022 at 06:14:17PM +0200, Kevin Wolf wrote: > > If we want to use drain for locking, we need to make sure that drain > > actually does the job correctly. I see two major problems with it: > > > > The first one is that drain only covers I/O paths, but we need to > > protect against _anything_ touching block nodes. This might mean a > > massive audit and making sure that everything in QEMU that could > > possibly touch a block node is integrated with drain. > > > I think Emanuele has argued before that because writes to the graph only > > happen in the main thread and we believe that currently only I/O > > requests are processed in iothreads, this is safe and we don't actually > > need to audit everything. > > I'm interested in the non-I/O code path cases you're thinking about: > > Block jobs receive callbacks during drain. They are safe.
We've had bugs in the callbacks before, so while they are probably as safe as they can get when each user has to actively support drain, I'm a bit less than 100% confident. > Exports: > - The nbd export has code to deal with drain and looks safe. > - vhost-user-blk uses aio_set_fd_handler(is_external=true) for virtqueue > kick fds but not for the vhost-user UNIX domain socket (not sure if > that's a problem). > - FUSE uses aio_set_fd_handler(is_external=true) and looks safe. > > The monitor runs with the BQL in the main loop and doesn't use > coroutines. It should be safe. The monitor does use coroutines (though I think this doesn't make a difference; the more important point is that this coroutine runs in qemu_aio_context while executing commands) and is not safe with respect to drain and we're seeing bugs coming from it: https://lists.gnu.org/archive/html/qemu-block/2022-03/msg00582.html The broader point here is that even running with the BQL in the main loop doesn't help you at all if the graph writer it could interfere with polls or is a coroutine that yields. You're only safe if _both_ sides run with the BQL in the main thread and neither poll nor yield. This is the condition I explained under which Emanuele's reasoning holds true. So you can choose between verifying that the monitor actively respects drain (it doesn't currently) or verifying that no graph writer can poll or yield during its drain section so that holding the BQL is enough (not true today and I'm not sure if it could be made true even in theory). > Anything else? How to figure this out is precisely my question to you. :-) Maybe migration completion? Some random BHs? A guest enabling a virtio-blk device so that the dataplane gets started and its backend is moved to a different AioContext? I'm not sure if these cases are bad. Maybe they aren't. But how do I tell without reading every code path? Well, and the follow-up question: How do we make sure that the list of "anything else" doesn't change over time when we rely on auditing every item on it for the correctness of drain? Kevin
signature.asc
Description: PGP signature