children: new rwlock

Kevin Wolf Thu, 19 May 2022 05:54:42 -0700

Am 19.05.2022 um 13:27 hat Stefan Hajnoczi geschrieben:
> On Wed, May 18, 2022 at 06:14:17PM +0200, Kevin Wolf wrote:
> > If we want to use drain for locking, we need to make sure that drain
> > actually does the job correctly. I see two major problems with it:
> > 
> > The first one is that drain only covers I/O paths, but we need to
> > protect against _anything_ touching block nodes. This might mean a
> > massive audit and making sure that everything in QEMU that could
> > possibly touch a block node is integrated with drain.
> 
> > I think Emanuele has argued before that because writes to the graph only
> > happen in the main thread and we believe that currently only I/O
> > requests are processed in iothreads, this is safe and we don't actually
> > need to audit everything.
> 
> I'm interested in the non-I/O code path cases you're thinking about:
> 
> Block jobs receive callbacks during drain. They are safe.


We've had bugs in the callbacks before, so while they are probably as
safe as they can get when each user has to actively support drain, I'm
a bit less than 100% confident.

> Exports:
> - The nbd export has code to deal with drain and looks safe.
> - vhost-user-blk uses aio_set_fd_handler(is_external=true) for virtqueue
>   kick fds but not for the vhost-user UNIX domain socket (not sure if
>   that's a problem).
> - FUSE uses aio_set_fd_handler(is_external=true) and looks safe.
> 
> The monitor runs with the BQL in the main loop and doesn't use
> coroutines. It should be safe.

The monitor does use coroutines (though I think this doesn't make a
difference; the more important point is that this coroutine runs in
qemu_aio_context while executing commands) and is not safe with respect
to drain and we're seeing bugs coming from it:

https://lists.gnu.org/archive/html/qemu-block/2022-03/msg00582.html

The broader point here is that even running with the BQL in the main
loop doesn't help you at all if the graph writer it could interfere with
polls or is a coroutine that yields. You're only safe if _both_ sides
run with the BQL in the main thread and neither poll nor yield. This is
the condition I explained under which Emanuele's reasoning holds true.

So you can choose between verifying that the monitor actively respects
drain (it doesn't currently) or verifying that no graph writer can poll
or yield during its drain section so that holding the BQL is enough (not
true today and I'm not sure if it could be made true even in theory).

> Anything else?

How to figure this out is precisely my question to you. :-) Maybe
migration completion? Some random BHs? A guest enabling a virtio-blk
device so that the dataplane gets started and its backend is moved to a
different AioContext? I'm not sure if these cases are bad. Maybe they
aren't. But how do I tell without reading every code path?

Well, and the follow-up question: How do we make sure that the list of
"anything else" doesn't change over time when we rely on auditing every
item on it for the correctness of drain?

Kevin

signature.asc
Description: PGP signature

Re: [RFC PATCH v2 0/8] Removal of AioContext lock, bs->parents and ->children: new rwlock

Reply via email to