Am 16.09.2014 um 14:35 hat Paolo Bonzini geschrieben:
> Il 16/09/2014 14:34, Kevin Wolf ha scritto:
> > I think bdrv_invalidate_cache() really needs to call bdrv_drain_all()
> > before starting to reopen stuff. There could be requests in flight
> > without holding the lock and if you can indeed reopen their BDS under
> > their feet without breaking things (I doubt it), that would be pure
> > luck.
> 
> But even that's not enough without a lock if .bdrv_invalidate_cache (the
> callback) is called from a coroutine.  As soon as it yields, another
> request can come in, for example from the NBD server.

Yes, that's true. We can't fix this problem in qcow2, though, because
it's a more general one.  I think we must make sure that
bdrv_invalidate_cache() doesn't yield.

Either by forbidding to run bdrv_invalidate_cache() in a coroutine and
moving the problem to the caller (where and why is it even called from a
coroutine?), or possibly by creating a new coroutine for the driver
callback and running that in a nested event loop that only handles
bdrv_invalidate_cache() callbacks, so that the NBD server doesn't get a
chance to process new requests in this thread.

Forbidding to run in a coroutine sounds easier, but I don't see yet
which caller would have to be fixed.

Kevin

Reply via email to