On Sun, 02/15 10:40, Fam Zheng wrote: > Even if the caller has the old #AioContext, there can be a deadlock, due > to the leading bdrv_drain_all: > > Suppose there are three io threads (a, b, c) with each owning a BDS > (bds_a, bds_b, bds_c), and a and b want to move their own BDS to c at > the same time: > > iothread a iothread b > -------------------------------------------------------------------------- > bdrv_set_aio_context(bds_a, c) bdrv_set_aio_context(bds_b, c) > -> bdrv_drain_all() -> bdrv_drain_all() > -> acquire a (OK, already has) -> acquire a (blocked) > -> acquire b (blocked) -> acquire b > -> acquire c -> acquire c
This doesn't recap the essence of the bug because one may argue that aio_context_acquire(c) is needed before either thread calling bdrv_set_aio_context. Actually it doesn't matter in this case, because iothread b can as well do bdrv_set_aio_context(bds_b, d). I'll update the commit log with v3. Fam > > Current caller of bdrv_set_aio_context outside BQL is > virtio-scsi-dataplane, which will be fixed in the next patches. > > Signed-off-by: Fam Zheng <f...@redhat.com> > Reviewed-by: Paolo Bonzini <pbonz...@redhat.com> > --- > include/block/block.h | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/include/block/block.h b/include/block/block.h > index 321295e..4fce25d 100644 > --- a/include/block/block.h > +++ b/include/block/block.h > @@ -546,8 +546,7 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs); > * Changes the #AioContext used for fd handlers, timers, and BHs by this > * BlockDriverState and all its children. > * > - * This function must be called from the old #AioContext or with a lock held > so > - * the old #AioContext is not executing. > + * This function must be called with iothread lock held. > */ > void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context); > > -- > 2.1.0 > >