Eric Blake <ebl...@redhat.com> writes: > When iothreads are in use, the failure to grab the aio context results > in an assertion failure when trying to unlock things during blk_unref, > when trying to unlock a mutex that was not locked. In short, all > calls to nbd_export_put need to done while within the correct aio > context. But since nbd_export_put can recursively reach itself via > nbd_export_close, and recursively grabbing the context would deadlock, > we can't do the context grab directly in those functions, but must do > so in their callers. > > Hoist the use of the correct aio_context from nbd_export_new() to its > caller qmp_nbd_server_add(). Then tweak qmp_nbd_server_remove(), > nbd_eject_notifier(), and nbd_esport_close_all() to grab the right > context, so that all callers during qemu now own the context before > nbd_export_put() can call blk_unref(). > > Remaining uses in qemu-nbd don't matter (since that use case does not > support iothreads). > > Suggested-by: Kevin Wolf <kw...@redhat.com> > Signed-off-by: Eric Blake <ebl...@redhat.com> > --- > > With this in place, my emailed formula [1] for causing an iothread > assertion failure no longer hits, and all the -nbd and -qcow2 iotests > still pass. I would still like to update iotests to cover things (I > could not quickly figure out how to make iotest 222 use iothreads - > either we modify that one or add a new one), but wanted to get review > started on this first. > > [1] https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg03383.html > > include/block/nbd.h | 1 + > blockdev-nbd.c | 14 ++++++++++++-- > nbd/server.c | 23 +++++++++++++++++++---- > 3 files changed, 32 insertions(+), 6 deletions(-) > > diff --git a/include/block/nbd.h b/include/block/nbd.h > index 21550747cf35..316fd705a9e4 100644 > --- a/include/block/nbd.h > +++ b/include/block/nbd.h > @@ -340,6 +340,7 @@ void nbd_export_put(NBDExport *exp); > > BlockBackend *nbd_export_get_blockdev(NBDExport *exp); > > +AioContext *nbd_export_aio_context(NBDExport *exp); > NBDExport *nbd_export_find(const char *name); > void nbd_export_close_all(void); > > diff --git a/blockdev-nbd.c b/blockdev-nbd.c > index 213f226ac1c4..6a8b206e1d74 100644 > --- a/blockdev-nbd.c > +++ b/blockdev-nbd.c > @@ -151,6 +151,7 @@ void qmp_nbd_server_add(const char *device, bool > has_name, const char *name, > BlockBackend *on_eject_blk; > NBDExport *exp; > int64_t len; > + AioContext *aio_context; > > if (!nbd_server) { > error_setg(errp, "NBD server not running"); > @@ -173,11 +174,13 @@ void qmp_nbd_server_add(const char *device, bool > has_name, const char *name, > return; > } > > + aio_context = bdrv_get_aio_context(bs); > + aio_context_acquire(aio_context); > len = bdrv_getlength(bs); > if (len < 0) { > error_setg_errno(errp, -len, > "Failed to determine the NBD export's length"); > - return; > + goto out; > } > > if (!has_writable) { > @@ -190,13 +193,16 @@ void qmp_nbd_server_add(const char *device, bool > has_name, const char *name, > exp = nbd_export_new(bs, 0, len, name, NULL, bitmap, !writable, > !writable, > NULL, false, on_eject_blk, errp); > if (!exp) { > - return; > + goto out; > } > > /* The list of named exports has a strong reference to this export now > and > * our only way of accessing it is through nbd_export_find(), so we can > drop > * the strong reference that is @exp. */ > nbd_export_put(exp); > + > + out: > + aio_context_release(aio_context); > } > > void qmp_nbd_server_remove(const char *name, > @@ -204,6 +210,7 @@ void qmp_nbd_server_remove(const char *name, > Error **errp) > { > NBDExport *exp; > + AioContext *aio_context; > > if (!nbd_server) { > error_setg(errp, "NBD server not running"); > @@ -220,7 +227,10 @@ void qmp_nbd_server_remove(const char *name, > mode = NBD_SERVER_REMOVE_MODE_SAFE; > } > > + aio_context = nbd_export_aio_context(exp); > + aio_context_acquire(aio_context); > nbd_export_remove(exp, mode, errp); > + aio_context_release(aio_context); > } > > void qmp_nbd_server_stop(Error **errp) > diff --git a/nbd/server.c b/nbd/server.c > index 378784c1e54a..3003381c86b4 100644 > --- a/nbd/server.c > +++ b/nbd/server.c > @@ -1458,7 +1458,12 @@ static void blk_aio_detach(void *opaque) > static void nbd_eject_notifier(Notifier *n, void *data) > { > NBDExport *exp = container_of(n, NBDExport, eject_notifier); > + AioContext *aio_context; > + > + aio_context = exp->ctx; > + aio_context_acquire(aio_context); > nbd_export_close(exp); > + aio_context_release(aio_context); > } > > NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset, > @@ -1477,12 +1482,11 @@ NBDExport *nbd_export_new(BlockDriverState *bs, > uint64_t dev_offset, > * NBD exports are used for non-shared storage migration. Make sure > * that BDRV_O_INACTIVE is cleared and the image is ready for write > * access since the export could be available before migration handover. > + * ctx was acquired in the caller. > */ > assert(name); > ctx = bdrv_get_aio_context(bs); > - aio_context_acquire(ctx); > bdrv_invalidate_cache(bs, NULL); > - aio_context_release(ctx); > > /* Don't allow resize while the NBD server is running, otherwise we don't > * care what happens with the node. */ > @@ -1490,7 +1494,7 @@ NBDExport *nbd_export_new(BlockDriverState *bs, > uint64_t dev_offset, > if (!readonly) { > perm |= BLK_PERM_WRITE; > } > - blk = blk_new(bdrv_get_aio_context(bs), perm, > + blk = blk_new(ctx, perm, > BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED | > BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD); > ret = blk_insert_bs(blk, bs, errp); > @@ -1557,7 +1561,7 @@ NBDExport *nbd_export_new(BlockDriverState *bs, > uint64_t dev_offset, > } > > exp->close = close; > - exp->ctx = blk_get_aio_context(blk); > + exp->ctx = ctx; > blk_add_aio_context_notifier(blk, blk_aio_attached, blk_aio_detach, exp); > > if (on_eject_blk) { > @@ -1590,11 +1594,18 @@ NBDExport *nbd_export_find(const char *name) > return NULL; > } > > +AioContext * > +nbd_export_aio_context(NBDExport *exp) > +{ > + return exp->ctx; > +} > + > void nbd_export_close(NBDExport *exp) > { > NBDClient *client, *next; > > nbd_export_get(exp); > +
I'm not sure if this new line was added here on purpose. > /* > * TODO: Should we expand QMP NbdServerRemoveNode enum to allow a > * close mode that stops advertising the export to new clients but > @@ -1684,9 +1695,13 @@ BlockBackend *nbd_export_get_blockdev(NBDExport *exp) > void nbd_export_close_all(void) > { > NBDExport *exp, *next; > + AioContext *aio_context; > > QTAILQ_FOREACH_SAFE(exp, &exports, next, next) { > + aio_context = exp->ctx; > + aio_context_acquire(aio_context); > nbd_export_close(exp); > + aio_context_release(aio_context); > } > } Otherwise, LGTM. Reviewed-by: Sergio Lopez <s...@redhat.com>
signature.asc
Description: PGP signature