Eric Blake <[email protected]> writes: > On 12/9/19 10:06 AM, Kevin Wolf wrote: >> Am 28.11.2019 um 11:41 hat Sergio Lopez geschrieben: >>> bdrv_try_set_aio_context() requires that the old context is held, and >>> the new context is not held. Fix all the occurrences where it's not >>> done this way. >>> >>> Suggested-by: Max Reitz <[email protected]> >>> Signed-off-by: Sergio Lopez <[email protected]> >>> --- > >> Or in fact, I think you need to hold the AioContext of a bs to >> bdrv_unref() it, so maybe 'goto out' is right, but you need to unref >> target_bs while you still hold old_context. > > I suspect https://bugzilla.redhat.com/show_bug.cgi?id=1779036 is also > a symptom of this. The v5 patch did not fix this simple test case: > > > $ qemu-img create -f qcow2 f1 100m > $ qemu-img create -f qcow2 f2 100m > $ ./qemu-kvm -nodefaults -nographic -qmp stdio -object iothread,id=io0 \ > -drive driver=qcow2,id=drive1,file=f1,if=none -device > virtio-scsi-pci,id=scsi0,iothread=io0 -device > scsi-hd,id=image1,drive=drive1 \ > -drive driver=qcow2,id=drive2,file=f2,if=none -device > virtio-blk-pci,id=image2,drive=drive2,iothread=io0 > > {'execute':'qmp_capabilities'} > > {'execute':'transaction','arguments':{'actions':[ > {'type':'blockdev-snapshot-sync','data':{'device':'drive1', > 'snapshot-file':'sn1','mode':'absolute-paths','format':'qcow2'}}, > {'type':'blockdev-snapshot-sync','data':{'device':'drive2', > 'snapshot-file':'/aa/sn1','mode':'absolute-paths','format':'qcow2'}}]}} > > which is an aio context bug somewhere on the error path of > blockdev-snapshot-sync (the first one has to be rolled back because > the second part of the transaction fails early on a nonexistent > directory)
This is slightly different. The problem resides in
external_snapshot_abort():
1717 static void external_snapshot_abort(BlkActionState *common)
1718 {
1719 ExternalSnapshotState *state =
1720 DO_UPCAST(ExternalSnapshotState, common,
common);
1721 if (state->new_bs) {
1722 if (state->overlay_appended) {
1723 AioContext *aio_context;
1724
1725 aio_context = bdrv_get_aio_context(state->old_bs);
1726 aio_context_acquire(aio_context);
1727
1728 bdrv_ref(state->old_bs); /* we can't let
bdrv_set_backind_hd()
1729 close state->old_bs; we need
it */
1730 bdrv_set_backing_hd(state->new_bs, NULL, &error_abort);
1731 bdrv_replace_node(state->new_bs, state->old_bs,
&error_abort);
1732 bdrv_unref(state->old_bs); /* bdrv_replace_node() ref'ed
old_bs */
1733
1734 aio_context_release(aio_context);
1735 }
1736 }
1737 }
bdrv_set_backing_hd() returns state->old_bs to the main AioContext,
while bdrv_replace_node() expects state->new_bs and state->old_bs to be
using the same AioContext.
I'm thinking sending this as a separate patch:
diff --git a/blockdev.c b/blockdev.c
index e33abd7fd2..6c73ac4e32 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1731,6 +1731,8 @@ static void external_snapshot_abort(BlkActionState
*common)
if (state->new_bs) {
if (state->overlay_appended) {
AioContext *aio_context;
+ AioContext *tmp_context;
+ int ret;
aio_context = bdrv_get_aio_context(state->old_bs);
aio_context_acquire(aio_context);
@@ -1738,6 +1740,25 @@ static void external_snapshot_abort(BlkActionState
*common)
bdrv_ref(state->old_bs); /* we can't let bdrv_set_backind_hd()
close state->old_bs; we need it */
bdrv_set_backing_hd(state->new_bs, NULL, &error_abort);
+
+ /*
+ * The call to bdrv_set_backing_hd() above returns state->old_bs to
+ * the main AioContext. As we're still going to be using it, return
+ * it to the AioContext it was before.
+ */
+ tmp_context = bdrv_get_aio_context(state->old_bs);
+ if (aio_context != tmp_context) {
+ aio_context_release(aio_context);
+ aio_context_acquire(tmp_context);
+
+ ret = bdrv_try_set_aio_context(state->old_bs,
+ aio_context, NULL);
+ assert(ret == 0);
+
+ aio_context_release(tmp_context);
+ aio_context_acquire(aio_context);
+ }
+
bdrv_replace_node(state->new_bs, state->old_bs, &error_abort);
bdrv_unref(state->old_bs); /* bdrv_replace_node() ref'ed old_bs */
What do you think?
Sergio.
signature.asc
Description: PGP signature
