24.11.2020 13:04, Tuguoyi wrote:
The following steps will cause qemu assertion failure:
- pause vm
- save memory snapshot into local file through fd migration
- do the above operation again will cause qemu assertion failure

Hi!

Why do you need such scenario? Isn't it more correct and safer to just 
error-out on savevm if disks are already inactive? Inactive disks are a sign, 
that vm may be migrated to other host and already running, so creating any kind 
of snapshots of this old state may be bad idea. I mean, you try to allow a 
doubtful feature to avoid an assertion. If you don't have strong reasons for 
the feature, it's better to turn a crash into clean error-out.

As far as I remember, bdrv_inactive_all() is the only source of inactivation, so 
actually, it's more like "inactive" state of the vm, not just some disks are 
inactive.. And you change the code in a way that it looks like that some disks may be 
inactive and some not, which would actually be unexpected behavior.


The backtrace looks like:
#0  0x00007fbf958c5c37 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fbf958c9028 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fbf958bebf6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fbf958beca2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x000055ca8decd39d in bdrv_inactivate_recurse (bs=0x55ca90c80400) at 
/build/qemu-5.0/block.c:5724
#5  0x000055ca8dece967 in bdrv_inactivate_all () at 
/build//qemu-5.0/block.c:5792
#6  0x000055ca8de5539d in qemu_savevm_state_complete_precopy_non_iterable 
(inactivate_disks=true, in_postcopy=false, f=0x55ca907044b0)
     at /build/qemu-5.0/migration/savevm.c:1401
#7  qemu_savevm_state_complete_precopy (f=0x55ca907044b0, 
iterable_only=iterable_only@entry=false, 
inactivate_disks=inactivate_disks@entry=true)
     at /build/qemu-5.0/migration/savevm.c:1453
#8  0x000055ca8de4f581 in migration_completion (s=0x55ca8f64d9f0) at 
/build/qemu-5.0/migration/migration.c:2941
#9  migration_iteration_run (s=0x55ca8f64d9f0) at 
/build/qemu-5.0/migration/migration.c:3295
#10 migration_thread (opaque=opaque@entry=0x55ca8f64d9f0) at 
/build/qemu-5.0/migration/migration.c:3459
#11 0x000055ca8dfc6716 in qemu_thread_start (args=<optimized out>) at 
/build/qemu-5.0/util/qemu-thread-posix.c:519
#12 0x00007fbf95c5f184 in start_thread () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#13 0x00007fbf9598cbed in clone () from /lib/x86_64-linux-gnu/libc.so.6

When the first migration completes, bs->open_flags will set BDRV_O_INACTIVE flag
by bdrv_inactivate_recurse(), and during the second migration the
bdrv_inactivate_recurse assert that the bs->open_flags is already 
BDRV_O_INACTIVE
enabled which cause crash.

This patch just make the bdrv_inactivate_all() to don't inactivate the bs if it 
is
already inactive

Signed-off-by: Tuguoyi <tu.gu...@h3c.com>
---
  block.c | 7 ++++++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index f1cedac..02361e1 100644
--- a/block.c
+++ b/block.c
@@ -5938,6 +5938,11 @@ static int bdrv_inactivate_recurse(BlockDriverState *bs)
      return 0;
  }
+static bool bdrv_is_inactive(BlockDriverState *bs)
+{
+    return bs->open_flags & BDRV_O_INACTIVE;
+}
+
  int bdrv_inactivate_all(void)
  {
      BlockDriverState *bs = NULL;
@@ -5958,7 +5963,7 @@ int bdrv_inactivate_all(void)
          /* Nodes with BDS parents are covered by recursion from the last
           * parent that gets inactivated. Don't inactivate them a second
           * time if that has already happened. */
-        if (bdrv_has_bds_parent(bs, false)) {
+        if (bdrv_has_bds_parent(bs, false) || bdrv_is_inactive(bs)) {
              continue;
          }
          ret = bdrv_inactivate_recurse(bs);




--
Best regards,
Vladimir

Reply via email to