On Mon, Aug 14, 2023 at 11:54:26AM -0700, Steve Sistare wrote:
> Migration of a guest in the suspended runstate is broken.  The incoming
> migration code automatically tries to wake the guest, which is wrong;
> the guest should end migration in the same runstate it started.  Further,
> for a restored snapshot, the automatic wakeup fails.  The runstate is
> RUNNING, but the guest is not.  See the commit messages for the details.

Hi Steve,

I drafted two small patches to show what I meant, on top of this series.
Before applying these two, one needs to revert patch 1 in this series.

After applied, it should also pass all three new suspend tests.  We can
continue the discussion here based on the patches.

Thanks,

=======
>From 2e495b08be4c56d5d8a47ba1657bae6e316c6254 Mon Sep 17 00:00:00 2001
From: Peter Xu <pet...@redhat.com>
Date: Thu, 17 Aug 2023 12:32:00 -0400
Subject: [PATCH 1/2] cpus: Allow vm_prepare_start() to take vm state as input

It was by default always setting the state as RUNNING, but logically
SUSPENDED (acpi s1) should also fall into "vm running" case, where it's
only the vcpus that are stopped running (while everything else is).

Adding such a state parameter to be prepared when we want to prepare start
when not allowing vcpus to start yet (RUN_STATE_SUSPENDED).

Note: I found that not all vm state notifiers are ready for SUSPENDED when
having running=true set.  Here let's always pass in RUNNING irrelevant of
the state passed into vm_prepare_start(), and leave that for later to
figure out.  So far there should have (hopefully) no impact functional
wise.

For this specific patch, no functional change at all should be intended,
because all callers are still passing over RUNNING.

Signed-off-by: Peter Xu <pet...@redhat.com>
---
 include/sysemu/runstate.h |  3 ++-
 gdbstub/softmmu.c         |  2 +-
 softmmu/cpus.c            | 12 +++++++++---
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
index 7beb29c2e2..7d889ab7c7 100644
--- a/include/sysemu/runstate.h
+++ b/include/sysemu/runstate.h
@@ -39,8 +39,9 @@ void vm_start(void);
  * vm_prepare_start: Prepare for starting/resuming the VM
  *
  * @step_pending: whether any of the CPUs is about to be single-stepped by gdb
+ * @state: the vm state to setup
  */
-int vm_prepare_start(bool step_pending);
+int vm_prepare_start(bool step_pending, RunState state);
 int vm_stop(RunState state);
 int vm_stop_force_state(RunState state);
 int vm_shutdown(void);
diff --git a/gdbstub/softmmu.c b/gdbstub/softmmu.c
index f509b7285d..a43e8328c0 100644
--- a/gdbstub/softmmu.c
+++ b/gdbstub/softmmu.c
@@ -565,7 +565,7 @@ int gdb_continue_partial(char *newstates)
             }
         }
 
-        if (vm_prepare_start(step_requested)) {
+        if (vm_prepare_start(step_requested, RUN_STATE_RUNNING)) {
             return 0;
         }
 
diff --git a/softmmu/cpus.c b/softmmu/cpus.c
index fed20ffb5d..000fac79b7 100644
--- a/softmmu/cpus.c
+++ b/softmmu/cpus.c
@@ -681,7 +681,7 @@ int vm_stop(RunState state)
  * Returns -1 if the vCPUs are not to be restarted (e.g. if they are already
  * running or in case of an error condition), 0 otherwise.
  */
-int vm_prepare_start(bool step_pending)
+int vm_prepare_start(bool step_pending, RunState state)
 {
     RunState requested;
 
@@ -713,14 +713,20 @@ int vm_prepare_start(bool step_pending)
     qapi_event_send_resume();
 
     cpu_enable_ticks();
-    runstate_set(RUN_STATE_RUNNING);
+    runstate_set(state);
+    /*
+     * FIXME: ignore "state" being passed in for now, notify always with
+     * RUNNING. Because some of the vm state change handlers may not expect
+     * other states (e.g. SUSPENDED) passed in with running=true.  This can
+     * be modified after proper investigation over all vm state notifiers.
+     */
     vm_state_notify(1, RUN_STATE_RUNNING);
     return 0;
 }
 
 void vm_start(void)
 {
-    if (!vm_prepare_start(false)) {
+    if (!vm_prepare_start(false, RUN_STATE_RUNNING)) {
         resume_all_vcpus();
     }
 }
-- 
2.41.0

=======

>From 4a0936eafd03952d58ab380271559c4a2049b96e Mon Sep 17 00:00:00 2001
From: Peter Xu <pet...@redhat.com>
Date: Thu, 17 Aug 2023 12:44:29 -0400
Subject: [PATCH 2/2] fixup

Signed-off-by: Peter Xu <pet...@redhat.com>
---
 migration/migration.c        | 9 +++++----
 tests/qtest/migration-test.c | 9 +++++----
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index e6b8024b03..b004475af6 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -497,7 +497,7 @@ static void process_incoming_migration_bh(void *opaque)
         migration_incoming_disable_colo();
         vm_start();
     } else {
-        runstate_set(global_state_get_runstate());
+        vm_prepare_start(false, global_state_get_runstate());
     }
     /*
      * This must happen after any state changes since as soon as an external
@@ -1143,15 +1143,16 @@ void migrate_set_state(int *state, int old_state, int 
new_state)
 
 void migrate_set_runstate(void)
 {
-    if (!global_state_received() ||
-        global_state_get_runstate() == RUN_STATE_RUNNING) {
+    RunState state = global_state_get_runstate();
+
+    if (!global_state_received() || state == RUN_STATE_RUNNING) {
         if (autostart) {
             vm_start();
         } else {
             runstate_set(RUN_STATE_PAUSED);
         }
     } else {
-        runstate_set(global_state_get_runstate());
+        vm_prepare_start(false, state);
     }
 }
 
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 3c9e487754..5cc8b914df 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1614,15 +1614,16 @@ static void test_precopy_common(MigrateCommon *args)
             qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}");
         }
 
+        /* Always get RESUME first after switchover */
+        if (!dst_state.resume_seen) {
+            qtest_qmp_eventwait(to, "RESUME");
+        }
+
         if (args->start.suspend_me) {
             /* wakeup succeeds only if guest is suspended */
             qtest_qmp_assert_success(to, "{'execute': 'system_wakeup'}");
         }
 
-        if (!dst_state.resume_seen) {
-            qtest_qmp_eventwait(to, "RESUME");
-        }
-
         wait_for_serial("dest_serial");
     }
 
-- 
2.41.0


-- 
Peter Xu


Reply via email to