Re: [PATCH V1 2/3] migration: fix suspended runstate
On 8/17/2023 2:19 PM, Peter Xu wrote: > On Wed, Aug 16, 2023 at 01:48:13PM -0400, Steven Sistare wrote: >> On 8/14/2023 3:37 PM, Peter Xu wrote: >>> On Mon, Aug 14, 2023 at 02:53:56PM -0400, Steven Sistare wrote: > Can we just call vm_state_notify() earlier? We cannot. The guest is not running yet, and will not be until later. We cannot call notifiers that perform actions that complete, or react to, the guest entering a running state. >>> >>> I tried to look at a few examples of the notifees and most of them I read >>> do not react to "vcpu running" but "vm running" (in which case I think >>> "suspended" mode falls into "vm running" case); most of them won't care on >>> the RunState parameter passed in, but only the bool "running". >>> >>> In reality, when running=true, it must be RUNNING so far. >>> >>> In that case does it mean we should notify right after the switchover, >>> since after migration the vm is indeed running only if the vcpus are not >>> during suspend? >> >> I cannot parse your question, but maybe this answers it. >> If the outgoing VM is running and not suspended, then the incoming side >> tests for autostart==true and calls vm_start, which calls the notifiers, >> right after the switchover. > > I meant IMHO SUSPENDED should be seen as "vm running" case to me, just like > RUNNING. Then, we should invoke vm_prepare_start(), just need some touch > ups. > >> >>> One example (of possible issue) is vfio_vmstate_change(), where iiuc if we >>> try to suspend a VM it should keep to be VFIO_DEVICE_STATE_RUNNING for that >>> device; this kind of prove to me that SUSPEND is actually one of >>> running=true states. >>> >>> If we postpone all notifiers here even after we switched over to dest qemu >>> to the next upcoming suspend wakeup, I think it means these devices will >>> not be in VFIO_DEVICE_STATE_RUNNING after switchover but perhaps >>> VFIO_DEVICE_STATE_STOP. >> >> or VFIO_DEVICE_STATE_RESUMING, which is set in vfio_load_setup. >> AFAIK it is OK to remain in that state until wakeup is called later. > > So let me provide another clue of why I think we should call > vm_prepare_start().. > > Firstly, I think RESUME event should always be there right after we > switched over, no matter suspeneded or not. I just noticed that your test > case would work because you put "wakeup" to be before RESUME. I'm not sure > whether that's correct. I'd bet people could rely on that RESUME to > identify the switchover. Yes, possibly. > More importantly, I'm wondering whether RTC should still be running during > the suspended mode? Sorry again if my knowledge over there is just > limited, so correct me otherwise - but my understanding is during suspend > mode (s1 or s3, frankly I can't tell which one this belongs..), rtc should > still be running along with the system clock. It means we _should_ at > least call cpu_enable_ticks() to enable rtc: Agreed. The comment above cpu_get_ticks says: * return the time elapsed in VM between vm_start and vm_stop Suspending a guest does not call vm_stop, so ticks keeps running. I also verified that experimentally. > /* > * enable cpu_get_ticks() > * Caller must hold BQL which serves as mutex for vm_clock_seqlock. > */ > void cpu_enable_ticks(void) > > I think that'll enable cpu_get_tsc() and make it start to work right. > >> >>> Ideally I think we should here call vm_state_notify() with running=true and >>> state=SUSPEND, but since I do see some hooks are not well prepared for >>> SUSPEND over running=true, I'd think we should on the safe side call >>> vm_state_notify(running=true, state=RUNNING) even for SUSPEND at switch >>> over phase. With that IIUC it'll naturally work (e.g. when wakeup again >>> later we just need to call no notifiers). >> >> Notifiers are just one piece, all the code in vm_prepare_start must be >> called. >> Is it correct to call all of that long before we actually resume the CPUs in >> wakeup? I don't know, but what is the point? > > The point is not only for cleaness (again, I really, really don't like that > new global.. sorry), but also now I think we should make the vm running. I do believe it is safe to call vm_prepare_start immediately, since the vcpus are never running when it is called. >> The wakeup code still needs >> modification to conditionally resume the vcpus. The scheme would be roughly: >> >> loadvm_postcopy_handle_run_bh() >> runstat = global_state_get_runstate(); >> if (runstate == RUN_STATE_RUNNING) { >> vm_start() >> } else if (runstate == RUN_STATE_SUSPENDED) >> vm_prepare_start(); // the start of vm_start() >> } >> >> qemu_system_wakeup_request() >> if (some condition) >> resume_all_vcpus(); // the remainder of vm_start() >> else >> runstate_set(RUN_STATE_RUNNING) > > No it doesn't. wakeup_reason is set there, main loop does the resuming. > See: > > if (qemu_wak
Re: [PATCH V1 2/3] migration: fix suspended runstate
On Wed, Aug 16, 2023 at 01:48:13PM -0400, Steven Sistare wrote: > On 8/14/2023 3:37 PM, Peter Xu wrote: > > On Mon, Aug 14, 2023 at 02:53:56PM -0400, Steven Sistare wrote: > >>> Can we just call vm_state_notify() earlier? > >> > >> We cannot. The guest is not running yet, and will not be until later. > >> We cannot call notifiers that perform actions that complete, or react to, > >> the guest entering a running state. > > > > I tried to look at a few examples of the notifees and most of them I read > > do not react to "vcpu running" but "vm running" (in which case I think > > "suspended" mode falls into "vm running" case); most of them won't care on > > the RunState parameter passed in, but only the bool "running". > > > > In reality, when running=true, it must be RUNNING so far. > > > > In that case does it mean we should notify right after the switchover, > > since after migration the vm is indeed running only if the vcpus are not > > during suspend? > > I cannot parse your question, but maybe this answers it. > If the outgoing VM is running and not suspended, then the incoming side > tests for autostart==true and calls vm_start, which calls the notifiers, > right after the switchover. I meant IMHO SUSPENDED should be seen as "vm running" case to me, just like RUNNING. Then, we should invoke vm_prepare_start(), just need some touch ups. > > > One example (of possible issue) is vfio_vmstate_change(), where iiuc if we > > try to suspend a VM it should keep to be VFIO_DEVICE_STATE_RUNNING for that > > device; this kind of prove to me that SUSPEND is actually one of > > running=true states. > > > > If we postpone all notifiers here even after we switched over to dest qemu > > to the next upcoming suspend wakeup, I think it means these devices will > > not be in VFIO_DEVICE_STATE_RUNNING after switchover but perhaps > > VFIO_DEVICE_STATE_STOP. > > or VFIO_DEVICE_STATE_RESUMING, which is set in vfio_load_setup. > AFAIK it is OK to remain in that state until wakeup is called later. So let me provide another clue of why I think we should call vm_prepare_start().. Firstly, I think RESUME event should always be there right after we switched over, no matter suspeneded or not. I just noticed that your test case would work because you put "wakeup" to be before RESUME. I'm not sure whether that's correct. I'd bet people could rely on that RESUME to identify the switchover. More importantly, I'm wondering whether RTC should still be running during the suspended mode? Sorry again if my knowledge over there is just limited, so correct me otherwise - but my understanding is during suspend mode (s1 or s3, frankly I can't tell which one this belongs..), rtc should still be running along with the system clock. It means we _should_ at least call cpu_enable_ticks() to enable rtc: /* * enable cpu_get_ticks() * Caller must hold BQL which serves as mutex for vm_clock_seqlock. */ void cpu_enable_ticks(void) I think that'll enable cpu_get_tsc() and make it start to work right. > > > Ideally I think we should here call vm_state_notify() with running=true and > > state=SUSPEND, but since I do see some hooks are not well prepared for > > SUSPEND over running=true, I'd think we should on the safe side call > > vm_state_notify(running=true, state=RUNNING) even for SUSPEND at switch > > over phase. With that IIUC it'll naturally work (e.g. when wakeup again > > later we just need to call no notifiers). > > Notifiers are just one piece, all the code in vm_prepare_start must be called. > Is it correct to call all of that long before we actually resume the CPUs in > wakeup? I don't know, but what is the point? The point is not only for cleaness (again, I really, really don't like that new global.. sorry), but also now I think we should make the vm running. > The wakeup code still needs > modification to conditionally resume the vcpus. The scheme would be roughly: > > loadvm_postcopy_handle_run_bh() > runstat = global_state_get_runstate(); > if (runstate == RUN_STATE_RUNNING) { > vm_start() > } else if (runstate == RUN_STATE_SUSPENDED) > vm_prepare_start(); // the start of vm_start() > } > > qemu_system_wakeup_request() > if (some condition) > resume_all_vcpus(); // the remainder of vm_start() > else > runstate_set(RUN_STATE_RUNNING) No it doesn't. wakeup_reason is set there, main loop does the resuming. See: if (qemu_wakeup_requested()) { pause_all_vcpus(); qemu_system_wakeup(); notifier_list_notify(&wakeup_notifiers, &wakeup_reason); wakeup_reason = QEMU_WAKEUP_REASON_NONE; resume_all_vcpus(); qapi_event_send_wakeup(); } > > How is that better than my patches > [PATCH V3 01/10] vl: start on wakeup request > [PATCH V3 02/10] migration: preserve suspended runstate > > loadvm_postcopy_handle_run_bh() > runstate = global_s
Re: [PATCH V1 2/3] migration: fix suspended runstate
On 8/14/2023 3:37 PM, Peter Xu wrote: > On Mon, Aug 14, 2023 at 02:53:56PM -0400, Steven Sistare wrote: >>> Can we just call vm_state_notify() earlier? >> >> We cannot. The guest is not running yet, and will not be until later. >> We cannot call notifiers that perform actions that complete, or react to, >> the guest entering a running state. > > I tried to look at a few examples of the notifees and most of them I read > do not react to "vcpu running" but "vm running" (in which case I think > "suspended" mode falls into "vm running" case); most of them won't care on > the RunState parameter passed in, but only the bool "running". > > In reality, when running=true, it must be RUNNING so far. > > In that case does it mean we should notify right after the switchover, > since after migration the vm is indeed running only if the vcpus are not > during suspend? I cannot parse your question, but maybe this answers it. If the outgoing VM is running and not suspended, then the incoming side tests for autostart==true and calls vm_start, which calls the notifiers, right after the switchover. > One example (of possible issue) is vfio_vmstate_change(), where iiuc if we > try to suspend a VM it should keep to be VFIO_DEVICE_STATE_RUNNING for that > device; this kind of prove to me that SUSPEND is actually one of > running=true states. > > If we postpone all notifiers here even after we switched over to dest qemu > to the next upcoming suspend wakeup, I think it means these devices will > not be in VFIO_DEVICE_STATE_RUNNING after switchover but perhaps > VFIO_DEVICE_STATE_STOP. or VFIO_DEVICE_STATE_RESUMING, which is set in vfio_load_setup. AFAIK it is OK to remain in that state until wakeup is called later. > Ideally I think we should here call vm_state_notify() with running=true and > state=SUSPEND, but since I do see some hooks are not well prepared for > SUSPEND over running=true, I'd think we should on the safe side call > vm_state_notify(running=true, state=RUNNING) even for SUSPEND at switch > over phase. With that IIUC it'll naturally work (e.g. when wakeup again > later we just need to call no notifiers). Notifiers are just one piece, all the code in vm_prepare_start must be called. Is it correct to call all of that long before we actually resume the CPUs in wakeup? I don't know, but what is the point? The wakeup code still needs modification to conditionally resume the vcpus. The scheme would be roughly: loadvm_postcopy_handle_run_bh() runstat = global_state_get_runstate(); if (runstate == RUN_STATE_RUNNING) { vm_start() } else if (runstate == RUN_STATE_SUSPENDED) vm_prepare_start(); // the start of vm_start() } qemu_system_wakeup_request() if (some condition) resume_all_vcpus(); // the remainder of vm_start() else runstate_set(RUN_STATE_RUNNING) How is that better than my patches [PATCH V3 01/10] vl: start on wakeup request [PATCH V3 02/10] migration: preserve suspended runstate loadvm_postcopy_handle_run_bh() runstate = global_state_get_runstate(); if (runstate == RUN_STATE_RUNNING) vm_start() else runstate_set(runstate);// eg RUN_STATE_SUSPENDED qemu_system_wakeup_request() if (!vm_started) vm_start(); else runstate_set(RUN_STATE_RUNNING); Recall this thread started with your comment "It then can avoid touching the system wakeup code which seems cleaner". We still need to touch the wakeup code. - Steve
Re: [PATCH V1 2/3] migration: fix suspended runstate
On Mon, Aug 14, 2023 at 02:53:56PM -0400, Steven Sistare wrote: > > Can we just call vm_state_notify() earlier? > > We cannot. The guest is not running yet, and will not be until later. > We cannot call notifiers that perform actions that complete, or react to, > the guest entering a running state. I tried to look at a few examples of the notifees and most of them I read do not react to "vcpu running" but "vm running" (in which case I think "suspended" mode falls into "vm running" case); most of them won't care on the RunState parameter passed in, but only the bool "running". In reality, when running=true, it must be RUNNING so far. In that case does it mean we should notify right after the switchover, since after migration the vm is indeed running only if the vcpus are not during suspend? One example (of possible issue) is vfio_vmstate_change(), where iiuc if we try to suspend a VM it should keep to be VFIO_DEVICE_STATE_RUNNING for that device; this kind of prove to me that SUSPEND is actually one of running=true states. If we postpone all notifiers here even after we switched over to dest qemu to the next upcoming suspend wakeup, I think it means these devices will not be in VFIO_DEVICE_STATE_RUNNING after switchover but perhaps VFIO_DEVICE_STATE_STOP. Ideally I think we should here call vm_state_notify() with running=true and state=SUSPEND, but since I do see some hooks are not well prepared for SUSPEND over running=true, I'd think we should on the safe side call vm_state_notify(running=true, state=RUNNING) even for SUSPEND at switch over phase. With that IIUC it'll naturally work (e.g. when wakeup again later we just need to call no notifiers). -- Peter Xu
Re: [PATCH V1 2/3] migration: fix suspended runstate
On 7/26/2023 4:18 PM, Peter Xu wrote: > On Fri, Jun 30, 2023 at 09:50:41AM -0400, Steven Sistare wrote: >> On 6/26/2023 2:27 PM, Peter Xu wrote: >>> On Fri, Jun 23, 2023 at 02:25:05PM -0400, Steven Sistare wrote: On 6/21/2023 4:28 PM, Peter Xu wrote: > On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote: >> On 6/20/2023 5:46 PM, Peter Xu wrote: >>> On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote: Migration of a guest in the suspended state is broken. The incoming migration code automatically tries to wake the guest, which IMO is wrong -- the guest should end migration in the same state it started. Further, the wakeup is done by calling qemu_system_wakeup_request(), which bypasses vm_start(). The guest appears to be in the running state, but it is not. To fix, leave the guest in the suspended state, but call qemu_system_start_on_wakeup_request() so the guest is properly resumed later, when the client sends a system_wakeup command. Signed-off-by: Steve Sistare --- migration/migration.c | 11 --- softmmu/runstate.c| 1 + 2 files changed, 5 insertions(+), 7 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 17b4b47..851fe6d 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void *opaque) vm_start(); } else { runstate_set(global_state_get_runstate()); +if (runstate_check(RUN_STATE_SUSPENDED)) { +/* Force vm_start to be called later. */ +qemu_system_start_on_wakeup_request(); +} >>> >>> Is this really needed, along with patch 1? >>> >>> I have a very limited knowledge on suspension, so I'm prone to making >>> mistakes.. >>> >>> But from what I read this, qemu_system_wakeup_request() (existing one, >>> not >>> after patch 1 applied) will setup wakeup_reason and kick the main thread >>> using qemu_notify_event(). Then IIUC the e.g. vcpu wakeups will be >>> done in >>> the main thread later on after qemu_wakeup_requested() returns true. >> >> Correct, here: >> >> if (qemu_wakeup_requested()) { >> pause_all_vcpus(); >> qemu_system_wakeup(); >> notifier_list_notify(&wakeup_notifiers, &wakeup_reason); >> wakeup_reason = QEMU_WAKEUP_REASON_NONE; >> resume_all_vcpus(); >> qapi_event_send_wakeup(); >> } >> >> However, that is not sufficient, because vm_start() was never called on >> the incoming >> side. vm_start calls the vm state notifiers for RUN_STATE_RUNNING, >> among other things. >> >> >> Without my fixes, it "works" because the outgoing migration >> automatically wakes a suspended >> guest, which sets the state to running, which is saved in global state: >> >> void migration_completion(MigrationState *s) >> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); >> global_state_store() >> >> Then the incoming migration calls vm_start here: >> >> migration/migration.c >> if (!global_state_received() || >> global_state_get_runstate() == RUN_STATE_RUNNING) { >> if (autostart) { >> vm_start(); >> >> vm_start must be called for correctness. > > I see. Though I had a feeling that this is still not the right way to do, > at least not as clean. > > One question is, would above work for postcopy when VM is suspended during > the switchover? Good catch, that is broken. I added qemu_system_start_on_wakeup_request to loadvm_postcopy_handle_run_bh and now it works. if (global_state_get_runstate() == RUN_STATE_RUNNING) { if (autostart) { vm_start(); } else { runstate_set(RUN_STATE_PAUSED); } } else { runstate_set(global_state_get_runstate()); if (runstate_check(RUN_STATE_SUSPENDED)) { qemu_system_start_on_wakeup_request(); } } > I think I see your point that vm_start() (mostly vm_prepare_start()) > contains a bunch of operations that maybe we must have before starting the > VM, but then.. should we just make that vm_start() unconditional when > loading VM completes? I just don't see anything won't need it (besides > -S), even COLO. > > So I'm wondering about something like this: > > ===8<=== > --- a/migration/migration.c > +++ b/migration/migration.c >>>
Re: [PATCH V1 2/3] migration: fix suspended runstate
On Fri, Jun 30, 2023 at 09:50:41AM -0400, Steven Sistare wrote: > On 6/26/2023 2:27 PM, Peter Xu wrote: > > On Fri, Jun 23, 2023 at 02:25:05PM -0400, Steven Sistare wrote: > >> On 6/21/2023 4:28 PM, Peter Xu wrote: > >>> On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote: > On 6/20/2023 5:46 PM, Peter Xu wrote: > > On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote: > >> Migration of a guest in the suspended state is broken. The incoming > >> migration code automatically tries to wake the guest, which IMO is > >> wrong -- the guest should end migration in the same state it started. > >> Further, the wakeup is done by calling qemu_system_wakeup_request(), > >> which > >> bypasses vm_start(). The guest appears to be in the running state, but > >> it is not. > >> > >> To fix, leave the guest in the suspended state, but call > >> qemu_system_start_on_wakeup_request() so the guest is properly resumed > >> later, when the client sends a system_wakeup command. > >> > >> Signed-off-by: Steve Sistare > >> --- > >> migration/migration.c | 11 --- > >> softmmu/runstate.c| 1 + > >> 2 files changed, 5 insertions(+), 7 deletions(-) > >> > >> diff --git a/migration/migration.c b/migration/migration.c > >> index 17b4b47..851fe6d 100644 > >> --- a/migration/migration.c > >> +++ b/migration/migration.c > >> @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void > >> *opaque) > >> vm_start(); > >> } else { > >> runstate_set(global_state_get_runstate()); > >> +if (runstate_check(RUN_STATE_SUSPENDED)) { > >> +/* Force vm_start to be called later. */ > >> +qemu_system_start_on_wakeup_request(); > >> +} > > > > Is this really needed, along with patch 1? > > > > I have a very limited knowledge on suspension, so I'm prone to making > > mistakes.. > > > > But from what I read this, qemu_system_wakeup_request() (existing one, > > not > > after patch 1 applied) will setup wakeup_reason and kick the main thread > > using qemu_notify_event(). Then IIUC the e.g. vcpu wakeups will be > > done in > > the main thread later on after qemu_wakeup_requested() returns true. > > Correct, here: > > if (qemu_wakeup_requested()) { > pause_all_vcpus(); > qemu_system_wakeup(); > notifier_list_notify(&wakeup_notifiers, &wakeup_reason); > wakeup_reason = QEMU_WAKEUP_REASON_NONE; > resume_all_vcpus(); > qapi_event_send_wakeup(); > } > > However, that is not sufficient, because vm_start() was never called on > the incoming > side. vm_start calls the vm state notifiers for RUN_STATE_RUNNING, > among other things. > > > Without my fixes, it "works" because the outgoing migration > automatically wakes a suspended > guest, which sets the state to running, which is saved in global state: > > void migration_completion(MigrationState *s) > qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); > global_state_store() > > Then the incoming migration calls vm_start here: > > migration/migration.c > if (!global_state_received() || > global_state_get_runstate() == RUN_STATE_RUNNING) { > if (autostart) { > vm_start(); > > vm_start must be called for correctness. > >>> > >>> I see. Though I had a feeling that this is still not the right way to do, > >>> at least not as clean. > >>> > >>> One question is, would above work for postcopy when VM is suspended during > >>> the switchover? > >> > >> Good catch, that is broken. > >> I added qemu_system_start_on_wakeup_request to > >> loadvm_postcopy_handle_run_bh > >> and now it works. > >> > >> if (global_state_get_runstate() == RUN_STATE_RUNNING) { > >> if (autostart) { > >> vm_start(); > >> } else { > >> runstate_set(RUN_STATE_PAUSED); > >> } > >> } else { > >> runstate_set(global_state_get_runstate()); > >> if (runstate_check(RUN_STATE_SUSPENDED)) { > >> qemu_system_start_on_wakeup_request(); > >> } > >> } > >> > >>> I think I see your point that vm_start() (mostly vm_prepare_start()) > >>> contains a bunch of operations that maybe we must have before starting the > >>> VM, but then.. should we just make that vm_start() unconditional when > >>> loading VM completes? I just don't see anything won't need it (besides > >>> -S), even COLO. > >>> > >>> So I'm wondering about something like this: > >>> > >>> ===8<=== > >>> --- a/migration/migration.c > >>> +++ b/migration/migration.c > >>> @@ -481,19 +481,28 @@ static void proc
Re: [PATCH V1 2/3] migration: fix suspended runstate
On 6/26/2023 2:27 PM, Peter Xu wrote: > On Fri, Jun 23, 2023 at 02:25:05PM -0400, Steven Sistare wrote: >> On 6/21/2023 4:28 PM, Peter Xu wrote: >>> On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote: On 6/20/2023 5:46 PM, Peter Xu wrote: > On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote: >> Migration of a guest in the suspended state is broken. The incoming >> migration code automatically tries to wake the guest, which IMO is >> wrong -- the guest should end migration in the same state it started. >> Further, the wakeup is done by calling qemu_system_wakeup_request(), >> which >> bypasses vm_start(). The guest appears to be in the running state, but >> it is not. >> >> To fix, leave the guest in the suspended state, but call >> qemu_system_start_on_wakeup_request() so the guest is properly resumed >> later, when the client sends a system_wakeup command. >> >> Signed-off-by: Steve Sistare >> --- >> migration/migration.c | 11 --- >> softmmu/runstate.c| 1 + >> 2 files changed, 5 insertions(+), 7 deletions(-) >> >> diff --git a/migration/migration.c b/migration/migration.c >> index 17b4b47..851fe6d 100644 >> --- a/migration/migration.c >> +++ b/migration/migration.c >> @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void >> *opaque) >> vm_start(); >> } else { >> runstate_set(global_state_get_runstate()); >> +if (runstate_check(RUN_STATE_SUSPENDED)) { >> +/* Force vm_start to be called later. */ >> +qemu_system_start_on_wakeup_request(); >> +} > > Is this really needed, along with patch 1? > > I have a very limited knowledge on suspension, so I'm prone to making > mistakes.. > > But from what I read this, qemu_system_wakeup_request() (existing one, not > after patch 1 applied) will setup wakeup_reason and kick the main thread > using qemu_notify_event(). Then IIUC the e.g. vcpu wakeups will be done > in > the main thread later on after qemu_wakeup_requested() returns true. Correct, here: if (qemu_wakeup_requested()) { pause_all_vcpus(); qemu_system_wakeup(); notifier_list_notify(&wakeup_notifiers, &wakeup_reason); wakeup_reason = QEMU_WAKEUP_REASON_NONE; resume_all_vcpus(); qapi_event_send_wakeup(); } However, that is not sufficient, because vm_start() was never called on the incoming side. vm_start calls the vm state notifiers for RUN_STATE_RUNNING, among other things. Without my fixes, it "works" because the outgoing migration automatically wakes a suspended guest, which sets the state to running, which is saved in global state: void migration_completion(MigrationState *s) qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); global_state_store() Then the incoming migration calls vm_start here: migration/migration.c if (!global_state_received() || global_state_get_runstate() == RUN_STATE_RUNNING) { if (autostart) { vm_start(); vm_start must be called for correctness. >>> >>> I see. Though I had a feeling that this is still not the right way to do, >>> at least not as clean. >>> >>> One question is, would above work for postcopy when VM is suspended during >>> the switchover? >> >> Good catch, that is broken. >> I added qemu_system_start_on_wakeup_request to loadvm_postcopy_handle_run_bh >> and now it works. >> >> if (global_state_get_runstate() == RUN_STATE_RUNNING) { >> if (autostart) { >> vm_start(); >> } else { >> runstate_set(RUN_STATE_PAUSED); >> } >> } else { >> runstate_set(global_state_get_runstate()); >> if (runstate_check(RUN_STATE_SUSPENDED)) { >> qemu_system_start_on_wakeup_request(); >> } >> } >> >>> I think I see your point that vm_start() (mostly vm_prepare_start()) >>> contains a bunch of operations that maybe we must have before starting the >>> VM, but then.. should we just make that vm_start() unconditional when >>> loading VM completes? I just don't see anything won't need it (besides >>> -S), even COLO. >>> >>> So I'm wondering about something like this: >>> >>> ===8<=== >>> --- a/migration/migration.c >>> +++ b/migration/migration.c >>> @@ -481,19 +481,28 @@ static void process_incoming_migration_bh(void >>> *opaque) >>> >>> dirty_bitmap_mig_before_vm_start(); >>> >>> -if (!global_state_received() || >>> -global_state_get_runstate() == RUN_STATE_RUNNING) { >>> -if (autostart) { >>> -vm_start(); >>> -} else { >>> -runstate_se
Re: [PATCH V1 2/3] migration: fix suspended runstate
On Mon, Jun 26, 2023 at 02:27:32PM -0400, Peter Xu wrote: > On Fri, Jun 23, 2023 at 02:25:05PM -0400, Steven Sistare wrote: > > On 6/21/2023 4:28 PM, Peter Xu wrote: > > > On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote: > > >> On 6/20/2023 5:46 PM, Peter Xu wrote: > > >>> On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote: > > Migration of a guest in the suspended state is broken. The incoming > > migration code automatically tries to wake the guest, which IMO is > > wrong -- the guest should end migration in the same state it started. > > Further, the wakeup is done by calling qemu_system_wakeup_request(), > > which > > bypasses vm_start(). The guest appears to be in the running state, but > > it is not. > > > > To fix, leave the guest in the suspended state, but call > > qemu_system_start_on_wakeup_request() so the guest is properly resumed > > later, when the client sends a system_wakeup command. > > > > Signed-off-by: Steve Sistare > > --- > > migration/migration.c | 11 --- > > softmmu/runstate.c| 1 + > > 2 files changed, 5 insertions(+), 7 deletions(-) > > > > diff --git a/migration/migration.c b/migration/migration.c > > index 17b4b47..851fe6d 100644 > > --- a/migration/migration.c > > +++ b/migration/migration.c > > @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void > > *opaque) > > vm_start(); > > } else { > > runstate_set(global_state_get_runstate()); > > +if (runstate_check(RUN_STATE_SUSPENDED)) { > > +/* Force vm_start to be called later. */ > > +qemu_system_start_on_wakeup_request(); > > +} > > >>> > > >>> Is this really needed, along with patch 1? > > >>> > > >>> I have a very limited knowledge on suspension, so I'm prone to making > > >>> mistakes.. > > >>> > > >>> But from what I read this, qemu_system_wakeup_request() (existing one, > > >>> not > > >>> after patch 1 applied) will setup wakeup_reason and kick the main thread > > >>> using qemu_notify_event(). Then IIUC the e.g. vcpu wakeups will be > > >>> done in > > >>> the main thread later on after qemu_wakeup_requested() returns true. > > >> > > >> Correct, here: > > >> > > >> if (qemu_wakeup_requested()) { > > >> pause_all_vcpus(); > > >> qemu_system_wakeup(); > > >> notifier_list_notify(&wakeup_notifiers, &wakeup_reason); > > >> wakeup_reason = QEMU_WAKEUP_REASON_NONE; > > >> resume_all_vcpus(); > > >> qapi_event_send_wakeup(); > > >> } > > >> > > >> However, that is not sufficient, because vm_start() was never called on > > >> the incoming > > >> side. vm_start calls the vm state notifiers for RUN_STATE_RUNNING, > > >> among other things. > > >> > > >> > > >> Without my fixes, it "works" because the outgoing migration > > >> automatically wakes a suspended > > >> guest, which sets the state to running, which is saved in global state: > > >> > > >> void migration_completion(MigrationState *s) > > >> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); > > >> global_state_store() > > >> > > >> Then the incoming migration calls vm_start here: > > >> > > >> migration/migration.c > > >> if (!global_state_received() || > > >> global_state_get_runstate() == RUN_STATE_RUNNING) { > > >> if (autostart) { > > >> vm_start(); > > >> > > >> vm_start must be called for correctness. > > > > > > I see. Though I had a feeling that this is still not the right way to do, > > > at least not as clean. > > > > > > One question is, would above work for postcopy when VM is suspended during > > > the switchover? > > > > Good catch, that is broken. > > I added qemu_system_start_on_wakeup_request to loadvm_postcopy_handle_run_bh > > and now it works. > > > > if (global_state_get_runstate() == RUN_STATE_RUNNING) { > > if (autostart) { > > vm_start(); > > } else { > > runstate_set(RUN_STATE_PAUSED); > > } > > } else { > > runstate_set(global_state_get_runstate()); > > if (runstate_check(RUN_STATE_SUSPENDED)) { > > qemu_system_start_on_wakeup_request(); > > } > > } > > > > > I think I see your point that vm_start() (mostly vm_prepare_start()) > > > contains a bunch of operations that maybe we must have before starting the > > > VM, but then.. should we just make that vm_start() unconditional when > > > loading VM completes? I just don't see anything won't need it (besides > > > -S), even COLO. > > > > > > So I'm wondering about something like this: > > > > > > ===8<=== > > > --- a/migration/migration.c > > > +++ b/migration/migration.c > > > @@ -481,19 +481,28 @@ static void process_incoming_migration_bh(void > > > *opaque) > > > > > >
Re: [PATCH V1 2/3] migration: fix suspended runstate
On Fri, Jun 23, 2023 at 02:25:05PM -0400, Steven Sistare wrote: > On 6/21/2023 4:28 PM, Peter Xu wrote: > > On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote: > >> On 6/20/2023 5:46 PM, Peter Xu wrote: > >>> On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote: > Migration of a guest in the suspended state is broken. The incoming > migration code automatically tries to wake the guest, which IMO is > wrong -- the guest should end migration in the same state it started. > Further, the wakeup is done by calling qemu_system_wakeup_request(), > which > bypasses vm_start(). The guest appears to be in the running state, but > it is not. > > To fix, leave the guest in the suspended state, but call > qemu_system_start_on_wakeup_request() so the guest is properly resumed > later, when the client sends a system_wakeup command. > > Signed-off-by: Steve Sistare > --- > migration/migration.c | 11 --- > softmmu/runstate.c| 1 + > 2 files changed, 5 insertions(+), 7 deletions(-) > > diff --git a/migration/migration.c b/migration/migration.c > index 17b4b47..851fe6d 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void > *opaque) > vm_start(); > } else { > runstate_set(global_state_get_runstate()); > +if (runstate_check(RUN_STATE_SUSPENDED)) { > +/* Force vm_start to be called later. */ > +qemu_system_start_on_wakeup_request(); > +} > >>> > >>> Is this really needed, along with patch 1? > >>> > >>> I have a very limited knowledge on suspension, so I'm prone to making > >>> mistakes.. > >>> > >>> But from what I read this, qemu_system_wakeup_request() (existing one, not > >>> after patch 1 applied) will setup wakeup_reason and kick the main thread > >>> using qemu_notify_event(). Then IIUC the e.g. vcpu wakeups will be done > >>> in > >>> the main thread later on after qemu_wakeup_requested() returns true. > >> > >> Correct, here: > >> > >> if (qemu_wakeup_requested()) { > >> pause_all_vcpus(); > >> qemu_system_wakeup(); > >> notifier_list_notify(&wakeup_notifiers, &wakeup_reason); > >> wakeup_reason = QEMU_WAKEUP_REASON_NONE; > >> resume_all_vcpus(); > >> qapi_event_send_wakeup(); > >> } > >> > >> However, that is not sufficient, because vm_start() was never called on > >> the incoming > >> side. vm_start calls the vm state notifiers for RUN_STATE_RUNNING, among > >> other things. > >> > >> > >> Without my fixes, it "works" because the outgoing migration automatically > >> wakes a suspended > >> guest, which sets the state to running, which is saved in global state: > >> > >> void migration_completion(MigrationState *s) > >> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); > >> global_state_store() > >> > >> Then the incoming migration calls vm_start here: > >> > >> migration/migration.c > >> if (!global_state_received() || > >> global_state_get_runstate() == RUN_STATE_RUNNING) { > >> if (autostart) { > >> vm_start(); > >> > >> vm_start must be called for correctness. > > > > I see. Though I had a feeling that this is still not the right way to do, > > at least not as clean. > > > > One question is, would above work for postcopy when VM is suspended during > > the switchover? > > Good catch, that is broken. > I added qemu_system_start_on_wakeup_request to loadvm_postcopy_handle_run_bh > and now it works. > > if (global_state_get_runstate() == RUN_STATE_RUNNING) { > if (autostart) { > vm_start(); > } else { > runstate_set(RUN_STATE_PAUSED); > } > } else { > runstate_set(global_state_get_runstate()); > if (runstate_check(RUN_STATE_SUSPENDED)) { > qemu_system_start_on_wakeup_request(); > } > } > > > I think I see your point that vm_start() (mostly vm_prepare_start()) > > contains a bunch of operations that maybe we must have before starting the > > VM, but then.. should we just make that vm_start() unconditional when > > loading VM completes? I just don't see anything won't need it (besides > > -S), even COLO. > > > > So I'm wondering about something like this: > > > > ===8<=== > > --- a/migration/migration.c > > +++ b/migration/migration.c > > @@ -481,19 +481,28 @@ static void process_incoming_migration_bh(void > > *opaque) > > > > dirty_bitmap_mig_before_vm_start(); > > > > -if (!global_state_received() || > > -global_state_get_runstate() == RUN_STATE_RUNNING) { > > -if (autostart) { > > -vm_start(); > > -} else { > > -runstate_set(RUN_STATE_PAUSED); > > -} > > -} else
Re: [PATCH V1 2/3] migration: fix suspended runstate
On 6/23/2023 2:25 PM, Steven Sistare wrote: > On 6/21/2023 4:28 PM, Peter Xu wrote: >> On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote: >>> On 6/20/2023 5:46 PM, Peter Xu wrote: On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote: > Migration of a guest in the suspended state is broken. The incoming > migration code automatically tries to wake the guest, which IMO is > wrong -- the guest should end migration in the same state it started. > Further, the wakeup is done by calling qemu_system_wakeup_request(), which > bypasses vm_start(). The guest appears to be in the running state, but > it is not. > > To fix, leave the guest in the suspended state, but call > qemu_system_start_on_wakeup_request() so the guest is properly resumed > later, when the client sends a system_wakeup command. > > Signed-off-by: Steve Sistare > --- > migration/migration.c | 11 --- > softmmu/runstate.c| 1 + > 2 files changed, 5 insertions(+), 7 deletions(-) > > diff --git a/migration/migration.c b/migration/migration.c > index 17b4b47..851fe6d 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void > *opaque) > vm_start(); > } else { > runstate_set(global_state_get_runstate()); > +if (runstate_check(RUN_STATE_SUSPENDED)) { > +/* Force vm_start to be called later. */ > +qemu_system_start_on_wakeup_request(); > +} Is this really needed, along with patch 1? I have a very limited knowledge on suspension, so I'm prone to making mistakes.. But from what I read this, qemu_system_wakeup_request() (existing one, not after patch 1 applied) will setup wakeup_reason and kick the main thread using qemu_notify_event(). Then IIUC the e.g. vcpu wakeups will be done in the main thread later on after qemu_wakeup_requested() returns true. >>> >>> Correct, here: >>> >>> if (qemu_wakeup_requested()) { >>> pause_all_vcpus(); >>> qemu_system_wakeup(); >>> notifier_list_notify(&wakeup_notifiers, &wakeup_reason); >>> wakeup_reason = QEMU_WAKEUP_REASON_NONE; >>> resume_all_vcpus(); >>> qapi_event_send_wakeup(); >>> } >>> >>> However, that is not sufficient, because vm_start() was never called on the >>> incoming >>> side. vm_start calls the vm state notifiers for RUN_STATE_RUNNING, among >>> other things. >>> >>> >>> Without my fixes, it "works" because the outgoing migration automatically >>> wakes a suspended >>> guest, which sets the state to running, which is saved in global state: >>> >>> void migration_completion(MigrationState *s) >>> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); >>> global_state_store() >>> >>> Then the incoming migration calls vm_start here: >>> >>> migration/migration.c >>> if (!global_state_received() || >>> global_state_get_runstate() == RUN_STATE_RUNNING) { >>> if (autostart) { >>> vm_start(); >>> >>> vm_start must be called for correctness. >> >> I see. Though I had a feeling that this is still not the right way to do, >> at least not as clean. >> >> One question is, would above work for postcopy when VM is suspended during >> the switchover? > > Good catch, that is broken. > I added qemu_system_start_on_wakeup_request to loadvm_postcopy_handle_run_bh > and now it works. > > if (global_state_get_runstate() == RUN_STATE_RUNNING) { > if (autostart) { > vm_start(); > } else { > runstate_set(RUN_STATE_PAUSED); > } > } else { > runstate_set(global_state_get_runstate()); > if (runstate_check(RUN_STATE_SUSPENDED)) { > qemu_system_start_on_wakeup_request(); > } > } > >> I think I see your point that vm_start() (mostly vm_prepare_start()) >> contains a bunch of operations that maybe we must have before starting the >> VM, but then.. should we just make that vm_start() unconditional when >> loading VM completes? I just don't see anything won't need it (besides >> -S), even COLO. >> >> So I'm wondering about something like this: >> >> ===8<=== >> --- a/migration/migration.c >> +++ b/migration/migration.c >> @@ -481,19 +481,28 @@ static void process_incoming_migration_bh(void *opaque) >> >> dirty_bitmap_mig_before_vm_start(); >> >> -if (!global_state_received() || >> -global_state_get_runstate() == RUN_STATE_RUNNING) { >> -if (autostart) { >> -vm_start(); >> -} else { >> -runstate_set(RUN_STATE_PAUSED); >> -} >> -} else if (migration_incoming_colo_enabled()) { >> +if (migration_incoming_colo_enabled()) { >> migration_incoming_disable_colo(); >> +/*
Re: [PATCH V1 2/3] migration: fix suspended runstate
On 6/21/2023 4:28 PM, Peter Xu wrote: > On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote: >> On 6/20/2023 5:46 PM, Peter Xu wrote: >>> On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote: Migration of a guest in the suspended state is broken. The incoming migration code automatically tries to wake the guest, which IMO is wrong -- the guest should end migration in the same state it started. Further, the wakeup is done by calling qemu_system_wakeup_request(), which bypasses vm_start(). The guest appears to be in the running state, but it is not. To fix, leave the guest in the suspended state, but call qemu_system_start_on_wakeup_request() so the guest is properly resumed later, when the client sends a system_wakeup command. Signed-off-by: Steve Sistare --- migration/migration.c | 11 --- softmmu/runstate.c| 1 + 2 files changed, 5 insertions(+), 7 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 17b4b47..851fe6d 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void *opaque) vm_start(); } else { runstate_set(global_state_get_runstate()); +if (runstate_check(RUN_STATE_SUSPENDED)) { +/* Force vm_start to be called later. */ +qemu_system_start_on_wakeup_request(); +} >>> >>> Is this really needed, along with patch 1? >>> >>> I have a very limited knowledge on suspension, so I'm prone to making >>> mistakes.. >>> >>> But from what I read this, qemu_system_wakeup_request() (existing one, not >>> after patch 1 applied) will setup wakeup_reason and kick the main thread >>> using qemu_notify_event(). Then IIUC the e.g. vcpu wakeups will be done in >>> the main thread later on after qemu_wakeup_requested() returns true. >> >> Correct, here: >> >> if (qemu_wakeup_requested()) { >> pause_all_vcpus(); >> qemu_system_wakeup(); >> notifier_list_notify(&wakeup_notifiers, &wakeup_reason); >> wakeup_reason = QEMU_WAKEUP_REASON_NONE; >> resume_all_vcpus(); >> qapi_event_send_wakeup(); >> } >> >> However, that is not sufficient, because vm_start() was never called on the >> incoming >> side. vm_start calls the vm state notifiers for RUN_STATE_RUNNING, among >> other things. >> >> >> Without my fixes, it "works" because the outgoing migration automatically >> wakes a suspended >> guest, which sets the state to running, which is saved in global state: >> >> void migration_completion(MigrationState *s) >> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); >> global_state_store() >> >> Then the incoming migration calls vm_start here: >> >> migration/migration.c >> if (!global_state_received() || >> global_state_get_runstate() == RUN_STATE_RUNNING) { >> if (autostart) { >> vm_start(); >> >> vm_start must be called for correctness. > > I see. Though I had a feeling that this is still not the right way to do, > at least not as clean. > > One question is, would above work for postcopy when VM is suspended during > the switchover? Good catch, that is broken. I added qemu_system_start_on_wakeup_request to loadvm_postcopy_handle_run_bh and now it works. if (global_state_get_runstate() == RUN_STATE_RUNNING) { if (autostart) { vm_start(); } else { runstate_set(RUN_STATE_PAUSED); } } else { runstate_set(global_state_get_runstate()); if (runstate_check(RUN_STATE_SUSPENDED)) { qemu_system_start_on_wakeup_request(); } } > I think I see your point that vm_start() (mostly vm_prepare_start()) > contains a bunch of operations that maybe we must have before starting the > VM, but then.. should we just make that vm_start() unconditional when > loading VM completes? I just don't see anything won't need it (besides > -S), even COLO. > > So I'm wondering about something like this: > > ===8<=== > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -481,19 +481,28 @@ static void process_incoming_migration_bh(void *opaque) > > dirty_bitmap_mig_before_vm_start(); > > -if (!global_state_received() || > -global_state_get_runstate() == RUN_STATE_RUNNING) { > -if (autostart) { > -vm_start(); > -} else { > -runstate_set(RUN_STATE_PAUSED); > -} > -} else if (migration_incoming_colo_enabled()) { > +if (migration_incoming_colo_enabled()) { > migration_incoming_disable_colo(); > +/* COLO should always have autostart=1 or we can enforce it here */ > +} > + > +if (autostart) { > +RunState run_state = global_state_get_runstate(); > vm_start();
Re: [PATCH V1 2/3] migration: fix suspended runstate
On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote: > On 6/20/2023 5:46 PM, Peter Xu wrote: > > On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote: > >> Migration of a guest in the suspended state is broken. The incoming > >> migration code automatically tries to wake the guest, which IMO is > >> wrong -- the guest should end migration in the same state it started. > >> Further, the wakeup is done by calling qemu_system_wakeup_request(), which > >> bypasses vm_start(). The guest appears to be in the running state, but > >> it is not. > >> > >> To fix, leave the guest in the suspended state, but call > >> qemu_system_start_on_wakeup_request() so the guest is properly resumed > >> later, when the client sends a system_wakeup command. > >> > >> Signed-off-by: Steve Sistare > >> --- > >> migration/migration.c | 11 --- > >> softmmu/runstate.c| 1 + > >> 2 files changed, 5 insertions(+), 7 deletions(-) > >> > >> diff --git a/migration/migration.c b/migration/migration.c > >> index 17b4b47..851fe6d 100644 > >> --- a/migration/migration.c > >> +++ b/migration/migration.c > >> @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void > >> *opaque) > >> vm_start(); > >> } else { > >> runstate_set(global_state_get_runstate()); > >> +if (runstate_check(RUN_STATE_SUSPENDED)) { > >> +/* Force vm_start to be called later. */ > >> +qemu_system_start_on_wakeup_request(); > >> +} > > > > Is this really needed, along with patch 1? > > > > I have a very limited knowledge on suspension, so I'm prone to making > > mistakes.. > > > > But from what I read this, qemu_system_wakeup_request() (existing one, not > > after patch 1 applied) will setup wakeup_reason and kick the main thread > > using qemu_notify_event(). Then IIUC the e.g. vcpu wakeups will be done in > > the main thread later on after qemu_wakeup_requested() returns true. > > Correct, here: > > if (qemu_wakeup_requested()) { > pause_all_vcpus(); > qemu_system_wakeup(); > notifier_list_notify(&wakeup_notifiers, &wakeup_reason); > wakeup_reason = QEMU_WAKEUP_REASON_NONE; > resume_all_vcpus(); > qapi_event_send_wakeup(); > } > > However, that is not sufficient, because vm_start() was never called on the > incoming > side. vm_start calls the vm state notifiers for RUN_STATE_RUNNING, among > other things. > > > Without my fixes, it "works" because the outgoing migration automatically > wakes a suspended > guest, which sets the state to running, which is saved in global state: > > void migration_completion(MigrationState *s) > qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); > global_state_store() > > Then the incoming migration calls vm_start here: > > migration/migration.c > if (!global_state_received() || > global_state_get_runstate() == RUN_STATE_RUNNING) { > if (autostart) { > vm_start(); > > vm_start must be called for correctness. I see. Though I had a feeling that this is still not the right way to do, at least not as clean. One question is, would above work for postcopy when VM is suspended during the switchover? I think I see your point that vm_start() (mostly vm_prepare_start()) contains a bunch of operations that maybe we must have before starting the VM, but then.. should we just make that vm_start() unconditional when loading VM completes? I just don't see anything won't need it (besides -S), even COLO. So I'm wondering about something like this: ===8<=== --- a/migration/migration.c +++ b/migration/migration.c @@ -481,19 +481,28 @@ static void process_incoming_migration_bh(void *opaque) dirty_bitmap_mig_before_vm_start(); -if (!global_state_received() || -global_state_get_runstate() == RUN_STATE_RUNNING) { -if (autostart) { -vm_start(); -} else { -runstate_set(RUN_STATE_PAUSED); -} -} else if (migration_incoming_colo_enabled()) { +if (migration_incoming_colo_enabled()) { migration_incoming_disable_colo(); +/* COLO should always have autostart=1 or we can enforce it here */ +} + +if (autostart) { +RunState run_state = global_state_get_runstate(); vm_start(); +switch (run_state) { +case RUN_STATE_RUNNING: +break; +case RUN_STATE_SUSPENDED: +qemu_system_suspend(); +break; +default: +runstate_set(run_state); +break; +} } else { -runstate_set(global_state_get_runstate()); +runstate_set(RUN_STATE_PAUSED); } ===8<=== IIUC this can drop qemu_system_start_on_wakeup_request() along with the other global var. Would something like it work for us? -- Peter Xu
Re: [PATCH V1 2/3] migration: fix suspended runstate
On 6/20/2023 5:46 PM, Peter Xu wrote: > On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote: >> Migration of a guest in the suspended state is broken. The incoming >> migration code automatically tries to wake the guest, which IMO is >> wrong -- the guest should end migration in the same state it started. >> Further, the wakeup is done by calling qemu_system_wakeup_request(), which >> bypasses vm_start(). The guest appears to be in the running state, but >> it is not. >> >> To fix, leave the guest in the suspended state, but call >> qemu_system_start_on_wakeup_request() so the guest is properly resumed >> later, when the client sends a system_wakeup command. >> >> Signed-off-by: Steve Sistare >> --- >> migration/migration.c | 11 --- >> softmmu/runstate.c| 1 + >> 2 files changed, 5 insertions(+), 7 deletions(-) >> >> diff --git a/migration/migration.c b/migration/migration.c >> index 17b4b47..851fe6d 100644 >> --- a/migration/migration.c >> +++ b/migration/migration.c >> @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void *opaque) >> vm_start(); >> } else { >> runstate_set(global_state_get_runstate()); >> +if (runstate_check(RUN_STATE_SUSPENDED)) { >> +/* Force vm_start to be called later. */ >> +qemu_system_start_on_wakeup_request(); >> +} > > Is this really needed, along with patch 1? > > I have a very limited knowledge on suspension, so I'm prone to making > mistakes.. > > But from what I read this, qemu_system_wakeup_request() (existing one, not > after patch 1 applied) will setup wakeup_reason and kick the main thread > using qemu_notify_event(). Then IIUC the e.g. vcpu wakeups will be done in > the main thread later on after qemu_wakeup_requested() returns true. Correct, here: if (qemu_wakeup_requested()) { pause_all_vcpus(); qemu_system_wakeup(); notifier_list_notify(&wakeup_notifiers, &wakeup_reason); wakeup_reason = QEMU_WAKEUP_REASON_NONE; resume_all_vcpus(); qapi_event_send_wakeup(); } However, that is not sufficient, because vm_start() was never called on the incoming side. vm_start calls the vm state notifiers for RUN_STATE_RUNNING, among other things. Without my fixes, it "works" because the outgoing migration automatically wakes a suspended guest, which sets the state to running, which is saved in global state: void migration_completion(MigrationState *s) qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); global_state_store() Then the incoming migration calls vm_start here: migration/migration.c if (!global_state_received() || global_state_get_runstate() == RUN_STATE_RUNNING) { if (autostart) { vm_start(); vm_start must be called for correctness. - Steve >> } >> /* >> * This must happen after any state changes since as soon as an external >> @@ -2101,7 +2105,6 @@ static int postcopy_start(MigrationState *ms) >> qemu_mutex_lock_iothread(); >> trace_postcopy_start_set_run(); >> >> -qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); >> global_state_store(); >> ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); >> if (ret < 0) { >> @@ -2307,7 +2310,6 @@ static void migration_completion(MigrationState *s) >> if (s->state == MIGRATION_STATUS_ACTIVE) { >> qemu_mutex_lock_iothread(); >> s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); >> -qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); >> >> s->vm_old_state = runstate_get(); >> global_state_store(); >> @@ -3102,11 +3104,6 @@ static void *bg_migration_thread(void *opaque) >> >> qemu_mutex_lock_iothread(); >> >> -/* >> - * If VM is currently in suspended state, then, to make a valid runstate >> - * transition in vm_stop_force_state() we need to wakeup it up. >> - */ >> -qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); > > Removal of these three places seems reasonable to me, or we won't persist > the SUSPEND state. > > Above comment was the major reason I used to have thought it was needed > (again, based on zero knowledge around this..), but perhaps it was just > wrong? I would assume vm_stop_force_state() will still just work with > suepended, am I right? > >> s->vm_old_state = runstate_get(); >> >> global_state_store(); >> diff --git a/softmmu/runstate.c b/softmmu/runstate.c >> index e127b21..771896c 100644 >> --- a/softmmu/runstate.c >> +++ b/softmmu/runstate.c >> @@ -159,6 +159,7 @@ static const RunStateTransition >> runstate_transitions_def[] = { >> { RUN_STATE_RUNNING, RUN_STATE_SUSPENDED }, >> { RUN_STATE_SUSPENDED, RUN_STATE_RUNNING }, >> { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE }, >> +{ RUN_STATE_SUSPENDED, RUN_STATE_PAUSED }, >> { RUN_STATE_SUSPENDED, RUN_STATE_PRELA
Re: [PATCH V1 2/3] migration: fix suspended runstate
On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote: > Migration of a guest in the suspended state is broken. The incoming > migration code automatically tries to wake the guest, which IMO is > wrong -- the guest should end migration in the same state it started. > Further, the wakeup is done by calling qemu_system_wakeup_request(), which > bypasses vm_start(). The guest appears to be in the running state, but > it is not. > > To fix, leave the guest in the suspended state, but call > qemu_system_start_on_wakeup_request() so the guest is properly resumed > later, when the client sends a system_wakeup command. > > Signed-off-by: Steve Sistare > --- > migration/migration.c | 11 --- > softmmu/runstate.c| 1 + > 2 files changed, 5 insertions(+), 7 deletions(-) > > diff --git a/migration/migration.c b/migration/migration.c > index 17b4b47..851fe6d 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void *opaque) > vm_start(); > } else { > runstate_set(global_state_get_runstate()); > +if (runstate_check(RUN_STATE_SUSPENDED)) { > +/* Force vm_start to be called later. */ > +qemu_system_start_on_wakeup_request(); > +} Is this really needed, along with patch 1? I have a very limited knowledge on suspension, so I'm prone to making mistakes.. But from what I read this, qemu_system_wakeup_request() (existing one, not after patch 1 applied) will setup wakeup_reason and kick the main thread using qemu_notify_event(). Then IIUC the e.g. vcpu wakeups will be done in the main thread later on after qemu_wakeup_requested() returns true. > } > /* > * This must happen after any state changes since as soon as an external > @@ -2101,7 +2105,6 @@ static int postcopy_start(MigrationState *ms) > qemu_mutex_lock_iothread(); > trace_postcopy_start_set_run(); > > -qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); > global_state_store(); > ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); > if (ret < 0) { > @@ -2307,7 +2310,6 @@ static void migration_completion(MigrationState *s) > if (s->state == MIGRATION_STATUS_ACTIVE) { > qemu_mutex_lock_iothread(); > s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); > -qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); > > s->vm_old_state = runstate_get(); > global_state_store(); > @@ -3102,11 +3104,6 @@ static void *bg_migration_thread(void *opaque) > > qemu_mutex_lock_iothread(); > > -/* > - * If VM is currently in suspended state, then, to make a valid runstate > - * transition in vm_stop_force_state() we need to wakeup it up. > - */ > -qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); Removal of these three places seems reasonable to me, or we won't persist the SUSPEND state. Above comment was the major reason I used to have thought it was needed (again, based on zero knowledge around this..), but perhaps it was just wrong? I would assume vm_stop_force_state() will still just work with suepended, am I right? > s->vm_old_state = runstate_get(); > > global_state_store(); > diff --git a/softmmu/runstate.c b/softmmu/runstate.c > index e127b21..771896c 100644 > --- a/softmmu/runstate.c > +++ b/softmmu/runstate.c > @@ -159,6 +159,7 @@ static const RunStateTransition > runstate_transitions_def[] = { > { RUN_STATE_RUNNING, RUN_STATE_SUSPENDED }, > { RUN_STATE_SUSPENDED, RUN_STATE_RUNNING }, > { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE }, > +{ RUN_STATE_SUSPENDED, RUN_STATE_PAUSED }, > { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH }, > { RUN_STATE_SUSPENDED, RUN_STATE_COLO}, > > -- > 1.8.3.1 > -- Peter Xu
[PATCH V1 2/3] migration: fix suspended runstate
Migration of a guest in the suspended state is broken. The incoming migration code automatically tries to wake the guest, which IMO is wrong -- the guest should end migration in the same state it started. Further, the wakeup is done by calling qemu_system_wakeup_request(), which bypasses vm_start(). The guest appears to be in the running state, but it is not. To fix, leave the guest in the suspended state, but call qemu_system_start_on_wakeup_request() so the guest is properly resumed later, when the client sends a system_wakeup command. Signed-off-by: Steve Sistare --- migration/migration.c | 11 --- softmmu/runstate.c| 1 + 2 files changed, 5 insertions(+), 7 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 17b4b47..851fe6d 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void *opaque) vm_start(); } else { runstate_set(global_state_get_runstate()); +if (runstate_check(RUN_STATE_SUSPENDED)) { +/* Force vm_start to be called later. */ +qemu_system_start_on_wakeup_request(); +} } /* * This must happen after any state changes since as soon as an external @@ -2101,7 +2105,6 @@ static int postcopy_start(MigrationState *ms) qemu_mutex_lock_iothread(); trace_postcopy_start_set_run(); -qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); global_state_store(); ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); if (ret < 0) { @@ -2307,7 +2310,6 @@ static void migration_completion(MigrationState *s) if (s->state == MIGRATION_STATUS_ACTIVE) { qemu_mutex_lock_iothread(); s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); -qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); s->vm_old_state = runstate_get(); global_state_store(); @@ -3102,11 +3104,6 @@ static void *bg_migration_thread(void *opaque) qemu_mutex_lock_iothread(); -/* - * If VM is currently in suspended state, then, to make a valid runstate - * transition in vm_stop_force_state() we need to wakeup it up. - */ -qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL); s->vm_old_state = runstate_get(); global_state_store(); diff --git a/softmmu/runstate.c b/softmmu/runstate.c index e127b21..771896c 100644 --- a/softmmu/runstate.c +++ b/softmmu/runstate.c @@ -159,6 +159,7 @@ static const RunStateTransition runstate_transitions_def[] = { { RUN_STATE_RUNNING, RUN_STATE_SUSPENDED }, { RUN_STATE_SUSPENDED, RUN_STATE_RUNNING }, { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE }, +{ RUN_STATE_SUSPENDED, RUN_STATE_PAUSED }, { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH }, { RUN_STATE_SUSPENDED, RUN_STATE_COLO}, -- 1.8.3.1