Re: [PATCH V1 2/3] migration: fix suspended runstate

2023-08-24 Thread Steven Sistare
On 8/17/2023 2:19 PM, Peter Xu wrote:
> On Wed, Aug 16, 2023 at 01:48:13PM -0400, Steven Sistare wrote:
>> On 8/14/2023 3:37 PM, Peter Xu wrote:
>>> On Mon, Aug 14, 2023 at 02:53:56PM -0400, Steven Sistare wrote:
> Can we just call vm_state_notify() earlier?

 We cannot.  The guest is not running yet, and will not be until later.
 We cannot call notifiers that perform actions that complete, or react to, 
 the guest entering a running state.
>>>
>>> I tried to look at a few examples of the notifees and most of them I read
>>> do not react to "vcpu running" but "vm running" (in which case I think
>>> "suspended" mode falls into "vm running" case); most of them won't care on
>>> the RunState parameter passed in, but only the bool "running".
>>>
>>> In reality, when running=true, it must be RUNNING so far.
>>>
>>> In that case does it mean we should notify right after the switchover,
>>> since after migration the vm is indeed running only if the vcpus are not
>>> during suspend?
>>
>> I cannot parse your question, but maybe this answers it.
>> If the outgoing VM is running and not suspended, then the incoming side
>> tests for autostart==true and calls vm_start, which calls the notifiers,
>> right after the switchover.
> 
> I meant IMHO SUSPENDED should be seen as "vm running" case to me, just like
> RUNNING.  Then, we should invoke vm_prepare_start(), just need some touch
> ups.
> 
>>
>>> One example (of possible issue) is vfio_vmstate_change(), where iiuc if we
>>> try to suspend a VM it should keep to be VFIO_DEVICE_STATE_RUNNING for that
>>> device; this kind of prove to me that SUSPEND is actually one of
>>> running=true states.
>>>
>>> If we postpone all notifiers here even after we switched over to dest qemu
>>> to the next upcoming suspend wakeup, I think it means these devices will
>>> not be in VFIO_DEVICE_STATE_RUNNING after switchover but perhaps
>>> VFIO_DEVICE_STATE_STOP.
>>
>> or VFIO_DEVICE_STATE_RESUMING, which is set in vfio_load_setup.
>> AFAIK it is OK to remain in that state until wakeup is called later.
> 
> So let me provide another clue of why I think we should call
> vm_prepare_start()..
> 
> Firstly, I think RESUME event should always be there right after we
> switched over, no matter suspeneded or not.  I just noticed that your test
> case would work because you put "wakeup" to be before RESUME.  I'm not sure
> whether that's correct.  I'd bet people could rely on that RESUME to
> identify the switchover.

Yes, possibly.

> More importantly, I'm wondering whether RTC should still be running during
> the suspended mode?  Sorry again if my knowledge over there is just
> limited, so correct me otherwise - but my understanding is during suspend
> mode (s1 or s3, frankly I can't tell which one this belongs..), rtc should
> still be running along with the system clock.  It means we _should_ at
> least call cpu_enable_ticks() to enable rtc:

Agreed.  The comment above cpu_get_ticks says:
  * return the time elapsed in VM between vm_start and vm_stop
Suspending a guest does not call vm_stop, so ticks keeps running.
I also verified that experimentally.

> /*
>  * enable cpu_get_ticks()
>  * Caller must hold BQL which serves as mutex for vm_clock_seqlock.
>  */
> void cpu_enable_ticks(void)
> 
> I think that'll enable cpu_get_tsc() and make it start to work right.
> 
>>
>>> Ideally I think we should here call vm_state_notify() with running=true and
>>> state=SUSPEND, but since I do see some hooks are not well prepared for
>>> SUSPEND over running=true, I'd think we should on the safe side call
>>> vm_state_notify(running=true, state=RUNNING) even for SUSPEND at switch
>>> over phase.  With that IIUC it'll naturally work (e.g. when wakeup again
>>> later we just need to call no notifiers).
>>
>> Notifiers are just one piece, all the code in vm_prepare_start must be 
>> called.
>> Is it correct to call all of that long before we actually resume the CPUs in
>> wakeup?  I don't know, but what is the point?
> 
> The point is not only for cleaness (again, I really, really don't like that
> new global.. sorry), but also now I think we should make the vm running.

I do believe it is safe to call vm_prepare_start immediately, since the vcpus
are never running when it is called.

>> The wakeup code still needs
>> modification to conditionally resume the vcpus.  The scheme would be roughly:
>>
>> loadvm_postcopy_handle_run_bh()
>> runstat = global_state_get_runstate();
>> if (runstate == RUN_STATE_RUNNING) {
>> vm_start()
>> } else if (runstate == RUN_STATE_SUSPENDED)
>> vm_prepare_start();   // the start of vm_start()
>> }
>>
>> qemu_system_wakeup_request()
>> if (some condition)
>> resume_all_vcpus();   // the remainder of vm_start()
>> else
>> runstate_set(RUN_STATE_RUNNING)
> 
> No it doesn't.  wakeup_reason is set there, main loop does the resuming.
> See:
> 
> if 

Re: [PATCH V1 2/3] migration: fix suspended runstate

2023-08-17 Thread Peter Xu
On Wed, Aug 16, 2023 at 01:48:13PM -0400, Steven Sistare wrote:
> On 8/14/2023 3:37 PM, Peter Xu wrote:
> > On Mon, Aug 14, 2023 at 02:53:56PM -0400, Steven Sistare wrote:
> >>> Can we just call vm_state_notify() earlier?
> >>
> >> We cannot.  The guest is not running yet, and will not be until later.
> >> We cannot call notifiers that perform actions that complete, or react to, 
> >> the guest entering a running state.
> > 
> > I tried to look at a few examples of the notifees and most of them I read
> > do not react to "vcpu running" but "vm running" (in which case I think
> > "suspended" mode falls into "vm running" case); most of them won't care on
> > the RunState parameter passed in, but only the bool "running".
> > 
> > In reality, when running=true, it must be RUNNING so far.
> > 
> > In that case does it mean we should notify right after the switchover,
> > since after migration the vm is indeed running only if the vcpus are not
> > during suspend?
> 
> I cannot parse your question, but maybe this answers it.
> If the outgoing VM is running and not suspended, then the incoming side
> tests for autostart==true and calls vm_start, which calls the notifiers,
> right after the switchover.

I meant IMHO SUSPENDED should be seen as "vm running" case to me, just like
RUNNING.  Then, we should invoke vm_prepare_start(), just need some touch
ups.

> 
> > One example (of possible issue) is vfio_vmstate_change(), where iiuc if we
> > try to suspend a VM it should keep to be VFIO_DEVICE_STATE_RUNNING for that
> > device; this kind of prove to me that SUSPEND is actually one of
> > running=true states.
> > 
> > If we postpone all notifiers here even after we switched over to dest qemu
> > to the next upcoming suspend wakeup, I think it means these devices will
> > not be in VFIO_DEVICE_STATE_RUNNING after switchover but perhaps
> > VFIO_DEVICE_STATE_STOP.
> 
> or VFIO_DEVICE_STATE_RESUMING, which is set in vfio_load_setup.
> AFAIK it is OK to remain in that state until wakeup is called later.

So let me provide another clue of why I think we should call
vm_prepare_start()..

Firstly, I think RESUME event should always be there right after we
switched over, no matter suspeneded or not.  I just noticed that your test
case would work because you put "wakeup" to be before RESUME.  I'm not sure
whether that's correct.  I'd bet people could rely on that RESUME to
identify the switchover.

More importantly, I'm wondering whether RTC should still be running during
the suspended mode?  Sorry again if my knowledge over there is just
limited, so correct me otherwise - but my understanding is during suspend
mode (s1 or s3, frankly I can't tell which one this belongs..), rtc should
still be running along with the system clock.  It means we _should_ at
least call cpu_enable_ticks() to enable rtc:

/*
 * enable cpu_get_ticks()
 * Caller must hold BQL which serves as mutex for vm_clock_seqlock.
 */
void cpu_enable_ticks(void)

I think that'll enable cpu_get_tsc() and make it start to work right.

> 
> > Ideally I think we should here call vm_state_notify() with running=true and
> > state=SUSPEND, but since I do see some hooks are not well prepared for
> > SUSPEND over running=true, I'd think we should on the safe side call
> > vm_state_notify(running=true, state=RUNNING) even for SUSPEND at switch
> > over phase.  With that IIUC it'll naturally work (e.g. when wakeup again
> > later we just need to call no notifiers).
> 
> Notifiers are just one piece, all the code in vm_prepare_start must be called.
> Is it correct to call all of that long before we actually resume the CPUs in
> wakeup?  I don't know, but what is the point?

The point is not only for cleaness (again, I really, really don't like that
new global.. sorry), but also now I think we should make the vm running.

> The wakeup code still needs
> modification to conditionally resume the vcpus.  The scheme would be roughly:
> 
> loadvm_postcopy_handle_run_bh()
> runstat = global_state_get_runstate();
> if (runstate == RUN_STATE_RUNNING) {
> vm_start()
> } else if (runstate == RUN_STATE_SUSPENDED)
> vm_prepare_start();   // the start of vm_start()
> }
> 
> qemu_system_wakeup_request()
> if (some condition)
> resume_all_vcpus();   // the remainder of vm_start()
> else
> runstate_set(RUN_STATE_RUNNING)

No it doesn't.  wakeup_reason is set there, main loop does the resuming.
See:

if (qemu_wakeup_requested()) {
pause_all_vcpus();
qemu_system_wakeup();
notifier_list_notify(_notifiers, _reason);
wakeup_reason = QEMU_WAKEUP_REASON_NONE;
resume_all_vcpus();
qapi_event_send_wakeup();
}

> 
> How is that better than my patches
> [PATCH V3 01/10] vl: start on wakeup request
> [PATCH V3 02/10] migration: preserve suspended runstate
> 
> loadvm_postcopy_handle_run_bh()
> runstate = 

Re: [PATCH V1 2/3] migration: fix suspended runstate

2023-08-16 Thread Steven Sistare
On 8/14/2023 3:37 PM, Peter Xu wrote:
> On Mon, Aug 14, 2023 at 02:53:56PM -0400, Steven Sistare wrote:
>>> Can we just call vm_state_notify() earlier?
>>
>> We cannot.  The guest is not running yet, and will not be until later.
>> We cannot call notifiers that perform actions that complete, or react to, 
>> the guest entering a running state.
> 
> I tried to look at a few examples of the notifees and most of them I read
> do not react to "vcpu running" but "vm running" (in which case I think
> "suspended" mode falls into "vm running" case); most of them won't care on
> the RunState parameter passed in, but only the bool "running".
> 
> In reality, when running=true, it must be RUNNING so far.
> 
> In that case does it mean we should notify right after the switchover,
> since after migration the vm is indeed running only if the vcpus are not
> during suspend?

I cannot parse your question, but maybe this answers it.
If the outgoing VM is running and not suspended, then the incoming side
tests for autostart==true and calls vm_start, which calls the notifiers,
right after the switchover.

> One example (of possible issue) is vfio_vmstate_change(), where iiuc if we
> try to suspend a VM it should keep to be VFIO_DEVICE_STATE_RUNNING for that
> device; this kind of prove to me that SUSPEND is actually one of
> running=true states.
> 
> If we postpone all notifiers here even after we switched over to dest qemu
> to the next upcoming suspend wakeup, I think it means these devices will
> not be in VFIO_DEVICE_STATE_RUNNING after switchover but perhaps
> VFIO_DEVICE_STATE_STOP.

or VFIO_DEVICE_STATE_RESUMING, which is set in vfio_load_setup.
AFAIK it is OK to remain in that state until wakeup is called later.

> Ideally I think we should here call vm_state_notify() with running=true and
> state=SUSPEND, but since I do see some hooks are not well prepared for
> SUSPEND over running=true, I'd think we should on the safe side call
> vm_state_notify(running=true, state=RUNNING) even for SUSPEND at switch
> over phase.  With that IIUC it'll naturally work (e.g. when wakeup again
> later we just need to call no notifiers).

Notifiers are just one piece, all the code in vm_prepare_start must be called.
Is it correct to call all of that long before we actually resume the CPUs in
wakeup?  I don't know, but what is the point?  The wakeup code still needs
modification to conditionally resume the vcpus.  The scheme would be roughly:

loadvm_postcopy_handle_run_bh()
runstat = global_state_get_runstate();
if (runstate == RUN_STATE_RUNNING) {
vm_start()
} else if (runstate == RUN_STATE_SUSPENDED)
vm_prepare_start();   // the start of vm_start()
}

qemu_system_wakeup_request()
if (some condition)
resume_all_vcpus();   // the remainder of vm_start()
else
runstate_set(RUN_STATE_RUNNING)

How is that better than my patches
[PATCH V3 01/10] vl: start on wakeup request
[PATCH V3 02/10] migration: preserve suspended runstate

loadvm_postcopy_handle_run_bh()
runstate = global_state_get_runstate();
if (runstate == RUN_STATE_RUNNING)
vm_start()
else
runstate_set(runstate);// eg RUN_STATE_SUSPENDED

qemu_system_wakeup_request()
if (!vm_started)
vm_start();
else
runstate_set(RUN_STATE_RUNNING);

Recall this thread started with your comment "It then can avoid touching the 
system wakeup code which seems cleaner".  We still need to touch the wakeup
code.

- Steve



Re: [PATCH V1 2/3] migration: fix suspended runstate

2023-08-14 Thread Peter Xu
On Mon, Aug 14, 2023 at 02:53:56PM -0400, Steven Sistare wrote:
> > Can we just call vm_state_notify() earlier?
> 
> We cannot.  The guest is not running yet, and will not be until later.
> We cannot call notifiers that perform actions that complete, or react to, 
> the guest entering a running state.

I tried to look at a few examples of the notifees and most of them I read
do not react to "vcpu running" but "vm running" (in which case I think
"suspended" mode falls into "vm running" case); most of them won't care on
the RunState parameter passed in, but only the bool "running".

In reality, when running=true, it must be RUNNING so far.

In that case does it mean we should notify right after the switchover,
since after migration the vm is indeed running only if the vcpus are not
during suspend?

One example (of possible issue) is vfio_vmstate_change(), where iiuc if we
try to suspend a VM it should keep to be VFIO_DEVICE_STATE_RUNNING for that
device; this kind of prove to me that SUSPEND is actually one of
running=true states.

If we postpone all notifiers here even after we switched over to dest qemu
to the next upcoming suspend wakeup, I think it means these devices will
not be in VFIO_DEVICE_STATE_RUNNING after switchover but perhaps
VFIO_DEVICE_STATE_STOP.

Ideally I think we should here call vm_state_notify() with running=true and
state=SUSPEND, but since I do see some hooks are not well prepared for
SUSPEND over running=true, I'd think we should on the safe side call
vm_state_notify(running=true, state=RUNNING) even for SUSPEND at switch
over phase.  With that IIUC it'll naturally work (e.g. when wakeup again
later we just need to call no notifiers).

-- 
Peter Xu




Re: [PATCH V1 2/3] migration: fix suspended runstate

2023-08-14 Thread Steven Sistare
On 7/26/2023 4:18 PM, Peter Xu wrote:
> On Fri, Jun 30, 2023 at 09:50:41AM -0400, Steven Sistare wrote:
>> On 6/26/2023 2:27 PM, Peter Xu wrote:
>>> On Fri, Jun 23, 2023 at 02:25:05PM -0400, Steven Sistare wrote:
 On 6/21/2023 4:28 PM, Peter Xu wrote:
> On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote:
>> On 6/20/2023 5:46 PM, Peter Xu wrote:
>>> On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote:
 Migration of a guest in the suspended state is broken.  The incoming
 migration code automatically tries to wake the guest, which IMO is
 wrong -- the guest should end migration in the same state it started.
 Further, the wakeup is done by calling qemu_system_wakeup_request(), 
 which
 bypasses vm_start().  The guest appears to be in the running state, but
 it is not.

 To fix, leave the guest in the suspended state, but call
 qemu_system_start_on_wakeup_request() so the guest is properly resumed
 later, when the client sends a system_wakeup command.

 Signed-off-by: Steve Sistare 
 ---
  migration/migration.c | 11 ---
  softmmu/runstate.c|  1 +
  2 files changed, 5 insertions(+), 7 deletions(-)

 diff --git a/migration/migration.c b/migration/migration.c
 index 17b4b47..851fe6d 100644
 --- a/migration/migration.c
 +++ b/migration/migration.c
 @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void 
 *opaque)
  vm_start();
  } else {
  runstate_set(global_state_get_runstate());
 +if (runstate_check(RUN_STATE_SUSPENDED)) {
 +/* Force vm_start to be called later. */
 +qemu_system_start_on_wakeup_request();
 +}
>>>
>>> Is this really needed, along with patch 1?
>>>
>>> I have a very limited knowledge on suspension, so I'm prone to making
>>> mistakes..
>>>
>>> But from what I read this, qemu_system_wakeup_request() (existing one, 
>>> not
>>> after patch 1 applied) will setup wakeup_reason and kick the main thread
>>> using qemu_notify_event().  Then IIUC the e.g. vcpu wakeups will be 
>>> done in
>>> the main thread later on after qemu_wakeup_requested() returns true.
>>
>> Correct, here:
>>
>> if (qemu_wakeup_requested()) {
>> pause_all_vcpus();
>> qemu_system_wakeup();
>> notifier_list_notify(_notifiers, _reason);
>> wakeup_reason = QEMU_WAKEUP_REASON_NONE;
>> resume_all_vcpus();
>> qapi_event_send_wakeup();
>> }
>>
>> However, that is not sufficient, because vm_start() was never called on 
>> the incoming
>> side.  vm_start calls the vm state notifiers for RUN_STATE_RUNNING, 
>> among other things.
>>
>>
>> Without my fixes, it "works" because the outgoing migration 
>> automatically wakes a suspended
>> guest, which sets the state to running, which is saved in global state:
>>
>> void migration_completion(MigrationState *s)
>> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
>> global_state_store()
>>
>> Then the incoming migration calls vm_start here:
>>
>> migration/migration.c
>> if (!global_state_received() ||
>> global_state_get_runstate() == RUN_STATE_RUNNING) {
>> if (autostart) {
>> vm_start();
>>
>> vm_start must be called for correctness.
>
> I see.  Though I had a feeling that this is still not the right way to do,
> at least not as clean.
>
> One question is, would above work for postcopy when VM is suspended during
> the switchover?

 Good catch, that is broken.
 I added qemu_system_start_on_wakeup_request to 
 loadvm_postcopy_handle_run_bh
 and now it works.

 if (global_state_get_runstate() == RUN_STATE_RUNNING) {
 if (autostart) {
 vm_start();
 } else {
 runstate_set(RUN_STATE_PAUSED);
 }
 } else {
 runstate_set(global_state_get_runstate());
 if (runstate_check(RUN_STATE_SUSPENDED)) {
 qemu_system_start_on_wakeup_request();
 }
 }

> I think I see your point that vm_start() (mostly vm_prepare_start())
> contains a bunch of operations that maybe we must have before starting the
> VM, but then.. should we just make that vm_start() unconditional when
> loading VM completes?  I just don't see anything won't need it (besides
> -S), even COLO.
>
> So I'm wondering about something like this:
>
> ===8<===
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -481,19 

Re: [PATCH V1 2/3] migration: fix suspended runstate

2023-07-26 Thread Peter Xu
On Fri, Jun 30, 2023 at 09:50:41AM -0400, Steven Sistare wrote:
> On 6/26/2023 2:27 PM, Peter Xu wrote:
> > On Fri, Jun 23, 2023 at 02:25:05PM -0400, Steven Sistare wrote:
> >> On 6/21/2023 4:28 PM, Peter Xu wrote:
> >>> On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote:
>  On 6/20/2023 5:46 PM, Peter Xu wrote:
> > On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote:
> >> Migration of a guest in the suspended state is broken.  The incoming
> >> migration code automatically tries to wake the guest, which IMO is
> >> wrong -- the guest should end migration in the same state it started.
> >> Further, the wakeup is done by calling qemu_system_wakeup_request(), 
> >> which
> >> bypasses vm_start().  The guest appears to be in the running state, but
> >> it is not.
> >>
> >> To fix, leave the guest in the suspended state, but call
> >> qemu_system_start_on_wakeup_request() so the guest is properly resumed
> >> later, when the client sends a system_wakeup command.
> >>
> >> Signed-off-by: Steve Sistare 
> >> ---
> >>  migration/migration.c | 11 ---
> >>  softmmu/runstate.c|  1 +
> >>  2 files changed, 5 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/migration/migration.c b/migration/migration.c
> >> index 17b4b47..851fe6d 100644
> >> --- a/migration/migration.c
> >> +++ b/migration/migration.c
> >> @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void 
> >> *opaque)
> >>  vm_start();
> >>  } else {
> >>  runstate_set(global_state_get_runstate());
> >> +if (runstate_check(RUN_STATE_SUSPENDED)) {
> >> +/* Force vm_start to be called later. */
> >> +qemu_system_start_on_wakeup_request();
> >> +}
> >
> > Is this really needed, along with patch 1?
> >
> > I have a very limited knowledge on suspension, so I'm prone to making
> > mistakes..
> >
> > But from what I read this, qemu_system_wakeup_request() (existing one, 
> > not
> > after patch 1 applied) will setup wakeup_reason and kick the main thread
> > using qemu_notify_event().  Then IIUC the e.g. vcpu wakeups will be 
> > done in
> > the main thread later on after qemu_wakeup_requested() returns true.
> 
>  Correct, here:
> 
>  if (qemu_wakeup_requested()) {
>  pause_all_vcpus();
>  qemu_system_wakeup();
>  notifier_list_notify(_notifiers, _reason);
>  wakeup_reason = QEMU_WAKEUP_REASON_NONE;
>  resume_all_vcpus();
>  qapi_event_send_wakeup();
>  }
> 
>  However, that is not sufficient, because vm_start() was never called on 
>  the incoming
>  side.  vm_start calls the vm state notifiers for RUN_STATE_RUNNING, 
>  among other things.
> 
> 
>  Without my fixes, it "works" because the outgoing migration 
>  automatically wakes a suspended
>  guest, which sets the state to running, which is saved in global state:
> 
>  void migration_completion(MigrationState *s)
>  qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
>  global_state_store()
> 
>  Then the incoming migration calls vm_start here:
> 
>  migration/migration.c
>  if (!global_state_received() ||
>  global_state_get_runstate() == RUN_STATE_RUNNING) {
>  if (autostart) {
>  vm_start();
> 
>  vm_start must be called for correctness.
> >>>
> >>> I see.  Though I had a feeling that this is still not the right way to do,
> >>> at least not as clean.
> >>>
> >>> One question is, would above work for postcopy when VM is suspended during
> >>> the switchover?
> >>
> >> Good catch, that is broken.
> >> I added qemu_system_start_on_wakeup_request to 
> >> loadvm_postcopy_handle_run_bh
> >> and now it works.
> >>
> >> if (global_state_get_runstate() == RUN_STATE_RUNNING) {
> >> if (autostart) {
> >> vm_start();
> >> } else {
> >> runstate_set(RUN_STATE_PAUSED);
> >> }
> >> } else {
> >> runstate_set(global_state_get_runstate());
> >> if (runstate_check(RUN_STATE_SUSPENDED)) {
> >> qemu_system_start_on_wakeup_request();
> >> }
> >> }
> >>
> >>> I think I see your point that vm_start() (mostly vm_prepare_start())
> >>> contains a bunch of operations that maybe we must have before starting the
> >>> VM, but then.. should we just make that vm_start() unconditional when
> >>> loading VM completes?  I just don't see anything won't need it (besides
> >>> -S), even COLO.
> >>>
> >>> So I'm wondering about something like this:
> >>>
> >>> ===8<===
> >>> --- a/migration/migration.c
> >>> +++ b/migration/migration.c
> >>> @@ -481,19 +481,28 @@ static void 

Re: [PATCH V1 2/3] migration: fix suspended runstate

2023-06-30 Thread Steven Sistare
On 6/26/2023 2:27 PM, Peter Xu wrote:
> On Fri, Jun 23, 2023 at 02:25:05PM -0400, Steven Sistare wrote:
>> On 6/21/2023 4:28 PM, Peter Xu wrote:
>>> On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote:
 On 6/20/2023 5:46 PM, Peter Xu wrote:
> On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote:
>> Migration of a guest in the suspended state is broken.  The incoming
>> migration code automatically tries to wake the guest, which IMO is
>> wrong -- the guest should end migration in the same state it started.
>> Further, the wakeup is done by calling qemu_system_wakeup_request(), 
>> which
>> bypasses vm_start().  The guest appears to be in the running state, but
>> it is not.
>>
>> To fix, leave the guest in the suspended state, but call
>> qemu_system_start_on_wakeup_request() so the guest is properly resumed
>> later, when the client sends a system_wakeup command.
>>
>> Signed-off-by: Steve Sistare 
>> ---
>>  migration/migration.c | 11 ---
>>  softmmu/runstate.c|  1 +
>>  2 files changed, 5 insertions(+), 7 deletions(-)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 17b4b47..851fe6d 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void 
>> *opaque)
>>  vm_start();
>>  } else {
>>  runstate_set(global_state_get_runstate());
>> +if (runstate_check(RUN_STATE_SUSPENDED)) {
>> +/* Force vm_start to be called later. */
>> +qemu_system_start_on_wakeup_request();
>> +}
>
> Is this really needed, along with patch 1?
>
> I have a very limited knowledge on suspension, so I'm prone to making
> mistakes..
>
> But from what I read this, qemu_system_wakeup_request() (existing one, not
> after patch 1 applied) will setup wakeup_reason and kick the main thread
> using qemu_notify_event().  Then IIUC the e.g. vcpu wakeups will be done 
> in
> the main thread later on after qemu_wakeup_requested() returns true.

 Correct, here:

 if (qemu_wakeup_requested()) {
 pause_all_vcpus();
 qemu_system_wakeup();
 notifier_list_notify(_notifiers, _reason);
 wakeup_reason = QEMU_WAKEUP_REASON_NONE;
 resume_all_vcpus();
 qapi_event_send_wakeup();
 }

 However, that is not sufficient, because vm_start() was never called on 
 the incoming
 side.  vm_start calls the vm state notifiers for RUN_STATE_RUNNING, among 
 other things.


 Without my fixes, it "works" because the outgoing migration automatically 
 wakes a suspended
 guest, which sets the state to running, which is saved in global state:

 void migration_completion(MigrationState *s)
 qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
 global_state_store()

 Then the incoming migration calls vm_start here:

 migration/migration.c
 if (!global_state_received() ||
 global_state_get_runstate() == RUN_STATE_RUNNING) {
 if (autostart) {
 vm_start();

 vm_start must be called for correctness.
>>>
>>> I see.  Though I had a feeling that this is still not the right way to do,
>>> at least not as clean.
>>>
>>> One question is, would above work for postcopy when VM is suspended during
>>> the switchover?
>>
>> Good catch, that is broken.
>> I added qemu_system_start_on_wakeup_request to loadvm_postcopy_handle_run_bh
>> and now it works.
>>
>> if (global_state_get_runstate() == RUN_STATE_RUNNING) {
>> if (autostart) {
>> vm_start();
>> } else {
>> runstate_set(RUN_STATE_PAUSED);
>> }
>> } else {
>> runstate_set(global_state_get_runstate());
>> if (runstate_check(RUN_STATE_SUSPENDED)) {
>> qemu_system_start_on_wakeup_request();
>> }
>> }
>>
>>> I think I see your point that vm_start() (mostly vm_prepare_start())
>>> contains a bunch of operations that maybe we must have before starting the
>>> VM, but then.. should we just make that vm_start() unconditional when
>>> loading VM completes?  I just don't see anything won't need it (besides
>>> -S), even COLO.
>>>
>>> So I'm wondering about something like this:
>>>
>>> ===8<===
>>> --- a/migration/migration.c
>>> +++ b/migration/migration.c
>>> @@ -481,19 +481,28 @@ static void process_incoming_migration_bh(void 
>>> *opaque)
>>>  
>>>  dirty_bitmap_mig_before_vm_start();
>>>  
>>> -if (!global_state_received() ||
>>> -global_state_get_runstate() == RUN_STATE_RUNNING) {
>>> -if (autostart) {
>>> -vm_start();
>>> -} else {
>>> -

Re: [PATCH V1 2/3] migration: fix suspended runstate

2023-06-26 Thread Peter Xu
On Mon, Jun 26, 2023 at 02:27:32PM -0400, Peter Xu wrote:
> On Fri, Jun 23, 2023 at 02:25:05PM -0400, Steven Sistare wrote:
> > On 6/21/2023 4:28 PM, Peter Xu wrote:
> > > On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote:
> > >> On 6/20/2023 5:46 PM, Peter Xu wrote:
> > >>> On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote:
> >  Migration of a guest in the suspended state is broken.  The incoming
> >  migration code automatically tries to wake the guest, which IMO is
> >  wrong -- the guest should end migration in the same state it started.
> >  Further, the wakeup is done by calling qemu_system_wakeup_request(), 
> >  which
> >  bypasses vm_start().  The guest appears to be in the running state, but
> >  it is not.
> > 
> >  To fix, leave the guest in the suspended state, but call
> >  qemu_system_start_on_wakeup_request() so the guest is properly resumed
> >  later, when the client sends a system_wakeup command.
> > 
> >  Signed-off-by: Steve Sistare 
> >  ---
> >   migration/migration.c | 11 ---
> >   softmmu/runstate.c|  1 +
> >   2 files changed, 5 insertions(+), 7 deletions(-)
> > 
> >  diff --git a/migration/migration.c b/migration/migration.c
> >  index 17b4b47..851fe6d 100644
> >  --- a/migration/migration.c
> >  +++ b/migration/migration.c
> >  @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void 
> >  *opaque)
> >   vm_start();
> >   } else {
> >   runstate_set(global_state_get_runstate());
> >  +if (runstate_check(RUN_STATE_SUSPENDED)) {
> >  +/* Force vm_start to be called later. */
> >  +qemu_system_start_on_wakeup_request();
> >  +}
> > >>>
> > >>> Is this really needed, along with patch 1?
> > >>>
> > >>> I have a very limited knowledge on suspension, so I'm prone to making
> > >>> mistakes..
> > >>>
> > >>> But from what I read this, qemu_system_wakeup_request() (existing one, 
> > >>> not
> > >>> after patch 1 applied) will setup wakeup_reason and kick the main thread
> > >>> using qemu_notify_event().  Then IIUC the e.g. vcpu wakeups will be 
> > >>> done in
> > >>> the main thread later on after qemu_wakeup_requested() returns true.
> > >>
> > >> Correct, here:
> > >>
> > >> if (qemu_wakeup_requested()) {
> > >> pause_all_vcpus();
> > >> qemu_system_wakeup();
> > >> notifier_list_notify(_notifiers, _reason);
> > >> wakeup_reason = QEMU_WAKEUP_REASON_NONE;
> > >> resume_all_vcpus();
> > >> qapi_event_send_wakeup();
> > >> }
> > >>
> > >> However, that is not sufficient, because vm_start() was never called on 
> > >> the incoming
> > >> side.  vm_start calls the vm state notifiers for RUN_STATE_RUNNING, 
> > >> among other things.
> > >>
> > >>
> > >> Without my fixes, it "works" because the outgoing migration 
> > >> automatically wakes a suspended
> > >> guest, which sets the state to running, which is saved in global state:
> > >>
> > >> void migration_completion(MigrationState *s)
> > >> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
> > >> global_state_store()
> > >>
> > >> Then the incoming migration calls vm_start here:
> > >>
> > >> migration/migration.c
> > >> if (!global_state_received() ||
> > >> global_state_get_runstate() == RUN_STATE_RUNNING) {
> > >> if (autostart) {
> > >> vm_start();
> > >>
> > >> vm_start must be called for correctness.
> > > 
> > > I see.  Though I had a feeling that this is still not the right way to do,
> > > at least not as clean.
> > > 
> > > One question is, would above work for postcopy when VM is suspended during
> > > the switchover?
> > 
> > Good catch, that is broken.
> > I added qemu_system_start_on_wakeup_request to loadvm_postcopy_handle_run_bh
> > and now it works.
> > 
> > if (global_state_get_runstate() == RUN_STATE_RUNNING) {
> > if (autostart) {
> > vm_start();
> > } else {
> > runstate_set(RUN_STATE_PAUSED);
> > }
> > } else {
> > runstate_set(global_state_get_runstate());
> > if (runstate_check(RUN_STATE_SUSPENDED)) {
> > qemu_system_start_on_wakeup_request();
> > }
> > }
> > 
> > > I think I see your point that vm_start() (mostly vm_prepare_start())
> > > contains a bunch of operations that maybe we must have before starting the
> > > VM, but then.. should we just make that vm_start() unconditional when
> > > loading VM completes?  I just don't see anything won't need it (besides
> > > -S), even COLO.
> > > 
> > > So I'm wondering about something like this:
> > > 
> > > ===8<===
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -481,19 +481,28 @@ static void process_incoming_migration_bh(void 
> > > *opaque)
> > >  
> > >  

Re: [PATCH V1 2/3] migration: fix suspended runstate

2023-06-26 Thread Peter Xu
On Fri, Jun 23, 2023 at 02:25:05PM -0400, Steven Sistare wrote:
> On 6/21/2023 4:28 PM, Peter Xu wrote:
> > On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote:
> >> On 6/20/2023 5:46 PM, Peter Xu wrote:
> >>> On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote:
>  Migration of a guest in the suspended state is broken.  The incoming
>  migration code automatically tries to wake the guest, which IMO is
>  wrong -- the guest should end migration in the same state it started.
>  Further, the wakeup is done by calling qemu_system_wakeup_request(), 
>  which
>  bypasses vm_start().  The guest appears to be in the running state, but
>  it is not.
> 
>  To fix, leave the guest in the suspended state, but call
>  qemu_system_start_on_wakeup_request() so the guest is properly resumed
>  later, when the client sends a system_wakeup command.
> 
>  Signed-off-by: Steve Sistare 
>  ---
>   migration/migration.c | 11 ---
>   softmmu/runstate.c|  1 +
>   2 files changed, 5 insertions(+), 7 deletions(-)
> 
>  diff --git a/migration/migration.c b/migration/migration.c
>  index 17b4b47..851fe6d 100644
>  --- a/migration/migration.c
>  +++ b/migration/migration.c
>  @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void 
>  *opaque)
>   vm_start();
>   } else {
>   runstate_set(global_state_get_runstate());
>  +if (runstate_check(RUN_STATE_SUSPENDED)) {
>  +/* Force vm_start to be called later. */
>  +qemu_system_start_on_wakeup_request();
>  +}
> >>>
> >>> Is this really needed, along with patch 1?
> >>>
> >>> I have a very limited knowledge on suspension, so I'm prone to making
> >>> mistakes..
> >>>
> >>> But from what I read this, qemu_system_wakeup_request() (existing one, not
> >>> after patch 1 applied) will setup wakeup_reason and kick the main thread
> >>> using qemu_notify_event().  Then IIUC the e.g. vcpu wakeups will be done 
> >>> in
> >>> the main thread later on after qemu_wakeup_requested() returns true.
> >>
> >> Correct, here:
> >>
> >> if (qemu_wakeup_requested()) {
> >> pause_all_vcpus();
> >> qemu_system_wakeup();
> >> notifier_list_notify(_notifiers, _reason);
> >> wakeup_reason = QEMU_WAKEUP_REASON_NONE;
> >> resume_all_vcpus();
> >> qapi_event_send_wakeup();
> >> }
> >>
> >> However, that is not sufficient, because vm_start() was never called on 
> >> the incoming
> >> side.  vm_start calls the vm state notifiers for RUN_STATE_RUNNING, among 
> >> other things.
> >>
> >>
> >> Without my fixes, it "works" because the outgoing migration automatically 
> >> wakes a suspended
> >> guest, which sets the state to running, which is saved in global state:
> >>
> >> void migration_completion(MigrationState *s)
> >> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
> >> global_state_store()
> >>
> >> Then the incoming migration calls vm_start here:
> >>
> >> migration/migration.c
> >> if (!global_state_received() ||
> >> global_state_get_runstate() == RUN_STATE_RUNNING) {
> >> if (autostart) {
> >> vm_start();
> >>
> >> vm_start must be called for correctness.
> > 
> > I see.  Though I had a feeling that this is still not the right way to do,
> > at least not as clean.
> > 
> > One question is, would above work for postcopy when VM is suspended during
> > the switchover?
> 
> Good catch, that is broken.
> I added qemu_system_start_on_wakeup_request to loadvm_postcopy_handle_run_bh
> and now it works.
> 
> if (global_state_get_runstate() == RUN_STATE_RUNNING) {
> if (autostart) {
> vm_start();
> } else {
> runstate_set(RUN_STATE_PAUSED);
> }
> } else {
> runstate_set(global_state_get_runstate());
> if (runstate_check(RUN_STATE_SUSPENDED)) {
> qemu_system_start_on_wakeup_request();
> }
> }
> 
> > I think I see your point that vm_start() (mostly vm_prepare_start())
> > contains a bunch of operations that maybe we must have before starting the
> > VM, but then.. should we just make that vm_start() unconditional when
> > loading VM completes?  I just don't see anything won't need it (besides
> > -S), even COLO.
> > 
> > So I'm wondering about something like this:
> > 
> > ===8<===
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -481,19 +481,28 @@ static void process_incoming_migration_bh(void 
> > *opaque)
> >  
> >  dirty_bitmap_mig_before_vm_start();
> >  
> > -if (!global_state_received() ||
> > -global_state_get_runstate() == RUN_STATE_RUNNING) {
> > -if (autostart) {
> > -vm_start();
> > -} else {
> > -runstate_set(RUN_STATE_PAUSED);
> > -}
> > -} else if 

Re: [PATCH V1 2/3] migration: fix suspended runstate

2023-06-23 Thread Steven Sistare
On 6/23/2023 2:25 PM, Steven Sistare wrote:
> On 6/21/2023 4:28 PM, Peter Xu wrote:
>> On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote:
>>> On 6/20/2023 5:46 PM, Peter Xu wrote:
 On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote:
> Migration of a guest in the suspended state is broken.  The incoming
> migration code automatically tries to wake the guest, which IMO is
> wrong -- the guest should end migration in the same state it started.
> Further, the wakeup is done by calling qemu_system_wakeup_request(), which
> bypasses vm_start().  The guest appears to be in the running state, but
> it is not.
>
> To fix, leave the guest in the suspended state, but call
> qemu_system_start_on_wakeup_request() so the guest is properly resumed
> later, when the client sends a system_wakeup command.
>
> Signed-off-by: Steve Sistare 
> ---
>  migration/migration.c | 11 ---
>  softmmu/runstate.c|  1 +
>  2 files changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 17b4b47..851fe6d 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void 
> *opaque)
>  vm_start();
>  } else {
>  runstate_set(global_state_get_runstate());
> +if (runstate_check(RUN_STATE_SUSPENDED)) {
> +/* Force vm_start to be called later. */
> +qemu_system_start_on_wakeup_request();
> +}

 Is this really needed, along with patch 1?

 I have a very limited knowledge on suspension, so I'm prone to making
 mistakes..

 But from what I read this, qemu_system_wakeup_request() (existing one, not
 after patch 1 applied) will setup wakeup_reason and kick the main thread
 using qemu_notify_event().  Then IIUC the e.g. vcpu wakeups will be done in
 the main thread later on after qemu_wakeup_requested() returns true.
>>>
>>> Correct, here:
>>>
>>> if (qemu_wakeup_requested()) {
>>> pause_all_vcpus();
>>> qemu_system_wakeup();
>>> notifier_list_notify(_notifiers, _reason);
>>> wakeup_reason = QEMU_WAKEUP_REASON_NONE;
>>> resume_all_vcpus();
>>> qapi_event_send_wakeup();
>>> }
>>>
>>> However, that is not sufficient, because vm_start() was never called on the 
>>> incoming
>>> side.  vm_start calls the vm state notifiers for RUN_STATE_RUNNING, among 
>>> other things.
>>>
>>>
>>> Without my fixes, it "works" because the outgoing migration automatically 
>>> wakes a suspended
>>> guest, which sets the state to running, which is saved in global state:
>>>
>>> void migration_completion(MigrationState *s)
>>> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
>>> global_state_store()
>>>
>>> Then the incoming migration calls vm_start here:
>>>
>>> migration/migration.c
>>> if (!global_state_received() ||
>>> global_state_get_runstate() == RUN_STATE_RUNNING) {
>>> if (autostart) {
>>> vm_start();
>>>
>>> vm_start must be called for correctness.
>>
>> I see.  Though I had a feeling that this is still not the right way to do,
>> at least not as clean.
>>
>> One question is, would above work for postcopy when VM is suspended during
>> the switchover?
> 
> Good catch, that is broken.
> I added qemu_system_start_on_wakeup_request to loadvm_postcopy_handle_run_bh
> and now it works.
> 
> if (global_state_get_runstate() == RUN_STATE_RUNNING) {
> if (autostart) {
> vm_start();
> } else {
> runstate_set(RUN_STATE_PAUSED);
> }
> } else {
> runstate_set(global_state_get_runstate());
> if (runstate_check(RUN_STATE_SUSPENDED)) {
> qemu_system_start_on_wakeup_request();
> }
> }
> 
>> I think I see your point that vm_start() (mostly vm_prepare_start())
>> contains a bunch of operations that maybe we must have before starting the
>> VM, but then.. should we just make that vm_start() unconditional when
>> loading VM completes?  I just don't see anything won't need it (besides
>> -S), even COLO.
>>
>> So I'm wondering about something like this:
>>
>> ===8<===
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -481,19 +481,28 @@ static void process_incoming_migration_bh(void *opaque)
>>  
>>  dirty_bitmap_mig_before_vm_start();
>>  
>> -if (!global_state_received() ||
>> -global_state_get_runstate() == RUN_STATE_RUNNING) {
>> -if (autostart) {
>> -vm_start();
>> -} else {
>> -runstate_set(RUN_STATE_PAUSED);
>> -}
>> -} else if (migration_incoming_colo_enabled()) {
>> +if (migration_incoming_colo_enabled()) {
>>  migration_incoming_disable_colo();
>> +/* COLO should 

Re: [PATCH V1 2/3] migration: fix suspended runstate

2023-06-23 Thread Steven Sistare
On 6/21/2023 4:28 PM, Peter Xu wrote:
> On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote:
>> On 6/20/2023 5:46 PM, Peter Xu wrote:
>>> On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote:
 Migration of a guest in the suspended state is broken.  The incoming
 migration code automatically tries to wake the guest, which IMO is
 wrong -- the guest should end migration in the same state it started.
 Further, the wakeup is done by calling qemu_system_wakeup_request(), which
 bypasses vm_start().  The guest appears to be in the running state, but
 it is not.

 To fix, leave the guest in the suspended state, but call
 qemu_system_start_on_wakeup_request() so the guest is properly resumed
 later, when the client sends a system_wakeup command.

 Signed-off-by: Steve Sistare 
 ---
  migration/migration.c | 11 ---
  softmmu/runstate.c|  1 +
  2 files changed, 5 insertions(+), 7 deletions(-)

 diff --git a/migration/migration.c b/migration/migration.c
 index 17b4b47..851fe6d 100644
 --- a/migration/migration.c
 +++ b/migration/migration.c
 @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void 
 *opaque)
  vm_start();
  } else {
  runstate_set(global_state_get_runstate());
 +if (runstate_check(RUN_STATE_SUSPENDED)) {
 +/* Force vm_start to be called later. */
 +qemu_system_start_on_wakeup_request();
 +}
>>>
>>> Is this really needed, along with patch 1?
>>>
>>> I have a very limited knowledge on suspension, so I'm prone to making
>>> mistakes..
>>>
>>> But from what I read this, qemu_system_wakeup_request() (existing one, not
>>> after patch 1 applied) will setup wakeup_reason and kick the main thread
>>> using qemu_notify_event().  Then IIUC the e.g. vcpu wakeups will be done in
>>> the main thread later on after qemu_wakeup_requested() returns true.
>>
>> Correct, here:
>>
>> if (qemu_wakeup_requested()) {
>> pause_all_vcpus();
>> qemu_system_wakeup();
>> notifier_list_notify(_notifiers, _reason);
>> wakeup_reason = QEMU_WAKEUP_REASON_NONE;
>> resume_all_vcpus();
>> qapi_event_send_wakeup();
>> }
>>
>> However, that is not sufficient, because vm_start() was never called on the 
>> incoming
>> side.  vm_start calls the vm state notifiers for RUN_STATE_RUNNING, among 
>> other things.
>>
>>
>> Without my fixes, it "works" because the outgoing migration automatically 
>> wakes a suspended
>> guest, which sets the state to running, which is saved in global state:
>>
>> void migration_completion(MigrationState *s)
>> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
>> global_state_store()
>>
>> Then the incoming migration calls vm_start here:
>>
>> migration/migration.c
>> if (!global_state_received() ||
>> global_state_get_runstate() == RUN_STATE_RUNNING) {
>> if (autostart) {
>> vm_start();
>>
>> vm_start must be called for correctness.
> 
> I see.  Though I had a feeling that this is still not the right way to do,
> at least not as clean.
> 
> One question is, would above work for postcopy when VM is suspended during
> the switchover?

Good catch, that is broken.
I added qemu_system_start_on_wakeup_request to loadvm_postcopy_handle_run_bh
and now it works.

if (global_state_get_runstate() == RUN_STATE_RUNNING) {
if (autostart) {
vm_start();
} else {
runstate_set(RUN_STATE_PAUSED);
}
} else {
runstate_set(global_state_get_runstate());
if (runstate_check(RUN_STATE_SUSPENDED)) {
qemu_system_start_on_wakeup_request();
}
}

> I think I see your point that vm_start() (mostly vm_prepare_start())
> contains a bunch of operations that maybe we must have before starting the
> VM, but then.. should we just make that vm_start() unconditional when
> loading VM completes?  I just don't see anything won't need it (besides
> -S), even COLO.
> 
> So I'm wondering about something like this:
> 
> ===8<===
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -481,19 +481,28 @@ static void process_incoming_migration_bh(void *opaque)
>  
>  dirty_bitmap_mig_before_vm_start();
>  
> -if (!global_state_received() ||
> -global_state_get_runstate() == RUN_STATE_RUNNING) {
> -if (autostart) {
> -vm_start();
> -} else {
> -runstate_set(RUN_STATE_PAUSED);
> -}
> -} else if (migration_incoming_colo_enabled()) {
> +if (migration_incoming_colo_enabled()) {
>  migration_incoming_disable_colo();
> +/* COLO should always have autostart=1 or we can enforce it here */
> +}
> +
> +if (autostart) {
> +RunState run_state = global_state_get_runstate();
>  vm_start();

This will 

Re: [PATCH V1 2/3] migration: fix suspended runstate

2023-06-21 Thread Peter Xu
On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote:
> On 6/20/2023 5:46 PM, Peter Xu wrote:
> > On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote:
> >> Migration of a guest in the suspended state is broken.  The incoming
> >> migration code automatically tries to wake the guest, which IMO is
> >> wrong -- the guest should end migration in the same state it started.
> >> Further, the wakeup is done by calling qemu_system_wakeup_request(), which
> >> bypasses vm_start().  The guest appears to be in the running state, but
> >> it is not.
> >>
> >> To fix, leave the guest in the suspended state, but call
> >> qemu_system_start_on_wakeup_request() so the guest is properly resumed
> >> later, when the client sends a system_wakeup command.
> >>
> >> Signed-off-by: Steve Sistare 
> >> ---
> >>  migration/migration.c | 11 ---
> >>  softmmu/runstate.c|  1 +
> >>  2 files changed, 5 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/migration/migration.c b/migration/migration.c
> >> index 17b4b47..851fe6d 100644
> >> --- a/migration/migration.c
> >> +++ b/migration/migration.c
> >> @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void 
> >> *opaque)
> >>  vm_start();
> >>  } else {
> >>  runstate_set(global_state_get_runstate());
> >> +if (runstate_check(RUN_STATE_SUSPENDED)) {
> >> +/* Force vm_start to be called later. */
> >> +qemu_system_start_on_wakeup_request();
> >> +}
> > 
> > Is this really needed, along with patch 1?
> > 
> > I have a very limited knowledge on suspension, so I'm prone to making
> > mistakes..
> > 
> > But from what I read this, qemu_system_wakeup_request() (existing one, not
> > after patch 1 applied) will setup wakeup_reason and kick the main thread
> > using qemu_notify_event().  Then IIUC the e.g. vcpu wakeups will be done in
> > the main thread later on after qemu_wakeup_requested() returns true.
> 
> Correct, here:
> 
> if (qemu_wakeup_requested()) {
> pause_all_vcpus();
> qemu_system_wakeup();
> notifier_list_notify(_notifiers, _reason);
> wakeup_reason = QEMU_WAKEUP_REASON_NONE;
> resume_all_vcpus();
> qapi_event_send_wakeup();
> }
> 
> However, that is not sufficient, because vm_start() was never called on the 
> incoming
> side.  vm_start calls the vm state notifiers for RUN_STATE_RUNNING, among 
> other things.
> 
> 
> Without my fixes, it "works" because the outgoing migration automatically 
> wakes a suspended
> guest, which sets the state to running, which is saved in global state:
> 
> void migration_completion(MigrationState *s)
> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
> global_state_store()
> 
> Then the incoming migration calls vm_start here:
> 
> migration/migration.c
> if (!global_state_received() ||
> global_state_get_runstate() == RUN_STATE_RUNNING) {
> if (autostart) {
> vm_start();
> 
> vm_start must be called for correctness.

I see.  Though I had a feeling that this is still not the right way to do,
at least not as clean.

One question is, would above work for postcopy when VM is suspended during
the switchover?

I think I see your point that vm_start() (mostly vm_prepare_start())
contains a bunch of operations that maybe we must have before starting the
VM, but then.. should we just make that vm_start() unconditional when
loading VM completes?  I just don't see anything won't need it (besides
-S), even COLO.

So I'm wondering about something like this:

===8<===
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -481,19 +481,28 @@ static void process_incoming_migration_bh(void *opaque)
 
 dirty_bitmap_mig_before_vm_start();
 
-if (!global_state_received() ||
-global_state_get_runstate() == RUN_STATE_RUNNING) {
-if (autostart) {
-vm_start();
-} else {
-runstate_set(RUN_STATE_PAUSED);
-}
-} else if (migration_incoming_colo_enabled()) {
+if (migration_incoming_colo_enabled()) {
 migration_incoming_disable_colo();
+/* COLO should always have autostart=1 or we can enforce it here */
+}
+
+if (autostart) {
+RunState run_state = global_state_get_runstate();
 vm_start();
+switch (run_state) {
+case RUN_STATE_RUNNING:
+break;
+case RUN_STATE_SUSPENDED:
+qemu_system_suspend();
+break;
+default:
+runstate_set(run_state);
+break;
+}
 } else {
-runstate_set(global_state_get_runstate());
+runstate_set(RUN_STATE_PAUSED);
 }
===8<===

IIUC this can drop qemu_system_start_on_wakeup_request() along with the
other global var.  Would something like it work for us?

-- 
Peter Xu




Re: [PATCH V1 2/3] migration: fix suspended runstate

2023-06-21 Thread Steven Sistare
On 6/20/2023 5:46 PM, Peter Xu wrote:
> On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote:
>> Migration of a guest in the suspended state is broken.  The incoming
>> migration code automatically tries to wake the guest, which IMO is
>> wrong -- the guest should end migration in the same state it started.
>> Further, the wakeup is done by calling qemu_system_wakeup_request(), which
>> bypasses vm_start().  The guest appears to be in the running state, but
>> it is not.
>>
>> To fix, leave the guest in the suspended state, but call
>> qemu_system_start_on_wakeup_request() so the guest is properly resumed
>> later, when the client sends a system_wakeup command.
>>
>> Signed-off-by: Steve Sistare 
>> ---
>>  migration/migration.c | 11 ---
>>  softmmu/runstate.c|  1 +
>>  2 files changed, 5 insertions(+), 7 deletions(-)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 17b4b47..851fe6d 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void *opaque)
>>  vm_start();
>>  } else {
>>  runstate_set(global_state_get_runstate());
>> +if (runstate_check(RUN_STATE_SUSPENDED)) {
>> +/* Force vm_start to be called later. */
>> +qemu_system_start_on_wakeup_request();
>> +}
> 
> Is this really needed, along with patch 1?
> 
> I have a very limited knowledge on suspension, so I'm prone to making
> mistakes..
> 
> But from what I read this, qemu_system_wakeup_request() (existing one, not
> after patch 1 applied) will setup wakeup_reason and kick the main thread
> using qemu_notify_event().  Then IIUC the e.g. vcpu wakeups will be done in
> the main thread later on after qemu_wakeup_requested() returns true.

Correct, here:

if (qemu_wakeup_requested()) {
pause_all_vcpus();
qemu_system_wakeup();
notifier_list_notify(_notifiers, _reason);
wakeup_reason = QEMU_WAKEUP_REASON_NONE;
resume_all_vcpus();
qapi_event_send_wakeup();
}

However, that is not sufficient, because vm_start() was never called on the 
incoming
side.  vm_start calls the vm state notifiers for RUN_STATE_RUNNING, among other 
things.


Without my fixes, it "works" because the outgoing migration automatically wakes 
a suspended
guest, which sets the state to running, which is saved in global state:

void migration_completion(MigrationState *s)
qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
global_state_store()

Then the incoming migration calls vm_start here:

migration/migration.c
if (!global_state_received() ||
global_state_get_runstate() == RUN_STATE_RUNNING) {
if (autostart) {
vm_start();

vm_start must be called for correctness.

- Steve

>>  }
>>  /*
>>   * This must happen after any state changes since as soon as an external
>> @@ -2101,7 +2105,6 @@ static int postcopy_start(MigrationState *ms)
>>  qemu_mutex_lock_iothread();
>>  trace_postcopy_start_set_run();
>>  
>> -qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
>>  global_state_store();
>>  ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>>  if (ret < 0) {
>> @@ -2307,7 +2310,6 @@ static void migration_completion(MigrationState *s)
>>  if (s->state == MIGRATION_STATUS_ACTIVE) {
>>  qemu_mutex_lock_iothread();
>>  s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>> -qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
>>  
>>  s->vm_old_state = runstate_get();
>>  global_state_store();
>> @@ -3102,11 +3104,6 @@ static void *bg_migration_thread(void *opaque)
>>  
>>  qemu_mutex_lock_iothread();
>>  
>> -/*
>> - * If VM is currently in suspended state, then, to make a valid runstate
>> - * transition in vm_stop_force_state() we need to wakeup it up.
>> - */
>> -qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
> 
> Removal of these three places seems reasonable to me, or we won't persist
> the SUSPEND state.
> 
> Above comment was the major reason I used to have thought it was needed
> (again, based on zero knowledge around this..), but perhaps it was just
> wrong?  I would assume vm_stop_force_state() will still just work with
> suepended, am I right?
> 
>>  s->vm_old_state = runstate_get();
>>  
>>  global_state_store();
>> diff --git a/softmmu/runstate.c b/softmmu/runstate.c
>> index e127b21..771896c 100644
>> --- a/softmmu/runstate.c
>> +++ b/softmmu/runstate.c
>> @@ -159,6 +159,7 @@ static const RunStateTransition 
>> runstate_transitions_def[] = {
>>  { RUN_STATE_RUNNING, RUN_STATE_SUSPENDED },
>>  { RUN_STATE_SUSPENDED, RUN_STATE_RUNNING },
>>  { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
>> +{ RUN_STATE_SUSPENDED, RUN_STATE_PAUSED },
>>  { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH },
>>

Re: [PATCH V1 2/3] migration: fix suspended runstate

2023-06-20 Thread Peter Xu
On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote:
> Migration of a guest in the suspended state is broken.  The incoming
> migration code automatically tries to wake the guest, which IMO is
> wrong -- the guest should end migration in the same state it started.
> Further, the wakeup is done by calling qemu_system_wakeup_request(), which
> bypasses vm_start().  The guest appears to be in the running state, but
> it is not.
> 
> To fix, leave the guest in the suspended state, but call
> qemu_system_start_on_wakeup_request() so the guest is properly resumed
> later, when the client sends a system_wakeup command.
> 
> Signed-off-by: Steve Sistare 
> ---
>  migration/migration.c | 11 ---
>  softmmu/runstate.c|  1 +
>  2 files changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 17b4b47..851fe6d 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void *opaque)
>  vm_start();
>  } else {
>  runstate_set(global_state_get_runstate());
> +if (runstate_check(RUN_STATE_SUSPENDED)) {
> +/* Force vm_start to be called later. */
> +qemu_system_start_on_wakeup_request();
> +}

Is this really needed, along with patch 1?

I have a very limited knowledge on suspension, so I'm prone to making
mistakes..

But from what I read this, qemu_system_wakeup_request() (existing one, not
after patch 1 applied) will setup wakeup_reason and kick the main thread
using qemu_notify_event().  Then IIUC the e.g. vcpu wakeups will be done in
the main thread later on after qemu_wakeup_requested() returns true.

>  }
>  /*
>   * This must happen after any state changes since as soon as an external
> @@ -2101,7 +2105,6 @@ static int postcopy_start(MigrationState *ms)
>  qemu_mutex_lock_iothread();
>  trace_postcopy_start_set_run();
>  
> -qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
>  global_state_store();
>  ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>  if (ret < 0) {
> @@ -2307,7 +2310,6 @@ static void migration_completion(MigrationState *s)
>  if (s->state == MIGRATION_STATUS_ACTIVE) {
>  qemu_mutex_lock_iothread();
>  s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> -qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
>  
>  s->vm_old_state = runstate_get();
>  global_state_store();
> @@ -3102,11 +3104,6 @@ static void *bg_migration_thread(void *opaque)
>  
>  qemu_mutex_lock_iothread();
>  
> -/*
> - * If VM is currently in suspended state, then, to make a valid runstate
> - * transition in vm_stop_force_state() we need to wakeup it up.
> - */
> -qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);

Removal of these three places seems reasonable to me, or we won't persist
the SUSPEND state.

Above comment was the major reason I used to have thought it was needed
(again, based on zero knowledge around this..), but perhaps it was just
wrong?  I would assume vm_stop_force_state() will still just work with
suepended, am I right?

>  s->vm_old_state = runstate_get();
>  
>  global_state_store();
> diff --git a/softmmu/runstate.c b/softmmu/runstate.c
> index e127b21..771896c 100644
> --- a/softmmu/runstate.c
> +++ b/softmmu/runstate.c
> @@ -159,6 +159,7 @@ static const RunStateTransition 
> runstate_transitions_def[] = {
>  { RUN_STATE_RUNNING, RUN_STATE_SUSPENDED },
>  { RUN_STATE_SUSPENDED, RUN_STATE_RUNNING },
>  { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
> +{ RUN_STATE_SUSPENDED, RUN_STATE_PAUSED },
>  { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH },
>  { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
>  
> -- 
> 1.8.3.1
> 

-- 
Peter Xu




[PATCH V1 2/3] migration: fix suspended runstate

2023-06-15 Thread Steve Sistare
Migration of a guest in the suspended state is broken.  The incoming
migration code automatically tries to wake the guest, which IMO is
wrong -- the guest should end migration in the same state it started.
Further, the wakeup is done by calling qemu_system_wakeup_request(), which
bypasses vm_start().  The guest appears to be in the running state, but
it is not.

To fix, leave the guest in the suspended state, but call
qemu_system_start_on_wakeup_request() so the guest is properly resumed
later, when the client sends a system_wakeup command.

Signed-off-by: Steve Sistare 
---
 migration/migration.c | 11 ---
 softmmu/runstate.c|  1 +
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 17b4b47..851fe6d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void *opaque)
 vm_start();
 } else {
 runstate_set(global_state_get_runstate());
+if (runstate_check(RUN_STATE_SUSPENDED)) {
+/* Force vm_start to be called later. */
+qemu_system_start_on_wakeup_request();
+}
 }
 /*
  * This must happen after any state changes since as soon as an external
@@ -2101,7 +2105,6 @@ static int postcopy_start(MigrationState *ms)
 qemu_mutex_lock_iothread();
 trace_postcopy_start_set_run();
 
-qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
 global_state_store();
 ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
 if (ret < 0) {
@@ -2307,7 +2310,6 @@ static void migration_completion(MigrationState *s)
 if (s->state == MIGRATION_STATUS_ACTIVE) {
 qemu_mutex_lock_iothread();
 s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
-qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
 
 s->vm_old_state = runstate_get();
 global_state_store();
@@ -3102,11 +3104,6 @@ static void *bg_migration_thread(void *opaque)
 
 qemu_mutex_lock_iothread();
 
-/*
- * If VM is currently in suspended state, then, to make a valid runstate
- * transition in vm_stop_force_state() we need to wakeup it up.
- */
-qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
 s->vm_old_state = runstate_get();
 
 global_state_store();
diff --git a/softmmu/runstate.c b/softmmu/runstate.c
index e127b21..771896c 100644
--- a/softmmu/runstate.c
+++ b/softmmu/runstate.c
@@ -159,6 +159,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RUN_STATE_RUNNING, RUN_STATE_SUSPENDED },
 { RUN_STATE_SUSPENDED, RUN_STATE_RUNNING },
 { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
+{ RUN_STATE_SUSPENDED, RUN_STATE_PAUSED },
 { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH },
 { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
 
-- 
1.8.3.1