On 11/20/2023 2:59 PM, Peter Xu wrote: > On Mon, Nov 13, 2023 at 10:33:50AM -0800, Steve Sistare wrote: >> A vm in the suspended state is not completely stopped. The VCPUs have been >> paused, but the cpu clock still runs, and runstate notifiers for the >> transition to stopped have not been called. Modify vm_stop_force_state to >> completely stop the vm if the current state is suspended, to be called for >> live migration and snapshots. >> >> Suggested-by: Peter Xu <pet...@redhat.com> >> Signed-off-by: Steve Sistare <steven.sist...@oracle.com> >> --- >> system/cpus.c | 8 ++++++-- >> 1 file changed, 6 insertions(+), 2 deletions(-) >> >> diff --git a/system/cpus.c b/system/cpus.c >> index f72c4be..c772708 100644 >> --- a/system/cpus.c >> +++ b/system/cpus.c >> @@ -255,6 +255,8 @@ void cpu_interrupt(CPUState *cpu, int mask) >> static int do_vm_stop(RunState state, bool send_stop, bool force) >> { >> int ret = 0; >> + bool running = runstate_is_running(); >> + bool suspended = runstate_check(RUN_STATE_SUSPENDED); >> >> if (qemu_in_vcpu_thread()) { >> qemu_system_vmstop_request_prepare(); >> @@ -267,10 +269,12 @@ static int do_vm_stop(RunState state, bool send_stop, >> bool force) >> return 0; >> } >> >> - if (runstate_is_running()) { >> + if (running || (suspended && force)) { >> runstate_set(state); >> cpu_disable_ticks(); > > Not directly relevant, but this is weird that I just notice. > > If we disable ticks before stopping vCPUs, IIUC it means vcpus can see > stall ticks. I checked the vm_start() and indeed that one did it in the > other way round: we'll stop vCPUs before stopping the ticks. > >> - pause_all_vcpus(); >> + if (running) { >> + pause_all_vcpus(); >> + } >> vm_state_notify(0, state); >> if (send_stop) { >> qapi_event_send_stop(); > > IIUC the "force" is not actually needed. It's only used when SUSPENDED, > right? > > In general, considering all above, I'm wondering something like this would > be much cleaner (and dropping force)?
If we drop force, then all calls to vm_stop will completely stop the suspended state, eg an hmp "stop" command. This causes two problems. First, that is a change in user-visible behavior for something that currently works, vs the migration code where we are fixing brokenness. Second, it does not quite work, because the state becomes RUN_STATE_PAUSED, so the suspended state is forgotten, and the hmp "cont" will try to set the running state. I could fix that by introducing a new state RUN_STATE_SUSPENDED_STOPPED, but again it is a user-visible change in existing behavior. (I even implemented that while developing, then I realized it was not needed to fix the migration bugs.) - Steve > ===8<=== > static int do_vm_stop(RunState state, bool send_stop) > { > + bool suspended = runstate_check(RUN_STATE_SUSPENDED); > + bool running = runstate_is_running(); > int ret = 0; > > - if (runstate_is_running()) { > + /* > + * RUNNING: VM and vCPUs are all running > + * SUSPENDED: VM is running, VCPUs are stopped > + * Others: VM and vCPUs are all stopped > + */ > + > + /* Whether do we need to stop vCPUs? */ > + if (running) { > + pause_all_vcpus(); > + } > + > + /* Whether do we need to stop the VM in general? */ > + if (running || suspended) { > runstate_set(state); > cpu_disable_ticks(); > - pause_all_vcpus(); > vm_state_notify(0, state); > if (send_stop) { > qapi_event_send_stop(); >