On 11/22/2023 11:21 AM, Peter Xu wrote: > On Wed, Nov 22, 2023 at 09:38:06AM +0000, Daniel P. Berrangé wrote: >> On Mon, Nov 20, 2023 at 04:44:50PM -0500, Peter Xu wrote: >>> On Mon, Nov 20, 2023 at 03:55:54PM -0500, Steven Sistare wrote: >>>> If we drop force, then all calls to vm_stop will completely stop the >>>> suspended state, eg an hmp "stop" command. This causes two problems. >>>> First, that is a change in user-visible behavior for something that >>>> currently works, >>> >>> IMHO it depends on what should be the correct behavior. IOW, when VM is in >>> SUSPENDED state and then the user sends "stop" QMP command, what should we >>> expect? >> >> I would say that from a mgmtm app POV "stop" is initiating a state >> transition, from RUN_STATE_RUNNING to RUN_STATE_PAUSED and "cont" >> is doing the reverse from PAUSED to RUNNING. >> >> It is a little more complicated than that as there are some other >> states like INMIGRATE that are conceptually equiv to RUNNING, >> and states where the transition simply doesn't make sense. > > In the qemu impl, INMIGRATE is, imho, more equiv of PAUSED - do_vm_stop() > mostly ignores every state except RUNNING (putting bdrv operations aside). > IOW, anything besides "running" is treated as "not running". > > But then Paolo fixed that in 1e9981465f ("qmp: handle stop/cont in > INMIGRATE state"), wiring that to autostart. > > Now we seem to find that "suspended" should actually fall within (where > "vm" is running, but "vcpu" is not), and it seems we should treat "vm" and > "vcpu" differently. > >> >> So my question is if we're in "SUSPENDED" and someone issues "stop", >> what state do we go into, and perhaps more importantly what state >> do we go to in a subsequent "cont". > > I think we must stop the "vm", not only the "vcpu". I discussed this bit > in the other thread more or less: it's because qemu_system_wakeup_request() > can be called in many places, e.g. acpi_pm_tmr_timer(). > > It means even after the VM is "stop"ped by the admin QMP (where qmp_stop() > ignores SUSPENDED, keep the "vm" running), it can silently got waken up > without admin even noticing it. I'm not sure what Libvirt will behave if > it suddenly receives a QAPI_EVENT_WAKEUP randomly after a "stop". > >> >> If you say SUSPENDED ---(stop)---> PAUSED ---(cont)---> SUSPENDED >> then we create a problem, because the decision for the transition >> out of PAUSED needs memory of the previous state. > > Right, that's why I think we at least need one more boolean to remember the > suspended state, then when we switch from any STOP states into any RUN > states, we know where to go. Here STOP states I meant anything except > RUNNING and SUSPENDED, while RUN -> RUNNING or SUSPENDED. > >> >>> My understanding is we should expect to fully stop the VM, including the >>> ticks, for example. Keeping the ticks running even after QMP "stop" >>> doesn't sound right, isn't it? >> >> The "stop" QMP command is documented as >> >> "Stop all guest VCPU execution" >> >> the devil is in the detail though, and we've not documented any detail. >> >> Whether or not timers keep running across stop/cont I think can be >> argued to be an impl detail, as long as the headline goal "vcpus >> don't execute" is satisfied. > > "stop" was since qemu v0.14, so I guess we can't blame the missing of > details or any form of inaccuracy.. Obviously we do more than "stop vCPU > executions" in the current implementation. > > But after we reach a consensus on how we should fix the current suspended > problem, we may want to update the documentation to start containing more > information. > >> >>>> vs the migration code where we are fixing brokenness. >>> >>> This is not a migration-only bug if above holds, IMO. >>> >>>> Second, it does not quite work, because the state becomes >>>> RUN_STATE_PAUSED, so the suspended state is forgotten, and the hmp "cont" >>>> will try to set the running state. I could fix that by introducing a new >>>> state RUN_STATE_SUSPENDED_STOPPED, but again it is a user-visible change >>>> in existing behavior. (I even implemented that while developing, then I >>>> realized it was not needed to fix the migration bugs.) >>> >>> Good point. >> >> We have added new guest states periodically. It is a user visible >> change, but you could argue that it is implementing a new feature >> ie the ability to "stop" a "suspended" guest, and so is justified. >> >> S3 is so little used in virt, so I'm not surprised we're finding >> long standing edge cases that have never been thought about before. >> >>> Now with above comments, what's your thoughts on whether we should change >>> the user behavior? My answer is still a yes. >>> >>> Maybe SUSPENDED should not be a RunState at all? SUSPENDED is guest visible >>> behavior, while something like QMP "stop" is not guest visible. Maybe we >>> should remember it separately? >> >> Yes, every time I look at this area I come away thinking that >> the RunState enum is a mess, overloading too many different >> concepts onto the same single field. >> >> Specifically "SUSPENDED" vs "RUNNING" is a reflection of guest >> state (ie whether or not the VM is in S3), but pretty much all >> the others are a reflection of QEMU host state. I kind of feel >> that SUSPENDED (S3) probably shouldn't have been a RunState at >> all. I'd probably put guest-panicked into a separate thing too. >> >> But we're stuck with what we have. > > IMO compatibility is only necessary if at least the existing code is > running well. But now I see it a major flaw in suspended state and I can't > see how it can go right if with current code base.. My concern is instead > that after suspended will be used more (e.g., assuming CPR will rely on it) > we can have more chance to confuse/oops a mgmt app like Libvirt, like I > described above. > > In summary, I think a current solution to me is only to fix at least > suspended state for good, by: > > - adding vm_suspended boolean to remember machine RUNNING / SUSPENDED > state. When "cont" we need to switch to either RUNNING / SUSPENDED > depending on the boolean. > > - keeping SUSPENDED state as RunState (for compatibility, otherwise we'll > need another interface to fetch that boolean anyway), even though not > query-able during any !RUNNING && !SUSPENDED state.. hopefully not a > big deal > > - enrich ducumentation of qmp_stop/qmp_cont to describe what they really do > > - (with suspended working all right...) fix migration of SUSPENDED state > > I don't expect a lot of code changes is needed, maybe even less than the > current series (because we don't need special knob to differenciate > migration or non-migration callers of do_vm_stop()). IMHO this series is > already doing some of that but just decided to ignore outside-migration > states for suspeneded. > > We may want to add some test cases though to verify the suspended state > transitions (maybe easier to put into migration-test with the new ACPI > guest code), but optional.
FYI, here is a brief update before today's meeting. I have implemented this and I am testing libvirt and its various save + restore commands, when the guest is suspended running (RUN_STATE_SUSPENDED), and suspended stopped (RUN_STATE_PAUSED with vm_was_suspended = true). There are a few failures, and I am still investigating to see whether they can be fixed in qemu, or need a fix in libvirt. I will send more details later. - Steve