Hi there, With the risk of becoming repetitive, I'm going to summarize the problems and solutions we've discussed in the last few days for the problems found in qemu-ga's shutdown and suspend commands.
Gleb and Igor, you may be interested in items 2 and 4. Basically, we have four issues: 1. The guest-shutdown and guest-suspend-* commands are unable to detect errors while performing their operation. That is, qemu-ga will report success to clients even if an error happens while shutting down or suspending. This happens because the operation is executed in a child process and qemu-ga doesn't wait() for children processes to avoid blocking. Possible solutions: A. Don't fix this and preserve qemu-ga's non-blocking behavior B. Change qemu-ga to wait() for its children and report errors. Has the implication of being a blocking call 2. The guest-shutdown and guest-suspend-* commands may not emit a success response. Actually, the guest-suspend-* commands may emit a response after the guest resumes. This happens because the guest may shutdown/suspend before qemu-ga is able to emit a success response. Solution: Change qemu-ga to never emit a success response. Clients should do the following to check for success: o guest-suspend-disk: if the guest suspends through ACPI, an exit status of 3 (chose a random number). Otherwise an exit status of 0 o guest-suspend-ram or hybrid: wait for the SUSPEND event and/or pull for a RunState change to suspended (the RunState change doesn't exist upstream yet, will submit a patch) o guest-shutdown: an exit status of 0 3. There's a possible race in suspend code while trying to detect suspend support in the guest. This happens because the suspend code got complex while trying to preserve qemu-ga's non-blocking behavior described in item 1. Possible solutions: A. Just fix the race (which makes the code more complex) B. Do solution 1.B. (which also simplifies the code considerably) 4. Libvirt is facing a problem when hot plugging a device and then user-space suspends to disk: if libvirt is not told to make the new device persistent, then it will be unable to correctly resume the VM later, since its command-line won't have the newly added device. This happens because libvirt doesn't know the VM suspended to disk. Solution: Implement solution for item 2 above (ie. exit with a different exit status, eg. 3). There isn't much to be done if the guest doesn't suspend through ACPI. PS: This problem is out of qemu-ga's realm, but it's interesting to find a "unified" solution.