Re: [Qemu-devel] ACPI PCI hotplug table updates
On Thu, 4 Oct 2018 13:24:40 -0400 "Michael S. Tsirkin" wrote: > On Thu, Oct 04, 2018 at 01:57:21PM +0200, Igor Mammedov wrote: > > On Wed, 3 Oct 2018 10:44:20 -0700 > > open sorcerer <0p3n.s0rc3...@gmail.com> wrote: > > > > > Hi, > > > > > > I am digging into an issue where qmp_device_del does not actually delete > > > devices when a guest OS is in prelaunch. This seems to be due to the guest > > > OS not handling ACPI events because it is not currently running. If I > > > assume correctly, qmp should allow you to add/remove devices while the > > > host > > > is down, or if not possible, publish an error message. > > may I ask why one would delete a device at -S pause point, isn't it easier > > to start QEMU without it, to begin with? > > > > > I think fixing this issue is as simple as making sure that the VM is in a > > > safe state to ignore the hotplug ACPI dance but eject the disk, something > > > like: > > in prelaunch runstate where '-S' option pauses VM, it is practically paused > > at the first instruction to be executed. So device_add at that point is > > considered as hotplug with all actions already executed on hardware level > > (interrupts sent, devices responsible for hotplug handling has changed > > state). > > So if one wished to delete device at that point, one would have to rollback > > related state changes. > > If one would additionally use -incoming CLI option, it becomes more > > complicated > > as we might endup in prelaunch runstate with VM in running state > > (see possible transitions in runstate_transitions_def[]) > > I'd say prelauch runstate can't be used for removing devices that do not > > support surprise removal (in our case PCI isn't). > > I'd say the point is this. In prelaunch guest did not observe any > device state yet, we could make device_add look just like > a non-hotplugged device. And we could make device_del pretend > there was a reset immediately afterwards. > > Not sure why it matters to anyone, but it's doable I think. in case we came to prelaunch from freshly started QEMU with -S and no other disrupting things in between /migration, chekpointing, .../ it's theoretically possible. However even then to make a clean device_del in that state for devices that expect guest cooperation, one would need chain unplug_request (which is what device_del translates to) with whatever hotunplug hw is used and simulate guest unplugging it. And then on top we might need to rebuild/reload firmware tables (ARM) (x86 should work as it will rebuild ACPI tables on the first access). In generic case we might need to fixup something else elsewhere. When I looked into early numa configuration, I've failed to convince myself that using prelaunch, changing its semantics to coldplug and fixing up already built machine as safe/robust thing to do.(Resulted prelaunch based RFC even worked fine, but I wouldn't bet it wouldn't fall apart or in all other combinations prelaunch runstate could be reached) As result we ended up with new preconfig option/runstate where to we can gradually move machine building steps. One possible way to deal with subject would be queue at preconfig stage -device/device_add and use this queue later to add devices to board (not sure if it's a sound idea in general). This early it should be possible to remove a device from queue. But why one would add device and immediately remove it ... :/
Re: [Qemu-devel] ACPI PCI hotplug table updates
On Thu, Oct 04, 2018 at 01:57:21PM +0200, Igor Mammedov wrote: > On Wed, 3 Oct 2018 10:44:20 -0700 > open sorcerer <0p3n.s0rc3...@gmail.com> wrote: > > > Hi, > > > > I am digging into an issue where qmp_device_del does not actually delete > > devices when a guest OS is in prelaunch. This seems to be due to the guest > > OS not handling ACPI events because it is not currently running. If I > > assume correctly, qmp should allow you to add/remove devices while the host > > is down, or if not possible, publish an error message. > may I ask why one would delete a device at -S pause point, isn't it easier > to start QEMU without it, to begin with? > > > I think fixing this issue is as simple as making sure that the VM is in a > > safe state to ignore the hotplug ACPI dance but eject the disk, something > > like: > in prelaunch runstate where '-S' option pauses VM, it is practically paused > at the first instruction to be executed. So device_add at that point is > considered as hotplug with all actions already executed on hardware level > (interrupts sent, devices responsible for hotplug handling has changed state). > So if one wished to delete device at that point, one would have to rollback > related state changes. > If one would additionally use -incoming CLI option, it becomes more > complicated > as we might endup in prelaunch runstate with VM in running state > (see possible transitions in runstate_transitions_def[]) > I'd say prelauch runstate can't be used for removing devices that do not > support surprise removal (in our case PCI isn't). I'd say the point is this. In prelaunch guest did not observe any device state yet, we could make device_add look just like a non-hotplugged device. And we could make device_del pretend there was a reset immediately afterwards. Not sure why it matters to anyone, but it's doable I think. > > prelaunch, preconfig, shutdown: ignore acpi and deal with cleaning devices > > other non-running: bubble up error > > running: default behavior > > > > I was trying to validate that this change would be safe (keep in mind I am > > learning ACPI in little pieces while digging) using GDB, and code > > inspection. While stepping through with GDB i noticed that the PCI slots > > are controlled by memory region and the opaque acpi pci hp state object. I > > was unable this far to find any code executed that modifies the ACPI tables > > beyond just the pci hotplug state. > > > > I also tried to test using "while true; do acpidump | md5; sleep 1; done" > > in the guest OS and then add/remove a virtio-blk-pci device (which > > exercised the ACPI callbacks via piix4 callbacks). The output of the > > acpidump -> md5 was consistent during each phase of the data collection > > which I believe implied that the acpi tables were not modified by the PCI > > hotplug. > > > > Can someone help me understand: > > > > 1. Are the ACPI tables not modified when doing PCI hotplug? > > 2. Do the general changes proposed seem safe? > > 3. Are there resources or documentation I can read to help me understand > > this problem further? I have skimmed through alot of different documents > > and watched some youtube videos, but the ACPI documentation is hard to read > > and sift through and the youtube videos are generally too high level. > Regarding ACPI based PCI hotplug you can look at > docs/specs/acpi_pci_hotplug.txt > hw/acpi/pcihp.c > ACPI AML part in build_append_pci_bus_devices() > > > > > Thanks.
Re: [Qemu-devel] ACPI PCI hotplug table updates
On Wed, 3 Oct 2018 10:44:20 -0700 open sorcerer <0p3n.s0rc3...@gmail.com> wrote: > Hi, > > I am digging into an issue where qmp_device_del does not actually delete > devices when a guest OS is in prelaunch. This seems to be due to the guest > OS not handling ACPI events because it is not currently running. If I > assume correctly, qmp should allow you to add/remove devices while the host > is down, or if not possible, publish an error message. may I ask why one would delete a device at -S pause point, isn't it easier to start QEMU without it, to begin with? > I think fixing this issue is as simple as making sure that the VM is in a > safe state to ignore the hotplug ACPI dance but eject the disk, something > like: in prelaunch runstate where '-S' option pauses VM, it is practically paused at the first instruction to be executed. So device_add at that point is considered as hotplug with all actions already executed on hardware level (interrupts sent, devices responsible for hotplug handling has changed state). So if one wished to delete device at that point, one would have to rollback related state changes. If one would additionally use -incoming CLI option, it becomes more complicated as we might endup in prelaunch runstate with VM in running state (see possible transitions in runstate_transitions_def[]) I'd say prelauch runstate can't be used for removing devices that do not support surprise removal (in our case PCI isn't). > prelaunch, preconfig, shutdown: ignore acpi and deal with cleaning devices > other non-running: bubble up error > running: default behavior > > I was trying to validate that this change would be safe (keep in mind I am > learning ACPI in little pieces while digging) using GDB, and code > inspection. While stepping through with GDB i noticed that the PCI slots > are controlled by memory region and the opaque acpi pci hp state object. I > was unable this far to find any code executed that modifies the ACPI tables > beyond just the pci hotplug state. > > I also tried to test using "while true; do acpidump | md5; sleep 1; done" > in the guest OS and then add/remove a virtio-blk-pci device (which > exercised the ACPI callbacks via piix4 callbacks). The output of the > acpidump -> md5 was consistent during each phase of the data collection > which I believe implied that the acpi tables were not modified by the PCI > hotplug. > > Can someone help me understand: > > 1. Are the ACPI tables not modified when doing PCI hotplug? > 2. Do the general changes proposed seem safe? > 3. Are there resources or documentation I can read to help me understand > this problem further? I have skimmed through alot of different documents > and watched some youtube videos, but the ACPI documentation is hard to read > and sift through and the youtube videos are generally too high level. Regarding ACPI based PCI hotplug you can look at docs/specs/acpi_pci_hotplug.txt hw/acpi/pcihp.c ACPI AML part in build_append_pci_bus_devices() > > Thanks.
[Qemu-devel] ACPI PCI hotplug table updates
Hi, I am digging into an issue where qmp_device_del does not actually delete devices when a guest OS is in prelaunch. This seems to be due to the guest OS not handling ACPI events because it is not currently running. If I assume correctly, qmp should allow you to add/remove devices while the host is down, or if not possible, publish an error message. I think fixing this issue is as simple as making sure that the VM is in a safe state to ignore the hotplug ACPI dance but eject the disk, something like: prelaunch, preconfig, shutdown: ignore acpi and deal with cleaning devices other non-running: bubble up error running: default behavior I was trying to validate that this change would be safe (keep in mind I am learning ACPI in little pieces while digging) using GDB, and code inspection. While stepping through with GDB i noticed that the PCI slots are controlled by memory region and the opaque acpi pci hp state object. I was unable this far to find any code executed that modifies the ACPI tables beyond just the pci hotplug state. I also tried to test using "while true; do acpidump | md5; sleep 1; done" in the guest OS and then add/remove a virtio-blk-pci device (which exercised the ACPI callbacks via piix4 callbacks). The output of the acpidump -> md5 was consistent during each phase of the data collection which I believe implied that the acpi tables were not modified by the PCI hotplug. Can someone help me understand: 1. Are the ACPI tables not modified when doing PCI hotplug? 2. Do the general changes proposed seem safe? 3. Are there resources or documentation I can read to help me understand this problem further? I have skimmed through alot of different documents and watched some youtube videos, but the ACPI documentation is hard to read and sift through and the youtube videos are generally too high level. Thanks.
Re: [Qemu-devel] ACPI PCI hotplug table updates
On Wed, Oct 03, 2018 at 10:44:20AM -0700, open sorcerer wrote: > Hi, > > I am digging into an issue where qmp_device_del does not actually delete > devices when a guest OS is in prelaunch. What exactly is meant by prelaunch? E.g. is it prelaunch while bios is doing the pci bus scan? > This seems to be due to the guest OS > not handling ACPI events because it is not currently running. If I assume > correctly, qmp should allow you to add/remove devices while the host is down, > or if not possible, publish an error message. > > I think fixing this issue is as simple as making sure that the VM is in a safe > state to ignore the hotplug ACPI dance but eject the disk, something like: > > prelaunch, preconfig, shutdown: ignore acpi and deal with cleaning devices > other non-running: bubble up error > running: default behavior > > I was trying to validate that this change would be safe (keep in mind I am > learning ACPI in little pieces while digging) using GDB, and code inspection. > While stepping through with GDB i noticed that the PCI slots are controlled by > memory region and the opaque acpi pci hp state object. I was unable this far > to > find any code executed that modifies the ACPI tables beyond just the pci > hotplug state. > > I also tried to test using "while true; do acpidump | md5; sleep 1; done" in > the guest OS and then add/remove a virtio-blk-pci device (which exercised the > ACPI callbacks via piix4 callbacks). The output of the acpidump -> md5 was > consistent during each phase of the data collection which I believe implied > that the acpi tables were not modified by the PCI hotplug. > > Can someone help me understand: > > 1. Are the ACPI tables not modified when doing PCI hotplug? Yes. > 2. Do the general changes proposed seem safe? > 3. Are there resources or documentation I can read to help me understand this > problem further? I have skimmed through alot of different documents and > watched > some youtube videos, but the ACPI documentation is hard to read and sift > through and the youtube videos are generally too high level. > > Thanks. > We are generally trying to move away from ACPI hotplug to native PCIE hotplug. You can read up on that in the pci express spec. -- MST