Re: [Qemu-devel] ACPI PCI hotplug table updates

2018-10-05 Thread Igor Mammedov
On Thu, 4 Oct 2018 13:24:40 -0400
"Michael S. Tsirkin"  wrote:

> On Thu, Oct 04, 2018 at 01:57:21PM +0200, Igor Mammedov wrote:
> > On Wed, 3 Oct 2018 10:44:20 -0700
> > open sorcerer <0p3n.s0rc3...@gmail.com> wrote:
> >   
> > > Hi,
> > > 
> > > I am digging into an issue where qmp_device_del does not actually delete
> > > devices when a guest OS is in prelaunch. This seems to be due to the guest
> > > OS not handling ACPI events because it is not currently running. If I
> > > assume correctly, qmp should allow you to add/remove devices while the 
> > > host
> > > is down, or if not possible, publish an error message.  
> > may I ask why one would delete a device at -S pause point, isn't it easier
> > to start QEMU without it, to begin with?
> >   
> > > I think fixing this issue is as simple as making sure that the VM is in a
> > > safe state to ignore the hotplug ACPI dance but eject the disk, something
> > > like:  
> > in prelaunch runstate where '-S' option pauses VM, it is practically paused
> > at the first instruction to be executed. So device_add at that point is
> > considered as hotplug with all actions already executed on hardware level
> > (interrupts sent, devices responsible for hotplug handling has changed 
> > state).
> > So if one wished to delete device at that point, one would have to rollback
> > related state changes.
> > If one would additionally use -incoming CLI option, it becomes more 
> > complicated
> > as we might endup in prelaunch runstate with VM in running state
> > (see possible transitions in runstate_transitions_def[])
> > I'd say prelauch runstate can't be used for removing devices that do not
> > support surprise removal (in our case PCI isn't).  
> 
> I'd say the point is this. In prelaunch guest did not observe any
> device state yet, we could make device_add look just like
> a non-hotplugged device. And we could make device_del pretend
> there was a reset immediately afterwards.
> 
> Not sure why it matters to anyone, but it's doable I think.
in case we came to prelaunch from freshly started QEMU with -S
and no other disrupting things in between /migration, chekpointing, .../
it's theoretically possible.

However even then to make a clean device_del in that state for
devices that expect guest cooperation, one would need chain
unplug_request (which is what device_del translates to) with
whatever hotunplug hw is used and simulate guest unplugging it. 

And then on top we might need to rebuild/reload firmware tables
(ARM) (x86 should work as it will rebuild ACPI tables on the
first access). In generic case we might need to fixup something
else elsewhere.

When I looked into early numa configuration, I've failed to
convince myself that using prelaunch, changing its semantics to
coldplug and fixing up already built machine as safe/robust thing
to do.(Resulted prelaunch based RFC even worked fine, but I
wouldn't bet it wouldn't fall apart or in all other combinations
prelaunch runstate could be reached)

As result we ended up with new preconfig option/runstate where to
we can gradually move machine building steps. One possible way
to deal with subject would be queue at preconfig stage -device/device_add
and use this queue later to add devices to board (not sure
if it's a sound idea in general).
This early it should be possible to remove a device from queue.
But why one would add device and immediately remove it ... :/







Re: [Qemu-devel] ACPI PCI hotplug table updates

2018-10-04 Thread Michael S. Tsirkin
On Thu, Oct 04, 2018 at 01:57:21PM +0200, Igor Mammedov wrote:
> On Wed, 3 Oct 2018 10:44:20 -0700
> open sorcerer <0p3n.s0rc3...@gmail.com> wrote:
> 
> > Hi,
> > 
> > I am digging into an issue where qmp_device_del does not actually delete
> > devices when a guest OS is in prelaunch. This seems to be due to the guest
> > OS not handling ACPI events because it is not currently running. If I
> > assume correctly, qmp should allow you to add/remove devices while the host
> > is down, or if not possible, publish an error message.
> may I ask why one would delete a device at -S pause point, isn't it easier
> to start QEMU without it, to begin with?
> 
> > I think fixing this issue is as simple as making sure that the VM is in a
> > safe state to ignore the hotplug ACPI dance but eject the disk, something
> > like:
> in prelaunch runstate where '-S' option pauses VM, it is practically paused
> at the first instruction to be executed. So device_add at that point is
> considered as hotplug with all actions already executed on hardware level
> (interrupts sent, devices responsible for hotplug handling has changed state).
> So if one wished to delete device at that point, one would have to rollback
> related state changes.
> If one would additionally use -incoming CLI option, it becomes more 
> complicated
> as we might endup in prelaunch runstate with VM in running state
> (see possible transitions in runstate_transitions_def[])
> I'd say prelauch runstate can't be used for removing devices that do not
> support surprise removal (in our case PCI isn't).

I'd say the point is this. In prelaunch guest did not observe any
device state yet, we could make device_add look just like
a non-hotplugged device. And we could make device_del pretend
there was a reset immediately afterwards.

Not sure why it matters to anyone, but it's doable I think.



> > prelaunch, preconfig, shutdown: ignore acpi and deal with cleaning devices
> > other non-running: bubble up error
> > running: default behavior
> > 
> > I was trying to validate that this change would be safe (keep in mind I am
> > learning ACPI in little pieces while digging) using GDB, and code
> > inspection. While stepping through with GDB i noticed that the PCI slots
> > are controlled by memory region and the opaque acpi pci hp state object. I
> > was unable this far to find any code executed that modifies the ACPI tables
> > beyond just the pci hotplug state.
> > 
> > I also tried to test using "while true; do acpidump | md5; sleep 1; done"
> > in the guest OS and then add/remove a virtio-blk-pci device (which
> > exercised the ACPI callbacks via piix4 callbacks). The output of the
> > acpidump -> md5 was consistent during each phase of the data collection
> > which I believe implied that the acpi tables were not modified by the PCI
> > hotplug.
> > 
> > Can someone help me understand:
> > 
> > 1. Are the ACPI tables not modified when doing PCI hotplug?
> > 2. Do the general changes proposed seem safe?
> > 3. Are there resources or documentation I can read to help me understand
> > this problem further? I have skimmed through alot of different documents
> > and watched some youtube videos, but the ACPI documentation is hard to read
> > and sift through and the youtube videos are generally too high level.
> Regarding ACPI based PCI hotplug you can look at
>   docs/specs/acpi_pci_hotplug.txt
>   hw/acpi/pcihp.c
>   ACPI AML part in build_append_pci_bus_devices()
> 
> > 
> > Thanks.



Re: [Qemu-devel] ACPI PCI hotplug table updates

2018-10-04 Thread Igor Mammedov
On Wed, 3 Oct 2018 10:44:20 -0700
open sorcerer <0p3n.s0rc3...@gmail.com> wrote:

> Hi,
> 
> I am digging into an issue where qmp_device_del does not actually delete
> devices when a guest OS is in prelaunch. This seems to be due to the guest
> OS not handling ACPI events because it is not currently running. If I
> assume correctly, qmp should allow you to add/remove devices while the host
> is down, or if not possible, publish an error message.
may I ask why one would delete a device at -S pause point, isn't it easier
to start QEMU without it, to begin with?

> I think fixing this issue is as simple as making sure that the VM is in a
> safe state to ignore the hotplug ACPI dance but eject the disk, something
> like:
in prelaunch runstate where '-S' option pauses VM, it is practically paused
at the first instruction to be executed. So device_add at that point is
considered as hotplug with all actions already executed on hardware level
(interrupts sent, devices responsible for hotplug handling has changed state).
So if one wished to delete device at that point, one would have to rollback
related state changes.
If one would additionally use -incoming CLI option, it becomes more complicated
as we might endup in prelaunch runstate with VM in running state
(see possible transitions in runstate_transitions_def[])
I'd say prelauch runstate can't be used for removing devices that do not
support surprise removal (in our case PCI isn't).

> prelaunch, preconfig, shutdown: ignore acpi and deal with cleaning devices
> other non-running: bubble up error
> running: default behavior
> 
> I was trying to validate that this change would be safe (keep in mind I am
> learning ACPI in little pieces while digging) using GDB, and code
> inspection. While stepping through with GDB i noticed that the PCI slots
> are controlled by memory region and the opaque acpi pci hp state object. I
> was unable this far to find any code executed that modifies the ACPI tables
> beyond just the pci hotplug state.
> 
> I also tried to test using "while true; do acpidump | md5; sleep 1; done"
> in the guest OS and then add/remove a virtio-blk-pci device (which
> exercised the ACPI callbacks via piix4 callbacks). The output of the
> acpidump -> md5 was consistent during each phase of the data collection
> which I believe implied that the acpi tables were not modified by the PCI
> hotplug.
> 
> Can someone help me understand:
> 
> 1. Are the ACPI tables not modified when doing PCI hotplug?
> 2. Do the general changes proposed seem safe?
> 3. Are there resources or documentation I can read to help me understand
> this problem further? I have skimmed through alot of different documents
> and watched some youtube videos, but the ACPI documentation is hard to read
> and sift through and the youtube videos are generally too high level.
Regarding ACPI based PCI hotplug you can look at
  docs/specs/acpi_pci_hotplug.txt
  hw/acpi/pcihp.c
  ACPI AML part in build_append_pci_bus_devices()

> 
> Thanks.




[Qemu-devel] ACPI PCI hotplug table updates

2018-10-03 Thread open sorcerer
Hi,

I am digging into an issue where qmp_device_del does not actually delete
devices when a guest OS is in prelaunch. This seems to be due to the guest
OS not handling ACPI events because it is not currently running. If I
assume correctly, qmp should allow you to add/remove devices while the host
is down, or if not possible, publish an error message.

I think fixing this issue is as simple as making sure that the VM is in a
safe state to ignore the hotplug ACPI dance but eject the disk, something
like:

prelaunch, preconfig, shutdown: ignore acpi and deal with cleaning devices
other non-running: bubble up error
running: default behavior

I was trying to validate that this change would be safe (keep in mind I am
learning ACPI in little pieces while digging) using GDB, and code
inspection. While stepping through with GDB i noticed that the PCI slots
are controlled by memory region and the opaque acpi pci hp state object. I
was unable this far to find any code executed that modifies the ACPI tables
beyond just the pci hotplug state.

I also tried to test using "while true; do acpidump | md5; sleep 1; done"
in the guest OS and then add/remove a virtio-blk-pci device (which
exercised the ACPI callbacks via piix4 callbacks). The output of the
acpidump -> md5 was consistent during each phase of the data collection
which I believe implied that the acpi tables were not modified by the PCI
hotplug.

Can someone help me understand:

1. Are the ACPI tables not modified when doing PCI hotplug?
2. Do the general changes proposed seem safe?
3. Are there resources or documentation I can read to help me understand
this problem further? I have skimmed through alot of different documents
and watched some youtube videos, but the ACPI documentation is hard to read
and sift through and the youtube videos are generally too high level.

Thanks.


Re: [Qemu-devel] ACPI PCI hotplug table updates

2018-10-03 Thread Michael S. Tsirkin
On Wed, Oct 03, 2018 at 10:44:20AM -0700, open sorcerer wrote:
> Hi,
> 
> I am digging into an issue where qmp_device_del does not actually delete
> devices when a guest OS is in prelaunch.

What exactly is meant by prelaunch? E.g. is it prelaunch while bios is
doing the pci bus scan?

> This seems to be due to the guest OS
> not handling ACPI events because it is not currently running. If I assume
> correctly, qmp should allow you to add/remove devices while the host is down,
> or if not possible, publish an error message.
> 
> I think fixing this issue is as simple as making sure that the VM is in a safe
> state to ignore the hotplug ACPI dance but eject the disk, something like:
> 
> prelaunch, preconfig, shutdown: ignore acpi and deal with cleaning devices
> other non-running: bubble up error
> running: default behavior
> 
> I was trying to validate that this change would be safe (keep in mind I am
> learning ACPI in little pieces while digging) using GDB, and code inspection.
> While stepping through with GDB i noticed that the PCI slots are controlled by
> memory region and the opaque acpi pci hp state object. I was unable this far 
> to
> find any code executed that modifies the ACPI tables beyond just the pci
> hotplug state.
> 
> I also tried to test using "while true; do acpidump | md5; sleep 1; done" in
> the guest OS and then add/remove a virtio-blk-pci device (which exercised the
> ACPI callbacks via piix4 callbacks). The output of the acpidump -> md5 was
> consistent during each phase of the data collection which I believe implied
> that the acpi tables were not modified by the PCI hotplug.
> 
> Can someone help me understand:
> 
> 1. Are the ACPI tables not modified when doing PCI hotplug?

Yes.

> 2. Do the general changes proposed seem safe?
> 3. Are there resources or documentation I can read to help me understand this
> problem further? I have skimmed through alot of different documents and 
> watched
> some youtube videos, but the ACPI documentation is hard to read and sift
> through and the youtube videos are generally too high level.
> 
> Thanks.
> 

We are generally trying to move away from ACPI hotplug to native
PCIE hotplug. You can read up on that in the pci express spec.

-- 
MST