Hi Michael,
On Fri, Jul 19, 2019 at 02:35:14AM +0000, Zhangbo (Oscar) wrote:
Hi All:
I have 2 questions about (un)hotplug on pcie-root-port.
First Question (hotplug failure because of redundant PCI_EXP_LNKSTA_DLLLA bit
set):
during VM boot, qemu sets PCI_EXP_LNKSTA_DLLLA according to this process:
pcie_cap_init() -> pcie_cap_v1_fill(),
even if there's no pcie device added to the VM.
I noticed that during hotplug, qemu also sets PCI_EXP_LNKSTA_DLLLA in
pcie_cap_slot_hotplug_cb().
It means that the bit PCI_EXP_LNKSTA_DLLLA is set TWICE.
why set this bit during initializing pcie-root-port? It seems unnecessary.
Makes sense.
The bad side of this is it causes HOTPLUG FAILURE if we boot the VM and
hotplug a pcie device at the same time:
In VM kernel,according to this bit set, it senses a PDC event, the
process is:
pciehp_probe -> pcie_init -> pcie_init_slot ->
pciehp_queue_pushbutton_work.
If the 2 PDC events get too close, the VM kernel will wrongly unplug the
device.
Suggestion to the 1st problem:
Can I remove the PCI_EXP_LNKSTA_DLLLA bit set process during
pcie_cap_init().
We raise this qeustion here because we find out that if the pcie ext
capability PCI_EXP_LNKSTA_DLLLA is set by default, linux kernel may try
to send PCI_EXP_HP_EV_PDC event after boot-up. When we do virtio device
hotplug during the processing of PCI_EXP_HP_EV_PDC event (pciehp_ist
=>pciehp_handle_presence_or_link_change => pciehp_enable_slot)
the device may be accidently powered down because the power state
detected is ON_STATE.
Kernel sends PCI_EXP_HP_EV_PDC event when it tries to probe the
pcie-root-probe, i.e:
pciehp_probe => pciehp_check_presence =>
pciehp_card_present_or_link_active => pciehp_check_link_active
pciehp_check_link_active returns true if PCI_EXP_LNKSTA_DLLLA Cap is
presented.
We are going send a patch to have PCI_EXP_LNKSTA_DLLLA Cap removed to
fix the problem here.
Second Question (time cost too much during pcie device unplug):
qemu only send ABP event to VM kernel during unpluging pcie devices, VM
kernel receives the
ABP event then sleep 5s to expect a PDC event, which causes unpluging
devices takes too long.
Suggestion to the 2nd problem:
Can I send ABP and *PDC* events to kernet when unplug devices.
I think we should not only set PDC but also try clearing presence bit,
even though the device is actually still there and mapped into guest
memory.
Maybe we should also not send the ABP event at all.
In both cases it's necessary to test with a non-linux guest
(e.g. a couple of versions of windows) to be sure we are not breaking
anything.
Thanks to your opinion, we will try to send PCI_EXP_HP_EV_PDC event
instead of the PCI_EXP_HP_EV_PDC event at device unplug and do some
Windows guest compatibility test.
We will reply later.