On Tue, 2015-07-28 at 07:48 +0000, Chen, Hanxiao wrote: > Hi, Alex > > > -----Original Message----- > > From: Alex Williamson [mailto:alex.william...@redhat.com] > > To: Chen, Hanxiao > > Cc: qemu-devel@nongnu.org; Chen, Fan > > Subject: Re: [Qemu-devel] [PATCH v12 00/15] vfio-pci: pass the aer error to > > guest > > > > On Thu, 2015-07-16 at 12:00 +0800, Chen Hanxiao wrote: > > > From: Chen Fan <chen.fan.f...@cn.fujitsu.com> > > > > > > For now, when qemu receives an error from host aer report > > > by vfio pci passthough devices, qemu just terminate the guest. > > > Usually user want to know what error occurred > > > rather than stop the guest. > > > > > > This patches add aer capability support for vfio device, > > > then pass the error to guest, and let guest driver to recover > > > from the error. > > > Turning on SERR# for error forwording in bridge control register > > > patch in seabios has been merged as commit 32ec3ee. > > > > > > notes: > > > this series patches enable aer support single/multi-function, > > > for multi-function, require all the function of the slot assigned to > > > VM and on the same slot. > > > > > > Chen Fan (15): > > > vfio: extract vfio_get_hot_reset_info as a single function > > > vfio: squeeze out vfio_pci_do_hot_reset for support bus reset > > > pcie: modify the capability size assert > > > vfio: make the 4 bytes aligned for capability size > > > vfio: add pcie extanded capability support > > > aer: impove pcie_aer_init to support vfio device > > > vfio: add aer support for vfio device > > > vfio: add check host bus reset is support or not > > > pci: add bus reset_notifiers callbacks for host bus reset > > > vfio: add sec_bus_reset notifier to notify physical bus reset is > > > needed > > > vfio: modify vfio_pci_hot_reset to support bus reset > > > vfio: do hot bus reset when do virtual secondary bus reset > > > pcie_aer: expose pcie_aer_msg() interface > > > vfio-pci: pass the aer error to guest > > > vfio: add 'aer' property to expose aercap > > > > > > hw/pci-bridge/ioh3420.c | 2 +- > > > hw/pci-bridge/xio3130_downstream.c | 2 +- > > > hw/pci-bridge/xio3130_upstream.c | 2 +- > > > hw/pci/pci.c | 16 + > > > hw/pci/pci_bridge.c | 6 + > > > hw/pci/pcie.c | 2 +- > > > hw/pci/pcie_aer.c | 6 +- > > > hw/vfio/pci.c | 636 > > > +++++++++++++++++++++++++++++++++---- > > > include/hw/pci/pci.h | 4 + > > > include/hw/pci/pci_bus.h | 2 + > > > include/hw/pci/pcie_aer.h | 3 +- > > > 11 files changed, 609 insertions(+), 72 deletions(-) > > > > > > This seems to be pretty much the same as v11 where I commented that I > > didn't think it was acceptable to have a feature dependent on having all > > the functions assigned without supporting hot-add of multi-function > > devices. Can you summarize what's changed here and whether that comment > > was addressed. It would be a courtesy to reviewers to provide at least > > a summary changelog with each new version. Thanks, > > > > We could hot-unplug all passthrough devices by device_del, > But currently Qemu could not hot-add multi-function pci device. > > See TODO in pcie_cap_slot_hotplug_cb: > /* TODO: multifunction hot-plug. > * Right now, only a device of function = 0 is allowed to be > * hot plugged/unplugged. > */ > assert(PCI_FUNC(pci_dev->devfn) == 0); > > So we had to limit this as a workaround. > > Why can't we add functions one by one to the same slot by device_add? > > If we could add functions one by one, > then we can enable aer for the devices once all dependence functions > were added by setting aer as a dynamic property(such as using > object_property_add, qom-set)
Adding functions individually is not supported by PCI, the specification indicates that function 0 must always be present and if function 0 reports the multi-function bit set, system software is to scan function 1 through 7. Therefore any hotplug notification to the slot should do nothing unless a function zero is present and real hardware doesn't typically have the ability to spawn new functions, so I wouldn't necessarily expect additional notifications to a slot to cause system software to re-scan for new functions. Also, if we add functions one by one, how do we know we'll ever see the complete set necessary to support the user requested AER feature? We cannot predict that the user will add the needed functions in the future. Paolo suggested a possible solution to this in a separate internal thread, that perhaps we could do multi-function hot-add so long as the zero function is added last, effectively signaling the closure of the slot. Modifications to the PCI-core would be needed such that non-zero functions could be hot added and config space hidden until the successful instantiation of the zero function (to prevent unsolicited discovery and use of the functions prior to exposure). Perhaps a slot closure callback would give us the same opportunity we have with machine_init_done to give a final check of the slot and abort if our aer requirements are not met. Multi-function hot-add is not currently supported because nothing requires it. If aer is to depend on multi-function, then it's going to need to bring multi-function hot-add up to a supportable level in order to be accepted. Thanks, Alex