On Thu, Feb 04, 2016 at 10:04:01AM +0800, Chen Fan wrote: > > On 02/03/2016 09:57 PM, Michael S. Tsirkin wrote: > >On Wed, Feb 03, 2016 at 04:54:01PM +0800, Chen Fan wrote: > >>On 01/17/2016 02:34 AM, Michael S. Tsirkin wrote: > >>>On Tue, Jan 12, 2016 at 10:43:01AM +0800, Cao jin wrote: > >>>>From: Chen Fan <chen.fan.f...@cn.fujitsu.com> > >>>> > >>>>For now, for vfio pci passthough devices when qemu receives > >>>>an error from host aer report, currentlly just terminate the guest, > >>>>but usually user want to know what error occurred but stopping the > >>>>guest, so this patches add aer capability support for vfio device, > >>>>and pass the error to guest, and have guest driver to recover > >>>>from the error. > >>>I would like to see a version of this patchset that doesn't > >>>depend on pci core changes. > >>>I think that if you make this simplifying assumption: > >>> > >>>- all devices on same bus in guest are on same bus in host > >>> > >>>then you can handle both reset and hotplug simply in function 0 > >>>since it will belong to vfio. > >>> > >>>So we can have a version without pci core changes that simply assumes > >>>this, and things will just work. > >>> > >>> > >>>Now, if we wanted to enforce this limitation, I think the > >>>cleanest way would be to add a callback in struct PCIDevice: > >>> > >>> bool is_valid_function(PCIDevice *newfunction) > >>> > >>>and call it as each function is added. > >>>This way aer function can validate that each function > >>>added shares the same bus. > >>>And this way issues will be detected directly and not when > >>>function 0 is added. > >>> > >>>I would prefer this validation code to be a patch on top so we can merge > >>>the functionality directly and avoid blocking it while we figure out the > >>>best api to validate things. > >>> > >>>I don't see why making guest topology match host would > >>>ever be a problem, but if it's required to support > >>>configurations where these differ, I'd like to see > >>>an attempt to address that be split out, after aer > >>>is supported. > >>Hi Michael, > >> > >>Just think about this more, I think we also should check the vfio > >>devices whether on the same bus at the time of function 0 is added. > >>because we don't know the affected devices by a bus reset have > >>already all been assigned to VM. > >This is something vfio in kernel should check. > >You can't rely on qemu being well behaved, so don't > >even try to catch cases which would break host in userspace. > > > >qemu should only worry about not breaking guest. > > > > > >>for example, the multi-function's hotplug. > >>devices on same bus in host are added to VM one by one. when we > >>test one device, we haven't yet added the other devices. > >>so I think > >>the patch should like below. then we could add a vfio_is_valid_function in > >>vfio > >>to test each device whether the affected devices on the same bus. > >> > >>Thanks, > >>Chen > >> > >>diff --git a/hw/pci/pci.c b/hw/pci/pci.c > >>index d940f79..7163b56 100644 > >>--- a/hw/pci/pci.c > >>+++ b/hw/pci/pci.c > >>@@ -1836,6 +1836,38 @@ PCIDevice *pci_find_device(PCIBus *bus, int bus_num, > >>uint8_t devfn) > >> return bus->devices[devfn]; > >> } > >> > >>+static int pci_bus_check_devices(PCIBus *bus) > >>+{ > >>+ PCIDeviceClass *pc; > >>+ int i, ret = 0; > >>+ > >>+ for (i = 0; i < ARRAY_SIZE(bus->devices); ++i) { > >>+ if (!bus->devices[i]) { > >>+ continue; > >>+ } > >>+ > >>+ pc = PCI_DEVICE_GET_CLASS(bus->devices[i]); > >>+ if (!pc->is_valid_func) { > >>+ continue; > >>+ } > >>+ > >>+ ret = pc->is_valid_func(bus->devices[i], bus); > >>+ if (!ret) { > >>+ return -1; > >>+ } > >>+ } > >>+ return 0; > >>+} > >>+ > >>+static bool pci_is_valid_function(PCIDevice *pdev, PCIBus *bus) > >>+{ > >>+ if (pdev->bus == bus) { > >>+ return true; > >>+ } > >>+ > >>+ return false; > >>+} > >>+ > >I don't really understand what is this one doing. > >Why do we need a default function? > if the vfio driver in kernel can handle the bus reset for any one > device in qemu without the affected devices assigned. I think > we don't need this default one. > BTW, IIRC at present the devices on the same bus in host can > be assigned to different VM, so if we want to support this kind of > bus reset for an independent device when enable aer, aren't we > limiting the case that others devices on the same bus must be > assigned to current VM? > > Thanks, > Chen
I don't believe this works at the moment, and I'd expect kernel to prevent this, so we should not rely on userspace code for this. Alex, could you comment please? > >> static void pci_qdev_realize(DeviceState *qdev, Error **errp) > >> { > >> PCIDevice *pci_dev = (PCIDevice *)qdev; > >>@@ -1878,6 +1910,14 @@ static void pci_qdev_realize(DeviceState *qdev, Error > >>**errp) > >> pci_qdev_unrealize(DEVICE(pci_dev), NULL); > >> return; > >> } > >>+ > >>+ if (DEVICE(pci_dev)->hotplugged && > >>+ pci_get_function_0(pci_dev) == pci_dev && > >>+ pci_bus_check_devices(bus)) { > >>+ error_setg(errp, "failed to hotplug function 0"); > >>+ pci_qdev_unrealize(DEVICE(pci_dev), NULL); > >>+ return; > >>+ } > >> } > >> > >> static void pci_default_realize(PCIDevice *dev, Error **errp) > >>@@ -2390,6 +2430,7 @@ static void pci_device_class_init(ObjectClass *klass, > >>void *data) > >> k->bus_type = TYPE_PCI_BUS; > >> k->props = pci_props; > >> pc->realize = pci_default_realize; > >>+ pc->is_valid_func = pci_is_valid_function; > >> } > >> > >> AddressSpace *pci_device_iommu_address_space(PCIDevice *dev) > >>diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h > >>index dedf277..a89580f 100644 > >>--- a/include/hw/pci/pci.h > >>+++ b/include/hw/pci/pci.h > >>@@ -191,6 +191,7 @@ typedef struct PCIDeviceClass { > >> > >> void (*realize)(PCIDevice *dev, Error **errp); > >> int (*init)(PCIDevice *dev);/* TODO convert to realize() and remove */ > >>+ bool (*is_valid_func)(PCIDevice *dev, PCIBus *bus); > >> PCIUnregisterFunc *exit; > >> PCIConfigReadFunc *config_read; > >> PCIConfigWriteFunc *config_write; > >> > >> > >> > >>> > >>> > >>>>v15-v16: > >>>> 10/14, 11/14 are new to introduce a reset sequence id to specify the > >>>> vfio devices has been reset for that reset. other patches aren't > >>>> modified. > >>>> > >>>>v14-v15: > >>>> 1. add device hot reset callback > >>>> 2. add bus_in_reset for vfio device to avoid multi do host bus reset > >>>> > >>>>v13-v14: > >>>> 1. for multifunction device, requiring all functions enable AER.(9/13) > >>>> 2. due to all affected functions receive error signal, ignore no > >>>> error occurred function. (12/13) > >>>> > >>>>v12-v13: > >>>> 1. since support multifuncion hotplug, here add callback to enable > >>>> aer. > >>>> 2. add pci device pre+post reset for aer host reset. > >>>> > >>>>Chen Fan (14): > >>>> vfio: extract vfio_get_hot_reset_info as a single function > >>>> vfio: squeeze out vfio_pci_do_hot_reset for support bus reset > >>>> pcie: modify the capability size assert > >>>> vfio: make the 4 bytes aligned for capability size > >>>> vfio: add pcie extanded capability support > >>>> aer: impove pcie_aer_init to support vfio device > >>>> vfio: add aer support for vfio device > >>>> vfio: add check host bus reset is support or not > >>>> add check reset mechanism when hotplug vfio device > >>>> pci: introduce pci bus pre reset > >>>> vfio: introduce last reset sequence id > >>>> pcie_aer: expose pcie_aer_msg() interface > >>>> vfio-pci: pass the aer error to guest > >>>> vfio: add 'aer' property to expose aercap > >>>> > >>>> hw/pci-bridge/ioh3420.c | 2 +- > >>>> hw/pci-bridge/xio3130_downstream.c | 2 +- > >>>> hw/pci-bridge/xio3130_upstream.c | 2 +- > >>>> hw/pci/pci.c | 42 +++ > >>>> hw/pci/pci_bridge.c | 3 + > >>>> hw/pci/pcie.c | 2 +- > >>>> hw/pci/pcie_aer.c | 6 +- > >>>> hw/vfio/pci.c | 616 > >>>> +++++++++++++++++++++++++++++++++---- > >>>> hw/vfio/pci.h | 9 + > >>>> include/hw/pci/pci.h | 1 + > >>>> include/hw/pci/pci_bus.h | 8 + > >>>> include/hw/pci/pcie_aer.h | 3 +- > >>>> 12 files changed, 624 insertions(+), 72 deletions(-) > >>>> > >>>>-- > >>>>1.9.3 > >>>> > >>>> > >>>. > >>> > >> > > > >. > > > >