On Tue, 22 Mar 2016 18:14:45 +0800 Chen Fan <chen.fan.f...@cn.fujitsu.com> wrote:
> On 03/22/2016 05:40 AM, Alex Williamson wrote: > > On Mon, 21 Mar 2016 18:08:44 +0800 > > Cao jin <caoj.f...@cn.fujitsu.com> wrote: > > > >> From: Chen Fan <chen.fan.f...@cn.fujitsu.com> > >> > >> Due to all devices assigned to VM on the same way as host if enable > >> aer, so we can easily do the hot reset by selecting the function #0 > >> to do the hot reset. > >> > >> Signed-off-by: Chen Fan <chen.fan.f...@cn.fujitsu.com> > >> --- > >> hw/vfio/pci.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ > >> hw/vfio/pci.h | 2 ++ > >> 2 files changed, 52 insertions(+) > >> > >> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c > >> index 9902c87..718cde7 100644 > >> --- a/hw/vfio/pci.c > >> +++ b/hw/vfio/pci.c > >> @@ -1900,6 +1900,8 @@ static void vfio_check_hot_bus_reset(VFIOPCIDevice > >> *vdev, Error **errp) > >> /* List all affected devices by bus reset */ > >> devices = &info->devices[0]; > >> > >> + vdev->single_depend_dev = (info->count == 1); > >> + > >> /* Verify that we have all the groups required */ > >> for (i = 0; i < info->count; i++) { > >> PCIHostDeviceAddress host; > >> @@ -2608,11 +2610,36 @@ static void vfio_put_device(VFIOPCIDevice *vdev) > >> static void vfio_err_notifier_handler(void *opaque) > >> { > >> VFIOPCIDevice *vdev = opaque; > >> + PCIDevice *pdev = &vdev->pdev; > >> > >> if (!event_notifier_test_and_clear(&vdev->err_notifier)) { > >> return; > >> } > >> > >> + if (vdev->features & VFIO_FEATURE_ENABLE_AER) { > >> + VFIOPCIDevice *tmp; > >> + PCIDevice *dev; > >> + int devfn; > >> + > >> + /* > >> + * If one device has aer capability on a bus, when aer occurred, > >> + * we should notify all devices on the bus there was an aer > >> arrived, > >> + * then we are able to vote the device #0 to do host bus reset. > >> + */ > >> + for (devfn = 0; devfn < 8; devfn++) { > > ARI? > > > >> + dev = pci_find_device(pdev->bus, pci_bus_num(pdev->bus), > >> + PCI_DEVFN(PCI_SLOT(pdev->devfn), devfn)); > >> + if (!dev) { > >> + continue; > >> + } > >> + if (!object_dynamic_cast(OBJECT(dev), "vfio-pci")) { > >> + continue; > >> + } > >> + tmp = DO_UPCAST(VFIOPCIDevice, pdev, dev); > >> + tmp->aer_occurred = true; > >> + } > >> + } > >> + > >> /* > >> * TBD. Retrieve the error details and decide what action > >> * needs to be taken. One of the actions could be to pass > >> @@ -3075,6 +3102,29 @@ static void vfio_pci_reset(DeviceState *dev) > >> > >> trace_vfio_pci_reset(vdev->vbasedev.name); > >> > >> + if (vdev->aer_occurred) { > >> + PCIDevice *br = pci_bridge_get_device(pdev->bus); > >> + > >> + if (br && > >> + (pci_get_word(br->config + PCI_BRIDGE_CONTROL) & > >> + PCI_BRIDGE_CTL_BUS_RESET)) { > >> + /* simply voting the function 0 to do hot bus reset */ > >> + if (pci_get_function_0(pdev) == pdev) { > >> + if (vdev->features & VFIO_FEATURE_ENABLE_AER) { > >> + vfio_pci_hot_reset(vdev, vdev->single_depend_dev); > >> + } else { > >> + /* if this device has not AER capability, code > >> + * coming here indicates there is another function > >> + * on the bus has AER capability. > >> + * */ > > This shouldn't be possible, right? > > > >> + vfio_pci_hot_reset(vdev, false); > >> + } > >> + } > >> + vdev->aer_occurred = false; > >> + return; > >> + } > >> + } > > Why do we care than an AER occurred now? Can't we simply test: > > > > if (vdev->features & VFIO_FEATURE_ENABLE_AER && > > pci_get_function_0(pdev) == pdev) { > > PCIDevice *br = pci_bridge_get_device(pdev->bus); > > > > if (pci_get_word(br->config + PCI_BRIDGE_CONTROL) & > > PCI_BRIDGE_CTL_BUS_RESET)) { > > > > vfio_pci_hot_reset(vdev, vdev->single_depend_dev); > > return; > > } > > } > > do we have the case that only one/few of the devices affected > by a bus reset assigned to VM enabled AER, then when bus > reset, we let the function 0 do hot reset, which may not enable AER, > but we should tell other devices on the bus that they don't need > to do bus reset. so I just mark all devices on the bus when needing a > hot reset. I thought we were requiring all the bus reset affected devices to enable AER, so that example should not be possible. I think it matches our target use case to make this a requirement. Thanks, Alex