On Sat, 25 Jun 2016 09:24:19 +0800 Zhou Jie <zhoujie2...@cn.fujitsu.com> wrote:
> Hi Alex, > > > We should never depend on the guest driver to behave in a certain way, > > but we need to prioritize what that actually means. vfio in the kernel > > has a responsibility first and foremost to the host kernel. User owned > > devices cannot be allowed to exploit or interfere with the host > > regardless of user behavior. The next priority is correct operation > > for the user. When the host kernel is handling the AER event between > > the error and resume notifies, it doesn't have device specific drivers, > > it's manipulating the device as a generic PCI device. That makes me > > think that vfio should not allow the user to interact (interfere) with > > the device during that process and that such interference can be > > limited to standard PCI level interactions. That means config space, > > and things that operate on config space (like interrupt ioctls and > > resets). On the QEMU side, we've sent a notification that an error > > occurred, how the user and the guest respond to that is beyond the > > concern of vfio in the kernel. If the user/guest driver continues to > > interact with resources on the device, that's fine, but I think vfio in > > the kernel does need to prevent the user from interfering with the PCI > > state of the device for that brief window when we know the host kernel > > is operating on the device. Otherwise the results are unpredictable > > and therefore unsupportable. Does that make sense? Thanks, > I understand. > > I want to alter the VFIO driver like following. > During err occurs and resume: > 1. Make config space read only. > Ignore config space writing to prevent the user from > interfering with the PCI state of the device. > User can get the error infomation by reading the config space. > 2. Disable INTx and MSI > Write "Command Register" to disable INTx and MSI. > 3. Do nothing for bar regions. > Guest driver may access bar regions. > It doesn't matter as device is going to be reset. > > The following code will be modified. > 1. vfio_pci_ioctl > add flag for aer support > 2. vfio_pci_ioctl > During err occurs and resume: > if (cmd == VFIO_DEVICE_SET_IRQS) return EAGAIN > if (cmd == VFIO_DEVICE_RESET) return EAGAIN > if (cmd == VFIO_DEVICE_GET_PCI_HOT_RESET_INFO) return EAGAIN > if (cmd == VFIO_DEVICE_PCI_HOT_RESET) return EAGAIN > 3. vfio_pci_write > During err occurs and resume: > block > 4. vfio_pci_aer_err_detected > Set aer state in "struct vfio_pci_device" > Write "Command Register" to disable INTx and MSI. > 5. vfio_pci_aer_resume > Clear aer state in "struct vfio_pci_device" > I don't need to enable INTx and MSI. > The device will be initalized by guest driver. The INTx/MSI part needs further definition for the user. Are we actually completely tearing down interrupts with the expectation that the user will re-enable them or are we just masking them such that the user needs to unmask? Also note that not all devices support DisINTx. Otherwise it seems like a reasonable approach, but I can't guarantee we won't find new issues along the way. For instance we'll need to test how -EAGAIN returns interact with existing QEMU and maybe decided whether there are cases that are better handled by doing an interruptible wait. Thanks, Alex