On Sat, 25 Jun 2016 09:24:19 +0800
Zhou Jie <zhoujie2...@cn.fujitsu.com> wrote:

> Hi Alex,
> 
> > We should never depend on the guest driver to behave in a certain way,
> > but we need to prioritize what that actually means.  vfio in the kernel
> > has a responsibility first and foremost to the host kernel.  User owned
> > devices cannot be allowed to exploit or interfere with the host
> > regardless of user behavior.  The next priority is correct operation
> > for the user.  When the host kernel is handling the AER event between
> > the error and resume notifies, it doesn't have device specific drivers,
> > it's manipulating the device as a generic PCI device.  That makes me
> > think that vfio should not allow the user to interact (interfere) with
> > the device during that process and that such interference can be
> > limited to standard PCI level interactions.  That means config space,
> > and things that operate on config space (like interrupt ioctls and
> > resets).  On the QEMU side, we've sent a notification that an error
> > occurred, how the user and the guest respond to that is beyond the
> > concern of vfio in the kernel.  If the user/guest driver continues to
> > interact with resources on the device, that's fine, but I think vfio in
> > the kernel does need to prevent the user from interfering with the PCI
> > state of the device for that brief window when we know the host kernel
> > is operating on the device.  Otherwise the results are unpredictable
> > and therefore unsupportable.  Does that make sense?  Thanks,  
> I understand.
> 
> I want to alter the VFIO driver like following.
> During err occurs and resume:
> 1. Make config space read only.
>     Ignore config space writing to prevent the user from
>     interfering with the PCI state of the device.
>     User can get the error infomation by reading the config space.
> 2. Disable INTx and MSI
>     Write "Command Register" to disable INTx and MSI.
> 3. Do nothing for bar regions.
>     Guest driver may access bar regions.
>     It doesn't matter as device is going to be reset.
> 
> The following code will be modified.
> 1. vfio_pci_ioctl
>     add flag for aer support
> 2. vfio_pci_ioctl
>     During err occurs and resume:
>     if (cmd == VFIO_DEVICE_SET_IRQS) return EAGAIN
>     if (cmd == VFIO_DEVICE_RESET) return EAGAIN
>     if (cmd == VFIO_DEVICE_GET_PCI_HOT_RESET_INFO) return EAGAIN
>     if (cmd == VFIO_DEVICE_PCI_HOT_RESET) return EAGAIN
> 3. vfio_pci_write
>     During err occurs and resume:
>     block
> 4. vfio_pci_aer_err_detected
>     Set aer state in "struct vfio_pci_device"
>     Write "Command Register" to disable INTx and MSI.
> 5. vfio_pci_aer_resume
>     Clear aer state in "struct vfio_pci_device"
>     I don't need to enable INTx and MSI.
>     The device will be initalized by guest driver.

The INTx/MSI part needs further definition for the user.  Are we
actually completely tearing down interrupts with the expectation that
the user will re-enable them or are we just masking them such that the
user needs to unmask?  Also note that not all devices support DisINTx.

Otherwise it seems like a reasonable approach, but I can't guarantee we
won't find new issues along the way.  For instance we'll need to test
how -EAGAIN returns interact with existing QEMU and maybe decided
whether there are cases that are better handled by doing an
interruptible wait.  Thanks,

Alex

Reply via email to