On Wed, 13 Jul 2016 09:04:16 +0800
Zhou Jie <zhoujie2...@cn.fujitsu.com> wrote:

> Hi Alex,
> 
> >> I will use workable state support flag
> >> to let user know whether the kenerl support block feature.
> >> And make configure space writing and ioctl function blocked.  
> >
> > And what of my suggestion that a user may desire to poll the state of
> > the device?  
> I will also add a poll function to vfio_fops.

Can you explain how this will work?  I was only suggesting that one of
the flag bits in vfio_device_info be allocated to report the current
state of blocking and the user could poll by repeatedly calling the
DEVICE_INFO ioctl.  Are you thinking of using POLLOUT/POLLIN?  I'm not
sure if those are a perfect match since it's really only the PCI config
region and a few ioctls where access is blocked, other operations may
proceed normally.
 
> > A user does know what the vfio driver has done if you define the
> > behavior that on an AER error reported event, as signaled to the user
> > via the error notification interrupt, vfio-pci will teardown device
> > interrupts to an uninitialized state.  The difference between the
> > command register approach you suggest and the teardown I suggest is
> > that the command register is simply masking interrupt deliver while the
> > teardown approach returns the device to an uninitialized interrupt
> > state.  Take a look at the device state when a bus reset occurs, what
> > state is saved and restored and what is left at a default PCI value.
> > The command register is saved and restored, so any manipulation we do
> > of it is racing the host kernel AER handling and bus reset.  What about
> > MSI and MSI-X?  Looks to me like those are left at the PCI default
> > initialization state, so now after an AER error we have irq handlers
> > and eventfds configured, while in fact the device has been
> > de-programmed.  To handle that we're expecting users to teardown the
> > interrupt state and re-establish it?  Again, why not just teardown the
> > interrupt state ourselves?  I dont' see the value in simply masking the
> > command register, especially when it doesn't handle the no-DisINTx case.  
> I understand.
> Thank you very much to explain this to me.
> I will teardown the interrupt state.
> 
> > We cannot depend on the behavior of any given driver and the fact that
> > the guest driver may teardown interrupts anyway is not a justification
> > that vfio shouldn't be doing this to make the device state presented to
> > the user consistent.  Thanks,  
> I understand.
> 
> The following code will be modified.
> 1. vfio_pci_ioctl
>     add a flag in vfio_device_info for workable_state support
> 2. vfio_pci_ioctl
>     During err occurs and resume:
>     if (cmd == VFIO_DEVICE_SET_IRQS || VFIO_DEVICE_RESET
>     || VFIO_DEVICE_GET_PCI_HOT_RESET_INFO || VFIO_DEVICE_PCI_HOT_RESET)
>     block for workable_state clearing
> 3. vfio_pci_write
>     During err occurs and resume:
>     block write to configure space
> 4. vfio_pci_aer_err_detected
>     Set workable_state to false in "struct vfio_pci_device"
>     teardown the interrupt
> 5. vfio_pci_aer_resume
>     Set workable_state to true in "struct vfio_pci_device"
> 6. vfio_fops
>     Add poll function

I would still suggest that the name "workable_state" is quite vague.
Something like aer_error_in_progress is much more specific.  Thanks,

Alex

Reply via email to