On Tue, 21 Jun 2016 20:41:32 +0800
Chen Fan <fan.c...@easystack.cn> wrote:

> On 2016年06月21日 11:13, Alex Williamson wrote:
> > On Tue, 21 Jun 2016 10:16:25 +0800
> > Zhou Jie <zhoujie2...@cn.fujitsu.com> wrote:
> >  
> >> Hi, Alex
> >>  
> >>> I was really hoping to hear your opinion, or at least some further
> >>> discussion of pros and cons rather than simply parroting back my idea.  
> >> I understand.
> >>  
> >>> My current thinking is that a resume notifier to userspace is poorly
> >>> defined, it's not clear what the user can and cannot do between an
> >>> error notification and the resume notification.  
> >> Yes, do nothing between that time is better.
> >>  
> >>> One approach to solve
> >>> that might be that the kernel internally handles the resume
> >>> notifications.  Maybe that means blocking the ioctl (interruptible
> >>> timeout) until the internal resume occurs, or maybe that means
> >>> returning -EAGAIN.  
> >> I don't think it is a good idea.
> >> The kernel give the error and resume notifications, it's enough.
> >> It's up to user to how to use them.  
> > Well that's exactly why it's poorly defined.  What does a resume
> > notification signal a user that they're allowed to do?  What can they
> > not do between error and resume notification.  Clearly you had issues
> > attempting to perform a reset during this time period since it was
> > racing with the kernel reset, so is a user allowed to do a hot reset
> > between error and resume?  Where do we define it?  Do we prevent it if
> > they try?  Why?  What about the reset ioctl?  How and why is that
> > different from a hot reset?  (hint, they can be the same)  Do we define
> > that resets are not allowed between error and resume, but other
> > operations like read/write or interrupt setup ioctls are allowed? Why?
> > Clearly we can't do anything that manipulates the device between error
> > and resume since it might be lost or ineffective, but where do we
> > define it and do we need to actively enforce those rules?  I'm arguing
> > that it's poorly defined, so "it's up to the user how to use them"
> > doesn't not give me any additional confidence in that approach.  We
> > can't trust the user to be polite, we can't even trust the user not to
> > be malicious.  
> Hi Alex,
>       on kernel side, I think if we don't trust the user behaviors, we 
> should
>   disable the access of vfio-pci interface once vfio-pci driver got the 
> error_detected,
>   we should disable all access to vfio fd regardless whether the vfio-pci
>   was assigned to a VM, we also can return a EAGAIN error if user try
>   to access it during the reset period until the host reset finished.
>       on qemu side, when we got a error_detect, we pass through the
> aer error to guest directly, ignore all access to vfio-pci during this 
> time,
> when qemu need to do a hot reset, we can retry to get the info from
> the get info ioctl until we got the info that vfio-pci has been reset 
> finished,
> then do the hot_reset ioctl if need, the kernel should ensure the ioctl 
> become
> //// accessible after host reset completed.
>

That sounds pretty thorough, the sticky point there is always disabling
the device mmaps w/o a revoke interface.  Do we invalidate the pfn
range and setup a fault handler that blocks on access?  I don't think
we have a whole lot of options, either block or sigbus, but having such
a mechanism might allow us to easily put a device in a "dead" state
where the user can't touch it, which could be useful for other purposes
too.  QEMU would also need to timeout after some number of reset
attempts and assume the device is not coming back.  Plus we'd need a
device flag to indicate this behavior.  Thanks,

Alex

Reply via email to