On Tue, 7 Jun 2016 03:03:32 +0000 "Tian, Kevin" <kevin.t...@intel.com> wrote:
> > From: Alex Williamson [mailto:alex.william...@redhat.com] > > Sent: Tuesday, June 07, 2016 3:31 AM > > > > On Mon, 6 Jun 2016 10:44:25 -0700 > > Neo Jia <c...@nvidia.com> wrote: > > > > > On Mon, Jun 06, 2016 at 04:29:11PM +0800, Dong Jia wrote: > > > > On Sun, 5 Jun 2016 23:27:42 -0700 > > > > Neo Jia <c...@nvidia.com> wrote: > > > > > > > > 2. VFIO_DEVICE_CCW_CMD_REQUEST > > > > This intends to handle an intercepted channel I/O instruction. It > > > > basically need to do the following thing: > > > > > > May I ask how and when QEMU knows that he needs to issue such VFIO ioctl > > > at > > > first place? > > > > Yep, this is my question as well. It sounds a bit like there's an > > emulated device in QEMU that's trying to tell the mediated device when > > to start an operation when we probably should be passing through > > whatever i/o operations indicate that status directly to the mediated > > device. Thanks, > > > > Alex > > Below is copied from Dong's earlier post which said clear that > a guest cmd submission will trigger the whole flow: > > ---- > Explanation: > Q1-Q4: Qemu side process. > K1-K6: Kernel side process. > > Q1. Intercept a ssch instruction. > Q2. Translate the guest ccw program to a user space ccw program > (u_ccwchain). > Q3. Call VFIO_DEVICE_CCW_CMD_REQUEST (u_ccwchain, orb, irb). > K1. Copy from u_ccwchain to kernel (k_ccwchain). > K2. Translate the user space ccw program to a kernel space ccw > program, which becomes runnable for a real device. > K3. With the necessary information contained in the orb passed in > by Qemu, issue the k_ccwchain to the device, and wait event q > for the I/O result. > K4. Interrupt handler gets the I/O result, and wakes up the wait q. > K5. CMD_REQUEST ioctl gets the I/O result, and uses the result to > update the user space irb. > K6. Copy irb and scsw back to user space. > Q4. Update the irb for the guest. > ---- Right, but this was the pre-mediated device approach, now we no longer need step Q2 so we really only need Q1 and therefore Q3 to exist in QEMU if those are operations that are not visible to the mediated device; which they very well might be, since it's described as an instruction rather than an i/o operation. It's not terrible if that's the case, vfio-pci has its own ioctl for doing a hot reset. > My understanding is that such thing belongs to how device is mediated > (so device driver specific), instead of something to be abstracted in > VFIO which manages resource but doesn't care how resource is used. > > Actually we have same requirement in vGPU case, that a guest driver > needs submit GPU commands through some MMIO register. vGPU device > model will intercept the submission request (in its own way), do its > necessary scan/audit to ensure correctness/security, and then submit > to physical GPU through vendor specific interface. > > No difference with channel I/O here. Well, if the GPU command is submitted through an MMIO register, is that MMIO register part of the mediated device? If so, could the mediated device recognize the command and do the scan/audit itself? QEMU must not be the point at which mediation occurs for security purposes, QEMU is userspace and userspace is not to be trusted. I'm still open to ioctls where it makes sense, as above, we have PCI specific ioctls and already, but we need to evaluate each one, why it needs to exist, and whether we can skip it if the mediated device can trigger the action on its own. After all, that's why we're using the vfio api, so we can re-use much of the existing infrastructure, especially for a vGPU that exposes itself as a PCI device. Thanks, Alex