On Thu, 14 May 2020 09:32:15 -0700 John G Johnson <john.g.john...@oracle.com> wrote:
> Thanos and I have made some changes to the doc in response to the > feedback we’ve received. The biggest difference is that it is less reliant > on the reader being familiar with the current VFIO implementation. We’d > appreciate any additional feedback you could give on the changes. Thanks > in advance. > > Thanos and JJ > > > The link remains the same: > > https://docs.google.com/document/d/1FspkL0hVEnZqHbdoqGLUpyC38rSk_7HhY471TsVwyK8/edit?usp=sharing Hi, I'm confused by VFIO_USER_ADD_MEMORY_REGION vs VFIO_USER_IOMMU_MAP_DMA. The former seems intended to provide the server with access to the entire GPA space, while the latter indicates an IOVA to GPA mapping of those regions. Doesn't this break the basic isolation of a vIOMMU? This essentially says to me "here's all the guest memory, but please only access these regions for which we're providing DMA mappings". That invites abuse. Also regarding VFIO_USER_ADD_MEMORY_REGION, it's not clear to me how "an array of file descriptors will be sent as part of the message meta-data" works. Also consider s/SUB/DEL/. Why is the Device ID in the table specified as 0? How does a client learn their Device ID? VFIO_USER_DEVICE_GET_REGION_INFO (or anything else making use of a capability chain), the cap_offset and next pointers within the chain need to specify what their offset is relative to (ie. the start of the packet, the start of the vfio compatible data structure, etc). I assume the latter for client compatibility. Also on REGION_INFO, offset is specified as "the base offset to be given to the mmap() call for regions with the MMAP attribute". Base offset from what? Is the mmap performed on the socket fd? Do we not allow read/write, we need to use VFIO_USER_MMIO_READ/WRITE instead? Why do we specify "MMIO" in those operations versus simply "REGION"? Are we arbitrarily excluding support for I/O port regions or device specific regions? If these commands replace direct read and write to an fd offset, how is PCI config space handled? VFIO_USER_MMIO_READ specifies the count field is zero and the reply will include the count specifying the amount of data read. How does the client specify how much data to read? Via message size? VFIO_USER_DMA_READ/WRITE, is the address a GPA or IOVA? IMO the device should only ever have access via IOVA, which implies a DMA mapping exists for the device. Can you provide an example of why we need these commands since there seems little point to this interface if a device cannot directly interact with VM memory. The IOMMU commands should be unnecessary, a vIOMMU should be transparent to the server by virtue that the device only knows about IOVA mappings accessible to the device. Requiring the client to expose all memory to the server implies that the server must always be trusted. Interrupt info format, s/type/index/, s/vector/subindex/ In addition to the unused ioctls, the entire concept of groups and containers are not found in this specification. To some degree that makes sense and even mdevs and typically SR-IOV VFs have a 1:1 device to group relationship. However, the container is very much involved in the development of migration support, where it's the container that provides dirty bitmaps. Since we're doing map and unmap without that container concept here, perhaps we'd equally apply those APIs to this same socket. Thanks, Alex