On 06/07/2014 08:47 AM, Alexey Kardashevskiy wrote: > On 06/07/2014 02:57 AM, Alex Williamson wrote: >> On Fri, 2014-06-06 at 13:34 +1000, Alexey Kardashevskiy wrote: >>> This turns the sPAPR support on and enables VFIO container use >>> in the kernel. >>> >>> This extends vfio_connect_container to support VFIO_SPAPR_TCE_IOMMU type >>> in the host kernel. >>> >>> This registers a memory listener which sPAPR IOMMU will notify when >>> executing H_PUT_TCE/etc DMA calls. The listener then will notify the host >>> kernel about DMA map/unmap operation via VFIO_IOMMU_MAP_DMA/ >>> VFIO_IOMMU_UNMAP_DMA ioctls. >>> >>> This executes VFIO_IOMMU_ENABLE ioctl to make sure that the IOMMU is free >>> of mappings and can be exclusively given to the user. At the moment SPAPR >>> is the only platform requiring this call to be implemented. >>> >>> Note that the host kernel function implementing VFIO_IOMMU_DISABLE >>> is called automatically when container's fd is closed so there is >>> no need to call it explicitly from QEMU. This may change in the future. >> >> So you're saying we rely on a behavior that may change in the future... >> The kernel must do cleanup, it can never rely on userspace to do the >> right thing or we have a bug. Does that mean we can remove the "may >> change in the future" part of this comment or is that directed at QEMU's >> behavior and not the kernel's? > > > I wanted to say that if/when we support PCI hotplug or dynamic IOMMU group > reconfiguration (we might be able to do so in POWER9 or later, do not know > details), we will call DISABLE explicitly.
Any better? === Note that the host kernel function implementing VFIO_IOMMU_DISABLE is called automatically when container's fd is closed so there is no need to call it explicitly from QEMU. We may need to call VFIO_IOMMU_DISABLE explicitely in the future for some sort of dynamic reconfiguration (PCI hotplug or dynamic IOMMU group management). === Thanks! > > >> A comment in the code where we setup the >> release handler or call ENABLE would be a good idea too. Thanks, > > > > >> >> Alex >> >>> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> >>> --- >>> Changes: >>> v8: >>> * added note about VFIO_IOMMU_DISABLE in the commit log >>> >>> v7: >>> * added more details in commit log >>> >>> v5: >>> * multiple returns converted to gotos >>> >>> v4: >>> * fixed format string to use %m which is a glibc extension: >>> "Print output of strerror(errno). No argument is required." >>> --- >>> hw/misc/vfio.c | 28 ++++++++++++++++++++++++++++ >>> 1 file changed, 28 insertions(+) >>> >>> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c >>> index bb77934..78d2045 100644 >>> --- a/hw/misc/vfio.c >>> +++ b/hw/misc/vfio.c >>> @@ -3650,6 +3650,34 @@ static int vfio_connect_container(VFIOGroup *group, >>> AddressSpace *as) >>> >>> container->iommu_data.type1.initialized = true; >>> >>> + } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) { >>> + ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &fd); >>> + if (ret) { >>> + error_report("vfio: failed to set group container: %m"); >>> + ret = -errno; >>> + goto free_container_exit; >>> + } >>> + >>> + ret = ioctl(fd, VFIO_SET_IOMMU, VFIO_SPAPR_TCE_IOMMU); >>> + if (ret) { >>> + error_report("vfio: failed to set iommu for container: %m"); >>> + ret = -errno; >>> + goto free_container_exit; >>> + } >>> + >>> + ret = ioctl(fd, VFIO_IOMMU_ENABLE); >>> + if (ret) { >>> + error_report("vfio: failed to enable container: %m"); >>> + ret = -errno; >>> + goto free_container_exit; >>> + } >>> + >>> + container->iommu_data.type1.listener = vfio_memory_listener; >>> + container->iommu_data.release = vfio_listener_release; >>> + >>> + memory_listener_register(&container->iommu_data.type1.listener, >>> + container->space->as); >>> + >>> } else { >>> error_report("vfio: No available IOMMU models"); >>> ret = -EINVAL; >> >> >> > > -- Alexey