On Sun, 2014-06-08 at 09:49 +1000, Alexey Kardashevskiy wrote: > On 06/07/2014 08:47 AM, Alexey Kardashevskiy wrote: > > On 06/07/2014 02:57 AM, Alex Williamson wrote: > >> On Fri, 2014-06-06 at 13:34 +1000, Alexey Kardashevskiy wrote: > >>> This turns the sPAPR support on and enables VFIO container use > >>> in the kernel. > >>> > >>> This extends vfio_connect_container to support VFIO_SPAPR_TCE_IOMMU type > >>> in the host kernel. > >>> > >>> This registers a memory listener which sPAPR IOMMU will notify when > >>> executing H_PUT_TCE/etc DMA calls. The listener then will notify the host > >>> kernel about DMA map/unmap operation via VFIO_IOMMU_MAP_DMA/ > >>> VFIO_IOMMU_UNMAP_DMA ioctls. > >>> > >>> This executes VFIO_IOMMU_ENABLE ioctl to make sure that the IOMMU is free > >>> of mappings and can be exclusively given to the user. At the moment SPAPR > >>> is the only platform requiring this call to be implemented. > >>> > >>> Note that the host kernel function implementing VFIO_IOMMU_DISABLE > >>> is called automatically when container's fd is closed so there is > >>> no need to call it explicitly from QEMU. This may change in the future. > >> > >> So you're saying we rely on a behavior that may change in the future... > >> The kernel must do cleanup, it can never rely on userspace to do the > >> right thing or we have a bug. Does that mean we can remove the "may > >> change in the future" part of this comment or is that directed at QEMU's > >> behavior and not the kernel's? > > > > > > I wanted to say that if/when we support PCI hotplug or dynamic IOMMU group > > reconfiguration (we might be able to do so in POWER9 or later, do not know > > details), we will call DISABLE explicitly. > > > Any better? > === > Note that the host kernel function implementing VFIO_IOMMU_DISABLE > is called automatically when container's fd is closed so there is > no need to call it explicitly from QEMU. We may need to call > VFIO_IOMMU_DISABLE explicitely in the future for some sort of dynamic > reconfiguration (PCI hotplug or dynamic IOMMU group management). > ===
That sounds better. I'd also appreciate a comment in the code where enable is called stating that disable is implicit in closing the container fd. Thanks, Alex > > > >> A comment in the code where we setup the > >> release handler or call ENABLE would be a good idea too. Thanks, > > > > > > > > > >> > >> Alex > >> > >>> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> > >>> --- > >>> Changes: > >>> v8: > >>> * added note about VFIO_IOMMU_DISABLE in the commit log > >>> > >>> v7: > >>> * added more details in commit log > >>> > >>> v5: > >>> * multiple returns converted to gotos > >>> > >>> v4: > >>> * fixed format string to use %m which is a glibc extension: > >>> "Print output of strerror(errno). No argument is required." > >>> --- > >>> hw/misc/vfio.c | 28 ++++++++++++++++++++++++++++ > >>> 1 file changed, 28 insertions(+) > >>> > >>> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c > >>> index bb77934..78d2045 100644 > >>> --- a/hw/misc/vfio.c > >>> +++ b/hw/misc/vfio.c > >>> @@ -3650,6 +3650,34 @@ static int vfio_connect_container(VFIOGroup > >>> *group, AddressSpace *as) > >>> > >>> container->iommu_data.type1.initialized = true; > >>> > >>> + } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) { > >>> + ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &fd); > >>> + if (ret) { > >>> + error_report("vfio: failed to set group container: %m"); > >>> + ret = -errno; > >>> + goto free_container_exit; > >>> + } > >>> + > >>> + ret = ioctl(fd, VFIO_SET_IOMMU, VFIO_SPAPR_TCE_IOMMU); > >>> + if (ret) { > >>> + error_report("vfio: failed to set iommu for container: %m"); > >>> + ret = -errno; > >>> + goto free_container_exit; > >>> + } > >>> + > >>> + ret = ioctl(fd, VFIO_IOMMU_ENABLE); > >>> + if (ret) { > >>> + error_report("vfio: failed to enable container: %m"); > >>> + ret = -errno; > >>> + goto free_container_exit; > >>> + } > >>> + > >>> + container->iommu_data.type1.listener = vfio_memory_listener; > >>> + container->iommu_data.release = vfio_listener_release; > >>> + > >>> + memory_listener_register(&container->iommu_data.type1.listener, > >>> + container->space->as); > >>> + > >>> } else { > >>> error_report("vfio: No available IOMMU models"); > >>> ret = -EINVAL; > >> > >> > >> > > > > > >