(cut-n-paste from kernel patchset) Each Partitionable Endpoint (IOMMU group) has an address range on a PCI bus where devices are allowed to do DMA. These ranges are called DMA windows. By default, there is a single DMA window, 1 or 2GB big, mapped at zero on a PCI bus.
PAPR defines a DDW RTAS API which allows pseries guests querying the hypervisor about DDW support and capabilities (page size mask for now). A pseries guest may request an additional (to the default) DMA windows using this RTAS API. The existing pseries Linux guests request an additional window as big as the guest RAM and map the entire guest window which effectively creates direct mapping of the guest memory to a PCI bus. This patchset reworks PPC64 IOMMU code and adds necessary structures to support big windows. Once a Linux guest discovers the presence of DDW, it does: 1. query hypervisor about number of available windows and page size masks; 2. create a window with the biggest possible page size (today 4K/64K/16M); 3. map the entire guest RAM via H_PUT_TCE* hypercalls; 4. switche dma_ops to direct_dma_ops on the selected PE. Once this is done, H_PUT_TCE is not called anymore for 64bit devices and the guest does not waste time on DMA map/unmap operations. Note that 32bit devices won't use DDW and will keep using the default DMA window so KVM optimizations will be required (to be posted later). This patchset adds DDW support for pseries. The host kernel changes are required, posted as: [PATCH kernel v11 00/34] powerpc/iommu/vfio: Enable Dynamic DMA windows This patchset is based on git://github.com/dgibson/qemu.git spapr-next branch. Please comment. Thanks! Changes: v8: * reworked unreferencing in "spapr_iommu: Introduce "enabled" state for TCE table" * added clean-up patch "spapr_iommu: Remove vfio_accel flag from sPAPRTCETable" * rebased on latest spapr-next v7: * bunch of cleanups, renames after David+Thomas+Michael review * patches are reorganized and those which do not need the host kernel headers update are put first and can be pulled if these are good enough :) v6: * spapr-pci-vfio-host-bridge is now a synonim of spapr-pci-host-bridge - same PHB can host emulated and VFIO devices * changed patches order * lot of small changes v5: * TCE tables got "enabled" state and are persistent, i.e. not recreated every reboot * added v2 of SPAPR_TCE_IOMMU * fixed migration for emulated PHB with enabled DDW * huge pile of other changes v4: * reimplemented the whole thing * machine reset and ddw-reset RTAS call both remove all TCE tables and create the default one * IOMMU group id is not needed to use VFIO PHB anymore, multiple groups are supported on the same VFIO container and virtual PHB v3: * removed "reset" from API now * reworked machine versions * applied multiple comments * includes David's machine QOM rework as this patchset adds a new machine type v2: * tested on emulated PHB * removed "ddw" machine property, now it is PHB property * disabled by default * defined "pseries-2.2" machine which enables DDW by default * fixed reset() and reference counting Alexey Kardashevskiy (14): vmstate: Define VARRAY with VMS_ALLOC vfio: spapr: Move SPAPR-related code to a separate file spapr_pci_vfio: Enable multiple groups per container spapr_pci: Convert finish_realize() to dma_capabilities_update()+dma_init_window() spapr_iommu: Move table allocation to helpers spapr_iommu: Introduce "enabled" state for TCE table spapr_iommu: Remove vfio_accel flag from sPAPRTCETable spapr_iommu: Add root memory region spapr_pci: Do complete reset of DMA config when resetting PHB spapr_vfio_pci: Remove redundant spapr-pci-vfio-host-bridge spapr_pci: Enable vfio-pci hotplug linux headers update for DDW on SPAPR vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering) spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) hw/ppc/Makefile.objs | 3 + hw/ppc/spapr.c | 5 + hw/ppc/spapr_iommu.c | 210 ++++++++++++++++++++++------ hw/ppc/spapr_pci.c | 263 +++++++++++++++++++++++++---------- hw/ppc/spapr_pci_vfio.c | 167 ++++++++++++---------- hw/ppc/spapr_rtas_ddw.c | 300 ++++++++++++++++++++++++++++++++++++++++ hw/ppc/spapr_vio.c | 9 +- hw/vfio/Makefile.objs | 1 + hw/vfio/common.c | 183 +++++------------------- hw/vfio/spapr.c | 315 ++++++++++++++++++++++++++++++++++++++++++ include/hw/pci-host/spapr.h | 49 +++++-- include/hw/ppc/spapr.h | 33 +++-- include/hw/vfio/vfio-common.h | 16 +++ include/hw/vfio/vfio.h | 2 +- include/migration/vmstate.h | 10 ++ linux-headers/linux/vfio.h | 88 +++++++++++- trace-events | 9 +- 17 files changed, 1299 insertions(+), 364 deletions(-) create mode 100644 hw/ppc/spapr_rtas_ddw.c create mode 100644 hw/vfio/spapr.c -- 2.4.0.rc3.8.gfb3e7d5