On Thu, Jun 18, 2015 at 09:37:22PM +1000, Alexey Kardashevskiy wrote: > > (cut-n-paste from kernel patchset) > > Each Partitionable Endpoint (IOMMU group) has an address range on a PCI bus > where devices are allowed to do DMA. These ranges are called DMA windows. > By default, there is a single DMA window, 1 or 2GB big, mapped at zero > on a PCI bus. > > PAPR defines a DDW RTAS API which allows pseries guests > querying the hypervisor about DDW support and capabilities (page size mask > for now). A pseries guest may request an additional (to the default) > DMA windows using this RTAS API. > The existing pseries Linux guests request an additional window as big as > the guest RAM and map the entire guest window which effectively creates > direct mapping of the guest memory to a PCI bus. > > This patchset reworks PPC64 IOMMU code and adds necessary structures > to support big windows. > > Once a Linux guest discovers the presence of DDW, it does: > 1. query hypervisor about number of available windows and page size masks; > 2. create a window with the biggest possible page size (today 4K/64K/16M); > 3. map the entire guest RAM via H_PUT_TCE* hypercalls; > 4. switche dma_ops to direct_dma_ops on the selected PE. > > Once this is done, H_PUT_TCE is not called anymore for 64bit devices and > the guest does not waste time on DMA map/unmap operations. > > Note that 32bit devices won't use DDW and will keep using the default > DMA window so KVM optimizations will be required (to be posted later). > > This patchset adds DDW support for pseries. The host kernel changes are > required, posted as: > > [PATCH kernel v11 00/34] powerpc/iommu/vfio: Enable Dynamic DMA windows > > This patchset is based on git://github.com/dgibson/qemu.git spapr-next branch.
A couple of general queries - this touchs on the kernel part as well as the qemu part: * Am I correct in thinking that the point in doing the pre-registration stuff is to allow the kernel to handle PUT_TCE in real mode? i.e. that the advatage of doing preregistration rather than accounting on the DMA_MAP and DMA_UNMAP itself only appears once you have kernel KVM+VFIO acceleration? * Do you have test numbers to show that it's still worthwhile to have kernel acceleration once you have a guest using DDW? With DDW in play, even if PUT_TCE is slow, it should be called a lot less often. The reason I ask is that the preregistration handling is a pretty big chunk of code that inserts itself into some pretty core kernel data structures, all for one pretty specific use case. We only want to do that if there's a strong justification for it. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
pgpJxQBpmEU8_.pgp
Description: PGP signature