On 15/11/16 11:49, Joerg Roedel wrote: > On Fri, Nov 11, 2016 at 06:30:45PM +0000, Robin Murphy wrote: >> iommu_dma_init_domain() was originally written under the misconception >> that dma_32bit_pfn represented some sort of size limit for IOVA domains. >> Since the truth is almost the exact opposite of that, rework the logic >> and comments to reflect its real purpose of optimising lookups when >> allocating from a subset of the available space. >> >> Signed-off-by: Robin Murphy <robin.mur...@arm.com> >> --- >> drivers/iommu/dma-iommu.c | 23 ++++++++++++++++++----- >> 1 file changed, 18 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c >> index c5ab8667e6f2..ae045a14b530 100644 >> --- a/drivers/iommu/dma-iommu.c >> +++ b/drivers/iommu/dma-iommu.c >> @@ -139,6 +139,7 @@ int iommu_dma_init_domain(struct iommu_domain *domain, >> dma_addr_t base, >> { >> struct iova_domain *iovad = cookie_iovad(domain); >> unsigned long order, base_pfn, end_pfn; >> + bool pci = dev && dev_is_pci(dev); >> >> if (!iovad) >> return -ENODEV; >> @@ -161,19 +162,31 @@ int iommu_dma_init_domain(struct iommu_domain *domain, >> dma_addr_t base, >> end_pfn = min_t(unsigned long, end_pfn, >> domain->geometry.aperture_end >> order); >> } >> + /* >> + * PCI devices may have larger DMA masks, but still prefer allocating >> + * within a 32-bit mask to avoid DAC addressing. Such limitations don't >> + * apply to the typical platform device, so for those we may as well >> + * leave the cache limit at the top of the range they're likely to use. >> + */ >> + if (pci) >> + end_pfn = min_t(unsigned long, end_pfn, >> + DMA_BIT_MASK(32) >> order); > > Question, does it actually hurt platform devices to follow the same > allocation strategy as pci devices? I mean, does it hurt enough to > special-case the code here?
It hurts them in the sense that they get absolutely no benefit from trying a lower limit first (the device simply has as many address lines wired to the interconnect as it needs/can drive), therefore there's nothing to offset the cost of every allocation becoming try-fail-and-try-again once <32-bits fills up with, say, big GPU mappings. The maintenance cost of a couple of one-liner if statements doesn't seem big enough to prevent a significant set of devices - bear in mind that the first products really using this code in anger don't have PCI at all (MT8173) - from having straightforward and optimal allocation behaviour all the time, every time. I suppose it might bear saying that as a Cambridge Water customer, I do tend to view PCI in the same light as USB/I2C/etc. as "just another external bus controller on the outside edge of 'the system'". From that perspective, avoiding DAC overhead really does look like the minority special case, although I appreciate that the view from the x86 angle is rather different. I also suspect that, whether we want to or not, we may continue to see more on-chip 'PCI' devices which don't really need this either (because they're really just 'platform' devices talking directly to whatever coherent interconnect with a config space bolted on somewhere). Robin. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu