On Tue, May 30, 2017 at 11:13:36PM -0700, Ray Jui wrote: > I did a little more digging myself and I think I now understand what you > meant by identity mapping, i.e., configuring the MMU-500 with 1:1 mapping > between the DMA address and the IOVA address. > > I think that should work. In the end, due to this MSI write parsing issue in > our PCIe controller, the reason to use IOMMU is to allow the cache > attributes (AxCACHE) of the MSI writes towards GICv3 ITS to be modified by > the IOMMU to be device type, while leaving the rest of inbound reads/writes > from/to DDR with more optimized cache attributes setting, to allow I/O > coherency to be still enabled for the PCIe controller. In fact, the PCIe > controller itself is fully capable of DMA to/from the full address space of > our SoC including both DDR and any device memory. > > The 1:1 mapping will still pose some translation overhead like you > suggested; however, the overhead of allocating page tables and locking will > be gone. This sounds like the best possible option I have currently.
It might end up being pretty invasive to work around a hardware bug, so we'll have to see what it looks like. Ideally, we could just use the SMMU for everything as-is and work on clawing back the lost performance (it should be possible to get ~95% of the perf if we sort out the locking, which we *are* working on). > May I ask, how do I start to try to get this identity mapping to work as an > experiment and proof of concept? Any pointer or advise is highly appreciated > as you can see I'm not very experienced with this. I found Will recently > added the IOMMU_DOMAIN_IDENTITY support to the arm-smmu driver. But I > suppose that is to bypass the SMMU completely, instead of still going > through the MMU with 1:1 translation. Is my understanding correct? Yes, I don't think IOMMU_DOMAIN_IDENTITY is what you need because you actally need per-page control of memory attributes. Robin might have a better idea, but I think you'll have to hack dma-iommu.c so that you can have a version of the DMA ops that: * Initialises the identity map (I guess as normal WB cacheable?) * Reserves and maps the MSI region appropriately * Just returns the physical address for the dma address for map requests (return error for the MSI region) * Does nothing for unmap requests But my strong preference would be to fix the locking overhead from the SMMU so that the perf hit is acceptable. Will _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu