Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code
On Tue Apr 14 20, Joerg Roedel wrote: Hi, here is the second version of this patch-set. The first version with some more introductory text can be found here: https://lore.kernel.org/lkml/20200407183742.4344-1-j...@8bytes.org/ Changes v1->v2: * Rebased to v5.7-rc1 * Re-wrote the arm-smmu changes as suggested by Robin Murphy * Re-worked the Exynos patches to hopefully not break the driver anymore * Fixed a missing mutex_unlock() reported by Marek Szyprowski, thanks for that. There is also a git-branch available with these patches applied: https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=iommu-probe-device-v2 Please review. Thanks, Joerg Joerg Roedel (32): iommu: Move default domain allocation to separate function iommu/amd: Implement iommu_ops->def_domain_type call-back iommu/vt-d: Wire up iommu_ops->def_domain_type iommu/amd: Remove dma_mask check from check_device() iommu/amd: Return -ENODEV in add_device when device is not handled by IOMMU iommu: Add probe_device() and remove_device() call-backs iommu: Move default domain allocation to iommu_probe_device() iommu: Keep a list of allocated groups in __iommu_probe_device() iommu: Move new probe_device path to separate function iommu: Split off default domain allocation from group assignment iommu: Move iommu_group_create_direct_mappings() out of iommu_group_add_device() iommu: Export bus_iommu_probe() and make is safe for re-probing iommu/amd: Remove dev_data->passthrough iommu/amd: Convert to probe/release_device() call-backs iommu/vt-d: Convert to probe/release_device() call-backs iommu/arm-smmu: Convert to probe/release_device() call-backs iommu/pamu: Convert to probe/release_device() call-backs iommu/s390: Convert to probe/release_device() call-backs iommu/virtio: Convert to probe/release_device() call-backs iommu/msm: Convert to probe/release_device() call-backs iommu/mediatek: Convert to probe/release_device() call-backs iommu/mediatek-v1 Convert to probe/release_device() call-backs iommu/qcom: Convert to probe/release_device() call-backs iommu/rockchip: Convert to probe/release_device() call-backs iommu/tegra: Convert to probe/release_device() call-backs iommu/renesas: Convert to probe/release_device() call-backs iommu/omap: Remove orphan_dev tracking iommu/omap: Convert to probe/release_device() call-backs iommu/exynos: Use first SYSMMU in controllers list for IOMMU core iommu/exynos: Convert to probe/release_device() call-backs iommu: Remove add_device()/remove_device() code-paths iommu: Unexport iommu_group_get_for_dev() Sai Praneeth Prakhya (1): iommu: Add def_domain_type() callback in iommu_ops drivers/iommu/amd_iommu.c | 97 drivers/iommu/amd_iommu_types.h | 1 - drivers/iommu/arm-smmu-v3.c | 38 +-- drivers/iommu/arm-smmu.c| 39 ++-- drivers/iommu/exynos-iommu.c| 24 +- drivers/iommu/fsl_pamu_domain.c | 22 +- drivers/iommu/intel-iommu.c | 68 +- drivers/iommu/iommu.c | 393 +--- drivers/iommu/ipmmu-vmsa.c | 60 ++--- drivers/iommu/msm_iommu.c | 34 +-- drivers/iommu/mtk_iommu.c | 24 +- drivers/iommu/mtk_iommu_v1.c| 50 ++-- drivers/iommu/omap-iommu.c | 99 ++-- drivers/iommu/qcom_iommu.c | 24 +- drivers/iommu/rockchip-iommu.c | 26 +-- drivers/iommu/s390-iommu.c | 22 +- drivers/iommu/tegra-gart.c | 24 +- drivers/iommu/tegra-smmu.c | 31 +-- drivers/iommu/virtio-iommu.c| 41 +--- include/linux/iommu.h | 21 +- 20 files changed, 533 insertions(+), 605 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu Hi Joerg, With this patchset, I have an epyc system where if I boot with iommu=nopt and force a dump I will see some io page faults for a nic on the system. The vmcore is harvested and the system reboots. I haven't reproduced it on other systems yet, but without the patchset I don't see the io page faults during the kdump. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api
On 2020-05-29 3:11 p.m., Marek Szyprowski wrote: > Patches are pending: > https://lore.kernel.org/linux-iommu/20200513132114.6046-1-m.szyprow...@samsung.com/T/ Cool, nice! Though, I still don't think that fixes the issue in i915_scatterlist.h given it still ignores sg_dma_len() and strictly relies on sg_next()/sg_is_last() to stop iterating -- and I suspect this is the bug that got in Tom's way. >> However, as Robin pointed out, there are other ugly tricks like stopping >> iterating through the SGL when sg_dma_len() is zero. For example, the >> AMD driver appears to use drm_prime_sg_to_page_addr_arrays() which does >> this trick and thus likely isn't buggy (otherwise, I'd expect someone to >> have complained by now seeing AMD has already switched to IOMMU-DMA. > > I'm not sure that this is a trick. Stopping at zero sg_dma_len() was > somewhere documented. Well whatever you want to call it, it is ugly to have some drivers doing one thing with the returned value and others assuming there's an extra zero at the end. It just causes confusion for people reading/copying the code. It would be better if they are all consistent. However, I concede stopping at zero should not be broken, presently. Logan ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api
Hi Logan, On 29.05.2020 21:05, Logan Gunthorpe wrote: > On 2020-05-29 6:45 a.m., Christoph Hellwig wrote: >> On Thu, May 28, 2020 at 06:00:44PM -0600, Logan Gunthorpe wrote: This issue is most likely in the i915 driver and is most likely caused by the driver not respecting the return value of the dma_map_ops::map_sg function. You can see the driver ignoring the return value here: https://protect2.fireeye.com/url?k=ca25a34b-97f7b813-ca242804-0cc47a31c8b4-0ecdffc9f56851e1&q=1&u=https%3A%2F%2Fgithub.com%2Ftorvalds%2Flinux%2Fblob%2F7e0165b2f1a912a06e381e91f0f4e495f4ac3736%2Fdrivers%2Fgpu%2Fdrm%2Fi915%2Fgem%2Fi915_gem_dmabuf.c%23L51 Previously this didn’t cause issues because the intel map_sg always returned the same number of elements as the input scatter gather list but with the change to this dma-iommu api this is no longer the case. I wasn’t able to track the bug down to a specific line of code unfortunately. >> Mark did a big audit into the map_sg API abuse and initially had >> some i915 patches, but then gave up on them with this comment: >> >> "The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix >> it fully. The driver creatively uses sg_table->orig_nents to store the >> size of the allocate scatterlist and ignores the number of the entries >> returned by dma_map_sg function. In this patchset I only fixed the >> sg_table objects exported by dmabuf related functions. I hope that I >> didn't break anything there." >> >> it would be really nice if the i915 maintainers could help with sorting >> that API abuse out. >> > I agree completely that the API abuse should be sorted out, but I think > that's much larger than just the i915 driver. Pretty much every dma-buf > map_dma_buf implementation I looked at ignores the returned nents of > sg_attrs. This sucks, but I don't think it's the bug Tom ran into. See: > > amdgpu_dma_buf_map > armada_gem_prime_map_dma_buf > drm_gem_map_dma_buf > i915_gem_map_dma_buf > tegra_gem_prime_map_dma_buf > > So this should probably be addressed by the whole GPU community. Patches are pending: https://lore.kernel.org/linux-iommu/20200513132114.6046-1-m.szyprow...@samsung.com/T/ > However, as Robin pointed out, there are other ugly tricks like stopping > iterating through the SGL when sg_dma_len() is zero. For example, the > AMD driver appears to use drm_prime_sg_to_page_addr_arrays() which does > this trick and thus likely isn't buggy (otherwise, I'd expect someone to > have complained by now seeing AMD has already switched to IOMMU-DMA. I'm not sure that this is a trick. Stopping at zero sg_dma_len() was somewhere documented. > As I tried to point out in my previous email, i915 does not do this > trick. In fact, it completely ignores sg_dma_len() and is thus > completely broken. See i915_scatterlist.h and the __sgt_iter() function. > So it doesn't sound to me like Mark's fix would address the issue at > all. Per my previous email, I'd guess that it can be fixed simply by > adjusting the __sgt_iter() function to do something more sensible. > (Better yet, if possible, ditch __sgt_iter() and use the common DRM > function that AMD uses). > > This would at least allow us to make progress with Tom's IOMMU-DMA patch > set and once that gets in, it will be harder for other drivers to make > the same mistake. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [git pull] IOMMU Fixes for Linux v5.7-rc7
The pull request you sent on Fri, 29 May 2020 20:58:41 +0200: > git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git > tags/iommu-fixes-v5.7-rc7 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/b58f2140ea8605ee6ea0530d9c0cb5d049f9c7ca Thank you! -- Deet-doot-dot, I am a bot. https://korg.wiki.kernel.org/userdoc/prtracker ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/amd: Fix event counter availability check
The driver performs an extra check if the IOMMU's capabilities advertise presence of performance counters: it verifies that counters are writable by writing a hard-coded value to a counter and testing that reading that counter gives back the same value. Unfortunately it does so quite early, even before pci_enable_device is called for the IOMMU, i.e. when accessing its MMIO space is not guaranteed to work. On Ryzen 4500U CPU, this actually breaks the test: the driver assumes the counters are not writable, and disables the functionality. Moving init_iommu_perf_ctr just after iommu_flush_all_caches resolves the issue. This is the earliest point in amd_iommu_init_pci where the call succeeds on my laptop. Signed-off-by: Alexander Monakov Cc: Joerg Roedel Cc: Suravee Suthikulpanit Cc: iommu@lists.linux-foundation.org --- PS. I'm seeing another hiccup with IOMMU probing on my system: pci :00:00.2: can't derive routing for PCI INT A pci :00:00.2: PCI INT A: not connected Hopefully I can figure it out, but I'd appreciate hints. drivers/iommu/amd_iommu_init.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c index 5b81fd16f5fa..1b7ec6b6a282 100644 --- a/drivers/iommu/amd_iommu_init.c +++ b/drivers/iommu/amd_iommu_init.c @@ -1788,8 +1788,6 @@ static int __init iommu_init_pci(struct amd_iommu *iommu) if (iommu->cap & (1UL << IOMMU_CAP_NPCACHE)) amd_iommu_np_cache = true; - init_iommu_perf_ctr(iommu); - if (is_rd890_iommu(iommu->dev)) { int i, j; @@ -1891,8 +1889,10 @@ static int __init amd_iommu_init_pci(void) init_device_table_dma(); - for_each_iommu(iommu) + for_each_iommu(iommu) { iommu_flush_all_caches(iommu); + init_iommu_perf_ctr(iommu); + } if (!ret) print_iommu_info(); base-commit: 75caf310d16cc5e2f851c048cd597f5437013368 -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api
On 2020-05-29 6:45 a.m., Christoph Hellwig wrote: > On Thu, May 28, 2020 at 06:00:44PM -0600, Logan Gunthorpe wrote: >>> This issue is most likely in the i915 driver and is most likely caused by >>> the driver not respecting the return value of the dma_map_ops::map_sg >>> function. You can see the driver ignoring the return value here: >>> https://github.com/torvalds/linux/blob/7e0165b2f1a912a06e381e91f0f4e495f4ac3736/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c#L51 >>> >>> Previously this didn’t cause issues because the intel map_sg always >>> returned the same number of elements as the input scatter gather list but >>> with the change to this dma-iommu api this is no longer the case. I wasn’t >>> able to track the bug down to a specific line of code unfortunately. > > Mark did a big audit into the map_sg API abuse and initially had > some i915 patches, but then gave up on them with this comment: > > "The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix > it fully. The driver creatively uses sg_table->orig_nents to store the > size of the allocate scatterlist and ignores the number of the entries > returned by dma_map_sg function. In this patchset I only fixed the > sg_table objects exported by dmabuf related functions. I hope that I > didn't break anything there." > > it would be really nice if the i915 maintainers could help with sorting > that API abuse out. > I agree completely that the API abuse should be sorted out, but I think that's much larger than just the i915 driver. Pretty much every dma-buf map_dma_buf implementation I looked at ignores the returned nents of sg_attrs. This sucks, but I don't think it's the bug Tom ran into. See: amdgpu_dma_buf_map armada_gem_prime_map_dma_buf drm_gem_map_dma_buf i915_gem_map_dma_buf tegra_gem_prime_map_dma_buf So this should probably be addressed by the whole GPU community. However, as Robin pointed out, there are other ugly tricks like stopping iterating through the SGL when sg_dma_len() is zero. For example, the AMD driver appears to use drm_prime_sg_to_page_addr_arrays() which does this trick and thus likely isn't buggy (otherwise, I'd expect someone to have complained by now seeing AMD has already switched to IOMMU-DMA. As I tried to point out in my previous email, i915 does not do this trick. In fact, it completely ignores sg_dma_len() and is thus completely broken. See i915_scatterlist.h and the __sgt_iter() function. So it doesn't sound to me like Mark's fix would address the issue at all. Per my previous email, I'd guess that it can be fixed simply by adjusting the __sgt_iter() function to do something more sensible. (Better yet, if possible, ditch __sgt_iter() and use the common DRM function that AMD uses). This would at least allow us to make progress with Tom's IOMMU-DMA patch set and once that gets in, it will be harder for other drivers to make the same mistake. Logan ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[git pull] IOMMU Fixes for Linux v5.7-rc7
Hi Linus, The following changes since commit 9cb1fd0efd195590b828b9b865421ad345a4a145: Linux 5.7-rc7 (2020-05-24 15:32:54 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git tags/iommu-fixes-v5.7-rc7 for you to fetch changes up to 7cc31613734c4870ae32f5265d576ef296621343: iommu: Fix reference count leak in iommu_group_alloc. (2020-05-29 15:27:50 +0200) IOMMU Fixes for Linux v5.7-rc7 Including: - Two compile test fixes for issues introduced during the 5.7-rc1 merge window. - A fix for a reference count leak in an error path of iommu_group_alloc(). Krzysztof Kozlowski (2): ia64: Hide the archdata.iommu field behind generic IOMMU_API x86: Hide the archdata.iommu field behind generic IOMMU_API Qiushi Wu (1): iommu: Fix reference count leak in iommu_group_alloc. arch/ia64/include/asm/device.h | 2 +- arch/x86/include/asm/device.h | 2 +- drivers/iommu/iommu.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) Please pull. Thanks, Joerg signature.asc Description: Digital signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 00/14] PCI: brcmstb: enable PCIe for STB chips
On Fri, May 29, 2020 at 1:49 PM Rob Herring wrote: > > On Tue, May 26, 2020 at 03:12:39PM -0400, Jim Quinlan wrote: > > v2: > > Commit: "device core: Add ability to handle multiple dma offsets" > > o Added helper func attach_dma_pfn_offset_map() in address.c (Chistoph) > > o Helpers funcs added to __phys_to_dma() & __dma_to_phys() (Christoph) > > o Added warning when multiple offsets are needed and !DMA_PFN_OFFSET_MAP > > o dev->dma_pfn_map => dev->dma_pfn_offset_map > > o s/frm/from/ for dma_pfn_offset_frm_{phys,dma}_addr() (Christoph) > > o In device.h: s/const void */const struct dma_pfn_offset_region */ > > o removed 'unlikely' from unlikely(dev->dma_pfn_offset_map) since > > guarded by CONFIG_DMA_PFN_OFFSET_MAP (Christoph) > > o Since dev->dma_pfn_offset is copied in usb/core/{usb,message}.c, now > > dev->dma_pfn_offset_map is copied as well. > > o Merged two of the DMA commits into one (Christoph). > > > > Commit "arm: dma-mapping: Invoke dma offset func if needed": > > o Use helper functions instead of #if CONFIG_DMA_PFN_OFFSET > > > > Other commits' changes: > > o Removed need for carrying of_id var in priv (Nicolas) > > o Commit message rewordings (Bjorn) > > o Commit log messages filled to 75 chars (Bjorn) > > o devm_reset_control_get_shared()) > > => devm_reset_control_get_optional_shared (Philipp) > > o Add call to reset_control_assert() in PCIe remove routines (Philipp) > > > > v1: > > This patchset expands the usefulness of the Broadcom Settop Box PCIe > > controller by building upon the PCIe driver used currently by the > > Raspbery Pi. Other forms of this patchset were submitted by me years > > ago and not accepted; the major sticking point was the code required > > for the DMA remapping needed for the PCIe driver to work [1]. > > > > There have been many changes to the DMA and OF subsystems since that > > time, making a cleaner and less intrusive patchset possible. This > > patchset implements a generalization of "dev->dma_pfn_offset", except > > that instead of a single scalar offset it provides for multiple > > offsets via a function which depends upon the "dma-ranges" property of > > the PCIe host controller. This is required for proper functionality > > of the BrcmSTB PCIe controller and possibly some other devices. > > If you can enable the h/w support without the multiple offset support, > then I'd split up this series. The latter part might take a bit more > time. > > Rob Unfortunately, the STB PCIe controller depends on the multiple PFN offset functionality. Thanks, Jim ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 09/14] device core: Add ability to handle multiple dma offsets
On Fri, May 29, 2020 at 1:35 PM Rob Herring wrote: > > On Wed, May 27, 2020 at 9:43 AM Jim Quinlan > wrote: > > > > Hi Nicolas, > > > > On Wed, May 27, 2020 at 11:00 AM Nicolas Saenz Julienne > > wrote: > > > > > > Hi Jim, > > > one thing comes to mind, there is a small test suite in > > > drivers/of/unittest.c > > > (specifically of_unittest_pci_dma_ranges()) you could extend it to > > > include your > > > use cases. > > Sure, will check out. > > > > > > On Tue, 2020-05-26 at 15:12 -0400, Jim Quinlan wrote: > > > > The new field in struct device 'dma_pfn_offset_map' is used to > > > > facilitate > > > > the use of multiple pfn offsets between cpu addrs and dma addrs. It is > > > > similar to 'dma_pfn_offset' except that the offset chosen depends on the > > > > cpu or dma address involved. > > > > > > > > Signed-off-by: Jim Quinlan > > > > --- > > > > drivers/of/address.c| 65 +++-- > > > > drivers/usb/core/message.c | 3 ++ > > > > drivers/usb/core/usb.c | 3 ++ > > > > include/linux/device.h | 10 +- > > > > include/linux/dma-direct.h | 10 -- > > > > include/linux/dma-mapping.h | 46 ++ > > > > kernel/dma/Kconfig | 13 > > > > 7 files changed, 144 insertions(+), 6 deletions(-) > > > > > > > > > > [...] > > > > > > > @@ -977,10 +1020,19 @@ int of_dma_get_range(struct device *dev, struct > > > > device_node *np, u64 *dma_addr, > > > > pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n", > > > >range.bus_addr, range.cpu_addr, range.size); > > > > > > > > + num_ranges++; > > > > if (dma_offset && range.cpu_addr - range.bus_addr != > > > > dma_offset) > > > > { > > > > - pr_warn("Can't handle multiple dma-ranges with > > > > different > > > > offsets on node(%pOF)\n", node); > > > > - /* Don't error out as we'd break some existing > > > > DTs */ > > > > - continue; > > > > + if (!IS_ENABLED(CONFIG_DMA_PFN_OFFSET_MAP)) { > > > > + pr_warn("Can't handle multiple dma-ranges > > > > with > > > > different offsets on node(%pOF)\n", node); > > > > + pr_warn("Perhaps set > > > > DMA_PFN_OFFSET_MAP=y?\n"); > > > > + /* > > > > + * Don't error out as we'd break some > > > > existing > > > > + * DTs that are using configs w/o > > > > + * CONFIG_DMA_PFN_OFFSET_MAP set. > > > > + */ > > > > + continue; > > > > > > dev->bus_dma_limit is set in of_dma_configure(), this function's caller, > > > based > > > on dma_start's value (set after this continue). So you'd be effectively > > > setting > > > the dev->bus_dma_limit to whatever we get from the first dma-range. > > I'm not seeing that at all. On the evaluation of each dma-range, > > dma_start and dma_end are re-evaluated to be the lowest and highest > > bus values of the dma-ranges seen so far. After all dma-ranges are > > examined, dev->bus_dma_limit being set to the highest. In fact, the > > current code -- ie before my commits -- already does this for multiple > > dma-ranges as long as the cpu-bus offset is the same in the > > dma-ranges. > > > > > > This can be troublesome depending on how the dma-ranges are setup, for > > > example > > > if the first dma-range doesn't include the CMA area, in arm64 generally > > > set as > > > high as possible in ZONE_DMA32, that would render it useless for > > > dma/{direct/swiotlb}. Again depending on the bus_dma_limit value, if > > > smaller > > > than ZONE_DMA you'd be unable to allocate any DMA memory. > > > > > > IMO, a solution to this calls for a revamp of dma-direct's dma_capable(): > > > match > > > the target DMA memory area with each dma-range we have to see if it fits. > > > > > > > + } > > > > + dma_multi_pfn_offset = true; > > > > } > > > > dma_offset = range.cpu_addr - range.bus_addr; > > > > > > > > @@ -991,6 +1043,13 @@ int of_dma_get_range(struct device *dev, struct > > > > device_node *np, u64 *dma_addr, > > > > dma_end = range.bus_addr + range.size; > > > > } > > > > > > > > + if (dma_multi_pfn_offset) { > > > > + dma_offset = 0; > > > > + ret = attach_dma_pfn_offset_map(dev, node, num_ranges); > > > > + if (ret) > > > > + return ret; > > > > + } > > > > + > > > > if (dma_start >= dma_end) { > > > > ret = -EINVAL; > > > > pr_debug("Invalid DMA ranges configuration on > > > > node(%pOF)\n", > > > > diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c > > > > index 6197938dcc2d..aaa3e58f5eb4 100644 > > > > ---
Re: [PATCH v2 00/14] PCI: brcmstb: enable PCIe for STB chips
On Tue, May 26, 2020 at 03:12:39PM -0400, Jim Quinlan wrote: > v2: > Commit: "device core: Add ability to handle multiple dma offsets" > o Added helper func attach_dma_pfn_offset_map() in address.c (Chistoph) > o Helpers funcs added to __phys_to_dma() & __dma_to_phys() (Christoph) > o Added warning when multiple offsets are needed and !DMA_PFN_OFFSET_MAP > o dev->dma_pfn_map => dev->dma_pfn_offset_map > o s/frm/from/ for dma_pfn_offset_frm_{phys,dma}_addr() (Christoph) > o In device.h: s/const void */const struct dma_pfn_offset_region */ > o removed 'unlikely' from unlikely(dev->dma_pfn_offset_map) since > guarded by CONFIG_DMA_PFN_OFFSET_MAP (Christoph) > o Since dev->dma_pfn_offset is copied in usb/core/{usb,message}.c, now > dev->dma_pfn_offset_map is copied as well. > o Merged two of the DMA commits into one (Christoph). > > Commit "arm: dma-mapping: Invoke dma offset func if needed": > o Use helper functions instead of #if CONFIG_DMA_PFN_OFFSET > > Other commits' changes: > o Removed need for carrying of_id var in priv (Nicolas) > o Commit message rewordings (Bjorn) > o Commit log messages filled to 75 chars (Bjorn) > o devm_reset_control_get_shared()) > => devm_reset_control_get_optional_shared (Philipp) > o Add call to reset_control_assert() in PCIe remove routines (Philipp) > > v1: > This patchset expands the usefulness of the Broadcom Settop Box PCIe > controller by building upon the PCIe driver used currently by the > Raspbery Pi. Other forms of this patchset were submitted by me years > ago and not accepted; the major sticking point was the code required > for the DMA remapping needed for the PCIe driver to work [1]. > > There have been many changes to the DMA and OF subsystems since that > time, making a cleaner and less intrusive patchset possible. This > patchset implements a generalization of "dev->dma_pfn_offset", except > that instead of a single scalar offset it provides for multiple > offsets via a function which depends upon the "dma-ranges" property of > the PCIe host controller. This is required for proper functionality > of the BrcmSTB PCIe controller and possibly some other devices. If you can enable the h/w support without the multiple offset support, then I'd split up this series. The latter part might take a bit more time. Rob ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 09/14] device core: Add ability to handle multiple dma offsets
On Wed, May 27, 2020 at 9:43 AM Jim Quinlan wrote: > > Hi Nicolas, > > On Wed, May 27, 2020 at 11:00 AM Nicolas Saenz Julienne > wrote: > > > > Hi Jim, > > one thing comes to mind, there is a small test suite in > > drivers/of/unittest.c > > (specifically of_unittest_pci_dma_ranges()) you could extend it to include > > your > > use cases. > Sure, will check out. > > > > On Tue, 2020-05-26 at 15:12 -0400, Jim Quinlan wrote: > > > The new field in struct device 'dma_pfn_offset_map' is used to facilitate > > > the use of multiple pfn offsets between cpu addrs and dma addrs. It is > > > similar to 'dma_pfn_offset' except that the offset chosen depends on the > > > cpu or dma address involved. > > > > > > Signed-off-by: Jim Quinlan > > > --- > > > drivers/of/address.c| 65 +++-- > > > drivers/usb/core/message.c | 3 ++ > > > drivers/usb/core/usb.c | 3 ++ > > > include/linux/device.h | 10 +- > > > include/linux/dma-direct.h | 10 -- > > > include/linux/dma-mapping.h | 46 ++ > > > kernel/dma/Kconfig | 13 > > > 7 files changed, 144 insertions(+), 6 deletions(-) > > > > > > > [...] > > > > > @@ -977,10 +1020,19 @@ int of_dma_get_range(struct device *dev, struct > > > device_node *np, u64 *dma_addr, > > > pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n", > > >range.bus_addr, range.cpu_addr, range.size); > > > > > > + num_ranges++; > > > if (dma_offset && range.cpu_addr - range.bus_addr != > > > dma_offset) > > > { > > > - pr_warn("Can't handle multiple dma-ranges with > > > different > > > offsets on node(%pOF)\n", node); > > > - /* Don't error out as we'd break some existing DTs > > > */ > > > - continue; > > > + if (!IS_ENABLED(CONFIG_DMA_PFN_OFFSET_MAP)) { > > > + pr_warn("Can't handle multiple dma-ranges > > > with > > > different offsets on node(%pOF)\n", node); > > > + pr_warn("Perhaps set > > > DMA_PFN_OFFSET_MAP=y?\n"); > > > + /* > > > + * Don't error out as we'd break some > > > existing > > > + * DTs that are using configs w/o > > > + * CONFIG_DMA_PFN_OFFSET_MAP set. > > > + */ > > > + continue; > > > > dev->bus_dma_limit is set in of_dma_configure(), this function's caller, > > based > > on dma_start's value (set after this continue). So you'd be effectively > > setting > > the dev->bus_dma_limit to whatever we get from the first dma-range. > I'm not seeing that at all. On the evaluation of each dma-range, > dma_start and dma_end are re-evaluated to be the lowest and highest > bus values of the dma-ranges seen so far. After all dma-ranges are > examined, dev->bus_dma_limit being set to the highest. In fact, the > current code -- ie before my commits -- already does this for multiple > dma-ranges as long as the cpu-bus offset is the same in the > dma-ranges. > > > > This can be troublesome depending on how the dma-ranges are setup, for > > example > > if the first dma-range doesn't include the CMA area, in arm64 generally set > > as > > high as possible in ZONE_DMA32, that would render it useless for > > dma/{direct/swiotlb}. Again depending on the bus_dma_limit value, if smaller > > than ZONE_DMA you'd be unable to allocate any DMA memory. > > > > IMO, a solution to this calls for a revamp of dma-direct's dma_capable(): > > match > > the target DMA memory area with each dma-range we have to see if it fits. > > > > > + } > > > + dma_multi_pfn_offset = true; > > > } > > > dma_offset = range.cpu_addr - range.bus_addr; > > > > > > @@ -991,6 +1043,13 @@ int of_dma_get_range(struct device *dev, struct > > > device_node *np, u64 *dma_addr, > > > dma_end = range.bus_addr + range.size; > > > } > > > > > > + if (dma_multi_pfn_offset) { > > > + dma_offset = 0; > > > + ret = attach_dma_pfn_offset_map(dev, node, num_ranges); > > > + if (ret) > > > + return ret; > > > + } > > > + > > > if (dma_start >= dma_end) { > > > ret = -EINVAL; > > > pr_debug("Invalid DMA ranges configuration on node(%pOF)\n", > > > diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c > > > index 6197938dcc2d..aaa3e58f5eb4 100644 > > > --- a/drivers/usb/core/message.c > > > +++ b/drivers/usb/core/message.c > > > @@ -1960,6 +1960,9 @@ int usb_set_configuration(struct usb_device *dev, > > > int > > > configuration) > > >*/ > > > intf->dev.dma_mask = dev->dev.dma_mask; > > > intf->dev.dma_pfn_offset
Re: Re: [PATCH] PCI: Relax ACS requirement for Intel RCiEP devices.
On 5/28/20 4:38 PM, Alex Williamson wrote: On Thu, 28 May 2020 13:57:42 -0700 Ashok Raj wrote: All Intel platforms guarantee that all root complex implementations must send transactions up to IOMMU for address translations. Hence for RCiEP devices that are Vendor ID Intel, can claim exception for lack of ACS support. 3.16 Root-Complex Peer to Peer Considerations When DMA remapping is enabled, peer-to-peer requests through the Root-Complex must be handled as follows: • The input address in the request is translated (through first-level, second-level or nested translation) to a host physical address (HPA). The address decoding for peer addresses must be done only on the translated HPA. Hardware implementations are free to further limit peer-to-peer accesses to specific host physical address regions (or to completely disallow peer-forwarding of translated requests). • Since address translation changes the contents (address field) of the PCI Express Transaction Layer Packet (TLP), for PCI Express peer-to-peer requests with ECRC, the Root-Complex hardware must use the new ECRC (re-computed with the translated address) if it decides to forward the TLP as a peer request. • Root-ports, and multi-function root-complex integrated endpoints, may support additional peerto-peer control features by supporting PCI Express Access Control Services (ACS) capability. Refer to ACS capability in PCI Express specifications for details. Since Linux didn't give special treatment to allow this exception, certain RCiEP MFD devices are getting grouped in a single iommu group. This doesn't permit a single device to be assigned to a guest for instance. In one vendor system: Device 14.x were grouped in a single IOMMU group. /sys/kernel/iommu_groups/5/devices/:00:14.0 /sys/kernel/iommu_groups/5/devices/:00:14.2 /sys/kernel/iommu_groups/5/devices/:00:14.3 After the patch: /sys/kernel/iommu_groups/5/devices/:00:14.0 /sys/kernel/iommu_groups/5/devices/:00:14.2 /sys/kernel/iommu_groups/6/devices/:00:14.3 <<< new group 14.0 and 14.2 are integrated devices, but legacy end points. Whereas 14.3 was a PCIe compliant RCiEP. 00:14.3 Network controller: Intel Corporation Device 9df0 (rev 30) Capabilities: [40] Express (v2) Root Complex Integrated Endpoint, MSI 00 This permits assigning this device to a guest VM. Fixes: f096c061f552 ("iommu: Rework iommu_group_get_for_pci_dev()") I don't really understand this Fixes tag. This seems like a feature, not a fix. If you want it in stable releases as a feature, request it via Cc: sta...@vger.kernel.org. I'd drop that tag, that's my nit. Otherwise: Reviewed-by: Alex Williamson I have tested this patch with 5.6.14 as well as a slightly modified version (without pci_acs_ctrl_enabled()) in a 3.10 enterprise linux kernel. Tested-by: Darrel Goeddel Signed-off-by: Ashok Raj To: Joerg Roedel To: Bjorn Helgaas Cc: linux-ker...@vger.kernel.org Cc: iommu@lists.linux-foundation.org Cc: Lu Baolu Cc: Alex Williamson Cc: Darrel Goeddel Cc: Mark Scott , Cc: Romil Sharma Cc: Ashok Raj --- v2: Moved functionality from iommu to pci quirks - Alex Williamson drivers/pci/quirks.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 28c9a2409c50..63373ca0a3fe 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -4682,6 +4682,20 @@ static int pci_quirk_mf_endpoint_acs(struct pci_dev *dev, u16 acs_flags) PCI_ACS_CR | PCI_ACS_UF | PCI_ACS_DT); } +static int pci_quirk_rciep_acs(struct pci_dev *dev, u16 acs_flags) +{ + /* +* RCiEP's are required to allow p2p only on translated addresses. +* Refer to Intel VT-d specification Section 3.16 Root-Complex Peer +* to Peer Considerations +*/ + if (pci_pcie_type(dev) != PCI_EXP_TYPE_RC_END) + return -ENOTTY; + + return pci_acs_ctrl_enabled(acs_flags, + PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF); +} + static int pci_quirk_brcm_acs(struct pci_dev *dev, u16 acs_flags) { /* @@ -4764,6 +4778,7 @@ static const struct pci_dev_acs_enabled { /* I219 */ { PCI_VENDOR_ID_INTEL, 0x15b7, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x15b8, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_quirk_rciep_acs }, /* QCOM QDF2xxx root ports */ { PCI_VENDOR_ID_QCOM, 0x0400, pci_quirk_qcom_rp_acs }, { PCI_VENDOR_ID_QCOM, 0x0401, pci_quirk_qcom_rp_acs }, ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 01/10] iommu/amd: Move AMD IOMMU driver to a subdirectory
On Wed, May 27, 2020 at 01:53:04PM +0200, Joerg Roedel wrote: > From: Joerg Roedel > > The driver consists of five C files and three header files by now. > Move them to their own subdirectory to not clutter to iommu top-level > directory with them. > > Signed-off-by: Joerg Roedel > --- > MAINTAINERS | 2 +- > drivers/iommu/Makefile | 6 +++--- > drivers/iommu/{ => amd}/amd_iommu.h | 0 > drivers/iommu/{ => amd}/amd_iommu_proto.h| 0 > drivers/iommu/{ => amd}/amd_iommu_types.h| 0 > drivers/iommu/{amd_iommu_debugfs.c => amd/debugfs.c} | 0 > drivers/iommu/{amd_iommu_init.c => amd/init.c} | 2 +- > drivers/iommu/{amd_iommu.c => amd/iommu.c} | 2 +- > drivers/iommu/{amd_iommu_v2.c => amd/iommu_v2.c} | 0 > drivers/iommu/{amd_iommu_quirks.c => amd/quirks.c} | 0 > 10 files changed, 6 insertions(+), 6 deletions(-) > rename drivers/iommu/{ => amd}/amd_iommu.h (100%) > rename drivers/iommu/{ => amd}/amd_iommu_proto.h (100%) > rename drivers/iommu/{ => amd}/amd_iommu_types.h (100%) > rename drivers/iommu/{amd_iommu_debugfs.c => amd/debugfs.c} (100%) > rename drivers/iommu/{amd_iommu_init.c => amd/init.c} (99%) > rename drivers/iommu/{amd_iommu.c => amd/iommu.c} (99%) > rename drivers/iommu/{amd_iommu_v2.c => amd/iommu_v2.c} (100%) > rename drivers/iommu/{amd_iommu_quirks.c => amd/quirks.c} (100%) Okay, so there were a lot of conflicts creating my next branch after this patch-set was applied, so I skipped this patch to make resolving them a bit easier. I will send out this patch again separatly together with a patch doing the same for VT-d and merge it directly into the next branch I will send to Linus. Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/vt-d: Fix compile warning
On Fri, 29 May 2020 15:15:45 +0200 Joerg Roedel wrote: > Applied, thanks. > > On Thu, May 28, 2020 at 11:03:51AM -0700, Jacob Pan wrote: > > Make intel_svm_unbind_mm() a static function. > > > > Fixes: 064a57d7ddfc ("iommu/vt-d: Replace intel SVM APIs with > > generic SVA APIs") > > Please make sure the fixes tags (or any other tags) are not > line-wrapped in future patch submissions. > Got it, thanks. > Thanks, > > Joerg [Jacob Pan] ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 00/10] iommu/amd: Updates and Cleanups
Joerg, On 5/27/2020 6:53 PM, Joerg Roedel wrote: Hi, here is a collection of patches that clean up a few things in the AMD IOMMU driver. Foremost, it moves all related files of the driver into a separate subdirectory. But the patches also remove usage of dev->archdata.iommu and clean up dev_data handling and domain allocation. Patches are runtime-tested, including device-assignment. Please review. Regards, Joerg Joerg Roedel (10): iommu/amd: Move AMD IOMMU driver to a subdirectory iommu/amd: Unexport get_dev_data() iommu/amd: Let free_pagetable() not rely on domain->pt_root iommu/amd: Allocate page-table in protection_domain_init() iommu/amd: Free page-table in protection_domain_free() iommu/amd: Consolidate domain allocation/freeing iommu/amd: Remove PD_DMA_OPS_MASK iommu/amd: Merge private header files iommu/amd: Store dev_data as device iommu private data iommu/amd: Remove redundant devid checks MAINTAINERS | 2 +- drivers/iommu/Makefile| 6 +- .../{amd_iommu_proto.h => amd/amd_iommu.h}| 20 +- drivers/iommu/{ => amd}/amd_iommu_types.h | 0 .../{amd_iommu_debugfs.c => amd/debugfs.c}| 5 +- .../iommu/{amd_iommu_init.c => amd/init.c}| 6 +- drivers/iommu/{amd_iommu.c => amd/iommu.c}| 265 ++ .../iommu/{amd_iommu_v2.c => amd/iommu_v2.c} | 14 +- .../{amd_iommu_quirks.c => amd/quirks.c} | 0 drivers/iommu/amd_iommu.h | 14 - 10 files changed, 117 insertions(+), 215 deletions(-) rename drivers/iommu/{amd_iommu_proto.h => amd/amd_iommu.h} (88%) rename drivers/iommu/{ => amd}/amd_iommu_types.h (100%) rename drivers/iommu/{amd_iommu_debugfs.c => amd/debugfs.c} (89%) rename drivers/iommu/{amd_iommu_init.c => amd/init.c} (99%) rename drivers/iommu/{amd_iommu.c => amd/iommu.c} (95%) rename drivers/iommu/{amd_iommu_v2.c => amd/iommu_v2.c} (98%) rename drivers/iommu/{amd_iommu_quirks.c => amd/quirks.c} (100%) delete mode 100644 drivers/iommu/amd_iommu.h Thank you for cleaning up. Reviewed-by: Suravee Suthikulpanit ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: Fix reference count leak in iommu_group_alloc.
On Wed, May 27, 2020 at 04:00:19PM -0500, wu000...@umn.edu wrote: > From: Qiushi Wu > > kobject_init_and_add() takes reference even when it fails. > Thus, when kobject_init_and_add() returns an error, > kobject_put() must be called to properly clean up the kobject. > > Fixes: d72e31c93746 ("iommu: IOMMU Groups") > Signed-off-by: Qiushi Wu > --- > drivers/iommu/iommu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/vt-d: Fix compile warning
Applied, thanks. On Thu, May 28, 2020 at 11:03:51AM -0700, Jacob Pan wrote: > Make intel_svm_unbind_mm() a static function. > > Fixes: 064a57d7ddfc ("iommu/vt-d: Replace intel SVM APIs with generic > SVA APIs") Please make sure the fixes tags (or any other tags) are not line-wrapped in future patch submissions. Thanks, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 0/3] iommu/vt-d: real DMA sub-device info allocation
On Wed, May 27, 2020 at 10:56:14AM -0600, Jon Derrick wrote: > Jon Derrick (3): > iommu/vt-d: Only clear real DMA device's context entries > iommu/vt-d: Allocate domain info for real DMA sub-devices > iommu/vt-d: Remove real DMA lookup in find_domain > > drivers/iommu/intel-iommu.c | 31 +++ > include/linux/intel-iommu.h | 1 + > 2 files changed, 24 insertions(+), 8 deletions(-) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 00/10] iommu/amd: Updates and Cleanups
On Fri, May 29, 2020 at 07:15:13PM +0700, Suravee Suthikulpanit wrote: > Thank you for cleaning up. > > Reviewed-by: Suravee Suthikulpanit Thanks for you review, Suravee. Patches are now applied. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/2] iommu: Remove iommu_sva_ops::mm_exit()
On Thu, Apr 23, 2020 at 02:53:27PM +0200, Jean-Philippe Brucker wrote: > The IOMMU SVA API currently requires device drivers to implement an > mm_exit() callback, which stops device jobs that do DMA. This function > is called in the release() MMU notifier, when an address space that is > shared with a device exits. > > It has been noted several time during discussions about SVA that > cancelling DMA jobs can be slow and complex, and doing it in the > release() notifier might cause synchronization issues. Device drivers > must in any case call unbind() to remove their bond, after stopping DMA > from a more favorable context (release of a file descriptor). > > Patch 1 removes the mm_exit() callback from the uacce module, and patch > 2 removes it from the IOMMU API. Since v1 [1] I fixed the uacce unbind > reported by Zhangfei and added details in the commit message of patch 2. > > Jean-Philippe Brucker (2): > uacce: Remove mm_exit() op > iommu: Remove iommu_sva_ops::mm_exit() Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api
On Thu, May 28, 2020 at 06:00:44PM -0600, Logan Gunthorpe wrote: > > This issue is most likely in the i915 driver and is most likely caused by > > the driver not respecting the return value of the dma_map_ops::map_sg > > function. You can see the driver ignoring the return value here: > > https://github.com/torvalds/linux/blob/7e0165b2f1a912a06e381e91f0f4e495f4ac3736/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c#L51 > > > > Previously this didn’t cause issues because the intel map_sg always > > returned the same number of elements as the input scatter gather list but > > with the change to this dma-iommu api this is no longer the case. I wasn’t > > able to track the bug down to a specific line of code unfortunately. Mark did a big audit into the map_sg API abuse and initially had some i915 patches, but then gave up on them with this comment: "The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix it fully. The driver creatively uses sg_table->orig_nents to store the size of the allocate scatterlist and ignores the number of the entries returned by dma_map_sg function. In this patchset I only fixed the sg_table objects exported by dmabuf related functions. I hope that I didn't break anything there." it would be really nice if the i915 maintainers could help with sorting that API abuse out. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 04/24] iommu: Add a page fault handler
Hi, On 2020/5/20 1:54, Jean-Philippe Brucker wrote: > Some systems allow devices to handle I/O Page Faults in the core mm. For > example systems implementing the PCIe PRI extension or Arm SMMU stall > model. Infrastructure for reporting these recoverable page faults was > added to the IOMMU core by commit 0c830e6b3282 ("iommu: Introduce device > fault report API"). Add a page fault handler for host SVA. > > IOMMU driver can now instantiate several fault workqueues and link them > to IOPF-capable devices. Drivers can choose between a single global > workqueue, one per IOMMU device, one per low-level fault queue, one per > domain, etc. > > When it receives a fault event, supposedly in an IRQ handler, the IOMMU > driver reports the fault using iommu_report_device_fault(), which calls > the registered handler. The page fault handler then calls the mm fault > handler, and reports either success or failure with iommu_page_response(). > When the handler succeeded, the IOMMU retries the access. > > The iopf_param pointer could be embedded into iommu_fault_param. But > putting iopf_param into the iommu_param structure allows us not to care > about ordering between calls to iopf_queue_add_device() and > iommu_register_device_fault_handler(). > > Signed-off-by: Jean-Philippe Brucker > --- > v6->v7: Fix leak in iopf_queue_discard_partial() > --- > drivers/iommu/Kconfig | 4 + > drivers/iommu/Makefile | 1 + > include/linux/iommu.h | 51 + > drivers/iommu/io-pgfault.c | 459 + > 4 files changed, 515 insertions(+) > create mode 100644 drivers/iommu/io-pgfault.c > > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig > index d9fa5b410015..15e9dc4e503c 100644 > --- a/drivers/iommu/Kconfig > +++ b/drivers/iommu/Kconfig > @@ -107,6 +107,10 @@ config IOMMU_SVA > bool > select IOASID > > +config IOMMU_PAGE_FAULT > + bool > + select IOMMU_SVA > + > config FSL_PAMU > bool "Freescale IOMMU support" > depends on PCI > diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile > index 40c800dd4e3e..bf5cb4ee8409 100644 > --- a/drivers/iommu/Makefile > +++ b/drivers/iommu/Makefile > @@ -4,6 +4,7 @@ obj-$(CONFIG_IOMMU_API) += iommu-traces.o > obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o > obj-$(CONFIG_IOMMU_DEBUGFS) += iommu-debugfs.o > obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o > +obj-$(CONFIG_IOMMU_PAGE_FAULT) += io-pgfault.o > obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o > obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o > obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o > diff --git a/include/linux/iommu.h b/include/linux/iommu.h > index b62525747bd9..a462157c855b 100644 > --- a/include/linux/iommu.h > +++ b/include/linux/iommu.h > @@ -46,6 +46,7 @@ struct iommu_domain; > struct notifier_block; > struct iommu_sva; > struct iommu_fault_event; > +struct iopf_queue; > > /* iommu fault flags */ > #define IOMMU_FAULT_READ 0x0 > @@ -347,6 +348,7 @@ struct iommu_fault_param { > * struct dev_iommu - Collection of per-device IOMMU data > * > * @fault_param: IOMMU detected device fault reporting data > + * @iopf_param: I/O Page Fault queue and data > * @fwspec: IOMMU fwspec data > * @priv: IOMMU Driver private data > * > @@ -356,6 +358,7 @@ struct iommu_fault_param { > struct dev_iommu { > struct mutex lock; > struct iommu_fault_param*fault_param; > + struct iopf_device_param*iopf_param; > struct iommu_fwspec *fwspec; > void*priv; > }; [...] > #endif /* __LINUX_IOMMU_H */ > diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c > new file mode 100644 > index ..1f61c1bc05da > --- /dev/null > +++ b/drivers/iommu/io-pgfault.c > @@ -0,0 +1,459 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Handle device page faults > + * > + * Copyright (C) 2020 ARM Ltd. > + */ > + > +#include > +#include > +#include > +#include > +#include > + > +#include "iommu-sva.h" > + > +/** > + * struct iopf_queue - IO Page Fault queue > + * @wq: the fault workqueue > + * @devices: devices attached to this queue > + * @lock: protects the device list > + */ > +struct iopf_queue { > + struct workqueue_struct *wq; > + struct list_headdevices; > + struct mutexlock; > +}; > + > +/** > + * struct iopf_device_param - IO Page Fault data attached to a device > + * @dev: the device that owns this param > + * @queue: IOPF queue > + * @queue_list: index into queue->devices > + * @partial: faults that are part of a Page Request Group for which the last > + * request hasn't been submitted yet. > + */ > +struct iopf_device_param { > + struct device *dev; > + struct iopf_queue *queue; > + struct list_headqueue_list; > + struct list_headpartial; > +}; > +
RE: [PATCH V3 1/3] iommu: Add support to change default domain of an iommu_group
Hi Baolu, > -Original Message- > From: Lu Baolu > Sent: Thursday, May 28, 2020 7:43 PM > To: Prakhya, Sai Praneeth ; > iommu@lists.linux-foundation.org > Cc: baolu...@linux.intel.com; Christoph Hellwig ; Joerg Roedel > ; Raj, Ashok ; Will Deacon > ; Mehta, Sohil ; Robin > Murphy ; Jacob Pan > Subject: Re: [PATCH V3 1/3] iommu: Add support to change default domain of > an iommu_group > > Hi Sai, > > On 5/29/20 3:23 AM, Sai Praneeth Prakhya wrote: > > Presently, the default domain of an iommu_group is allocated during > > boot time (i.e. when a device is being added to a group) and it cannot > > be > > > This is inaccurate as Joerg's code has deferred default domain allocation and > attaching after group allocation. I'd suggest to remove this. Ok.. makes sense. I will remove it. I think it should have been like below to accurately describe Joerg's changes, is that correct? "Presently, the default domain of an iommu_group is allocated during boot time (i.e. during device probe and after the device is added to a group) and it cannot be" > > changed later. So, the device would typically be either in identity > > (also known as pass_through) mode (controlled by "iommu=pt" kernel > > command line > > > There are other kernel parameters to put device in pass_through mode. > I'd suggest to remove this. Ok.. makes sense. I will remove it. I think, you were talking about ARM parameters (iommu.passthrough).. am I right? > > argument) or the device would be in DMA mode as long as the machine is > > up and running. There is no way to change the default domain type > > dynamically i.e. after booting, a device cannot switch between > > identity mode and DMA mode. > > > > But, assume a use case wherein the user trusts the device and believes > > that the OS is secure enough and hence wants *only* this device to > > bypass IOMMU (so that it could be high performing) whereas all the > > other devices to go through IOMMU (so that the system is protected). > > Presently, this use case is not supported. It will be helpful if there > > is some way to change the default domain of a B:D.F dynamically. Hence, add > such support. >^ > > Currently default domain is per iommu_group, we have no per device default > domain yet. Probably, "default domain of an iommu group"? Makes sense. I will change it. > > A privileged user could request the kernel to change the default > > domain type of a iommu_group by writing to > > "/sys/kernel/iommu_groups//type" file. Presently, only three > > values are supported 1. identity: all the DMA transactions from the > > device in this group are > > *not* translated by the iommu 2. DMA: all the DMA > > transactions from the device in this group are > > translated by the iommu > > 3. auto: change to the type the device was booted with > > > > Note: > > 1. Default domain of an iommu group with two or more devices cannot be > > changed. > > 2. The device in the iommu group shouldn't be bound to any driver. > > 3. The device shouldn't be assigned to user for direct access. > > > > Please see "Documentation/ABI/testing/sysfs-kernel-iommu_groups" for > > more information. > > > > Cc: Christoph Hellwig > > Cc: Joerg Roedel > > Cc: Ashok Raj > > Cc: Will Deacon > > Cc: Lu Baolu > > Cc: Sohil Mehta > > Cc: Robin Murphy > > Cc: Jacob Pan > > Signed-off-by: Sai Praneeth Prakhya > > --- > > drivers/iommu/iommu.c | 211 > +- > > 1 file changed, 210 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index > > a4c2f122eb8b..2b6cca799055 100644 > > --- a/drivers/iommu/iommu.c > > +++ b/drivers/iommu/iommu.c > > @@ -92,6 +92,8 @@ static void __iommu_detach_group(struct > iommu_domain *domain, > > static int iommu_create_device_direct_mappings(struct iommu_group > *group, > >struct device *dev); > > static struct iommu_group *iommu_group_get_for_dev(struct device > > *dev); > > +static ssize_t iommu_group_store_type(struct iommu_group *group, > > + const char *buf, size_t count); > > > > #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store) > \ > > struct iommu_group_attribute iommu_group_attr_##_name = \ > > @@ -524,7 +526,8 @@ static IOMMU_GROUP_ATTR(name, S_IRUGO, > iommu_group_show_name, NULL); > > static IOMMU_GROUP_ATTR(reserved_regions, 0444, > > iommu_group_show_resv_regions, NULL); > > > > -static IOMMU_GROUP_ATTR(type, 0444, iommu_group_show_type, NULL); > > +static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type, > > + iommu_group_store_type); > > > > static void iommu_group_release(struct kobject *kobj) > > { > > @@ -2847,3 +2850,209 @@ int iommu_sva_get_pasid(struct iommu_sva > *