[PATCH 2/2] arm64: dts: qcom: sm8250: Enable per-process page tables.
This is an SMMU for the adreno gpu, and adding this compatible lets the driver use per-fd page tables, which are required for security between GPU clients. Signed-off-by: Emma Anholt --- Tested with a full deqp-vk run on RB5, which did involve some iommu faults. arch/arm64/boot/dts/qcom/sm8250.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/qcom/sm8250.dtsi b/arch/arm64/boot/dts/qcom/sm8250.dtsi index a92230bec1dd..483c0e0f1d1a 100644 --- a/arch/arm64/boot/dts/qcom/sm8250.dtsi +++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi @@ -2513,7 +2513,7 @@ gpucc: clock-controller@3d9 { }; adreno_smmu: iommu@3da { - compatible = "qcom,sm8250-smmu-500", "arm,mmu-500"; + compatible = "qcom,sm8250-smmu-500", "arm,mmu-500", "qcom,adreno-smmu"; reg = <0 0x03da 0 0x1>; #iommu-cells = <2>; #global-interrupts = <2>; -- 2.36.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/2] iommu: arm-smmu-impl: Add 8250 display compatible to the client list.
Required for turning on per-process page tables for the GPU. Signed-off-by: Emma Anholt --- drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c index d8e1ef83c01b..bb9220937068 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c @@ -233,6 +233,7 @@ static const struct of_device_id qcom_smmu_client_of_match[] __maybe_unused = { { .compatible = "qcom,sc7280-mdss" }, { .compatible = "qcom,sc7280-mss-pil" }, { .compatible = "qcom,sc8180x-mdss" }, + { .compatible = "qcom,sm8250-mdss" }, { .compatible = "qcom,sdm845-mdss" }, { .compatible = "qcom,sdm845-mss-pil" }, { } -- 2.36.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 01/12] iommu/vt-d: debugfs: Remove device_domain_lock usage
On 2022/6/14 14:43, Tian, Kevin wrote: From: Lu Baolu Sent: Tuesday, June 14, 2022 10:51 AM The domain_translation_struct debugfs node is used to dump the DMAR page tables for the PCI devices. It potentially races with setting domains to devices. The existing code uses a global spinlock device_domain_lock to avoid the races, but this is problematical as this lock is only used to protect the device tracking lists of each domain. is it really problematic at this point? Before following patches are applied using device_domain_lock should have similar effect as holding the group lock. Here it might make more sense to just focus on removing the use of device_domain_lock outside of iommu.c. Just that using group lock is cleaner and more compatible to following cleanups. and it's worth mentioning that racing with page table updates is out of the scope of this series. Probably also add a comment in the code to clarify this point. Hi Kevin, How do you like below updated patch? From cecc9a0623780a11c4ea4d0a15aa6187f01541c4 Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Sun, 29 May 2022 10:18:56 +0800 Subject: [PATCH 1/1] iommu/vt-d: debugfs: Remove device_domain_lock usage The domain_translation_struct debugfs node is used to dump the DMAR page tables for the PCI devices. It potentially races with setting domains to devices. The existing code uses the global spinlock device_domain_lock to avoid the races. This removes the use of device_domain_lock outside of iommu.c by replacing it with the group mutex lock. Using the group mutex lock is cleaner and more compatible to following cleanups. Signed-off-by: Lu Baolu --- drivers/iommu/intel/debugfs.c | 42 +-- drivers/iommu/intel/iommu.c | 2 +- drivers/iommu/intel/iommu.h | 1 - 3 files changed, 31 insertions(+), 14 deletions(-) diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c index d927ef10641b..f4acd8993f60 100644 --- a/drivers/iommu/intel/debugfs.c +++ b/drivers/iommu/intel/debugfs.c @@ -342,13 +342,13 @@ static void pgtable_walk_level(struct seq_file *m, struct dma_pte *pde, } } -static int show_device_domain_translation(struct device *dev, void *data) +static int __show_device_domain_translation(struct device *dev, void *data) { - struct device_domain_info *info = dev_iommu_priv_get(dev); - struct dmar_domain *domain = info->domain; + struct dmar_domain *domain; struct seq_file *m = data; u64 path[6] = { 0 }; + domain = to_dmar_domain(iommu_get_domain_for_dev(dev)); if (!domain) return 0; @@ -359,20 +359,38 @@ static int show_device_domain_translation(struct device *dev, void *data) pgtable_walk_level(m, domain->pgd, domain->agaw + 2, 0, path); seq_putc(m, '\n'); - return 0; + return 1; } -static int domain_translation_struct_show(struct seq_file *m, void *unused) +static int show_device_domain_translation(struct device *dev, void *data) { - unsigned long flags; - int ret; + struct iommu_group *group; - spin_lock_irqsave(_domain_lock, flags); - ret = bus_for_each_dev(_bus_type, NULL, m, - show_device_domain_translation); - spin_unlock_irqrestore(_domain_lock, flags); + group = iommu_group_get(dev); + if (group) { + /* +* The group->mutex is held across the callback, which will +* block calls to iommu_attach/detach_group/device. Hence, +* the domain of the device will not change during traversal. +* +* All devices in an iommu group share a single domain, hence +* we only dump the domain of the first device. Even though, +* this code still possibly races with the iommu_unmap() +* interface. This could be solved by RCU-freeing the page +* table pages in the iommu_unmap() path. +*/ + iommu_group_for_each_dev(group, data, +__show_device_domain_translation); + iommu_group_put(group); + } - return ret; + return 0; +} + +static int domain_translation_struct_show(struct seq_file *m, void *unused) +{ + return bus_for_each_dev(_bus_type, NULL, m, + show_device_domain_translation); } DEFINE_SHOW_ATTRIBUTE(domain_translation_struct); diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 19024dc52735..a39d72a9d1cf 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -314,7 +314,7 @@ static int iommu_skip_te_disable; #define IDENTMAP_GFX 2 #define IDENTMAP_AZALIA4 -DEFINE_SPINLOCK(device_domain_lock); +static DEFINE_SPINLOCK(device_domain_lock); static LIST_HEAD(device_domain_list); /* diff --git a/drivers/iommu/intel/iommu.h
Re: [PATCH v2] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On 2022/6/15 05:12, Steve Wahl wrote: On Tue, Jun 14, 2022 at 12:01:45PM -0700, Jerry Snitselaar wrote: On Tue, Jun 14, 2022 at 11:45:35AM -0500, Steve Wahl wrote: On Tue, Jun 14, 2022 at 10:21:29AM +0800, Baolu Lu wrote: On 2022/6/14 09:54, Jerry Snitselaar wrote: On Mon, Jun 13, 2022 at 6:51 PM Baolu Lu wrote: On 2022/6/14 09:44, Jerry Snitselaar wrote: On Mon, Jun 13, 2022 at 6:36 PM Baolu Lu wrote: On 2022/6/14 04:57, Jerry Snitselaar wrote: On Thu, May 12, 2022 at 10:13:09AM -0500, Steve Wahl wrote: To support up to 64 sockets with 10 DMAR units each (640), make the value of DMAR_UNITS_SUPPORTED adjustable by a config variable, CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when MAXSMP is set. If the available hardware exceeds DMAR_UNITS_SUPPORTED (previously set to MAX_IO_APICS, or 128), it causes these messages: "DMAR: Failed to allocate seq_id", "DMAR: Parse DMAR table failure.", and "x2apic: IRQ remapping doesn't support X2APIC mode x2apic disabled"; and the system fails to boot properly. Signed-off-by: Steve Wahl --- Note that we could not find a reason for connecting DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. Perhaps it seemed like the two would continue to match on earlier processors. There doesn't appear to be kernel code that assumes that the value of one is related to the other. v2: Make this value a config option, rather than a fixed constant. The default values should match previous configuration except in the MAXSMP case. Keeping the value at a power of two was requested by Kevin Tian. drivers/iommu/intel/Kconfig | 6 ++ include/linux/dmar.h| 6 +- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig index 247d0f2d5fdf..fdbda77ac21e 100644 --- a/drivers/iommu/intel/Kconfig +++ b/drivers/iommu/intel/Kconfig @@ -9,6 +9,12 @@ config DMAR_PERF config DMAR_DEBUG bool +config DMAR_UNITS_SUPPORTED +int "Number of DMA Remapping Units supported" Also, should there be a "depends on (X86 || IA64)" here? Do you have any compilation errors or warnings? Best regards, baolu I think it is probably harmless since it doesn't get used elsewhere, but our tooling was complaining to me because DMAR_UNITS_SUPPORTED was being autogenerated into the configs for the non-x86 architectures we build (aarch64, s390x, ppcle64). We have files corresponding to the config options that it looks at, and I had one for x86 and not the others so it noticed the discrepancy. So with "depends on (X86 || IA64)", that tool doesn't complain anymore, right? Best regards, baolu Yes, with the depends it no longer happens. The dmar code only exists on X86 and IA64 arch's. Adding this depending makes sense to me. I will add it if no objections. I think that works after Baolu's patchset that makes intel-iommu.h private. I'm pretty sure it wouldn't have worked before that. No objections. Yes, I think applying it with the depends prior to Baolu's change would still run into the issue from the KTR report if someone compiled without INTEL_IOMMU enabled. This was dealing with being able to do something like: make allmodconfig ARCH=arm64 ; grep DMAR_UNITS .config and finding CONFIG_DMAR_UNITS_SUPPORTED=64. Thinking some more though, instead of the depends being on the arch would depending on DMAR_TABLE or INTEL_IOMMU be more appropriate? At least in my limited exploration, depending on INTEL_IOMMU yields compile errors, but depending upon DMAR_TABLE appears to work fine. DMAR_TABLE is used beyond INTEL_IOMMU, so depending on DMAR_TABLE seems better. Steve, do you mind posting a v3 with this fixed? Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 5/7] iommu: Add domain_type_supported() callback in iommu_ops
Robin, On 6/14/2022 4:51 PM, Robin Murphy wrote: On 2022-06-13 15:38, Suthikulpanit, Suravee wrote: Robin, On 6/13/2022 4:31 PM, Robin Murphy wrote: Introducing check_domain_type_supported() callback in iommu_ops, which allows IOMMU generic layer to check with vendor-specific IOMMU driver whether the requested type is supported. This allows user to request types other than the default type. Note also that you're only adding this in the sysfs path - what about the "iommu.passthrough=" parameter or CONFIG_IOMMU_DEFAULT_PASSTHROUGH? For SNP case, we cannot enable SNP if iommu=off or iommu=pt or iommu.passthrough=1 or CONFIG_IOMMU_DEFAULT_PASSTHROUGH=y. So, when another driver tries to enable SNP, the IOMMU driver prevents it (see iommu_sev_snp_supported() in patch 3). Ugh, I hadn't looked too closely at the other patches, but an interface that looks like a simple "is this feature supported?" check with a secret side-effect of changing global behaviour as well? Yuck :( What external drivers are expected to have the authority to affect the entire system and call that? The fact that you're exporting it suggests they could be loaded from modules *after* v2 features have been enabled and/or the user has configured a non-default identity domain for a group via sysfs... Fun! I see your point. Currently, the function to enable SNP will be called from SEV driver when it tries to enable SNP support globally on the system. This is done during fs_initcall(), which is early in the boot process. I can also add a guard code to make sure that this won't be done after a certain phase. Instead, if we boot with iommu.passhthrough=0, when another driver tries to enable SNP, the IOMMU driver allows this and switch to SNP enable mode. Subsequently, if user tries to switch a domain (via sysfs) to IOMMU_DOMAIN_IDENTITY, the IOMMU needs to prevent this because it has already switch to SNP-enabled mode. AFAICS there shouldn't need to be any core-level changes to support this. We already have drivers which don't support passthrough at all, so conditionally not supporting it should be no big deal. What should happen currently is that def_domain_type returns 0 for "don't care", then domain_alloc rejects IOMMU_DOMAIN_IDENTITY and and returns NULL, so iommu_group_alloc_default_domain() falls back to IOMMU_DOMAIN_DMA. Technically, we can do it the way you suggest. But isn't this confusing? At first, def_domain_type() returns 0 for "don't care", but then it rejects the request to change to IOMMU_DOMAIN_IDENTITY when trying to call domain_alloc(). Yes, that's how it works; def_domain_type is responsible for quirking individual *devices* that need to have a specific domain type (in practice, devices which need identity mapping), while domain_alloc is responsible for saying which domain types the driver supports as a whole, by allocating them or not as appropriate. We don't have a particularly neat way to achieve the negative of def_domain_type - i.e. saying that a specific device *can't* use a specific otherwise-supported domain type - other than subsequently failing in attach_dev, but so far we've not needed such a thing. And if SNP is expected to be mutually exclusive with identity domain support globally, then we still shouldn't need it. Thanks for your feedback. Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On Tue, Jun 14, 2022 at 12:01:45PM -0700, Jerry Snitselaar wrote: > On Tue, Jun 14, 2022 at 11:45:35AM -0500, Steve Wahl wrote: > > On Tue, Jun 14, 2022 at 10:21:29AM +0800, Baolu Lu wrote: > > > On 2022/6/14 09:54, Jerry Snitselaar wrote: > > > > On Mon, Jun 13, 2022 at 6:51 PM Baolu Lu > > > > wrote: > > > > > > > > > > On 2022/6/14 09:44, Jerry Snitselaar wrote: > > > > > > On Mon, Jun 13, 2022 at 6:36 PM Baolu Lu > > > > > > wrote: > > > > > > > On 2022/6/14 04:57, Jerry Snitselaar wrote: > > > > > > > > On Thu, May 12, 2022 at 10:13:09AM -0500, Steve Wahl wrote: > > > > > > > > > To support up to 64 sockets with 10 DMAR units each (640), > > > > > > > > > make the > > > > > > > > > value of DMAR_UNITS_SUPPORTED adjustable by a config variable, > > > > > > > > > CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when > > > > > > > > > MAXSMP is > > > > > > > > > set. > > > > > > > > > > > > > > > > > > If the available hardware exceeds DMAR_UNITS_SUPPORTED > > > > > > > > > (previously set > > > > > > > > > to MAX_IO_APICS, or 128), it causes these messages: "DMAR: > > > > > > > > > Failed to > > > > > > > > > allocate seq_id", "DMAR: Parse DMAR table failure.", and > > > > > > > > > "x2apic: IRQ > > > > > > > > > remapping doesn't support X2APIC mode x2apic disabled"; and > > > > > > > > > the system > > > > > > > > > fails to boot properly. > > > > > > > > > > > > > > > > > > Signed-off-by: Steve Wahl > > > > > > > > > --- > > > > > > > > > > > > > > > > > > Note that we could not find a reason for connecting > > > > > > > > > DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. > > > > > > > > > Perhaps > > > > > > > > > it seemed like the two would continue to match on earlier > > > > > > > > > processors. > > > > > > > > > There doesn't appear to be kernel code that assumes that the > > > > > > > > > value of > > > > > > > > > one is related to the other. > > > > > > > > > > > > > > > > > > v2: Make this value a config option, rather than a fixed > > > > > > > > > constant. The default > > > > > > > > > values should match previous configuration except in the > > > > > > > > > MAXSMP case. Keeping the > > > > > > > > > value at a power of two was requested by Kevin Tian. > > > > > > > > > > > > > > > > > > drivers/iommu/intel/Kconfig | 6 ++ > > > > > > > > > include/linux/dmar.h| 6 +- > > > > > > > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > > > > > > > > > > > > > > > diff --git a/drivers/iommu/intel/Kconfig > > > > > > > > > b/drivers/iommu/intel/Kconfig > > > > > > > > > index 247d0f2d5fdf..fdbda77ac21e 100644 > > > > > > > > > --- a/drivers/iommu/intel/Kconfig > > > > > > > > > +++ b/drivers/iommu/intel/Kconfig > > > > > > > > > @@ -9,6 +9,12 @@ config DMAR_PERF > > > > > > > > > config DMAR_DEBUG > > > > > > > > >bool > > > > > > > > > > > > > > > > > > +config DMAR_UNITS_SUPPORTED > > > > > > > > > +int "Number of DMA Remapping Units supported" > > > > > > > > Also, should there be a "depends on (X86 || IA64)" here? > > > > > > > Do you have any compilation errors or warnings? > > > > > > > > > > > > > > Best regards, > > > > > > > baolu > > > > > > > > > > > > > I think it is probably harmless since it doesn't get used elsewhere, > > > > > > but our tooling was complaining to me because DMAR_UNITS_SUPPORTED > > > > > > was > > > > > > being autogenerated into the configs for the non-x86 architectures > > > > > > we > > > > > > build (aarch64, s390x, ppcle64). > > > > > > We have files corresponding to the config options that it looks at, > > > > > > and I had one for x86 and not the others so it noticed the > > > > > > discrepancy. > > > > > > > > > > So with "depends on (X86 || IA64)", that tool doesn't complain > > > > > anymore, > > > > > right? > > > > > > > > > > Best regards, > > > > > baolu > > > > > > > > > > > > > Yes, with the depends it no longer happens. > > > > > > The dmar code only exists on X86 and IA64 arch's. Adding this depending > > > makes sense to me. I will add it if no objections. > > > > I think that works after Baolu's patchset that makes intel-iommu.h > > private. I'm pretty sure it wouldn't have worked before that. > > > > No objections. > > > > Yes, I think applying it with the depends prior to Baolu's change would > still run into the issue from the KTR report if someone compiled without > INTEL_IOMMU enabled. > > This was dealing with being able to do something like: > > make allmodconfig ARCH=arm64 ; grep DMAR_UNITS .config > > and finding CONFIG_DMAR_UNITS_SUPPORTED=64. > > Thinking some more though, instead of the depends being on the arch > would depending on DMAR_TABLE or INTEL_IOMMU be more appropriate? At least in my limited exploration, depending on INTEL_IOMMU yields compile errors, but depending upon DMAR_TABLE appears to work fine. --> Steve -- Steve Wahl, Hewlett Packard Enterprise
Re: [PATCH 3/5] vfio/iommu_type1: Prefer to reuse domains vs match enforced cache coherency
Hi Kevin, On Wed, Jun 08, 2022 at 11:48:27PM +, Tian, Kevin wrote: > > > > The KVM mechanism for controlling wbinvd is only triggered during > > > > kvm_vfio_group_add(), meaning it is a one-shot test done once the > > devices > > > > are setup. > > > > > > It's not one-shot. kvm_vfio_update_coherency() is called in both > > > group_add() and group_del(). Then the coherency property is > > > checked dynamically in wbinvd emulation: > > > > From the perspective of managing the domains that is still > > one-shot. It doesn't get updated when individual devices are > > added/removed to domains. > > It's unchanged per-domain but dynamic per-vm when multiple > domains are added/removed (i.e. kvm->arch.noncoherent_dma_count). > It's the latter being checked in the kvm. I am going to send a v2, yet not quite getting the point here. Meanwhile, Jason is on leave. What, in your opinion, would be an accurate description here? Thanks Nic ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On Tue, Jun 14, 2022 at 11:45:35AM -0500, Steve Wahl wrote: > On Tue, Jun 14, 2022 at 10:21:29AM +0800, Baolu Lu wrote: > > On 2022/6/14 09:54, Jerry Snitselaar wrote: > > > On Mon, Jun 13, 2022 at 6:51 PM Baolu Lu wrote: > > > > > > > > On 2022/6/14 09:44, Jerry Snitselaar wrote: > > > > > On Mon, Jun 13, 2022 at 6:36 PM Baolu Lu > > > > > wrote: > > > > > > On 2022/6/14 04:57, Jerry Snitselaar wrote: > > > > > > > On Thu, May 12, 2022 at 10:13:09AM -0500, Steve Wahl wrote: > > > > > > > > To support up to 64 sockets with 10 DMAR units each (640), make > > > > > > > > the > > > > > > > > value of DMAR_UNITS_SUPPORTED adjustable by a config variable, > > > > > > > > CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when > > > > > > > > MAXSMP is > > > > > > > > set. > > > > > > > > > > > > > > > > If the available hardware exceeds DMAR_UNITS_SUPPORTED > > > > > > > > (previously set > > > > > > > > to MAX_IO_APICS, or 128), it causes these messages: "DMAR: > > > > > > > > Failed to > > > > > > > > allocate seq_id", "DMAR: Parse DMAR table failure.", and > > > > > > > > "x2apic: IRQ > > > > > > > > remapping doesn't support X2APIC mode x2apic disabled"; and the > > > > > > > > system > > > > > > > > fails to boot properly. > > > > > > > > > > > > > > > > Signed-off-by: Steve Wahl > > > > > > > > --- > > > > > > > > > > > > > > > > Note that we could not find a reason for connecting > > > > > > > > DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. > > > > > > > > Perhaps > > > > > > > > it seemed like the two would continue to match on earlier > > > > > > > > processors. > > > > > > > > There doesn't appear to be kernel code that assumes that the > > > > > > > > value of > > > > > > > > one is related to the other. > > > > > > > > > > > > > > > > v2: Make this value a config option, rather than a fixed > > > > > > > > constant. The default > > > > > > > > values should match previous configuration except in the MAXSMP > > > > > > > > case. Keeping the > > > > > > > > value at a power of two was requested by Kevin Tian. > > > > > > > > > > > > > > > > drivers/iommu/intel/Kconfig | 6 ++ > > > > > > > > include/linux/dmar.h| 6 +- > > > > > > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > > > > > > > > > > > > > diff --git a/drivers/iommu/intel/Kconfig > > > > > > > > b/drivers/iommu/intel/Kconfig > > > > > > > > index 247d0f2d5fdf..fdbda77ac21e 100644 > > > > > > > > --- a/drivers/iommu/intel/Kconfig > > > > > > > > +++ b/drivers/iommu/intel/Kconfig > > > > > > > > @@ -9,6 +9,12 @@ config DMAR_PERF > > > > > > > > config DMAR_DEBUG > > > > > > > >bool > > > > > > > > > > > > > > > > +config DMAR_UNITS_SUPPORTED > > > > > > > > +int "Number of DMA Remapping Units supported" > > > > > > > Also, should there be a "depends on (X86 || IA64)" here? > > > > > > Do you have any compilation errors or warnings? > > > > > > > > > > > > Best regards, > > > > > > baolu > > > > > > > > > > > I think it is probably harmless since it doesn't get used elsewhere, > > > > > but our tooling was complaining to me because DMAR_UNITS_SUPPORTED was > > > > > being autogenerated into the configs for the non-x86 architectures we > > > > > build (aarch64, s390x, ppcle64). > > > > > We have files corresponding to the config options that it looks at, > > > > > and I had one for x86 and not the others so it noticed the > > > > > discrepancy. > > > > > > > > So with "depends on (X86 || IA64)", that tool doesn't complain anymore, > > > > right? > > > > > > > > Best regards, > > > > baolu > > > > > > > > > > Yes, with the depends it no longer happens. > > > > The dmar code only exists on X86 and IA64 arch's. Adding this depending > > makes sense to me. I will add it if no objections. > > I think that works after Baolu's patchset that makes intel-iommu.h > private. I'm pretty sure it wouldn't have worked before that. > > No objections. > Yes, I think applying it with the depends prior to Baolu's change would still run into the issue from the KTR report if someone compiled without INTEL_IOMMU enabled. This was dealing with being able to do something like: make allmodconfig ARCH=arm64 ; grep DMAR_UNITS .config and finding CONFIG_DMAR_UNITS_SUPPORTED=64. Thinking some more though, instead of the depends being on the arch would depending on DMAR_TABLE or INTEL_IOMMU be more appropriate? Regards, Jerry > --> Steve > > -- > Steve Wahl, Hewlett Packard Enterprise ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 3/3] iommu/mediatek: Allow page table PA up to 35bit
> Single memory zone feature will remove ZONE_DMA32 and ZONE_DMA. So add > the quirk IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT to let level 1 and level 2 > pgtable support at most 35bit PA. > > Signed-off-by: Ning Li > Signed-off-by: Yunfei Wang Reviewed-by: Miles Chen > --- > drivers/iommu/mtk_iommu.c | 14 +- > 1 file changed, 9 insertions(+), 5 deletions(-) > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > index 3d62399e8865..4dbc33758711 100644 > --- a/drivers/iommu/mtk_iommu.c > +++ b/drivers/iommu/mtk_iommu.c > @@ -138,6 +138,7 @@ > /* PM and clock always on. e.g. infra iommu */ > #define PM_CLK_AOBIT(15) > #define IFA_IOMMU_PCIE_SUPPORT BIT(16) > +#define PGTABLE_PA_35_EN BIT(17) > > #define MTK_IOMMU_HAS_FLAG_MASK(pdata, _x, mask) \ > pdata)->flags) & (mask)) == (_x)) > @@ -240,6 +241,7 @@ struct mtk_iommu_data { > struct mtk_iommu_domain { > struct io_pgtable_cfg cfg; > struct io_pgtable_ops *iop; > + u32 ttbr; > > struct mtk_iommu_bank_data *bank; > struct iommu_domain domain; > @@ -596,6 +598,9 @@ static int mtk_iommu_domain_finalise(struct > mtk_iommu_domain *dom, > .iommu_dev = data->dev, > }; > > + if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN)) > + dom->cfg.quirks |= IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT; > + > if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) > dom->cfg.oas = data->enable_4GB ? 33 : 32; > else > @@ -684,8 +689,8 @@ static int mtk_iommu_attach_device(struct iommu_domain > *domain, > goto err_unlock; > } > bank->m4u_dom = dom; > - writel(dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK, > -bank->base + REG_MMU_PT_BASE_ADDR); > + bank->m4u_dom->ttbr = MTK_IOMMU_ADDR(dom->cfg.arm_v7s_cfg.ttbr); > + writel(bank->m4u_dom->ttbr, data->base + REG_MMU_PT_BASE_ADDR); > > pm_runtime_put(m4udev); > } > @@ -1366,8 +1371,7 @@ static int __maybe_unused > mtk_iommu_runtime_resume(struct device *dev) > writel_relaxed(reg->int_control[i], base + > REG_MMU_INT_CONTROL0); > writel_relaxed(reg->int_main_control[i], base + > REG_MMU_INT_MAIN_CONTROL); > writel_relaxed(reg->ivrp_paddr[i], base + REG_MMU_IVRP_PADDR); > - writel(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK, > -base + REG_MMU_PT_BASE_ADDR); > + writel(m4u_dom->ttbr, base + REG_MMU_PT_BASE_ADDR); > } while (++i < data->plat_data->banks_num); > > /* > @@ -1401,7 +1405,7 @@ static const struct mtk_iommu_plat_data mt2712_data = { > static const struct mtk_iommu_plat_data mt6779_data = { > .m4u_plat = M4U_MT6779, > .flags = HAS_SUB_COMM_2BITS | OUT_ORDER_WR_EN | WR_THROT_EN | > - MTK_IOMMU_TYPE_MM, > + MTK_IOMMU_TYPE_MM | PGTABLE_PA_35_EN, > .inv_sel_reg = REG_MMU_INV_SEL_GEN2, > .banks_num= 1, > .banks_enable = {true}, > -- > 2.18.0 > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 2/3] iommu/mediatek: Rename MTK_IOMMU_TLB_ADDR to MTK_IOMMU_ADDR
> Rename MTK_IOMMU_TLB_ADDR to MTK_IOMMU_ADDR, and update MTK_IOMMU_ADDR > definition for better generality. > > Signed-off-by: Ning Li > Signed-off-by: Yunfei Wang Reviewed-by: Miles Chen ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On Tue, Jun 14, 2022 at 10:21:29AM +0800, Baolu Lu wrote: > On 2022/6/14 09:54, Jerry Snitselaar wrote: > > On Mon, Jun 13, 2022 at 6:51 PM Baolu Lu wrote: > > > > > > On 2022/6/14 09:44, Jerry Snitselaar wrote: > > > > On Mon, Jun 13, 2022 at 6:36 PM Baolu Lu > > > > wrote: > > > > > On 2022/6/14 04:57, Jerry Snitselaar wrote: > > > > > > On Thu, May 12, 2022 at 10:13:09AM -0500, Steve Wahl wrote: > > > > > > > To support up to 64 sockets with 10 DMAR units each (640), make > > > > > > > the > > > > > > > value of DMAR_UNITS_SUPPORTED adjustable by a config variable, > > > > > > > CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when > > > > > > > MAXSMP is > > > > > > > set. > > > > > > > > > > > > > > If the available hardware exceeds DMAR_UNITS_SUPPORTED > > > > > > > (previously set > > > > > > > to MAX_IO_APICS, or 128), it causes these messages: "DMAR: Failed > > > > > > > to > > > > > > > allocate seq_id", "DMAR: Parse DMAR table failure.", and "x2apic: > > > > > > > IRQ > > > > > > > remapping doesn't support X2APIC mode x2apic disabled"; and the > > > > > > > system > > > > > > > fails to boot properly. > > > > > > > > > > > > > > Signed-off-by: Steve Wahl > > > > > > > --- > > > > > > > > > > > > > > Note that we could not find a reason for connecting > > > > > > > DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. > > > > > > > Perhaps > > > > > > > it seemed like the two would continue to match on earlier > > > > > > > processors. > > > > > > > There doesn't appear to be kernel code that assumes that the > > > > > > > value of > > > > > > > one is related to the other. > > > > > > > > > > > > > > v2: Make this value a config option, rather than a fixed > > > > > > > constant. The default > > > > > > > values should match previous configuration except in the MAXSMP > > > > > > > case. Keeping the > > > > > > > value at a power of two was requested by Kevin Tian. > > > > > > > > > > > > > > drivers/iommu/intel/Kconfig | 6 ++ > > > > > > > include/linux/dmar.h| 6 +- > > > > > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > > > > > > > > > > > diff --git a/drivers/iommu/intel/Kconfig > > > > > > > b/drivers/iommu/intel/Kconfig > > > > > > > index 247d0f2d5fdf..fdbda77ac21e 100644 > > > > > > > --- a/drivers/iommu/intel/Kconfig > > > > > > > +++ b/drivers/iommu/intel/Kconfig > > > > > > > @@ -9,6 +9,12 @@ config DMAR_PERF > > > > > > > config DMAR_DEBUG > > > > > > >bool > > > > > > > > > > > > > > +config DMAR_UNITS_SUPPORTED > > > > > > > +int "Number of DMA Remapping Units supported" > > > > > > Also, should there be a "depends on (X86 || IA64)" here? > > > > > Do you have any compilation errors or warnings? > > > > > > > > > > Best regards, > > > > > baolu > > > > > > > > > I think it is probably harmless since it doesn't get used elsewhere, > > > > but our tooling was complaining to me because DMAR_UNITS_SUPPORTED was > > > > being autogenerated into the configs for the non-x86 architectures we > > > > build (aarch64, s390x, ppcle64). > > > > We have files corresponding to the config options that it looks at, > > > > and I had one for x86 and not the others so it noticed the > > > > discrepancy. > > > > > > So with "depends on (X86 || IA64)", that tool doesn't complain anymore, > > > right? > > > > > > Best regards, > > > baolu > > > > > > > Yes, with the depends it no longer happens. > > The dmar code only exists on X86 and IA64 arch's. Adding this depending > makes sense to me. I will add it if no objections. I think that works after Baolu's patchset that makes intel-iommu.h private. I'm pretty sure it wouldn't have worked before that. No objections. --> Steve -- Steve Wahl, Hewlett Packard Enterprise ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 5/6] dt-bindings: iommu: mediatek: Add mediatek, pericfg phandle
On 09/06/2022 12:08, AngeloGioacchino Del Regno wrote: Add property "mediatek,pericfg" to let the mtk_iommu driver retrieve a phandle to the infracfg syscon instead of performing a per-soc compatible lookup in the entire devicetree and set it as a required property for MT8195's infra IOMMU. Signed-off-by: AngeloGioacchino Del Regno Reviewd-by: Matthias Brugger --- .../devicetree/bindings/iommu/mediatek,iommu.yaml | 14 ++ 1 file changed, 14 insertions(+) diff --git a/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml b/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml index 4142a568b293..d5e3272a54e8 100644 --- a/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml +++ b/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml @@ -116,6 +116,10 @@ properties: Refer to bindings/memory-controllers/mediatek,smi-larb.yaml. It must sort according to the local arbiter index, like larb0, larb1, larb2... + mediatek,pericfg: +$ref: /schemas/types.yaml#/definitions/phandle +description: The phandle to the mediatek pericfg syscon + '#iommu-cells': const: 1 description: | @@ -183,6 +187,16 @@ allOf: required: - mediatek,infracfg + - if: + properties: +compatible: + contains: +const: mediatek,mt8195-iommu-infra + +then: + required: +- mediatek,pericfg + - if: # The IOMMUs don't have larbs. not: properties: ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 2/6] iommu: mtk_iommu: Lookup phandle to retrieve syscon to infracfg
On 09/06/2022 12:07, AngeloGioacchino Del Regno wrote: This driver will get support for more SoCs and the list of infracfg compatibles is expected to grow: in order to prevent getting this situation out of control and see a long list of compatible strings, add support to retrieve a handle to infracfg's regmap through a new "mediatek,infracfg" phandle. In order to keep retrocompatibility with older devicetrees, the old way is kept in place. Signed-off-by: AngeloGioacchino Del Regno Reviewed-by: Matthias Brugger --- drivers/iommu/mtk_iommu.c | 38 -- 1 file changed, 24 insertions(+), 14 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index bb9dd92c9898..90685946fcbe 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -1140,22 +1140,32 @@ static int mtk_iommu_probe(struct platform_device *pdev) data->protect_base = ALIGN(virt_to_phys(protect), MTK_PROTECT_PA_ALIGN); if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) { - switch (data->plat_data->m4u_plat) { - case M4U_MT2712: - p = "mediatek,mt2712-infracfg"; - break; - case M4U_MT8173: - p = "mediatek,mt8173-infracfg"; - break; - default: - p = NULL; + infracfg = syscon_regmap_lookup_by_phandle(dev->of_node, "mediatek,infracfg"); + if (IS_ERR(infracfg)) { + /* +* Legacy devicetrees will not specify a phandle to +* mediatek,infracfg: in that case, we use the older +* way to retrieve a syscon to infra. +* +* This is for retrocompatibility purposes only, hence +* no more compatibles shall be added to this. +*/ + switch (data->plat_data->m4u_plat) { + case M4U_MT2712: + p = "mediatek,mt2712-infracfg"; + break; + case M4U_MT8173: + p = "mediatek,mt8173-infracfg"; + break; + default: + p = NULL; + } + + infracfg = syscon_regmap_lookup_by_compatible(p); + if (IS_ERR(infracfg)) + return PTR_ERR(infracfg); } - infracfg = syscon_regmap_lookup_by_compatible(p); - - if (IS_ERR(infracfg)) - return PTR_ERR(infracfg); - ret = regmap_read(infracfg, REG_INFRA_MISC, ); if (ret) return ret; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 2/4] dma-iommu: Add iommu_dma_opt_mapping_size()
On 06/06/2022 10:30, John Garry wrote: Add the IOMMU callback for DMA mapping API dma_opt_mapping_size(), which allows the drivers to know the optimal mapping limit and thus limit the requested IOVA lengths. This value is based on the IOVA rcache range limit, as IOVAs allocated above this limit must always be newly allocated, which may be quite slow. Can I please get some sort of ack from the IOMMU people on this one? Thanks, John EOM Signed-off-by: John Garry Reviewed-by: Damien Le Moal --- drivers/iommu/dma-iommu.c | 6 ++ drivers/iommu/iova.c | 5 + include/linux/iova.h | 2 ++ 3 files changed, 13 insertions(+) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index f90251572a5d..9e1586447ee8 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1459,6 +1459,11 @@ static unsigned long iommu_dma_get_merge_boundary(struct device *dev) return (1UL << __ffs(domain->pgsize_bitmap)) - 1; } +static size_t iommu_dma_opt_mapping_size(void) +{ + return iova_rcache_range(); +} + static const struct dma_map_ops iommu_dma_ops = { .alloc = iommu_dma_alloc, .free = iommu_dma_free, @@ -1479,6 +1484,7 @@ static const struct dma_map_ops iommu_dma_ops = { .map_resource = iommu_dma_map_resource, .unmap_resource = iommu_dma_unmap_resource, .get_merge_boundary = iommu_dma_get_merge_boundary, + .opt_mapping_size = iommu_dma_opt_mapping_size, }; /* diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index db77aa675145..9f00b58d546e 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -26,6 +26,11 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad, static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad); static void free_iova_rcaches(struct iova_domain *iovad); +unsigned long iova_rcache_range(void) +{ + return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1); +} + static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node) { struct iova_domain *iovad; diff --git a/include/linux/iova.h b/include/linux/iova.h index 320a70e40233..c6ba6d95d79c 100644 --- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -79,6 +79,8 @@ static inline unsigned long iova_pfn(struct iova_domain *iovad, dma_addr_t iova) int iova_cache_get(void); void iova_cache_put(void); +unsigned long iova_rcache_range(void); + void free_iova(struct iova_domain *iovad, unsigned long pfn); void __free_iova(struct iova_domain *iovad, struct iova *iova); struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size, ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 1/3] iommu/io-pgtable-arm-v7s: Add a quirk to allow pgtable PA up to 35bit
Hi, For some reason, this series has landed in my spam folder so apologies for the delay :/ On Sat, Jun 11, 2022 at 06:26:53PM +0800, yf.w...@mediatek.com wrote: > From: Yunfei Wang > > Single memory zone feature will remove ZONE_DMA32 and ZONE_DMA and > cause pgtable PA size larger than 32bit. > > Since Mediatek IOMMU hardware support at most 35bit PA in pgtable, > so add a quirk to allow the PA of pgtables support up to bit35. > > Signed-off-by: Ning Li > Signed-off-by: Yunfei Wang > --- > drivers/iommu/io-pgtable-arm-v7s.c | 48 ++ > include/linux/io-pgtable.h | 17 +++ > 2 files changed, 46 insertions(+), 19 deletions(-) > > diff --git a/drivers/iommu/io-pgtable-arm-v7s.c > b/drivers/iommu/io-pgtable-arm-v7s.c > index be066c1503d3..d4702d8d825a 100644 > --- a/drivers/iommu/io-pgtable-arm-v7s.c > +++ b/drivers/iommu/io-pgtable-arm-v7s.c > @@ -182,14 +182,8 @@ static bool arm_v7s_is_mtk_enabled(struct io_pgtable_cfg > *cfg) > (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT); > } > > -static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl, > - struct io_pgtable_cfg *cfg) > +static arm_v7s_iopte to_iopte_mtk(phys_addr_t paddr, arm_v7s_iopte pte) > { > - arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl); > - > - if (!arm_v7s_is_mtk_enabled(cfg)) > - return pte; > - > if (paddr & BIT_ULL(32)) > pte |= ARM_V7S_ATTR_MTK_PA_BIT32; > if (paddr & BIT_ULL(33)) > @@ -199,6 +193,17 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, > int lvl, > return pte; > } > > +static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl, > + struct io_pgtable_cfg *cfg) > +{ > + arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl); > + > + if (!arm_v7s_is_mtk_enabled(cfg)) > + return pte; > + > + return to_iopte_mtk(paddr, pte); nit, but can we rename and rework this so it reads a bit better, please? Something like: if (arm_v7s_is_mtk_enabled(cfg)) return to_mtk_iopte(paddr, pte); return pte; > static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl, > struct io_pgtable_cfg *cfg) > { > @@ -234,6 +239,7 @@ static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int > lvl, > static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp, > struct arm_v7s_io_pgtable *data) > { > + gfp_t gfp_l1 = __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA; > struct io_pgtable_cfg *cfg = >iop.cfg; > struct device *dev = cfg->iommu_dev; > phys_addr_t phys; > @@ -241,9 +247,11 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp, > size_t size = ARM_V7S_TABLE_SIZE(lvl, cfg); > void *table = NULL; > > + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT) > + gfp_l1 = GFP_KERNEL | __GFP_ZERO; I think it's a bit grotty to override the flags inline like this (same for the slab flag later on). Something like this is a bit cleaner: /* * Comment explaining why GFP_KERNEL is desirable here. * I'm assuming it's because the walker can address all of memory. */ gfp_l1 = cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT ? GFP_KERNEL : ARM_V7S_TABLE_GFP_DMA; ... __get_free_pages(gfp_l1 | __GFP_ZERO, ...); and similar for the slab flag. > if (lvl == 1) > - table = (void *)__get_free_pages( > - __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size)); > + table = (void *)__get_free_pages(gfp_l1, get_order(size)); > else if (lvl == 2) > table = kmem_cache_zalloc(data->l2_tables, gfp); > > @@ -251,7 +259,8 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp, > return NULL; > > phys = virt_to_phys(table); > - if (phys != (arm_v7s_iopte)phys) { > + if (phys != (arm_v7s_iopte)phys && > + !(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)) { > /* Doesn't fit in PTE */ Shouldn't we be checking that the address is within 35 bits here? Perhaps we should generate a mask from the oas instead of just using the cast. > dev_err(dev, "Page table does not fit in PTE: %pa", ); > goto out_free; > @@ -457,9 +466,14 @@ static arm_v7s_iopte arm_v7s_install_table(arm_v7s_iopte > *table, > arm_v7s_iopte curr, > struct io_pgtable_cfg *cfg) > { > + phys_addr_t phys = virt_to_phys(table); > arm_v7s_iopte old, new; > > - new = virt_to_phys(table) | ARM_V7S_PTE_TYPE_TABLE; > + new = phys | ARM_V7S_PTE_TYPE_TABLE; > + > + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT) > + new = to_iopte_mtk(phys, new); > + > if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS) >
Re: [PATCH 5/7] iommu: Add domain_type_supported() callback in iommu_ops
On 2022-06-13 15:38, Suthikulpanit, Suravee wrote: Robin, On 6/13/2022 4:31 PM, Robin Murphy wrote: On 2022-06-13 02:25, Suravee Suthikulpanit wrote: When user requests to change IOMMU domain to a new type, IOMMU generic layer checks the requested type against the default domain type returned by vendor-specific IOMMU driver. However, there is only one default domain type, and current mechanism does not allow if the requested type does not match the default type. I don't really follow the reasoning here. If a driver's def_domain_type callback returns a specific type, it's saying that the device *has* to have that specific domain type for driver/platform-specific reasons. Agree, and I understand this part. If that's not the case, then the driver shouldn't say so in the first place. Considering the case: 1. Boot w/ default domain = IOMMU_DOMAIN_DMA_FQ 2. User wants to change to IOMMU_DOMAIN_IDENTITY, which is not supported by IOMMU driver. In this case, IOMMU driver can return IOMMU_DOMAIN_DMA_FQ and prevent the mode change. 3. However, if user want to change to IOMMU_DOMAIN_DMA. The driver can support this. However, since the def_domain_type() returns IOMMU_DOMAIN_DMA_FQ, it ends up prevent the mode change. Why would a driver be forcing IOMMU_DOMAIN_DMA_FQ for a device though? Nobody's doing that today, and semantically it wouldn't really make sense - forcing translation to deny passthrough on a device-specific basis (beyond the common handling of untrusted devices) *might* be a thing, but the performance/strictness tradeoff of using a flush queue or not is surely a subjective user decision, not an objective platform one. IIUC, we should support step 3 above. Basically, with the newly proposed interface, it allows us to check with IOMMU driver if it can support certain domain types before trying to allocate the domain. Indeed we could do that - as a much more comprehensive change to the internal domain_alloc interfaces - but do we really need to? If we succeed at allocating a domain then we know it's supported; if it fails then we can't give the user what they asked for, regardless of the exact reason why - what do we gain from doubling the number of potential failure paths that we have to handle? Introducing check_domain_type_supported() callback in iommu_ops, which allows IOMMU generic layer to check with vendor-specific IOMMU driver whether the requested type is supported. This allows user to request types other than the default type. Note also that you're only adding this in the sysfs path - what about the "iommu.passthrough=" parameter or CONFIG_IOMMU_DEFAULT_PASSTHROUGH? For SNP case, we cannot enable SNP if iommu=off or iommu=pt or iommu.passthrough=1 or CONFIG_IOMMU_DEFAULT_PASSTHROUGH=y. So, when another driver tries to enable SNP, the IOMMU driver prevents it (see iommu_sev_snp_supported() in patch 3). Ugh, I hadn't looked too closely at the other patches, but an interface that looks like a simple "is this feature supported?" check with a secret side-effect of changing global behaviour as well? Yuck :( What external drivers are expected to have the authority to affect the entire system and call that? The fact that you're exporting it suggests they could be loaded from modules *after* v2 features have been enabled and/or the user has configured a non-default identity domain for a group via sysfs... Fun! Instead, if we boot with iommu.passhthrough=0, when another driver tries to enable SNP, the IOMMU driver allows this and switch to SNP enable mode. Subsequently, if user tries to switch a domain (via sysfs) to IOMMU_DOMAIN_IDENTITY, the IOMMU needs to prevent this because it has already switch to SNP-enabled mode. AFAICS there shouldn't need to be any core-level changes to support this. We already have drivers which don't support passthrough at all, so conditionally not supporting it should be no big deal. What should happen currently is that def_domain_type returns 0 for "don't care", then domain_alloc rejects IOMMU_DOMAIN_IDENTITY and and returns NULL, so iommu_group_alloc_default_domain() falls back to IOMMU_DOMAIN_DMA. Technically, we can do it the way you suggest. But isn't this confusing? At first, def_domain_type() returns 0 for "don't care", but then it rejects the request to change to IOMMU_DOMAIN_IDENTITY when trying to call domain_alloc(). Yes, that's how it works; def_domain_type is responsible for quirking individual *devices* that need to have a specific domain type (in practice, devices which need identity mapping), while domain_alloc is responsible for saying which domain types the driver supports as a whole, by allocating them or not as appropriate. We don't have a particularly neat way to achieve the negative of def_domain_type - i.e. saying that a specific device *can't* use a specific otherwise-supported domain type - other than subsequently failing in attach_dev, but so far we've not
[PATCH 10/10] ARM/dma-mapping: merge IOMMU ops
From: Robin Murphy The dma_sync_* operations are now the only difference between the coherent and non-coherent IOMMU ops. Some minor tweaks to make those safe for coherent devices with minimal overhead, and we can condense down to a single set of DMA ops. Signed-off-by: Robin Murphy Signed-off-by: Christoph Hellwig Tested-by: Marc Zyngier --- arch/arm/mm/dma-mapping.c | 37 + 1 file changed, 13 insertions(+), 24 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index e7ccf7c82e025..e68d1d2ac4be0 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -1341,6 +1341,9 @@ static void arm_iommu_sync_sg_for_cpu(struct device *dev, struct scatterlist *s; int i; + if (dev->dma_coherent) + return; + for_each_sg(sg, s, nents, i) __dma_page_dev_to_cpu(sg_page(s), s->offset, s->length, dir); @@ -1360,6 +1363,9 @@ static void arm_iommu_sync_sg_for_device(struct device *dev, struct scatterlist *s; int i; + if (dev->dma_coherent) + return; + for_each_sg(sg, s, nents, i) __dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir); } @@ -1493,12 +1499,13 @@ static void arm_iommu_sync_single_for_cpu(struct device *dev, { struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev); dma_addr_t iova = handle & PAGE_MASK; - struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova)); + struct page *page; unsigned int offset = handle & ~PAGE_MASK; - if (!iova) + if (dev->dma_coherent || !iova) return; + page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova)); __dma_page_dev_to_cpu(page, offset, size, dir); } @@ -1507,12 +1514,13 @@ static void arm_iommu_sync_single_for_device(struct device *dev, { struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev); dma_addr_t iova = handle & PAGE_MASK; - struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova)); + struct page *page; unsigned int offset = handle & ~PAGE_MASK; - if (!iova) + if (dev->dma_coherent || !iova) return; + page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova)); __dma_page_cpu_to_dev(page, offset, size, dir); } @@ -1536,22 +1544,6 @@ static const struct dma_map_ops iommu_ops = { .unmap_resource = arm_iommu_unmap_resource, }; -static const struct dma_map_ops iommu_coherent_ops = { - .alloc = arm_iommu_alloc_attrs, - .free = arm_iommu_free_attrs, - .mmap = arm_iommu_mmap_attrs, - .get_sgtable= arm_iommu_get_sgtable, - - .map_page = arm_iommu_map_page, - .unmap_page = arm_iommu_unmap_page, - - .map_sg = arm_iommu_map_sg, - .unmap_sg = arm_iommu_unmap_sg, - - .map_resource = arm_iommu_map_resource, - .unmap_resource = arm_iommu_unmap_resource, -}; - /** * arm_iommu_create_mapping * @bus: pointer to the bus holding the client device (for IOMMU calls) @@ -1750,10 +1742,7 @@ static void arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 size, return; } - if (coherent) - set_dma_ops(dev, _coherent_ops); - else - set_dma_ops(dev, _ops); + set_dma_ops(dev, _ops); } static void arm_teardown_iommu_dma_ops(struct device *dev) -- 2.30.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 09/10] ARM/dma-mapping: consolidate IOMMU ops callbacks
From: Robin Murphy Merge the coherent and non-coherent callbacks down to a single implementation each, relying on the generic dev->dma_coherent flag at the points where the difference matters. Signed-off-by: Robin Murphy Signed-off-by: Christoph Hellwig Tested-by: Marc Zyngier --- arch/arm/mm/dma-mapping.c | 238 +- 1 file changed, 55 insertions(+), 183 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 4055f2dc2859e..e7ccf7c82e025 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -1079,13 +1079,13 @@ static void __iommu_free_atomic(struct device *dev, void *cpu_addr, __free_from_pool(cpu_addr, size); } -static void *__arm_iommu_alloc_attrs(struct device *dev, size_t size, - dma_addr_t *handle, gfp_t gfp, unsigned long attrs, - int coherent_flag) +static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, + dma_addr_t *handle, gfp_t gfp, unsigned long attrs) { pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL); struct page **pages; void *addr = NULL; + int coherent_flag = dev->dma_coherent ? COHERENT : NORMAL; *handle = DMA_MAPPING_ERROR; size = PAGE_ALIGN(size); @@ -1128,19 +1128,7 @@ static void *__arm_iommu_alloc_attrs(struct device *dev, size_t size, return NULL; } -static void *arm_iommu_alloc_attrs(struct device *dev, size_t size, - dma_addr_t *handle, gfp_t gfp, unsigned long attrs) -{ - return __arm_iommu_alloc_attrs(dev, size, handle, gfp, attrs, NORMAL); -} - -static void *arm_coherent_iommu_alloc_attrs(struct device *dev, size_t size, - dma_addr_t *handle, gfp_t gfp, unsigned long attrs) -{ - return __arm_iommu_alloc_attrs(dev, size, handle, gfp, attrs, COHERENT); -} - -static int __arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, +static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, void *cpu_addr, dma_addr_t dma_addr, size_t size, unsigned long attrs) { @@ -1154,35 +1142,24 @@ static int __arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma if (vma->vm_pgoff >= nr_pages) return -ENXIO; + if (!dev->dma_coherent) + vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot); + err = vm_map_pages(vma, pages, nr_pages); if (err) pr_err("Remapping memory failed: %d\n", err); return err; } -static int arm_iommu_mmap_attrs(struct device *dev, - struct vm_area_struct *vma, void *cpu_addr, - dma_addr_t dma_addr, size_t size, unsigned long attrs) -{ - vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot); - - return __arm_iommu_mmap_attrs(dev, vma, cpu_addr, dma_addr, size, attrs); -} - -static int arm_coherent_iommu_mmap_attrs(struct device *dev, - struct vm_area_struct *vma, void *cpu_addr, - dma_addr_t dma_addr, size_t size, unsigned long attrs) -{ - return __arm_iommu_mmap_attrs(dev, vma, cpu_addr, dma_addr, size, attrs); -} /* * free a page as defined by the above mapping. * Must not be called with IRQs disabled. */ -static void __arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, - dma_addr_t handle, unsigned long attrs, int coherent_flag) +static void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, + dma_addr_t handle, unsigned long attrs) { + int coherent_flag = dev->dma_coherent ? COHERENT : NORMAL; struct page **pages; size = PAGE_ALIGN(size); @@ -1204,19 +1181,6 @@ static void __arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_ad __iommu_free_buffer(dev, pages, size, attrs); } -static void arm_iommu_free_attrs(struct device *dev, size_t size, -void *cpu_addr, dma_addr_t handle, -unsigned long attrs) -{ - __arm_iommu_free_attrs(dev, size, cpu_addr, handle, attrs, NORMAL); -} - -static void arm_coherent_iommu_free_attrs(struct device *dev, size_t size, - void *cpu_addr, dma_addr_t handle, unsigned long attrs) -{ - __arm_iommu_free_attrs(dev, size, cpu_addr, handle, attrs, COHERENT); -} - static int arm_iommu_get_sgtable(struct device *dev, struct sg_table *sgt, void *cpu_addr, dma_addr_t dma_addr, size_t size, unsigned long attrs) @@ -1236,8 +1200,7 @@ static int arm_iommu_get_sgtable(struct device *dev, struct sg_table *sgt, */ static int __map_sg_chunk(struct device *dev, struct scatterlist *sg, size_t size, dma_addr_t *handle, - enum dma_data_direction dir, unsigned long attrs, - bool is_coherent) +
[PATCH 08/10] ARM/dma-mapping: drop .dma_supported for IOMMU ops
From: Robin Murphy When an IOMMU is present, we trust that it should be capable of remapping any physical memory, and since the device masks represent the input (virtual) addresses to the IOMMU it makes no sense to validate them against physical PFNs anyway. Signed-off-by: Robin Murphy Signed-off-by: Christoph Hellwig Tested-by: Marc Zyngier --- arch/arm/mm/dma-mapping.c | 23 --- 1 file changed, 23 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index ceb56928d01ec..4055f2dc2859e 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -104,25 +104,6 @@ static struct arm_dma_buffer *arm_dma_buffer_find(void *virt) * */ -#ifdef CONFIG_ARM_DMA_USE_IOMMU -/* - * Return whether the given device DMA address mask can be supported - * properly. For example, if your device can only drive the low 24-bits - * during bus mastering, then you would pass 0x00ff as the mask - * to this function. - */ -static int arm_dma_supported(struct device *dev, u64 mask) -{ - unsigned long max_dma_pfn = min(max_pfn - 1, arm_dma_pfn_limit); - - /* -* Translate the device's DMA mask to a PFN limit. This -* PFN number includes the page which we can DMA to. -*/ - return PHYS_PFN(dma_to_phys(dev, mask)) >= max_dma_pfn; -} -#endif - static void __dma_clear_buffer(struct page *page, size_t size, int coherent_flag) { /* @@ -1681,8 +1662,6 @@ static const struct dma_map_ops iommu_ops = { .map_resource = arm_iommu_map_resource, .unmap_resource = arm_iommu_unmap_resource, - - .dma_supported = arm_dma_supported, }; static const struct dma_map_ops iommu_coherent_ops = { @@ -1699,8 +1678,6 @@ static const struct dma_map_ops iommu_coherent_ops = { .map_resource = arm_iommu_map_resource, .unmap_resource = arm_iommu_unmap_resource, - - .dma_supported = arm_dma_supported, }; /** -- 2.30.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 07/10] ARM/dma-mapping: use dma-direct unconditionally
Use dma-direct unconditionally on arm. It has already been used for some time for LPAE and nommu configurations. This mostly changes the streaming mapping implementation and the (simple) coherent allocator for device that are DMA coherent. The existing complex allocator for uncached mappings for non-coherent devices is still used as is using the arch_dma_alloc/arch_dma_free hooks. Signed-off-by: Christoph Hellwig Reviewed-by: Arnd Bergmann Acked-by: Andre Przywara [highbank] Tested-by: Marc Zyngier --- arch/arm/Kconfig | 4 +- arch/arm/include/asm/dma-mapping.h | 24 -- arch/arm/mach-highbank/highbank.c | 2 +- arch/arm/mach-mvebu/coherency.c| 2 +- arch/arm/mm/dma-mapping.c | 365 ++--- 5 files changed, 19 insertions(+), 378 deletions(-) delete mode 100644 arch/arm/include/asm/dma-mapping.h diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index cd67e359958cb..4c18fe7b5d1cf 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -19,8 +19,8 @@ config ARM select ARCH_HAS_SET_MEMORY select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL select ARCH_HAS_STRICT_MODULE_RWX if MMU - select ARCH_HAS_SYNC_DMA_FOR_DEVICE if SWIOTLB || !MMU - select ARCH_HAS_SYNC_DMA_FOR_CPU if SWIOTLB || !MMU + select ARCH_HAS_SYNC_DMA_FOR_DEVICE + select ARCH_HAS_SYNC_DMA_FOR_CPU select ARCH_HAS_TEARDOWN_DMA_OPS if MMU select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select ARCH_HAVE_CUSTOM_GPIO_H diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h deleted file mode 100644 index 6427b934bd11c..0 --- a/arch/arm/include/asm/dma-mapping.h +++ /dev/null @@ -1,24 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef ASMARM_DMA_MAPPING_H -#define ASMARM_DMA_MAPPING_H - -#ifdef __KERNEL__ - -#include -#include - -#include -#include - -extern const struct dma_map_ops arm_dma_ops; -extern const struct dma_map_ops arm_coherent_dma_ops; - -static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus) -{ - if (IS_ENABLED(CONFIG_MMU) && !IS_ENABLED(CONFIG_ARM_LPAE)) - return _dma_ops; - return NULL; -} - -#endif /* __KERNEL__ */ -#endif diff --git a/arch/arm/mach-highbank/highbank.c b/arch/arm/mach-highbank/highbank.c index db607955a7e45..5d4f977ac7d2a 100644 --- a/arch/arm/mach-highbank/highbank.c +++ b/arch/arm/mach-highbank/highbank.c @@ -98,7 +98,7 @@ static int highbank_platform_notifier(struct notifier_block *nb, if (of_property_read_bool(dev->of_node, "dma-coherent")) { val = readl(sregs_base + reg); writel(val | 0xff01, sregs_base + reg); - set_dma_ops(dev, _coherent_dma_ops); + dev->dma_coherent = true; } return NOTIFY_OK; diff --git a/arch/arm/mach-mvebu/coherency.c b/arch/arm/mach-mvebu/coherency.c index 49e3c8d20c2fa..865ac4bc060df 100644 --- a/arch/arm/mach-mvebu/coherency.c +++ b/arch/arm/mach-mvebu/coherency.c @@ -98,7 +98,7 @@ static int mvebu_hwcc_notifier(struct notifier_block *nb, if (event != BUS_NOTIFY_ADD_DEVICE) return NOTIFY_DONE; - set_dma_ops(dev, _coherent_dma_ops); + dev->dma_coherent = true; return NOTIFY_OK; } diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index a09ce16c7ddbd..ceb56928d01ec 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -103,79 +103,8 @@ static struct arm_dma_buffer *arm_dma_buffer_find(void *virt) * before transfers and delay cache invalidation until transfer completion. * */ -static void __dma_page_cpu_to_dev(struct page *, unsigned long, - size_t, enum dma_data_direction); -static void __dma_page_dev_to_cpu(struct page *, unsigned long, - size_t, enum dma_data_direction); - -/** - * arm_dma_map_page - map a portion of a page for streaming DMA - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices - * @page: page that buffer resides in - * @offset: offset into page for start of buffer - * @size: size of buffer to map - * @dir: DMA transfer direction - * - * Ensure that any data held in the cache is appropriately discarded - * or written back. - * - * The device owns this memory once this call has completed. The CPU - * can regain ownership by calling dma_unmap_page(). - */ -static dma_addr_t arm_dma_map_page(struct device *dev, struct page *page, -unsigned long offset, size_t size, enum dma_data_direction dir, -unsigned long attrs) -{ - if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0) - __dma_page_cpu_to_dev(page, offset, size, dir); - return phys_to_dma(dev, page_to_phys(page) + offset); -} - -static dma_addr_t arm_coherent_dma_map_page(struct device *dev, struct page *page, -unsigned long offset, size_t size, enum dma_data_direction dir, -
[PATCH 05/10] ARM/dma-mapping: use dma_to_phys/phys_to_dma in the dma-mapping code
Use the helpers as expected by the dma-direct code in the old arm dma-mapping code to ease a gradual switch to the common DMA code. Signed-off-by: Christoph Hellwig Reviewed-by: Arnd Bergmann Tested-by: Marc Zyngier --- arch/arm/mm/dma-mapping.c | 24 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 90142183d1045..a09ce16c7ddbd 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -128,14 +128,14 @@ static dma_addr_t arm_dma_map_page(struct device *dev, struct page *page, { if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0) __dma_page_cpu_to_dev(page, offset, size, dir); - return pfn_to_dma(dev, page_to_pfn(page)) + offset; + return phys_to_dma(dev, page_to_phys(page) + offset); } static dma_addr_t arm_coherent_dma_map_page(struct device *dev, struct page *page, unsigned long offset, size_t size, enum dma_data_direction dir, unsigned long attrs) { - return pfn_to_dma(dev, page_to_pfn(page)) + offset; + return phys_to_dma(dev, page_to_phys(page) + offset); } /** @@ -156,7 +156,7 @@ static void arm_dma_unmap_page(struct device *dev, dma_addr_t handle, size_t size, enum dma_data_direction dir, unsigned long attrs) { if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0) - __dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)), + __dma_page_dev_to_cpu(phys_to_page(dma_to_phys(dev, handle)), handle & ~PAGE_MASK, size, dir); } @@ -164,7 +164,7 @@ static void arm_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle, size_t size, enum dma_data_direction dir) { unsigned int offset = handle & (PAGE_SIZE - 1); - struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset)); + struct page *page = phys_to_page(dma_to_phys(dev, handle-offset)); __dma_page_dev_to_cpu(page, offset, size, dir); } @@ -172,7 +172,7 @@ static void arm_dma_sync_single_for_device(struct device *dev, dma_addr_t handle, size_t size, enum dma_data_direction dir) { unsigned int offset = handle & (PAGE_SIZE - 1); - struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset)); + struct page *page = phys_to_page(dma_to_phys(dev, handle-offset)); __dma_page_cpu_to_dev(page, offset, size, dir); } @@ -190,7 +190,7 @@ static int arm_dma_supported(struct device *dev, u64 mask) * Translate the device's DMA mask to a PFN limit. This * PFN number includes the page which we can DMA to. */ - return dma_to_pfn(dev, mask) >= max_dma_pfn; + return PHYS_PFN(dma_to_phys(dev, mask)) >= max_dma_pfn; } static void __dma_clear_buffer(struct page *page, size_t size, int coherent_flag) @@ -681,7 +681,7 @@ static void *__dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, if (page) { unsigned long flags; - *handle = pfn_to_dma(dev, page_to_pfn(page)); + *handle = phys_to_dma(dev, page_to_phys(page)); buf->virt = args.want_vaddr ? addr : page; spin_lock_irqsave(_dma_bufs_lock, flags); @@ -721,7 +721,7 @@ static int __arm_dma_mmap(struct device *dev, struct vm_area_struct *vma, int ret = -ENXIO; unsigned long nr_vma_pages = vma_pages(vma); unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT; - unsigned long pfn = dma_to_pfn(dev, dma_addr); + unsigned long pfn = PHYS_PFN(dma_to_phys(dev, dma_addr)); unsigned long off = vma->vm_pgoff; if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, )) @@ -762,7 +762,7 @@ static void __arm_dma_free(struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle, unsigned long attrs, bool is_coherent) { - struct page *page = pfn_to_page(dma_to_pfn(dev, handle)); + struct page *page = phys_to_page(dma_to_phys(dev, handle)); struct arm_dma_buffer *buf; struct arm_dma_free_args args = { .dev = dev, @@ -796,15 +796,15 @@ static int arm_dma_get_sgtable(struct device *dev, struct sg_table *sgt, void *cpu_addr, dma_addr_t handle, size_t size, unsigned long attrs) { - unsigned long pfn = dma_to_pfn(dev, handle); + phys_addr_t paddr = dma_to_phys(dev, handle); struct page *page; int ret; /* If the PFN is not valid, we do not have a struct page */ - if (!pfn_valid(pfn)) + if (!pfn_valid(PHYS_PFN(paddr))) return -ENXIO; - page = pfn_to_page(pfn); + page = phys_to_page(paddr); ret = sg_alloc_table(sgt, 1, GFP_KERNEL); if (unlikely(ret)) -- 2.30.2 ___ iommu mailing list
[PATCH 06/10] ARM/dma-mapping: use the generic versions of dma_to_phys/phys_to_dma by default
Only the footbridge platforms provide their own DMA address translation helpers, so switch to the generic version for all other platforms, and consolidate the footbridge implementation to remove two levels of indirection. Signed-off-by: Christoph Hellwig Reviewed-by: Arnd Bergmann Tested-by: Marc Zyngier --- arch/arm/Kconfig | 1 - arch/arm/include/asm/dma-direct.h | 41 +-- arch/arm/include/asm/memory.h | 2 - arch/arm/mach-footbridge/Kconfig | 1 + arch/arm/mach-footbridge/common.c | 19 + .../mach-footbridge/include/mach/dma-direct.h | 8 .../arm/mach-footbridge/include/mach/memory.h | 4 -- 7 files changed, 21 insertions(+), 55 deletions(-) create mode 100644 arch/arm/mach-footbridge/include/mach/dma-direct.h diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 7630ba9cb6ccc..cd67e359958cb 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -15,7 +15,6 @@ config ARM select ARCH_HAS_MEMBARRIER_SYNC_CORE select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE select ARCH_HAS_PTE_SPECIAL if ARM_LPAE - select ARCH_HAS_PHYS_TO_DMA select ARCH_HAS_SETUP_DMA_OPS select ARCH_HAS_SET_MEMORY select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL diff --git a/arch/arm/include/asm/dma-direct.h b/arch/arm/include/asm/dma-direct.h index 6fd52713b5d12..4f7bcde03abb5 100644 --- a/arch/arm/include/asm/dma-direct.h +++ b/arch/arm/include/asm/dma-direct.h @@ -1,40 +1 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef ASM_ARM_DMA_DIRECT_H -#define ASM_ARM_DMA_DIRECT_H 1 - -#include - -/* - * dma_to_pfn/pfn_to_dma are architecture private - * functions used internally by the DMA-mapping API to provide DMA - * addresses. They must not be used by drivers. - */ -static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn) -{ - if (dev && dev->dma_range_map) - pfn = PFN_DOWN(translate_phys_to_dma(dev, PFN_PHYS(pfn))); - return (dma_addr_t)__pfn_to_bus(pfn); -} - -static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr) -{ - unsigned long pfn = __bus_to_pfn(addr); - - if (dev && dev->dma_range_map) - pfn = PFN_DOWN(translate_dma_to_phys(dev, PFN_PHYS(pfn))); - return pfn; -} - -static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr) -{ - unsigned int offset = paddr & ~PAGE_MASK; - return pfn_to_dma(dev, __phys_to_pfn(paddr)) + offset; -} - -static inline phys_addr_t dma_to_phys(struct device *dev, dma_addr_t dev_addr) -{ - unsigned int offset = dev_addr & ~PAGE_MASK; - return __pfn_to_phys(dma_to_pfn(dev, dev_addr)) + offset; -} - -#endif /* ASM_ARM_DMA_DIRECT_H */ +#include diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index f673e13e0f942..a55a9038abc89 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -378,8 +378,6 @@ static inline unsigned long __virt_to_idmap(unsigned long x) #ifndef __virt_to_bus #define __virt_to_bus __virt_to_phys #define __bus_to_virt __phys_to_virt -#define __pfn_to_bus(x)__pfn_to_phys(x) -#define __bus_to_pfn(x)__phys_to_pfn(x) #endif /* diff --git a/arch/arm/mach-footbridge/Kconfig b/arch/arm/mach-footbridge/Kconfig index 728aff93fba9d..b5bbdcf2e4896 100644 --- a/arch/arm/mach-footbridge/Kconfig +++ b/arch/arm/mach-footbridge/Kconfig @@ -60,6 +60,7 @@ endmenu # Footbridge support config FOOTBRIDGE + select ARCH_HAS_PHYS_TO_DMA bool # Footbridge in host mode diff --git a/arch/arm/mach-footbridge/common.c b/arch/arm/mach-footbridge/common.c index 322495df271d5..5020eb96b025d 100644 --- a/arch/arm/mach-footbridge/common.c +++ b/arch/arm/mach-footbridge/common.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include @@ -335,17 +336,19 @@ unsigned long __bus_to_virt(unsigned long res) return res; } EXPORT_SYMBOL(__bus_to_virt); - -unsigned long __pfn_to_bus(unsigned long pfn) +#else +static inline unsigned long fb_bus_sdram_offset(void) { - return __pfn_to_phys(pfn) + (fb_bus_sdram_offset() - PHYS_OFFSET); + return BUS_OFFSET; } -EXPORT_SYMBOL(__pfn_to_bus); +#endif /* CONFIG_FOOTBRIDGE_ADDIN */ -unsigned long __bus_to_pfn(unsigned long bus) +dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr) { - return __phys_to_pfn(bus - (fb_bus_sdram_offset() - PHYS_OFFSET)); + return paddr + (fb_bus_sdram_offset() - PHYS_OFFSET); } -EXPORT_SYMBOL(__bus_to_pfn); -#endif +phys_addr_t dma_to_phys(struct device *dev, dma_addr_t dev_addr) +{ + return dev_addr - (fb_bus_sdram_offset() - PHYS_OFFSET); +} diff --git a/arch/arm/mach-footbridge/include/mach/dma-direct.h b/arch/arm/mach-footbridge/include/mach/dma-direct.h new file mode 100644 index 0..01f9e8367c009 --- /dev/null +++
[PATCH 04/10] ARM/dma-mapping: remove the unused virt_to_dma helper
virt_to_dma was only used by the now removed dmabounce code. Signed-off-by: Christoph Hellwig Reviewed-by: Arnd Bergmann Tested-by: Marc Zyngier --- arch/arm/include/asm/dma-direct.h | 10 +- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/arch/arm/include/asm/dma-direct.h b/arch/arm/include/asm/dma-direct.h index 77fcb7ee5ec90..6fd52713b5d12 100644 --- a/arch/arm/include/asm/dma-direct.h +++ b/arch/arm/include/asm/dma-direct.h @@ -5,7 +5,7 @@ #include /* - * dma_to_pfn/pfn_to_dma/virt_to_dma are architecture private + * dma_to_pfn/pfn_to_dma are architecture private * functions used internally by the DMA-mapping API to provide DMA * addresses. They must not be used by drivers. */ @@ -25,14 +25,6 @@ static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr) return pfn; } -static inline dma_addr_t virt_to_dma(struct device *dev, void *addr) -{ - if (dev) - return pfn_to_dma(dev, virt_to_pfn(addr)); - - return (dma_addr_t)__virt_to_bus((unsigned long)(addr)); -} - static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr) { unsigned int offset = paddr & ~PAGE_MASK; -- 2.30.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 03/10] ARM/dma-mapping: mark various dma-mapping routines static in dma-mapping.c
With the dmabounce removal these aren't used outside of dma-mapping.c, so mark them static. Move the dma_map_ops declarations down a bit to avoid lots of forward declarations. Signed-off-by: Christoph Hellwig Reviewed-by: Arnd Bergmann Tested-by: Marc Zyngier --- arch/arm/include/asm/dma-mapping.h | 75 -- arch/arm/mm/dma-mapping.c | 100 + 2 files changed, 46 insertions(+), 129 deletions(-) diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index 1e015a7ad86aa..6427b934bd11c 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -20,80 +20,5 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus) return NULL; } -/** - * arm_dma_alloc - allocate consistent memory for DMA - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices - * @size: required memory size - * @handle: bus-specific DMA address - * @attrs: optinal attributes that specific mapping properties - * - * Allocate some memory for a device for performing DMA. This function - * allocates pages, and will return the CPU-viewed address, and sets @handle - * to be the device-viewed address. - */ -extern void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, - gfp_t gfp, unsigned long attrs); - -/** - * arm_dma_free - free memory allocated by arm_dma_alloc - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices - * @size: size of memory originally requested in dma_alloc_coherent - * @cpu_addr: CPU-view address returned from dma_alloc_coherent - * @handle: device-view address returned from dma_alloc_coherent - * @attrs: optinal attributes that specific mapping properties - * - * Free (and unmap) a DMA buffer previously allocated by - * arm_dma_alloc(). - * - * References to memory and mappings associated with cpu_addr/handle - * during and after this call executing are illegal. - */ -extern void arm_dma_free(struct device *dev, size_t size, void *cpu_addr, -dma_addr_t handle, unsigned long attrs); - -/** - * arm_dma_mmap - map a coherent DMA allocation into user space - * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices - * @vma: vm_area_struct describing requested user mapping - * @cpu_addr: kernel CPU-view address returned from dma_alloc_coherent - * @handle: device-view address returned from dma_alloc_coherent - * @size: size of memory originally requested in dma_alloc_coherent - * @attrs: optinal attributes that specific mapping properties - * - * Map a coherent DMA buffer previously allocated by dma_alloc_coherent - * into user space. The coherent DMA buffer must not be freed by the - * driver until the user space mapping has been released. - */ -extern int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma, - void *cpu_addr, dma_addr_t dma_addr, size_t size, - unsigned long attrs); - -/* - * For SA-, IXP425, and ADI systems the dma-mapping functions are "magic" - * and utilize bounce buffers as needed to work around limited DMA windows. - * - * On the SA-, a bug limits DMA to only certain regions of RAM. - * On the IXP425, the PCI inbound window is 64MB (256MB total RAM) - * On some ADI engineering systems, PCI inbound window is 32MB (12MB total RAM) - * - * The following are helper functions used by the dmabounce subystem - * - */ - -/* - * The scatter list versions of the above methods. - */ -extern int arm_dma_map_sg(struct device *, struct scatterlist *, int, - enum dma_data_direction, unsigned long attrs); -extern void arm_dma_unmap_sg(struct device *, struct scatterlist *, int, - enum dma_data_direction, unsigned long attrs); -extern void arm_dma_sync_sg_for_cpu(struct device *, struct scatterlist *, int, - enum dma_data_direction); -extern void arm_dma_sync_sg_for_device(struct device *, struct scatterlist *, int, - enum dma_data_direction); -extern int arm_dma_get_sgtable(struct device *dev, struct sg_table *sgt, - void *cpu_addr, dma_addr_t dma_addr, size_t size, - unsigned long attrs); - #endif /* __KERNEL__ */ #endif diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 059cce0185706..90142183d1045 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -193,50 +193,6 @@ static int arm_dma_supported(struct device *dev, u64 mask) return dma_to_pfn(dev, mask) >= max_dma_pfn; } -const struct dma_map_ops arm_dma_ops = { - .alloc = arm_dma_alloc, - .free = arm_dma_free, - .alloc_pages= dma_direct_alloc_pages, - .free_pages = dma_direct_free_pages, - .mmap = arm_dma_mmap, - .get_sgtable= arm_dma_get_sgtable, -
[PATCH 01/10] ARM: sa1100/assabet: move dmabounce hack to ohci driver
From: Arnd Bergmann The sa platform is one of the two remaining users of the old Arm specific "dmabounce" code, which is an earlier implementation of the generic swiotlb. Linus Walleij submitted a patch that removes dmabounce support from the ixp4xx, and I had a look at the other user, which is the sa companion chip. Looking at how dmabounce is used, I could narrow it down to one driver one three machines: - dmabounce is only initialized on assabet/neponset, jornada720 and badge4, which are the platforms that have an sa and support DMA on it. - All three of these suffer from "erratum #7" that requires only doing DMA to half the memory sections based on one of the address lines, in addition, the neponset also can't DMA to the RAM that is connected to sa itself. - the pxa lubbock machine also has sa, but does not support DMA on it and does not set dmabounce. - only the OHCI and audio devices on sa support DMA, but as there is no audio driver for this hardware, only OHCI remains. In the OHCI code, I noticed that two other platforms already have a local bounce buffer support in the form of the "local_mem" allocator. Specifically, TMIO and SM501 use this on a few other ARM boards with 16KB or 128KB of local SRAM that can be accessed from the OHCI and from the CPU. While this is not the same problem as on sa, I could not find a reason why we can't re-use the existing implementation but replace the physical SRAM address mapping with a locally allocated DMA buffer. There are two main downsides: - rather than using a dynamically sized pool, this buffer needs to be allocated at probe time using a fixed size. Without having any idea of what it should be, I picked a size of 64KB, which is between what the other two OHCI front-ends use in their SRAM. If anyone has a better idea what that size is reasonable, this can be trivially changed. - Previously, only USB transfers to unaddressable memory needed to go through the bounce buffer, now all of them do, which may impact runtime performance for USB endpoints that do a lot of transfers. On the upside, the local_mem support uses write-combining buffers, which should be a bit faster for transfers to the device compared to normal uncached coherent memory as used in dmabounce. Cc: Linus Walleij Cc: Russell King Cc: Christoph Hellwig Cc: Laurentiu Tudor Cc: linux-...@vger.kernel.org Signed-off-by: Arnd Bergmann Reviewed-by: Greg Kroah-Hartman Acked-by: Alan Stern Signed-off-by: Christoph Hellwig --- arch/arm/common/Kconfig| 2 +- arch/arm/common/sa.c | 64 -- drivers/usb/core/hcd.c | 17 +++-- drivers/usb/host/ohci-sa.c | 25 + 4 files changed, 40 insertions(+), 68 deletions(-) diff --git a/arch/arm/common/Kconfig b/arch/arm/common/Kconfig index c8e198631d418..bc158fd227e12 100644 --- a/arch/arm/common/Kconfig +++ b/arch/arm/common/Kconfig @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 config SA bool - select DMABOUNCE if !ARCH_PXA + select ZONE_DMA if ARCH_SA1100 config DMABOUNCE bool diff --git a/arch/arm/common/sa.c b/arch/arm/common/sa.c index 2343e2b6214d7..f5e6990b8856b 100644 --- a/arch/arm/common/sa.c +++ b/arch/arm/common/sa.c @@ -1389,70 +1389,9 @@ void sa_driver_unregister(struct sa_driver *driver) } EXPORT_SYMBOL(sa_driver_unregister); -#ifdef CONFIG_DMABOUNCE -/* - * According to the "Intel StrongARM SA- Microprocessor Companion - * Chip Specification Update" (June 2000), erratum #7, there is a - * significant bug in the SA SDRAM shared memory controller. If - * an access to a region of memory above 1MB relative to the bank base, - * it is important that address bit 10 _NOT_ be asserted. Depending - * on the configuration of the RAM, bit 10 may correspond to one - * of several different (processor-relative) address bits. - * - * This routine only identifies whether or not a given DMA address - * is susceptible to the bug. - * - * This should only get called for sa_device types due to the - * way we configure our device dma_masks. - */ -static int sa_needs_bounce(struct device *dev, dma_addr_t addr, size_t size) -{ - /* -* Section 4.6 of the "Intel StrongARM SA- Development Module -* User's Guide" mentions that jumpers R51 and R52 control the -* target of SA- DMA (either SDRAM bank 0 on Assabet, or -* SDRAM bank 1 on Neponset). The default configuration selects -* Assabet, so any address in bank 1 is necessarily invalid. -*/ - return (machine_is_assabet() || machine_is_pfs168()) && - (addr >= 0xc800 || (addr + size) >= 0xc800); -} - -static int sa_notifier_call(struct notifier_block *n, unsigned long action, - void *data) -{ - struct sa_dev *dev = to_sa_device(data); - -
[PATCH 02/10] ARM/dma-mapping: remove dmabounce
Remove the now unused dmabounce code. Signed-off-by: Christoph Hellwig Reviewed-by: Arnd Bergmann --- arch/arm/common/Kconfig| 4 - arch/arm/common/Makefile | 1 - arch/arm/common/dmabounce.c| 582 - arch/arm/include/asm/device.h | 3 - arch/arm/include/asm/dma-mapping.h | 29 -- 5 files changed, 619 deletions(-) delete mode 100644 arch/arm/common/dmabounce.c diff --git a/arch/arm/common/Kconfig b/arch/arm/common/Kconfig index bc158fd227e12..d2fdb1796f488 100644 --- a/arch/arm/common/Kconfig +++ b/arch/arm/common/Kconfig @@ -3,10 +3,6 @@ config SA bool select ZONE_DMA if ARCH_SA1100 -config DMABOUNCE - bool - select ZONE_DMA - config KRAIT_L2_ACCESSORS bool diff --git a/arch/arm/common/Makefile b/arch/arm/common/Makefile index 8cd574be94cfe..7bae8cbaafe78 100644 --- a/arch/arm/common/Makefile +++ b/arch/arm/common/Makefile @@ -6,7 +6,6 @@ obj-y += firmware.o obj-$(CONFIG_SA) += sa.o -obj-$(CONFIG_DMABOUNCE)+= dmabounce.o obj-$(CONFIG_KRAIT_L2_ACCESSORS) += krait-l2-accessors.o obj-$(CONFIG_SHARP_LOCOMO) += locomo.o obj-$(CONFIG_SHARP_PARAM) += sharpsl_param.o diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c deleted file mode 100644 index 7996c04393d50..0 --- a/arch/arm/common/dmabounce.c +++ /dev/null @@ -1,582 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-only -/* - * arch/arm/common/dmabounce.c - * - * Special dma_{map/unmap/dma_sync}_* routines for systems that have - * limited DMA windows. These functions utilize bounce buffers to - * copy data to/from buffers located outside the DMA region. This - * only works for systems in which DMA memory is at the bottom of - * RAM, the remainder of memory is at the top and the DMA memory - * can be marked as ZONE_DMA. Anything beyond that such as discontiguous - * DMA windows will require custom implementations that reserve memory - * areas at early bootup. - * - * Original version by Brad Parker (b...@heeltoe.com) - * Re-written by Christopher Hoover - * Made generic by Deepak Saxena - * - * Copyright (C) 2002 Hewlett Packard Company. - * Copyright (C) 2004 MontaVista Software, Inc. - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include - -#undef STATS - -#ifdef STATS -#define DO_STATS(X) do { X ; } while (0) -#else -#define DO_STATS(X) do { } while (0) -#endif - -/* ** */ - -struct safe_buffer { - struct list_head node; - - /* original request */ - void*ptr; - size_t size; - int direction; - - /* safe buffer info */ - struct dmabounce_pool *pool; - void*safe; - dma_addr_t safe_dma_addr; -}; - -struct dmabounce_pool { - unsigned long size; - struct dma_pool *pool; -#ifdef STATS - unsigned long allocs; -#endif -}; - -struct dmabounce_device_info { - struct device *dev; - struct list_head safe_buffers; -#ifdef STATS - unsigned long total_allocs; - unsigned long map_op_count; - unsigned long bounce_count; - int attr_res; -#endif - struct dmabounce_pool small; - struct dmabounce_pool large; - - rwlock_t lock; - - int (*needs_bounce)(struct device *, dma_addr_t, size_t); -}; - -#ifdef STATS -static ssize_t dmabounce_show(struct device *dev, struct device_attribute *attr, - char *buf) -{ - struct dmabounce_device_info *device_info = dev->archdata.dmabounce; - return sprintf(buf, "%lu %lu %lu %lu %lu %lu\n", - device_info->small.allocs, - device_info->large.allocs, - device_info->total_allocs - device_info->small.allocs - - device_info->large.allocs, - device_info->total_allocs, - device_info->map_op_count, - device_info->bounce_count); -} - -static DEVICE_ATTR(dmabounce_stats, 0400, dmabounce_show, NULL); -#endif - - -/* allocate a 'safe' buffer and keep track of it */ -static inline struct safe_buffer * -alloc_safe_buffer(struct dmabounce_device_info *device_info, void *ptr, - size_t size, enum dma_data_direction dir) -{ - struct safe_buffer *buf; - struct dmabounce_pool *pool; - struct device *dev = device_info->dev; - unsigned long flags; - - dev_dbg(dev, "%s(ptr=%p, size=%d, dir=%d)\n", - __func__, ptr, size, dir); - - if (size <= device_info->small.size) { - pool = _info->small; - } else if (size <= device_info->large.size) { - pool = _info->large; - } else { - pool = NULL; - } - - buf = kmalloc(sizeof(struct safe_buffer),
fully convert arm to use dma-direct v3
Hi all, arm is the last platform not using the dma-direct code for directly mapped DMA. With the dmaboune removal from Arnd we can easily switch arm to always use dma-direct now (it already does for LPAE configs and nommu). I'd love to merge this series through the dma-mapping tree as it gives us the opportunity for additional core dma-mapping improvements. Changes since v2: - rebased to Linux 5.19-rc2 Changes since v1: - remove another unused function - improve a few commit logs - add three more patches from Robin Diffstat: arch/arm/common/dmabounce.c | 582 - arch/arm/include/asm/dma-mapping.h | 128 --- b/arch/arm/Kconfig |5 b/arch/arm/common/Kconfig|6 b/arch/arm/common/Makefile |1 b/arch/arm/common/sa.c | 64 - b/arch/arm/include/asm/device.h |3 b/arch/arm/include/asm/dma-direct.h | 49 - b/arch/arm/include/asm/memory.h |2 b/arch/arm/mach-footbridge/Kconfig |1 b/arch/arm/mach-footbridge/common.c | 19 b/arch/arm/mach-footbridge/include/mach/dma-direct.h |8 b/arch/arm/mach-footbridge/include/mach/memory.h |4 b/arch/arm/mach-highbank/highbank.c |2 b/arch/arm/mach-mvebu/coherency.c|2 b/arch/arm/mm/dma-mapping.c | 649 ++- b/drivers/usb/core/hcd.c | 17 b/drivers/usb/host/ohci-sa.c | 25 18 files changed, 137 insertions(+), 1430 deletions(-) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
New IOMMU mailing list coming
Hi all, after several problems with the current IOMMU mailing list (no DKIM support, poor b4 interoperability) we have decided to move the IOMMU development discussions to a new list hosted at lists.linux.dev. The new list is up and running already, to subscribe please send an email to iommu+subscr...@lists.linux.dev. It is not yet archived, but there will be a hard switch between the old and the new list on July 5th, 2022 After that date the old list will no longer work and the archive will switch to the new mailing list. Sorry for any inconveniences this might cause. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] iommu/vt-d: Add set_dev_pasid callbacks for default domain
On 2022/6/14 15:19, Tian, Kevin wrote: From: Baolu Lu Sent: Tuesday, June 14, 2022 2:13 PM On 2022/6/14 13:36, Tian, Kevin wrote: From: Baolu Lu Sent: Tuesday, June 14, 2022 12:48 PM On 2022/6/14 12:02, Tian, Kevin wrote: From: Lu Baolu Sent: Tuesday, June 14, 2022 11:44 AM This allows the upper layers to set a domain to a PASID of a device if the PASID feature is supported by the IOMMU hardware. The typical use cases are, for example, kernel DMA with PASID and hardware assisted mediated device drivers. why is it not part of the series for those use cases? There is no consumer of added callbacks in this patch... It could be. I just wanted to maintain the integrity of Intel IOMMU driver implementation. but let's not add dead code. and this patch is actually a right step simply from set_dev_pasid() p.o.v hence you should include in any series which first tries to use that interface. Yes, that's my intention. If it reviews well, we can include it in the driver's implementation. Then you should make it clear in the first place. otherwise a patch like this implies a review for merge. Yeah! Will update this in the next version. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 11/12] iommu/vt-d: Use device_domain_lock accurately
On 2022/6/14 15:16, Tian, Kevin wrote: From: Lu Baolu Sent: Tuesday, June 14, 2022 10:52 AM The device_domain_lock is used to protect the device tracking list of a domain. Remove unnecessary spin_lock/unlock()'s and move the necessary ones around the list access. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 68 +++-- 1 file changed, 27 insertions(+), 41 deletions(-) [...] +iommu_support_dev_iotlb(struct dmar_domain *domain, struct intel_iommu *iommu, + u8 bus, u8 devfn) { - struct device_domain_info *info; - - assert_spin_locked(_domain_lock); + struct device_domain_info *info = NULL, *tmp; + unsigned long flags; if (!iommu->qi) return NULL; - list_for_each_entry(info, >devices, link) - if (info->iommu == iommu && info->bus == bus && - info->devfn == devfn) { - if (info->ats_supported && info->dev) - return info; + spin_lock_irqsave(_domain_lock, flags); + list_for_each_entry(tmp, >devices, link) { + if (tmp->iommu == iommu && tmp->bus == bus && + tmp->devfn == devfn) { + if (tmp->ats_supported) + info = tmp; Directly returning with unlock here is clearer than adding another tmp variable... Sure. @@ -2460,15 +2450,14 @@ static int domain_add_dev_info(struct dmar_domain *domain, struct device *dev) if (!iommu) return -ENODEV; - spin_lock_irqsave(_domain_lock, flags); - info->domain = domain; ret = domain_attach_iommu(domain, iommu); - if (ret) { - spin_unlock_irqrestore(_domain_lock, flags); + if (ret) return ret; - } + + spin_lock_irqsave(_domain_lock, flags); list_add(>link, >devices); spin_unlock_irqrestore(_domain_lock, flags); + info->domain = domain; This is incorrect. You need fully initialize the object before adding it to the list. Otherwise a search right after above unlock and before assigning info->domain will get a wrong data Fair enough. Will fix it in the next version. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 10/12] iommu/vt-d: Fold __dmar_remove_one_dev_info() into its caller
On 2022/6/14 15:07, Tian, Kevin wrote: From: Lu Baolu Sent: Tuesday, June 14, 2022 10:52 AM Fold __dmar_remove_one_dev_info() into dmar_remove_one_dev_info() which is its only caller. Make the spin lock critical range only cover the device list change code and remove some unnecessary checks. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 34 +- 1 file changed, 9 insertions(+), 25 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index af22690f44c8..8345e0c0824c 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -295,7 +295,6 @@ static LIST_HEAD(dmar_satc_units); static int g_num_of_iommus; static void dmar_remove_one_dev_info(struct device *dev); -static void __dmar_remove_one_dev_info(struct device_domain_info *info); int dmar_disabled = !IS_ENABLED(CONFIG_INTEL_IOMMU_DEFAULT_ON); int intel_iommu_sm = IS_ENABLED(CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON); @@ -4141,20 +4140,14 @@ static void domain_context_clear(struct device_domain_info *info) _context_clear_one_cb, info); } -static void __dmar_remove_one_dev_info(struct device_domain_info *info) +static void dmar_remove_one_dev_info(struct device *dev) { - struct dmar_domain *domain; - struct intel_iommu *iommu; - - assert_spin_locked(_domain_lock); - - if (WARN_ON(!info)) - return; - - iommu = info->iommu; - domain = info->domain; + struct device_domain_info *info = dev_iommu_priv_get(dev); + struct dmar_domain *domain = info->domain; this local variable is not required as there is just one reference to info->domain. Yes. It could be removed and use info->domain directly. btw I didn't see info->domain is cleared in this path. Is it correct? It's better to clear here. I will make this change in my in-process blocking domain series. But it doesn't cause any real problems because the Intel IOMMU driver supports default domain, hence the logic here is info->domain is replaced, but not cleared. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 06/12] iommu/vt-d: Acquiring lock in domain ID allocation helpers
On 2022/6/14 14:52, Tian, Kevin wrote: From: Lu Baolu Sent: Tuesday, June 14, 2022 10:52 AM The iommu->lock is used to protect the per-IOMMU domain ID resource. Moving the lock into the ID alloc/free helpers makes the code more compact. At the same time, the device_domain_lock is irrelevant to the domain ID resource, remove its assertion as well. On the other hand, the iommu->lock is never used in interrupt context, there's no need to use the irqsave variant of the spinlock calls. I still prefer to separating reduction of lock ranges from changing irqsave. Locking is tricky. From bisect p.o.v. it will be a lot easier if we just change one logic in one patch. Fair enough. I will do this in the next version. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 03/12] iommu/vt-d: Remove clearing translation data in disable_dmar_iommu()
On 2022/6/14 14:49, Tian, Kevin wrote: From: Lu Baolu Sent: Tuesday, June 14, 2022 10:51 AM The disable_dmar_iommu() is called when IOMMU initialization fails or the IOMMU is hot-removed from the system. In both cases, there is no need to clear the IOMMU translation data structures for devices. On the initialization path, the device probing only happens after the IOMMU is initialized successfully, hence there're no translation data structures. Out of curiosity. With kexec the IOMMU may contain stale mappings from the old kernel. Then is it meaningful to disable IOMMU after the new kernel fails to initialize it properly? For kexec kernel, if the IOMMU is detected to be pre-enabled, the IOMMU driver will try to copy tables from the old kernel. If copying table fails, the IOMMU driver will disable IOMMU and do the normal initialization. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 1/1] iommu/vt-d: Add set_dev_pasid callbacks for default domain
> From: Baolu Lu > Sent: Tuesday, June 14, 2022 2:13 PM > > On 2022/6/14 13:36, Tian, Kevin wrote: > >> From: Baolu Lu > >> Sent: Tuesday, June 14, 2022 12:48 PM > >> > >> On 2022/6/14 12:02, Tian, Kevin wrote: > From: Lu Baolu > Sent: Tuesday, June 14, 2022 11:44 AM > > This allows the upper layers to set a domain to a PASID of a device > if the PASID feature is supported by the IOMMU hardware. The typical > use cases are, for example, kernel DMA with PASID and hardware > assisted mediated device drivers. > > >>> why is it not part of the series for those use cases? There is no consumer > >>> of added callbacks in this patch... > >> It could be. I just wanted to maintain the integrity of Intel IOMMU > >> driver implementation. > > but let's not add dead code. and this patch is actually a right step > > simply from set_dev_pasid() p.o.v hence you should include in any > > series which first tries to use that interface. > > > > Yes, that's my intention. If it reviews well, we can include it in the > driver's implementation. > Then you should make it clear in the first place. otherwise a patch like this implies a review for merge. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 11/12] iommu/vt-d: Use device_domain_lock accurately
> From: Lu Baolu > Sent: Tuesday, June 14, 2022 10:52 AM > > The device_domain_lock is used to protect the device tracking list of > a domain. Remove unnecessary spin_lock/unlock()'s and move the necessary > ones around the list access. > > Signed-off-by: Lu Baolu > --- > drivers/iommu/intel/iommu.c | 68 +++-- > 1 file changed, 27 insertions(+), 41 deletions(-) > [...] > +iommu_support_dev_iotlb(struct dmar_domain *domain, struct > intel_iommu *iommu, > + u8 bus, u8 devfn) > { > - struct device_domain_info *info; > - > - assert_spin_locked(_domain_lock); > + struct device_domain_info *info = NULL, *tmp; > + unsigned long flags; > > if (!iommu->qi) > return NULL; > > - list_for_each_entry(info, >devices, link) > - if (info->iommu == iommu && info->bus == bus && > - info->devfn == devfn) { > - if (info->ats_supported && info->dev) > - return info; > + spin_lock_irqsave(_domain_lock, flags); > + list_for_each_entry(tmp, >devices, link) { > + if (tmp->iommu == iommu && tmp->bus == bus && > + tmp->devfn == devfn) { > + if (tmp->ats_supported) > + info = tmp; Directly returning with unlock here is clearer than adding another tmp variable... > @@ -2460,15 +2450,14 @@ static int domain_add_dev_info(struct > dmar_domain *domain, struct device *dev) > if (!iommu) > return -ENODEV; > > - spin_lock_irqsave(_domain_lock, flags); > - info->domain = domain; > ret = domain_attach_iommu(domain, iommu); > - if (ret) { > - spin_unlock_irqrestore(_domain_lock, flags); > + if (ret) > return ret; > - } > + > + spin_lock_irqsave(_domain_lock, flags); > list_add(>link, >devices); > spin_unlock_irqrestore(_domain_lock, flags); > + info->domain = domain; > This is incorrect. You need fully initialize the object before adding it to the list. Otherwise a search right after above unlock and before assigning info->domain will get a wrong data ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 01/12] iommu/vt-d: debugfs: Remove device_domain_lock usage
Hi Kevin, Thanks for your reviewing. On 2022/6/14 14:43, Tian, Kevin wrote: From: Lu Baolu Sent: Tuesday, June 14, 2022 10:51 AM The domain_translation_struct debugfs node is used to dump the DMAR page tables for the PCI devices. It potentially races with setting domains to devices. The existing code uses a global spinlock device_domain_lock to avoid the races, but this is problematical as this lock is only used to protect the device tracking lists of each domain. is it really problematic at this point? Before following patches are applied using device_domain_lock should have similar effect as holding the group lock. The device_domain_lock only protects the device tracking list of the domain, it doesn't include the domain pointer stored in the dev_info structure. That's really protected by the group->mutex. Here it might make more sense to just focus on removing the use of device_domain_lock outside of iommu.c. Just that using group lock is cleaner and more compatible to following cleanups. Fair enough. I will update the commit message with above statement. and it's worth mentioning that racing with page table updates is out of the scope of this series. Probably also add a comment in the code to clarify this point. Sure. This replaces device_domain_lock with group->mutex to protect page tables from setting a new domain. This also makes device_domain_lock static as it is now only used inside the file. s/the file/iommu.c/ Sure. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.h | 1 - drivers/iommu/intel/debugfs.c | 49 +-- drivers/iommu/intel/iommu.c | 2 +- 3 files changed, 36 insertions(+), 16 deletions(-) diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index a22adfbdf870..8a6d64d726c0 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -480,7 +480,6 @@ enum { #define VTD_FLAG_SVM_CAPABLE (1 << 2) extern int intel_iommu_sm; -extern spinlock_t device_domain_lock; #define sm_supported(iommu) (intel_iommu_sm && ecap_smts((iommu)->ecap)) #define pasid_supported(iommu)(sm_supported(iommu) && \ diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c index d927ef10641b..5ebfe32265d5 100644 --- a/drivers/iommu/intel/debugfs.c +++ b/drivers/iommu/intel/debugfs.c @@ -342,15 +342,23 @@ static void pgtable_walk_level(struct seq_file *m, struct dma_pte *pde, } } -static int show_device_domain_translation(struct device *dev, void *data) +struct show_domain_opaque { + struct device *dev; + struct seq_file *m; +}; Sounds redundant as both bus_for_each_dev() and iommu_group_for_each_dev() declares the same fn type which accepts a device pointer and void *data. + +static int __show_device_domain_translation(struct device *dev, void *data) { - struct device_domain_info *info = dev_iommu_priv_get(dev); - struct dmar_domain *domain = info->domain; - struct seq_file *m = data; + struct show_domain_opaque *opaque = data; + struct device_domain_info *info; + struct seq_file *m = opaque->m; + struct dmar_domain *domain; u64 path[6] = { 0 }; - if (!domain) + if (dev != opaque->dev) return 0; not required. Together with above comment. The iommu group might have other devices. I only want to dump the domain of the secific @opaque->dev. It reads a bit confusing, but it's the only helper I can use outside of drivers/iommu/iommu.c. Or, since all devices in the iommu group share the same domain, hence only dump once? + info = dev_iommu_priv_get(dev); + domain = info->domain; seq_printf(m, "Device %s @0x%llx\n", dev_name(dev), (u64)virt_to_phys(domain->pgd)); @@ -359,20 +367,33 @@ static int show_device_domain_translation(struct device *dev, void *data) pgtable_walk_level(m, domain->pgd, domain->agaw + 2, 0, path); seq_putc(m, '\n'); - return 0; + return 1; } -static int domain_translation_struct_show(struct seq_file *m, void *unused) +static int show_device_domain_translation(struct device *dev, void *data) { - unsigned long flags; - int ret; + struct show_domain_opaque opaque = {dev, data}; + struct iommu_group *group; - spin_lock_irqsave(_domain_lock, flags); - ret = bus_for_each_dev(_bus_type, NULL, m, - show_device_domain_translation); - spin_unlock_irqrestore(_domain_lock, flags); + group = iommu_group_get(dev); + if (group) { + /* +* The group->mutex is held across the callback, which will +* block calls to iommu_attach/detach_group/device. Hence, +* the domain of the device will not change during traversal. +*/ + iommu_group_for_each_dev(group, , +
RE: [PATCH v2 10/12] iommu/vt-d: Fold __dmar_remove_one_dev_info() into its caller
> From: Lu Baolu > Sent: Tuesday, June 14, 2022 10:52 AM > > Fold __dmar_remove_one_dev_info() into dmar_remove_one_dev_info() > which > is its only caller. Make the spin lock critical range only cover the > device list change code and remove some unnecessary checks. > > Signed-off-by: Lu Baolu > --- > drivers/iommu/intel/iommu.c | 34 +- > 1 file changed, 9 insertions(+), 25 deletions(-) > > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c > index af22690f44c8..8345e0c0824c 100644 > --- a/drivers/iommu/intel/iommu.c > +++ b/drivers/iommu/intel/iommu.c > @@ -295,7 +295,6 @@ static LIST_HEAD(dmar_satc_units); > static int g_num_of_iommus; > > static void dmar_remove_one_dev_info(struct device *dev); > -static void __dmar_remove_one_dev_info(struct device_domain_info *info); > > int dmar_disabled = !IS_ENABLED(CONFIG_INTEL_IOMMU_DEFAULT_ON); > int intel_iommu_sm = > IS_ENABLED(CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON); > @@ -4141,20 +4140,14 @@ static void domain_context_clear(struct > device_domain_info *info) > _context_clear_one_cb, info); > } > > -static void __dmar_remove_one_dev_info(struct device_domain_info *info) > +static void dmar_remove_one_dev_info(struct device *dev) > { > - struct dmar_domain *domain; > - struct intel_iommu *iommu; > - > - assert_spin_locked(_domain_lock); > - > - if (WARN_ON(!info)) > - return; > - > - iommu = info->iommu; > - domain = info->domain; > + struct device_domain_info *info = dev_iommu_priv_get(dev); > + struct dmar_domain *domain = info->domain; this local variable is not required as there is just one reference to info->domain. btw I didn't see info->domain is cleared in this path. Is it correct? > + struct intel_iommu *iommu = info->iommu; > + unsigned long flags; > > - if (info->dev && !dev_is_real_dma_subdevice(info->dev)) { > + if (!dev_is_real_dma_subdevice(info->dev)) { > if (dev_is_pci(info->dev) && sm_supported(iommu)) > intel_pasid_tear_down_entry(iommu, info->dev, > PASID_RID2PASID, false); > @@ -4164,20 +4157,11 @@ static void > __dmar_remove_one_dev_info(struct device_domain_info *info) > intel_pasid_free_table(info->dev); > } > > - list_del(>link); > - domain_detach_iommu(domain, iommu); > -} > - > -static void dmar_remove_one_dev_info(struct device *dev) > -{ > - struct device_domain_info *info; > - unsigned long flags; > - > spin_lock_irqsave(_domain_lock, flags); > - info = dev_iommu_priv_get(dev); > - if (info) > - __dmar_remove_one_dev_info(info); > + list_del(>link); > spin_unlock_irqrestore(_domain_lock, flags); > + > + domain_detach_iommu(domain, iommu); > } > > static int md_domain_init(struct dmar_domain *domain, int guest_width) > -- > 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 09/12] iommu/vt-d: Check device list of domain in domain free path
> From: Lu Baolu > Sent: Tuesday, June 14, 2022 10:52 AM > > When the IOMMU domain is about to be freed, it should not be set on any > device. Instead of silently dealing with some bug cases, it's better to > trigger a warning to report and fix any potential bugs at the first time. > > Signed-off-by: Lu Baolu > Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 08/12] iommu/vt-d: Replace spin_lock_irqsave() with spin_lock()
> From: Lu Baolu > Sent: Tuesday, June 14, 2022 10:52 AM > > The iommu->lock is used to protect changes in root/context/pasid tables > and domain ID allocation. There's no use case to change these resources > in any interrupt context. Hence there's no need to disable interrupts > when helding the spinlock. > > Signed-off-by: Lu Baolu with this as one-place to changing irqsave for all previous patches: Reviewed-by: Kevin Tian ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 06/12] iommu/vt-d: Acquiring lock in domain ID allocation helpers
> From: Lu Baolu > Sent: Tuesday, June 14, 2022 10:52 AM > > The iommu->lock is used to protect the per-IOMMU domain ID resource. > Moving the lock into the ID alloc/free helpers makes the code more > compact. At the same time, the device_domain_lock is irrelevant to > the domain ID resource, remove its assertion as well. > > On the other hand, the iommu->lock is never used in interrupt context, > there's no need to use the irqsave variant of the spinlock calls. I still prefer to separating reduction of lock ranges from changing irqsave. Locking is tricky. From bisect p.o.v. it will be a lot easier if we just change one logic in one patch. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 03/12] iommu/vt-d: Remove clearing translation data in disable_dmar_iommu()
> From: Lu Baolu > Sent: Tuesday, June 14, 2022 10:51 AM > > The disable_dmar_iommu() is called when IOMMU initialization fails or > the IOMMU is hot-removed from the system. In both cases, there is no > need to clear the IOMMU translation data structures for devices. > > On the initialization path, the device probing only happens after the > IOMMU is initialized successfully, hence there're no translation data > structures. Out of curiosity. With kexec the IOMMU may contain stale mappings from the old kernel. Then is it meaningful to disable IOMMU after the new kernel fails to initialize it properly? > > On the hot-remove path, there is no real use case where the IOMMU is > hot-removed, but the devices that it manages are still alive in the > system. The translation data structures were torn down during device > release, hence there's no need to repeat it in IOMMU hot-remove path > either. This removes the unnecessary code and only leaves a check. > > Signed-off-by: Lu Baolu > --- > drivers/iommu/intel/pasid.h | 1 + > drivers/iommu/intel/iommu.c | 21 +++-- > 2 files changed, 8 insertions(+), 14 deletions(-) > > diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h > index 583ea67fc783..2afbb2afe8cc 100644 > --- a/drivers/iommu/intel/pasid.h > +++ b/drivers/iommu/intel/pasid.h > @@ -39,6 +39,7 @@ > * only and pass-through transfer modes. > */ > #define FLPT_DEFAULT_DID 1 > +#define NUM_RESERVED_DID 2 > > /* > * The SUPERVISOR_MODE flag indicates a first level translation which > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c > index ff6018f746e0..695aed474e5d 100644 > --- a/drivers/iommu/intel/iommu.c > +++ b/drivers/iommu/intel/iommu.c > @@ -1715,23 +1715,16 @@ static int iommu_init_domains(struct > intel_iommu *iommu) > > static void disable_dmar_iommu(struct intel_iommu *iommu) > { > - struct device_domain_info *info, *tmp; > - unsigned long flags; > - > if (!iommu->domain_ids) > return; > > - spin_lock_irqsave(_domain_lock, flags); > - list_for_each_entry_safe(info, tmp, _domain_list, global) { > - if (info->iommu != iommu) > - continue; > - > - if (!info->dev || !info->domain) > - continue; > - > - __dmar_remove_one_dev_info(info); > - } > - spin_unlock_irqrestore(_domain_lock, flags); > + /* > + * All iommu domains must have been detached from the devices, > + * hence there should be no domain IDs in use. > + */ > + if (WARN_ON(bitmap_weight(iommu->domain_ids, > cap_ndoms(iommu->cap)) > + != NUM_RESERVED_DID)) > + return; > > if (iommu->gcmd & DMA_GCMD_TE) > iommu_disable_translation(iommu); > -- > 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 01/12] iommu/vt-d: debugfs: Remove device_domain_lock usage
> From: Lu Baolu > Sent: Tuesday, June 14, 2022 10:51 AM > > The domain_translation_struct debugfs node is used to dump the DMAR > page > tables for the PCI devices. It potentially races with setting domains to > devices. The existing code uses a global spinlock device_domain_lock to > avoid the races, but this is problematical as this lock is only used to > protect the device tracking lists of each domain. is it really problematic at this point? Before following patches are applied using device_domain_lock should have similar effect as holding the group lock. Here it might make more sense to just focus on removing the use of device_domain_lock outside of iommu.c. Just that using group lock is cleaner and more compatible to following cleanups. and it's worth mentioning that racing with page table updates is out of the scope of this series. Probably also add a comment in the code to clarify this point. > > This replaces device_domain_lock with group->mutex to protect page tables > from setting a new domain. This also makes device_domain_lock static as > it is now only used inside the file. s/the file/iommu.c/ > > Signed-off-by: Lu Baolu > --- > drivers/iommu/intel/iommu.h | 1 - > drivers/iommu/intel/debugfs.c | 49 +-- > drivers/iommu/intel/iommu.c | 2 +- > 3 files changed, 36 insertions(+), 16 deletions(-) > > diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h > index a22adfbdf870..8a6d64d726c0 100644 > --- a/drivers/iommu/intel/iommu.h > +++ b/drivers/iommu/intel/iommu.h > @@ -480,7 +480,6 @@ enum { > #define VTD_FLAG_SVM_CAPABLE (1 << 2) > > extern int intel_iommu_sm; > -extern spinlock_t device_domain_lock; > > #define sm_supported(iommu) (intel_iommu_sm && > ecap_smts((iommu)->ecap)) > #define pasid_supported(iommu) (sm_supported(iommu) && > \ > diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c > index d927ef10641b..5ebfe32265d5 100644 > --- a/drivers/iommu/intel/debugfs.c > +++ b/drivers/iommu/intel/debugfs.c > @@ -342,15 +342,23 @@ static void pgtable_walk_level(struct seq_file *m, > struct dma_pte *pde, > } > } > > -static int show_device_domain_translation(struct device *dev, void *data) > +struct show_domain_opaque { > + struct device *dev; > + struct seq_file *m; > +}; Sounds redundant as both bus_for_each_dev() and iommu_group_for_each_dev() declares the same fn type which accepts a device pointer and void *data. > + > +static int __show_device_domain_translation(struct device *dev, void *data) > { > - struct device_domain_info *info = dev_iommu_priv_get(dev); > - struct dmar_domain *domain = info->domain; > - struct seq_file *m = data; > + struct show_domain_opaque *opaque = data; > + struct device_domain_info *info; > + struct seq_file *m = opaque->m; > + struct dmar_domain *domain; > u64 path[6] = { 0 }; > > - if (!domain) > + if (dev != opaque->dev) > return 0; not required. > + info = dev_iommu_priv_get(dev); > + domain = info->domain; > > seq_printf(m, "Device %s @0x%llx\n", dev_name(dev), > (u64)virt_to_phys(domain->pgd)); > @@ -359,20 +367,33 @@ static int show_device_domain_translation(struct > device *dev, void *data) > pgtable_walk_level(m, domain->pgd, domain->agaw + 2, 0, path); > seq_putc(m, '\n'); > > - return 0; > + return 1; > } > > -static int domain_translation_struct_show(struct seq_file *m, void *unused) > +static int show_device_domain_translation(struct device *dev, void *data) > { > - unsigned long flags; > - int ret; > + struct show_domain_opaque opaque = {dev, data}; > + struct iommu_group *group; > > - spin_lock_irqsave(_domain_lock, flags); > - ret = bus_for_each_dev(_bus_type, NULL, m, > -show_device_domain_translation); > - spin_unlock_irqrestore(_domain_lock, flags); > + group = iommu_group_get(dev); > + if (group) { > + /* > + * The group->mutex is held across the callback, which will > + * block calls to iommu_attach/detach_group/device. Hence, > + * the domain of the device will not change during traversal. > + */ > + iommu_group_for_each_dev(group, , > + __show_device_domain_translation); > + iommu_group_put(group); > + } > > - return ret; > + return 0; > +} > + > +static int domain_translation_struct_show(struct seq_file *m, void *unused) > +{ > + return bus_for_each_dev(_bus_type, NULL, m, > + show_device_domain_translation); > } > DEFINE_SHOW_ATTRIBUTE(domain_translation_struct); > > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c > index 19024dc52735..a39d72a9d1cf 100644 > --- a/drivers/iommu/intel/iommu.c > +++
Re: [PATCH 1/1] iommu/vt-d: Add set_dev_pasid callbacks for default domain
On 2022/6/14 13:36, Tian, Kevin wrote: From: Baolu Lu Sent: Tuesday, June 14, 2022 12:48 PM On 2022/6/14 12:02, Tian, Kevin wrote: From: Lu Baolu Sent: Tuesday, June 14, 2022 11:44 AM This allows the upper layers to set a domain to a PASID of a device if the PASID feature is supported by the IOMMU hardware. The typical use cases are, for example, kernel DMA with PASID and hardware assisted mediated device drivers. why is it not part of the series for those use cases? There is no consumer of added callbacks in this patch... It could be. I just wanted to maintain the integrity of Intel IOMMU driver implementation. but let's not add dead code. and this patch is actually a right step simply from set_dev_pasid() p.o.v hence you should include in any series which first tries to use that interface. Yes, that's my intention. If it reviews well, we can include it in the driver's implementation. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu