[PATCH 2/2] arm64: dts: qcom: sm8250: Enable per-process page tables.

2022-06-14 Thread Emma Anholt
This is an SMMU for the adreno gpu, and adding this compatible lets
the driver use per-fd page tables, which are required for security
between GPU clients.

Signed-off-by: Emma Anholt 
---

Tested with a full deqp-vk run on RB5, which did involve some iommu faults.

 arch/arm64/boot/dts/qcom/sm8250.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sm8250.dtsi 
b/arch/arm64/boot/dts/qcom/sm8250.dtsi
index a92230bec1dd..483c0e0f1d1a 100644
--- a/arch/arm64/boot/dts/qcom/sm8250.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi
@@ -2513,7 +2513,7 @@ gpucc: clock-controller@3d9 {
};
 
adreno_smmu: iommu@3da {
-   compatible = "qcom,sm8250-smmu-500", "arm,mmu-500";
+   compatible = "qcom,sm8250-smmu-500", "arm,mmu-500", 
"qcom,adreno-smmu";
reg = <0 0x03da 0 0x1>;
#iommu-cells = <2>;
#global-interrupts = <2>;
-- 
2.36.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] iommu: arm-smmu-impl: Add 8250 display compatible to the client list.

2022-06-14 Thread Emma Anholt
Required for turning on per-process page tables for the GPU.

Signed-off-by: Emma Anholt 
---

 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index d8e1ef83c01b..bb9220937068 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -233,6 +233,7 @@ static const struct of_device_id 
qcom_smmu_client_of_match[] __maybe_unused = {
{ .compatible = "qcom,sc7280-mdss" },
{ .compatible = "qcom,sc7280-mss-pil" },
{ .compatible = "qcom,sc8180x-mdss" },
+   { .compatible = "qcom,sm8250-mdss" },
{ .compatible = "qcom,sdm845-mdss" },
{ .compatible = "qcom,sdm845-mss-pil" },
{ }
-- 
2.36.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 01/12] iommu/vt-d: debugfs: Remove device_domain_lock usage

2022-06-14 Thread Baolu Lu

On 2022/6/14 14:43, Tian, Kevin wrote:

From: Lu Baolu
Sent: Tuesday, June 14, 2022 10:51 AM

The domain_translation_struct debugfs node is used to dump the DMAR
page
tables for the PCI devices. It potentially races with setting domains to
devices. The existing code uses a global spinlock device_domain_lock to
avoid the races, but this is problematical as this lock is only used to
protect the device tracking lists of each domain.

is it really problematic at this point? Before following patches are applied
using device_domain_lock should have similar effect as holding the group
lock.

Here it might make more sense to just focus on removing the use of
device_domain_lock outside of iommu.c. Just that using group lock is
cleaner and more compatible to following cleanups.

and it's worth mentioning that racing with page table updates is out
of the scope of this series. Probably also add a comment in the code
to clarify this point.



Hi Kevin,

How do you like below updated patch?

From cecc9a0623780a11c4ea4d0a15aa6187f01541c4 Mon Sep 17 00:00:00 2001
From: Lu Baolu 
Date: Sun, 29 May 2022 10:18:56 +0800
Subject: [PATCH 1/1] iommu/vt-d: debugfs: Remove device_domain_lock usage

The domain_translation_struct debugfs node is used to dump the DMAR page
tables for the PCI devices. It potentially races with setting domains to
devices. The existing code uses the global spinlock device_domain_lock to
avoid the races.

This removes the use of device_domain_lock outside of iommu.c by replacing
it with the group mutex lock. Using the group mutex lock is cleaner and
more compatible to following cleanups.

Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel/debugfs.c | 42 +--
 drivers/iommu/intel/iommu.c   |  2 +-
 drivers/iommu/intel/iommu.h   |  1 -
 3 files changed, 31 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c
index d927ef10641b..f4acd8993f60 100644
--- a/drivers/iommu/intel/debugfs.c
+++ b/drivers/iommu/intel/debugfs.c
@@ -342,13 +342,13 @@ static void pgtable_walk_level(struct seq_file *m, 
struct dma_pte *pde,

}
 }

-static int show_device_domain_translation(struct device *dev, void *data)
+static int __show_device_domain_translation(struct device *dev, void *data)
 {
-   struct device_domain_info *info = dev_iommu_priv_get(dev);
-   struct dmar_domain *domain = info->domain;
+   struct dmar_domain *domain;
struct seq_file *m = data;
u64 path[6] = { 0 };

+   domain = to_dmar_domain(iommu_get_domain_for_dev(dev));
if (!domain)
return 0;

@@ -359,20 +359,38 @@ static int show_device_domain_translation(struct 
device *dev, void *data)

pgtable_walk_level(m, domain->pgd, domain->agaw + 2, 0, path);
seq_putc(m, '\n');

-   return 0;
+   return 1;
 }

-static int domain_translation_struct_show(struct seq_file *m, void *unused)
+static int show_device_domain_translation(struct device *dev, void *data)
 {
-   unsigned long flags;
-   int ret;
+   struct iommu_group *group;

-   spin_lock_irqsave(_domain_lock, flags);
-   ret = bus_for_each_dev(_bus_type, NULL, m,
-  show_device_domain_translation);
-   spin_unlock_irqrestore(_domain_lock, flags);
+   group = iommu_group_get(dev);
+   if (group) {
+   /*
+* The group->mutex is held across the callback, which will
+* block calls to iommu_attach/detach_group/device. Hence,
+* the domain of the device will not change during traversal.
+*
+* All devices in an iommu group share a single domain, hence
+* we only dump the domain of the first device. Even though,
+* this code still possibly races with the iommu_unmap()
+* interface. This could be solved by RCU-freeing the page
+* table pages in the iommu_unmap() path.
+*/
+   iommu_group_for_each_dev(group, data,
+__show_device_domain_translation);
+   iommu_group_put(group);
+   }

-   return ret;
+   return 0;
+}
+
+static int domain_translation_struct_show(struct seq_file *m, void *unused)
+{
+   return bus_for_each_dev(_bus_type, NULL, m,
+   show_device_domain_translation);
 }
 DEFINE_SHOW_ATTRIBUTE(domain_translation_struct);

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 19024dc52735..a39d72a9d1cf 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -314,7 +314,7 @@ static int iommu_skip_te_disable;
 #define IDENTMAP_GFX   2
 #define IDENTMAP_AZALIA4

-DEFINE_SPINLOCK(device_domain_lock);
+static DEFINE_SPINLOCK(device_domain_lock);
 static LIST_HEAD(device_domain_list);

 /*
diff --git a/drivers/iommu/intel/iommu.h 

Re: [PATCH v2] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting

2022-06-14 Thread Baolu Lu

On 2022/6/15 05:12, Steve Wahl wrote:

On Tue, Jun 14, 2022 at 12:01:45PM -0700, Jerry Snitselaar wrote:

On Tue, Jun 14, 2022 at 11:45:35AM -0500, Steve Wahl wrote:

On Tue, Jun 14, 2022 at 10:21:29AM +0800, Baolu Lu wrote:

On 2022/6/14 09:54, Jerry Snitselaar wrote:

On Mon, Jun 13, 2022 at 6:51 PM Baolu Lu  wrote:


On 2022/6/14 09:44, Jerry Snitselaar wrote:

On Mon, Jun 13, 2022 at 6:36 PM Baolu Lu  wrote:

On 2022/6/14 04:57, Jerry Snitselaar wrote:

On Thu, May 12, 2022 at 10:13:09AM -0500, Steve Wahl wrote:

To support up to 64 sockets with 10 DMAR units each (640), make the
value of DMAR_UNITS_SUPPORTED adjustable by a config variable,
CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when MAXSMP is
set.

If the available hardware exceeds DMAR_UNITS_SUPPORTED (previously set
to MAX_IO_APICS, or 128), it causes these messages: "DMAR: Failed to
allocate seq_id", "DMAR: Parse DMAR table failure.", and "x2apic: IRQ
remapping doesn't support X2APIC mode x2apic disabled"; and the system
fails to boot properly.

Signed-off-by: Steve Wahl
---

Note that we could not find a reason for connecting
DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously.  Perhaps
it seemed like the two would continue to match on earlier processors.
There doesn't appear to be kernel code that assumes that the value of
one is related to the other.

v2: Make this value a config option, rather than a fixed constant.  The default
values should match previous configuration except in the MAXSMP case.  Keeping 
the
value at a power of two was requested by Kevin Tian.

 drivers/iommu/intel/Kconfig | 6 ++
 include/linux/dmar.h| 6 +-
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig
index 247d0f2d5fdf..fdbda77ac21e 100644
--- a/drivers/iommu/intel/Kconfig
+++ b/drivers/iommu/intel/Kconfig
@@ -9,6 +9,12 @@ config DMAR_PERF
 config DMAR_DEBUG
bool

+config DMAR_UNITS_SUPPORTED
+int "Number of DMA Remapping Units supported"

Also, should there be a "depends on (X86 || IA64)" here?

Do you have any compilation errors or warnings?

Best regards,
baolu


I think it is probably harmless since it doesn't get used elsewhere,
but our tooling was complaining to me because DMAR_UNITS_SUPPORTED was
being autogenerated into the configs for the non-x86 architectures we
build (aarch64, s390x, ppcle64).
We have files corresponding to the config options that it looks at,
and I had one for x86 and not the others so it noticed the
discrepancy.


So with "depends on (X86 || IA64)", that tool doesn't complain anymore,
right?

Best regards,
baolu



Yes, with the depends it no longer happens.


The dmar code only exists on X86 and IA64 arch's. Adding this depending
makes sense to me. I will add it if no objections.


I think that works after Baolu's patchset that makes intel-iommu.h
private.  I'm pretty sure it wouldn't have worked before that.

No objections.



Yes, I think applying it with the depends prior to Baolu's change would
still run into the issue from the KTR report if someone compiled without
INTEL_IOMMU enabled.

This was dealing with being able to do something like:

make allmodconfig ARCH=arm64 ; grep DMAR_UNITS .config

and finding CONFIG_DMAR_UNITS_SUPPORTED=64.

Thinking some more though, instead of the depends being on the arch
would depending on DMAR_TABLE or INTEL_IOMMU be more appropriate?


At least in my limited exploration, depending on INTEL_IOMMU yields
compile errors, but depending upon DMAR_TABLE appears to work fine.


DMAR_TABLE is used beyond INTEL_IOMMU, so depending on DMAR_TABLE seems
better.

Steve, do you mind posting a v3 with this fixed?

Best regards,
baolu

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/7] iommu: Add domain_type_supported() callback in iommu_ops

2022-06-14 Thread Suthikulpanit, Suravee via iommu

Robin,

On 6/14/2022 4:51 PM, Robin Murphy wrote:

On 2022-06-13 15:38, Suthikulpanit, Suravee wrote:

Robin,

On 6/13/2022 4:31 PM, Robin Murphy wrote:



Introducing check_domain_type_supported() callback in iommu_ops,
which allows IOMMU generic layer to check with vendor-specific IOMMU driver
whether the requested type is supported. This allows user to request
types other than the default type.


Note also that you're only adding this in the sysfs path - what about the 
"iommu.passthrough=" parameter or CONFIG_IOMMU_DEFAULT_PASSTHROUGH?


For SNP case, we cannot enable SNP if iommu=off or iommu=pt or 
iommu.passthrough=1 or CONFIG_IOMMU_DEFAULT_PASSTHROUGH=y.
So, when another driver tries to enable SNP, the IOMMU driver prevents it (see 
iommu_sev_snp_supported() in patch 3).


Ugh, I hadn't looked too closely at the other patches, but an interface that looks like a 
simple "is this feature supported?" check with a secret side-effect of changing 
global behaviour as well? Yuck :(

What external drivers are expected to have the authority to affect the entire 
system and call that? The fact that you're exporting it suggests they could be 
loaded from modules *after* v2 features have been enabled and/or the user has 
configured a non-default identity domain for a group via sysfs... Fun!


I see your point.

Currently, the function to enable SNP will be called from SEV driver when it 
tries to enable SNP support globally on the system.
This is done during fs_initcall(), which is early in the boot process. I can 
also add a guard code to make sure that this won't
be done after a certain phase.


Instead, if we boot with iommu.passhthrough=0, when another driver tries to 
enable SNP, the IOMMU driver allows this and switch to SNP enable mode.
Subsequently, if user tries to switch a domain (via sysfs) to 
IOMMU_DOMAIN_IDENTITY, the IOMMU needs to prevent this because it has already 
switch
to SNP-enabled mode.


AFAICS there shouldn't need to be any core-level changes to support this. We already have 
drivers which don't support passthrough at all, so conditionally not supporting it should 
be no big deal. What should happen currently is that def_domain_type returns 0 for 
"don't care", then domain_alloc rejects IOMMU_DOMAIN_IDENTITY and and returns 
NULL, so iommu_group_alloc_default_domain() falls back to IOMMU_DOMAIN_DMA.


Technically, we can do it the way you suggest. But isn't this confusing? At first, 
def_domain_type() returns 0 for "don't care",
but then it rejects the request to change to IOMMU_DOMAIN_IDENTITY when trying 
to call domain_alloc().


Yes, that's how it works; def_domain_type is responsible for quirking 
individual *devices* that need to have a specific domain type (in practice, 
devices which need identity mapping), while domain_alloc is responsible for 
saying which domain types the driver supports as a whole, by allocating them or 
not as appropriate.

We don't have a particularly neat way to achieve the negative of 
def_domain_type - i.e. saying that a specific device *can't* use a specific 
otherwise-supported domain type - other than subsequently failing in 
attach_dev, but so far we've not needed such a thing. And if SNP is expected to 
be mutually exclusive with identity domain support globally, then we still 
shouldn't need it.


Thanks for your feedback.

Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting

2022-06-14 Thread Steve Wahl
On Tue, Jun 14, 2022 at 12:01:45PM -0700, Jerry Snitselaar wrote:
> On Tue, Jun 14, 2022 at 11:45:35AM -0500, Steve Wahl wrote:
> > On Tue, Jun 14, 2022 at 10:21:29AM +0800, Baolu Lu wrote:
> > > On 2022/6/14 09:54, Jerry Snitselaar wrote:
> > > > On Mon, Jun 13, 2022 at 6:51 PM Baolu Lu  
> > > > wrote:
> > > > > 
> > > > > On 2022/6/14 09:44, Jerry Snitselaar wrote:
> > > > > > On Mon, Jun 13, 2022 at 6:36 PM Baolu Lu  
> > > > > > wrote:
> > > > > > > On 2022/6/14 04:57, Jerry Snitselaar wrote:
> > > > > > > > On Thu, May 12, 2022 at 10:13:09AM -0500, Steve Wahl wrote:
> > > > > > > > > To support up to 64 sockets with 10 DMAR units each (640), 
> > > > > > > > > make the
> > > > > > > > > value of DMAR_UNITS_SUPPORTED adjustable by a config variable,
> > > > > > > > > CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when 
> > > > > > > > > MAXSMP is
> > > > > > > > > set.
> > > > > > > > > 
> > > > > > > > > If the available hardware exceeds DMAR_UNITS_SUPPORTED 
> > > > > > > > > (previously set
> > > > > > > > > to MAX_IO_APICS, or 128), it causes these messages: "DMAR: 
> > > > > > > > > Failed to
> > > > > > > > > allocate seq_id", "DMAR: Parse DMAR table failure.", and 
> > > > > > > > > "x2apic: IRQ
> > > > > > > > > remapping doesn't support X2APIC mode x2apic disabled"; and 
> > > > > > > > > the system
> > > > > > > > > fails to boot properly.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Steve Wahl
> > > > > > > > > ---
> > > > > > > > > 
> > > > > > > > > Note that we could not find a reason for connecting
> > > > > > > > > DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously.  
> > > > > > > > > Perhaps
> > > > > > > > > it seemed like the two would continue to match on earlier 
> > > > > > > > > processors.
> > > > > > > > > There doesn't appear to be kernel code that assumes that the 
> > > > > > > > > value of
> > > > > > > > > one is related to the other.
> > > > > > > > > 
> > > > > > > > > v2: Make this value a config option, rather than a fixed 
> > > > > > > > > constant.  The default
> > > > > > > > > values should match previous configuration except in the 
> > > > > > > > > MAXSMP case.  Keeping the
> > > > > > > > > value at a power of two was requested by Kevin Tian.
> > > > > > > > > 
> > > > > > > > > drivers/iommu/intel/Kconfig | 6 ++
> > > > > > > > > include/linux/dmar.h| 6 +-
> > > > > > > > > 2 files changed, 7 insertions(+), 5 deletions(-)
> > > > > > > > > 
> > > > > > > > > diff --git a/drivers/iommu/intel/Kconfig 
> > > > > > > > > b/drivers/iommu/intel/Kconfig
> > > > > > > > > index 247d0f2d5fdf..fdbda77ac21e 100644
> > > > > > > > > --- a/drivers/iommu/intel/Kconfig
> > > > > > > > > +++ b/drivers/iommu/intel/Kconfig
> > > > > > > > > @@ -9,6 +9,12 @@ config DMAR_PERF
> > > > > > > > > config DMAR_DEBUG
> > > > > > > > >bool
> > > > > > > > > 
> > > > > > > > > +config DMAR_UNITS_SUPPORTED
> > > > > > > > > +int "Number of DMA Remapping Units supported"
> > > > > > > > Also, should there be a "depends on (X86 || IA64)" here?
> > > > > > > Do you have any compilation errors or warnings?
> > > > > > > 
> > > > > > > Best regards,
> > > > > > > baolu
> > > > > > > 
> > > > > > I think it is probably harmless since it doesn't get used elsewhere,
> > > > > > but our tooling was complaining to me because DMAR_UNITS_SUPPORTED 
> > > > > > was
> > > > > > being autogenerated into the configs for the non-x86 architectures 
> > > > > > we
> > > > > > build (aarch64, s390x, ppcle64).
> > > > > > We have files corresponding to the config options that it looks at,
> > > > > > and I had one for x86 and not the others so it noticed the
> > > > > > discrepancy.
> > > > > 
> > > > > So with "depends on (X86 || IA64)", that tool doesn't complain 
> > > > > anymore,
> > > > > right?
> > > > > 
> > > > > Best regards,
> > > > > baolu
> > > > > 
> > > > 
> > > > Yes, with the depends it no longer happens.
> > > 
> > > The dmar code only exists on X86 and IA64 arch's. Adding this depending
> > > makes sense to me. I will add it if no objections.
> > 
> > I think that works after Baolu's patchset that makes intel-iommu.h
> > private.  I'm pretty sure it wouldn't have worked before that.
> > 
> > No objections.
> > 
> 
> Yes, I think applying it with the depends prior to Baolu's change would
> still run into the issue from the KTR report if someone compiled without
> INTEL_IOMMU enabled.
> 
> This was dealing with being able to do something like:
> 
> make allmodconfig ARCH=arm64 ; grep DMAR_UNITS .config
> 
> and finding CONFIG_DMAR_UNITS_SUPPORTED=64.
> 
> Thinking some more though, instead of the depends being on the arch
> would depending on DMAR_TABLE or INTEL_IOMMU be more appropriate?

At least in my limited exploration, depending on INTEL_IOMMU yields
compile errors, but depending upon DMAR_TABLE appears to work fine.

--> Steve

-- 
Steve Wahl, Hewlett Packard Enterprise

Re: [PATCH 3/5] vfio/iommu_type1: Prefer to reuse domains vs match enforced cache coherency

2022-06-14 Thread Nicolin Chen via iommu
Hi Kevin,

On Wed, Jun 08, 2022 at 11:48:27PM +, Tian, Kevin wrote:
> > > > The KVM mechanism for controlling wbinvd is only triggered during
> > > > kvm_vfio_group_add(), meaning it is a one-shot test done once the
> > devices
> > > > are setup.
> > >
> > > It's not one-shot. kvm_vfio_update_coherency() is called in both
> > > group_add() and group_del(). Then the coherency property is
> > > checked dynamically in wbinvd emulation:
> >
> > From the perspective of managing the domains that is still
> > one-shot. It doesn't get updated when individual devices are
> > added/removed to domains.
> 
> It's unchanged per-domain but dynamic per-vm when multiple
> domains are added/removed (i.e. kvm->arch.noncoherent_dma_count).
> It's the latter being checked in the kvm.

I am going to send a v2, yet not quite getting the point here.
Meanwhile, Jason is on leave.

What, in your opinion, would be an accurate description here?

Thanks
Nic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting

2022-06-14 Thread Jerry Snitselaar
On Tue, Jun 14, 2022 at 11:45:35AM -0500, Steve Wahl wrote:
> On Tue, Jun 14, 2022 at 10:21:29AM +0800, Baolu Lu wrote:
> > On 2022/6/14 09:54, Jerry Snitselaar wrote:
> > > On Mon, Jun 13, 2022 at 6:51 PM Baolu Lu  wrote:
> > > > 
> > > > On 2022/6/14 09:44, Jerry Snitselaar wrote:
> > > > > On Mon, Jun 13, 2022 at 6:36 PM Baolu Lu  
> > > > > wrote:
> > > > > > On 2022/6/14 04:57, Jerry Snitselaar wrote:
> > > > > > > On Thu, May 12, 2022 at 10:13:09AM -0500, Steve Wahl wrote:
> > > > > > > > To support up to 64 sockets with 10 DMAR units each (640), make 
> > > > > > > > the
> > > > > > > > value of DMAR_UNITS_SUPPORTED adjustable by a config variable,
> > > > > > > > CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when 
> > > > > > > > MAXSMP is
> > > > > > > > set.
> > > > > > > > 
> > > > > > > > If the available hardware exceeds DMAR_UNITS_SUPPORTED 
> > > > > > > > (previously set
> > > > > > > > to MAX_IO_APICS, or 128), it causes these messages: "DMAR: 
> > > > > > > > Failed to
> > > > > > > > allocate seq_id", "DMAR: Parse DMAR table failure.", and 
> > > > > > > > "x2apic: IRQ
> > > > > > > > remapping doesn't support X2APIC mode x2apic disabled"; and the 
> > > > > > > > system
> > > > > > > > fails to boot properly.
> > > > > > > > 
> > > > > > > > Signed-off-by: Steve Wahl
> > > > > > > > ---
> > > > > > > > 
> > > > > > > > Note that we could not find a reason for connecting
> > > > > > > > DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously.  
> > > > > > > > Perhaps
> > > > > > > > it seemed like the two would continue to match on earlier 
> > > > > > > > processors.
> > > > > > > > There doesn't appear to be kernel code that assumes that the 
> > > > > > > > value of
> > > > > > > > one is related to the other.
> > > > > > > > 
> > > > > > > > v2: Make this value a config option, rather than a fixed 
> > > > > > > > constant.  The default
> > > > > > > > values should match previous configuration except in the MAXSMP 
> > > > > > > > case.  Keeping the
> > > > > > > > value at a power of two was requested by Kevin Tian.
> > > > > > > > 
> > > > > > > > drivers/iommu/intel/Kconfig | 6 ++
> > > > > > > > include/linux/dmar.h| 6 +-
> > > > > > > > 2 files changed, 7 insertions(+), 5 deletions(-)
> > > > > > > > 
> > > > > > > > diff --git a/drivers/iommu/intel/Kconfig 
> > > > > > > > b/drivers/iommu/intel/Kconfig
> > > > > > > > index 247d0f2d5fdf..fdbda77ac21e 100644
> > > > > > > > --- a/drivers/iommu/intel/Kconfig
> > > > > > > > +++ b/drivers/iommu/intel/Kconfig
> > > > > > > > @@ -9,6 +9,12 @@ config DMAR_PERF
> > > > > > > > config DMAR_DEBUG
> > > > > > > >bool
> > > > > > > > 
> > > > > > > > +config DMAR_UNITS_SUPPORTED
> > > > > > > > +int "Number of DMA Remapping Units supported"
> > > > > > > Also, should there be a "depends on (X86 || IA64)" here?
> > > > > > Do you have any compilation errors or warnings?
> > > > > > 
> > > > > > Best regards,
> > > > > > baolu
> > > > > > 
> > > > > I think it is probably harmless since it doesn't get used elsewhere,
> > > > > but our tooling was complaining to me because DMAR_UNITS_SUPPORTED was
> > > > > being autogenerated into the configs for the non-x86 architectures we
> > > > > build (aarch64, s390x, ppcle64).
> > > > > We have files corresponding to the config options that it looks at,
> > > > > and I had one for x86 and not the others so it noticed the
> > > > > discrepancy.
> > > > 
> > > > So with "depends on (X86 || IA64)", that tool doesn't complain anymore,
> > > > right?
> > > > 
> > > > Best regards,
> > > > baolu
> > > > 
> > > 
> > > Yes, with the depends it no longer happens.
> > 
> > The dmar code only exists on X86 and IA64 arch's. Adding this depending
> > makes sense to me. I will add it if no objections.
> 
> I think that works after Baolu's patchset that makes intel-iommu.h
> private.  I'm pretty sure it wouldn't have worked before that.
> 
> No objections.
> 

Yes, I think applying it with the depends prior to Baolu's change would
still run into the issue from the KTR report if someone compiled without
INTEL_IOMMU enabled.

This was dealing with being able to do something like:

make allmodconfig ARCH=arm64 ; grep DMAR_UNITS .config

and finding CONFIG_DMAR_UNITS_SUPPORTED=64.

Thinking some more though, instead of the depends being on the arch
would depending on DMAR_TABLE or INTEL_IOMMU be more appropriate?

Regards,
Jerry

> --> Steve
> 
> -- 
> Steve Wahl, Hewlett Packard Enterprise

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 3/3] iommu/mediatek: Allow page table PA up to 35bit

2022-06-14 Thread Miles Chen via iommu
> Single memory zone feature will remove ZONE_DMA32 and ZONE_DMA. So add
> the quirk IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT to let level 1 and level 2
> pgtable support at most 35bit PA.
> 
> Signed-off-by: Ning Li 
> Signed-off-by: Yunfei Wang 


Reviewed-by: Miles Chen  

> ---
>  drivers/iommu/mtk_iommu.c | 14 +-
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 3d62399e8865..4dbc33758711 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -138,6 +138,7 @@
>  /* PM and clock always on. e.g. infra iommu */
>  #define PM_CLK_AOBIT(15)
>  #define IFA_IOMMU_PCIE_SUPPORT   BIT(16)
> +#define PGTABLE_PA_35_EN BIT(17)
>  
>  #define MTK_IOMMU_HAS_FLAG_MASK(pdata, _x, mask) \
>   pdata)->flags) & (mask)) == (_x))
> @@ -240,6 +241,7 @@ struct mtk_iommu_data {
>  struct mtk_iommu_domain {
>   struct io_pgtable_cfg   cfg;
>   struct io_pgtable_ops   *iop;
> + u32 ttbr;
>  
>   struct mtk_iommu_bank_data  *bank;
>   struct iommu_domain domain;
> @@ -596,6 +598,9 @@ static int mtk_iommu_domain_finalise(struct 
> mtk_iommu_domain *dom,
>   .iommu_dev = data->dev,
>   };
>  
> + if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN))
> + dom->cfg.quirks |= IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT;
> +
>   if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE))
>   dom->cfg.oas = data->enable_4GB ? 33 : 32;
>   else
> @@ -684,8 +689,8 @@ static int mtk_iommu_attach_device(struct iommu_domain 
> *domain,
>   goto err_unlock;
>   }
>   bank->m4u_dom = dom;
> - writel(dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK,
> -bank->base + REG_MMU_PT_BASE_ADDR);
> + bank->m4u_dom->ttbr = MTK_IOMMU_ADDR(dom->cfg.arm_v7s_cfg.ttbr);
> + writel(bank->m4u_dom->ttbr, data->base + REG_MMU_PT_BASE_ADDR);
>  
>   pm_runtime_put(m4udev);
>   }
> @@ -1366,8 +1371,7 @@ static int __maybe_unused 
> mtk_iommu_runtime_resume(struct device *dev)
>   writel_relaxed(reg->int_control[i], base + 
> REG_MMU_INT_CONTROL0);
>   writel_relaxed(reg->int_main_control[i], base + 
> REG_MMU_INT_MAIN_CONTROL);
>   writel_relaxed(reg->ivrp_paddr[i], base + REG_MMU_IVRP_PADDR);
> - writel(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK,
> -base + REG_MMU_PT_BASE_ADDR);
> + writel(m4u_dom->ttbr, base + REG_MMU_PT_BASE_ADDR);
>   } while (++i < data->plat_data->banks_num);
>  
>   /*
> @@ -1401,7 +1405,7 @@ static const struct mtk_iommu_plat_data mt2712_data = {
>  static const struct mtk_iommu_plat_data mt6779_data = {
>   .m4u_plat  = M4U_MT6779,
>   .flags = HAS_SUB_COMM_2BITS | OUT_ORDER_WR_EN | WR_THROT_EN |
> -  MTK_IOMMU_TYPE_MM,
> +  MTK_IOMMU_TYPE_MM | PGTABLE_PA_35_EN,
>   .inv_sel_reg   = REG_MMU_INV_SEL_GEN2,
>   .banks_num= 1,
>   .banks_enable = {true},
> -- 
> 2.18.0
> 
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 2/3] iommu/mediatek: Rename MTK_IOMMU_TLB_ADDR to MTK_IOMMU_ADDR

2022-06-14 Thread Miles Chen via iommu
> Rename MTK_IOMMU_TLB_ADDR to MTK_IOMMU_ADDR, and update MTK_IOMMU_ADDR
> definition for better generality.
> 
> Signed-off-by: Ning Li 
> Signed-off-by: Yunfei Wang 

Reviewed-by: Miles Chen  
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting

2022-06-14 Thread Steve Wahl
On Tue, Jun 14, 2022 at 10:21:29AM +0800, Baolu Lu wrote:
> On 2022/6/14 09:54, Jerry Snitselaar wrote:
> > On Mon, Jun 13, 2022 at 6:51 PM Baolu Lu  wrote:
> > > 
> > > On 2022/6/14 09:44, Jerry Snitselaar wrote:
> > > > On Mon, Jun 13, 2022 at 6:36 PM Baolu Lu  
> > > > wrote:
> > > > > On 2022/6/14 04:57, Jerry Snitselaar wrote:
> > > > > > On Thu, May 12, 2022 at 10:13:09AM -0500, Steve Wahl wrote:
> > > > > > > To support up to 64 sockets with 10 DMAR units each (640), make 
> > > > > > > the
> > > > > > > value of DMAR_UNITS_SUPPORTED adjustable by a config variable,
> > > > > > > CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when 
> > > > > > > MAXSMP is
> > > > > > > set.
> > > > > > > 
> > > > > > > If the available hardware exceeds DMAR_UNITS_SUPPORTED 
> > > > > > > (previously set
> > > > > > > to MAX_IO_APICS, or 128), it causes these messages: "DMAR: Failed 
> > > > > > > to
> > > > > > > allocate seq_id", "DMAR: Parse DMAR table failure.", and "x2apic: 
> > > > > > > IRQ
> > > > > > > remapping doesn't support X2APIC mode x2apic disabled"; and the 
> > > > > > > system
> > > > > > > fails to boot properly.
> > > > > > > 
> > > > > > > Signed-off-by: Steve Wahl
> > > > > > > ---
> > > > > > > 
> > > > > > > Note that we could not find a reason for connecting
> > > > > > > DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously.  
> > > > > > > Perhaps
> > > > > > > it seemed like the two would continue to match on earlier 
> > > > > > > processors.
> > > > > > > There doesn't appear to be kernel code that assumes that the 
> > > > > > > value of
> > > > > > > one is related to the other.
> > > > > > > 
> > > > > > > v2: Make this value a config option, rather than a fixed 
> > > > > > > constant.  The default
> > > > > > > values should match previous configuration except in the MAXSMP 
> > > > > > > case.  Keeping the
> > > > > > > value at a power of two was requested by Kevin Tian.
> > > > > > > 
> > > > > > > drivers/iommu/intel/Kconfig | 6 ++
> > > > > > > include/linux/dmar.h| 6 +-
> > > > > > > 2 files changed, 7 insertions(+), 5 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/iommu/intel/Kconfig 
> > > > > > > b/drivers/iommu/intel/Kconfig
> > > > > > > index 247d0f2d5fdf..fdbda77ac21e 100644
> > > > > > > --- a/drivers/iommu/intel/Kconfig
> > > > > > > +++ b/drivers/iommu/intel/Kconfig
> > > > > > > @@ -9,6 +9,12 @@ config DMAR_PERF
> > > > > > > config DMAR_DEBUG
> > > > > > >bool
> > > > > > > 
> > > > > > > +config DMAR_UNITS_SUPPORTED
> > > > > > > +int "Number of DMA Remapping Units supported"
> > > > > > Also, should there be a "depends on (X86 || IA64)" here?
> > > > > Do you have any compilation errors or warnings?
> > > > > 
> > > > > Best regards,
> > > > > baolu
> > > > > 
> > > > I think it is probably harmless since it doesn't get used elsewhere,
> > > > but our tooling was complaining to me because DMAR_UNITS_SUPPORTED was
> > > > being autogenerated into the configs for the non-x86 architectures we
> > > > build (aarch64, s390x, ppcle64).
> > > > We have files corresponding to the config options that it looks at,
> > > > and I had one for x86 and not the others so it noticed the
> > > > discrepancy.
> > > 
> > > So with "depends on (X86 || IA64)", that tool doesn't complain anymore,
> > > right?
> > > 
> > > Best regards,
> > > baolu
> > > 
> > 
> > Yes, with the depends it no longer happens.
> 
> The dmar code only exists on X86 and IA64 arch's. Adding this depending
> makes sense to me. I will add it if no objections.

I think that works after Baolu's patchset that makes intel-iommu.h
private.  I'm pretty sure it wouldn't have worked before that.

No objections.

--> Steve

-- 
Steve Wahl, Hewlett Packard Enterprise
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 5/6] dt-bindings: iommu: mediatek: Add mediatek, pericfg phandle

2022-06-14 Thread Matthias Brugger




On 09/06/2022 12:08, AngeloGioacchino Del Regno wrote:

Add property "mediatek,pericfg" to let the mtk_iommu driver retrieve
a phandle to the infracfg syscon instead of performing a per-soc
compatible lookup in the entire devicetree and set it as a required
property for MT8195's infra IOMMU.

Signed-off-by: AngeloGioacchino Del Regno 



Reviewd-by: Matthias Brugger 


---
  .../devicetree/bindings/iommu/mediatek,iommu.yaml  | 14 ++
  1 file changed, 14 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml 
b/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
index 4142a568b293..d5e3272a54e8 100644
--- a/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
+++ b/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
@@ -116,6 +116,10 @@ properties:
Refer to bindings/memory-controllers/mediatek,smi-larb.yaml. It must 
sort
according to the local arbiter index, like larb0, larb1, larb2...
  
+  mediatek,pericfg:

+$ref: /schemas/types.yaml#/definitions/phandle
+description: The phandle to the mediatek pericfg syscon
+
'#iommu-cells':
  const: 1
  description: |
@@ -183,6 +187,16 @@ allOf:
required:
  - mediatek,infracfg
  
+  - if:

+  properties:
+compatible:
+  contains:
+const: mediatek,mt8195-iommu-infra
+
+then:
+  required:
+- mediatek,pericfg
+
- if: # The IOMMUs don't have larbs.
not:
  properties:

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 2/6] iommu: mtk_iommu: Lookup phandle to retrieve syscon to infracfg

2022-06-14 Thread Matthias Brugger




On 09/06/2022 12:07, AngeloGioacchino Del Regno wrote:

This driver will get support for more SoCs and the list of infracfg
compatibles is expected to grow: in order to prevent getting this
situation out of control and see a long list of compatible strings,
add support to retrieve a handle to infracfg's regmap through a
new "mediatek,infracfg" phandle.

In order to keep retrocompatibility with older devicetrees, the old
way is kept in place.

Signed-off-by: AngeloGioacchino Del Regno 



Reviewed-by: Matthias Brugger 


---
  drivers/iommu/mtk_iommu.c | 38 --
  1 file changed, 24 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index bb9dd92c9898..90685946fcbe 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -1140,22 +1140,32 @@ static int mtk_iommu_probe(struct platform_device *pdev)
data->protect_base = ALIGN(virt_to_phys(protect), MTK_PROTECT_PA_ALIGN);
  
  	if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) {

-   switch (data->plat_data->m4u_plat) {
-   case M4U_MT2712:
-   p = "mediatek,mt2712-infracfg";
-   break;
-   case M4U_MT8173:
-   p = "mediatek,mt8173-infracfg";
-   break;
-   default:
-   p = NULL;
+   infracfg = syscon_regmap_lookup_by_phandle(dev->of_node, 
"mediatek,infracfg");
+   if (IS_ERR(infracfg)) {
+   /*
+* Legacy devicetrees will not specify a phandle to
+* mediatek,infracfg: in that case, we use the older
+* way to retrieve a syscon to infra.
+*
+* This is for retrocompatibility purposes only, hence
+* no more compatibles shall be added to this.
+*/
+   switch (data->plat_data->m4u_plat) {
+   case M4U_MT2712:
+   p = "mediatek,mt2712-infracfg";
+   break;
+   case M4U_MT8173:
+   p = "mediatek,mt8173-infracfg";
+   break;
+   default:
+   p = NULL;
+   }
+
+   infracfg = syscon_regmap_lookup_by_compatible(p);
+   if (IS_ERR(infracfg))
+   return PTR_ERR(infracfg);
}
  
-		infracfg = syscon_regmap_lookup_by_compatible(p);

-
-   if (IS_ERR(infracfg))
-   return PTR_ERR(infracfg);
-
ret = regmap_read(infracfg, REG_INFRA_MISC, );
if (ret)
return ret;

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 2/4] dma-iommu: Add iommu_dma_opt_mapping_size()

2022-06-14 Thread John Garry via iommu

On 06/06/2022 10:30, John Garry wrote:

Add the IOMMU callback for DMA mapping API dma_opt_mapping_size(), which
allows the drivers to know the optimal mapping limit and thus limit the
requested IOVA lengths.

This value is based on the IOVA rcache range limit, as IOVAs allocated
above this limit must always be newly allocated, which may be quite slow.



Can I please get some sort of ack from the IOMMU people on this one?

Thanks,
John

EOM


Signed-off-by: John Garry 
Reviewed-by: Damien Le Moal 
---
  drivers/iommu/dma-iommu.c | 6 ++
  drivers/iommu/iova.c  | 5 +
  include/linux/iova.h  | 2 ++
  3 files changed, 13 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f90251572a5d..9e1586447ee8 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1459,6 +1459,11 @@ static unsigned long iommu_dma_get_merge_boundary(struct 
device *dev)
return (1UL << __ffs(domain->pgsize_bitmap)) - 1;
  }
  
+static size_t iommu_dma_opt_mapping_size(void)

+{
+   return iova_rcache_range();
+}
+
  static const struct dma_map_ops iommu_dma_ops = {
.alloc  = iommu_dma_alloc,
.free   = iommu_dma_free,
@@ -1479,6 +1484,7 @@ static const struct dma_map_ops iommu_dma_ops = {
.map_resource   = iommu_dma_map_resource,
.unmap_resource = iommu_dma_unmap_resource,
.get_merge_boundary = iommu_dma_get_merge_boundary,
+   .opt_mapping_size   = iommu_dma_opt_mapping_size,
  };
  
  /*

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index db77aa675145..9f00b58d546e 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -26,6 +26,11 @@ static unsigned long iova_rcache_get(struct iova_domain 
*iovad,
  static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain 
*iovad);
  static void free_iova_rcaches(struct iova_domain *iovad);
  
+unsigned long iova_rcache_range(void)

+{
+   return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1);
+}
+
  static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node)
  {
struct iova_domain *iovad;
diff --git a/include/linux/iova.h b/include/linux/iova.h
index 320a70e40233..c6ba6d95d79c 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -79,6 +79,8 @@ static inline unsigned long iova_pfn(struct iova_domain 
*iovad, dma_addr_t iova)
  int iova_cache_get(void);
  void iova_cache_put(void);
  
+unsigned long iova_rcache_range(void);

+
  void free_iova(struct iova_domain *iovad, unsigned long pfn);
  void __free_iova(struct iova_domain *iovad, struct iova *iova);
  struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size,


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 1/3] iommu/io-pgtable-arm-v7s: Add a quirk to allow pgtable PA up to 35bit

2022-06-14 Thread Will Deacon
Hi,

For some reason, this series has landed in my spam folder so apologies
for the delay :/

On Sat, Jun 11, 2022 at 06:26:53PM +0800, yf.w...@mediatek.com wrote:
> From: Yunfei Wang 
> 
> Single memory zone feature will remove ZONE_DMA32 and ZONE_DMA and
> cause pgtable PA size larger than 32bit.
> 
> Since Mediatek IOMMU hardware support at most 35bit PA in pgtable,
> so add a quirk to allow the PA of pgtables support up to bit35.
> 
> Signed-off-by: Ning Li 
> Signed-off-by: Yunfei Wang 
> ---
>  drivers/iommu/io-pgtable-arm-v7s.c | 48 ++
>  include/linux/io-pgtable.h | 17 +++
>  2 files changed, 46 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
> b/drivers/iommu/io-pgtable-arm-v7s.c
> index be066c1503d3..d4702d8d825a 100644
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -182,14 +182,8 @@ static bool arm_v7s_is_mtk_enabled(struct io_pgtable_cfg 
> *cfg)
>   (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT);
>  }
>  
> -static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
> - struct io_pgtable_cfg *cfg)
> +static arm_v7s_iopte to_iopte_mtk(phys_addr_t paddr, arm_v7s_iopte pte)
>  {
> - arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
> -
> - if (!arm_v7s_is_mtk_enabled(cfg))
> - return pte;
> -
>   if (paddr & BIT_ULL(32))
>   pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
>   if (paddr & BIT_ULL(33))
> @@ -199,6 +193,17 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, 
> int lvl,
>   return pte;
>  }
>  
> +static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
> + struct io_pgtable_cfg *cfg)
> +{
> + arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
> +
> + if (!arm_v7s_is_mtk_enabled(cfg))
> + return pte;
> +
> + return to_iopte_mtk(paddr, pte);

nit, but can we rename and rework this so it reads a bit better, please?
Something like:


if (arm_v7s_is_mtk_enabled(cfg))
return to_mtk_iopte(paddr, pte);

return pte;


>  static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
> struct io_pgtable_cfg *cfg)
>  {
> @@ -234,6 +239,7 @@ static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int 
> lvl,
>  static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
>  struct arm_v7s_io_pgtable *data)
>  {
> + gfp_t gfp_l1 = __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA;
>   struct io_pgtable_cfg *cfg = >iop.cfg;
>   struct device *dev = cfg->iommu_dev;
>   phys_addr_t phys;
> @@ -241,9 +247,11 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
>   size_t size = ARM_V7S_TABLE_SIZE(lvl, cfg);
>   void *table = NULL;
>  
> + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)
> + gfp_l1 = GFP_KERNEL | __GFP_ZERO;

I think it's a bit grotty to override the flags inline like this (same for
the slab flag later on). Something like this is a bit cleaner:


/*
 * Comment explaining why GFP_KERNEL is desirable here.
 * I'm assuming it's because the walker can address all of memory.
 */
gfp_l1 = cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT ?
 GFP_KERNEL : ARM_V7S_TABLE_GFP_DMA;

...

__get_free_pages(gfp_l1 | __GFP_ZERO, ...);


and similar for the slab flag.

>   if (lvl == 1)
> - table = (void *)__get_free_pages(
> - __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size));
> + table = (void *)__get_free_pages(gfp_l1, get_order(size));
>   else if (lvl == 2)
>   table = kmem_cache_zalloc(data->l2_tables, gfp);
>  
> @@ -251,7 +259,8 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
>   return NULL;
>  
>   phys = virt_to_phys(table);
> - if (phys != (arm_v7s_iopte)phys) {
> + if (phys != (arm_v7s_iopte)phys &&
> + !(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)) {
>   /* Doesn't fit in PTE */

Shouldn't we be checking that the address is within 35 bits here? Perhaps we
should generate a mask from the oas instead of just using the cast.

>   dev_err(dev, "Page table does not fit in PTE: %pa", );
>   goto out_free;
> @@ -457,9 +466,14 @@ static arm_v7s_iopte arm_v7s_install_table(arm_v7s_iopte 
> *table,
>  arm_v7s_iopte curr,
>  struct io_pgtable_cfg *cfg)
>  {
> + phys_addr_t phys = virt_to_phys(table);
>   arm_v7s_iopte old, new;
>  
> - new = virt_to_phys(table) | ARM_V7S_PTE_TYPE_TABLE;
> + new = phys | ARM_V7S_PTE_TYPE_TABLE;
> +
> + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)
> + new = to_iopte_mtk(phys, new);
> +
>   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
>   

Re: [PATCH 5/7] iommu: Add domain_type_supported() callback in iommu_ops

2022-06-14 Thread Robin Murphy

On 2022-06-13 15:38, Suthikulpanit, Suravee wrote:

Robin,

On 6/13/2022 4:31 PM, Robin Murphy wrote:

On 2022-06-13 02:25, Suravee Suthikulpanit wrote:

When user requests to change IOMMU domain to a new type, IOMMU generic
layer checks the requested type against the default domain type returned
by vendor-specific IOMMU driver.

However, there is only one default domain type, and current mechanism
does not allow if the requested type does not match the default type.


I don't really follow the reasoning here. If a driver's 
def_domain_type callback returns a specific type, it's saying that the 
device *has* to have that specific domain type for 
driver/platform-specific reasons. 


Agree, and I understand this part.

If 
that's not the case, then the driver shouldn't say so in the first place.


Considering the case:
1. Boot w/ default domain = IOMMU_DOMAIN_DMA_FQ
2. User wants to change to IOMMU_DOMAIN_IDENTITY, which is not supported 
by IOMMU driver. In this case, IOMMU driver can return 
IOMMU_DOMAIN_DMA_FQ and prevent the mode change.
3. However, if user want to change to IOMMU_DOMAIN_DMA. The driver can 
support this. However, since the def_domain_type() returns 
IOMMU_DOMAIN_DMA_FQ, it ends up prevent the mode change.


Why would a driver be forcing IOMMU_DOMAIN_DMA_FQ for a device though? 
Nobody's doing that today, and semantically it wouldn't really make 
sense - forcing translation to deny passthrough on a device-specific 
basis (beyond the common handling of untrusted devices) *might* be a 
thing, but the performance/strictness tradeoff of using a flush queue or 
not is surely a subjective user decision, not an objective platform one.


IIUC, we should support step 3 above. Basically, with the newly proposed 
interface, it allows us to check with IOMMU driver if it can support 
certain domain types before trying

to allocate the domain.


Indeed we could do that - as a much more comprehensive change to the 
internal domain_alloc interfaces - but do we really need to? If we 
succeed at allocating a domain then we know it's supported; if it fails 
then we can't give the user what they asked for, regardless of the exact 
reason why - what do we gain from doubling the number of potential 
failure paths that we have to handle?



Introducing check_domain_type_supported() callback in iommu_ops,
which allows IOMMU generic layer to check with vendor-specific IOMMU driver 


whether the requested type is supported. This allows user to request
types other than the default type.


Note also that you're only adding this in the sysfs path - what about 
the "iommu.passthrough=" parameter or CONFIG_IOMMU_DEFAULT_PASSTHROUGH?


For SNP case, we cannot enable SNP if iommu=off or iommu=pt or 
iommu.passthrough=1 or CONFIG_IOMMU_DEFAULT_PASSTHROUGH=y.
So, when another driver tries to enable SNP, the IOMMU driver prevents 
it (see iommu_sev_snp_supported() in patch 3).


Ugh, I hadn't looked too closely at the other patches, but an interface 
that looks like a simple "is this feature supported?" check with a 
secret side-effect of changing global behaviour as well? Yuck :(


What external drivers are expected to have the authority to affect the 
entire system and call that? The fact that you're exporting it suggests 
they could be loaded from modules *after* v2 features have been enabled 
and/or the user has configured a non-default identity domain for a group 
via sysfs... Fun!


Instead, if we boot with iommu.passhthrough=0, when another driver tries 
to enable SNP, the IOMMU driver allows this and switch to SNP enable mode.
Subsequently, if user tries to switch a domain (via sysfs) to 
IOMMU_DOMAIN_IDENTITY, the IOMMU needs to prevent this because it has 
already switch

to SNP-enabled mode.

AFAICS there shouldn't need to be any core-level changes to support 
this. We already have drivers which don't support passthrough at all, 
so conditionally not supporting it should be no big deal. What should 
happen currently is that def_domain_type returns 0 for "don't care", 
then domain_alloc rejects IOMMU_DOMAIN_IDENTITY and and returns NULL, 
so iommu_group_alloc_default_domain() falls back to IOMMU_DOMAIN_DMA.


Technically, we can do it the way you suggest. But isn't this confusing? 
At first, def_domain_type() returns 0 for "don't care",
but then it rejects the request to change to IOMMU_DOMAIN_IDENTITY when 
trying to call domain_alloc().


Yes, that's how it works; def_domain_type is responsible for quirking 
individual *devices* that need to have a specific domain type (in 
practice, devices which need identity mapping), while domain_alloc is 
responsible for saying which domain types the driver supports as a 
whole, by allocating them or not as appropriate.


We don't have a particularly neat way to achieve the negative of 
def_domain_type - i.e. saying that a specific device *can't* use a 
specific otherwise-supported domain type - other than subsequently 
failing in attach_dev, but so far we've not 

[PATCH 10/10] ARM/dma-mapping: merge IOMMU ops

2022-06-14 Thread Christoph Hellwig
From: Robin Murphy 

The dma_sync_* operations are now the only difference between the
coherent and non-coherent IOMMU ops. Some minor tweaks to make those
safe for coherent devices with minimal overhead, and we can condense
down to a single set of DMA ops.

Signed-off-by: Robin Murphy 
Signed-off-by: Christoph Hellwig 
Tested-by: Marc Zyngier 
---
 arch/arm/mm/dma-mapping.c | 37 +
 1 file changed, 13 insertions(+), 24 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index e7ccf7c82e025..e68d1d2ac4be0 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1341,6 +1341,9 @@ static void arm_iommu_sync_sg_for_cpu(struct device *dev,
struct scatterlist *s;
int i;
 
+   if (dev->dma_coherent)
+   return;
+
for_each_sg(sg, s, nents, i)
__dma_page_dev_to_cpu(sg_page(s), s->offset, s->length, dir);
 
@@ -1360,6 +1363,9 @@ static void arm_iommu_sync_sg_for_device(struct device 
*dev,
struct scatterlist *s;
int i;
 
+   if (dev->dma_coherent)
+   return;
+
for_each_sg(sg, s, nents, i)
__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
 }
@@ -1493,12 +1499,13 @@ static void arm_iommu_sync_single_for_cpu(struct device 
*dev,
 {
struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
dma_addr_t iova = handle & PAGE_MASK;
-   struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, 
iova));
+   struct page *page;
unsigned int offset = handle & ~PAGE_MASK;
 
-   if (!iova)
+   if (dev->dma_coherent || !iova)
return;
 
+   page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
__dma_page_dev_to_cpu(page, offset, size, dir);
 }
 
@@ -1507,12 +1514,13 @@ static void arm_iommu_sync_single_for_device(struct 
device *dev,
 {
struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
dma_addr_t iova = handle & PAGE_MASK;
-   struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, 
iova));
+   struct page *page;
unsigned int offset = handle & ~PAGE_MASK;
 
-   if (!iova)
+   if (dev->dma_coherent || !iova)
return;
 
+   page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
__dma_page_cpu_to_dev(page, offset, size, dir);
 }
 
@@ -1536,22 +1544,6 @@ static const struct dma_map_ops iommu_ops = {
.unmap_resource = arm_iommu_unmap_resource,
 };
 
-static const struct dma_map_ops iommu_coherent_ops = {
-   .alloc  = arm_iommu_alloc_attrs,
-   .free   = arm_iommu_free_attrs,
-   .mmap   = arm_iommu_mmap_attrs,
-   .get_sgtable= arm_iommu_get_sgtable,
-
-   .map_page   = arm_iommu_map_page,
-   .unmap_page = arm_iommu_unmap_page,
-
-   .map_sg = arm_iommu_map_sg,
-   .unmap_sg   = arm_iommu_unmap_sg,
-
-   .map_resource   = arm_iommu_map_resource,
-   .unmap_resource = arm_iommu_unmap_resource,
-};
-
 /**
  * arm_iommu_create_mapping
  * @bus: pointer to the bus holding the client device (for IOMMU calls)
@@ -1750,10 +1742,7 @@ static void arm_setup_iommu_dma_ops(struct device *dev, 
u64 dma_base, u64 size,
return;
}
 
-   if (coherent)
-   set_dma_ops(dev, _coherent_ops);
-   else
-   set_dma_ops(dev, _ops);
+   set_dma_ops(dev, _ops);
 }
 
 static void arm_teardown_iommu_dma_ops(struct device *dev)
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 09/10] ARM/dma-mapping: consolidate IOMMU ops callbacks

2022-06-14 Thread Christoph Hellwig
From: Robin Murphy 

Merge the coherent and non-coherent callbacks down to a single
implementation each, relying on the generic dev->dma_coherent
flag at the points where the difference matters.

Signed-off-by: Robin Murphy 
Signed-off-by: Christoph Hellwig 
Tested-by: Marc Zyngier 
---
 arch/arm/mm/dma-mapping.c | 238 +-
 1 file changed, 55 insertions(+), 183 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 4055f2dc2859e..e7ccf7c82e025 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1079,13 +1079,13 @@ static void __iommu_free_atomic(struct device *dev, 
void *cpu_addr,
__free_from_pool(cpu_addr, size);
 }
 
-static void *__arm_iommu_alloc_attrs(struct device *dev, size_t size,
-   dma_addr_t *handle, gfp_t gfp, unsigned long attrs,
-   int coherent_flag)
+static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
+   dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
 {
pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL);
struct page **pages;
void *addr = NULL;
+   int coherent_flag = dev->dma_coherent ? COHERENT : NORMAL;
 
*handle = DMA_MAPPING_ERROR;
size = PAGE_ALIGN(size);
@@ -1128,19 +1128,7 @@ static void *__arm_iommu_alloc_attrs(struct device *dev, 
size_t size,
return NULL;
 }
 
-static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
-   dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
-{
-   return __arm_iommu_alloc_attrs(dev, size, handle, gfp, attrs, NORMAL);
-}
-
-static void *arm_coherent_iommu_alloc_attrs(struct device *dev, size_t size,
-   dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
-{
-   return __arm_iommu_alloc_attrs(dev, size, handle, gfp, attrs, COHERENT);
-}
-
-static int __arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct 
*vma,
+static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
void *cpu_addr, dma_addr_t dma_addr, size_t size,
unsigned long attrs)
 {
@@ -1154,35 +1142,24 @@ static int __arm_iommu_mmap_attrs(struct device *dev, 
struct vm_area_struct *vma
if (vma->vm_pgoff >= nr_pages)
return -ENXIO;
 
+   if (!dev->dma_coherent)
+   vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
+
err = vm_map_pages(vma, pages, nr_pages);
if (err)
pr_err("Remapping memory failed: %d\n", err);
 
return err;
 }
-static int arm_iommu_mmap_attrs(struct device *dev,
-   struct vm_area_struct *vma, void *cpu_addr,
-   dma_addr_t dma_addr, size_t size, unsigned long attrs)
-{
-   vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
-
-   return __arm_iommu_mmap_attrs(dev, vma, cpu_addr, dma_addr, size, 
attrs);
-}
-
-static int arm_coherent_iommu_mmap_attrs(struct device *dev,
-   struct vm_area_struct *vma, void *cpu_addr,
-   dma_addr_t dma_addr, size_t size, unsigned long attrs)
-{
-   return __arm_iommu_mmap_attrs(dev, vma, cpu_addr, dma_addr, size, 
attrs);
-}
 
 /*
  * free a page as defined by the above mapping.
  * Must not be called with IRQs disabled.
  */
-static void __arm_iommu_free_attrs(struct device *dev, size_t size, void 
*cpu_addr,
-   dma_addr_t handle, unsigned long attrs, int coherent_flag)
+static void arm_iommu_free_attrs(struct device *dev, size_t size, void 
*cpu_addr,
+   dma_addr_t handle, unsigned long attrs)
 {
+   int coherent_flag = dev->dma_coherent ? COHERENT : NORMAL;
struct page **pages;
size = PAGE_ALIGN(size);
 
@@ -1204,19 +1181,6 @@ static void __arm_iommu_free_attrs(struct device *dev, 
size_t size, void *cpu_ad
__iommu_free_buffer(dev, pages, size, attrs);
 }
 
-static void arm_iommu_free_attrs(struct device *dev, size_t size,
-void *cpu_addr, dma_addr_t handle,
-unsigned long attrs)
-{
-   __arm_iommu_free_attrs(dev, size, cpu_addr, handle, attrs, NORMAL);
-}
-
-static void arm_coherent_iommu_free_attrs(struct device *dev, size_t size,
-   void *cpu_addr, dma_addr_t handle, unsigned long attrs)
-{
-   __arm_iommu_free_attrs(dev, size, cpu_addr, handle, attrs, COHERENT);
-}
-
 static int arm_iommu_get_sgtable(struct device *dev, struct sg_table *sgt,
 void *cpu_addr, dma_addr_t dma_addr,
 size_t size, unsigned long attrs)
@@ -1236,8 +1200,7 @@ static int arm_iommu_get_sgtable(struct device *dev, 
struct sg_table *sgt,
  */
 static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
  size_t size, dma_addr_t *handle,
- enum dma_data_direction dir, unsigned long attrs,
- bool is_coherent)
+   

[PATCH 08/10] ARM/dma-mapping: drop .dma_supported for IOMMU ops

2022-06-14 Thread Christoph Hellwig
From: Robin Murphy 

When an IOMMU is present, we trust that it should be capable
of remapping any physical memory, and since the device masks
represent the input (virtual) addresses to the IOMMU it makes
no sense to validate them against physical PFNs anyway.

Signed-off-by: Robin Murphy 
Signed-off-by: Christoph Hellwig 
Tested-by: Marc Zyngier 
---
 arch/arm/mm/dma-mapping.c | 23 ---
 1 file changed, 23 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index ceb56928d01ec..4055f2dc2859e 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -104,25 +104,6 @@ static struct arm_dma_buffer *arm_dma_buffer_find(void 
*virt)
  *
  */
 
-#ifdef CONFIG_ARM_DMA_USE_IOMMU
-/*
- * Return whether the given device DMA address mask can be supported
- * properly.  For example, if your device can only drive the low 24-bits
- * during bus mastering, then you would pass 0x00ff as the mask
- * to this function.
- */
-static int arm_dma_supported(struct device *dev, u64 mask)
-{
-   unsigned long max_dma_pfn = min(max_pfn - 1, arm_dma_pfn_limit);
-
-   /*
-* Translate the device's DMA mask to a PFN limit.  This
-* PFN number includes the page which we can DMA to.
-*/
-   return PHYS_PFN(dma_to_phys(dev, mask)) >= max_dma_pfn;
-}
-#endif
-
 static void __dma_clear_buffer(struct page *page, size_t size, int 
coherent_flag)
 {
/*
@@ -1681,8 +1662,6 @@ static const struct dma_map_ops iommu_ops = {
 
.map_resource   = arm_iommu_map_resource,
.unmap_resource = arm_iommu_unmap_resource,
-
-   .dma_supported  = arm_dma_supported,
 };
 
 static const struct dma_map_ops iommu_coherent_ops = {
@@ -1699,8 +1678,6 @@ static const struct dma_map_ops iommu_coherent_ops = {
 
.map_resource   = arm_iommu_map_resource,
.unmap_resource = arm_iommu_unmap_resource,
-
-   .dma_supported  = arm_dma_supported,
 };
 
 /**
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 07/10] ARM/dma-mapping: use dma-direct unconditionally

2022-06-14 Thread Christoph Hellwig
Use dma-direct unconditionally on arm.  It has already been used for
some time for LPAE and nommu configurations.

This mostly changes the streaming mapping implementation and the (simple)
coherent allocator for device that are DMA coherent.  The existing
complex allocator for uncached mappings for non-coherent devices is still
used as is using the arch_dma_alloc/arch_dma_free hooks.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Arnd Bergmann 
Acked-by: Andre Przywara  [highbank]
Tested-by: Marc Zyngier 
---
 arch/arm/Kconfig   |   4 +-
 arch/arm/include/asm/dma-mapping.h |  24 --
 arch/arm/mach-highbank/highbank.c  |   2 +-
 arch/arm/mach-mvebu/coherency.c|   2 +-
 arch/arm/mm/dma-mapping.c  | 365 ++---
 5 files changed, 19 insertions(+), 378 deletions(-)
 delete mode 100644 arch/arm/include/asm/dma-mapping.h

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index cd67e359958cb..4c18fe7b5d1cf 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -19,8 +19,8 @@ config ARM
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
select ARCH_HAS_STRICT_MODULE_RWX if MMU
-   select ARCH_HAS_SYNC_DMA_FOR_DEVICE if SWIOTLB || !MMU
-   select ARCH_HAS_SYNC_DMA_FOR_CPU if SWIOTLB || !MMU
+   select ARCH_HAS_SYNC_DMA_FOR_DEVICE
+   select ARCH_HAS_SYNC_DMA_FOR_CPU
select ARCH_HAS_TEARDOWN_DMA_OPS if MMU
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAVE_CUSTOM_GPIO_H
diff --git a/arch/arm/include/asm/dma-mapping.h 
b/arch/arm/include/asm/dma-mapping.h
deleted file mode 100644
index 6427b934bd11c..0
--- a/arch/arm/include/asm/dma-mapping.h
+++ /dev/null
@@ -1,24 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef ASMARM_DMA_MAPPING_H
-#define ASMARM_DMA_MAPPING_H
-
-#ifdef __KERNEL__
-
-#include 
-#include 
-
-#include 
-#include 
-
-extern const struct dma_map_ops arm_dma_ops;
-extern const struct dma_map_ops arm_coherent_dma_ops;
-
-static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
-{
-   if (IS_ENABLED(CONFIG_MMU) && !IS_ENABLED(CONFIG_ARM_LPAE))
-   return _dma_ops;
-   return NULL;
-}
-
-#endif /* __KERNEL__ */
-#endif
diff --git a/arch/arm/mach-highbank/highbank.c 
b/arch/arm/mach-highbank/highbank.c
index db607955a7e45..5d4f977ac7d2a 100644
--- a/arch/arm/mach-highbank/highbank.c
+++ b/arch/arm/mach-highbank/highbank.c
@@ -98,7 +98,7 @@ static int highbank_platform_notifier(struct notifier_block 
*nb,
if (of_property_read_bool(dev->of_node, "dma-coherent")) {
val = readl(sregs_base + reg);
writel(val | 0xff01, sregs_base + reg);
-   set_dma_ops(dev, _coherent_dma_ops);
+   dev->dma_coherent = true;
}
 
return NOTIFY_OK;
diff --git a/arch/arm/mach-mvebu/coherency.c b/arch/arm/mach-mvebu/coherency.c
index 49e3c8d20c2fa..865ac4bc060df 100644
--- a/arch/arm/mach-mvebu/coherency.c
+++ b/arch/arm/mach-mvebu/coherency.c
@@ -98,7 +98,7 @@ static int mvebu_hwcc_notifier(struct notifier_block *nb,
 
if (event != BUS_NOTIFY_ADD_DEVICE)
return NOTIFY_DONE;
-   set_dma_ops(dev, _coherent_dma_ops);
+   dev->dma_coherent = true;
 
return NOTIFY_OK;
 }
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index a09ce16c7ddbd..ceb56928d01ec 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -103,79 +103,8 @@ static struct arm_dma_buffer *arm_dma_buffer_find(void 
*virt)
  * before transfers and delay cache invalidation until transfer completion.
  *
  */
-static void __dma_page_cpu_to_dev(struct page *, unsigned long,
-   size_t, enum dma_data_direction);
-static void __dma_page_dev_to_cpu(struct page *, unsigned long,
-   size_t, enum dma_data_direction);
-
-/**
- * arm_dma_map_page - map a portion of a page for streaming DMA
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @page: page that buffer resides in
- * @offset: offset into page for start of buffer
- * @size: size of buffer to map
- * @dir: DMA transfer direction
- *
- * Ensure that any data held in the cache is appropriately discarded
- * or written back.
- *
- * The device owns this memory once this call has completed.  The CPU
- * can regain ownership by calling dma_unmap_page().
- */
-static dma_addr_t arm_dma_map_page(struct device *dev, struct page *page,
-unsigned long offset, size_t size, enum dma_data_direction dir,
-unsigned long attrs)
-{
-   if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
-   __dma_page_cpu_to_dev(page, offset, size, dir);
-   return phys_to_dma(dev, page_to_phys(page) + offset);
-}
-
-static dma_addr_t arm_coherent_dma_map_page(struct device *dev, struct page 
*page,
-unsigned long offset, size_t size, enum dma_data_direction dir,
-

[PATCH 05/10] ARM/dma-mapping: use dma_to_phys/phys_to_dma in the dma-mapping code

2022-06-14 Thread Christoph Hellwig
Use the helpers as expected by the dma-direct code in the old arm
dma-mapping code to ease a gradual switch to the common DMA code.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Arnd Bergmann 
Tested-by: Marc Zyngier 
---
 arch/arm/mm/dma-mapping.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 90142183d1045..a09ce16c7ddbd 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -128,14 +128,14 @@ static dma_addr_t arm_dma_map_page(struct device *dev, 
struct page *page,
 {
if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
__dma_page_cpu_to_dev(page, offset, size, dir);
-   return pfn_to_dma(dev, page_to_pfn(page)) + offset;
+   return phys_to_dma(dev, page_to_phys(page) + offset);
 }
 
 static dma_addr_t arm_coherent_dma_map_page(struct device *dev, struct page 
*page,
 unsigned long offset, size_t size, enum dma_data_direction dir,
 unsigned long attrs)
 {
-   return pfn_to_dma(dev, page_to_pfn(page)) + offset;
+   return phys_to_dma(dev, page_to_phys(page) + offset);
 }
 
 /**
@@ -156,7 +156,7 @@ static void arm_dma_unmap_page(struct device *dev, 
dma_addr_t handle,
size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
-   __dma_page_dev_to_cpu(pfn_to_page(dma_to_pfn(dev, handle)),
+   __dma_page_dev_to_cpu(phys_to_page(dma_to_phys(dev, handle)),
  handle & ~PAGE_MASK, size, dir);
 }
 
@@ -164,7 +164,7 @@ static void arm_dma_sync_single_for_cpu(struct device *dev,
dma_addr_t handle, size_t size, enum dma_data_direction dir)
 {
unsigned int offset = handle & (PAGE_SIZE - 1);
-   struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
+   struct page *page = phys_to_page(dma_to_phys(dev, handle-offset));
__dma_page_dev_to_cpu(page, offset, size, dir);
 }
 
@@ -172,7 +172,7 @@ static void arm_dma_sync_single_for_device(struct device 
*dev,
dma_addr_t handle, size_t size, enum dma_data_direction dir)
 {
unsigned int offset = handle & (PAGE_SIZE - 1);
-   struct page *page = pfn_to_page(dma_to_pfn(dev, handle-offset));
+   struct page *page = phys_to_page(dma_to_phys(dev, handle-offset));
__dma_page_cpu_to_dev(page, offset, size, dir);
 }
 
@@ -190,7 +190,7 @@ static int arm_dma_supported(struct device *dev, u64 mask)
 * Translate the device's DMA mask to a PFN limit.  This
 * PFN number includes the page which we can DMA to.
 */
-   return dma_to_pfn(dev, mask) >= max_dma_pfn;
+   return PHYS_PFN(dma_to_phys(dev, mask)) >= max_dma_pfn;
 }
 
 static void __dma_clear_buffer(struct page *page, size_t size, int 
coherent_flag)
@@ -681,7 +681,7 @@ static void *__dma_alloc(struct device *dev, size_t size, 
dma_addr_t *handle,
if (page) {
unsigned long flags;
 
-   *handle = pfn_to_dma(dev, page_to_pfn(page));
+   *handle = phys_to_dma(dev, page_to_phys(page));
buf->virt = args.want_vaddr ? addr : page;
 
spin_lock_irqsave(_dma_bufs_lock, flags);
@@ -721,7 +721,7 @@ static int __arm_dma_mmap(struct device *dev, struct 
vm_area_struct *vma,
int ret = -ENXIO;
unsigned long nr_vma_pages = vma_pages(vma);
unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
-   unsigned long pfn = dma_to_pfn(dev, dma_addr);
+   unsigned long pfn = PHYS_PFN(dma_to_phys(dev, dma_addr));
unsigned long off = vma->vm_pgoff;
 
if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, ))
@@ -762,7 +762,7 @@ static void __arm_dma_free(struct device *dev, size_t size, 
void *cpu_addr,
   dma_addr_t handle, unsigned long attrs,
   bool is_coherent)
 {
-   struct page *page = pfn_to_page(dma_to_pfn(dev, handle));
+   struct page *page = phys_to_page(dma_to_phys(dev, handle));
struct arm_dma_buffer *buf;
struct arm_dma_free_args args = {
.dev = dev,
@@ -796,15 +796,15 @@ static int arm_dma_get_sgtable(struct device *dev, struct 
sg_table *sgt,
 void *cpu_addr, dma_addr_t handle, size_t size,
 unsigned long attrs)
 {
-   unsigned long pfn = dma_to_pfn(dev, handle);
+   phys_addr_t paddr = dma_to_phys(dev, handle);
struct page *page;
int ret;
 
/* If the PFN is not valid, we do not have a struct page */
-   if (!pfn_valid(pfn))
+   if (!pfn_valid(PHYS_PFN(paddr)))
return -ENXIO;
 
-   page = pfn_to_page(pfn);
+   page = phys_to_page(paddr);
 
ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
if (unlikely(ret))
-- 
2.30.2

___
iommu mailing list

[PATCH 06/10] ARM/dma-mapping: use the generic versions of dma_to_phys/phys_to_dma by default

2022-06-14 Thread Christoph Hellwig
Only the footbridge platforms provide their own DMA address translation
helpers, so switch to the generic version for all other platforms, and
consolidate the footbridge implementation to remove two levels of
indirection.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Arnd Bergmann 
Tested-by: Marc Zyngier 
---
 arch/arm/Kconfig  |  1 -
 arch/arm/include/asm/dma-direct.h | 41 +--
 arch/arm/include/asm/memory.h |  2 -
 arch/arm/mach-footbridge/Kconfig  |  1 +
 arch/arm/mach-footbridge/common.c | 19 +
 .../mach-footbridge/include/mach/dma-direct.h |  8 
 .../arm/mach-footbridge/include/mach/memory.h |  4 --
 7 files changed, 21 insertions(+), 55 deletions(-)
 create mode 100644 arch/arm/mach-footbridge/include/mach/dma-direct.h

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 7630ba9cb6ccc..cd67e359958cb 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -15,7 +15,6 @@ config ARM
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PTE_SPECIAL if ARM_LPAE
-   select ARCH_HAS_PHYS_TO_DMA
select ARCH_HAS_SETUP_DMA_OPS
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
diff --git a/arch/arm/include/asm/dma-direct.h 
b/arch/arm/include/asm/dma-direct.h
index 6fd52713b5d12..4f7bcde03abb5 100644
--- a/arch/arm/include/asm/dma-direct.h
+++ b/arch/arm/include/asm/dma-direct.h
@@ -1,40 +1 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef ASM_ARM_DMA_DIRECT_H
-#define ASM_ARM_DMA_DIRECT_H 1
-
-#include 
-
-/*
- * dma_to_pfn/pfn_to_dma are architecture private
- * functions used internally by the DMA-mapping API to provide DMA
- * addresses. They must not be used by drivers.
- */
-static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
-{
-   if (dev && dev->dma_range_map)
-   pfn = PFN_DOWN(translate_phys_to_dma(dev, PFN_PHYS(pfn)));
-   return (dma_addr_t)__pfn_to_bus(pfn);
-}
-
-static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
-{
-   unsigned long pfn = __bus_to_pfn(addr);
-
-   if (dev && dev->dma_range_map)
-   pfn = PFN_DOWN(translate_dma_to_phys(dev, PFN_PHYS(pfn)));
-   return pfn;
-}
-
-static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
-{
-   unsigned int offset = paddr & ~PAGE_MASK;
-   return pfn_to_dma(dev, __phys_to_pfn(paddr)) + offset;
-}
-
-static inline phys_addr_t dma_to_phys(struct device *dev, dma_addr_t dev_addr)
-{
-   unsigned int offset = dev_addr & ~PAGE_MASK;
-   return __pfn_to_phys(dma_to_pfn(dev, dev_addr)) + offset;
-}
-
-#endif /* ASM_ARM_DMA_DIRECT_H */
+#include 
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index f673e13e0f942..a55a9038abc89 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -378,8 +378,6 @@ static inline unsigned long __virt_to_idmap(unsigned long x)
 #ifndef __virt_to_bus
 #define __virt_to_bus  __virt_to_phys
 #define __bus_to_virt  __phys_to_virt
-#define __pfn_to_bus(x)__pfn_to_phys(x)
-#define __bus_to_pfn(x)__phys_to_pfn(x)
 #endif
 
 /*
diff --git a/arch/arm/mach-footbridge/Kconfig b/arch/arm/mach-footbridge/Kconfig
index 728aff93fba9d..b5bbdcf2e4896 100644
--- a/arch/arm/mach-footbridge/Kconfig
+++ b/arch/arm/mach-footbridge/Kconfig
@@ -60,6 +60,7 @@ endmenu
 
 # Footbridge support
 config FOOTBRIDGE
+   select ARCH_HAS_PHYS_TO_DMA
bool
 
 # Footbridge in host mode
diff --git a/arch/arm/mach-footbridge/common.c 
b/arch/arm/mach-footbridge/common.c
index 322495df271d5..5020eb96b025d 100644
--- a/arch/arm/mach-footbridge/common.c
+++ b/arch/arm/mach-footbridge/common.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -335,17 +336,19 @@ unsigned long __bus_to_virt(unsigned long res)
return res;
 }
 EXPORT_SYMBOL(__bus_to_virt);
-
-unsigned long __pfn_to_bus(unsigned long pfn)
+#else
+static inline unsigned long fb_bus_sdram_offset(void)
 {
-   return __pfn_to_phys(pfn) + (fb_bus_sdram_offset() - PHYS_OFFSET);
+   return BUS_OFFSET;
 }
-EXPORT_SYMBOL(__pfn_to_bus);
+#endif /* CONFIG_FOOTBRIDGE_ADDIN */
 
-unsigned long __bus_to_pfn(unsigned long bus)
+dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
 {
-   return __phys_to_pfn(bus - (fb_bus_sdram_offset() - PHYS_OFFSET));
+   return paddr + (fb_bus_sdram_offset() - PHYS_OFFSET);
 }
-EXPORT_SYMBOL(__bus_to_pfn);
 
-#endif
+phys_addr_t dma_to_phys(struct device *dev, dma_addr_t dev_addr)
+{
+   return dev_addr - (fb_bus_sdram_offset() - PHYS_OFFSET);
+}
diff --git a/arch/arm/mach-footbridge/include/mach/dma-direct.h 
b/arch/arm/mach-footbridge/include/mach/dma-direct.h
new file mode 100644
index 0..01f9e8367c009
--- /dev/null
+++ 

[PATCH 04/10] ARM/dma-mapping: remove the unused virt_to_dma helper

2022-06-14 Thread Christoph Hellwig
virt_to_dma was only used by the now removed dmabounce code.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Arnd Bergmann 
Tested-by: Marc Zyngier 
---
 arch/arm/include/asm/dma-direct.h | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/dma-direct.h 
b/arch/arm/include/asm/dma-direct.h
index 77fcb7ee5ec90..6fd52713b5d12 100644
--- a/arch/arm/include/asm/dma-direct.h
+++ b/arch/arm/include/asm/dma-direct.h
@@ -5,7 +5,7 @@
 #include 
 
 /*
- * dma_to_pfn/pfn_to_dma/virt_to_dma are architecture private
+ * dma_to_pfn/pfn_to_dma are architecture private
  * functions used internally by the DMA-mapping API to provide DMA
  * addresses. They must not be used by drivers.
  */
@@ -25,14 +25,6 @@ static inline unsigned long dma_to_pfn(struct device *dev, 
dma_addr_t addr)
return pfn;
 }
 
-static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
-{
-   if (dev)
-   return pfn_to_dma(dev, virt_to_pfn(addr));
-
-   return (dma_addr_t)__virt_to_bus((unsigned long)(addr));
-}
-
 static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
 {
unsigned int offset = paddr & ~PAGE_MASK;
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 03/10] ARM/dma-mapping: mark various dma-mapping routines static in dma-mapping.c

2022-06-14 Thread Christoph Hellwig
With the dmabounce removal these aren't used outside of dma-mapping.c,
so mark them static.  Move the dma_map_ops declarations down a bit
to avoid lots of forward declarations.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Arnd Bergmann 
Tested-by: Marc Zyngier 
---
 arch/arm/include/asm/dma-mapping.h |  75 --
 arch/arm/mm/dma-mapping.c  | 100 +
 2 files changed, 46 insertions(+), 129 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h 
b/arch/arm/include/asm/dma-mapping.h
index 1e015a7ad86aa..6427b934bd11c 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -20,80 +20,5 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
return NULL;
 }
 
-/**
- * arm_dma_alloc - allocate consistent memory for DMA
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @size: required memory size
- * @handle: bus-specific DMA address
- * @attrs: optinal attributes that specific mapping properties
- *
- * Allocate some memory for a device for performing DMA.  This function
- * allocates pages, and will return the CPU-viewed address, and sets @handle
- * to be the device-viewed address.
- */
-extern void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
-  gfp_t gfp, unsigned long attrs);
-
-/**
- * arm_dma_free - free memory allocated by arm_dma_alloc
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @size: size of memory originally requested in dma_alloc_coherent
- * @cpu_addr: CPU-view address returned from dma_alloc_coherent
- * @handle: device-view address returned from dma_alloc_coherent
- * @attrs: optinal attributes that specific mapping properties
- *
- * Free (and unmap) a DMA buffer previously allocated by
- * arm_dma_alloc().
- *
- * References to memory and mappings associated with cpu_addr/handle
- * during and after this call executing are illegal.
- */
-extern void arm_dma_free(struct device *dev, size_t size, void *cpu_addr,
-dma_addr_t handle, unsigned long attrs);
-
-/**
- * arm_dma_mmap - map a coherent DMA allocation into user space
- * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
- * @vma: vm_area_struct describing requested user mapping
- * @cpu_addr: kernel CPU-view address returned from dma_alloc_coherent
- * @handle: device-view address returned from dma_alloc_coherent
- * @size: size of memory originally requested in dma_alloc_coherent
- * @attrs: optinal attributes that specific mapping properties
- *
- * Map a coherent DMA buffer previously allocated by dma_alloc_coherent
- * into user space.  The coherent DMA buffer must not be freed by the
- * driver until the user space mapping has been released.
- */
-extern int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma,
-   void *cpu_addr, dma_addr_t dma_addr, size_t size,
-   unsigned long attrs);
-
-/*
- * For SA-, IXP425, and ADI systems  the dma-mapping functions are "magic"
- * and utilize bounce buffers as needed to work around limited DMA windows.
- *
- * On the SA-, a bug limits DMA to only certain regions of RAM.
- * On the IXP425, the PCI inbound window is 64MB (256MB total RAM)
- * On some ADI engineering systems, PCI inbound window is 32MB (12MB total RAM)
- *
- * The following are helper functions used by the dmabounce subystem
- *
- */
-
-/*
- * The scatter list versions of the above methods.
- */
-extern int arm_dma_map_sg(struct device *, struct scatterlist *, int,
-   enum dma_data_direction, unsigned long attrs);
-extern void arm_dma_unmap_sg(struct device *, struct scatterlist *, int,
-   enum dma_data_direction, unsigned long attrs);
-extern void arm_dma_sync_sg_for_cpu(struct device *, struct scatterlist *, int,
-   enum dma_data_direction);
-extern void arm_dma_sync_sg_for_device(struct device *, struct scatterlist *, 
int,
-   enum dma_data_direction);
-extern int arm_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
-   void *cpu_addr, dma_addr_t dma_addr, size_t size,
-   unsigned long attrs);
-
 #endif /* __KERNEL__ */
 #endif
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 059cce0185706..90142183d1045 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -193,50 +193,6 @@ static int arm_dma_supported(struct device *dev, u64 mask)
return dma_to_pfn(dev, mask) >= max_dma_pfn;
 }
 
-const struct dma_map_ops arm_dma_ops = {
-   .alloc  = arm_dma_alloc,
-   .free   = arm_dma_free,
-   .alloc_pages= dma_direct_alloc_pages,
-   .free_pages = dma_direct_free_pages,
-   .mmap   = arm_dma_mmap,
-   .get_sgtable= arm_dma_get_sgtable,
-   

[PATCH 01/10] ARM: sa1100/assabet: move dmabounce hack to ohci driver

2022-06-14 Thread Christoph Hellwig
From: Arnd Bergmann 

The sa platform is one of the two remaining users of the old Arm
specific "dmabounce" code, which is an earlier implementation of the
generic swiotlb.

Linus Walleij submitted a patch that removes dmabounce support from
the ixp4xx, and I had a look at the other user, which is the sa
companion chip.

Looking at how dmabounce is used, I could narrow it down to one driver
one three machines:

 - dmabounce is only initialized on assabet/neponset, jornada720 and
   badge4, which are the platforms that have an sa and support
   DMA on it.

 - All three of these suffer from "erratum #7" that requires only
   doing DMA to half the memory sections based on one of the address
   lines, in addition, the neponset also can't DMA to the RAM that
   is connected to sa itself.

 - the pxa lubbock machine also has sa, but does not support DMA
   on it and does not set dmabounce.

 - only the OHCI and audio devices on sa support DMA, but as
   there is no audio driver for this hardware, only OHCI remains.

In the OHCI code, I noticed that two other platforms already have
a local bounce buffer support in the form of the "local_mem"
allocator. Specifically, TMIO and SM501 use this on a few other ARM
boards with 16KB or 128KB of local SRAM that can be accessed from the
OHCI and from the CPU.

While this is not the same problem as on sa, I could not find a
reason why we can't re-use the existing implementation but replace the
physical SRAM address mapping with a locally allocated DMA buffer.

There are two main downsides:

 - rather than using a dynamically sized pool, this buffer needs
   to be allocated at probe time using a fixed size. Without
   having any idea of what it should be, I picked a size of
   64KB, which is between what the other two OHCI front-ends use
   in their SRAM. If anyone has a better idea what that size
   is reasonable, this can be trivially changed.

 - Previously, only USB transfers to unaddressable memory needed
   to go through the bounce buffer, now all of them do, which may
   impact runtime performance for USB endpoints that do a lot of
   transfers.

On the upside, the local_mem support uses write-combining buffers,
which should be a bit faster for transfers to the device compared to
normal uncached coherent memory as used in dmabounce.

Cc: Linus Walleij 
Cc: Russell King 
Cc: Christoph Hellwig 
Cc: Laurentiu Tudor 
Cc: linux-...@vger.kernel.org
Signed-off-by: Arnd Bergmann 
Reviewed-by: Greg Kroah-Hartman 
Acked-by: Alan Stern 
Signed-off-by: Christoph Hellwig 
---
 arch/arm/common/Kconfig|  2 +-
 arch/arm/common/sa.c   | 64 --
 drivers/usb/core/hcd.c | 17 +++--
 drivers/usb/host/ohci-sa.c | 25 +
 4 files changed, 40 insertions(+), 68 deletions(-)

diff --git a/arch/arm/common/Kconfig b/arch/arm/common/Kconfig
index c8e198631d418..bc158fd227e12 100644
--- a/arch/arm/common/Kconfig
+++ b/arch/arm/common/Kconfig
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 config SA
bool
-   select DMABOUNCE if !ARCH_PXA
+   select ZONE_DMA if ARCH_SA1100
 
 config DMABOUNCE
bool
diff --git a/arch/arm/common/sa.c b/arch/arm/common/sa.c
index 2343e2b6214d7..f5e6990b8856b 100644
--- a/arch/arm/common/sa.c
+++ b/arch/arm/common/sa.c
@@ -1389,70 +1389,9 @@ void sa_driver_unregister(struct sa_driver 
*driver)
 }
 EXPORT_SYMBOL(sa_driver_unregister);
 
-#ifdef CONFIG_DMABOUNCE
-/*
- * According to the "Intel StrongARM SA- Microprocessor Companion
- * Chip Specification Update" (June 2000), erratum #7, there is a
- * significant bug in the SA SDRAM shared memory controller.  If
- * an access to a region of memory above 1MB relative to the bank base,
- * it is important that address bit 10 _NOT_ be asserted. Depending
- * on the configuration of the RAM, bit 10 may correspond to one
- * of several different (processor-relative) address bits.
- *
- * This routine only identifies whether or not a given DMA address
- * is susceptible to the bug.
- *
- * This should only get called for sa_device types due to the
- * way we configure our device dma_masks.
- */
-static int sa_needs_bounce(struct device *dev, dma_addr_t addr, size_t 
size)
-{
-   /*
-* Section 4.6 of the "Intel StrongARM SA- Development Module
-* User's Guide" mentions that jumpers R51 and R52 control the
-* target of SA- DMA (either SDRAM bank 0 on Assabet, or
-* SDRAM bank 1 on Neponset). The default configuration selects
-* Assabet, so any address in bank 1 is necessarily invalid.
-*/
-   return (machine_is_assabet() || machine_is_pfs168()) &&
-   (addr >= 0xc800 || (addr + size) >= 0xc800);
-}
-
-static int sa_notifier_call(struct notifier_block *n, unsigned long action,
-   void *data)
-{
-   struct sa_dev *dev = to_sa_device(data);
-
-   

[PATCH 02/10] ARM/dma-mapping: remove dmabounce

2022-06-14 Thread Christoph Hellwig
Remove the now unused dmabounce code.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Arnd Bergmann 
---
 arch/arm/common/Kconfig|   4 -
 arch/arm/common/Makefile   |   1 -
 arch/arm/common/dmabounce.c| 582 -
 arch/arm/include/asm/device.h  |   3 -
 arch/arm/include/asm/dma-mapping.h |  29 --
 5 files changed, 619 deletions(-)
 delete mode 100644 arch/arm/common/dmabounce.c

diff --git a/arch/arm/common/Kconfig b/arch/arm/common/Kconfig
index bc158fd227e12..d2fdb1796f488 100644
--- a/arch/arm/common/Kconfig
+++ b/arch/arm/common/Kconfig
@@ -3,10 +3,6 @@ config SA
bool
select ZONE_DMA if ARCH_SA1100
 
-config DMABOUNCE
-   bool
-   select ZONE_DMA
-
 config KRAIT_L2_ACCESSORS
bool
 
diff --git a/arch/arm/common/Makefile b/arch/arm/common/Makefile
index 8cd574be94cfe..7bae8cbaafe78 100644
--- a/arch/arm/common/Makefile
+++ b/arch/arm/common/Makefile
@@ -6,7 +6,6 @@
 obj-y  += firmware.o
 
 obj-$(CONFIG_SA)   += sa.o
-obj-$(CONFIG_DMABOUNCE)+= dmabounce.o
 obj-$(CONFIG_KRAIT_L2_ACCESSORS) += krait-l2-accessors.o
 obj-$(CONFIG_SHARP_LOCOMO) += locomo.o
 obj-$(CONFIG_SHARP_PARAM)  += sharpsl_param.o
diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
deleted file mode 100644
index 7996c04393d50..0
--- a/arch/arm/common/dmabounce.c
+++ /dev/null
@@ -1,582 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- *  arch/arm/common/dmabounce.c
- *
- *  Special dma_{map/unmap/dma_sync}_* routines for systems that have
- *  limited DMA windows. These functions utilize bounce buffers to
- *  copy data to/from buffers located outside the DMA region. This
- *  only works for systems in which DMA memory is at the bottom of
- *  RAM, the remainder of memory is at the top and the DMA memory
- *  can be marked as ZONE_DMA. Anything beyond that such as discontiguous
- *  DMA windows will require custom implementations that reserve memory
- *  areas at early bootup.
- *
- *  Original version by Brad Parker (b...@heeltoe.com)
- *  Re-written by Christopher Hoover 
- *  Made generic by Deepak Saxena 
- *
- *  Copyright (C) 2002 Hewlett Packard Company.
- *  Copyright (C) 2004 MontaVista Software, Inc.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-#include 
-
-#undef STATS
-
-#ifdef STATS
-#define DO_STATS(X) do { X ; } while (0)
-#else
-#define DO_STATS(X) do { } while (0)
-#endif
-
-/* ** */
-
-struct safe_buffer {
-   struct list_head node;
-
-   /* original request */
-   void*ptr;
-   size_t  size;
-   int direction;
-
-   /* safe buffer info */
-   struct dmabounce_pool *pool;
-   void*safe;
-   dma_addr_t  safe_dma_addr;
-};
-
-struct dmabounce_pool {
-   unsigned long   size;
-   struct dma_pool *pool;
-#ifdef STATS
-   unsigned long   allocs;
-#endif
-};
-
-struct dmabounce_device_info {
-   struct device *dev;
-   struct list_head safe_buffers;
-#ifdef STATS
-   unsigned long total_allocs;
-   unsigned long map_op_count;
-   unsigned long bounce_count;
-   int attr_res;
-#endif
-   struct dmabounce_pool   small;
-   struct dmabounce_pool   large;
-
-   rwlock_t lock;
-
-   int (*needs_bounce)(struct device *, dma_addr_t, size_t);
-};
-
-#ifdef STATS
-static ssize_t dmabounce_show(struct device *dev, struct device_attribute 
*attr,
- char *buf)
-{
-   struct dmabounce_device_info *device_info = dev->archdata.dmabounce;
-   return sprintf(buf, "%lu %lu %lu %lu %lu %lu\n",
-   device_info->small.allocs,
-   device_info->large.allocs,
-   device_info->total_allocs - device_info->small.allocs -
-   device_info->large.allocs,
-   device_info->total_allocs,
-   device_info->map_op_count,
-   device_info->bounce_count);
-}
-
-static DEVICE_ATTR(dmabounce_stats, 0400, dmabounce_show, NULL);
-#endif
-
-
-/* allocate a 'safe' buffer and keep track of it */
-static inline struct safe_buffer *
-alloc_safe_buffer(struct dmabounce_device_info *device_info, void *ptr,
- size_t size, enum dma_data_direction dir)
-{
-   struct safe_buffer *buf;
-   struct dmabounce_pool *pool;
-   struct device *dev = device_info->dev;
-   unsigned long flags;
-
-   dev_dbg(dev, "%s(ptr=%p, size=%d, dir=%d)\n",
-   __func__, ptr, size, dir);
-
-   if (size <= device_info->small.size) {
-   pool = _info->small;
-   } else if (size <= device_info->large.size) {
-   pool = _info->large;
-   } else {
-   pool = NULL;
-   }
-
-   buf = kmalloc(sizeof(struct safe_buffer), 

fully convert arm to use dma-direct v3

2022-06-14 Thread Christoph Hellwig
Hi all,

arm is the last platform not using the dma-direct code for directly
mapped DMA.  With the dmaboune removal from Arnd we can easily switch
arm to always use dma-direct now (it already does for LPAE configs
and nommu).  I'd love to merge this series through the dma-mapping tree
as it gives us the opportunity for additional core dma-mapping
improvements.

Changes since v2:
 - rebased to Linux 5.19-rc2

Changes since v1:
 - remove another unused function
 - improve a few commit logs
 - add three more patches from Robin

Diffstat:
 arch/arm/common/dmabounce.c  |  582 -
 arch/arm/include/asm/dma-mapping.h   |  128 ---
 b/arch/arm/Kconfig   |5 
 b/arch/arm/common/Kconfig|6 
 b/arch/arm/common/Makefile   |1 
 b/arch/arm/common/sa.c   |   64 -
 b/arch/arm/include/asm/device.h  |3 
 b/arch/arm/include/asm/dma-direct.h  |   49 -
 b/arch/arm/include/asm/memory.h  |2 
 b/arch/arm/mach-footbridge/Kconfig   |1 
 b/arch/arm/mach-footbridge/common.c  |   19 
 b/arch/arm/mach-footbridge/include/mach/dma-direct.h |8 
 b/arch/arm/mach-footbridge/include/mach/memory.h |4 
 b/arch/arm/mach-highbank/highbank.c  |2 
 b/arch/arm/mach-mvebu/coherency.c|2 
 b/arch/arm/mm/dma-mapping.c  |  649 ++-
 b/drivers/usb/core/hcd.c |   17 
 b/drivers/usb/host/ohci-sa.c |   25 
 18 files changed, 137 insertions(+), 1430 deletions(-)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


New IOMMU mailing list coming

2022-06-14 Thread Jörg Rödel
Hi all,

after several problems with the current IOMMU mailing list (no DKIM
support, poor b4 interoperability) we have decided to move the IOMMU
development discussions to a new list hosted at lists.linux.dev.

The new list is up and running already, to subscribe please send an
email to iommu+subscr...@lists.linux.dev. It is not yet archived, but
there will be a hard switch between the old and the new list on

July 5th, 2022

After that date the old list will no longer work and the archive will
switch to the new mailing list.

Sorry for any inconveniences this might cause.

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/vt-d: Add set_dev_pasid callbacks for default domain

2022-06-14 Thread Baolu Lu

On 2022/6/14 15:19, Tian, Kevin wrote:

From: Baolu Lu 
Sent: Tuesday, June 14, 2022 2:13 PM

On 2022/6/14 13:36, Tian, Kevin wrote:

From: Baolu Lu
Sent: Tuesday, June 14, 2022 12:48 PM

On 2022/6/14 12:02, Tian, Kevin wrote:

From: Lu Baolu
Sent: Tuesday, June 14, 2022 11:44 AM

This allows the upper layers to set a domain to a PASID of a device
if the PASID feature is supported by the IOMMU hardware. The typical
use cases are, for example, kernel DMA with PASID and hardware
assisted mediated device drivers.


why is it not part of the series for those use cases? There is no consumer
of added callbacks in this patch...

It could be. I just wanted to maintain the integrity of Intel IOMMU
driver implementation.

but let's not add dead code. and this patch is actually a right step
simply from set_dev_pasid() p.o.v hence you should include in any
series which first tries to use that interface.



Yes, that's my intention. If it reviews well, we can include it in the
driver's implementation.



Then you should make it clear in the first place. otherwise a patch
like this implies a review for merge. 


Yeah! Will update this in the next version.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 11/12] iommu/vt-d: Use device_domain_lock accurately

2022-06-14 Thread Baolu Lu

On 2022/6/14 15:16, Tian, Kevin wrote:

From: Lu Baolu 
Sent: Tuesday, June 14, 2022 10:52 AM

The device_domain_lock is used to protect the device tracking list of
a domain. Remove unnecessary spin_lock/unlock()'s and move the necessary
ones around the list access.

Signed-off-by: Lu Baolu 
---
  drivers/iommu/intel/iommu.c | 68 +++--
  1 file changed, 27 insertions(+), 41 deletions(-)


[...]

+iommu_support_dev_iotlb(struct dmar_domain *domain, struct
intel_iommu *iommu,
+   u8 bus, u8 devfn)
  {
-   struct device_domain_info *info;
-
-   assert_spin_locked(_domain_lock);
+   struct device_domain_info *info = NULL, *tmp;
+   unsigned long flags;

if (!iommu->qi)
return NULL;

-   list_for_each_entry(info, >devices, link)
-   if (info->iommu == iommu && info->bus == bus &&
-   info->devfn == devfn) {
-   if (info->ats_supported && info->dev)
-   return info;
+   spin_lock_irqsave(_domain_lock, flags);
+   list_for_each_entry(tmp, >devices, link) {
+   if (tmp->iommu == iommu && tmp->bus == bus &&
+   tmp->devfn == devfn) {
+   if (tmp->ats_supported)
+   info = tmp;


Directly returning with unlock here is clearer than adding
another tmp variable...


Sure.




@@ -2460,15 +2450,14 @@ static int domain_add_dev_info(struct
dmar_domain *domain, struct device *dev)
if (!iommu)
return -ENODEV;

-   spin_lock_irqsave(_domain_lock, flags);
-   info->domain = domain;
ret = domain_attach_iommu(domain, iommu);
-   if (ret) {
-   spin_unlock_irqrestore(_domain_lock, flags);
+   if (ret)
return ret;
-   }
+
+   spin_lock_irqsave(_domain_lock, flags);
list_add(>link, >devices);
spin_unlock_irqrestore(_domain_lock, flags);
+   info->domain = domain;



This is incorrect. You need fully initialize the object before adding
it to the list. Otherwise a search right after above unlock and
before assigning info->domain will get a wrong data


Fair enough. Will fix it in the next version.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 10/12] iommu/vt-d: Fold __dmar_remove_one_dev_info() into its caller

2022-06-14 Thread Baolu Lu

On 2022/6/14 15:07, Tian, Kevin wrote:

From: Lu Baolu
Sent: Tuesday, June 14, 2022 10:52 AM

Fold __dmar_remove_one_dev_info() into dmar_remove_one_dev_info()
which
is its only caller. Make the spin lock critical range only cover the
device list change code and remove some unnecessary checks.

Signed-off-by: Lu Baolu
---
  drivers/iommu/intel/iommu.c | 34 +-
  1 file changed, 9 insertions(+), 25 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index af22690f44c8..8345e0c0824c 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -295,7 +295,6 @@ static LIST_HEAD(dmar_satc_units);
  static int g_num_of_iommus;

  static void dmar_remove_one_dev_info(struct device *dev);
-static void __dmar_remove_one_dev_info(struct device_domain_info *info);

  int dmar_disabled = !IS_ENABLED(CONFIG_INTEL_IOMMU_DEFAULT_ON);
  int intel_iommu_sm =
IS_ENABLED(CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON);
@@ -4141,20 +4140,14 @@ static void domain_context_clear(struct
device_domain_info *info)
   _context_clear_one_cb, info);
  }

-static void __dmar_remove_one_dev_info(struct device_domain_info *info)
+static void dmar_remove_one_dev_info(struct device *dev)
  {
-   struct dmar_domain *domain;
-   struct intel_iommu *iommu;
-
-   assert_spin_locked(_domain_lock);
-
-   if (WARN_ON(!info))
-   return;
-
-   iommu = info->iommu;
-   domain = info->domain;
+   struct device_domain_info *info = dev_iommu_priv_get(dev);
+   struct dmar_domain *domain = info->domain;

this local variable is not required as there is just one reference
to info->domain.


Yes. It could be removed and use info->domain directly.



btw I didn't see info->domain is cleared in this path. Is it correct?



It's better to clear here. I will make this change in my in-process
blocking domain series.

But it doesn't cause any real problems because the Intel IOMMU driver
supports default domain, hence the logic here is info->domain is
replaced, but not cleared.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 06/12] iommu/vt-d: Acquiring lock in domain ID allocation helpers

2022-06-14 Thread Baolu Lu

On 2022/6/14 14:52, Tian, Kevin wrote:

From: Lu Baolu 
Sent: Tuesday, June 14, 2022 10:52 AM

The iommu->lock is used to protect the per-IOMMU domain ID resource.
Moving the lock into the ID alloc/free helpers makes the code more
compact. At the same time, the device_domain_lock is irrelevant to
the domain ID resource, remove its assertion as well.

On the other hand, the iommu->lock is never used in interrupt context,
there's no need to use the irqsave variant of the spinlock calls.


I still prefer to separating reduction of lock ranges from changing irqsave.
Locking is tricky. From bisect p.o.v. it will be a lot easier if we just change
one logic in one patch.



Fair enough. I will do this in the next version.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 03/12] iommu/vt-d: Remove clearing translation data in disable_dmar_iommu()

2022-06-14 Thread Baolu Lu

On 2022/6/14 14:49, Tian, Kevin wrote:

From: Lu Baolu
Sent: Tuesday, June 14, 2022 10:51 AM

The disable_dmar_iommu() is called when IOMMU initialization fails or
the IOMMU is hot-removed from the system. In both cases, there is no
need to clear the IOMMU translation data structures for devices.

On the initialization path, the device probing only happens after the
IOMMU is initialized successfully, hence there're no translation data
structures.

Out of curiosity. With kexec the IOMMU may contain stale mappings
from the old kernel. Then is it meaningful to disable IOMMU after the
new kernel fails to initialize it properly?


For kexec kernel, if the IOMMU is detected to be pre-enabled, the IOMMU
driver will try to copy tables from the old kernel. If copying table
fails, the IOMMU driver will disable IOMMU and do the normal
initialization.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 1/1] iommu/vt-d: Add set_dev_pasid callbacks for default domain

2022-06-14 Thread Tian, Kevin
> From: Baolu Lu 
> Sent: Tuesday, June 14, 2022 2:13 PM
> 
> On 2022/6/14 13:36, Tian, Kevin wrote:
> >> From: Baolu Lu
> >> Sent: Tuesday, June 14, 2022 12:48 PM
> >>
> >> On 2022/6/14 12:02, Tian, Kevin wrote:
>  From: Lu Baolu
>  Sent: Tuesday, June 14, 2022 11:44 AM
> 
>  This allows the upper layers to set a domain to a PASID of a device
>  if the PASID feature is supported by the IOMMU hardware. The typical
>  use cases are, for example, kernel DMA with PASID and hardware
>  assisted mediated device drivers.
> 
> >>> why is it not part of the series for those use cases? There is no consumer
> >>> of added callbacks in this patch...
> >> It could be. I just wanted to maintain the integrity of Intel IOMMU
> >> driver implementation.
> > but let's not add dead code. and this patch is actually a right step
> > simply from set_dev_pasid() p.o.v hence you should include in any
> > series which first tries to use that interface.
> >
> 
> Yes, that's my intention. If it reviews well, we can include it in the
> driver's implementation.
> 

Then you should make it clear in the first place. otherwise a patch
like this implies a review for merge. 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [PATCH v2 11/12] iommu/vt-d: Use device_domain_lock accurately

2022-06-14 Thread Tian, Kevin
> From: Lu Baolu 
> Sent: Tuesday, June 14, 2022 10:52 AM
> 
> The device_domain_lock is used to protect the device tracking list of
> a domain. Remove unnecessary spin_lock/unlock()'s and move the necessary
> ones around the list access.
> 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel/iommu.c | 68 +++--
>  1 file changed, 27 insertions(+), 41 deletions(-)
> 
[...]
> +iommu_support_dev_iotlb(struct dmar_domain *domain, struct
> intel_iommu *iommu,
> + u8 bus, u8 devfn)
>  {
> - struct device_domain_info *info;
> -
> - assert_spin_locked(_domain_lock);
> + struct device_domain_info *info = NULL, *tmp;
> + unsigned long flags;
> 
>   if (!iommu->qi)
>   return NULL;
> 
> - list_for_each_entry(info, >devices, link)
> - if (info->iommu == iommu && info->bus == bus &&
> - info->devfn == devfn) {
> - if (info->ats_supported && info->dev)
> - return info;
> + spin_lock_irqsave(_domain_lock, flags);
> + list_for_each_entry(tmp, >devices, link) {
> + if (tmp->iommu == iommu && tmp->bus == bus &&
> + tmp->devfn == devfn) {
> + if (tmp->ats_supported)
> + info = tmp;

Directly returning with unlock here is clearer than adding
another tmp variable...

> @@ -2460,15 +2450,14 @@ static int domain_add_dev_info(struct
> dmar_domain *domain, struct device *dev)
>   if (!iommu)
>   return -ENODEV;
> 
> - spin_lock_irqsave(_domain_lock, flags);
> - info->domain = domain;
>   ret = domain_attach_iommu(domain, iommu);
> - if (ret) {
> - spin_unlock_irqrestore(_domain_lock, flags);
> + if (ret)
>   return ret;
> - }
> +
> + spin_lock_irqsave(_domain_lock, flags);
>   list_add(>link, >devices);
>   spin_unlock_irqrestore(_domain_lock, flags);
> + info->domain = domain;
> 

This is incorrect. You need fully initialize the object before adding
it to the list. Otherwise a search right after above unlock and
before assigning info->domain will get a wrong data
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 01/12] iommu/vt-d: debugfs: Remove device_domain_lock usage

2022-06-14 Thread Baolu Lu

Hi Kevin,

Thanks for your reviewing.

On 2022/6/14 14:43, Tian, Kevin wrote:

From: Lu Baolu 
Sent: Tuesday, June 14, 2022 10:51 AM

The domain_translation_struct debugfs node is used to dump the DMAR
page
tables for the PCI devices. It potentially races with setting domains to
devices. The existing code uses a global spinlock device_domain_lock to
avoid the races, but this is problematical as this lock is only used to
protect the device tracking lists of each domain.


is it really problematic at this point? Before following patches are applied
using device_domain_lock should have similar effect as holding the group
lock.


The device_domain_lock only protects the device tracking list of the
domain, it doesn't include the domain pointer stored in the dev_info
structure. That's really protected by the group->mutex.



Here it might make more sense to just focus on removing the use of
device_domain_lock outside of iommu.c. Just that using group lock is
cleaner and more compatible to following cleanups.


Fair enough. I will update the commit message with above statement.


and it's worth mentioning that racing with page table updates is out
of the scope of this series. Probably also add a comment in the code
to clarify this point.


Sure.





This replaces device_domain_lock with group->mutex to protect page tables
from setting a new domain. This also makes device_domain_lock static as
it is now only used inside the file.


s/the file/iommu.c/


Sure.





Signed-off-by: Lu Baolu 
---
  drivers/iommu/intel/iommu.h   |  1 -
  drivers/iommu/intel/debugfs.c | 49 +--
  drivers/iommu/intel/iommu.c   |  2 +-
  3 files changed, 36 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index a22adfbdf870..8a6d64d726c0 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -480,7 +480,6 @@ enum {
  #define VTD_FLAG_SVM_CAPABLE  (1 << 2)

  extern int intel_iommu_sm;
-extern spinlock_t device_domain_lock;

  #define sm_supported(iommu)   (intel_iommu_sm &&
ecap_smts((iommu)->ecap))
  #define pasid_supported(iommu)(sm_supported(iommu) &&
\
diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c
index d927ef10641b..5ebfe32265d5 100644
--- a/drivers/iommu/intel/debugfs.c
+++ b/drivers/iommu/intel/debugfs.c
@@ -342,15 +342,23 @@ static void pgtable_walk_level(struct seq_file *m,
struct dma_pte *pde,
}
  }

-static int show_device_domain_translation(struct device *dev, void *data)
+struct show_domain_opaque {
+   struct device *dev;
+   struct seq_file *m;
+};


Sounds redundant as both bus_for_each_dev() and
iommu_group_for_each_dev() declares the same fn type which
accepts a device pointer and void *data.


+
+static int __show_device_domain_translation(struct device *dev, void *data)
  {
-   struct device_domain_info *info = dev_iommu_priv_get(dev);
-   struct dmar_domain *domain = info->domain;
-   struct seq_file *m = data;
+   struct show_domain_opaque *opaque = data;
+   struct device_domain_info *info;
+   struct seq_file *m = opaque->m;
+   struct dmar_domain *domain;
u64 path[6] = { 0 };

-   if (!domain)
+   if (dev != opaque->dev)
return 0;


not required.


Together with above comment.

The iommu group might have other devices. I only want to dump the domain
of the secific @opaque->dev. It reads a bit confusing, but it's the
only helper I can use outside of drivers/iommu/iommu.c.

Or, since all devices in the iommu group share the same domain, hence
only dump once?




+   info = dev_iommu_priv_get(dev);
+   domain = info->domain;

seq_printf(m, "Device %s @0x%llx\n", dev_name(dev),
   (u64)virt_to_phys(domain->pgd));
@@ -359,20 +367,33 @@ static int show_device_domain_translation(struct
device *dev, void *data)
pgtable_walk_level(m, domain->pgd, domain->agaw + 2, 0, path);
seq_putc(m, '\n');

-   return 0;
+   return 1;
  }

-static int domain_translation_struct_show(struct seq_file *m, void *unused)
+static int show_device_domain_translation(struct device *dev, void *data)
  {
-   unsigned long flags;
-   int ret;
+   struct show_domain_opaque opaque = {dev, data};
+   struct iommu_group *group;

-   spin_lock_irqsave(_domain_lock, flags);
-   ret = bus_for_each_dev(_bus_type, NULL, m,
-  show_device_domain_translation);
-   spin_unlock_irqrestore(_domain_lock, flags);
+   group = iommu_group_get(dev);
+   if (group) {
+   /*
+* The group->mutex is held across the callback, which will
+* block calls to iommu_attach/detach_group/device. Hence,
+* the domain of the device will not change during traversal.
+*/
+   iommu_group_for_each_dev(group, ,
+ 

RE: [PATCH v2 10/12] iommu/vt-d: Fold __dmar_remove_one_dev_info() into its caller

2022-06-14 Thread Tian, Kevin
> From: Lu Baolu 
> Sent: Tuesday, June 14, 2022 10:52 AM
> 
> Fold __dmar_remove_one_dev_info() into dmar_remove_one_dev_info()
> which
> is its only caller. Make the spin lock critical range only cover the
> device list change code and remove some unnecessary checks.
> 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel/iommu.c | 34 +-
>  1 file changed, 9 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index af22690f44c8..8345e0c0824c 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -295,7 +295,6 @@ static LIST_HEAD(dmar_satc_units);
>  static int g_num_of_iommus;
> 
>  static void dmar_remove_one_dev_info(struct device *dev);
> -static void __dmar_remove_one_dev_info(struct device_domain_info *info);
> 
>  int dmar_disabled = !IS_ENABLED(CONFIG_INTEL_IOMMU_DEFAULT_ON);
>  int intel_iommu_sm =
> IS_ENABLED(CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON);
> @@ -4141,20 +4140,14 @@ static void domain_context_clear(struct
> device_domain_info *info)
>  _context_clear_one_cb, info);
>  }
> 
> -static void __dmar_remove_one_dev_info(struct device_domain_info *info)
> +static void dmar_remove_one_dev_info(struct device *dev)
>  {
> - struct dmar_domain *domain;
> - struct intel_iommu *iommu;
> -
> - assert_spin_locked(_domain_lock);
> -
> - if (WARN_ON(!info))
> - return;
> -
> - iommu = info->iommu;
> - domain = info->domain;
> + struct device_domain_info *info = dev_iommu_priv_get(dev);
> + struct dmar_domain *domain = info->domain;

this local variable is not required as there is just one reference
to info->domain.

btw I didn't see info->domain is cleared in this path. Is it correct?

> + struct intel_iommu *iommu = info->iommu;
> + unsigned long flags;
> 
> - if (info->dev && !dev_is_real_dma_subdevice(info->dev)) {
> + if (!dev_is_real_dma_subdevice(info->dev)) {
>   if (dev_is_pci(info->dev) && sm_supported(iommu))
>   intel_pasid_tear_down_entry(iommu, info->dev,
>   PASID_RID2PASID, false);
> @@ -4164,20 +4157,11 @@ static void
> __dmar_remove_one_dev_info(struct device_domain_info *info)
>   intel_pasid_free_table(info->dev);
>   }
> 
> - list_del(>link);
> - domain_detach_iommu(domain, iommu);
> -}
> -
> -static void dmar_remove_one_dev_info(struct device *dev)
> -{
> - struct device_domain_info *info;
> - unsigned long flags;
> -
>   spin_lock_irqsave(_domain_lock, flags);
> - info = dev_iommu_priv_get(dev);
> - if (info)
> - __dmar_remove_one_dev_info(info);
> + list_del(>link);
>   spin_unlock_irqrestore(_domain_lock, flags);
> +
> + domain_detach_iommu(domain, iommu);
>  }
> 
>  static int md_domain_init(struct dmar_domain *domain, int guest_width)
> --
> 2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 09/12] iommu/vt-d: Check device list of domain in domain free path

2022-06-14 Thread Tian, Kevin
> From: Lu Baolu 
> Sent: Tuesday, June 14, 2022 10:52 AM
> 
> When the IOMMU domain is about to be freed, it should not be set on any
> device. Instead of silently dealing with some bug cases, it's better to
> trigger a warning to report and fix any potential bugs at the first time.
> 
> Signed-off-by: Lu Baolu 
> Reviewed-by: Jason Gunthorpe 

Reviewed-by: Kevin Tian 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 08/12] iommu/vt-d: Replace spin_lock_irqsave() with spin_lock()

2022-06-14 Thread Tian, Kevin
> From: Lu Baolu 
> Sent: Tuesday, June 14, 2022 10:52 AM
> 
> The iommu->lock is used to protect changes in root/context/pasid tables
> and domain ID allocation. There's no use case to change these resources
> in any interrupt context. Hence there's no need to disable interrupts
> when helding the spinlock.
> 
> Signed-off-by: Lu Baolu 

with this as one-place to changing irqsave for all previous patches:

Reviewed-by: Kevin Tian 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 06/12] iommu/vt-d: Acquiring lock in domain ID allocation helpers

2022-06-14 Thread Tian, Kevin
> From: Lu Baolu 
> Sent: Tuesday, June 14, 2022 10:52 AM
> 
> The iommu->lock is used to protect the per-IOMMU domain ID resource.
> Moving the lock into the ID alloc/free helpers makes the code more
> compact. At the same time, the device_domain_lock is irrelevant to
> the domain ID resource, remove its assertion as well.
> 
> On the other hand, the iommu->lock is never used in interrupt context,
> there's no need to use the irqsave variant of the spinlock calls.

I still prefer to separating reduction of lock ranges from changing irqsave.
Locking is tricky. From bisect p.o.v. it will be a lot easier if we just change
one logic in one patch.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 03/12] iommu/vt-d: Remove clearing translation data in disable_dmar_iommu()

2022-06-14 Thread Tian, Kevin
> From: Lu Baolu 
> Sent: Tuesday, June 14, 2022 10:51 AM
> 
> The disable_dmar_iommu() is called when IOMMU initialization fails or
> the IOMMU is hot-removed from the system. In both cases, there is no
> need to clear the IOMMU translation data structures for devices.
> 
> On the initialization path, the device probing only happens after the
> IOMMU is initialized successfully, hence there're no translation data
> structures.

Out of curiosity. With kexec the IOMMU may contain stale mappings
from the old kernel. Then is it meaningful to disable IOMMU after the
new kernel fails to initialize it properly?

> 
> On the hot-remove path, there is no real use case where the IOMMU is
> hot-removed, but the devices that it manages are still alive in the
> system. The translation data structures were torn down during device
> release, hence there's no need to repeat it in IOMMU hot-remove path
> either. This removes the unnecessary code and only leaves a check.
> 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel/pasid.h |  1 +
>  drivers/iommu/intel/iommu.c | 21 +++--
>  2 files changed, 8 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h
> index 583ea67fc783..2afbb2afe8cc 100644
> --- a/drivers/iommu/intel/pasid.h
> +++ b/drivers/iommu/intel/pasid.h
> @@ -39,6 +39,7 @@
>   * only and pass-through transfer modes.
>   */
>  #define FLPT_DEFAULT_DID 1
> +#define NUM_RESERVED_DID 2
> 
>  /*
>   * The SUPERVISOR_MODE flag indicates a first level translation which
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index ff6018f746e0..695aed474e5d 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1715,23 +1715,16 @@ static int iommu_init_domains(struct
> intel_iommu *iommu)
> 
>  static void disable_dmar_iommu(struct intel_iommu *iommu)
>  {
> - struct device_domain_info *info, *tmp;
> - unsigned long flags;
> -
>   if (!iommu->domain_ids)
>   return;
> 
> - spin_lock_irqsave(_domain_lock, flags);
> - list_for_each_entry_safe(info, tmp, _domain_list, global) {
> - if (info->iommu != iommu)
> - continue;
> -
> - if (!info->dev || !info->domain)
> - continue;
> -
> - __dmar_remove_one_dev_info(info);
> - }
> - spin_unlock_irqrestore(_domain_lock, flags);
> + /*
> +  * All iommu domains must have been detached from the devices,
> +  * hence there should be no domain IDs in use.
> +  */
> + if (WARN_ON(bitmap_weight(iommu->domain_ids,
> cap_ndoms(iommu->cap))
> + != NUM_RESERVED_DID))
> + return;
> 
>   if (iommu->gcmd & DMA_GCMD_TE)
>   iommu_disable_translation(iommu);
> --
> 2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 01/12] iommu/vt-d: debugfs: Remove device_domain_lock usage

2022-06-14 Thread Tian, Kevin
> From: Lu Baolu 
> Sent: Tuesday, June 14, 2022 10:51 AM
> 
> The domain_translation_struct debugfs node is used to dump the DMAR
> page
> tables for the PCI devices. It potentially races with setting domains to
> devices. The existing code uses a global spinlock device_domain_lock to
> avoid the races, but this is problematical as this lock is only used to
> protect the device tracking lists of each domain.

is it really problematic at this point? Before following patches are applied
using device_domain_lock should have similar effect as holding the group
lock.

Here it might make more sense to just focus on removing the use of
device_domain_lock outside of iommu.c. Just that using group lock is
cleaner and more compatible to following cleanups.

and it's worth mentioning that racing with page table updates is out
of the scope of this series. Probably also add a comment in the code
to clarify this point.

> 
> This replaces device_domain_lock with group->mutex to protect page tables
> from setting a new domain. This also makes device_domain_lock static as
> it is now only used inside the file.

s/the file/iommu.c/

> 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel/iommu.h   |  1 -
>  drivers/iommu/intel/debugfs.c | 49 +--
>  drivers/iommu/intel/iommu.c   |  2 +-
>  3 files changed, 36 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
> index a22adfbdf870..8a6d64d726c0 100644
> --- a/drivers/iommu/intel/iommu.h
> +++ b/drivers/iommu/intel/iommu.h
> @@ -480,7 +480,6 @@ enum {
>  #define VTD_FLAG_SVM_CAPABLE (1 << 2)
> 
>  extern int intel_iommu_sm;
> -extern spinlock_t device_domain_lock;
> 
>  #define sm_supported(iommu)  (intel_iommu_sm &&
> ecap_smts((iommu)->ecap))
>  #define pasid_supported(iommu)   (sm_supported(iommu) &&
>   \
> diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c
> index d927ef10641b..5ebfe32265d5 100644
> --- a/drivers/iommu/intel/debugfs.c
> +++ b/drivers/iommu/intel/debugfs.c
> @@ -342,15 +342,23 @@ static void pgtable_walk_level(struct seq_file *m,
> struct dma_pte *pde,
>   }
>  }
> 
> -static int show_device_domain_translation(struct device *dev, void *data)
> +struct show_domain_opaque {
> + struct device *dev;
> + struct seq_file *m;
> +};

Sounds redundant as both bus_for_each_dev() and
iommu_group_for_each_dev() declares the same fn type which
accepts a device pointer and void *data. 

> +
> +static int __show_device_domain_translation(struct device *dev, void *data)
>  {
> - struct device_domain_info *info = dev_iommu_priv_get(dev);
> - struct dmar_domain *domain = info->domain;
> - struct seq_file *m = data;
> + struct show_domain_opaque *opaque = data;
> + struct device_domain_info *info;
> + struct seq_file *m = opaque->m;
> + struct dmar_domain *domain;
>   u64 path[6] = { 0 };
> 
> - if (!domain)
> + if (dev != opaque->dev)
>   return 0;

not required.

> + info = dev_iommu_priv_get(dev);
> + domain = info->domain;
> 
>   seq_printf(m, "Device %s @0x%llx\n", dev_name(dev),
>  (u64)virt_to_phys(domain->pgd));
> @@ -359,20 +367,33 @@ static int show_device_domain_translation(struct
> device *dev, void *data)
>   pgtable_walk_level(m, domain->pgd, domain->agaw + 2, 0, path);
>   seq_putc(m, '\n');
> 
> - return 0;
> + return 1;
>  }
> 
> -static int domain_translation_struct_show(struct seq_file *m, void *unused)
> +static int show_device_domain_translation(struct device *dev, void *data)
>  {
> - unsigned long flags;
> - int ret;
> + struct show_domain_opaque opaque = {dev, data};
> + struct iommu_group *group;
> 
> - spin_lock_irqsave(_domain_lock, flags);
> - ret = bus_for_each_dev(_bus_type, NULL, m,
> -show_device_domain_translation);
> - spin_unlock_irqrestore(_domain_lock, flags);
> + group = iommu_group_get(dev);
> + if (group) {
> + /*
> +  * The group->mutex is held across the callback, which will
> +  * block calls to iommu_attach/detach_group/device. Hence,
> +  * the domain of the device will not change during traversal.
> +  */
> + iommu_group_for_each_dev(group, ,
> +  __show_device_domain_translation);
> + iommu_group_put(group);
> + }
> 
> - return ret;
> + return 0;
> +}
> +
> +static int domain_translation_struct_show(struct seq_file *m, void *unused)
> +{
> + return bus_for_each_dev(_bus_type, NULL, m,
> + show_device_domain_translation);
>  }
>  DEFINE_SHOW_ATTRIBUTE(domain_translation_struct);
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 19024dc52735..a39d72a9d1cf 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ 

Re: [PATCH 1/1] iommu/vt-d: Add set_dev_pasid callbacks for default domain

2022-06-14 Thread Baolu Lu

On 2022/6/14 13:36, Tian, Kevin wrote:

From: Baolu Lu
Sent: Tuesday, June 14, 2022 12:48 PM

On 2022/6/14 12:02, Tian, Kevin wrote:

From: Lu Baolu
Sent: Tuesday, June 14, 2022 11:44 AM

This allows the upper layers to set a domain to a PASID of a device
if the PASID feature is supported by the IOMMU hardware. The typical
use cases are, for example, kernel DMA with PASID and hardware
assisted mediated device drivers.


why is it not part of the series for those use cases? There is no consumer
of added callbacks in this patch...

It could be. I just wanted to maintain the integrity of Intel IOMMU
driver implementation.

but let's not add dead code. and this patch is actually a right step
simply from set_dev_pasid() p.o.v hence you should include in any
series which first tries to use that interface.



Yes, that's my intention. If it reviews well, we can include it in the
driver's implementation.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu