Re: [RFC PATCH 2/2] dma-direct: Fix dma_direct_{alloc,free}() for Hyperv-V IVMs

2022-07-06 Thread Christoph Hellwig
On Wed, Jul 06, 2022 at 09:50:27PM +0200, Andrea Parri (Microsoft) wrote:
> @@ -305,6 +306,21 @@ void *dma_direct_alloc(struct device *dev, size_t size,
>   ret = page_address(page);
>   if (dma_set_decrypted(dev, ret, size))
>   goto out_free_pages;
> +#ifdef CONFIG_HAS_IOMEM
> + /*
> +  * Remap the pages in the unencrypted physical address space
> +  * when dma_unencrypted_base is set (e.g., for Hyper-V AMD
> +  * SEV-SNP isolated guests).
> +  */
> + if (dma_unencrypted_base) {
> + phys_addr_t ret_pa = virt_to_phys(ret);
> +
> + ret_pa += dma_unencrypted_base;
> + ret = memremap(ret_pa, size, MEMREMAP_WB);
> + if (!ret)
> + goto out_encrypt_pages;
> + }
> +#endif


So:

this needs to move into dma_set_decrypted, otherwise we don't handle
the dma_alloc_pages case (never mind that this is pretty unreadable).

Which then again largely duplicates the code in swiotlb.  So I think
what we need here is a low-level helper that does the
set_memory_decrypted and memremap.  I'm not quite sure where it
should go, but maybe some of the people involved with memory
encryption might have good ideas.  unencrypted_base should go with
it and then both swiotlb and dma-direct can call it.

> + /*
> +  * If dma_unencrypted_base is set, the virtual address returned by
> +  * dma_direct_alloc() is in the vmalloc address range.
> +  */
> + if (!dma_unencrypted_base && is_vmalloc_addr(cpu_addr)) {
>   vunmap(cpu_addr);
>   } else {
>   if (IS_ENABLED(CONFIG_ARCH_HAS_DMA_CLEAR_UNCACHED))
>   arch_dma_clear_uncached(cpu_addr, size);
> +#ifdef CONFIG_HAS_IOMEM
> + if (dma_unencrypted_base) {
> + memunmap(cpu_addr);
> + /* re-encrypt the pages using the original address */
> + cpu_addr = page_address(pfn_to_page(PHYS_PFN(
> + dma_to_phys(dev, dma_addr;
> + }
> +#endif
>   if (dma_set_encrypted(dev, cpu_addr, size))

Same on the unmap side.  It might also be worth looking into reordering
the checks in some form instead o that raw dma_unencrypted_base check
before the unmap.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: fully convert arm to use dma-direct v3

2022-07-06 Thread Christoph Hellwig
On Wed, Jun 29, 2022 at 08:41:32AM +0200, Greg Kroah-Hartman wrote:
> On Wed, Jun 29, 2022 at 08:28:37AM +0200, Christoph Hellwig wrote:
> > Any comments or additional testing?  It would be really great to get
> > this off the table.
> 
> For the USB bits:
> 
> Acked-by: Greg Kroah-Hartman 

So given that we're not making any progress on getting anyone interested
on the series, I'm tempted to just pull it into the dma-mapping tree
this weekend so that we'll finally have all architectures using the
common code.

Anyone who has real concerns, please scream now.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v4 02/11] iommu/vt-d: Remove clearing translation data in disable_dmar_iommu()

2022-07-06 Thread Tian, Kevin
> From: Lu Baolu 
> Sent: Wednesday, July 6, 2022 10:55 AM
> 
> The disable_dmar_iommu() is called when IOMMU initialization fails or
> the IOMMU is hot-removed from the system. In both cases, there is no
> need to clear the IOMMU translation data structures for devices.
> 
> On the initialization path, the device probing only happens after the
> IOMMU is initialized successfully, hence there're no translation data
> structures.
> 
> On the hot-remove path, there is no real use case where the IOMMU is
> hot-removed, but the devices that it manages are still alive in the
> system. The translation data structures were torn down during device
> release, hence there's no need to repeat it in IOMMU hot-remove path
> either. This removes the unnecessary code and only leaves a check.
> 
> Signed-off-by: Lu Baolu 

Reviewed-by: Kevin Tian 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/6] iommu/vt-d: Refactor iommu information of each domain

2022-07-06 Thread Baolu Lu

On 2022/7/7 09:01, Tian, Kevin wrote:

From: Lu Baolu 
Sent: Saturday, July 2, 2022 9:56 AM

-out_unlock:
+   set_bit(num, iommu->domain_ids);
+   info->refcnt = 1;
+   info->did= num;
+   info->iommu  = iommu;
+   domain->nid  = iommu->node;


One nit. this line should be removed as it's incorrect to blindly update
domain->nid and we should just leave to domain_update_iommu_cap()
to decide the right node. Otherwise this causes a policy conflict as
here it is the last attached device deciding the node which is different
from domain_update_iommu_cap() which picks the node of the first
attached device.


Agreed and updated. Thank you!



Otherwise,

Reviewed-by: Kevin Tian 


Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v3 02/11] iommu/vt-d: Remove clearing translation data in disable_dmar_iommu()

2022-07-06 Thread Tian, Kevin
> From: Baolu Lu 
> Sent: Sunday, July 3, 2022 12:34 PM
> 
> On 2022/7/1 15:58, Tian, Kevin wrote:
> >> From: Lu Baolu  Sent: Wednesday, June 29,
> >> 2022 3:47 PM
> >>
> >> The disable_dmar_iommu() is called when IOMMU initialization fails
> >> or the IOMMU is hot-removed from the system. In both cases, there
> >> is no need to clear the IOMMU translation data structures for
> >> devices.
> >>
> >> On the initialization path, the device probing only happens after
> >> the IOMMU is initialized successfully, hence there're no
> >> translation data structures.
> >>
> >> On the hot-remove path, there is no real use case where the IOMMU
> >> is hot-removed, but the devices that it manages are still alive in
> >> the system. The translation data structures were torn down during
> >> device release, hence there's no need to repeat it in IOMMU
> >> hot-remove path either. This removes the unnecessary code and only
> >> leaves a check.
> >>
> >> Signed-off-by: Lu Baolu 
> >
> > You probably overlooked my last comment on kexec:
> >
> >
> https://lore.kernel.org/lkml/BL1PR11MB52711A71AD9F11B7AE42694C8CAC9
> @BL1PR11MB5271.namprd11.prod.outlook.com/
> >
> >  I think my question is still not answered.
> 
> Sorry! I did overlook that comment. I can see your points now, though it
> seems to be irrelevant to the problems that this series tries to solve.
> 
> The failure path of copying table still needs some improvement. At least
> the pages allocated for root/context tables should be freed in the
> failure path. Even worse, the software occupied a bit of page table
> entry which is feasible for the old ECS, but not work for the new
> scalable mode anymore.
> 
> All these problems deserve a separate series. We could address your
> concerns there. Does this work for you?

Yes, this makes sense to me.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 4/6] iommu/vt-d: Remove unnecessary check in intel_iommu_add()

2022-07-06 Thread Tian, Kevin
> From: Lu Baolu 
> Sent: Saturday, July 2, 2022 9:56 AM
> 
> The Intel IOMMU hot-add process starts from dmar_device_hotplug(). It
> uses the global dmar_global_lock to synchronize all the hot-add and
> hot-remove paths. In the hot-add path, the new IOMMU data structures
> are allocated firstly by dmar_parse_one_drhd() and then initialized by
> dmar_hp_add_drhd(). All the IOMMU units are allocated and initialized
> in the same synchronized path. There is no case where any IOMMU unit
> is created and then initialized for multiple times.
> 
> This removes the unnecessary check in intel_iommu_add() which is the
> last reference place of the global IOMMU array.
> 
> Signed-off-by: Lu Baolu 

Reviewed-by: Kevin Tian 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 3/6] iommu/vt-d: Refactor iommu information of each domain

2022-07-06 Thread Tian, Kevin
> From: Lu Baolu 
> Sent: Saturday, July 2, 2022 9:56 AM
> 
> -out_unlock:
> + set_bit(num, iommu->domain_ids);
> + info->refcnt= 1;
> + info->did   = num;
> + info->iommu = iommu;
> + domain->nid = iommu->node;

One nit. this line should be removed as it's incorrect to blindly update
domain->nid and we should just leave to domain_update_iommu_cap()
to decide the right node. Otherwise this causes a policy conflict as
here it is the last attached device deciding the node which is different
from domain_update_iommu_cap() which picks the node of the first
attached device.

Otherwise,

Reviewed-by: Kevin Tian 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 2/6] iommu/vt-d: Use IDA interface to manage iommu sequence id

2022-07-06 Thread Tian, Kevin
> From: Lu Baolu 
> Sent: Saturday, July 2, 2022 9:56 AM
> 
> Switch dmar unit sequence id allocation and release from bitmap to IDA
> interface.
> 
> Signed-off-by: Lu Baolu 

Reviewed-by: Kevin Tian 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH 1/2] swiotlb, dma-direct: Move swiotlb_unencrypted_base to direct.c

2022-07-06 Thread Andrea Parri (Microsoft)
The variable will come in handy to enable dma_direct_{alloc,free}()
for Hyper-V AMD SEV-SNP Isolated VMs.

Rename swiotlb_unencrypted_base to dma_unencrypted_base to indicate
that the notion is not restricted to SWIOTLB.

No functional change.

Suggested-by: Michael Kelley 
Signed-off-by: Andrea Parri (Microsoft) 
---
Yeah, this is in some sense trading the dependency on SWIOTLB for a
dependency on HAS_DMA:

Q1. I'm unable to envision a scenario where SWIOTLB without HAS_DMA
would make sense but I'm also expecting one of the kernel test bots
to try such a nonsensical configuration... should the references to
dma_unencrypted_base in swiotlb.c be protected with HAS_DMA? other?

Q2. Can the #ifdef CONFIG_HAS_DMA in arch/x86/kernel/cpu/mshyperv.c
be removed? can we make HYPERV "depends on HAS_DMA"?

...

 arch/x86/kernel/cpu/mshyperv.c |  6 +++---
 include/linux/dma-direct.h |  2 ++
 include/linux/swiotlb.h|  2 --
 kernel/dma/direct.c|  8 
 kernel/dma/swiotlb.c   | 12 +---
 5 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 831613959a92a..47e9cece86ff8 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -18,7 +18,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -333,8 +333,8 @@ static void __init ms_hyperv_init_platform(void)
 
if (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP) {
static_branch_enable(_type_snp);
-#ifdef CONFIG_SWIOTLB
-   swiotlb_unencrypted_base = 
ms_hyperv.shared_gpa_boundary;
+#ifdef CONFIG_HAS_DMA
+   dma_unencrypted_base = ms_hyperv.shared_gpa_boundary;
 #endif
}
/* Isolation VMs are unenlightened SEV-based VMs, thus this 
check: */
diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 18aade195884d..0b7e4c4b7b34c 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -14,6 +14,8 @@
 
 extern unsigned int zone_dma_bits;
 
+extern phys_addr_t dma_unencrypted_base;
+
 /*
  * Record the mapping of CPU physical to DMA addresses for a given region.
  */
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 7ed35dd3de6e7..fa2e85f21af61 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -190,6 +190,4 @@ static inline bool is_swiotlb_for_alloc(struct device *dev)
 }
 #endif /* CONFIG_DMA_RESTRICTED_POOL */
 
-extern phys_addr_t swiotlb_unencrypted_base;
-
 #endif /* __LINUX_SWIOTLB_H */
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 8d0b68a170422..06b2b901e37a3 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -22,6 +22,14 @@
  */
 unsigned int zone_dma_bits __ro_after_init = 24;
 
+/*
+ * Certain Confidential Computing solutions, such as Hyper-V AMD SEV-SNP
+ * isolated VMs, use dma_unencrypted_base as a watermark: memory addresses
+ * below dma_unencrypted_base are treated as private, while memory above
+ * dma_unencrypted_base is treated as shared.
+ */
+phys_addr_t dma_unencrypted_base;
+
 static inline dma_addr_t phys_to_dma_direct(struct device *dev,
phys_addr_t phys)
 {
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index cb50f8d383606..78d4f5294a56c 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -67,8 +67,6 @@ static bool swiotlb_force_disable;
 
 struct io_tlb_mem io_tlb_default_mem;
 
-phys_addr_t swiotlb_unencrypted_base;
-
 static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT;
 
 static int __init
@@ -142,7 +140,7 @@ static inline unsigned long nr_slots(u64 val)
 
 /*
  * Remap swioltb memory in the unencrypted physical address space
- * when swiotlb_unencrypted_base is set. (e.g. for Hyper-V AMD SEV-SNP
+ * when dma_unencrypted_base is set. (e.g. for Hyper-V AMD SEV-SNP
  * Isolation VMs).
  */
 #ifdef CONFIG_HAS_IOMEM
@@ -150,8 +148,8 @@ static void *swiotlb_mem_remap(struct io_tlb_mem *mem, 
unsigned long bytes)
 {
void *vaddr = NULL;
 
-   if (swiotlb_unencrypted_base) {
-   phys_addr_t paddr = mem->start + swiotlb_unencrypted_base;
+   if (dma_unencrypted_base) {
+   phys_addr_t paddr = mem->start + dma_unencrypted_base;
 
vaddr = memremap(paddr, bytes, MEMREMAP_WB);
if (!vaddr)
@@ -213,10 +211,10 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem 
*mem, phys_addr_t start,
}
 
/*
-* If swiotlb_unencrypted_base is set, the bounce buffer memory will
+* If dma_unencrypted_base is set, the bounce buffer memory will
 * be remapped and cleared in swiotlb_update_mem_attributes.
 */
-   if (swiotlb_unencrypted_base)
+   if (dma_unencrypted_base)
return;
 
memset(vaddr, 0, bytes);
-- 
2.25.1

___
iommu 

[RFC PATCH 2/2] dma-direct: Fix dma_direct_{alloc, free}() for Hyperv-V IVMs

2022-07-06 Thread Andrea Parri (Microsoft)
In Hyper-V AMD SEV-SNP Isolated VMs, the virtual address returned by
dma_direct_alloc() must map above dma_unencrypted_base because the
memory is shared with the hardware device and must not be encrypted.

Modify dma_direct_alloc() to do the necessary remapping.  In
dma_direct_free(), use the (unmodified) DMA address to derive the
original virtual address and re-encrypt the pages.

Suggested-by: Michael Kelley 
Co-developed-by: Dexuan Cui 
Signed-off-by: Dexuan Cui 
Signed-off-by: Andrea Parri (Microsoft) 
---
 kernel/dma/direct.c | 30 +-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 06b2b901e37a3..c4ce277687a49 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include  /* for memremap() */
 #include "direct.h"
 
 /*
@@ -305,6 +306,21 @@ void *dma_direct_alloc(struct device *dev, size_t size,
ret = page_address(page);
if (dma_set_decrypted(dev, ret, size))
goto out_free_pages;
+#ifdef CONFIG_HAS_IOMEM
+   /*
+* Remap the pages in the unencrypted physical address space
+* when dma_unencrypted_base is set (e.g., for Hyper-V AMD
+* SEV-SNP isolated guests).
+*/
+   if (dma_unencrypted_base) {
+   phys_addr_t ret_pa = virt_to_phys(ret);
+
+   ret_pa += dma_unencrypted_base;
+   ret = memremap(ret_pa, size, MEMREMAP_WB);
+   if (!ret)
+   goto out_encrypt_pages;
+   }
+#endif
}
 
memset(ret, 0, size);
@@ -360,11 +376,23 @@ void dma_direct_free(struct device *dev, size_t size,
dma_free_from_pool(dev, cpu_addr, PAGE_ALIGN(size)))
return;
 
-   if (is_vmalloc_addr(cpu_addr)) {
+   /*
+* If dma_unencrypted_base is set, the virtual address returned by
+* dma_direct_alloc() is in the vmalloc address range.
+*/
+   if (!dma_unencrypted_base && is_vmalloc_addr(cpu_addr)) {
vunmap(cpu_addr);
} else {
if (IS_ENABLED(CONFIG_ARCH_HAS_DMA_CLEAR_UNCACHED))
arch_dma_clear_uncached(cpu_addr, size);
+#ifdef CONFIG_HAS_IOMEM
+   if (dma_unencrypted_base) {
+   memunmap(cpu_addr);
+   /* re-encrypt the pages using the original address */
+   cpu_addr = page_address(pfn_to_page(PHYS_PFN(
+   dma_to_phys(dev, dma_addr;
+   }
+#endif
if (dma_set_encrypted(dev, cpu_addr, size))
return;
}
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH 0/2] dma_direct_{alloc,free}() for Hyper-V IVMs

2022-07-06 Thread Andrea Parri (Microsoft)
Through swiotlb_unencrypted_base.

P.S.  I'm on vacation for the next couple of weeks starting next Monday;
Dexuan/Michael should be able to address review feedback in that period.

Andrea Parri (Microsoft) (2):
  swiotlb,dma-direct: Move swiotlb_unencrypted_base to direct.c
  dma-direct: Fix dma_direct_{alloc,free}() for Hyperv-V IVMs

 arch/x86/kernel/cpu/mshyperv.c |  6 +++---
 include/linux/dma-direct.h |  2 ++
 include/linux/swiotlb.h|  2 --
 kernel/dma/direct.c| 38 +-
 kernel/dma/swiotlb.c   | 12 +--
 5 files changed, 47 insertions(+), 13 deletions(-)

-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 20/21] PCI/P2PDMA: Introduce pci_mmap_p2pmem()

2022-07-06 Thread Logan Gunthorpe



On 2022-07-06 01:04, Greg Kroah-Hartman wrote:
> On Wed, Jul 06, 2022 at 08:51:27AM +0200, Christoph Hellwig wrote:
>> On Tue, Jul 05, 2022 at 12:16:45PM -0600, Logan Gunthorpe wrote:
>>> The current version does it through a char device, but that requires
>>> creating a simple_fs and anon_inode for teardown on driver removal, plus
>>> a bunch of hooks through the driver that exposes it (NVMe, in this case)
>>> to set this all up.
>>>
>>> Christoph is suggesting a sysfs interface which could potentially avoid
>>> the anon_inode and all of the extra hooks. It has some significant
>>> benefits and maybe some small downsides, but I wouldn't describe it as
>>> horrid.
>>
>> Yeah, I don't think is is horrible, it fits in with the resource files
>> for the BARs, and solves a lot of problems.  Greg, can you explain
>> what would be so bad about it?
> 
> As you mention, you will have to pass different things down into sysfs
> in order for that to be possible.  If it matches the resource files like
> we currently have today, that might not be that bad, but it still feels
> odd to me.  Let's see an implementation and a Documentation/ABI/ entry
> first though.

I'll work something up in the coming weeks.

Thanks,

Logan
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 0/6] iommu/vt-d: Reset DMAR_UNITS_SUPPORTED

2022-07-06 Thread Steve Wahl
On Sat, Jun 25, 2022 at 08:51:58PM +0800, Lu Baolu wrote:
> Hi folks,
> 
> This is a follow-up series of changes proposed by this patch:
> 
> https://lore.kernel.org/linux-iommu/20220615183650.32075-1-steve.w...@hpe.com/
> 
> It removes several static arrays of size DMAR_UNITS_SUPPORTED and sets
> the DMAR_UNITS_SUPPORTED to 1024.
> 

After Kevin Tian's comments, for the whole series:

Reviewed-by: Steve Wahl 

--> Steve

-- 
Steve Wahl, Hewlett Packard Enterprise
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v9 7/8] docs: trace: Add HiSilicon PTT device driver documentation

2022-07-06 Thread Mathieu Poirier
Hi,

I have started looking at this set.

On Mon, Jun 06, 2022 at 07:55:54PM +0800, Yicong Yang wrote:
> Document the introduction and usage of HiSilicon PTT device driver.
> 
> Signed-off-by: Yicong Yang 
> Reviewed-by: Jonathan Cameron 
> ---
>  Documentation/trace/hisi-ptt.rst | 307 +++
>  Documentation/trace/index.rst|   1 +

The "get_maintainer" script clearly indicates that Jonathan Corbet maintains the
Documentation directory and yet he is not CC'ed on this patch, nor is the
linux-doc mainling list.  As such, it would not be possible to merge this
patchset.

>  2 files changed, 308 insertions(+)
>  create mode 100644 Documentation/trace/hisi-ptt.rst
> 
> diff --git a/Documentation/trace/hisi-ptt.rst 
> b/Documentation/trace/hisi-ptt.rst
> new file mode 100644
> index ..0a3112244d40
> --- /dev/null
> +++ b/Documentation/trace/hisi-ptt.rst
> @@ -0,0 +1,307 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +==
> +HiSilicon PCIe Tune and Trace device
> +==
> +
> +Introduction
> +
> +
> +HiSilicon PCIe tune and trace device (PTT) is a PCIe Root Complex
> +integrated Endpoint (RCiEP) device, providing the capability
> +to dynamically monitor and tune the PCIe link's events (tune),
> +and trace the TLP headers (trace). The two functions are independent,
> +but is recommended to use them together to analyze and enhance the
> +PCIe link's performance.
> +
> +On Kunpeng 930 SoC, the PCIe Root Complex is composed of several
> +PCIe cores. Each PCIe core includes several Root Ports and a PTT
> +RCiEP, like below. The PTT device is capable of tuning and
> +tracing the links of the PCIe core.
> +::
> +
> +  +--Core 0---+
> +  |   |   [   PTT   ] |
> +  |   |   [Root Port]---[Endpoint]
> +  |   |   [Root Port]---[Endpoint]
> +  |   |   [Root Port]---[Endpoint]
> +Root Complex  |--Core 1---+
> +  |   |   [   PTT   ] |
> +  |   |   [Root Port]---[ Switch ]---[Endpoint]
> +  |   |   [Root Port]---[Endpoint] `-[Endpoint]
> +  |   |   [Root Port]---[Endpoint]
> +  +---+
> +
> +The PTT device driver registers one PMU device for each PTT device.
> +The name of each PTT device is composed of 'hisi_ptt' prefix with
> +the id of the SICL and the Core where it locates. The Kunpeng 930
> +SoC encapsulates multiple CPU dies (SCCL, Super CPU Cluster) and
> +IO dies (SICL, Super I/O Cluster), where there's one PCIe Root
> +Complex for each SICL.
> +::
> +
> +/sys/devices/hisi_ptt_

All entries added to sysfs should have corresponding documentation.  See [1] and
[2] for details and [3] for an example.

[1]. https://elixir.bootlin.com/linux/latest/source/Documentation/ABI/README
[2]. https://elixir.bootlin.com/linux/latest/source/Documentation/ABI/testing
[3]. 
https://elixir.bootlin.com/linux/latest/source/Documentation/ABI/testing/sysfs-bus-coresight-devices-etm4x

> +
> +Tune
> +
> +
> +PTT tune is designed for monitoring and adjusting PCIe link parameters 
> (events).
> +Currently we support events in 4 classes. The scope of the events
> +covers the PCIe core to which the PTT device belongs.
> +
> +Each event is presented as a file under $(PTT PMU dir)/tune, and
> +a simple open/read/write/close cycle will be used to tune the event.
> +::
> +
> +$ cd /sys/devices/hisi_ptt_/tune
> +$ ls
> +qos_tx_cplqos_tx_npqos_tx_p
> +tx_path_rx_req_alloc_buf_level
> +tx_path_tx_req_alloc_buf_level

These look overly long... How about watermark_rx and watermark_tx?

> +$ cat qos_tx_dp
> +1
> +$ echo 2 > qos_tx_dp
> +$ cat qos_tx_dp
> +2
> +
> +Current value (numerical value) of the event can be simply read
> +from the file, and the desired value written to the file to tune.
> +
> +1. Tx path QoS control
> +
> +
> +The following files are provided to tune the QoS of the tx path of
> +the PCIe core.
> +
> +- qos_tx_cpl: weight of Tx completion TLPs
> +- qos_tx_np: weight of Tx non-posted TLPs
> +- qos_tx_p: weight of Tx posted TLPs
> +
> +The weight influences the proportion of certain packets on the PCIe link.
> +For example, for the storage scenario, increase the proportion
> +of the completion packets on the link to enhance the performance as
> +more completions are consumed.
> +
> +The available tune data of these events is [0, 1, 2].
> +Writing a negative value will return an error, and out of range
> +values will be converted to 2. Note that the event value just
> +indicates a probable level, but is not precise.
> +
> +2. Tx path buffer control
> +-
> +
> +Following files are provided to tune the buffer of tx path of the PCIe core.
> +
> +- tx_path_rx_req_alloc_buf_level: watermark of Rx requested
> +- 

Re: [PATCH 1/2] iommu: arm-smmu-impl: Add 8250 display compatible to the client list.

2022-07-06 Thread Will Deacon
On Tue, 14 Jun 2022 16:01:35 -0700, Emma Anholt wrote:
> Required for turning on per-process page tables for the GPU.
> 
> 

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/2] iommu: arm-smmu-impl: Add 8250 display compatible to the client list.
  https://git.kernel.org/will/c/3482c0b73073
[2/2] arm64: dts: qcom: sm8250: Enable per-process page tables.
  (no commit info)

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv2] iommu/arm-smmu-qcom: Add debug support for TLB sync timeouts

2022-07-06 Thread Robin Murphy

On 2022-05-26 05:14, Sai Prakash Ranjan wrote:

TLB sync timeouts can be due to various reasons such as TBU power down
or pending TCU/TBU invalidation/sync and so on. Debugging these often
require dumping of some implementation defined registers to know the
status of TBU/TCU operations and some of these registers are not
accessible in non-secure world such as from kernel and requires SMC
calls to read them in the secure world. So, add this debug support
to dump implementation defined registers for TLB sync timeout issues.

Signed-off-by: Sai Prakash Ranjan 
---

Changes in v2:
  * Use scm call consistently so that it works on older chipsets where
some of these regs are secure registers.
  * Add device specific data to get the implementation defined register
offsets.

---
  drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 161 ++---
  drivers/iommu/arm/arm-smmu/arm-smmu.c  |   2 +
  drivers/iommu/arm/arm-smmu/arm-smmu.h  |   1 +
  3 files changed, 146 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index 7820711c4560..bb68aa85b28b 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -5,13 +5,27 @@
  
  #include 

  #include 
+#include 
  #include 
  #include 
  
  #include "arm-smmu.h"
  
+#define QCOM_DUMMY_VAL	-1

+
+enum qcom_smmu_impl_reg_offset {
+   QCOM_SMMU_TBU_PWR_STATUS,
+   QCOM_SMMU_STATS_SYNC_INV_TBU_ACK,
+   QCOM_SMMU_MMU2QSS_AND_SAFE_WAIT_CNTR,
+};
+
+struct qcom_smmu_config {
+   const u32 *reg_offset;
+};
+
  struct qcom_smmu {
struct arm_smmu_device smmu;
+   const struct qcom_smmu_config *cfg;
bool bypass_quirk;
u8 bypass_cbndx;
u32 stall_enabled;
@@ -22,6 +36,56 @@ static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device 
*smmu)
return container_of(smmu, struct qcom_smmu, smmu);
  }
  
+static void qcom_smmu_tlb_sync(struct arm_smmu_device *smmu, int page,

+   int sync, int status)
+{
+   int ret;
+   unsigned int spin_cnt, delay;
+   u32 reg, tbu_pwr_status, sync_inv_ack, sync_inv_progress;
+   struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
+   const struct qcom_smmu_config *cfg;
+
+   arm_smmu_writel(smmu, page, sync, QCOM_DUMMY_VAL);
+   for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) {
+   for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) {
+   reg = arm_smmu_readl(smmu, page, status);
+   if (!(reg & ARM_SMMU_sTLBGSTATUS_GSACTIVE))
+   return;
+   cpu_relax();
+   }
+   udelay(delay);
+   }
+
+   dev_err_ratelimited(smmu->dev,
+   "TLB sync timed out -- SMMU may be deadlocked\n");


Maybe consider a single ratelimit state for the whole function so all 
the output stays together. If things go sufficiently wrong, mixed up 
bits of partial output from different events may be misleadingly 
unhelpful (and at the very least it'll be up to 5x more effective at the 
intent of limiting log spam).



+   cfg = qsmmu->cfg;
+   if (!cfg)
+   return;
+
+   ret = qcom_scm_io_readl(smmu->ioaddr + 
cfg->reg_offset[QCOM_SMMU_TBU_PWR_STATUS],
+   _pwr_status);
+   if (ret)
+   dev_err_ratelimited(smmu->dev,
+   "Failed to read TBU power status: %d\n", 
ret);
+
+   ret = qcom_scm_io_readl(smmu->ioaddr + 
cfg->reg_offset[QCOM_SMMU_STATS_SYNC_INV_TBU_ACK],
+   _inv_ack);
+   if (ret)
+   dev_err_ratelimited(smmu->dev,
+   "Failed to read TBU sync/inv ack status: 
%d\n", ret);
+
+   ret = qcom_scm_io_readl(smmu->ioaddr + 
cfg->reg_offset[QCOM_SMMU_MMU2QSS_AND_SAFE_WAIT_CNTR],
+   _inv_progress);
+   if (ret)
+   dev_err_ratelimited(smmu->dev,
+   "Failed to read TCU syn/inv progress: 
%d\n", ret);
+
+   dev_err_ratelimited(smmu->dev,
+   "TBU: power_status %#x sync_inv_ack %#x 
sync_inv_progress %#x\n",
+   tbu_pwr_status, sync_inv_ack, sync_inv_progress);
+}
+
  static void qcom_adreno_smmu_write_sctlr(struct arm_smmu_device *smmu, int 
idx,
u32 reg)
  {
@@ -374,6 +438,7 @@ static const struct arm_smmu_impl qcom_smmu_impl = {
.def_domain_type = qcom_smmu_def_domain_type,
.reset = qcom_smmu500_reset,
.write_s2cr = qcom_smmu_write_s2cr,
+   .tlb_sync = qcom_smmu_tlb_sync,
  };
  
  static const struct arm_smmu_impl qcom_adreno_smmu_impl = {

@@ -382,12 +447,84 @@ static const struct arm_smmu_impl qcom_adreno_smmu_impl = 
{
.reset = qcom_smmu500_reset,
.alloc_context_bank = 

Re: [PATCH v1 08/16] arm64: dts: mt8195: Add power domains controller

2022-07-06 Thread Krzysztof Kozlowski
On 06/07/2022 14:00, Tinghan Shen wrote:
> Hi Krzysztof,
> 
> After discussing your message with our power team, 
> we realized that we need your help to ensure we fully understand you.
> 
> On Mon, 2022-07-04 at 14:38 +0200, Krzysztof Kozlowski wrote:
>> On 04/07/2022 12:00, Tinghan Shen wrote:
>>> Add power domains controller node for mt8195.
>>>
>>> Signed-off-by: Weiyi Lu 
>>> Signed-off-by: Tinghan Shen 
>>> ---
>>>  arch/arm64/boot/dts/mediatek/mt8195.dtsi | 327 +++
>>>  1 file changed, 327 insertions(+)
>>>
>>> diff --git a/arch/arm64/boot/dts/mediatek/mt8195.dtsi 
>>> b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
>>> index 8d59a7da3271..d52e140d9271 100644
>>> --- a/arch/arm64/boot/dts/mediatek/mt8195.dtsi
>>> +++ b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
>>> @@ -10,6 +10,7 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>>  
>>>  / {
>>> compatible = "mediatek,mt8195";
>>> @@ -338,6 +339,332 @@
>>> #interrupt-cells = <2>;
>>> };
>>>  
>>> +   scpsys: syscon@10006000 {
>>> +   compatible = "syscon", "simple-mfd";
>>
>> These compatibles cannot be alone.
> 
> the scpsys sub node has the compatible of the power domain driver.
> do you suggest that the compatible in the sub node should move to here?

Not necessarily, depends. You have here device node representing system
registers. They need they own compatibles, just like everywhere in the
kernel (except the broken cases...).

Whether this should be compatible of power-domain driver, it depends
what this device node is. I don't know, I don't have your datasheets or
your architecture diagrams...

> 
>>> +   reg = <0 0x10006000 0 0x1000>;
>>> +   #power-domain-cells = <1>;
>>
>> If it is simple MFD, then probably it is not a power domain provider.
>> Decide.
> 
> this MFD device is the power controller on mt8195. 

Then it is not a simple MFD but a power controller. Do not use
"simple-mfd" compatible.

> Some features need 
> to do some operations on registers in this node. We think that implement 
> the operation of these registers as the MFD device can provide flexibility 
> for future use. We want to clarify if you're saying that an MFD device 
> cannot be a power domain provider.

MFD device is Linuxism, so it has nothing to do here. I am talking only
about simple-mfd. simple-mfd is a simple device only instantiating
children and not providing anything to anyone. Neither to children. This
 the most important part. The children do not depend on anything from
simple-mfd device. For example simple-mfd device can be shut down
(gated) and children should still operate. Being a power domain
controller, contradicts this usually.

Best regards,
Krzysztof
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 02/16] dt-bindings: memory: mediatek: Update condition for mt8195 smi node

2022-07-06 Thread Krzysztof Kozlowski
On 06/07/2022 15:48, Matthias Brugger wrote:
> 
> 
> On 04/07/2022 14:36, Krzysztof Kozlowski wrote:
>> On 04/07/2022 12:00, Tinghan Shen wrote:
>>> The max clock items for the dts node with compatible
>>> 'mediatek,mt8195-smi-sub-common' should be 3.
>>>
>>> However, the dtbs_check of such node will get following message,
>>> arch/arm64/boot/dts/mediatek/mt8195-evb.dtb: smi@1401: clock-names: 
>>> ['apb', 'smi', 'gals0'] is too long
>>>   From schema: 
>>> Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
>>>
>>> Remove the last 'else' checking to fix this error.
>>
>> Missing fixes tag.
>>
> 
>  From my understanding, fixes tags are for patches that fix bugs (hw is not 
> working etc) and not a warning message from dtbs_check. So my point of view 
> would be to not add a fixes tag here.

Not conforming to bindings is also a bug. Missing properties or wrong
properties, even if hardware is working, is still a bug. If such bug is
not visible now in Linux, might be visible later in the future or
visible in different OS (DTS are used by other systems and pieces of
software like bootloaders). Limiting this only to Linux and to current
version (hardware still works) is OK for Linux drivers, but not for DTS.

Therefore Fixes tag in general is applicable. Of course maybe to this
one not really, maybe this is too trivial, or whatever, so I do not
insist. But I insist on the principle - reasonable dtbs_check warnings
are like compiler warnings - bugs which have to be fixed.


Best regards,
Krzysztof
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 08/16] arm64: dts: mt8195: Add power domains controller

2022-07-06 Thread Krzysztof Kozlowski
On 06/07/2022 15:41, Matthias Brugger wrote:
> 
> 
> On 04/07/2022 14:38, Krzysztof Kozlowski wrote:
>> On 04/07/2022 12:00, Tinghan Shen wrote:
>>> Add power domains controller node for mt8195.
>>>
>>> Signed-off-by: Weiyi Lu 
>>> Signed-off-by: Tinghan Shen 
>>> ---
>>>   arch/arm64/boot/dts/mediatek/mt8195.dtsi | 327 +++
>>>   1 file changed, 327 insertions(+)
>>>
>>> diff --git a/arch/arm64/boot/dts/mediatek/mt8195.dtsi 
>>> b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
>>> index 8d59a7da3271..d52e140d9271 100644
>>> --- a/arch/arm64/boot/dts/mediatek/mt8195.dtsi
>>> +++ b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
>>> @@ -10,6 +10,7 @@
>>>   #include 
>>>   #include 
>>>   #include 
>>> +#include 
>>>   
>>>   / {
>>> compatible = "mediatek,mt8195";
>>> @@ -338,6 +339,332 @@
>>> #interrupt-cells = <2>;
>>> };
>>>   
>>> +   scpsys: syscon@10006000 {
>>> +   compatible = "syscon", "simple-mfd";
>>
>> These compatibles cannot be alone.
>>
> 
> You mean we would need something like "mediatek,scpsys" as dummy compatible 
> that's not bound to any driver?

Yes. syscon (and simple-mfd) must always come with a specific compatible.

> 
>>> +   reg = <0 0x10006000 0 0x1000>;
>>> +   #power-domain-cells = <1>;
>>
>> If it is simple MFD, then probably it is not a power domain provider.
>> Decide.
> 
> The SCPSYS IP block of MediaTek SoCs group several functionality, one is the 
> power domain controller. Others are not yet implemented, but defining the 
> scpsys 
> as a MFD will give us the possibility to do so in the future.

No, quite the opposite. Having simple-mfd prevents you from implementing
it correctly later as a driver, because you cannot remove it. It would
be ABI break.

It's fine to have one block being a simple MFD having several children,
but then it's not a power controller. Children could be such power
controller, but not simple-mfd. Rob explained this several times:
https://lore.kernel.org/all/yxhine00hg6hb...@robh.at.kernel.org/
https://lore.kernel.org/all/20220701000959.ga3588170-r...@kernel.org/


Best regards,
Krzysztof
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/4] iommu/exynos: Add basic support for SysMMU v7

2022-07-06 Thread Sam Protsenko
On Sun, 3 Jul 2022 at 13:47, David Virag  wrote:
>
> On Sun, 2022-07-03 at 00:48 +0300, Sam Protsenko wrote:
> [...]
> > Hi Marek,
> >
> > As I understand, you have some board with SysMMU v7, which is not VM
> > capable (judging from the patches you shared earlier). Could you
> > please somehow verify if this series works fine for you? For example,
> > this testing driver [1] can be helpful.
> >
> > Thanks!
> >
> > [1]
> > https://github.com/joe-skb7/linux/commit/bbadd46fa525fe1fef2ccbdfff81f7d29caf0506
>
> Hi Sam,
>
> Not Marek here, but I wanted to try this on my jackpotlte (Exynos
> 7885). The driver reports it's DPU sysmmu as version 7.2, and manually
> reading the capabilities registers it looks like it has the 2nd
> capability register but not the VM capability.
>
> After applying your patches, adding your test driver (with SYSMMU_BASE
> corrected to 7885 value), and adding the sysmmu to dt, I tried to cat
> the test file that it creates in debugfs and I got an SError kernel
> panic.
>
> I tried tracing where the SError happens and it looks like it's this
> line:
> /* Preload for emulation */
> iowrite32(rw | vpn, obj->reg_base + MMU_EMU_PRELOAD);
>
> Trying to read the EMU registers using devmem results in a "Bus error".
>
> Could these emulation registers be missing from my SysMMU? Do you have
> any info on what version should have it? Or maybe some capability bit?
> I'll try testing it with DECON/DPP later and see if it works that way.
>

Hi Janghyuck,

Do you have by chance any info on SysMMU v7.2, which is present e.g.
on Exynos7885? David is trying to use emulation registers there with
no luck, so it would be nice if you can provide some details on
questions above.

Thanks!

> Best regards,
> David
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 02/16] dt-bindings: memory: mediatek: Update condition for mt8195 smi node

2022-07-06 Thread Matthias Brugger




On 04/07/2022 14:36, Krzysztof Kozlowski wrote:

On 04/07/2022 12:00, Tinghan Shen wrote:

The max clock items for the dts node with compatible
'mediatek,mt8195-smi-sub-common' should be 3.

However, the dtbs_check of such node will get following message,
arch/arm64/boot/dts/mediatek/mt8195-evb.dtb: smi@1401: clock-names: ['apb', 
'smi', 'gals0'] is too long
  From schema: 
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml

Remove the last 'else' checking to fix this error.


Missing fixes tag.



From my understanding, fixes tags are for patches that fix bugs (hw is not 
working etc) and not a warning message from dtbs_check. So my point of view 
would be to not add a fixes tag here.


Regards,
Matthias



Signed-off-by: Tinghan Shen 
---
  .../memory-controllers/mediatek,smi-common.yaml| 10 +-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
index a98b359bf909..e5f553e2e12a 100644
--- 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
+++ 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
@@ -143,7 +143,15 @@ allOf:
  - const: gals0
  - const: gals1
  
-else:  # for gen2 HW that don't have gals

+  - if:  # for gen2 HW that don't have gals
+  properties:
+compatible:
+  enum:
+- mediatek,mt2712-smi-common
+- mediatek,mt8167-smi-common
+- mediatek,mt8173-smi-common
+


Without looking at the code, it's impossible to understand what you are
doing here. The commit msg says one, but you are doing something else.

Write commit msg explaining what you want to achieve and what you are doing.


Best regards,
Krzysztof

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 0/5] DMA mapping changes for SCSI core

2022-07-06 Thread Christoph Hellwig
On Wed, Jul 06, 2022 at 02:40:44PM +0100, John Garry wrote:
> On 30/06/2022 13:08, John Garry wrote:
>
> Hi Christoph,
>
> Can you please consider picking up this series? A few things to note 
> beforehand:
>
> - I changed to only apply the mapping limit to SAS hosts in this version. I 
> would need a fresh ack from Martin for those SCSI parts, but wanted to make 
> sure you were ok with it.

Yes, I've mostly been waiting for an ACK from Martin.

> - Damien had some doubt on updating the shost max_sectors as opposed to the 
> per-request queue default, but I think he's ok with it - see patch 4/5

I'm fine either way.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 08/16] arm64: dts: mt8195: Add power domains controller

2022-07-06 Thread Matthias Brugger




On 04/07/2022 14:38, Krzysztof Kozlowski wrote:

On 04/07/2022 12:00, Tinghan Shen wrote:

Add power domains controller node for mt8195.

Signed-off-by: Weiyi Lu 
Signed-off-by: Tinghan Shen 
---
  arch/arm64/boot/dts/mediatek/mt8195.dtsi | 327 +++
  1 file changed, 327 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8195.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
index 8d59a7da3271..d52e140d9271 100644
--- a/arch/arm64/boot/dts/mediatek/mt8195.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
@@ -10,6 +10,7 @@
  #include 
  #include 
  #include 
+#include 
  
  / {

compatible = "mediatek,mt8195";
@@ -338,6 +339,332 @@
#interrupt-cells = <2>;
};
  
+		scpsys: syscon@10006000 {

+   compatible = "syscon", "simple-mfd";


These compatibles cannot be alone.



You mean we would need something like "mediatek,scpsys" as dummy compatible 
that's not bound to any driver?



+   reg = <0 0x10006000 0 0x1000>;
+   #power-domain-cells = <1>;


If it is simple MFD, then probably it is not a power domain provider.
Decide.


The SCPSYS IP block of MediaTek SoCs group several functionality, one is the 
power domain controller. Others are not yet implemented, but defining the scpsys 
as a MFD will give us the possibility to do so in the future.


Regards,
Matthias



Best regards,
Krzysztof

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 0/5] DMA mapping changes for SCSI core

2022-07-06 Thread John Garry via iommu

On 30/06/2022 13:08, John Garry wrote:

Hi Christoph,

Can you please consider picking up this series? A few things to note 
beforehand:


- I changed to only apply the mapping limit to SAS hosts in this 
version. I would need a fresh ack from Martin for those SCSI parts, but 
wanted to make sure you were ok with it.
- Damien had some doubt on updating the shost max_sectors as opposed to 
the per-request queue default, but I think he's ok with it - see patch 4/5


Thanks,
John



As reported in [0], DMA mappings whose size exceeds the IOMMU IOVA caching
limit may see a big performance hit.

This series introduces a new DMA mapping API, dma_opt_mapping_size(), so
that drivers may know this limit when performance is a factor in the
mapping.

The SCSI SAS transport code is modified only to use this limit. For now I
did not want to touch other hosts as I have a concern that this change
could cause a performance regression.

I also added a patch for libata-scsi as it does not currently honour the
shost max_sectors limit.

[0]https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leiz...@huawei.com/

Changes since v4:
- tweak libata and other patch titles
- Add Robin's tag (thanks!)
- Clarify description of new DMA mapping API


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Re: Re: Re: [PATCH v2 1/9] PM: domains: Delete usage of driver_deferred_probe_check_state()

2022-07-06 Thread Alexander Stein
f the irqdomain (might be using the terms incorrectly) like the
> > > gic, you can make it a platform driver. And I was trying to hack up a
> > > patch that's the equivalent of platform_irqchip_probe() (which just
> > > ends up eventually calling the callback you use in IRQCHIP_DECLARE().
> > > I probably made some mistake in the quick hack that I'm sure if
> > > fixable.
> > > 
> > > > > [0.013251] Failed to map interrupt for
> > > > > /soc@0/bus@3040/timer@306a
> > > 
> > > However, this timer driver also uses TIMER_OF_DECLARE() which can't
> > > handle failure to get the IRQ (because it's can't -EPROBE_DEFER). So,
> > > this means, the timer driver inturn needs to be converted to a
> > > platform driver if it's supposed to work with the IRQCHIP_DECLARE()
> > > being converted to a platform driver.
> > > 
> > > But that's a can of worms not worth opening. But then I remembered
> > > this simpler workaround will work and it is pretty much a variant of
> > > the workaround that's already in the gpc's irqchip driver to allow two
> > > drivers to probe the same device (people really should stop doing
> > > that).
> > > 
> > > Can you drop my previous hack patch and try this instead please? I'm
> > > 99% sure this will work.
> > > 
> > > diff --git a/drivers/irqchip/irq-imx-gpcv2.c
> > > b/drivers/irqchip/irq-imx-gpcv2.c index b9c22f764b4d..8a0e82067924
> > > 100644
> > > --- a/drivers/irqchip/irq-imx-gpcv2.c
> > > +++ b/drivers/irqchip/irq-imx-gpcv2.c
> > > @@ -283,6 +283,7 @@ static int __init imx_gpcv2_irqchip_init(struct
> > > device_node *node,
> > > 
> > >  * later the GPC power domain driver will not be skipped.
> > >  */
> > > 
> > > of_node_clear_flag(node, OF_POPULATED);
> > > 
> > > +   fwnode_dev_initialized(domain->fwnode, false);
> > > 
> > > return 0;
> > >  
> > >  }
> > 
> > Just to be sure here, I tried this patch on top of next-20220701 but
> > unfortunately this doesn't fix the original problem either. The timer
> > errors are gone though.
> 
> To clarify, you had the timer issue only with my "combine drivers" patch,
> right?

That's correct.

> > The probe of imx8m-blk-ctrl got slightly delayed (from 0.74 to 0.90s
> > printk
> > time) but results in the identical error message.
> 
> My guess is that the probe attempt of blk-ctrl is delayed now till gpc
> probes (because of the device links getting created with the
> fwnode_dev_initialized() fix), but by the time gpc probe finishes, the
> power domains aren't registered yet because of the additional level of
> device addition and probing.
> 
> Can you try the attached patch please?

Sure, it needed some small fixes though. But the error still is present.

> And if that doesn't fix the issues, then enable the debug logs in the
> following functions please and share the logs from boot till the
> failure? If you can enable CONFIG_PRINTK_CALLER, that'd help too.
> device_link_add()
> fwnode_link_add()
> fw_devlink_relax_cycle()

I switched fw_devlink_relax_cycle() for fw_devlink_relax_link() as the former 
has no debug output here.

For the record I added the following line to my kernel command line:
> dyndbg="func device_link_add +p; func fwnode_link_add +p; func 
fw_devlink_relax_link +p"

I attached the dmesg until the probe error to this mail. But I noticed the 
following lines which seem interesting:
> [1.466620][T8] imx-pgc imx-pgc-domain.5: Linked as a consumer to
> regulator.8
> [1.466743][T8] imx-pgc imx-pgc-domain.5: imx_pgc_domain_probe: Probe 
succeeded
> [1.474733][T8] imx-pgc imx-pgc-domain.6: Linked as a consumer to 
regulator.9
> [1.474774][T8] imx-pgc imx-pgc-domain.6: imx_pgc_domain_probe: Probe 
succeeded

regulator.8 and regulator.9 is the power sequencer, attached on I2C. This also 
makes perfectly sense if you look at [1]ff. These power domains are supplied 
by specific power supply rails. Several, if not all, imx8mq boards have this 
kind of setting.

> Btw, part of the reason I'm trying to make sure we fix it the right
> way is that when we try to enable async boot by default, we don't run
> into issues.

Sounds resonable.

Best regards,
Alexander

[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/
arch/arm64/boot/dts/freescale/imx8mq-tqma8mq.dtsi#n84
[0.00][T0] Booting Linux on physical CPU 0x00 [0x410fd034]
[0.00][T0] Linux version 5.19.0-rc5-next-2022

Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()

2022-07-06 Thread Will Deacon
On Wed, Jul 06, 2022 at 01:03:44PM +0100, John Garry wrote:
> On 06/07/2022 13:00, Will Deacon wrote:
> > On Mon, Apr 04, 2022 at 07:27:10PM +0800, John Garry wrote:
> > > Function iommu_group_store_type() supports changing the default domain
> > > of an IOMMU group.
> > > 
> > > Many conditions need to be satisfied and steps taken for this action to be
> > > successful.
> > > 
> > > Satisfying these conditions and steps will be required for setting other
> > > IOMMU group attributes, so factor into a common part and a part specific
> > > to update the IOMMU group attribute.
> > > 
> > > No functional change intended.
> > > 
> > > Some code comments are tidied up also.
> > > 
> > > Signed-off-by: John Garry
> > > ---
> > >   drivers/iommu/iommu.c | 96 ---
> > >   1 file changed, 62 insertions(+), 34 deletions(-)
> > Acked-by: Will Deacon
> > 
> 
> Thanks, but currently I have no plans to progress this series, in favour of
> this 
> https://lore.kernel.org/linux-iommu/1656590892-42307-1-git-send-email-john.ga...@huawei.com/T/#me0e806913050c95f6e6ba2c7f7d96d51ce191204

heh, then I'll stop reviewing it then :) Shame, I quite liked it so far!

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible

2022-07-06 Thread Will Deacon
On Thu, Apr 07, 2022 at 03:52:53PM +0800, Leizhen (ThunderTown) wrote:
> On 2022/4/4 19:27, John Garry wrote:
> > Some low-level drivers may request DMA mappings whose IOVA length exceeds
> > that of the current rcache upper limit.
> > 
> > This means that allocations for those IOVAs will never be cached, and
> > always must be allocated and freed from the RB tree per DMA mapping cycle.
> > This has a significant effect on performance, more so since commit
> > 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search
> > fails"), as discussed at [0].
> > 
> > As a first step towards allowing the rcache range upper limit be
> > configured, hold this value in the IOVA rcache structure, and allocate
> > the rcaches separately.
> > 
> > Delete macro IOVA_RANGE_CACHE_MAX_SIZE in case it's reused by mistake.
> > 
> > [0] 
> > https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leiz...@huawei.com/
> > 
> > Signed-off-by: John Garry 
> > ---
> >  drivers/iommu/iova.c | 20 ++--
> >  include/linux/iova.h |  3 +++
> >  2 files changed, 13 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> > index db77aa675145..5c22b9187b79 100644
> > --- a/drivers/iommu/iova.c
> > +++ b/drivers/iommu/iova.c
> > @@ -15,8 +15,6 @@
> >  /* The anchor node sits above the top of the usable address space */
> >  #define IOVA_ANCHOR~0UL
> >  
> > -#define IOVA_RANGE_CACHE_MAX_SIZE 6/* log of max cached IOVA range 
> > size (in pages) */
> > -
> >  static bool iova_rcache_insert(struct iova_domain *iovad,
> >unsigned long pfn,
> >unsigned long size);
> > @@ -443,7 +441,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned 
> > long size,
> >  * rounding up anything cacheable to make sure that can't happen. The
> >  * order of the unadjusted size will still match upon freeing.
> >  */
> > -   if (size < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1)))
> > +   if (size < (1 << (iovad->rcache_max_size - 1)))
> > size = roundup_pow_of_two(size);
> >  
> > iova_pfn = iova_rcache_get(iovad, size, limit_pfn + 1);
> > @@ -713,13 +711,15 @@ int iova_domain_init_rcaches(struct iova_domain 
> > *iovad)
> > unsigned int cpu;
> > int i, ret;
> >  
> > -   iovad->rcaches = kcalloc(IOVA_RANGE_CACHE_MAX_SIZE,
> > +   iovad->rcache_max_size = 6; /* Arbitrarily high default */
> 
> It would be better to assign this constant value to iovad->rcache_max_size
> in init_iova_domain().

I think it's fine where it is as it's a meaningless number outside of the
rcache code.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible

2022-07-06 Thread Will Deacon
On Mon, Apr 04, 2022 at 07:27:11PM +0800, John Garry wrote:
> Some low-level drivers may request DMA mappings whose IOVA length exceeds
> that of the current rcache upper limit.
> 
> This means that allocations for those IOVAs will never be cached, and
> always must be allocated and freed from the RB tree per DMA mapping cycle.
> This has a significant effect on performance, more so since commit
> 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search
> fails"), as discussed at [0].
> 
> As a first step towards allowing the rcache range upper limit be
> configured, hold this value in the IOVA rcache structure, and allocate
> the rcaches separately.
> 
> Delete macro IOVA_RANGE_CACHE_MAX_SIZE in case it's reused by mistake.
> 
> [0] 
> https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leiz...@huawei.com/
> 
> Signed-off-by: John Garry 
> ---
>  drivers/iommu/iova.c | 20 ++--
>  include/linux/iova.h |  3 +++
>  2 files changed, 13 insertions(+), 10 deletions(-)

Acked-by: Will Deacon 

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()

2022-07-06 Thread John Garry via iommu

On 06/07/2022 13:00, Will Deacon wrote:

On Mon, Apr 04, 2022 at 07:27:10PM +0800, John Garry wrote:

Function iommu_group_store_type() supports changing the default domain
of an IOMMU group.

Many conditions need to be satisfied and steps taken for this action to be
successful.

Satisfying these conditions and steps will be required for setting other
IOMMU group attributes, so factor into a common part and a part specific
to update the IOMMU group attribute.

No functional change intended.

Some code comments are tidied up also.

Signed-off-by: John Garry
---
  drivers/iommu/iommu.c | 96 ---
  1 file changed, 62 insertions(+), 34 deletions(-)

Acked-by: Will Deacon



Thanks, but currently I have no plans to progress this series, in favour 
of this 
https://lore.kernel.org/linux-iommu/1656590892-42307-1-git-send-email-john.ga...@huawei.com/T/#me0e806913050c95f6e6ba2c7f7d96d51ce191204


cheers

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()

2022-07-06 Thread Will Deacon
On Mon, Apr 04, 2022 at 07:27:10PM +0800, John Garry wrote:
> Function iommu_group_store_type() supports changing the default domain
> of an IOMMU group.
> 
> Many conditions need to be satisfied and steps taken for this action to be
> successful.
> 
> Satisfying these conditions and steps will be required for setting other
> IOMMU group attributes, so factor into a common part and a part specific
> to update the IOMMU group attribute.
> 
> No functional change intended.
> 
> Some code comments are tidied up also.
> 
> Signed-off-by: John Garry 
> ---
>  drivers/iommu/iommu.c | 96 ---
>  1 file changed, 62 insertions(+), 34 deletions(-)

Acked-by: Will Deacon 

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 08/16] arm64: dts: mt8195: Add power domains controller

2022-07-06 Thread Tinghan Shen via iommu
Hi Krzysztof,

After discussing your message with our power team, 
we realized that we need your help to ensure we fully understand you.

On Mon, 2022-07-04 at 14:38 +0200, Krzysztof Kozlowski wrote:
> On 04/07/2022 12:00, Tinghan Shen wrote:
> > Add power domains controller node for mt8195.
> > 
> > Signed-off-by: Weiyi Lu 
> > Signed-off-by: Tinghan Shen 
> > ---
> >  arch/arm64/boot/dts/mediatek/mt8195.dtsi | 327 +++
> >  1 file changed, 327 insertions(+)
> > 
> > diff --git a/arch/arm64/boot/dts/mediatek/mt8195.dtsi 
> > b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
> > index 8d59a7da3271..d52e140d9271 100644
> > --- a/arch/arm64/boot/dts/mediatek/mt8195.dtsi
> > +++ b/arch/arm64/boot/dts/mediatek/mt8195.dtsi
> > @@ -10,6 +10,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  / {
> > compatible = "mediatek,mt8195";
> > @@ -338,6 +339,332 @@
> > #interrupt-cells = <2>;
> > };
> >  
> > +   scpsys: syscon@10006000 {
> > +   compatible = "syscon", "simple-mfd";
> 
> These compatibles cannot be alone.

the scpsys sub node has the compatible of the power domain driver.
do you suggest that the compatible in the sub node should move to here?

> > +   reg = <0 0x10006000 0 0x1000>;
> > +   #power-domain-cells = <1>;
> 
> If it is simple MFD, then probably it is not a power domain provider.
> Decide.

this MFD device is the power controller on mt8195. Some features need 
to do some operations on registers in this node. We think that implement 
the operation of these registers as the MFD device can provide flexibility 
for future use. We want to clarify if you're saying that an MFD device 
cannot be a power domain provider.



Best regards,
TingHan




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv2] iommu/arm-smmu-qcom: Add debug support for TLB sync timeouts

2022-07-06 Thread Will Deacon
On Thu, May 26, 2022 at 09:44:03AM +0530, Sai Prakash Ranjan wrote:
> TLB sync timeouts can be due to various reasons such as TBU power down
> or pending TCU/TBU invalidation/sync and so on. Debugging these often
> require dumping of some implementation defined registers to know the
> status of TBU/TCU operations and some of these registers are not
> accessible in non-secure world such as from kernel and requires SMC
> calls to read them in the secure world. So, add this debug support
> to dump implementation defined registers for TLB sync timeout issues.
> 
> Signed-off-by: Sai Prakash Ranjan 
> ---
> 
> Changes in v2:
>  * Use scm call consistently so that it works on older chipsets where
>some of these regs are secure registers.
>  * Add device specific data to get the implementation defined register
>offsets.
> 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 161 ++---
>  drivers/iommu/arm/arm-smmu/arm-smmu.c  |   2 +
>  drivers/iommu/arm/arm-smmu/arm-smmu.h  |   1 +
>  3 files changed, 146 insertions(+), 18 deletions(-)

If this is useful to you, then I suppose it's something we could support,
however I'm pretty worried about our ability to maintain/scale this stuff
as it is extended to support additional SoCs and other custom debugging
features.

Perhaps you could stick it all in arm-smmu-qcom-debug.c and have a new
config option for that, so at least it's even further out of the way?

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 RESEND 35/35] iommu/amd: Update amd_iommu_fault structure to include PCI seg ID

2022-07-06 Thread Vasant Hegde via iommu
Rename 'device_id' as 'sbdf' and extend it to 32bit so that we can
pass PCI segment ID to ppr_notifier(). Also pass PCI segment ID to
pci_get_domain_bus_and_slot() instead of default value.

Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/amd_iommu_types.h | 2 +-
 drivers/iommu/amd/iommu.c   | 2 +-
 drivers/iommu/amd/iommu_v2.c| 9 +
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 1ca54803702a..40f52d02c5b9 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -486,7 +486,7 @@ extern struct kmem_cache *amd_iommu_irq_cache;
 struct amd_iommu_fault {
u64 address;/* IO virtual address of the fault*/
u32 pasid;  /* Address space identifier */
-   u16 device_id;  /* Originating PCI device id */
+   u32 sbdf;   /* Originating PCI device id */
u16 tag;/* PPR tag */
u16 flags;  /* Fault flags */
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 6a1db8f9f453..a56a9ad3273e 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -701,7 +701,7 @@ static void iommu_handle_ppr_entry(struct amd_iommu *iommu, 
u64 *raw)
 
fault.address   = raw[1];
fault.pasid = PPR_PASID(raw[0]);
-   fault.device_id = PPR_DEVID(raw[0]);
+   fault.sbdf  = PCI_SEG_DEVID_TO_SBDF(iommu->pci_seg->id, 
PPR_DEVID(raw[0]));
fault.tag   = PPR_TAG(raw[0]);
fault.flags = PPR_FLAGS(raw[0]);
 
diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
index 40484af2ffc2..696dbe57 100644
--- a/drivers/iommu/amd/iommu_v2.c
+++ b/drivers/iommu/amd/iommu_v2.c
@@ -518,15 +518,16 @@ static int ppr_notifier(struct notifier_block *nb, 
unsigned long e, void *data)
unsigned long flags;
struct fault *fault;
bool finish;
-   u16 tag, devid;
+   u16 tag, devid, seg_id;
int ret;
 
iommu_fault = data;
tag = iommu_fault->tag & 0x1ff;
finish  = (iommu_fault->tag >> 9) & 1;
 
-   devid = iommu_fault->device_id;
-   pdev = pci_get_domain_bus_and_slot(0, PCI_BUS_NUM(devid),
+   seg_id = PCI_SBDF_TO_SEGID(iommu_fault->sbdf);
+   devid = PCI_SBDF_TO_DEVID(iommu_fault->sbdf);
+   pdev = pci_get_domain_bus_and_slot(seg_id, PCI_BUS_NUM(devid),
   devid & 0xff);
if (!pdev)
return -ENODEV;
@@ -540,7 +541,7 @@ static int ppr_notifier(struct notifier_block *nb, unsigned 
long e, void *data)
goto out;
}
 
-   dev_state = get_device_state(iommu_fault->device_id);
+   dev_state = get_device_state(iommu_fault->sbdf);
if (dev_state == NULL)
goto out;
 
-- 
2.31.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 RESEND 34/35] iommu/amd: Update device_state structure to include PCI seg ID

2022-07-06 Thread Vasant Hegde via iommu
Rename struct device_state.devid variable to struct device_state.sbdf
and extend it to 32-bit to include the 16-bit PCI segment ID via
the helper function get_pci_sbdf_id().

Co-developed-by: Suravee Suthikulpanit 
Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/iommu_v2.c | 58 +++-
 1 file changed, 24 insertions(+), 34 deletions(-)

diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
index afb3efd565b7..40484af2ffc2 100644
--- a/drivers/iommu/amd/iommu_v2.c
+++ b/drivers/iommu/amd/iommu_v2.c
@@ -51,7 +51,7 @@ struct pasid_state {
 
 struct device_state {
struct list_head list;
-   u16 devid;
+   u32 sbdf;
atomic_t count;
struct pci_dev *pdev;
struct pasid_state **states;
@@ -83,35 +83,25 @@ static struct workqueue_struct *iommu_wq;
 
 static void free_pasid_states(struct device_state *dev_state);
 
-static u16 device_id(struct pci_dev *pdev)
-{
-   u16 devid;
-
-   devid = pdev->bus->number;
-   devid = (devid << 8) | pdev->devfn;
-
-   return devid;
-}
-
-static struct device_state *__get_device_state(u16 devid)
+static struct device_state *__get_device_state(u32 sbdf)
 {
struct device_state *dev_state;
 
list_for_each_entry(dev_state, _list, list) {
-   if (dev_state->devid == devid)
+   if (dev_state->sbdf == sbdf)
return dev_state;
}
 
return NULL;
 }
 
-static struct device_state *get_device_state(u16 devid)
+static struct device_state *get_device_state(u32 sbdf)
 {
struct device_state *dev_state;
unsigned long flags;
 
spin_lock_irqsave(_lock, flags);
-   dev_state = __get_device_state(devid);
+   dev_state = __get_device_state(sbdf);
if (dev_state != NULL)
atomic_inc(_state->count);
spin_unlock_irqrestore(_lock, flags);
@@ -609,7 +599,7 @@ int amd_iommu_bind_pasid(struct pci_dev *pdev, u32 pasid,
struct pasid_state *pasid_state;
struct device_state *dev_state;
struct mm_struct *mm;
-   u16 devid;
+   u32 sbdf;
int ret;
 
might_sleep();
@@ -617,8 +607,8 @@ int amd_iommu_bind_pasid(struct pci_dev *pdev, u32 pasid,
if (!amd_iommu_v2_supported())
return -ENODEV;
 
-   devid = device_id(pdev);
-   dev_state = get_device_state(devid);
+   sbdf  = get_pci_sbdf_id(pdev);
+   dev_state = get_device_state(sbdf);
 
if (dev_state == NULL)
return -EINVAL;
@@ -692,15 +682,15 @@ void amd_iommu_unbind_pasid(struct pci_dev *pdev, u32 
pasid)
 {
struct pasid_state *pasid_state;
struct device_state *dev_state;
-   u16 devid;
+   u32 sbdf;
 
might_sleep();
 
if (!amd_iommu_v2_supported())
return;
 
-   devid = device_id(pdev);
-   dev_state = get_device_state(devid);
+   sbdf = get_pci_sbdf_id(pdev);
+   dev_state = get_device_state(sbdf);
if (dev_state == NULL)
return;
 
@@ -742,7 +732,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
struct iommu_group *group;
unsigned long flags;
int ret, tmp;
-   u16 devid;
+   u32 sbdf;
 
might_sleep();
 
@@ -759,7 +749,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
if (pasids <= 0 || pasids > (PASID_MASK + 1))
return -EINVAL;
 
-   devid = device_id(pdev);
+   sbdf = get_pci_sbdf_id(pdev);
 
dev_state = kzalloc(sizeof(*dev_state), GFP_KERNEL);
if (dev_state == NULL)
@@ -768,7 +758,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
spin_lock_init(_state->lock);
init_waitqueue_head(_state->wq);
dev_state->pdev  = pdev;
-   dev_state->devid = devid;
+   dev_state->sbdf = sbdf;
 
tmp = pasids;
for (dev_state->pasid_levels = 0; (tmp - 1) & ~0x1ff; tmp >>= 9)
@@ -806,7 +796,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
 
spin_lock_irqsave(_lock, flags);
 
-   if (__get_device_state(devid) != NULL) {
+   if (__get_device_state(sbdf) != NULL) {
spin_unlock_irqrestore(_lock, flags);
ret = -EBUSY;
goto out_free_domain;
@@ -838,16 +828,16 @@ void amd_iommu_free_device(struct pci_dev *pdev)
 {
struct device_state *dev_state;
unsigned long flags;
-   u16 devid;
+   u32 sbdf;
 
if (!amd_iommu_v2_supported())
return;
 
-   devid = device_id(pdev);
+   sbdf = get_pci_sbdf_id(pdev);
 
spin_lock_irqsave(_lock, flags);
 
-   dev_state = __get_device_state(devid);
+   dev_state = __get_device_state(sbdf);
if (dev_state == NULL) {
spin_unlock_irqrestore(_lock, flags);
return;
@@ -867,18 +857,18 @@ int 

[PATCH v3 RESEND 33/35] iommu/amd: Print PCI segment ID in error log messages

2022-07-06 Thread Vasant Hegde via iommu
Print pci segment ID along with bdf. Useful for debugging.

Co-developed-by: Suravee Suthikulpaint 
Signed-off-by: Suravee Suthikulpaint 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/init.c  | 10 +-
 drivers/iommu/amd/iommu.c | 36 ++--
 2 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 9b1026fa7283..3c82d9c5f1c0 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1855,11 +1855,11 @@ static int __init init_iommu_all(struct 
acpi_table_header *table)
h = (struct ivhd_header *)p;
if (*p == amd_iommu_target_ivhd_type) {
 
-   DUMP_printk("device: %02x:%02x.%01x cap: %04x "
-   "seg: %d flags: %01x info %04x\n",
-   PCI_BUS_NUM(h->devid), PCI_SLOT(h->devid),
-   PCI_FUNC(h->devid), h->cap_ptr,
-   h->pci_seg, h->flags, h->info);
+   DUMP_printk("device: %04x:%02x:%02x.%01x cap: %04x "
+   "flags: %01x info %04x\n",
+   h->pci_seg, PCI_BUS_NUM(h->devid),
+   PCI_SLOT(h->devid), PCI_FUNC(h->devid),
+   h->cap_ptr, h->flags, h->info);
DUMP_printk("   mmio-addr: %016llx\n",
h->mmio_phys);
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 2dbe17e49ffc..6a1db8f9f453 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -496,8 +496,8 @@ static void amd_iommu_report_rmp_hw_error(struct amd_iommu 
*iommu, volatile u32
vmg_tag, spa, flags);
}
} else {
-   pr_err_ratelimited("Event logged [RMP_HW_ERROR 
device=%02x:%02x.%x, vmg_tag=0x%04x, spa=0x%llx, flags=0x%04x]\n",
-   PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
+   pr_err_ratelimited("Event logged [RMP_HW_ERROR 
device=%04x:%02x:%02x.%x, vmg_tag=0x%04x, spa=0x%llx, flags=0x%04x]\n",
+   iommu->pci_seg->id, PCI_BUS_NUM(devid), 
PCI_SLOT(devid), PCI_FUNC(devid),
vmg_tag, spa, flags);
}
 
@@ -529,8 +529,8 @@ static void amd_iommu_report_rmp_fault(struct amd_iommu 
*iommu, volatile u32 *ev
vmg_tag, gpa, flags_rmp, flags);
}
} else {
-   pr_err_ratelimited("Event logged [RMP_PAGE_FAULT 
device=%02x:%02x.%x, vmg_tag=0x%04x, gpa=0x%llx, flags_rmp=0x%04x, 
flags=0x%04x]\n",
-   PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
+   pr_err_ratelimited("Event logged [RMP_PAGE_FAULT 
device=%04x:%02x:%02x.%x, vmg_tag=0x%04x, gpa=0x%llx, flags_rmp=0x%04x, 
flags=0x%04x]\n",
+   iommu->pci_seg->id, PCI_BUS_NUM(devid), 
PCI_SLOT(devid), PCI_FUNC(devid),
vmg_tag, gpa, flags_rmp, flags);
}
 
@@ -576,8 +576,8 @@ static void amd_iommu_report_page_fault(struct amd_iommu 
*iommu,
domain_id, address, flags);
}
} else {
-   pr_err_ratelimited("Event logged [IO_PAGE_FAULT 
device=%02x:%02x.%x domain=0x%04x address=0x%llx flags=0x%04x]\n",
-   PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
+   pr_err_ratelimited("Event logged [IO_PAGE_FAULT 
device=%04x:%02x:%02x.%x domain=0x%04x address=0x%llx flags=0x%04x]\n",
+   iommu->pci_seg->id, PCI_BUS_NUM(devid), 
PCI_SLOT(devid), PCI_FUNC(devid),
domain_id, address, flags);
}
 
@@ -620,20 +620,20 @@ static void iommu_print_event(struct amd_iommu *iommu, 
void *__evt)
 
switch (type) {
case EVENT_TYPE_ILL_DEV:
-   dev_err(dev, "Event logged [ILLEGAL_DEV_TABLE_ENTRY 
device=%02x:%02x.%x pasid=0x%05x address=0x%llx flags=0x%04x]\n",
-   PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
+   dev_err(dev, "Event logged [ILLEGAL_DEV_TABLE_ENTRY 
device=%04x:%02x:%02x.%x pasid=0x%05x address=0x%llx flags=0x%04x]\n",
+   iommu->pci_seg->id, PCI_BUS_NUM(devid), 
PCI_SLOT(devid), PCI_FUNC(devid),
pasid, address, flags);
dump_dte_entry(iommu, devid);
break;
case EVENT_TYPE_DEV_TAB_ERR:
-   dev_err(dev, "Event logged [DEV_TAB_HARDWARE_ERROR 
device=%02x:%02x.%x "
+   dev_err(dev, "Event logged [DEV_TAB_HARDWARE_ERROR 
device=%04x:%02x:%02x.%x "
"address=0x%llx flags=0x%04x]\n",
-   PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
+   iommu->pci_seg->id, PCI_BUS_NUM(devid), 

[PATCH v3 RESEND 32/35] iommu/amd: Add PCI segment support for ivrs_[ioapic/hpet/acpihid] commands

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

By default, PCI segment is zero and can be omitted. To support system
with non-zero PCI segment ID, modify the parsing functions to allow
PCI segment ID.

Co-developed-by: Vasant Hegde 
Signed-off-by: Vasant Hegde 
Signed-off-by: Suravee Suthikulpanit 
---
 .../admin-guide/kernel-parameters.txt | 34 ++
 drivers/iommu/amd/init.c  | 44 ---
 2 files changed, 52 insertions(+), 26 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 2522b11e593f..d45e58328ce6 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2266,23 +2266,39 @@
 
ivrs_ioapic [HW,X86-64]
Provide an override to the IOAPIC-ID<->DEVICE-ID
-   mapping provided in the IVRS ACPI table. For
-   example, to map IOAPIC-ID decimal 10 to
-   PCI device 00:14.0 write the parameter as:
+   mapping provided in the IVRS ACPI table.
+   By default, PCI segment is 0, and can be omitted.
+   For example:
+   * To map IOAPIC-ID decimal 10 to PCI device 00:14.0
+ write the parameter as:
ivrs_ioapic[10]=00:14.0
+   * To map IOAPIC-ID decimal 10 to PCI segment 0x1 and
+ PCI device 00:14.0 write the parameter as:
+   ivrs_ioapic[10]=0001:00:14.0
 
ivrs_hpet   [HW,X86-64]
Provide an override to the HPET-ID<->DEVICE-ID
-   mapping provided in the IVRS ACPI table. For
-   example, to map HPET-ID decimal 0 to
-   PCI device 00:14.0 write the parameter as:
+   mapping provided in the IVRS ACPI table.
+   By default, PCI segment is 0, and can be omitted.
+   For example:
+   * To map HPET-ID decimal 0 to PCI device 00:14.0
+ write the parameter as:
ivrs_hpet[0]=00:14.0
+   * To map HPET-ID decimal 10 to PCI segment 0x1 and
+ PCI device 00:14.0 write the parameter as:
+   ivrs_ioapic[10]=0001:00:14.0
 
ivrs_acpihid[HW,X86-64]
Provide an override to the ACPI-HID:UID<->DEVICE-ID
-   mapping provided in the IVRS ACPI table. For
-   example, to map UART-HID:UID AMD0020:0 to
-   PCI device 00:14.5 write the parameter as:
+   mapping provided in the IVRS ACPI table.
+
+   For example, to map UART-HID:UID AMD0020:0 to
+   PCI segment 0x1 and PCI device ID 00:14.5,
+   write the parameter as:
+   ivrs_acpihid[0001:00:14.5]=AMD0020:0
+
+   By default, PCI segment is 0, and can be omitted.
+   For example, PCI device 00:14.5 write the parameter as:
ivrs_acpihid[00:14.5]=AMD0020:0
 
js= [HW,JOY] Analog joystick
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 9693f0b9e07a..9b1026fa7283 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -84,6 +84,10 @@
 #define ACPI_DEVFLAG_ATSDIS 0x1000
 
 #define LOOP_TIMEOUT   200
+
+#define IVRS_GET_SBDF_ID(seg, bus, dev, fd)(((seg & 0x) << 16) | ((bus 
& 0xff) << 8) \
+| ((dev & 0x1f) << 3) | (fn & 
0x7))
+
 /*
  * ACPI table definitions
  *
@@ -3288,15 +3292,17 @@ static int __init parse_amd_iommu_options(char *str)
 
 static int __init parse_ivrs_ioapic(char *str)
 {
-   unsigned int bus, dev, fn;
+   u32 seg = 0, bus, dev, fn;
int ret, id, i;
-   u16 devid;
+   u32 devid;
 
ret = sscanf(str, "[%d]=%x:%x.%x", , , , );
-
if (ret != 4) {
-   pr_err("Invalid command line: ivrs_ioapic%s\n", str);
-   return 1;
+   ret = sscanf(str, "[%d]=%x:%x:%x.%x", , , , , 
);
+   if (ret != 5) {
+   pr_err("Invalid command line: ivrs_ioapic%s\n", str);
+   return 1;
+   }
}
 
if (early_ioapic_map_size == EARLY_MAP_SIZE) {
@@ -3305,7 +3311,7 @@ static int __init parse_ivrs_ioapic(char *str)
return 1;
}
 
-   devid = ((bus & 0xff) << 8) | ((dev & 0x1f) << 3) | (fn & 0x7);
+   devid = IVRS_GET_SBDF_ID(seg, bus, dev, fn);
 
cmdline_maps= true;
i   = early_ioapic_map_size++;
@@ -3318,15 

Re: [PATCH v12 0/2] iommu/mediatek: TTBR up to 35bit support

2022-07-06 Thread Will Deacon
On Thu, Jun 30, 2022 at 05:29:24PM +0800, yf.w...@mediatek.com wrote:
> This patchset adds MediaTek TTBR up to 35bit support for single normal zone.
> 
> Changes in v12:
> - Update [PATCH 1/2]: remove GENMASK(31, 7)
> - Update [PATCH 2/2]: remove MMU_PT_ADDR_MASK definition.

For both patches:

Acked-by: Will Deacon 

Joerg -- please can you pick these up for 5.20?

Thanks,

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 RESEND 31/35] iommu/amd: Specify PCI segment ID when getting pci device

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

Upcoming AMD systems can have multiple PCI segments. Hence pass PCI
segment ID to pci_get_domain_bus_and_slot() instead of '0'.

Co-developed-by: Vasant Hegde 
Signed-off-by: Vasant Hegde 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c  |  6 --
 drivers/iommu/amd/iommu.c | 19 ++-
 2 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index d35081d84460..9693f0b9e07a 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1962,7 +1962,8 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
int cap_ptr = iommu->cap_ptr;
int ret;
 
-   iommu->dev = pci_get_domain_bus_and_slot(0, PCI_BUS_NUM(iommu->devid),
+   iommu->dev = pci_get_domain_bus_and_slot(iommu->pci_seg->id,
+PCI_BUS_NUM(iommu->devid),
 iommu->devid & 0xff);
if (!iommu->dev)
return -ENODEV;
@@ -2025,7 +2026,8 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
int i, j;
 
iommu->root_pdev =
-   pci_get_domain_bus_and_slot(0, iommu->dev->bus->number,
+   pci_get_domain_bus_and_slot(iommu->pci_seg->id,
+   iommu->dev->bus->number,
PCI_DEVFN(0, 0));
 
/*
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 0751dda04a10..2dbe17e49ffc 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -473,7 +473,7 @@ static void dump_command(unsigned long phys_addr)
pr_err("CMD[%d]: %08x\n", i, cmd->data[i]);
 }
 
-static void amd_iommu_report_rmp_hw_error(volatile u32 *event)
+static void amd_iommu_report_rmp_hw_error(struct amd_iommu *iommu, volatile 
u32 *event)
 {
struct iommu_dev_data *dev_data = NULL;
int devid, vmg_tag, flags;
@@ -485,7 +485,7 @@ static void amd_iommu_report_rmp_hw_error(volatile u32 
*event)
flags   = (event[1] >> EVENT_FLAGS_SHIFT) & EVENT_FLAGS_MASK;
spa = ((u64)event[3] << 32) | (event[2] & 0xFFF8);
 
-   pdev = pci_get_domain_bus_and_slot(0, PCI_BUS_NUM(devid),
+   pdev = pci_get_domain_bus_and_slot(iommu->pci_seg->id, 
PCI_BUS_NUM(devid),
   devid & 0xff);
if (pdev)
dev_data = dev_iommu_priv_get(>dev);
@@ -505,7 +505,7 @@ static void amd_iommu_report_rmp_hw_error(volatile u32 
*event)
pci_dev_put(pdev);
 }
 
-static void amd_iommu_report_rmp_fault(volatile u32 *event)
+static void amd_iommu_report_rmp_fault(struct amd_iommu *iommu, volatile u32 
*event)
 {
struct iommu_dev_data *dev_data = NULL;
int devid, flags_rmp, vmg_tag, flags;
@@ -518,7 +518,7 @@ static void amd_iommu_report_rmp_fault(volatile u32 *event)
flags = (event[1] >> EVENT_FLAGS_SHIFT) & EVENT_FLAGS_MASK;
gpa   = ((u64)event[3] << 32) | event[2];
 
-   pdev = pci_get_domain_bus_and_slot(0, PCI_BUS_NUM(devid),
+   pdev = pci_get_domain_bus_and_slot(iommu->pci_seg->id, 
PCI_BUS_NUM(devid),
   devid & 0xff);
if (pdev)
dev_data = dev_iommu_priv_get(>dev);
@@ -544,13 +544,14 @@ static void amd_iommu_report_rmp_fault(volatile u32 
*event)
 #define IS_WRITE_REQUEST(flags)\
((flags) & EVENT_FLAG_RW)
 
-static void amd_iommu_report_page_fault(u16 devid, u16 domain_id,
+static void amd_iommu_report_page_fault(struct amd_iommu *iommu,
+   u16 devid, u16 domain_id,
u64 address, int flags)
 {
struct iommu_dev_data *dev_data = NULL;
struct pci_dev *pdev;
 
-   pdev = pci_get_domain_bus_and_slot(0, PCI_BUS_NUM(devid),
+   pdev = pci_get_domain_bus_and_slot(iommu->pci_seg->id, 
PCI_BUS_NUM(devid),
   devid & 0xff);
if (pdev)
dev_data = dev_iommu_priv_get(>dev);
@@ -613,7 +614,7 @@ static void iommu_print_event(struct amd_iommu *iommu, void 
*__evt)
}
 
if (type == EVENT_TYPE_IO_FAULT) {
-   amd_iommu_report_page_fault(devid, pasid, address, flags);
+   amd_iommu_report_page_fault(iommu, devid, pasid, address, 
flags);
return;
}
 
@@ -654,10 +655,10 @@ static void iommu_print_event(struct amd_iommu *iommu, 
void *__evt)
pasid, address, flags);
break;
case EVENT_TYPE_RMP_FAULT:
-   amd_iommu_report_rmp_fault(event);
+   amd_iommu_report_rmp_fault(iommu, event);
break;
case EVENT_TYPE_RMP_HW_ERR:
-   amd_iommu_report_rmp_hw_error(event);
+   

[PATCH v3 RESEND 30/35] iommu/amd: Include PCI segment ID when initialize IOMMU

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

Extend current device ID variables to 32-bit to include the 16-bit
segment ID when parsing device information from IVRS table to initialize
each IOMMU.

Co-developed-by: Vasant Hegde 
Signed-off-by: Vasant Hegde 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h   |  2 +-
 drivers/iommu/amd/amd_iommu_types.h |  6 ++--
 drivers/iommu/amd/init.c| 56 +++--
 drivers/iommu/amd/quirks.c  |  4 +--
 4 files changed, 35 insertions(+), 33 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index e73bd48fc716..9b7092182ca7 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -125,7 +125,7 @@ static inline int get_pci_sbdf_id(struct pci_dev *pdev)
 
 extern bool translation_pre_enabled(struct amd_iommu *iommu);
 extern bool amd_iommu_is_attach_deferred(struct device *dev);
-extern int __init add_special_device(u8 type, u8 id, u16 *devid,
+extern int __init add_special_device(u8 type, u8 id, u32 *devid,
 bool cmd_line);
 
 #ifdef CONFIG_DMI
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index ea238e8e6c99..1ca54803702a 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -744,8 +744,8 @@ struct acpihid_map_entry {
struct list_head list;
u8 uid[ACPIHID_UID_LEN];
u8 hid[ACPIHID_HID_LEN];
-   u16 devid;
-   u16 root_devid;
+   u32 devid;
+   u32 root_devid;
bool cmd_line;
struct iommu_group *group;
 };
@@ -753,7 +753,7 @@ struct acpihid_map_entry {
 struct devid_map {
struct list_head list;
u8 id;
-   u16 devid;
+   u32 devid;
bool cmd_line;
 };
 
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index df8f4b9d20cd..d35081d84460 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1148,7 +1148,7 @@ static void __init set_dev_entry_from_acpi(struct 
amd_iommu *iommu,
amd_iommu_set_rlookup_table(iommu, devid);
 }
 
-int __init add_special_device(u8 type, u8 id, u16 *devid, bool cmd_line)
+int __init add_special_device(u8 type, u8 id, u32 *devid, bool cmd_line)
 {
struct devid_map *entry;
struct list_head *list;
@@ -1185,7 +1185,7 @@ int __init add_special_device(u8 type, u8 id, u16 *devid, 
bool cmd_line)
return 0;
 }
 
-static int __init add_acpi_hid_device(u8 *hid, u8 *uid, u16 *devid,
+static int __init add_acpi_hid_device(u8 *hid, u8 *uid, u32 *devid,
  bool cmd_line)
 {
struct acpihid_map_entry *entry;
@@ -1264,7 +1264,7 @@ static int __init init_iommu_from_acpi(struct amd_iommu 
*iommu,
 {
u8 *p = (u8 *)h;
u8 *end = p, flags = 0;
-   u16 devid = 0, devid_start = 0, devid_to = 0;
+   u16 devid = 0, devid_start = 0, devid_to = 0, seg_id;
u32 dev_i, ext_flags = 0;
bool alias = false;
struct ivhd_entry *e;
@@ -1300,6 +1300,8 @@ static int __init init_iommu_from_acpi(struct amd_iommu 
*iommu,
 
while (p < end) {
e = (struct ivhd_entry *)p;
+   seg_id = pci_seg->id;
+
switch (e->type) {
case IVHD_DEV_ALL:
 
@@ -1310,9 +1312,9 @@ static int __init init_iommu_from_acpi(struct amd_iommu 
*iommu,
break;
case IVHD_DEV_SELECT:
 
-   DUMP_printk("  DEV_SELECT\t\t\t devid: %02x:%02x.%x "
+   DUMP_printk("  DEV_SELECT\t\t\t devid: 
%04x:%02x:%02x.%x "
"flags: %02x\n",
-   PCI_BUS_NUM(e->devid),
+   seg_id, PCI_BUS_NUM(e->devid),
PCI_SLOT(e->devid),
PCI_FUNC(e->devid),
e->flags);
@@ -1323,8 +1325,8 @@ static int __init init_iommu_from_acpi(struct amd_iommu 
*iommu,
case IVHD_DEV_SELECT_RANGE_START:
 
DUMP_printk("  DEV_SELECT_RANGE_START\t "
-   "devid: %02x:%02x.%x flags: %02x\n",
-   PCI_BUS_NUM(e->devid),
+   "devid: %04x:%02x:%02x.%x flags: %02x\n",
+   seg_id, PCI_BUS_NUM(e->devid),
PCI_SLOT(e->devid),
PCI_FUNC(e->devid),
e->flags);
@@ -1336,9 +1338,9 @@ static int __init init_iommu_from_acpi(struct amd_iommu 
*iommu,
break;
case IVHD_DEV_ALIAS:
 
-   DUMP_printk("  DEV_ALIAS\t\t\t devid: %02x:%02x.%x "
+   DUMP_printk("  DEV_ALIAS\t\t\t devid: %04x:%02x:%02x.%x 
"
"flags: 

[PATCH v3 RESEND 29/35] iommu/amd: Introduce get_device_sbdf_id() helper function

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

Current get_device_id() only provide 16-bit PCI device ID (i.e. BDF).
With multiple PCI segment support, we need to extend the helper function
to include PCI segment ID.

So, introduce a new helper function get_device_sbdf_id() to replace
the current get_pci_device_id().

Co-developed-by: Vasant Hegde 
Signed-off-by: Vasant Hegde 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h   |  7 
 drivers/iommu/amd/amd_iommu_types.h |  2 +
 drivers/iommu/amd/iommu.c   | 58 ++---
 3 files changed, 38 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 64c954e168d7..e73bd48fc716 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -115,6 +115,13 @@ void amd_iommu_domain_clr_pt_root(struct protection_domain 
*domain)
amd_iommu_domain_set_pt_root(domain, 0);
 }
 
+static inline int get_pci_sbdf_id(struct pci_dev *pdev)
+{
+   int seg = pci_domain_nr(pdev->bus);
+   u16 devid = pci_dev_id(pdev);
+
+   return PCI_SEG_DEVID_TO_SBDF(seg, devid);
+}
 
 extern bool translation_pre_enabled(struct amd_iommu *iommu);
 extern bool amd_iommu_is_attach_deferred(struct device *dev);
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 65b02e2ae28f..ea238e8e6c99 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -456,6 +456,8 @@ extern struct kmem_cache *amd_iommu_irq_cache;
 
 #define PCI_SBDF_TO_SEGID(sbdf)(((sbdf) >> 16) & 0x)
 #define PCI_SBDF_TO_DEVID(sbdf)((sbdf) & 0x)
+#define PCI_SEG_DEVID_TO_SBDF(seg, devid)  u32)(seg) & 0x) << 16) 
| \
+((devid) & 0x))
 
 /* Make iterating over all pci segment easier */
 #define for_each_pci_segment(pci_seg) \
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 6914911d4fb6..0751dda04a10 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -92,13 +92,6 @@ static void detach_device(struct device *dev);
  *
  /
 
-static inline u16 get_pci_device_id(struct device *dev)
-{
-   struct pci_dev *pdev = to_pci_dev(dev);
-
-   return pci_dev_id(pdev);
-}
-
 static inline int get_acpihid_device_id(struct device *dev,
struct acpihid_map_entry **entry)
 {
@@ -119,16 +112,16 @@ static inline int get_acpihid_device_id(struct device 
*dev,
return -EINVAL;
 }
 
-static inline int get_device_id(struct device *dev)
+static inline int get_device_sbdf_id(struct device *dev)
 {
-   int devid;
+   int sbdf;
 
if (dev_is_pci(dev))
-   devid = get_pci_device_id(dev);
+   sbdf = get_pci_sbdf_id(to_pci_dev(dev));
else
-   devid = get_acpihid_device_id(dev, NULL);
+   sbdf = get_acpihid_device_id(dev, NULL);
 
-   return devid;
+   return sbdf;
 }
 
 struct dev_table_entry *get_dev_table(struct amd_iommu *iommu)
@@ -182,9 +175,11 @@ static struct amd_iommu *__rlookup_amd_iommu(u16 seg, u16 
devid)
 static struct amd_iommu *rlookup_amd_iommu(struct device *dev)
 {
u16 seg = get_device_segment(dev);
-   u16 devid = get_device_id(dev);
+   int devid = get_device_sbdf_id(dev);
 
-   return __rlookup_amd_iommu(seg, devid);
+   if (devid < 0)
+   return NULL;
+   return __rlookup_amd_iommu(seg, PCI_SBDF_TO_DEVID(devid));
 }
 
 static struct protection_domain *to_pdomain(struct iommu_domain *dom)
@@ -360,14 +355,15 @@ static bool check_device(struct device *dev)
 {
struct amd_iommu_pci_seg *pci_seg;
struct amd_iommu *iommu;
-   int devid;
+   int devid, sbdf;
 
if (!dev)
return false;
 
-   devid = get_device_id(dev);
-   if (devid < 0)
+   sbdf = get_device_sbdf_id(dev);
+   if (sbdf < 0)
return false;
+   devid = PCI_SBDF_TO_DEVID(sbdf);
 
iommu = rlookup_amd_iommu(dev);
if (!iommu)
@@ -375,7 +371,7 @@ static bool check_device(struct device *dev)
 
/* Out of our scope? */
pci_seg = iommu->pci_seg;
-   if ((devid & 0x) > pci_seg->last_bdf)
+   if (devid > pci_seg->last_bdf)
return false;
 
return true;
@@ -384,15 +380,16 @@ static bool check_device(struct device *dev)
 static int iommu_init_device(struct amd_iommu *iommu, struct device *dev)
 {
struct iommu_dev_data *dev_data;
-   int devid;
+   int devid, sbdf;
 
if (dev_iommu_priv_get(dev))
return 0;
 
-   devid = get_device_id(dev);
-   if (devid < 0)
-   return devid;
+   sbdf = get_device_sbdf_id(dev);
+   if (sbdf < 0)
+   return sbdf;
 
+   devid = 

[PATCH v3 RESEND 28/35] iommu/amd: Flush upto last_bdf only

2022-07-06 Thread Vasant Hegde via iommu
Fix amd_iommu_flush_dte_all() and amd_iommu_flush_tlb_all() to flush
upto last_bdf only.

Co-developed-by: Suravee Suthikulpanit 
Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/iommu.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 94ebffe15960..6914911d4fb6 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1191,8 +1191,9 @@ static int iommu_flush_dte(struct amd_iommu *iommu, u16 
devid)
 static void amd_iommu_flush_dte_all(struct amd_iommu *iommu)
 {
u32 devid;
+   u16 last_bdf = iommu->pci_seg->last_bdf;
 
-   for (devid = 0; devid <= 0x; ++devid)
+   for (devid = 0; devid <= last_bdf; ++devid)
iommu_flush_dte(iommu, devid);
 
iommu_completion_wait(iommu);
@@ -1205,8 +1206,9 @@ static void amd_iommu_flush_dte_all(struct amd_iommu 
*iommu)
 static void amd_iommu_flush_tlb_all(struct amd_iommu *iommu)
 {
u32 dom_id;
+   u16 last_bdf = iommu->pci_seg->last_bdf;
 
-   for (dom_id = 0; dom_id <= 0x; ++dom_id) {
+   for (dom_id = 0; dom_id <= last_bdf; ++dom_id) {
struct iommu_cmd cmd;
build_inv_iommu_pages(, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS,
  dom_id, 1);
@@ -1249,8 +1251,9 @@ static void iommu_flush_irt(struct amd_iommu *iommu, u16 
devid)
 static void amd_iommu_flush_irt_all(struct amd_iommu *iommu)
 {
u32 devid;
+   u16 last_bdf = iommu->pci_seg->last_bdf;
 
-   for (devid = 0; devid <= MAX_DEV_TABLE_ENTRIES; devid++)
+   for (devid = 0; devid <= last_bdf; devid++)
iommu_flush_irt(iommu, devid);
 
iommu_completion_wait(iommu);
-- 
2.31.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 RESEND 27/35] iommu/amd: Remove global amd_iommu_[dev_table/alias_table/last_bdf]

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

Replace them with per PCI segment device table.
Also remove dev_table_size, alias_table_size, amd_iommu_last_bdf
variables.

Co-developed-by: Vasant Hegde 
Signed-off-by: Vasant Hegde 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu_types.h | 15 -
 drivers/iommu/amd/init.c| 89 +
 drivers/iommu/amd/iommu.c   | 18 --
 3 files changed, 27 insertions(+), 95 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index d932c90329e4..65b02e2ae28f 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -834,24 +834,9 @@ struct unity_map_entry {
  * Data structures for device handling
  */
 
-/*
- * Device table used by hardware. Read and write accesses by software are
- * locked with the amd_iommu_pd_table lock.
- */
-extern struct dev_table_entry *amd_iommu_dev_table;
-
-/*
- * Alias table to find requestor ids to device ids. Not locked because only
- * read on runtime.
- */
-extern u16 *amd_iommu_alias_table;
-
 /* size of the dma_ops aperture as power of 2 */
 extern unsigned amd_iommu_aperture_order;
 
-/* largest PCI device id we expect translation requests for */
-extern u16 amd_iommu_last_bdf;
-
 /* allocation bitmap for domain ids */
 extern unsigned long *amd_iommu_pd_alloc_bitmap;
 
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 508959182c7f..df8f4b9d20cd 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -160,9 +160,6 @@ static bool amd_iommu_disabled __initdata;
 static bool amd_iommu_force_enable __initdata;
 static int amd_iommu_target_ivhd_type;
 
-u16 amd_iommu_last_bdf;/* largest PCI device id we have
-  to handle */
-
 LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */
 LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the
   system */
@@ -185,30 +182,12 @@ bool amdr_ivrs_remap_support __read_mostly;
 
 bool amd_iommu_force_isolation __read_mostly;
 
-/*
- * Pointer to the device table which is shared by all AMD IOMMUs
- * it is indexed by the PCI device id or the HT unit id and contains
- * information about the domain the device belongs to as well as the
- * page table root pointer.
- */
-struct dev_table_entry *amd_iommu_dev_table;
-
-/*
- * The alias table is a driver specific data structure which contains the
- * mappings of the PCI device ids to the actual requestor ids on the IOMMU.
- * More than one device can share the same requestor id.
- */
-u16 *amd_iommu_alias_table;
-
 /*
  * AMD IOMMU allows up to 2^16 different protection domains. This is a bitmap
  * to know which ones are already in use.
  */
 unsigned long *amd_iommu_pd_alloc_bitmap;
 
-static u32 dev_table_size; /* size of the device table */
-static u32 alias_table_size;   /* size of the alias table */
-
 enum iommu_init_state {
IOMMU_START_STATE,
IOMMU_IVRS_DETECTED,
@@ -263,16 +242,10 @@ static void init_translation_status(struct amd_iommu 
*iommu)
iommu->flags |= AMD_IOMMU_FLAG_TRANS_PRE_ENABLED;
 }
 
-static inline void update_last_devid(u16 devid)
-{
-   if (devid > amd_iommu_last_bdf)
-   amd_iommu_last_bdf = devid;
-}
-
-static inline unsigned long tbl_size(int entry_size)
+static inline unsigned long tbl_size(int entry_size, int last_bdf)
 {
unsigned shift = PAGE_SHIFT +
-get_order(((int)amd_iommu_last_bdf + 1) * entry_size);
+get_order((last_bdf + 1) * entry_size);
 
return 1UL << shift;
 }
@@ -404,10 +377,11 @@ static void iommu_set_device_table(struct amd_iommu 
*iommu)
 {
u64 entry;
u32 dev_table_size = iommu->pci_seg->dev_table_size;
+   void *dev_table = (void *)get_dev_table(iommu);
 
BUG_ON(iommu->mmio_base == NULL);
 
-   entry = iommu_virt_to_phys(amd_iommu_dev_table);
+   entry = iommu_virt_to_phys(dev_table);
entry |= (dev_table_size >> 12) - 1;
memcpy_toio(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET,
, sizeof(entry));
@@ -557,14 +531,12 @@ static int __init find_last_devid_from_ivhd(struct 
ivhd_header *h)
switch (dev->type) {
case IVHD_DEV_ALL:
/* Use maximum BDF value for DEV_ALL */
-   update_last_devid(0x);
return 0x;
case IVHD_DEV_SELECT:
case IVHD_DEV_RANGE_END:
case IVHD_DEV_ALIAS:
case IVHD_DEV_EXT_SELECT:
/* all the above subfield types refer to device ids */
-   update_last_devid(dev->devid);
if (dev->devid > last_devid)
last_devid = dev->devid;

[PATCH v3 RESEND 26/35] iommu/amd: Update set_dev_entry_bit() and get_dev_entry_bit()

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

To include a pointer to per PCI segment device table.

Also include struct amd_iommu as one of the function parameter to
amd_iommu_apply_erratum_63() since it is needed when setting up DTE.

Co-developed-by: Vasant Hegde 
Signed-off-by: Vasant Hegde 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h |  2 +-
 drivers/iommu/amd/init.c  | 59 +++
 drivers/iommu/amd/iommu.c |  2 +-
 3 files changed, 41 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 2947239700ce..64c954e168d7 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -13,7 +13,7 @@
 
 extern irqreturn_t amd_iommu_int_thread(int irq, void *data);
 extern irqreturn_t amd_iommu_int_handler(int irq, void *data);
-extern void amd_iommu_apply_erratum_63(u16 devid);
+extern void amd_iommu_apply_erratum_63(struct amd_iommu *iommu, u16 devid);
 extern void amd_iommu_restart_event_logging(struct amd_iommu *iommu);
 extern int amd_iommu_init_devices(void);
 extern void amd_iommu_uninit_devices(void);
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 3024fa9a89d5..508959182c7f 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -989,22 +989,37 @@ static void iommu_enable_gt(struct amd_iommu *iommu)
 }
 
 /* sets a specific bit in the device table entry. */
-static void set_dev_entry_bit(u16 devid, u8 bit)
+static void __set_dev_entry_bit(struct dev_table_entry *dev_table,
+   u16 devid, u8 bit)
 {
int i = (bit >> 6) & 0x03;
int _bit = bit & 0x3f;
 
-   amd_iommu_dev_table[devid].data[i] |= (1UL << _bit);
+   dev_table[devid].data[i] |= (1UL << _bit);
 }
 
-static int get_dev_entry_bit(u16 devid, u8 bit)
+static void set_dev_entry_bit(struct amd_iommu *iommu, u16 devid, u8 bit)
+{
+   struct dev_table_entry *dev_table = get_dev_table(iommu);
+
+   return __set_dev_entry_bit(dev_table, devid, bit);
+}
+
+static int __get_dev_entry_bit(struct dev_table_entry *dev_table,
+  u16 devid, u8 bit)
 {
int i = (bit >> 6) & 0x03;
int _bit = bit & 0x3f;
 
-   return (amd_iommu_dev_table[devid].data[i] & (1UL << _bit)) >> _bit;
+   return (dev_table[devid].data[i] & (1UL << _bit)) >> _bit;
 }
 
+static int get_dev_entry_bit(struct amd_iommu *iommu, u16 devid, u8 bit)
+{
+   struct dev_table_entry *dev_table = get_dev_table(iommu);
+
+   return __get_dev_entry_bit(dev_table, devid, bit);
+}
 
 static bool __copy_device_table(struct amd_iommu *iommu)
 {
@@ -1123,15 +1138,15 @@ static bool copy_device_table(void)
return true;
 }
 
-void amd_iommu_apply_erratum_63(u16 devid)
+void amd_iommu_apply_erratum_63(struct amd_iommu *iommu, u16 devid)
 {
int sysmgt;
 
-   sysmgt = get_dev_entry_bit(devid, DEV_ENTRY_SYSMGT1) |
-(get_dev_entry_bit(devid, DEV_ENTRY_SYSMGT2) << 1);
+   sysmgt = get_dev_entry_bit(iommu, devid, DEV_ENTRY_SYSMGT1) |
+(get_dev_entry_bit(iommu, devid, DEV_ENTRY_SYSMGT2) << 1);
 
if (sysmgt == 0x01)
-   set_dev_entry_bit(devid, DEV_ENTRY_IW);
+   set_dev_entry_bit(iommu, devid, DEV_ENTRY_IW);
 }
 
 /* Writes the specific IOMMU for a device into the rlookup table */
@@ -1148,21 +1163,21 @@ static void __init set_dev_entry_from_acpi(struct 
amd_iommu *iommu,
   u16 devid, u32 flags, u32 ext_flags)
 {
if (flags & ACPI_DEVFLAG_INITPASS)
-   set_dev_entry_bit(devid, DEV_ENTRY_INIT_PASS);
+   set_dev_entry_bit(iommu, devid, DEV_ENTRY_INIT_PASS);
if (flags & ACPI_DEVFLAG_EXTINT)
-   set_dev_entry_bit(devid, DEV_ENTRY_EINT_PASS);
+   set_dev_entry_bit(iommu, devid, DEV_ENTRY_EINT_PASS);
if (flags & ACPI_DEVFLAG_NMI)
-   set_dev_entry_bit(devid, DEV_ENTRY_NMI_PASS);
+   set_dev_entry_bit(iommu, devid, DEV_ENTRY_NMI_PASS);
if (flags & ACPI_DEVFLAG_SYSMGT1)
-   set_dev_entry_bit(devid, DEV_ENTRY_SYSMGT1);
+   set_dev_entry_bit(iommu, devid, DEV_ENTRY_SYSMGT1);
if (flags & ACPI_DEVFLAG_SYSMGT2)
-   set_dev_entry_bit(devid, DEV_ENTRY_SYSMGT2);
+   set_dev_entry_bit(iommu, devid, DEV_ENTRY_SYSMGT2);
if (flags & ACPI_DEVFLAG_LINT0)
-   set_dev_entry_bit(devid, DEV_ENTRY_LINT0_PASS);
+   set_dev_entry_bit(iommu, devid, DEV_ENTRY_LINT0_PASS);
if (flags & ACPI_DEVFLAG_LINT1)
-   set_dev_entry_bit(devid, DEV_ENTRY_LINT1_PASS);
+   set_dev_entry_bit(iommu, devid, DEV_ENTRY_LINT1_PASS);
 
-   amd_iommu_apply_erratum_63(devid);
+   amd_iommu_apply_erratum_63(iommu, devid);
 
set_iommu_for_device(iommu, devid);
 }
@@ -2519,8 +2534,8 @@ static void init_device_table_dma(struct 

[PATCH v3 RESEND 25/35] iommu/amd: Update (un)init_device_table_dma()

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

Include struct amd_iommu_pci_seg as a function parameter since
we need to access per PCI segment device table.

Co-developed-by: Vasant Hegde 
Signed-off-by: Vasant Hegde 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 27 ---
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index b7e54bb5efc5..3024fa9a89d5 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -238,7 +238,7 @@ static enum iommu_init_state init_state = IOMMU_START_STATE;
 
 static int amd_iommu_enable_interrupts(void);
 static int __init iommu_go_to_state(enum iommu_init_state state);
-static void init_device_table_dma(void);
+static void init_device_table_dma(struct amd_iommu_pci_seg *pci_seg);
 
 static bool amd_iommu_pre_enabled = true;
 
@@ -2116,6 +2116,7 @@ static void print_iommu_info(void)
 static int __init amd_iommu_init_pci(void)
 {
struct amd_iommu *iommu;
+   struct amd_iommu_pci_seg *pci_seg;
int ret;
 
for_each_iommu(iommu) {
@@ -2146,7 +2147,8 @@ static int __init amd_iommu_init_pci(void)
goto out;
}
 
-   init_device_table_dma();
+   for_each_pci_segment(pci_seg)
+   init_device_table_dma(pci_seg);
 
for_each_iommu(iommu)
iommu_flush_all_caches(iommu);
@@ -2508,9 +2510,13 @@ static int __init init_memory_definitions(struct 
acpi_table_header *table)
 /*
  * Init the device table to not allow DMA access for devices
  */
-static void init_device_table_dma(void)
+static void init_device_table_dma(struct amd_iommu_pci_seg *pci_seg)
 {
u32 devid;
+   struct dev_table_entry *dev_table = pci_seg->dev_table;
+
+   if (dev_table == NULL)
+   return;
 
for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) {
set_dev_entry_bit(devid, DEV_ENTRY_VALID);
@@ -2518,13 +2524,17 @@ static void init_device_table_dma(void)
}
 }
 
-static void __init uninit_device_table_dma(void)
+static void __init uninit_device_table_dma(struct amd_iommu_pci_seg *pci_seg)
 {
u32 devid;
+   struct dev_table_entry *dev_table = pci_seg->dev_table;
+
+   if (dev_table == NULL)
+   return;
 
for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) {
-   amd_iommu_dev_table[devid].data[0] = 0ULL;
-   amd_iommu_dev_table[devid].data[1] = 0ULL;
+   dev_table[devid].data[0] = 0ULL;
+   dev_table[devid].data[1] = 0ULL;
}
 }
 
@@ -3117,8 +3127,11 @@ static int __init state_next(void)
free_iommu_resources();
} else {
struct amd_iommu *iommu;
+   struct amd_iommu_pci_seg *pci_seg;
+
+   for_each_pci_segment(pci_seg)
+   uninit_device_table_dma(pci_seg);
 
-   uninit_device_table_dma();
for_each_iommu(iommu)
iommu_flush_all_caches(iommu);
}
-- 
2.31.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 RESEND 24/35] iommu/amd: Update set_dte_irq_entry

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

Start using per PCI segment device table instead of global
device table.

Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/iommu.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 790a3449e7b7..f1fab4168101 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2731,18 +2731,20 @@ EXPORT_SYMBOL(amd_iommu_device_info);
 static struct irq_chip amd_ir_chip;
 static DEFINE_SPINLOCK(iommu_table_lock);
 
-static void set_dte_irq_entry(u16 devid, struct irq_remap_table *table)
+static void set_dte_irq_entry(struct amd_iommu *iommu, u16 devid,
+ struct irq_remap_table *table)
 {
u64 dte;
+   struct dev_table_entry *dev_table = get_dev_table(iommu);
 
-   dte = amd_iommu_dev_table[devid].data[2];
+   dte = dev_table[devid].data[2];
dte &= ~DTE_IRQ_PHYS_ADDR_MASK;
dte |= iommu_virt_to_phys(table->table);
dte |= DTE_IRQ_REMAP_INTCTL;
dte |= DTE_INTTABLEN;
dte |= DTE_IRQ_REMAP_ENABLE;
 
-   amd_iommu_dev_table[devid].data[2] = dte;
+   dev_table[devid].data[2] = dte;
 }
 
 static struct irq_remap_table *get_irq_table(struct amd_iommu *iommu, u16 
devid)
@@ -2793,7 +2795,7 @@ static void set_remap_table_entry(struct amd_iommu 
*iommu, u16 devid,
struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg;
 
pci_seg->irq_lookup_table[devid] = table;
-   set_dte_irq_entry(devid, table);
+   set_dte_irq_entry(iommu, devid, table);
iommu_flush_dte(iommu, devid);
 }
 
@@ -2809,8 +2811,7 @@ static int set_remap_table_entry_alias(struct pci_dev 
*pdev, u16 alias,
 
pci_seg = iommu->pci_seg;
pci_seg->irq_lookup_table[alias] = table;
-   set_dte_irq_entry(alias, table);
-
+   set_dte_irq_entry(iommu, alias, table);
iommu_flush_dte(pci_seg->rlookup_table[alias], alias);
 
return 0;
-- 
2.31.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 RESEND 23/35] iommu/amd: Update dump_dte_entry

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

Start using per PCI segment device table instead of global
device table.

Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/iommu.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 90755da7cff0..790a3449e7b7 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -451,13 +451,13 @@ static void amd_iommu_uninit_device(struct device *dev)
  *
  /
 
-static void dump_dte_entry(u16 devid)
+static void dump_dte_entry(struct amd_iommu *iommu, u16 devid)
 {
int i;
+   struct dev_table_entry *dev_table = get_dev_table(iommu);
 
for (i = 0; i < 4; ++i)
-   pr_err("DTE[%d]: %016llx\n", i,
-   amd_iommu_dev_table[devid].data[i]);
+   pr_err("DTE[%d]: %016llx\n", i, dev_table[devid].data[i]);
 }
 
 static void dump_command(unsigned long phys_addr)
@@ -618,7 +618,7 @@ static void iommu_print_event(struct amd_iommu *iommu, void 
*__evt)
dev_err(dev, "Event logged [ILLEGAL_DEV_TABLE_ENTRY 
device=%02x:%02x.%x pasid=0x%05x address=0x%llx flags=0x%04x]\n",
PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
pasid, address, flags);
-   dump_dte_entry(devid);
+   dump_dte_entry(iommu, devid);
break;
case EVENT_TYPE_DEV_TAB_ERR:
dev_err(dev, "Event logged [DEV_TAB_HARDWARE_ERROR 
device=%02x:%02x.%x "
-- 
2.31.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 RESEND 22/35] iommu/amd: Update iommu_ignore_device

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

Start using per PCI segment device table instead of global
device table.

Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/iommu.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 493cda5e0246..90755da7cff0 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -413,15 +413,15 @@ static int iommu_init_device(struct amd_iommu *iommu, 
struct device *dev)
 static void iommu_ignore_device(struct amd_iommu *iommu, struct device *dev)
 {
struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg;
+   struct dev_table_entry *dev_table = get_dev_table(iommu);
int devid;
 
-   devid = get_device_id(dev);
+   devid = (get_device_id(dev)) & 0x;
if (devid < 0)
return;
 
-
pci_seg->rlookup_table[devid] = NULL;
-   memset(_iommu_dev_table[devid], 0, sizeof(struct dev_table_entry));
+   memset(_table[devid], 0, sizeof(struct dev_table_entry));
 
setup_aliases(iommu, dev);
 }
-- 
2.31.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 RESEND 21/35] iommu/amd: Update set_dte_entry and clear_dte_entry

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

Start using per PCI segment data structures instead of global data
structures.

Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/iommu.c | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 6e0cd9c4f57c..493cda5e0246 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1537,6 +1537,7 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 
devid,
u64 pte_root = 0;
u64 flags = 0;
u32 old_domid;
+   struct dev_table_entry *dev_table = get_dev_table(iommu);
 
if (domain->iop.mode != PAGE_MODE_NONE)
pte_root = iommu_virt_to_phys(domain->iop.root);
@@ -1545,7 +1546,7 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 
devid,
<< DEV_ENTRY_MODE_SHIFT;
pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
 
-   flags = amd_iommu_dev_table[devid].data[1];
+   flags = dev_table[devid].data[1];
 
if (ats)
flags |= DTE_FLAG_IOTLB;
@@ -1584,9 +1585,9 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 
devid,
flags &= ~DEV_DOMID_MASK;
flags |= domain->id;
 
-   old_domid = amd_iommu_dev_table[devid].data[1] & DEV_DOMID_MASK;
-   amd_iommu_dev_table[devid].data[1]  = flags;
-   amd_iommu_dev_table[devid].data[0]  = pte_root;
+   old_domid = dev_table[devid].data[1] & DEV_DOMID_MASK;
+   dev_table[devid].data[1]  = flags;
+   dev_table[devid].data[0]  = pte_root;
 
/*
 * A kdump kernel might be replacing a domain ID that was copied from
@@ -1598,11 +1599,13 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 
devid,
}
 }
 
-static void clear_dte_entry(u16 devid)
+static void clear_dte_entry(struct amd_iommu *iommu, u16 devid)
 {
+   struct dev_table_entry *dev_table = get_dev_table(iommu);
+
/* remove entry from the device table seen by the hardware */
-   amd_iommu_dev_table[devid].data[0]  = DTE_FLAG_V | DTE_FLAG_TV;
-   amd_iommu_dev_table[devid].data[1] &= DTE_FLAG_MASK;
+   dev_table[devid].data[0]  = DTE_FLAG_V | DTE_FLAG_TV;
+   dev_table[devid].data[1] &= DTE_FLAG_MASK;
 
amd_iommu_apply_erratum_63(devid);
 }
@@ -1646,7 +1649,7 @@ static void do_detach(struct iommu_dev_data *dev_data)
/* Update data structures */
dev_data->domain = NULL;
list_del(_data->list);
-   clear_dte_entry(dev_data->devid);
+   clear_dte_entry(iommu, dev_data->devid);
clone_aliases(iommu, dev_data->dev);
 
/* Flush the DTE entry */
-- 
2.31.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 RESEND 20/35] iommu/amd: Convert to use per PCI segment rlookup_table

2022-07-06 Thread Vasant Hegde via iommu
Then, remove the global amd_iommu_rlookup_table and rlookup_table_size.

Co-developed-by: Suravee Suthikulpanit 
Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/amd_iommu_types.h |  5 -
 drivers/iommu/amd/init.c| 23 ++-
 drivers/iommu/amd/iommu.c   | 19 +--
 3 files changed, 11 insertions(+), 36 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 67feb847fc13..d932c90329e4 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -846,11 +846,6 @@ extern struct dev_table_entry *amd_iommu_dev_table;
  */
 extern u16 *amd_iommu_alias_table;
 
-/*
- * Reverse lookup table to find the IOMMU which translates a specific device.
- */
-extern struct amd_iommu **amd_iommu_rlookup_table;
-
 /* size of the dma_ops aperture as power of 2 */
 extern unsigned amd_iommu_aperture_order;
 
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index b7b50345c8a5..b7e54bb5efc5 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -200,12 +200,6 @@ struct dev_table_entry *amd_iommu_dev_table;
  */
 u16 *amd_iommu_alias_table;
 
-/*
- * The rlookup table is used to find the IOMMU which is responsible
- * for a specific device. It is also indexed by the PCI device id.
- */
-struct amd_iommu **amd_iommu_rlookup_table;
-
 /*
  * AMD IOMMU allows up to 2^16 different protection domains. This is a bitmap
  * to know which ones are already in use.
@@ -214,7 +208,6 @@ unsigned long *amd_iommu_pd_alloc_bitmap;
 
 static u32 dev_table_size; /* size of the device table */
 static u32 alias_table_size;   /* size of the alias table */
-static u32 rlookup_table_size; /* size if the rlookup table */
 
 enum iommu_init_state {
IOMMU_START_STATE,
@@ -1144,7 +1137,7 @@ void amd_iommu_apply_erratum_63(u16 devid)
 /* Writes the specific IOMMU for a device into the rlookup table */
 static void __init set_iommu_for_device(struct amd_iommu *iommu, u16 devid)
 {
-   amd_iommu_rlookup_table[devid] = iommu;
+   iommu->pci_seg->rlookup_table[devid] = iommu;
 }
 
 /*
@@ -1826,7 +1819,7 @@ static int __init init_iommu_one(struct amd_iommu *iommu, 
struct ivhd_header *h,
 * Make sure IOMMU is not considered to translate itself. The IVRS
 * table tells us so, but this is a lie!
 */
-   amd_iommu_rlookup_table[iommu->devid] = NULL;
+   pci_seg->rlookup_table[iommu->devid] = NULL;
 
return 0;
 }
@@ -2783,10 +2776,6 @@ static void __init free_iommu_resources(void)
kmem_cache_destroy(amd_iommu_irq_cache);
amd_iommu_irq_cache = NULL;
 
-   free_pages((unsigned long)amd_iommu_rlookup_table,
-  get_order(rlookup_table_size));
-   amd_iommu_rlookup_table = NULL;
-
free_pages((unsigned long)amd_iommu_alias_table,
   get_order(alias_table_size));
amd_iommu_alias_table = NULL;
@@ -2925,7 +2914,6 @@ static int __init early_amd_iommu_init(void)
 
dev_table_size = tbl_size(DEV_TABLE_ENTRY_SIZE);
alias_table_size   = tbl_size(ALIAS_TABLE_ENTRY_SIZE);
-   rlookup_table_size = tbl_size(RLOOKUP_TABLE_ENTRY_SIZE);
 
/* Device table - directly used by all IOMMUs */
ret = -ENOMEM;
@@ -2944,13 +2932,6 @@ static int __init early_amd_iommu_init(void)
if (amd_iommu_alias_table == NULL)
goto out;
 
-   /* IOMMU rlookup table - find the IOMMU for a specific device */
-   amd_iommu_rlookup_table = (void *)__get_free_pages(
-   GFP_KERNEL | __GFP_ZERO,
-   get_order(rlookup_table_size));
-   if (amd_iommu_rlookup_table == NULL)
-   goto out;
-
amd_iommu_pd_alloc_bitmap = (void *)__get_free_pages(
GFP_KERNEL | __GFP_ZERO,
get_order(MAX_DOMAIN_ID/8));
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 5ee1af9a0a54..6e0cd9c4f57c 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -287,10 +287,9 @@ static void setup_aliases(struct amd_iommu *iommu, struct 
device *dev)
clone_aliases(iommu, dev);
 }
 
-static struct iommu_dev_data *find_dev_data(u16 devid)
+static struct iommu_dev_data *find_dev_data(struct amd_iommu *iommu, u16 devid)
 {
struct iommu_dev_data *dev_data;
-   struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
 
dev_data = search_dev_data(iommu, devid);
 
@@ -388,7 +387,7 @@ static int iommu_init_device(struct amd_iommu *iommu, 
struct device *dev)
if (devid < 0)
return devid;
 
-   dev_data = find_dev_data(devid);
+   dev_data = find_dev_data(iommu, devid);
if (!dev_data)
return -ENOMEM;
 
@@ -403,9 +402,6 @@ static int iommu_init_device(struct amd_iommu *iommu, 
struct 

[PATCH v3 RESEND 19/35] iommu/amd: Update alloc_irq_table and alloc_irq_index

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

Pass amd_iommu structure as one of the parameter to these functions
as its needed to retrieve variable tables inside these functions.

Co-developed-by: Vasant Hegde 
Signed-off-by: Vasant Hegde 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/iommu.c | 26 +-
 1 file changed, 9 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index c4701fa957d0..5ee1af9a0a54 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2814,21 +2814,17 @@ static int set_remap_table_entry_alias(struct pci_dev 
*pdev, u16 alias,
return 0;
 }
 
-static struct irq_remap_table *alloc_irq_table(u16 devid, struct pci_dev *pdev)
+static struct irq_remap_table *alloc_irq_table(struct amd_iommu *iommu,
+  u16 devid, struct pci_dev *pdev)
 {
struct irq_remap_table *table = NULL;
struct irq_remap_table *new_table = NULL;
struct amd_iommu_pci_seg *pci_seg;
-   struct amd_iommu *iommu;
unsigned long flags;
u16 alias;
 
spin_lock_irqsave(_table_lock, flags);
 
-   iommu = amd_iommu_rlookup_table[devid];
-   if (!iommu)
-   goto out_unlock;
-
pci_seg = iommu->pci_seg;
table = pci_seg->irq_lookup_table[devid];
if (table)
@@ -2884,18 +2880,14 @@ static struct irq_remap_table *alloc_irq_table(u16 
devid, struct pci_dev *pdev)
return table;
 }
 
-static int alloc_irq_index(u16 devid, int count, bool align,
-  struct pci_dev *pdev)
+static int alloc_irq_index(struct amd_iommu *iommu, u16 devid, int count,
+  bool align, struct pci_dev *pdev)
 {
struct irq_remap_table *table;
int index, c, alignment = 1;
unsigned long flags;
-   struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
-
-   if (!iommu)
-   return -ENODEV;
 
-   table = alloc_irq_table(devid, pdev);
+   table = alloc_irq_table(iommu, devid, pdev);
if (!table)
return -ENODEV;
 
@@ -3267,7 +3259,7 @@ static int irq_remapping_alloc(struct irq_domain *domain, 
unsigned int virq,
if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC) {
struct irq_remap_table *table;
 
-   table = alloc_irq_table(devid, NULL);
+   table = alloc_irq_table(iommu, devid, NULL);
if (table) {
if (!table->min_index) {
/*
@@ -3287,10 +3279,10 @@ static int irq_remapping_alloc(struct irq_domain 
*domain, unsigned int virq,
   info->type == X86_IRQ_ALLOC_TYPE_PCI_MSIX) {
bool align = (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI);
 
-   index = alloc_irq_index(devid, nr_irqs, align,
+   index = alloc_irq_index(iommu, devid, nr_irqs, align,
msi_desc_to_pci_dev(info->desc));
} else {
-   index = alloc_irq_index(devid, nr_irqs, false, NULL);
+   index = alloc_irq_index(iommu, devid, nr_irqs, false, NULL);
}
 
if (index < 0) {
@@ -3416,8 +3408,8 @@ static int irq_remapping_select(struct irq_domain *d, 
struct irq_fwspec *fwspec,
 
if (devid < 0)
return 0;
+   iommu = __rlookup_amd_iommu((devid >> 16), (devid & 0x));
 
-   iommu = amd_iommu_rlookup_table[devid];
return iommu && iommu->ir_domain == d;
 }
 
-- 
2.31.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 RESEND 18/35] iommu/amd: Update amd_irte_ops functions

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

Pass amd_iommu structure as one of the parameter to amd_irte_ops functions
since its needed to activate/deactivate the iommu.

Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/amd_iommu_types.h |  6 ++--
 drivers/iommu/amd/iommu.c   | 51 -
 2 files changed, 24 insertions(+), 33 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 693926afdd0f..67feb847fc13 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -1007,9 +1007,9 @@ struct amd_ir_data {
 
 struct amd_irte_ops {
void (*prepare)(void *, u32, bool, u8, u32, int);
-   void (*activate)(void *, u16, u16);
-   void (*deactivate)(void *, u16, u16);
-   void (*set_affinity)(void *, u16, u16, u8, u32);
+   void (*activate)(struct amd_iommu *iommu, void *, u16, u16);
+   void (*deactivate)(struct amd_iommu *iommu, void *, u16, u16);
+   void (*set_affinity)(struct amd_iommu *iommu, void *, u16, u16, u8, 
u32);
void *(*get)(struct irq_remap_table *, int);
void (*set_allocated)(struct irq_remap_table *, int);
bool (*is_allocated)(struct irq_remap_table *, int);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 9f373b164762..c4701fa957d0 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2934,19 +2934,14 @@ static int alloc_irq_index(u16 devid, int count, bool 
align,
return index;
 }
 
-static int modify_irte_ga(u16 devid, int index, struct irte_ga *irte,
- struct amd_ir_data *data)
+static int modify_irte_ga(struct amd_iommu *iommu, u16 devid, int index,
+ struct irte_ga *irte, struct amd_ir_data *data)
 {
bool ret;
struct irq_remap_table *table;
-   struct amd_iommu *iommu;
unsigned long flags;
struct irte_ga *entry;
 
-   iommu = amd_iommu_rlookup_table[devid];
-   if (iommu == NULL)
-   return -EINVAL;
-
table = get_irq_table(iommu, devid);
if (!table)
return -ENOMEM;
@@ -2978,16 +2973,12 @@ static int modify_irte_ga(u16 devid, int index, struct 
irte_ga *irte,
return 0;
 }
 
-static int modify_irte(u16 devid, int index, union irte *irte)
+static int modify_irte(struct amd_iommu *iommu,
+  u16 devid, int index, union irte *irte)
 {
struct irq_remap_table *table;
-   struct amd_iommu *iommu;
unsigned long flags;
 
-   iommu = amd_iommu_rlookup_table[devid];
-   if (iommu == NULL)
-   return -EINVAL;
-
table = get_irq_table(iommu, devid);
if (!table)
return -ENOMEM;
@@ -3049,49 +3040,49 @@ static void irte_ga_prepare(void *entry,
irte->lo.fields_remap.valid   = 1;
 }
 
-static void irte_activate(void *entry, u16 devid, u16 index)
+static void irte_activate(struct amd_iommu *iommu, void *entry, u16 devid, u16 
index)
 {
union irte *irte = (union irte *) entry;
 
irte->fields.valid = 1;
-   modify_irte(devid, index, irte);
+   modify_irte(iommu, devid, index, irte);
 }
 
-static void irte_ga_activate(void *entry, u16 devid, u16 index)
+static void irte_ga_activate(struct amd_iommu *iommu, void *entry, u16 devid, 
u16 index)
 {
struct irte_ga *irte = (struct irte_ga *) entry;
 
irte->lo.fields_remap.valid = 1;
-   modify_irte_ga(devid, index, irte, NULL);
+   modify_irte_ga(iommu, devid, index, irte, NULL);
 }
 
-static void irte_deactivate(void *entry, u16 devid, u16 index)
+static void irte_deactivate(struct amd_iommu *iommu, void *entry, u16 devid, 
u16 index)
 {
union irte *irte = (union irte *) entry;
 
irte->fields.valid = 0;
-   modify_irte(devid, index, irte);
+   modify_irte(iommu, devid, index, irte);
 }
 
-static void irte_ga_deactivate(void *entry, u16 devid, u16 index)
+static void irte_ga_deactivate(struct amd_iommu *iommu, void *entry, u16 
devid, u16 index)
 {
struct irte_ga *irte = (struct irte_ga *) entry;
 
irte->lo.fields_remap.valid = 0;
-   modify_irte_ga(devid, index, irte, NULL);
+   modify_irte_ga(iommu, devid, index, irte, NULL);
 }
 
-static void irte_set_affinity(void *entry, u16 devid, u16 index,
+static void irte_set_affinity(struct amd_iommu *iommu, void *entry, u16 devid, 
u16 index,
  u8 vector, u32 dest_apicid)
 {
union irte *irte = (union irte *) entry;
 
irte->fields.vector = vector;
irte->fields.destination = dest_apicid;
-   modify_irte(devid, index, irte);
+   modify_irte(iommu, devid, index, irte);
 }
 
-static void irte_ga_set_affinity(void *entry, u16 devid, u16 index,
+static void irte_ga_set_affinity(struct amd_iommu *iommu, void *entry, u16 
devid, u16 index,
 u8 vector, u32 

[PATCH v3 RESEND 17/35] iommu/amd: Introduce struct amd_ir_data.iommu

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

Add a pointer to struct amd_iommu to amd_ir_data structure, which
can be used to correlate interrupt remapping data to a per-PCI-segment
interrupt remapping table.

Co-developed-by: Vasant Hegde 
Signed-off-by: Vasant Hegde 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu_types.h |  1 +
 drivers/iommu/amd/iommu.c   | 34 +
 2 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index ca1a3d55cc83..693926afdd0f 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -989,6 +989,7 @@ struct irq_2_irte {
 
 struct amd_ir_data {
u32 cached_ga_tag;
+   struct amd_iommu *iommu;
struct irq_2_irte irq_2_irte;
struct msi_msg msi_entry;
void *entry;/* Pointer to union irte or struct irte_ga */
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 5e4648cadff9..9f373b164762 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3002,16 +3002,11 @@ static int modify_irte(u16 devid, int index, union irte 
*irte)
return 0;
 }
 
-static void free_irte(u16 devid, int index)
+static void free_irte(struct amd_iommu *iommu, u16 devid, int index)
 {
struct irq_remap_table *table;
-   struct amd_iommu *iommu;
unsigned long flags;
 
-   iommu = amd_iommu_rlookup_table[devid];
-   if (iommu == NULL)
-   return;
-
table = get_irq_table(iommu, devid);
if (!table)
return;
@@ -3195,7 +3190,7 @@ static void irq_remapping_prepare_irte(struct amd_ir_data 
*data,
   int devid, int index, int sub_handle)
 {
struct irq_2_irte *irte_info = >irq_2_irte;
-   struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
+   struct amd_iommu *iommu = data->iommu;
 
if (!iommu)
return;
@@ -3336,6 +3331,7 @@ static int irq_remapping_alloc(struct irq_domain *domain, 
unsigned int virq,
goto out_free_data;
}
 
+   data->iommu = iommu;
irq_data->hwirq = (devid << 16) + i;
irq_data->chip_data = data;
irq_data->chip = _ir_chip;
@@ -3352,7 +3348,7 @@ static int irq_remapping_alloc(struct irq_domain *domain, 
unsigned int virq,
kfree(irq_data->chip_data);
}
for (i = 0; i < nr_irqs; i++)
-   free_irte(devid, index + i);
+   free_irte(iommu, devid, index + i);
 out_free_parent:
irq_domain_free_irqs_common(domain, virq, nr_irqs);
return ret;
@@ -3371,7 +3367,7 @@ static void irq_remapping_free(struct irq_domain *domain, 
unsigned int virq,
if (irq_data && irq_data->chip_data) {
data = irq_data->chip_data;
irte_info = >irq_2_irte;
-   free_irte(irte_info->devid, irte_info->index);
+   free_irte(data->iommu, irte_info->devid, 
irte_info->index);
kfree(data->entry);
kfree(data);
}
@@ -3389,7 +3385,7 @@ static int irq_remapping_activate(struct irq_domain 
*domain,
 {
struct amd_ir_data *data = irq_data->chip_data;
struct irq_2_irte *irte_info = >irq_2_irte;
-   struct amd_iommu *iommu = amd_iommu_rlookup_table[irte_info->devid];
+   struct amd_iommu *iommu = data->iommu;
struct irq_cfg *cfg = irqd_cfg(irq_data);
 
if (!iommu)
@@ -3406,7 +3402,7 @@ static void irq_remapping_deactivate(struct irq_domain 
*domain,
 {
struct amd_ir_data *data = irq_data->chip_data;
struct irq_2_irte *irte_info = >irq_2_irte;
-   struct amd_iommu *iommu = amd_iommu_rlookup_table[irte_info->devid];
+   struct amd_iommu *iommu = data->iommu;
 
if (iommu)
iommu->irte_ops->deactivate(data->entry, irte_info->devid,
@@ -3502,12 +3498,16 @@ EXPORT_SYMBOL(amd_iommu_deactivate_guest_mode);
 static int amd_ir_set_vcpu_affinity(struct irq_data *data, void *vcpu_info)
 {
int ret;
-   struct amd_iommu *iommu;
struct amd_iommu_pi_data *pi_data = vcpu_info;
struct vcpu_data *vcpu_pi_info = pi_data->vcpu_data;
struct amd_ir_data *ir_data = data->chip_data;
struct irq_2_irte *irte_info = _data->irq_2_irte;
-   struct iommu_dev_data *dev_data = search_dev_data(NULL, 
irte_info->devid);
+   struct iommu_dev_data *dev_data;
+
+   if (ir_data->iommu == NULL)
+   return -EINVAL;
+
+   dev_data = search_dev_data(ir_data->iommu, irte_info->devid);
 
/* Note:
 * This device has never been set up for guest mode.
@@ -3529,10 +3529,6 @@ static int amd_ir_set_vcpu_affinity(struct irq_data 
*data, void *vcpu_info)
pi_data->is_guest_mode = false;
 

[PATCH v3 RESEND 16/35] iommu/amd: Update irq_remapping_alloc to use IOMMU lookup helper function

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

To allow IOMMU rlookup using both PCI segment and device ID.

Co-developed-by: Vasant Hegde 
Signed-off-by: Vasant Hegde 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/iommu.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 19db4d54c337..5e4648cadff9 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3246,8 +3246,9 @@ static int irq_remapping_alloc(struct irq_domain *domain, 
unsigned int virq,
struct irq_alloc_info *info = arg;
struct irq_data *irq_data;
struct amd_ir_data *data = NULL;
+   struct amd_iommu *iommu;
struct irq_cfg *cfg;
-   int i, ret, devid;
+   int i, ret, devid, seg, sbdf;
int index;
 
if (!info)
@@ -3263,8 +3264,14 @@ static int irq_remapping_alloc(struct irq_domain 
*domain, unsigned int virq,
if (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI)
info->flags &= ~X86_IRQ_ALLOC_CONTIGUOUS_VECTORS;
 
-   devid = get_devid(info);
-   if (devid < 0)
+   sbdf = get_devid(info);
+   if (sbdf < 0)
+   return -EINVAL;
+
+   seg = PCI_SBDF_TO_SEGID(sbdf);
+   devid = PCI_SBDF_TO_DEVID(sbdf);
+   iommu = __rlookup_amd_iommu(seg, devid);
+   if (!iommu)
return -EINVAL;
 
ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
@@ -3273,7 +3280,6 @@ static int irq_remapping_alloc(struct irq_domain *domain, 
unsigned int virq,
 
if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC) {
struct irq_remap_table *table;
-   struct amd_iommu *iommu;
 
table = alloc_irq_table(devid, NULL);
if (table) {
@@ -3283,7 +3289,6 @@ static int irq_remapping_alloc(struct irq_domain *domain, 
unsigned int virq,
 * interrupts.
 */
table->min_index = 32;
-   iommu = amd_iommu_rlookup_table[devid];
for (i = 0; i < 32; ++i)
iommu->irte_ops->set_allocated(table, 
i);
}
-- 
2.31.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 RESEND 15/35] iommu/amd: Convert to use rlookup_amd_iommu helper function

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

Use rlookup_amd_iommu() helper function which will give per PCI
segment rlookup_table.

Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/iommu.c | 64 +++
 1 file changed, 38 insertions(+), 26 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index cfecd072e7a6..19db4d54c337 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -229,13 +229,17 @@ static struct iommu_dev_data *search_dev_data(struct 
amd_iommu *iommu, u16 devid
 
 static int clone_alias(struct pci_dev *pdev, u16 alias, void *data)
 {
+   struct amd_iommu *iommu;
u16 devid = pci_dev_id(pdev);
 
if (devid == alias)
return 0;
 
-   amd_iommu_rlookup_table[alias] =
-   amd_iommu_rlookup_table[devid];
+   iommu = rlookup_amd_iommu(>dev);
+   if (!iommu)
+   return 0;
+
+   amd_iommu_set_rlookup_table(iommu, alias);
memcpy(amd_iommu_dev_table[alias].data,
   amd_iommu_dev_table[devid].data,
   sizeof(amd_iommu_dev_table[alias].data));
@@ -366,7 +370,7 @@ static bool check_device(struct device *dev)
if (devid > amd_iommu_last_bdf)
return false;
 
-   if (amd_iommu_rlookup_table[devid] == NULL)
+   if (rlookup_amd_iommu(dev) == NULL)
return false;
 
return true;
@@ -1270,7 +1274,9 @@ static int device_flush_iotlb(struct iommu_dev_data 
*dev_data,
int qdep;
 
qdep = dev_data->ats.qdep;
-   iommu= amd_iommu_rlookup_table[dev_data->devid];
+   iommu= rlookup_amd_iommu(dev_data->dev);
+   if (!iommu)
+   return -EINVAL;
 
build_inv_iotlb_pages(, dev_data->devid, qdep, address, size);
 
@@ -1295,7 +1301,9 @@ static int device_flush_dte(struct iommu_dev_data 
*dev_data)
u16 alias;
int ret;
 
-   iommu = amd_iommu_rlookup_table[dev_data->devid];
+   iommu = rlookup_amd_iommu(dev_data->dev);
+   if (!iommu)
+   return -EINVAL;
 
if (dev_is_pci(dev_data->dev))
pdev = to_pci_dev(dev_data->dev);
@@ -1525,8 +1533,8 @@ static void free_gcr3_table(struct protection_domain 
*domain)
free_page((unsigned long)domain->gcr3_tbl);
 }
 
-static void set_dte_entry(u16 devid, struct protection_domain *domain,
- bool ats, bool ppr)
+static void set_dte_entry(struct amd_iommu *iommu, u16 devid,
+ struct protection_domain *domain, bool ats, bool ppr)
 {
u64 pte_root = 0;
u64 flags = 0;
@@ -1545,8 +1553,6 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain,
flags |= DTE_FLAG_IOTLB;
 
if (ppr) {
-   struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
-
if (iommu_feature(iommu, FEATURE_EPHSUP))
pte_root |= 1ULL << DEV_ENTRY_PPR;
}
@@ -1590,8 +1596,6 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain,
 * entries for the old domain ID that is being overwritten
 */
if (old_domid) {
-   struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
-
amd_iommu_flush_tlb_domid(iommu, old_domid);
}
 }
@@ -1611,7 +1615,9 @@ static void do_attach(struct iommu_dev_data *dev_data,
struct amd_iommu *iommu;
bool ats;
 
-   iommu = amd_iommu_rlookup_table[dev_data->devid];
+   iommu = rlookup_amd_iommu(dev_data->dev);
+   if (!iommu)
+   return;
ats   = dev_data->ats.enabled;
 
/* Update data structures */
@@ -1623,7 +1629,7 @@ static void do_attach(struct iommu_dev_data *dev_data,
domain->dev_cnt += 1;
 
/* Update device table */
-   set_dte_entry(dev_data->devid, domain,
+   set_dte_entry(iommu, dev_data->devid, domain,
  ats, dev_data->iommu_v2);
clone_aliases(iommu, dev_data->dev);
 
@@ -1635,7 +1641,9 @@ static void do_detach(struct iommu_dev_data *dev_data)
struct protection_domain *domain = dev_data->domain;
struct amd_iommu *iommu;
 
-   iommu = amd_iommu_rlookup_table[dev_data->devid];
+   iommu = rlookup_amd_iommu(dev_data->dev);
+   if (!iommu)
+   return;
 
/* Update data structures */
dev_data->domain = NULL;
@@ -1813,13 +1821,14 @@ static struct iommu_device 
*amd_iommu_probe_device(struct device *dev)
 {
struct iommu_device *iommu_dev;
struct amd_iommu *iommu;
-   int ret, devid;
+   int ret;
 
if (!check_device(dev))
return ERR_PTR(-ENODEV);
 
-   devid = get_device_id(dev);
-   iommu = amd_iommu_rlookup_table[devid];
+   iommu = rlookup_amd_iommu(dev);
+   if (!iommu)
+   return ERR_PTR(-ENODEV);
 
if 

[PATCH v3 RESEND 14/35] iommu/amd: Convert to use per PCI segment irq_lookup_table

2022-07-06 Thread Vasant Hegde via iommu
Then, remove the global irq_lookup_table.

Co-developed-by: Suravee Suthikulpanit 
Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/amd_iommu_types.h |  2 --
 drivers/iommu/amd/init.c| 19 ---
 drivers/iommu/amd/iommu.c   | 36 ++---
 3 files changed, 23 insertions(+), 34 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 8d2d5fbdb57f..ca1a3d55cc83 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -445,8 +445,6 @@ struct irq_remap_table {
u32 *table;
 };
 
-extern struct irq_remap_table **irq_lookup_table;
-
 /* Interrupt remapping feature used? */
 extern bool amd_iommu_irq_remap;
 
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index afe3bff5bce0..b7b50345c8a5 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -206,12 +206,6 @@ u16 *amd_iommu_alias_table;
  */
 struct amd_iommu **amd_iommu_rlookup_table;
 
-/*
- * This table is used to find the irq remapping table for a given device id
- * quickly.
- */
-struct irq_remap_table **irq_lookup_table;
-
 /*
  * AMD IOMMU allows up to 2^16 different protection domains. This is a bitmap
  * to know which ones are already in use.
@@ -2786,11 +2780,6 @@ static struct syscore_ops amd_iommu_syscore_ops = {
 
 static void __init free_iommu_resources(void)
 {
-   kmemleak_free(irq_lookup_table);
-   free_pages((unsigned long)irq_lookup_table,
-  get_order(rlookup_table_size));
-   irq_lookup_table = NULL;
-
kmem_cache_destroy(amd_iommu_irq_cache);
amd_iommu_irq_cache = NULL;
 
@@ -3011,14 +3000,6 @@ static int __init early_amd_iommu_init(void)
if (alloc_irq_lookup_table(pci_seg))
goto out;
}
-
-   irq_lookup_table = (void *)__get_free_pages(
-   GFP_KERNEL | __GFP_ZERO,
-   get_order(rlookup_table_size));
-   kmemleak_alloc(irq_lookup_table, rlookup_table_size,
-  1, GFP_KERNEL);
-   if (!irq_lookup_table)
-   goto out;
}
 
ret = init_memory_definitions(ivrs_base);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 53ccee57a7a0..cfecd072e7a6 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2732,16 +2732,18 @@ static void set_dte_irq_entry(u16 devid, struct 
irq_remap_table *table)
amd_iommu_dev_table[devid].data[2] = dte;
 }
 
-static struct irq_remap_table *get_irq_table(u16 devid)
+static struct irq_remap_table *get_irq_table(struct amd_iommu *iommu, u16 
devid)
 {
struct irq_remap_table *table;
+   struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg;
 
if (WARN_ONCE(!amd_iommu_rlookup_table[devid],
  "%s: no iommu for devid %x\n", __func__, devid))
return NULL;
 
-   table = irq_lookup_table[devid];
-   if (WARN_ONCE(!table, "%s: no table for devid %x\n", __func__, devid))
+   table = pci_seg->irq_lookup_table[devid];
+   if (WARN_ONCE(!table, "%s: no table for devid %x:%x\n",
+ __func__, pci_seg->id, devid))
return NULL;
 
return table;
@@ -2774,7 +2776,9 @@ static struct irq_remap_table *__alloc_irq_table(void)
 static void set_remap_table_entry(struct amd_iommu *iommu, u16 devid,
  struct irq_remap_table *table)
 {
-   irq_lookup_table[devid] = table;
+   struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg;
+
+   pci_seg->irq_lookup_table[devid] = table;
set_dte_irq_entry(devid, table);
iommu_flush_dte(iommu, devid);
 }
@@ -2783,8 +2787,14 @@ static int set_remap_table_entry_alias(struct pci_dev 
*pdev, u16 alias,
   void *data)
 {
struct irq_remap_table *table = data;
+   struct amd_iommu_pci_seg *pci_seg;
+   struct amd_iommu *iommu = rlookup_amd_iommu(>dev);
 
-   irq_lookup_table[alias] = table;
+   if (!iommu)
+   return -EINVAL;
+
+   pci_seg = iommu->pci_seg;
+   pci_seg->irq_lookup_table[alias] = table;
set_dte_irq_entry(alias, table);
 
iommu_flush_dte(amd_iommu_rlookup_table[alias], alias);
@@ -2808,12 +2818,12 @@ static struct irq_remap_table *alloc_irq_table(u16 
devid, struct pci_dev *pdev)
goto out_unlock;
 
pci_seg = iommu->pci_seg;
-   table = irq_lookup_table[devid];
+   table = pci_seg->irq_lookup_table[devid];
if (table)
goto out_unlock;
 
alias = pci_seg->alias_table[devid];
-   table = irq_lookup_table[alias];
+   table = pci_seg->irq_lookup_table[alias];
if (table) {
set_remap_table_entry(iommu, devid, table);
  

[PATCH v3 RESEND 13/35] iommu/amd: Introduce per PCI segment rlookup table size

2022-07-06 Thread Vasant Hegde via iommu
It will replace global "rlookup_table_size" variable.

Co-developed-by: Suravee Suthikulpanit 
Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/amd_iommu_types.h |  3 +++
 drivers/iommu/amd/init.c| 11 ++-
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 8638b1107dd2..8d2d5fbdb57f 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -561,6 +561,9 @@ struct amd_iommu_pci_seg {
/* Size of the alias table */
u32 alias_table_size;
 
+   /* Size of the rlookup table */
+   u32 rlookup_table_size;
+
/*
 * device table virtual address
 *
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 22a632397818..afe3bff5bce0 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -672,7 +672,7 @@ static inline int __init alloc_rlookup_table(struct 
amd_iommu_pci_seg *pci_seg)
 {
pci_seg->rlookup_table = (void *)__get_free_pages(
GFP_KERNEL | __GFP_ZERO,
-   get_order(rlookup_table_size));
+   
get_order(pci_seg->rlookup_table_size));
if (pci_seg->rlookup_table == NULL)
return -ENOMEM;
 
@@ -682,7 +682,7 @@ static inline int __init alloc_rlookup_table(struct 
amd_iommu_pci_seg *pci_seg)
 static inline void free_rlookup_table(struct amd_iommu_pci_seg *pci_seg)
 {
free_pages((unsigned long)pci_seg->rlookup_table,
-  get_order(rlookup_table_size));
+  get_order(pci_seg->rlookup_table_size));
pci_seg->rlookup_table = NULL;
 }
 
@@ -690,9 +690,9 @@ static inline int __init alloc_irq_lookup_table(struct 
amd_iommu_pci_seg *pci_se
 {
pci_seg->irq_lookup_table = (void *)__get_free_pages(
 GFP_KERNEL | __GFP_ZERO,
-get_order(rlookup_table_size));
+
get_order(pci_seg->rlookup_table_size));
kmemleak_alloc(pci_seg->irq_lookup_table,
-  rlookup_table_size, 1, GFP_KERNEL);
+  pci_seg->rlookup_table_size, 1, GFP_KERNEL);
if (pci_seg->irq_lookup_table == NULL)
return -ENOMEM;
 
@@ -703,7 +703,7 @@ static inline void free_irq_lookup_table(struct 
amd_iommu_pci_seg *pci_seg)
 {
kmemleak_free(pci_seg->irq_lookup_table);
free_pages((unsigned long)pci_seg->irq_lookup_table,
-  get_order(rlookup_table_size));
+  get_order(pci_seg->rlookup_table_size));
pci_seg->irq_lookup_table = NULL;
 }
 
@@ -1584,6 +1584,7 @@ static struct amd_iommu_pci_seg *__init 
alloc_pci_segment(u16 id,
DUMP_printk("PCI segment : 0x%0x, last bdf : 0x%04x\n", id, last_bdf);
pci_seg->dev_table_size = tbl_size(DEV_TABLE_ENTRY_SIZE);
pci_seg->alias_table_size   = tbl_size(ALIAS_TABLE_ENTRY_SIZE);
+   pci_seg->rlookup_table_size = tbl_size(RLOOKUP_TABLE_ENTRY_SIZE);
 
pci_seg->id = id;
init_llist_head(_seg->dev_data_list);
-- 
2.31.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 1/4] dt-bindings: qcom-iommu: Add Qualcomm MSM8953 compatible

2022-07-06 Thread Will Deacon
On Sun, Jun 12, 2022 at 11:22:13AM +0200, Luca Weiss wrote:
> Document the compatible used for IOMMU on the msm8953 SoC.
> 
> Signed-off-by: Luca Weiss 
> ---
> Changes from v1:
> - new patch
> 
>  Documentation/devicetree/bindings/iommu/qcom,iommu.txt | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
> b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
> index 059139abce35..e6cecfd360eb 100644
> --- a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
> +++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
> @@ -10,6 +10,7 @@ to non-secure vs secure interrupt line.
>  - compatible   : Should be one of:
>  
>  "qcom,msm8916-iommu"
> +"qcom,msm8953-iommu"

I'm assuming Andy or Bjorn will pick this up.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 RESEND 12/35] iommu/amd: Introduce per PCI segment alias table size

2022-07-06 Thread Vasant Hegde via iommu
It will replace global "alias_table_size" variable.

Co-developed-by: Suravee Suthikulpanit 
Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/amd_iommu_types.h | 3 +++
 drivers/iommu/amd/init.c| 5 +++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 1dbe9c7f973d..8638b1107dd2 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -558,6 +558,9 @@ struct amd_iommu_pci_seg {
/* Size of the device table */
u32 dev_table_size;
 
+   /* Size of the alias table */
+   u32 alias_table_size;
+
/*
 * device table virtual address
 *
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 4a1807f7a8b9..22a632397818 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -712,7 +712,7 @@ static int __init alloc_alias_table(struct 
amd_iommu_pci_seg *pci_seg)
int i;
 
pci_seg->alias_table = (void *)__get_free_pages(GFP_KERNEL,
-   
get_order(alias_table_size));
+   get_order(pci_seg->alias_table_size));
if (!pci_seg->alias_table)
return -ENOMEM;
 
@@ -728,7 +728,7 @@ static int __init alloc_alias_table(struct 
amd_iommu_pci_seg *pci_seg)
 static void __init free_alias_table(struct amd_iommu_pci_seg *pci_seg)
 {
free_pages((unsigned long)pci_seg->alias_table,
-  get_order(alias_table_size));
+  get_order(pci_seg->alias_table_size));
pci_seg->alias_table = NULL;
 }
 
@@ -1583,6 +1583,7 @@ static struct amd_iommu_pci_seg *__init 
alloc_pci_segment(u16 id,
pci_seg->last_bdf = last_bdf;
DUMP_printk("PCI segment : 0x%0x, last bdf : 0x%04x\n", id, last_bdf);
pci_seg->dev_table_size = tbl_size(DEV_TABLE_ENTRY_SIZE);
+   pci_seg->alias_table_size   = tbl_size(ALIAS_TABLE_ENTRY_SIZE);
 
pci_seg->id = id;
init_llist_head(_seg->dev_data_list);
-- 
2.31.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 RESEND 11/35] iommu/amd: Introduce per PCI segment device table size

2022-07-06 Thread Vasant Hegde via iommu
With multiple pci segment support, number of BDF supported by each
segment may differ. Hence introduce per segment device table size
which depends on last_bdf. This will replace global
"device_table_size" variable.

Co-developed-by: Suravee Suthikulpanit 
Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/amd_iommu_types.h |  3 +++
 drivers/iommu/amd/init.c| 18 ++
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 8be8f3d6b44a..1dbe9c7f973d 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -555,6 +555,9 @@ struct amd_iommu_pci_seg {
/* Largest PCI device id we expect translation requests for */
u16 last_bdf;
 
+   /* Size of the device table */
+   u32 dev_table_size;
+
/*
 * device table virtual address
 *
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 73554ee9c3b3..4a1807f7a8b9 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -416,6 +416,7 @@ static void iommu_set_cwwb_range(struct amd_iommu *iommu)
 static void iommu_set_device_table(struct amd_iommu *iommu)
 {
u64 entry;
+   u32 dev_table_size = iommu->pci_seg->dev_table_size;
 
BUG_ON(iommu->mmio_base == NULL);
 
@@ -652,7 +653,7 @@ static int __init find_last_devid_acpi(struct 
acpi_table_header *table, u16 pci_
 static inline int __init alloc_dev_table(struct amd_iommu_pci_seg *pci_seg)
 {
pci_seg->dev_table = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO | 
GFP_DMA32,
- 
get_order(dev_table_size));
+ 
get_order(pci_seg->dev_table_size));
if (!pci_seg->dev_table)
return -ENOMEM;
 
@@ -662,7 +663,7 @@ static inline int __init alloc_dev_table(struct 
amd_iommu_pci_seg *pci_seg)
 static inline void free_dev_table(struct amd_iommu_pci_seg *pci_seg)
 {
free_pages((unsigned long)pci_seg->dev_table,
-   get_order(dev_table_size));
+   get_order(pci_seg->dev_table_size));
pci_seg->dev_table = NULL;
 }
 
@@ -1035,7 +1036,7 @@ static bool __copy_device_table(struct amd_iommu *iommu)
entry = (((u64) hi) << 32) + lo;
 
old_devtb_size = ((entry & ~PAGE_MASK) + 1) << 12;
-   if (old_devtb_size != dev_table_size) {
+   if (old_devtb_size != pci_seg->dev_table_size) {
pr_err("The device table size of IOMMU:%d is not expected!\n",
iommu->index);
return false;
@@ -1054,15 +1055,15 @@ static bool __copy_device_table(struct amd_iommu *iommu)
}
old_devtb = (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT) && 
is_kdump_kernel())
? (__force void *)ioremap_encrypted(old_devtb_phys,
-   dev_table_size)
-   : memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
+   pci_seg->dev_table_size)
+   : memremap(old_devtb_phys, pci_seg->dev_table_size, 
MEMREMAP_WB);
 
if (!old_devtb)
return false;
 
gfp_flag = GFP_KERNEL | __GFP_ZERO | GFP_DMA32;
pci_seg->old_dev_tbl_cpy = (void *)__get_free_pages(gfp_flag,
-   get_order(dev_table_size));
+   
get_order(pci_seg->dev_table_size));
if (pci_seg->old_dev_tbl_cpy == NULL) {
pr_err("Failed to allocate memory for copying old device 
table!\n");
memunmap(old_devtb);
@@ -1581,6 +1582,7 @@ static struct amd_iommu_pci_seg *__init 
alloc_pci_segment(u16 id,
 
pci_seg->last_bdf = last_bdf;
DUMP_printk("PCI segment : 0x%0x, last bdf : 0x%04x\n", id, last_bdf);
+   pci_seg->dev_table_size = tbl_size(DEV_TABLE_ENTRY_SIZE);
 
pci_seg->id = id;
init_llist_head(_seg->dev_data_list);
@@ -2675,7 +2677,7 @@ static void early_enable_iommus(void)
for_each_pci_segment(pci_seg) {
if (pci_seg->old_dev_tbl_cpy != NULL) {
free_pages((unsigned 
long)pci_seg->old_dev_tbl_cpy,
-   get_order(dev_table_size));
+   
get_order(pci_seg->dev_table_size));
pci_seg->old_dev_tbl_cpy = NULL;
}
}
@@ -2689,7 +2691,7 @@ static void early_enable_iommus(void)
 
for_each_pci_segment(pci_seg) {
free_pages((unsigned long)pci_seg->dev_table,
-  get_order(dev_table_size));
+  

[PATCH v3 RESEND 10/35] iommu/amd: Introduce per PCI segment last_bdf

2022-07-06 Thread Vasant Hegde via iommu
Current code uses global "amd_iommu_last_bdf" to track the last bdf
supported by the system. This value is used for various memory
allocation, device data flushing, etc.

Introduce per PCI segment last_bdf which will be used to track last bdf
supported by the given PCI segment and use this value for all per
segment memory allocations. Eventually it will replace global
"amd_iommu_last_bdf".

Co-developed-by: Suravee Suthikulpanit 
Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/amd_iommu_types.h |  3 ++
 drivers/iommu/amd/init.c| 69 ++---
 2 files changed, 45 insertions(+), 27 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 3099a018cef0..8be8f3d6b44a 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -552,6 +552,9 @@ struct amd_iommu_pci_seg {
/* PCI segment number */
u16 id;
 
+   /* Largest PCI device id we expect translation requests for */
+   u16 last_bdf;
+
/*
 * device table virtual address
 *
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 39d04d4143fb..73554ee9c3b3 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -552,6 +552,7 @@ static int __init find_last_devid_from_ivhd(struct 
ivhd_header *h)
 {
u8 *p = (void *)h, *end = (void *)h;
struct ivhd_entry *dev;
+   int last_devid = -EINVAL;
 
u32 ivhd_size = get_ivhd_header_size(h);
 
@@ -569,13 +570,15 @@ static int __init find_last_devid_from_ivhd(struct 
ivhd_header *h)
case IVHD_DEV_ALL:
/* Use maximum BDF value for DEV_ALL */
update_last_devid(0x);
-   break;
+   return 0x;
case IVHD_DEV_SELECT:
case IVHD_DEV_RANGE_END:
case IVHD_DEV_ALIAS:
case IVHD_DEV_EXT_SELECT:
/* all the above subfield types refer to device ids */
update_last_devid(dev->devid);
+   if (dev->devid > last_devid)
+   last_devid = dev->devid;
break;
default:
break;
@@ -585,7 +588,7 @@ static int __init find_last_devid_from_ivhd(struct 
ivhd_header *h)
 
WARN_ON(p != end);
 
-   return 0;
+   return last_devid;
 }
 
 static int __init check_ivrs_checksum(struct acpi_table_header *table)
@@ -609,27 +612,31 @@ static int __init check_ivrs_checksum(struct 
acpi_table_header *table)
  * id which we need to handle. This is the first of three functions which parse
  * the ACPI table. So we check the checksum here.
  */
-static int __init find_last_devid_acpi(struct acpi_table_header *table)
+static int __init find_last_devid_acpi(struct acpi_table_header *table, u16 
pci_seg)
 {
u8 *p = (u8 *)table, *end = (u8 *)table;
struct ivhd_header *h;
+   int last_devid, last_bdf = 0;
 
p += IVRS_HEADER_LENGTH;
 
end += table->length;
while (p < end) {
h = (struct ivhd_header *)p;
-   if (h->type == amd_iommu_target_ivhd_type) {
-   int ret = find_last_devid_from_ivhd(h);
-
-   if (ret)
-   return ret;
+   if (h->pci_seg == pci_seg &&
+   h->type == amd_iommu_target_ivhd_type) {
+   last_devid = find_last_devid_from_ivhd(h);
+
+   if (last_devid < 0)
+   return -EINVAL;
+   if (last_devid > last_bdf)
+   last_bdf = last_devid;
}
p += h->length;
}
WARN_ON(p != end);
 
-   return 0;
+   return last_bdf;
 }
 
 /
@@ -1553,14 +1560,28 @@ static int __init init_iommu_from_acpi(struct amd_iommu 
*iommu,
 }
 
 /* Allocate PCI segment data structure */
-static struct amd_iommu_pci_seg *__init alloc_pci_segment(u16 id)
+static struct amd_iommu_pci_seg *__init alloc_pci_segment(u16 id,
+ struct acpi_table_header *ivrs_base)
 {
struct amd_iommu_pci_seg *pci_seg;
+   int last_bdf;
+
+   /*
+* First parse ACPI tables to find the largest Bus/Dev/Func we need to
+* handle in this PCI segment. Upon this information the shared data
+* structures for the PCI segments in the system will be allocated.
+*/
+   last_bdf = find_last_devid_acpi(ivrs_base, id);
+   if (last_bdf < 0)
+   return NULL;
 
pci_seg = kzalloc(sizeof(struct amd_iommu_pci_seg), GFP_KERNEL);
if (pci_seg == NULL)
return NULL;
 
+   pci_seg->last_bdf = 

[PATCH v3 RESEND 09/35] iommu/amd: Introduce per PCI segment unity map list

2022-07-06 Thread Vasant Hegde via iommu
Newer AMD systems can support multiple PCI segments. In order to support
multiple PCI segments IVMD table in IVRS structure is enhanced to
include pci segment id. Update ivmd_header structure to include "pci_seg".

Also introduce per PCI segment unity map list. It will replace global
amd_iommu_unity_map list.

Note that we have used "reserved" field in IVMD table to include "pci_seg
id" which was set to zero. It will take care of backward compatibility
(new kernel will work fine on older systems).

Co-developed-by: Suravee Suthikulpanit 
Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/amd_iommu_types.h | 13 +++--
 drivers/iommu/amd/init.c| 30 +++--
 drivers/iommu/amd/iommu.c   |  8 +++-
 3 files changed, 34 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index c9dd0ab37475..3099a018cef0 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -587,6 +587,13 @@ struct amd_iommu_pci_seg {
 * More than one device can share the same requestor id.
 */
u16 *alias_table;
+
+   /*
+* A list of required unity mappings we find in ACPI. It is not locked
+* because as runtime it is only read. It is created at ACPI table
+* parsing time.
+*/
+   struct list_head unity_map;
 };
 
 /*
@@ -813,12 +820,6 @@ struct unity_map_entry {
int prot;
 };
 
-/*
- * List of all unity mappings. It is not locked because as runtime it is only
- * read. It is created at ACPI table parsing time.
- */
-extern struct list_head amd_iommu_unity_map;
-
 /*
  * Data structures for device handling
  */
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 80e7eef4260f..39d04d4143fb 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -141,7 +141,8 @@ struct ivmd_header {
u16 length;
u16 devid;
u16 aux;
-   u64 resv;
+   u16 pci_seg;
+   u8  resv[6];
u64 range_start;
u64 range_length;
 } __attribute__((packed));
@@ -161,8 +162,6 @@ static int amd_iommu_target_ivhd_type;
 
 u16 amd_iommu_last_bdf;/* largest PCI device id we have
   to handle */
-LIST_HEAD(amd_iommu_unity_map);/* a list of required unity 
mappings
-  we find in ACPI */
 
 LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */
 LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the
@@ -1564,6 +1563,7 @@ static struct amd_iommu_pci_seg *__init 
alloc_pci_segment(u16 id)
 
pci_seg->id = id;
init_llist_head(_seg->dev_data_list);
+   INIT_LIST_HEAD(_seg->unity_map);
list_add_tail(_seg->list, _iommu_pci_seg_list);
 
if (alloc_dev_table(pci_seg))
@@ -2398,10 +2398,13 @@ static int iommu_init_irq(struct amd_iommu *iommu)
 static void __init free_unity_maps(void)
 {
struct unity_map_entry *entry, *next;
+   struct amd_iommu_pci_seg *p, *pci_seg;
 
-   list_for_each_entry_safe(entry, next, _iommu_unity_map, list) {
-   list_del(>list);
-   kfree(entry);
+   for_each_pci_segment_safe(pci_seg, p) {
+   list_for_each_entry_safe(entry, next, _seg->unity_map, 
list) {
+   list_del(>list);
+   kfree(entry);
+   }
}
 }
 
@@ -2409,8 +2412,13 @@ static void __init free_unity_maps(void)
 static int __init init_unity_map_range(struct ivmd_header *m)
 {
struct unity_map_entry *e = NULL;
+   struct amd_iommu_pci_seg *pci_seg;
char *s;
 
+   pci_seg = get_pci_segment(m->pci_seg);
+   if (pci_seg == NULL)
+   return -ENOMEM;
+
e = kzalloc(sizeof(*e), GFP_KERNEL);
if (e == NULL)
return -ENOMEM;
@@ -2448,14 +2456,16 @@ static int __init init_unity_map_range(struct 
ivmd_header *m)
if (m->flags & IVMD_FLAG_EXCL_RANGE)
e->prot = (IVMD_FLAG_IW | IVMD_FLAG_IR) >> 1;
 
-   DUMP_printk("%s devid_start: %02x:%02x.%x devid_end: %02x:%02x.%x"
-   " range_start: %016llx range_end: %016llx flags: %x\n", s,
+   DUMP_printk("%s devid_start: %04x:%02x:%02x.%x devid_end: "
+   "%04x:%02x:%02x.%x range_start: %016llx range_end: %016llx"
+   " flags: %x\n", s, m->pci_seg,
PCI_BUS_NUM(e->devid_start), PCI_SLOT(e->devid_start),
-   PCI_FUNC(e->devid_start), PCI_BUS_NUM(e->devid_end),
+   PCI_FUNC(e->devid_start), m->pci_seg,
+   PCI_BUS_NUM(e->devid_end),
PCI_SLOT(e->devid_end), PCI_FUNC(e->devid_end),
e->address_start, e->address_end, m->flags);
 
-   list_add_tail(>list, _iommu_unity_map);
+   

[PATCH v3 RESEND 08/35] iommu/amd: Introduce per PCI segment alias_table

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

This will replace global alias table (amd_iommu_alias_table).

Co-developed-by: Vasant Hegde 
Signed-off-by: Vasant Hegde 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu_types.h |  7 +
 drivers/iommu/amd/init.c| 41 ++---
 drivers/iommu/amd/iommu.c   | 41 ++---
 3 files changed, 64 insertions(+), 25 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 3ef68d588cc7..c9dd0ab37475 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -580,6 +580,13 @@ struct amd_iommu_pci_seg {
 * will be copied to. It's only be used in kdump kernel.
 */
struct dev_table_entry *old_dev_tbl_cpy;
+
+   /*
+* The alias table is a driver specific data structure which contains 
the
+* mappings of the PCI device ids to the actual requestor ids on the 
IOMMU.
+* More than one device can share the same requestor id.
+*/
+   u16 *alias_table;
 };
 
 /*
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index f188130cc173..80e7eef4260f 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -700,6 +700,31 @@ static inline void free_irq_lookup_table(struct 
amd_iommu_pci_seg *pci_seg)
pci_seg->irq_lookup_table = NULL;
 }
 
+static int __init alloc_alias_table(struct amd_iommu_pci_seg *pci_seg)
+{
+   int i;
+
+   pci_seg->alias_table = (void *)__get_free_pages(GFP_KERNEL,
+   
get_order(alias_table_size));
+   if (!pci_seg->alias_table)
+   return -ENOMEM;
+
+   /*
+* let all alias entries point to itself
+*/
+   for (i = 0; i <= amd_iommu_last_bdf; ++i)
+   pci_seg->alias_table[i] = i;
+
+   return 0;
+}
+
+static void __init free_alias_table(struct amd_iommu_pci_seg *pci_seg)
+{
+   free_pages((unsigned long)pci_seg->alias_table,
+  get_order(alias_table_size));
+   pci_seg->alias_table = NULL;
+}
+
 /*
  * Allocates the command buffer. This buffer is per AMD IOMMU. We can
  * write commands to that buffer later and the IOMMU will execute them
@@ -1268,6 +1293,7 @@ static int __init init_iommu_from_acpi(struct amd_iommu 
*iommu,
u32 dev_i, ext_flags = 0;
bool alias = false;
struct ivhd_entry *e;
+   struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg;
u32 ivhd_size;
int ret;
 
@@ -1349,7 +1375,7 @@ static int __init init_iommu_from_acpi(struct amd_iommu 
*iommu,
devid_to = e->ext >> 8;
set_dev_entry_from_acpi(iommu, devid   , e->flags, 0);
set_dev_entry_from_acpi(iommu, devid_to, e->flags, 0);
-   amd_iommu_alias_table[devid] = devid_to;
+   pci_seg->alias_table[devid] = devid_to;
break;
case IVHD_DEV_ALIAS_RANGE:
 
@@ -1407,7 +1433,7 @@ static int __init init_iommu_from_acpi(struct amd_iommu 
*iommu,
devid = e->devid;
for (dev_i = devid_start; dev_i <= devid; ++dev_i) {
if (alias) {
-   amd_iommu_alias_table[dev_i] = devid_to;
+   pci_seg->alias_table[dev_i] = devid_to;
set_dev_entry_from_acpi(iommu,
devid_to, flags, ext_flags);
}
@@ -1542,6 +1568,8 @@ static struct amd_iommu_pci_seg *__init 
alloc_pci_segment(u16 id)
 
if (alloc_dev_table(pci_seg))
return NULL;
+   if (alloc_alias_table(pci_seg))
+   return NULL;
if (alloc_rlookup_table(pci_seg))
return NULL;
 
@@ -1568,6 +1596,7 @@ static void __init free_pci_segments(void)
list_del(_seg->list);
free_irq_lookup_table(pci_seg);
free_rlookup_table(pci_seg);
+   free_alias_table(pci_seg);
free_dev_table(pci_seg);
kfree(pci_seg);
}
@@ -2839,7 +2868,7 @@ static void __init ivinfo_init(void *ivrs)
 static int __init early_amd_iommu_init(void)
 {
struct acpi_table_header *ivrs_base;
-   int i, remap_cache_sz, ret;
+   int remap_cache_sz, ret;
acpi_status status;
 
if (!amd_iommu_detected)
@@ -2910,12 +2939,6 @@ static int __init early_amd_iommu_init(void)
if (amd_iommu_pd_alloc_bitmap == NULL)
goto out;
 
-   /*
-* let all alias entries point to itself
-*/
-   for (i = 0; i <= amd_iommu_last_bdf; ++i)
-   amd_iommu_alias_table[i] = i;
-
/*
 * never allocate domain 0 because its used as the non-allocated and
   

[PATCH v3 RESEND 07/35] iommu/amd: Introduce per PCI segment old_dev_tbl_cpy

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

It will remove global old_dev_tbl_cpy. Also update copy_device_table()
copy device table for all PCI segments.

Co-developed-by: Vasant Hegde 
Signed-off-by: Vasant Hegde 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu_types.h |   6 ++
 drivers/iommu/amd/init.c| 109 
 2 files changed, 70 insertions(+), 45 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 5f3cc704f131..3ef68d588cc7 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -574,6 +574,12 @@ struct amd_iommu_pci_seg {
 * device id quickly.
 */
struct irq_remap_table **irq_lookup_table;
+
+   /*
+* Pointer to a device table which the content of old device table
+* will be copied to. It's only be used in kdump kernel.
+*/
+   struct dev_table_entry *old_dev_tbl_cpy;
 };
 
 /*
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 509655f86851..f188130cc173 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -193,11 +193,6 @@ bool amd_iommu_force_isolation __read_mostly;
  * page table root pointer.
  */
 struct dev_table_entry *amd_iommu_dev_table;
-/*
- * Pointer to a device table which the content of old device table
- * will be copied to. It's only be used in kdump kernel.
- */
-static struct dev_table_entry *old_dev_tbl_cpy;
 
 /*
  * The alias table is a driver specific data structure which contains the
@@ -992,39 +987,27 @@ static int get_dev_entry_bit(u16 devid, u8 bit)
 }
 
 
-static bool copy_device_table(void)
+static bool __copy_device_table(struct amd_iommu *iommu)
 {
-   u64 int_ctl, int_tab_len, entry = 0, last_entry = 0;
+   u64 int_ctl, int_tab_len, entry = 0;
+   struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg;
struct dev_table_entry *old_devtb = NULL;
u32 lo, hi, devid, old_devtb_size;
phys_addr_t old_devtb_phys;
-   struct amd_iommu *iommu;
u16 dom_id, dte_v, irq_v;
gfp_t gfp_flag;
u64 tmp;
 
-   if (!amd_iommu_pre_enabled)
-   return false;
-
-   pr_warn("Translation is already enabled - trying to copy translation 
structures\n");
-   for_each_iommu(iommu) {
-   /* All IOMMUs should use the same device table with the same 
size */
-   lo = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET);
-   hi = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET + 4);
-   entry = (((u64) hi) << 32) + lo;
-   if (last_entry && last_entry != entry) {
-   pr_err("IOMMU:%d should use the same dev table as 
others!\n",
-   iommu->index);
-   return false;
-   }
-   last_entry = entry;
+   /* Each IOMMU use separate device table with the same size */
+   lo = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET);
+   hi = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET + 4);
+   entry = (((u64) hi) << 32) + lo;
 
-   old_devtb_size = ((entry & ~PAGE_MASK) + 1) << 12;
-   if (old_devtb_size != dev_table_size) {
-   pr_err("The device table size of IOMMU:%d is not 
expected!\n",
-   iommu->index);
-   return false;
-   }
+   old_devtb_size = ((entry & ~PAGE_MASK) + 1) << 12;
+   if (old_devtb_size != dev_table_size) {
+   pr_err("The device table size of IOMMU:%d is not expected!\n",
+   iommu->index);
+   return false;
}
 
/*
@@ -1047,31 +1030,31 @@ static bool copy_device_table(void)
return false;
 
gfp_flag = GFP_KERNEL | __GFP_ZERO | GFP_DMA32;
-   old_dev_tbl_cpy = (void *)__get_free_pages(gfp_flag,
-   get_order(dev_table_size));
-   if (old_dev_tbl_cpy == NULL) {
+   pci_seg->old_dev_tbl_cpy = (void *)__get_free_pages(gfp_flag,
+   get_order(dev_table_size));
+   if (pci_seg->old_dev_tbl_cpy == NULL) {
pr_err("Failed to allocate memory for copying old device 
table!\n");
memunmap(old_devtb);
return false;
}
 
for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) {
-   old_dev_tbl_cpy[devid] = old_devtb[devid];
+   pci_seg->old_dev_tbl_cpy[devid] = old_devtb[devid];
dom_id = old_devtb[devid].data[1] & DEV_DOMID_MASK;
dte_v = old_devtb[devid].data[0] & DTE_FLAG_V;
 
if (dte_v && dom_id) {
-   old_dev_tbl_cpy[devid].data[0] = 
old_devtb[devid].data[0];
-   old_dev_tbl_cpy[devid].data[1] = 
old_devtb[devid].data[1];
+   

[PATCH v3 RESEND 06/35] iommu/amd: Introduce per PCI segment dev_data_list

2022-07-06 Thread Vasant Hegde via iommu
This will replace global dev_data_list.

Co-developed-by: Suravee Suthikulpanit 
Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/amd_iommu_types.h |  3 +++
 drivers/iommu/amd/init.c|  1 +
 drivers/iommu/amd/iommu.c   | 21 ++---
 3 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index cfb5f0e44186..5f3cc704f131 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -546,6 +546,9 @@ struct amd_iommu_pci_seg {
/* List with all PCI segments in the system */
struct list_head list;
 
+   /* List of all available dev_data structures */
+   struct llist_head dev_data_list;
+
/* PCI segment number */
u16 id;
 
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index f6678dd56e28..509655f86851 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1527,6 +1527,7 @@ static struct amd_iommu_pci_seg *__init 
alloc_pci_segment(u16 id)
return NULL;
 
pci_seg->id = id;
+   init_llist_head(_seg->dev_data_list);
list_add_tail(_seg->list, _iommu_pci_seg_list);
 
if (alloc_dev_table(pci_seg))
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index b0262b2e749d..48275da7fcb0 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -62,9 +62,6 @@
 
 static DEFINE_SPINLOCK(pd_bitmap_lock);
 
-/* List of all available dev_data structures */
-static LLIST_HEAD(dev_data_list);
-
 LIST_HEAD(ioapic_map);
 LIST_HEAD(hpet_map);
 LIST_HEAD(acpihid_map);
@@ -195,9 +192,10 @@ static struct protection_domain *to_pdomain(struct 
iommu_domain *dom)
return container_of(dom, struct protection_domain, domain);
 }
 
-static struct iommu_dev_data *alloc_dev_data(u16 devid)
+static struct iommu_dev_data *alloc_dev_data(struct amd_iommu *iommu, u16 
devid)
 {
struct iommu_dev_data *dev_data;
+   struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg;
 
dev_data = kzalloc(sizeof(*dev_data), GFP_KERNEL);
if (!dev_data)
@@ -207,19 +205,20 @@ static struct iommu_dev_data *alloc_dev_data(u16 devid)
dev_data->devid = devid;
ratelimit_default_init(_data->rs);
 
-   llist_add(_data->dev_data_list, _data_list);
+   llist_add(_data->dev_data_list, _seg->dev_data_list);
return dev_data;
 }
 
-static struct iommu_dev_data *search_dev_data(u16 devid)
+static struct iommu_dev_data *search_dev_data(struct amd_iommu *iommu, u16 
devid)
 {
struct iommu_dev_data *dev_data;
struct llist_node *node;
+   struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg;
 
-   if (llist_empty(_data_list))
+   if (llist_empty(_seg->dev_data_list))
return NULL;
 
-   node = dev_data_list.first;
+   node = pci_seg->dev_data_list.first;
llist_for_each_entry(dev_data, node, dev_data_list) {
if (dev_data->devid == devid)
return dev_data;
@@ -288,10 +287,10 @@ static struct iommu_dev_data *find_dev_data(u16 devid)
struct iommu_dev_data *dev_data;
struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
 
-   dev_data = search_dev_data(devid);
+   dev_data = search_dev_data(iommu, devid);
 
if (dev_data == NULL) {
-   dev_data = alloc_dev_data(devid);
+   dev_data = alloc_dev_data(iommu, devid);
if (!dev_data)
return NULL;
 
@@ -3466,7 +3465,7 @@ static int amd_ir_set_vcpu_affinity(struct irq_data 
*data, void *vcpu_info)
struct vcpu_data *vcpu_pi_info = pi_data->vcpu_data;
struct amd_ir_data *ir_data = data->chip_data;
struct irq_2_irte *irte_info = _data->irq_2_irte;
-   struct iommu_dev_data *dev_data = search_dev_data(irte_info->devid);
+   struct iommu_dev_data *dev_data = search_dev_data(NULL, 
irte_info->devid);
 
/* Note:
 * This device has never been set up for guest mode.
-- 
2.31.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 RESEND 05/35] iommu/amd: Introduce per PCI segment irq_lookup_table

2022-07-06 Thread Vasant Hegde via iommu
This will replace global irq lookup table (irq_lookup_table).

Co-developed-by: Suravee Suthikulpanit 
Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/amd_iommu_types.h |  6 ++
 drivers/iommu/amd/init.c| 27 +++
 2 files changed, 33 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index d0ee78c656ff..cfb5f0e44186 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -565,6 +565,12 @@ struct amd_iommu_pci_seg {
 * device id.
 */
struct amd_iommu **rlookup_table;
+
+   /*
+* This table is used to find the irq remapping table for a given
+* device id quickly.
+*/
+   struct irq_remap_table **irq_lookup_table;
 };
 
 /*
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 2fb3e1b82e09..f6678dd56e28 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -684,6 +684,26 @@ static inline void free_rlookup_table(struct 
amd_iommu_pci_seg *pci_seg)
pci_seg->rlookup_table = NULL;
 }
 
+static inline int __init alloc_irq_lookup_table(struct amd_iommu_pci_seg 
*pci_seg)
+{
+   pci_seg->irq_lookup_table = (void *)__get_free_pages(
+GFP_KERNEL | __GFP_ZERO,
+get_order(rlookup_table_size));
+   kmemleak_alloc(pci_seg->irq_lookup_table,
+  rlookup_table_size, 1, GFP_KERNEL);
+   if (pci_seg->irq_lookup_table == NULL)
+   return -ENOMEM;
+
+   return 0;
+}
+
+static inline void free_irq_lookup_table(struct amd_iommu_pci_seg *pci_seg)
+{
+   kmemleak_free(pci_seg->irq_lookup_table);
+   free_pages((unsigned long)pci_seg->irq_lookup_table,
+  get_order(rlookup_table_size));
+   pci_seg->irq_lookup_table = NULL;
+}
 
 /*
  * Allocates the command buffer. This buffer is per AMD IOMMU. We can
@@ -1535,6 +1555,7 @@ static void __init free_pci_segments(void)
 
for_each_pci_segment_safe(pci_seg, next) {
list_del(_seg->list);
+   free_irq_lookup_table(pci_seg);
free_rlookup_table(pci_seg);
free_dev_table(pci_seg);
kfree(pci_seg);
@@ -2897,6 +2918,7 @@ static int __init early_amd_iommu_init(void)
amd_iommu_irq_remap = check_ioapic_information();
 
if (amd_iommu_irq_remap) {
+   struct amd_iommu_pci_seg *pci_seg;
/*
 * Interrupt remapping enabled, create kmem_cache for the
 * remapping tables.
@@ -2913,6 +2935,11 @@ static int __init early_amd_iommu_init(void)
if (!amd_iommu_irq_cache)
goto out;
 
+   for_each_pci_segment(pci_seg) {
+   if (alloc_irq_lookup_table(pci_seg))
+   goto out;
+   }
+
irq_lookup_table = (void *)__get_free_pages(
GFP_KERNEL | __GFP_ZERO,
get_order(rlookup_table_size));
-- 
2.31.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 RESEND 04/35] iommu/amd: Introduce per PCI segment rlookup table

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

This will replace global rlookup table (amd_iommu_rlookup_table).
Add helper functions to set/get rlookup table for the given device.
Also add macros to get seg/devid from sbdf.

Co-developed-by: Vasant Hegde 
Signed-off-by: Vasant Hegde 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h   |  1 +
 drivers/iommu/amd/amd_iommu_types.h | 11 
 drivers/iommu/amd/init.c| 23 +++
 drivers/iommu/amd/iommu.c   | 44 +
 4 files changed, 79 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 885570cd0d77..2947239700ce 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -19,6 +19,7 @@ extern int amd_iommu_init_devices(void);
 extern void amd_iommu_uninit_devices(void);
 extern void amd_iommu_init_notifier(void);
 extern int amd_iommu_init_api(void);
+extern void amd_iommu_set_rlookup_table(struct amd_iommu *iommu, u16 devid);
 
 #ifdef CONFIG_AMD_IOMMU_DEBUGFS
 void amd_iommu_debugfs_setup(struct amd_iommu *iommu);
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 422ea87ae4c7..d0ee78c656ff 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -456,6 +456,9 @@ extern bool amdr_ivrs_remap_support;
 /* kmem_cache to get tables with 128 byte alignement */
 extern struct kmem_cache *amd_iommu_irq_cache;
 
+#define PCI_SBDF_TO_SEGID(sbdf)(((sbdf) >> 16) & 0x)
+#define PCI_SBDF_TO_DEVID(sbdf)((sbdf) & 0x)
+
 /* Make iterating over all pci segment easier */
 #define for_each_pci_segment(pci_seg) \
list_for_each_entry((pci_seg), _iommu_pci_seg_list, list)
@@ -490,6 +493,7 @@ struct amd_iommu_fault {
 };
 
 
+struct amd_iommu;
 struct iommu_domain;
 struct irq_domain;
 struct amd_irte_ops;
@@ -554,6 +558,13 @@ struct amd_iommu_pci_seg {
 * page table root pointer.
 */
struct dev_table_entry *dev_table;
+
+   /*
+* The rlookup iommu table is used to find the IOMMU which is
+* responsible for a specific device. It is indexed by the PCI
+* device id.
+*/
+   struct amd_iommu **rlookup_table;
 };
 
 /*
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 5152243593bf..2fb3e1b82e09 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -665,6 +665,26 @@ static inline void free_dev_table(struct amd_iommu_pci_seg 
*pci_seg)
pci_seg->dev_table = NULL;
 }
 
+/* Allocate per PCI segment IOMMU rlookup table. */
+static inline int __init alloc_rlookup_table(struct amd_iommu_pci_seg *pci_seg)
+{
+   pci_seg->rlookup_table = (void *)__get_free_pages(
+   GFP_KERNEL | __GFP_ZERO,
+   get_order(rlookup_table_size));
+   if (pci_seg->rlookup_table == NULL)
+   return -ENOMEM;
+
+   return 0;
+}
+
+static inline void free_rlookup_table(struct amd_iommu_pci_seg *pci_seg)
+{
+   free_pages((unsigned long)pci_seg->rlookup_table,
+  get_order(rlookup_table_size));
+   pci_seg->rlookup_table = NULL;
+}
+
+
 /*
  * Allocates the command buffer. This buffer is per AMD IOMMU. We can
  * write commands to that buffer later and the IOMMU will execute them
@@ -1491,6 +1511,8 @@ static struct amd_iommu_pci_seg *__init 
alloc_pci_segment(u16 id)
 
if (alloc_dev_table(pci_seg))
return NULL;
+   if (alloc_rlookup_table(pci_seg))
+   return NULL;
 
return pci_seg;
 }
@@ -1513,6 +1535,7 @@ static void __init free_pci_segments(void)
 
for_each_pci_segment_safe(pci_seg, next) {
list_del(_seg->list);
+   free_rlookup_table(pci_seg);
free_dev_table(pci_seg);
kfree(pci_seg);
}
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index ac8f81f527b4..b0262b2e749d 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -146,6 +146,50 @@ struct dev_table_entry *get_dev_table(struct amd_iommu 
*iommu)
return dev_table;
 }
 
+static inline u16 get_device_segment(struct device *dev)
+{
+   u16 seg;
+
+   if (dev_is_pci(dev)) {
+   struct pci_dev *pdev = to_pci_dev(dev);
+
+   seg = pci_domain_nr(pdev->bus);
+   } else {
+   u32 devid = get_acpihid_device_id(dev, NULL);
+
+   seg = PCI_SBDF_TO_SEGID(devid);
+   }
+
+   return seg;
+}
+
+/* Writes the specific IOMMU for a device into the PCI segment rlookup table */
+void amd_iommu_set_rlookup_table(struct amd_iommu *iommu, u16 devid)
+{
+   struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg;
+
+   pci_seg->rlookup_table[devid] = iommu;
+}
+
+static struct amd_iommu *__rlookup_amd_iommu(u16 seg, u16 devid)
+{
+   struct 

[PATCH v3 RESEND 03/35] iommu/amd: Introduce per PCI segment device table

2022-07-06 Thread Vasant Hegde via iommu
From: Suravee Suthikulpanit 

Introduce per PCI segment device table. All IOMMUs within the segment
will share this device table. This will replace global device
table i.e. amd_iommu_dev_table.

Also introduce helper function to get the device table for the given IOMMU.

Co-developed-by: Vasant Hegde 
Signed-off-by: Vasant Hegde 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h   |  1 +
 drivers/iommu/amd/amd_iommu_types.h | 10 ++
 drivers/iommu/amd/init.c| 26 --
 drivers/iommu/amd/iommu.c   | 12 
 4 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 1ab31074f5b3..885570cd0d77 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -128,4 +128,5 @@ static inline void amd_iommu_apply_ivrs_quirks(void) { }
 
 extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
 u64 *root, int mode);
+extern struct dev_table_entry *get_dev_table(struct amd_iommu *iommu);
 #endif
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 2243b1a22d78..422ea87ae4c7 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -544,6 +544,16 @@ struct amd_iommu_pci_seg {
 
/* PCI segment number */
u16 id;
+
+   /*
+* device table virtual address
+*
+* Pointer to the per PCI segment device table.
+* It is indexed by the PCI device id or the HT unit id and contains
+* information about the domain the device belongs to as well as the
+* page table root pointer.
+*/
+   struct dev_table_entry *dev_table;
 };
 
 /*
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index c1b5d530dbf3..5152243593bf 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -642,11 +642,29 @@ static int __init find_last_devid_acpi(struct 
acpi_table_header *table)
  *
  * The following functions belong to the code path which parses the ACPI table
  * the second time. In this ACPI parsing iteration we allocate IOMMU specific
- * data structures, initialize the device/alias/rlookup table and also
- * basically initialize the hardware.
+ * data structures, initialize the per PCI segment device/alias/rlookup table
+ * and also basically initialize the hardware.
  *
  /
 
+/* Allocate per PCI segment device table */
+static inline int __init alloc_dev_table(struct amd_iommu_pci_seg *pci_seg)
+{
+   pci_seg->dev_table = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO | 
GFP_DMA32,
+ 
get_order(dev_table_size));
+   if (!pci_seg->dev_table)
+   return -ENOMEM;
+
+   return 0;
+}
+
+static inline void free_dev_table(struct amd_iommu_pci_seg *pci_seg)
+{
+   free_pages((unsigned long)pci_seg->dev_table,
+   get_order(dev_table_size));
+   pci_seg->dev_table = NULL;
+}
+
 /*
  * Allocates the command buffer. This buffer is per AMD IOMMU. We can
  * write commands to that buffer later and the IOMMU will execute them
@@ -1471,6 +1489,9 @@ static struct amd_iommu_pci_seg *__init 
alloc_pci_segment(u16 id)
pci_seg->id = id;
list_add_tail(_seg->list, _iommu_pci_seg_list);
 
+   if (alloc_dev_table(pci_seg))
+   return NULL;
+
return pci_seg;
 }
 
@@ -1492,6 +1513,7 @@ static void __init free_pci_segments(void)
 
for_each_pci_segment_safe(pci_seg, next) {
list_del(_seg->list);
+   free_dev_table(pci_seg);
kfree(pci_seg);
}
 }
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index efa8af5a9419..ac8f81f527b4 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -134,6 +134,18 @@ static inline int get_device_id(struct device *dev)
return devid;
 }
 
+struct dev_table_entry *get_dev_table(struct amd_iommu *iommu)
+{
+   struct dev_table_entry *dev_table;
+   struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg;
+
+   BUG_ON(pci_seg == NULL);
+   dev_table = pci_seg->dev_table;
+   BUG_ON(dev_table == NULL);
+
+   return dev_table;
+}
+
 static struct protection_domain *to_pdomain(struct iommu_domain *dom)
 {
return container_of(dom, struct protection_domain, domain);
-- 
2.31.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 RESEND 02/35] iommu/amd: Introduce pci segment structure

2022-07-06 Thread Vasant Hegde via iommu
Newer AMD systems can support multiple PCI segments, where each segment
contains one or more IOMMU instances. However, an IOMMU instance can only
support a single PCI segment.

Current code assumes that system contains only one pci segment (segment 0)
and creates global data structures such as device table, rlookup table,
etc.

Introducing per PCI segment data structure, which contains segment
specific data structures. This will eventually replace the global
data structures.

Also update `amd_iommu->pci_seg` variable to point to PCI segment
structure instead of PCI segment ID.

Co-developed-by: Suravee Suthikulpanit 
Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/amd_iommu_types.h | 24 ++-
 drivers/iommu/amd/init.c| 46 -
 2 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 9b563f850f1d..2243b1a22d78 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -456,6 +456,11 @@ extern bool amdr_ivrs_remap_support;
 /* kmem_cache to get tables with 128 byte alignement */
 extern struct kmem_cache *amd_iommu_irq_cache;
 
+/* Make iterating over all pci segment easier */
+#define for_each_pci_segment(pci_seg) \
+   list_for_each_entry((pci_seg), _iommu_pci_seg_list, list)
+#define for_each_pci_segment_safe(pci_seg, next) \
+   list_for_each_entry_safe((pci_seg), (next), _iommu_pci_seg_list, 
list)
 /*
  * Make iterating over all IOMMUs easier
  */
@@ -530,6 +535,17 @@ struct protection_domain {
unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */
 };
 
+/*
+ * This structure contains information about one PCI segment in the system.
+ */
+struct amd_iommu_pci_seg {
+   /* List with all PCI segments in the system */
+   struct list_head list;
+
+   /* PCI segment number */
+   u16 id;
+};
+
 /*
  * Structure where we save information about one hardware AMD IOMMU in the
  * system.
@@ -581,7 +597,7 @@ struct amd_iommu {
u16 cap_ptr;
 
/* pci domain of this IOMMU */
-   u16 pci_seg;
+   struct amd_iommu_pci_seg *pci_seg;
 
/* start of exclusion range of that IOMMU */
u64 exclusion_start;
@@ -709,6 +725,12 @@ extern struct list_head ioapic_map;
 extern struct list_head hpet_map;
 extern struct list_head acpihid_map;
 
+/*
+ * List with all PCI segments in the system. This list is not locked because
+ * it is only written at driver initialization time
+ */
+extern struct list_head amd_iommu_pci_seg_list;
+
 /*
  * List with all IOMMUs in the system. This list is not locked because it is
  * only written and read at driver initialization or suspend time
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 1d08f87e734b..c1b5d530dbf3 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -164,6 +164,7 @@ u16 amd_iommu_last_bdf; /* largest PCI 
device id we have
 LIST_HEAD(amd_iommu_unity_map);/* a list of required unity 
mappings
   we find in ACPI */
 
+LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */
 LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the
   system */
 
@@ -1458,6 +1459,43 @@ static int __init init_iommu_from_acpi(struct amd_iommu 
*iommu,
return 0;
 }
 
+/* Allocate PCI segment data structure */
+static struct amd_iommu_pci_seg *__init alloc_pci_segment(u16 id)
+{
+   struct amd_iommu_pci_seg *pci_seg;
+
+   pci_seg = kzalloc(sizeof(struct amd_iommu_pci_seg), GFP_KERNEL);
+   if (pci_seg == NULL)
+   return NULL;
+
+   pci_seg->id = id;
+   list_add_tail(_seg->list, _iommu_pci_seg_list);
+
+   return pci_seg;
+}
+
+static struct amd_iommu_pci_seg *__init get_pci_segment(u16 id)
+{
+   struct amd_iommu_pci_seg *pci_seg;
+
+   for_each_pci_segment(pci_seg) {
+   if (pci_seg->id == id)
+   return pci_seg;
+   }
+
+   return alloc_pci_segment(id);
+}
+
+static void __init free_pci_segments(void)
+{
+   struct amd_iommu_pci_seg *pci_seg, *next;
+
+   for_each_pci_segment_safe(pci_seg, next) {
+   list_del(_seg->list);
+   kfree(pci_seg);
+   }
+}
+
 static void __init free_iommu_one(struct amd_iommu *iommu)
 {
free_cwwb_sem(iommu);
@@ -1544,8 +1582,14 @@ static void amd_iommu_ats_write_check_workaround(struct 
amd_iommu *iommu)
  */
 static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header 
*h)
 {
+   struct amd_iommu_pci_seg *pci_seg;
int ret;
 
+   pci_seg = get_pci_segment(h->pci_seg);
+   if (pci_seg == NULL)
+   return -ENOMEM;
+   iommu->pci_seg = pci_seg;
+
raw_spin_lock_init(>lock);
iommu->cmd_sem_val = 0;
 

[PATCH v3 RESEND 01/35] iommu/amd: Update struct iommu_dev_data definition

2022-07-06 Thread Vasant Hegde via iommu
struct iommu_dev_data contains member "pdev" to point to pci_dev. This is
valid for only PCI devices and for other devices this will be NULL. This
causes unnecessary "pdev != NULL" check at various places.

Replace "struct pci_dev" member with "struct device" and use to_pci_dev()
to get pci device reference as needed. Also adjust setup_aliases() and
clone_aliases() function.

No functional change intended.

Co-developed-by: Suravee Suthikulpanit 
Signed-off-by: Suravee Suthikulpanit 
Signed-off-by: Vasant Hegde 
---
 drivers/iommu/amd/amd_iommu_types.h |  2 +-
 drivers/iommu/amd/iommu.c   | 32 +
 2 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 72d0f5e2f651..9b563f850f1d 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -689,7 +689,7 @@ struct iommu_dev_data {
struct list_head list;/* For domain->dev_list */
struct llist_node dev_data_list;  /* For global dev_data_list */
struct protection_domain *domain; /* Domain the device is bound to */
-   struct pci_dev *pdev;
+   struct device *dev;
u16 devid;/* PCI Device ID */
bool iommu_v2;/* Device can make use of IOMMUv2 */
struct {
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 840831d5d2ad..efa8af5a9419 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -188,10 +188,13 @@ static int clone_alias(struct pci_dev *pdev, u16 alias, 
void *data)
return 0;
 }
 
-static void clone_aliases(struct pci_dev *pdev)
+static void clone_aliases(struct device *dev)
 {
-   if (!pdev)
+   struct pci_dev *pdev;
+
+   if (!dev_is_pci(dev))
return;
+   pdev = to_pci_dev(dev);
 
/*
 * The IVRS alias stored in the alias table may not be
@@ -203,14 +206,14 @@ static void clone_aliases(struct pci_dev *pdev)
pci_for_each_dma_alias(pdev, clone_alias, NULL);
 }
 
-static struct pci_dev *setup_aliases(struct device *dev)
+static void setup_aliases(struct device *dev)
 {
struct pci_dev *pdev = to_pci_dev(dev);
u16 ivrs_alias;
 
/* For ACPI HID devices, there are no aliases */
if (!dev_is_pci(dev))
-   return NULL;
+   return;
 
/*
 * Add the IVRS alias to the pci aliases if it is on the same
@@ -221,9 +224,7 @@ static struct pci_dev *setup_aliases(struct device *dev)
PCI_BUS_NUM(ivrs_alias) == pdev->bus->number)
pci_add_dma_alias(pdev, ivrs_alias & 0xff, 1);
 
-   clone_aliases(pdev);
-
-   return pdev;
+   clone_aliases(dev);
 }
 
 static struct iommu_dev_data *find_dev_data(u16 devid)
@@ -331,7 +332,8 @@ static int iommu_init_device(struct device *dev)
if (!dev_data)
return -ENOMEM;
 
-   dev_data->pdev = setup_aliases(dev);
+   dev_data->dev = dev;
+   setup_aliases(dev);
 
/*
 * By default we use passthrough mode for IOMMUv2 capable device.
@@ -1232,13 +1234,17 @@ static int device_flush_dte_alias(struct pci_dev *pdev, 
u16 alias, void *data)
 static int device_flush_dte(struct iommu_dev_data *dev_data)
 {
struct amd_iommu *iommu;
+   struct pci_dev *pdev = NULL;
u16 alias;
int ret;
 
iommu = amd_iommu_rlookup_table[dev_data->devid];
 
-   if (dev_data->pdev)
-   ret = pci_for_each_dma_alias(dev_data->pdev,
+   if (dev_is_pci(dev_data->dev))
+   pdev = to_pci_dev(dev_data->dev);
+
+   if (pdev)
+   ret = pci_for_each_dma_alias(pdev,
 device_flush_dte_alias, iommu);
else
ret = iommu_flush_dte(iommu, dev_data->devid);
@@ -1561,7 +1567,7 @@ static void do_attach(struct iommu_dev_data *dev_data,
/* Update device table */
set_dte_entry(dev_data->devid, domain,
  ats, dev_data->iommu_v2);
-   clone_aliases(dev_data->pdev);
+   clone_aliases(dev_data->dev);
 
device_flush_dte(dev_data);
 }
@@ -1577,7 +1583,7 @@ static void do_detach(struct iommu_dev_data *dev_data)
dev_data->domain = NULL;
list_del(_data->list);
clear_dte_entry(dev_data->devid);
-   clone_aliases(dev_data->pdev);
+   clone_aliases(dev_data->dev);
 
/* Flush the DTE entry */
device_flush_dte(dev_data);
@@ -1818,7 +1824,7 @@ static void update_device_table(struct protection_domain 
*domain)
list_for_each_entry(dev_data, >dev_list, list) {
set_dte_entry(dev_data->devid, domain,
  dev_data->ats.enabled, dev_data->iommu_v2);
-   clone_aliases(dev_data->pdev);
+   clone_aliases(dev_data->dev);
}
 }
 
-- 
2.31.1


[PATCH v3 RESEND 00/35] iommu/amd: Add multiple PCI segments support

2022-07-06 Thread Vasant Hegde via iommu
Hi Joerg,
   As discussed in other thread, I have updated "From:" tag and
   resending patchset. No changes in the actual patch content.
   This patchset is based on top on "iommu/x86/amd" branch.
   Base commit : 0d10fe75911787 ("iommu/amd: Use try_cmpxchg64 in ")

Newer AMD systems can support multiple PCI segments, where each segment
contains one or more IOMMU instances. However, an IOMMU instance can only
support a single PCI segment.

Current code assumes a system contains only one PCI segment (segment 0)
and creates global data structures such as device table, rlookup table,
etc.

This series introduces per-PCI-segment data structure, which contains
device table, alias table, etc. For each PCI segment, all IOMMUs
share the same data structure. The series also makes necessary code
adjustment and logging enhancements. Finally it removes global data
structures like device table, alias table, etc.

In case of system w/ single PCI segment (e.g. PCI segment ID is zero),
IOMMU driver allocates one PCI segment data structure, which will
be shared by all IOMMUs.

Patch 1 updates struct iommu_dev_data definition.

Patch 2 - 13 introduce new PCI segment structure and allocate per
data structures, and introduce the amd_iommu.pci_seg pointer to point
to the corresponded pci_segment structure. Also, we have introduced
a helper function rlookup_amd_iommu() to reverse-lookup each iommu
for a particular device.

Patch 14 - 27 adopt to per PCI segment data structure and removes
global data structure.

Patch 28 fixes flushing logic to flush upto last_bdf.

Patch 29 - 35 convert usages of 16-bit PCI device ID to include
16-bit segment ID.

v3 patchset: 
https://lore.kernel.org/linux-iommu/20220511072141.15485-1-vasant.he...@amd.com/

Changes from v2 -> v3:
  - Addressed Joerg's review comments
- Fixed typo in patch 1 subject
- Fixed few minor things in patch 2
- Merged patch 27 - 29 into one patch
- Added new macros to get seg and devid from sbdf
  - Patch 32 : Extend devid to 32bit and added new macro.

v2 patchset : 
https://lore.kernel.org/linux-iommu/20220425113415.24087-1-vasant.he...@amd.com/T/#t

Changes from v1 -> v2:
  - Updated patch 1 to include dev_is_pci() check

v1 patchset : 
https://lore.kernel.org/linux-iommu/20220404100023.324645-1-vasant.he...@amd.com/T/#t

Changes from RFC -> v1:
  - Rebased patches on top of iommu/next tree.
  - Update struct iommu_dev_data definition
  - Updated few log message to print segment ID
  - Fix smatch warnings

RFC patchset : 
https://lore.kernel.org/linux-iommu/20220311094854.31595-1-vasant.he...@amd.com/T/#t


Regards,
Vasant

Suravee Suthikulpanit (20):
  iommu/amd: Introduce per PCI segment device table
  iommu/amd: Introduce per PCI segment rlookup table
  iommu/amd: Introduce per PCI segment old_dev_tbl_cpy
  iommu/amd: Introduce per PCI segment alias_table
  iommu/amd: Convert to use rlookup_amd_iommu helper function
  iommu/amd: Update irq_remapping_alloc to use IOMMU lookup helper function
  iommu/amd: Introduce struct amd_ir_data.iommu
  iommu/amd: Update amd_irte_ops functions
  iommu/amd: Update alloc_irq_table and alloc_irq_index
  iommu/amd: Update set_dte_entry and clear_dte_entry
  iommu/amd: Update iommu_ignore_device
  iommu/amd: Update dump_dte_entry
  iommu/amd: Update set_dte_irq_entry
  iommu/amd: Update (un)init_device_table_dma()
  iommu/amd: Update set_dev_entry_bit() and get_dev_entry_bit()
  iommu/amd: Remove global amd_iommu_[dev_table/alias_table/last_bdf]
  iommu/amd: Introduce get_device_sbdf_id() helper function
  iommu/amd: Include PCI segment ID when initialize IOMMU
  iommu/amd: Specify PCI segment ID when getting pci device
  iommu/amd: Add PCI segment support for ivrs_[ioapic/hpet/acpihid] commands

Vasant Hegde (15):
  iommu/amd: Update struct iommu_dev_data definition
  iommu/amd: Introduce pci segment structure
  iommu/amd: Introduce per PCI segment irq_lookup_table
  iommu/amd: Introduce per PCI segment dev_data_list
  iommu/amd: Introduce per PCI segment unity map list
  iommu/amd: Introduce per PCI segment last_bdf
  iommu/amd: Introduce per PCI segment device table size
  iommu/amd: Introduce per PCI segment alias table size
  iommu/amd: Introduce per PCI segment rlookup table size
  iommu/amd: Convert to use per PCI segment irq_lookup_table
  iommu/amd: Convert to use per PCI segment rlookup_table
  iommu/amd: Flush upto last_bdf only
  iommu/amd: Print PCI segment ID in error log messages
  iommu/amd: Update device_state structure to include PCI seg ID
  iommu/amd: Update amd_iommu_fault structure to include PCI seg ID

 .../admin-guide/kernel-parameters.txt |  34 +-
 drivers/iommu/amd/amd_iommu.h |  13 +-
 drivers/iommu/amd/amd_iommu_types.h   | 133 +++-
 drivers/iommu/amd/init.c  | 687 +++---
 drivers/iommu/amd/iommu.c | 563 --
 drivers/iommu/amd/iommu_v2.c  |  67 +-
 

Re: [PATCH] iommu/amd: Handle return of iommu_device_sysfs_add

2022-07-06 Thread Joerg Roedel
On Fri, Jul 01, 2022 at 02:20:08AM -0400, Bo Liu wrote:
> As iommu_device_sysfs_add() can fail, we should check the return value.
> 
> Signed-off-by: Bo Liu 
> ---
>  drivers/iommu/amd/init.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)

Applied, thanks.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/iova: change IOVA_MAG_SIZE to 127 to save memory

2022-07-06 Thread Joerg Roedel
On Sun, Jul 03, 2022 at 07:44:50PM +0800, Feng Tang wrote:
> kmalloc will round up the request size to power of 2, and current
> iova_magazine's size is 1032 (1024+8) bytes, so each instance
> allocated will get 2048 bytes from kmalloc, causing around 1KB
> waste.
> 
> Change IOVA_MAG_SIZE from 128 to 127 to make size of 'iova_magazine'
> 1024 bytes so that no memory will be wasted.
> 
> Signed-off-by: Feng Tang 
> Acked-by: Robin Murphy 

Applied, thanks.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/1] iommu/vt-d: Fixes for v5.19-rc4

2022-07-06 Thread Joerg Roedel
On Sat, Jun 25, 2022 at 09:34:29PM +0800, Lu Baolu wrote:
> Hi Joerg,
> 
> One fix is queued for v5.19. It aims to fix:
> 
> - RID2PASID setup/teardown failures for pci alias devices
> 
> Please consider it for the iommu/fix branch.
> 
> Best regards,
> Lu Baolu
> 
> Lu Baolu (1):
>   iommu/vt-d: Fix RID2PASID setup/teardown failure

Queued, thanks Baolu and sorry for the delay.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/exynos: Make driver independent of the system page size

2022-07-06 Thread Joerg Roedel
On Thu, Jun 23, 2022 at 11:36:29AM +0200, Marek Szyprowski wrote:
>  drivers/iommu/exynos-iommu.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] iommu: More internal ops cleanup

2022-07-06 Thread Joerg Roedel
On Tue, Jun 21, 2022 at 04:14:24PM +0100, Robin Murphy wrote:
> Robin Murphy (3):
>   iommu: Use dev_iommu_ops() for probe_finalize
>   iommu: Make .release_device optional
>   iommu: Clean up release_device checks

Applied to core branch, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 09/14] iommu/ipmmu-vmsa: Clean up bus_set_iommu()

2022-07-06 Thread Robin Murphy

On 2022-07-06 09:38, Alexey Kardashevskiy wrote:



On 28/04/2022 23:18, Robin Murphy wrote:

Stop calling bus_set_iommu() since it's now unnecessary. This also
leaves the custom initcall effectively doing nothing but register
the driver, which no longer needs to happen early either, so convert
it to builtin_platform_driver().

Signed-off-by: Robin Murphy 
---
  drivers/iommu/ipmmu-vmsa.c | 35 +--
  1 file changed, 1 insertion(+), 34 deletions(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 8fdb84b3642b..2549d32f0ddd 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -1090,11 +1090,6 @@ static int ipmmu_probe(struct platform_device 
*pdev)
  ret = iommu_device_register(>iommu, _ops, 
>dev);

  if (ret)
  return ret;
-
-#if defined(CONFIG_IOMMU_DMA)
-    if (!iommu_present(_bus_type))
-    bus_set_iommu(_bus_type, _ops);
-#endif
  }
  /*


The comment which starts here did not make it to the patch but it should 
have as it mentions bus_set_iommu() which is gone by the end of the series.


Heh, busted! In fact I think the whole point of that comment stops being 
true, but I couldn't be bothered to reason about it since one of the 
next steps after this is to start ripping all the arm_iommu_* stuff out 
anyway.


More general question/request - could you please include the exact sha1 
the patchset is based on? It did not apply to any current trees and 
while it was trivial, it was slightly annoying to resolve the conflicts 
:)  Thanks,


v3 is based directly on 5.19-rc3:

https://lore.kernel.org/lkml/cover.1657034827.git.robin.mur...@arm.com/

And if it helps I have it on a branch here as well:

https://gitlab.arm.com/linux-arm/linux-rm/-/tree/bus-set-iommu-v3

Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v13 0/9] ACPI/IORT: Support for IORT RMR node

2022-07-06 Thread j...@8bytes.org
On Tue, Jun 28, 2022 at 07:59:39AM +, Shameerali Kolothum Thodi wrote:
> Now that we have all the required acks, could you please pick this series via
> IOMMU tree?

Applied to core branch, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/vt-d: Fix PCI bus rescan device hot add

2022-07-06 Thread Joerg Roedel
On Fri, Jun 24, 2022 at 02:12:28PM +0800, Baolu Lu wrote:
> It makes sense as far as I am aware. By putting IOMMUs in pass-through
> mode, there will be no run-time costs and things could be simplified a
> lot.
> 
> Besides the refactoring efforts, we still need this quick fix so that
> the fix could be propagated to various stable and vendors' downstream trees.

Patch is applied now for 5.19.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] MAINTAINERS: Remove iommu@lists.linux-foundation.org

2022-07-06 Thread Joerg Roedel
From: Joerg Roedel 

The IOMMU mailing list has moved to io...@lists.linux.dev
and the old list should bounce by now. Remove it from the
MAINTAINERS file.

Signed-off-by: Joerg Roedel 
---
 MAINTAINERS | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 66bffb24a348..ead381fdfc5a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -426,7 +426,6 @@ F:  drivers/acpi/*thermal*
 ACPI VIOT DRIVER
 M: Jean-Philippe Brucker 
 L: linux-a...@vger.kernel.org
-L: iommu@lists.linux-foundation.org
 L: io...@lists.linux.dev
 S: Maintained
 F: drivers/acpi/viot.c
@@ -960,7 +959,6 @@ F:  drivers/video/fbdev/geode/
 AMD IOMMU (AMD-VI)
 M: Joerg Roedel 
 R: Suravee Suthikulpanit 
-L: iommu@lists.linux-foundation.org
 L: io...@lists.linux.dev
 S: Maintained
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
@@ -5979,7 +5977,6 @@ DMA MAPPING HELPERS
 M: Christoph Hellwig 
 M: Marek Szyprowski 
 R: Robin Murphy 
-L: iommu@lists.linux-foundation.org
 L: io...@lists.linux.dev
 S: Supported
 W: http://git.infradead.org/users/hch/dma-mapping.git
@@ -5992,7 +5989,6 @@ F:kernel/dma/
 
 DMA MAPPING BENCHMARK
 M: Xiang Chen 
-L: iommu@lists.linux-foundation.org
 L: io...@lists.linux.dev
 F: kernel/dma/map_benchmark.c
 F: tools/testing/selftests/dma/
@@ -7577,7 +7573,6 @@ F:drivers/gpu/drm/exynos/exynos_dp*
 
 EXYNOS SYSMMU (IOMMU) driver
 M: Marek Szyprowski 
-L: iommu@lists.linux-foundation.org
 L: io...@lists.linux.dev
 S: Maintained
 F: drivers/iommu/exynos-iommu.c
@@ -,7 +9994,6 @@ F:drivers/hid/intel-ish-hid/
 INTEL IOMMU (VT-d)
 M: David Woodhouse 
 M: Lu Baolu 
-L: iommu@lists.linux-foundation.org
 L: io...@lists.linux.dev
 S: Supported
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
@@ -10379,7 +10373,6 @@ F:  include/linux/iomap.h
 IOMMU DRIVERS
 M: Joerg Roedel 
 M: Will Deacon 
-L: iommu@lists.linux-foundation.org
 L: io...@lists.linux.dev
 S: Maintained
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
@@ -12539,7 +12532,6 @@ F:  drivers/i2c/busses/i2c-mt65xx.c
 
 MEDIATEK IOMMU DRIVER
 M: Yong Wu 
-L: iommu@lists.linux-foundation.org
 L: io...@lists.linux.dev
 L: linux-media...@lists.infradead.org (moderated for non-subscribers)
 S: Supported
@@ -16591,7 +16583,6 @@ F:  drivers/i2c/busses/i2c-qcom-cci.c
 
 QUALCOMM IOMMU
 M: Rob Clark 
-L: iommu@lists.linux-foundation.org
 L: io...@lists.linux.dev
 L: linux-arm-...@vger.kernel.org
 S: Maintained
@@ -19217,7 +19208,6 @@ F:  arch/x86/boot/video*
 
 SWIOTLB SUBSYSTEM
 M: Christoph Hellwig 
-L: iommu@lists.linux-foundation.org
 L: io...@lists.linux.dev
 S: Supported
 W: http://git.infradead.org/users/hch/dma-mapping.git
@@ -21893,7 +21883,6 @@ XEN SWIOTLB SUBSYSTEM
 M: Juergen Gross 
 M: Stefano Stabellini 
 L: xen-de...@lists.xenproject.org (moderated for non-subscribers)
-L: iommu@lists.linux-foundation.org
 L: io...@lists.linux.dev
 S: Supported
 F: arch/x86/xen/*swiotlb*
-- 
2.36.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] x86/ACPI: Set swiotlb area according to the number of lapic entry in MADT

2022-07-06 Thread Tianyu Lan

On 7/6/2022 5:02 PM, Christoph Hellwig wrote:

On Wed, Jul 06, 2022 at 04:57:33PM +0800, Tianyu Lan wrote:

Swiotlb_init() is called in the mem_init() of different architects and
memblock free pages are released to the buddy allocator just after
calling swiotlb_init() via memblock_free_all().


Yes.


The mem_init() is called before smp_init().


But why would that matter?  cpu_possible_map is set up from
setup_arch(), which is called before that.


Sorry. I just still focus online cpu number and the number is got after
smp_init(). Possible cpu number includes some offline cpus. I will have 
a try. Thanks for suggestion.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 0/7] iommu/amd: Add Generic IO Page Table Framework Support for v2 Page Table

2022-07-06 Thread Joerg Roedel
On Tue, Jun 28, 2022 at 02:35:51PM +0530, Vasant Hegde wrote:
> Sorry. I didn't get last statement ("device identity maps DMA requests 
> without PASID").
> Can you please elaborate?

When using v1 page-tables, each device supporting ATS/PRI/PASID needs to
be direct-mapped, because the v1 page-tables basically act as a stage-2
page table for the PASID ones.

But when the non-pasid case moves to the pasid==0 page-table, then there
is not stage-2 anymore and a device can be used with ATS/PRI/PASID while
non-PASID requests are translated too, no?

I didn't get how this is handled in the current patch-set.

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 7/7] iommu/amd: Introduce amd_iommu_pgtable command-line option

2022-07-06 Thread Joerg Roedel
On Tue, Jun 28, 2022 at 01:23:52PM +0530, Vasant Hegde wrote:
> I think it will complicate the parsing logic. We do have `amd_iommu=off` 
> option.
> How are we going to handle `amd_iommu=off,[pgtable_v1/v2]` ? 

In that case everything except 'off' will be ignored. The driver might
set its internal variables, but this has no effect as the driver never
initializes.

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] x86/ACPI: Set swiotlb area according to the number of lapic entry in MADT

2022-07-06 Thread Christoph Hellwig
On Wed, Jul 06, 2022 at 04:57:33PM +0800, Tianyu Lan wrote:
> Swiotlb_init() is called in the mem_init() of different architects and
> memblock free pages are released to the buddy allocator just after
> calling swiotlb_init() via memblock_free_all().

Yes.

> The mem_init() is called before smp_init().

But why would that matter?  cpu_possible_map is set up from
setup_arch(), which is called before that.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] x86/ACPI: Set swiotlb area according to the number of lapic entry in MADT

2022-07-06 Thread Tianyu Lan

On 7/6/2022 4:00 PM, Christoph Hellwig wrote:

On Fri, Jul 01, 2022 at 01:02:21AM +0800, Tianyu Lan wrote:

Can we reorder that initialization?  Because I really hate having
to have an arch hook in every architecture.


How about using "flags" parameter of swiotlb_init() to pass area number
or add new parameter for area number?

I just reposted patch 1 since there is just some coding style issue and area
number may also set via swiotlb kernel parameter. We still need figure out a
good solution to pass area number from architecture code.


What is the problem with calling swiotlb_init after nr_possible_cpus()
works?


Swiotlb_init() is called in the mem_init() of different architects and
memblock free pages are released to the buddy allocator just after
calling swiotlb_init() via memblock_free_all().

The mem_init() is called before smp_init(). If calling swiotlb_init()
after smp_init(), that means we can't allocate large chunk low end
memory via memblock_alloc() in the swiotlb(). Swiotlb_init() needs
to rework to allocate memory from the buddy allocator and just like
swiotlb_init_late() does. This will limit the bounce buffer size.
Otherwise We need to do the reorder for all achitectures and there maybe
some other unknown issues.

swiotlb flags parameter of swiotlb_init() seems to be a good place to
pass the area number in current code. If not set the swiotlb_area
number/flag, the area number will be one and keep the original behavior
of one single global spinlock protecting io tlb data structure.














___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 09/14] iommu/ipmmu-vmsa: Clean up bus_set_iommu()

2022-07-06 Thread Alexey Kardashevskiy




On 28/04/2022 23:18, Robin Murphy wrote:

Stop calling bus_set_iommu() since it's now unnecessary. This also
leaves the custom initcall effectively doing nothing but register
the driver, which no longer needs to happen early either, so convert
it to builtin_platform_driver().

Signed-off-by: Robin Murphy 
---
  drivers/iommu/ipmmu-vmsa.c | 35 +--
  1 file changed, 1 insertion(+), 34 deletions(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 8fdb84b3642b..2549d32f0ddd 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -1090,11 +1090,6 @@ static int ipmmu_probe(struct platform_device *pdev)
ret = iommu_device_register(>iommu, _ops, 
>dev);
if (ret)
return ret;
-
-#if defined(CONFIG_IOMMU_DMA)
-   if (!iommu_present(_bus_type))
-   bus_set_iommu(_bus_type, _ops);
-#endif
}
  
  	/*


The comment which starts here did not make it to the patch but it should 
have as it mentions bus_set_iommu() which is gone by the end of the series.



More general question/request - could you please include the exact sha1 
the patchset is based on? It did not apply to any current trees and 
while it was trivial, it was slightly annoying to resolve the conflicts 
:)  Thanks,




@@ -1168,32 +1163,4 @@ static struct platform_driver ipmmu_driver = {
.probe = ipmmu_probe,
.remove = ipmmu_remove,
  };
-
-static int __init ipmmu_init(void)
-{
-   struct device_node *np;
-   static bool setup_done;
-   int ret;
-
-   if (setup_done)
-   return 0;
-
-   np = of_find_matching_node(NULL, ipmmu_of_ids);
-   if (!np)
-   return 0;
-
-   of_node_put(np);
-
-   ret = platform_driver_register(_driver);
-   if (ret < 0)
-   return ret;
-
-#if defined(CONFIG_ARM) && !defined(CONFIG_IOMMU_DMA)
-   if (!iommu_present(_bus_type))
-   bus_set_iommu(_bus_type, _ops);
-#endif
-
-   setup_done = true;
-   return 0;
-}
-subsys_initcall(ipmmu_init);
+builtin_platform_driver(ipmmu_driver);


--
Alexey
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] x86/ACPI: Set swiotlb area according to the number of lapic entry in MADT

2022-07-06 Thread Christoph Hellwig
On Fri, Jul 01, 2022 at 01:02:21AM +0800, Tianyu Lan wrote:
> > Can we reorder that initialization?  Because I really hate having
> > to have an arch hook in every architecture.
> 
> How about using "flags" parameter of swiotlb_init() to pass area number
> or add new parameter for area number?
> 
> I just reposted patch 1 since there is just some coding style issue and area
> number may also set via swiotlb kernel parameter. We still need figure out a
> good solution to pass area number from architecture code.

What is the problem with calling swiotlb_init after nr_possible_cpus()
works?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 20/21] PCI/P2PDMA: Introduce pci_mmap_p2pmem()

2022-07-06 Thread Greg Kroah-Hartman
On Wed, Jul 06, 2022 at 08:51:27AM +0200, Christoph Hellwig wrote:
> On Tue, Jul 05, 2022 at 12:16:45PM -0600, Logan Gunthorpe wrote:
> > The current version does it through a char device, but that requires
> > creating a simple_fs and anon_inode for teardown on driver removal, plus
> > a bunch of hooks through the driver that exposes it (NVMe, in this case)
> > to set this all up.
> > 
> > Christoph is suggesting a sysfs interface which could potentially avoid
> > the anon_inode and all of the extra hooks. It has some significant
> > benefits and maybe some small downsides, but I wouldn't describe it as
> > horrid.
> 
> Yeah, I don't think is is horrible, it fits in with the resource files
> for the BARs, and solves a lot of problems.  Greg, can you explain
> what would be so bad about it?

As you mention, you will have to pass different things down into sysfs
in order for that to be possible.  If it matches the resource files like
we currently have today, that might not be that bad, but it still feels
odd to me.  Let's see an implementation and a Documentation/ABI/ entry
first though.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 20/21] PCI/P2PDMA: Introduce pci_mmap_p2pmem()

2022-07-06 Thread Christoph Hellwig
On Tue, Jul 05, 2022 at 12:16:45PM -0600, Logan Gunthorpe wrote:
> The current version does it through a char device, but that requires
> creating a simple_fs and anon_inode for teardown on driver removal, plus
> a bunch of hooks through the driver that exposes it (NVMe, in this case)
> to set this all up.
> 
> Christoph is suggesting a sysfs interface which could potentially avoid
> the anon_inode and all of the extra hooks. It has some significant
> benefits and maybe some small downsides, but I wouldn't describe it as
> horrid.

Yeah, I don't think is is horrible, it fits in with the resource files
for the BARs, and solves a lot of problems.  Greg, can you explain
what would be so bad about it?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 03/16] dt-bindings: power: mediatek: Refine multiple level power domain nodes

2022-07-06 Thread Tinghan Shen via iommu
On Tue, 2022-07-05 at 14:57 -0600, Rob Herring wrote:
> On Mon, Jul 04, 2022 at 06:00:15PM +0800, Tinghan Shen wrote:
> > Extract duplicated properties and support more levels of power
> > domain nodes.
> > 
> > This change fix following error when do dtbs_check,
> > arch/arm64/boot/dts/mediatek/mt8195-evb.dtb: power-controller: 
> > power-domain@15:
> > power-domain@16:power-domain@18: 'power-domain@19', 'power-domain@20', 
> > 'power-domain@21' do not
> > match any of the regexes: 'pinctrl-[0-9]+'
> >  From schema: 
> > Documentation/devicetree/bindings/power/mediatek,power-controller.yaml
> > 
> > Signed-off-by: Tinghan Shen 
> > ---
> >  .../power/mediatek,power-controller.yaml  | 132 ++
> >  1 file changed, 12 insertions(+), 120 deletions(-)
> > 
> > diff --git 
> > a/Documentation/devicetree/bindings/power/mediatek,power-controller.yaml
> > b/Documentation/devicetree/bindings/power/mediatek,power-controller.yaml
> > index 135c6f722091..09a537a802b8 100644
> > --- a/Documentation/devicetree/bindings/power/mediatek,power-controller.yaml
> > +++ b/Documentation/devicetree/bindings/power/mediatek,power-controller.yaml
> > @@ -39,8 +39,17 @@ properties:
> >'#size-cells':
> >  const: 0
> >  
> > +required:
> > +  - compatible
> > +
> > +additionalProperties: false
> > +
> >  patternProperties:
> >"^power-domain@[0-9a-f]+$":
> > +$ref: "#/$defs/power-domain-node"
> > +
> > +$defs:
> > +  power-domain-node:
> >  type: object
> >  description: |
> >Represents the power domains within the power controller node as 
> > documented
> > @@ -98,127 +107,10 @@ patternProperties:
> >  $ref: /schemas/types.yaml#/definitions/phandle
> >  description: phandle to the device containing the SMI register 
> > range.
> >  
> > -patternProperties:
> > -  "^power-domain@[0-9a-f]+$":
> > -type: object
> > -description: |
> > -  Represents a power domain child within a power domain parent 
> > node.
> > -
> > -properties:
> > -
> > -  '#power-domain-cells':
> > -description:
> > -  Must be 0 for nodes representing a single PM domain and 1 
> > for nodes
> > -  providing multiple PM domains.
> > -
> > -  '#address-cells':
> > -const: 1
> > -
> > -  '#size-cells':
> > -const: 0
> > -
> > -  reg:
> > -maxItems: 1
> > -
> > -  clocks:
> > -description: |
> > -  A number of phandles to clocks that need to be enabled 
> > during domain
> > -  power-up sequencing.
> > -
> > -  clock-names:
> > -description: |
> > -  List of names of clocks, in order to match the power-up 
> > sequencing
> > -  for each power domain we need to group the clocks by name. 
> > BASIC
> > -  clocks need to be enabled before enabling the corresponding 
> > power
> > -  domain, and should not have a '-' in their name (i.e mm, 
> > mfg, venc).
> > -  SUSBYS clocks need to be enabled before releasing the bus 
> > protection,
> > -  and should contain a '-' in their name (i.e mm-0, isp-0, 
> > cam-0).
> > -
> > -  In order to follow properly the power-up sequencing, the 
> > clocks must
> > -  be specified by order, adding first the BASIC clocks 
> > followed by the
> > -  SUSBSYS clocks.
> > -
> > -  domain-supply:
> > -description: domain regulator supply.
> > -
> > -  mediatek,infracfg:
> > -$ref: /schemas/types.yaml#/definitions/phandle
> > -description: phandle to the device containing the INFRACFG 
> > register range.
> > -
> > -  mediatek,smi:
> > -$ref: /schemas/types.yaml#/definitions/phandle
> > -description: phandle to the device containing the SMI register 
> > range.
> > -
> > -patternProperties:
> > -  "^power-domain@[0-9a-f]+$":
> > -type: object
> > -description: |
> > -  Represents a power domain child within a power domain parent 
> > node.
> > -
> > -properties:
> > +  required:
> > +- reg
> >  
> > -  '#power-domain-cells':
> > -description:
> > -  Must be 0 for nodes representing a single PM domain and 
> > 1 for nodes
> > -  providing multiple PM domains.
> > -
> > -  '#address-cells':
> > -const: 1
> > -
> > -  '#size-cells':
> > -const: 0
> > -
> > -  reg:
> > -maxItems: 1
> > -
> > -  clocks:
> > -description: |
> > -  A number of phandles to clocks that need to be enabled 
> > during domain
> > -  power-up sequencing.
> > -
> > -  clock-names:
> > -description: |
> >