Re: [RFC PATCH 2/2] dma-direct: Fix dma_direct_{alloc,free}() for Hyperv-V IVMs
On Wed, Jul 06, 2022 at 09:50:27PM +0200, Andrea Parri (Microsoft) wrote: > @@ -305,6 +306,21 @@ void *dma_direct_alloc(struct device *dev, size_t size, > ret = page_address(page); > if (dma_set_decrypted(dev, ret, size)) > goto out_free_pages; > +#ifdef CONFIG_HAS_IOMEM > + /* > + * Remap the pages in the unencrypted physical address space > + * when dma_unencrypted_base is set (e.g., for Hyper-V AMD > + * SEV-SNP isolated guests). > + */ > + if (dma_unencrypted_base) { > + phys_addr_t ret_pa = virt_to_phys(ret); > + > + ret_pa += dma_unencrypted_base; > + ret = memremap(ret_pa, size, MEMREMAP_WB); > + if (!ret) > + goto out_encrypt_pages; > + } > +#endif So: this needs to move into dma_set_decrypted, otherwise we don't handle the dma_alloc_pages case (never mind that this is pretty unreadable). Which then again largely duplicates the code in swiotlb. So I think what we need here is a low-level helper that does the set_memory_decrypted and memremap. I'm not quite sure where it should go, but maybe some of the people involved with memory encryption might have good ideas. unencrypted_base should go with it and then both swiotlb and dma-direct can call it. > + /* > + * If dma_unencrypted_base is set, the virtual address returned by > + * dma_direct_alloc() is in the vmalloc address range. > + */ > + if (!dma_unencrypted_base && is_vmalloc_addr(cpu_addr)) { > vunmap(cpu_addr); > } else { > if (IS_ENABLED(CONFIG_ARCH_HAS_DMA_CLEAR_UNCACHED)) > arch_dma_clear_uncached(cpu_addr, size); > +#ifdef CONFIG_HAS_IOMEM > + if (dma_unencrypted_base) { > + memunmap(cpu_addr); > + /* re-encrypt the pages using the original address */ > + cpu_addr = page_address(pfn_to_page(PHYS_PFN( > + dma_to_phys(dev, dma_addr; > + } > +#endif > if (dma_set_encrypted(dev, cpu_addr, size)) Same on the unmap side. It might also be worth looking into reordering the checks in some form instead o that raw dma_unencrypted_base check before the unmap. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: fully convert arm to use dma-direct v3
On Wed, Jun 29, 2022 at 08:41:32AM +0200, Greg Kroah-Hartman wrote: > On Wed, Jun 29, 2022 at 08:28:37AM +0200, Christoph Hellwig wrote: > > Any comments or additional testing? It would be really great to get > > this off the table. > > For the USB bits: > > Acked-by: Greg Kroah-Hartman So given that we're not making any progress on getting anyone interested on the series, I'm tempted to just pull it into the dma-mapping tree this weekend so that we'll finally have all architectures using the common code. Anyone who has real concerns, please scream now. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v4 02/11] iommu/vt-d: Remove clearing translation data in disable_dmar_iommu()
> From: Lu Baolu > Sent: Wednesday, July 6, 2022 10:55 AM > > The disable_dmar_iommu() is called when IOMMU initialization fails or > the IOMMU is hot-removed from the system. In both cases, there is no > need to clear the IOMMU translation data structures for devices. > > On the initialization path, the device probing only happens after the > IOMMU is initialized successfully, hence there're no translation data > structures. > > On the hot-remove path, there is no real use case where the IOMMU is > hot-removed, but the devices that it manages are still alive in the > system. The translation data structures were torn down during device > release, hence there's no need to repeat it in IOMMU hot-remove path > either. This removes the unnecessary code and only leaves a check. > > Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 3/6] iommu/vt-d: Refactor iommu information of each domain
On 2022/7/7 09:01, Tian, Kevin wrote: From: Lu Baolu Sent: Saturday, July 2, 2022 9:56 AM -out_unlock: + set_bit(num, iommu->domain_ids); + info->refcnt = 1; + info->did= num; + info->iommu = iommu; + domain->nid = iommu->node; One nit. this line should be removed as it's incorrect to blindly update domain->nid and we should just leave to domain_update_iommu_cap() to decide the right node. Otherwise this causes a policy conflict as here it is the last attached device deciding the node which is different from domain_update_iommu_cap() which picks the node of the first attached device. Agreed and updated. Thank you! Otherwise, Reviewed-by: Kevin Tian Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v3 02/11] iommu/vt-d: Remove clearing translation data in disable_dmar_iommu()
> From: Baolu Lu > Sent: Sunday, July 3, 2022 12:34 PM > > On 2022/7/1 15:58, Tian, Kevin wrote: > >> From: Lu Baolu Sent: Wednesday, June 29, > >> 2022 3:47 PM > >> > >> The disable_dmar_iommu() is called when IOMMU initialization fails > >> or the IOMMU is hot-removed from the system. In both cases, there > >> is no need to clear the IOMMU translation data structures for > >> devices. > >> > >> On the initialization path, the device probing only happens after > >> the IOMMU is initialized successfully, hence there're no > >> translation data structures. > >> > >> On the hot-remove path, there is no real use case where the IOMMU > >> is hot-removed, but the devices that it manages are still alive in > >> the system. The translation data structures were torn down during > >> device release, hence there's no need to repeat it in IOMMU > >> hot-remove path either. This removes the unnecessary code and only > >> leaves a check. > >> > >> Signed-off-by: Lu Baolu > > > > You probably overlooked my last comment on kexec: > > > > > https://lore.kernel.org/lkml/BL1PR11MB52711A71AD9F11B7AE42694C8CAC9 > @BL1PR11MB5271.namprd11.prod.outlook.com/ > > > > I think my question is still not answered. > > Sorry! I did overlook that comment. I can see your points now, though it > seems to be irrelevant to the problems that this series tries to solve. > > The failure path of copying table still needs some improvement. At least > the pages allocated for root/context tables should be freed in the > failure path. Even worse, the software occupied a bit of page table > entry which is feasible for the old ECS, but not work for the new > scalable mode anymore. > > All these problems deserve a separate series. We could address your > concerns there. Does this work for you? Yes, this makes sense to me. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 4/6] iommu/vt-d: Remove unnecessary check in intel_iommu_add()
> From: Lu Baolu > Sent: Saturday, July 2, 2022 9:56 AM > > The Intel IOMMU hot-add process starts from dmar_device_hotplug(). It > uses the global dmar_global_lock to synchronize all the hot-add and > hot-remove paths. In the hot-add path, the new IOMMU data structures > are allocated firstly by dmar_parse_one_drhd() and then initialized by > dmar_hp_add_drhd(). All the IOMMU units are allocated and initialized > in the same synchronized path. There is no case where any IOMMU unit > is created and then initialized for multiple times. > > This removes the unnecessary check in intel_iommu_add() which is the > last reference place of the global IOMMU array. > > Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 3/6] iommu/vt-d: Refactor iommu information of each domain
> From: Lu Baolu > Sent: Saturday, July 2, 2022 9:56 AM > > -out_unlock: > + set_bit(num, iommu->domain_ids); > + info->refcnt= 1; > + info->did = num; > + info->iommu = iommu; > + domain->nid = iommu->node; One nit. this line should be removed as it's incorrect to blindly update domain->nid and we should just leave to domain_update_iommu_cap() to decide the right node. Otherwise this causes a policy conflict as here it is the last attached device deciding the node which is different from domain_update_iommu_cap() which picks the node of the first attached device. Otherwise, Reviewed-by: Kevin Tian ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 2/6] iommu/vt-d: Use IDA interface to manage iommu sequence id
> From: Lu Baolu > Sent: Saturday, July 2, 2022 9:56 AM > > Switch dmar unit sequence id allocation and release from bitmap to IDA > interface. > > Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH 1/2] swiotlb, dma-direct: Move swiotlb_unencrypted_base to direct.c
The variable will come in handy to enable dma_direct_{alloc,free}() for Hyper-V AMD SEV-SNP Isolated VMs. Rename swiotlb_unencrypted_base to dma_unencrypted_base to indicate that the notion is not restricted to SWIOTLB. No functional change. Suggested-by: Michael Kelley Signed-off-by: Andrea Parri (Microsoft) --- Yeah, this is in some sense trading the dependency on SWIOTLB for a dependency on HAS_DMA: Q1. I'm unable to envision a scenario where SWIOTLB without HAS_DMA would make sense but I'm also expecting one of the kernel test bots to try such a nonsensical configuration... should the references to dma_unencrypted_base in swiotlb.c be protected with HAS_DMA? other? Q2. Can the #ifdef CONFIG_HAS_DMA in arch/x86/kernel/cpu/mshyperv.c be removed? can we make HYPERV "depends on HAS_DMA"? ... arch/x86/kernel/cpu/mshyperv.c | 6 +++--- include/linux/dma-direct.h | 2 ++ include/linux/swiotlb.h| 2 -- kernel/dma/direct.c| 8 kernel/dma/swiotlb.c | 12 +--- 5 files changed, 18 insertions(+), 12 deletions(-) diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index 831613959a92a..47e9cece86ff8 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -18,7 +18,7 @@ #include #include #include -#include +#include #include #include #include @@ -333,8 +333,8 @@ static void __init ms_hyperv_init_platform(void) if (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP) { static_branch_enable(_type_snp); -#ifdef CONFIG_SWIOTLB - swiotlb_unencrypted_base = ms_hyperv.shared_gpa_boundary; +#ifdef CONFIG_HAS_DMA + dma_unencrypted_base = ms_hyperv.shared_gpa_boundary; #endif } /* Isolation VMs are unenlightened SEV-based VMs, thus this check: */ diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h index 18aade195884d..0b7e4c4b7b34c 100644 --- a/include/linux/dma-direct.h +++ b/include/linux/dma-direct.h @@ -14,6 +14,8 @@ extern unsigned int zone_dma_bits; +extern phys_addr_t dma_unencrypted_base; + /* * Record the mapping of CPU physical to DMA addresses for a given region. */ diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 7ed35dd3de6e7..fa2e85f21af61 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -190,6 +190,4 @@ static inline bool is_swiotlb_for_alloc(struct device *dev) } #endif /* CONFIG_DMA_RESTRICTED_POOL */ -extern phys_addr_t swiotlb_unencrypted_base; - #endif /* __LINUX_SWIOTLB_H */ diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 8d0b68a170422..06b2b901e37a3 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -22,6 +22,14 @@ */ unsigned int zone_dma_bits __ro_after_init = 24; +/* + * Certain Confidential Computing solutions, such as Hyper-V AMD SEV-SNP + * isolated VMs, use dma_unencrypted_base as a watermark: memory addresses + * below dma_unencrypted_base are treated as private, while memory above + * dma_unencrypted_base is treated as shared. + */ +phys_addr_t dma_unencrypted_base; + static inline dma_addr_t phys_to_dma_direct(struct device *dev, phys_addr_t phys) { diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index cb50f8d383606..78d4f5294a56c 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -67,8 +67,6 @@ static bool swiotlb_force_disable; struct io_tlb_mem io_tlb_default_mem; -phys_addr_t swiotlb_unencrypted_base; - static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT; static int __init @@ -142,7 +140,7 @@ static inline unsigned long nr_slots(u64 val) /* * Remap swioltb memory in the unencrypted physical address space - * when swiotlb_unencrypted_base is set. (e.g. for Hyper-V AMD SEV-SNP + * when dma_unencrypted_base is set. (e.g. for Hyper-V AMD SEV-SNP * Isolation VMs). */ #ifdef CONFIG_HAS_IOMEM @@ -150,8 +148,8 @@ static void *swiotlb_mem_remap(struct io_tlb_mem *mem, unsigned long bytes) { void *vaddr = NULL; - if (swiotlb_unencrypted_base) { - phys_addr_t paddr = mem->start + swiotlb_unencrypted_base; + if (dma_unencrypted_base) { + phys_addr_t paddr = mem->start + dma_unencrypted_base; vaddr = memremap(paddr, bytes, MEMREMAP_WB); if (!vaddr) @@ -213,10 +211,10 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start, } /* -* If swiotlb_unencrypted_base is set, the bounce buffer memory will +* If dma_unencrypted_base is set, the bounce buffer memory will * be remapped and cleared in swiotlb_update_mem_attributes. */ - if (swiotlb_unencrypted_base) + if (dma_unencrypted_base) return; memset(vaddr, 0, bytes); -- 2.25.1 ___ iommu
[RFC PATCH 2/2] dma-direct: Fix dma_direct_{alloc, free}() for Hyperv-V IVMs
In Hyper-V AMD SEV-SNP Isolated VMs, the virtual address returned by dma_direct_alloc() must map above dma_unencrypted_base because the memory is shared with the hardware device and must not be encrypted. Modify dma_direct_alloc() to do the necessary remapping. In dma_direct_free(), use the (unmodified) DMA address to derive the original virtual address and re-encrypt the pages. Suggested-by: Michael Kelley Co-developed-by: Dexuan Cui Signed-off-by: Dexuan Cui Signed-off-by: Andrea Parri (Microsoft) --- kernel/dma/direct.c | 30 +- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 06b2b901e37a3..c4ce277687a49 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -13,6 +13,7 @@ #include #include #include +#include /* for memremap() */ #include "direct.h" /* @@ -305,6 +306,21 @@ void *dma_direct_alloc(struct device *dev, size_t size, ret = page_address(page); if (dma_set_decrypted(dev, ret, size)) goto out_free_pages; +#ifdef CONFIG_HAS_IOMEM + /* +* Remap the pages in the unencrypted physical address space +* when dma_unencrypted_base is set (e.g., for Hyper-V AMD +* SEV-SNP isolated guests). +*/ + if (dma_unencrypted_base) { + phys_addr_t ret_pa = virt_to_phys(ret); + + ret_pa += dma_unencrypted_base; + ret = memremap(ret_pa, size, MEMREMAP_WB); + if (!ret) + goto out_encrypt_pages; + } +#endif } memset(ret, 0, size); @@ -360,11 +376,23 @@ void dma_direct_free(struct device *dev, size_t size, dma_free_from_pool(dev, cpu_addr, PAGE_ALIGN(size))) return; - if (is_vmalloc_addr(cpu_addr)) { + /* +* If dma_unencrypted_base is set, the virtual address returned by +* dma_direct_alloc() is in the vmalloc address range. +*/ + if (!dma_unencrypted_base && is_vmalloc_addr(cpu_addr)) { vunmap(cpu_addr); } else { if (IS_ENABLED(CONFIG_ARCH_HAS_DMA_CLEAR_UNCACHED)) arch_dma_clear_uncached(cpu_addr, size); +#ifdef CONFIG_HAS_IOMEM + if (dma_unencrypted_base) { + memunmap(cpu_addr); + /* re-encrypt the pages using the original address */ + cpu_addr = page_address(pfn_to_page(PHYS_PFN( + dma_to_phys(dev, dma_addr; + } +#endif if (dma_set_encrypted(dev, cpu_addr, size)) return; } -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC PATCH 0/2] dma_direct_{alloc,free}() for Hyper-V IVMs
Through swiotlb_unencrypted_base. P.S. I'm on vacation for the next couple of weeks starting next Monday; Dexuan/Michael should be able to address review feedback in that period. Andrea Parri (Microsoft) (2): swiotlb,dma-direct: Move swiotlb_unencrypted_base to direct.c dma-direct: Fix dma_direct_{alloc,free}() for Hyperv-V IVMs arch/x86/kernel/cpu/mshyperv.c | 6 +++--- include/linux/dma-direct.h | 2 ++ include/linux/swiotlb.h| 2 -- kernel/dma/direct.c| 38 +- kernel/dma/swiotlb.c | 12 +-- 5 files changed, 47 insertions(+), 13 deletions(-) -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 20/21] PCI/P2PDMA: Introduce pci_mmap_p2pmem()
On 2022-07-06 01:04, Greg Kroah-Hartman wrote: > On Wed, Jul 06, 2022 at 08:51:27AM +0200, Christoph Hellwig wrote: >> On Tue, Jul 05, 2022 at 12:16:45PM -0600, Logan Gunthorpe wrote: >>> The current version does it through a char device, but that requires >>> creating a simple_fs and anon_inode for teardown on driver removal, plus >>> a bunch of hooks through the driver that exposes it (NVMe, in this case) >>> to set this all up. >>> >>> Christoph is suggesting a sysfs interface which could potentially avoid >>> the anon_inode and all of the extra hooks. It has some significant >>> benefits and maybe some small downsides, but I wouldn't describe it as >>> horrid. >> >> Yeah, I don't think is is horrible, it fits in with the resource files >> for the BARs, and solves a lot of problems. Greg, can you explain >> what would be so bad about it? > > As you mention, you will have to pass different things down into sysfs > in order for that to be possible. If it matches the resource files like > we currently have today, that might not be that bad, but it still feels > odd to me. Let's see an implementation and a Documentation/ABI/ entry > first though. I'll work something up in the coming weeks. Thanks, Logan ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 0/6] iommu/vt-d: Reset DMAR_UNITS_SUPPORTED
On Sat, Jun 25, 2022 at 08:51:58PM +0800, Lu Baolu wrote: > Hi folks, > > This is a follow-up series of changes proposed by this patch: > > https://lore.kernel.org/linux-iommu/20220615183650.32075-1-steve.w...@hpe.com/ > > It removes several static arrays of size DMAR_UNITS_SUPPORTED and sets > the DMAR_UNITS_SUPPORTED to 1024. > After Kevin Tian's comments, for the whole series: Reviewed-by: Steve Wahl --> Steve -- Steve Wahl, Hewlett Packard Enterprise ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 7/8] docs: trace: Add HiSilicon PTT device driver documentation
Hi, I have started looking at this set. On Mon, Jun 06, 2022 at 07:55:54PM +0800, Yicong Yang wrote: > Document the introduction and usage of HiSilicon PTT device driver. > > Signed-off-by: Yicong Yang > Reviewed-by: Jonathan Cameron > --- > Documentation/trace/hisi-ptt.rst | 307 +++ > Documentation/trace/index.rst| 1 + The "get_maintainer" script clearly indicates that Jonathan Corbet maintains the Documentation directory and yet he is not CC'ed on this patch, nor is the linux-doc mainling list. As such, it would not be possible to merge this patchset. > 2 files changed, 308 insertions(+) > create mode 100644 Documentation/trace/hisi-ptt.rst > > diff --git a/Documentation/trace/hisi-ptt.rst > b/Documentation/trace/hisi-ptt.rst > new file mode 100644 > index ..0a3112244d40 > --- /dev/null > +++ b/Documentation/trace/hisi-ptt.rst > @@ -0,0 +1,307 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +== > +HiSilicon PCIe Tune and Trace device > +== > + > +Introduction > + > + > +HiSilicon PCIe tune and trace device (PTT) is a PCIe Root Complex > +integrated Endpoint (RCiEP) device, providing the capability > +to dynamically monitor and tune the PCIe link's events (tune), > +and trace the TLP headers (trace). The two functions are independent, > +but is recommended to use them together to analyze and enhance the > +PCIe link's performance. > + > +On Kunpeng 930 SoC, the PCIe Root Complex is composed of several > +PCIe cores. Each PCIe core includes several Root Ports and a PTT > +RCiEP, like below. The PTT device is capable of tuning and > +tracing the links of the PCIe core. > +:: > + > + +--Core 0---+ > + | | [ PTT ] | > + | | [Root Port]---[Endpoint] > + | | [Root Port]---[Endpoint] > + | | [Root Port]---[Endpoint] > +Root Complex |--Core 1---+ > + | | [ PTT ] | > + | | [Root Port]---[ Switch ]---[Endpoint] > + | | [Root Port]---[Endpoint] `-[Endpoint] > + | | [Root Port]---[Endpoint] > + +---+ > + > +The PTT device driver registers one PMU device for each PTT device. > +The name of each PTT device is composed of 'hisi_ptt' prefix with > +the id of the SICL and the Core where it locates. The Kunpeng 930 > +SoC encapsulates multiple CPU dies (SCCL, Super CPU Cluster) and > +IO dies (SICL, Super I/O Cluster), where there's one PCIe Root > +Complex for each SICL. > +:: > + > +/sys/devices/hisi_ptt_ All entries added to sysfs should have corresponding documentation. See [1] and [2] for details and [3] for an example. [1]. https://elixir.bootlin.com/linux/latest/source/Documentation/ABI/README [2]. https://elixir.bootlin.com/linux/latest/source/Documentation/ABI/testing [3]. https://elixir.bootlin.com/linux/latest/source/Documentation/ABI/testing/sysfs-bus-coresight-devices-etm4x > + > +Tune > + > + > +PTT tune is designed for monitoring and adjusting PCIe link parameters > (events). > +Currently we support events in 4 classes. The scope of the events > +covers the PCIe core to which the PTT device belongs. > + > +Each event is presented as a file under $(PTT PMU dir)/tune, and > +a simple open/read/write/close cycle will be used to tune the event. > +:: > + > +$ cd /sys/devices/hisi_ptt_/tune > +$ ls > +qos_tx_cplqos_tx_npqos_tx_p > +tx_path_rx_req_alloc_buf_level > +tx_path_tx_req_alloc_buf_level These look overly long... How about watermark_rx and watermark_tx? > +$ cat qos_tx_dp > +1 > +$ echo 2 > qos_tx_dp > +$ cat qos_tx_dp > +2 > + > +Current value (numerical value) of the event can be simply read > +from the file, and the desired value written to the file to tune. > + > +1. Tx path QoS control > + > + > +The following files are provided to tune the QoS of the tx path of > +the PCIe core. > + > +- qos_tx_cpl: weight of Tx completion TLPs > +- qos_tx_np: weight of Tx non-posted TLPs > +- qos_tx_p: weight of Tx posted TLPs > + > +The weight influences the proportion of certain packets on the PCIe link. > +For example, for the storage scenario, increase the proportion > +of the completion packets on the link to enhance the performance as > +more completions are consumed. > + > +The available tune data of these events is [0, 1, 2]. > +Writing a negative value will return an error, and out of range > +values will be converted to 2. Note that the event value just > +indicates a probable level, but is not precise. > + > +2. Tx path buffer control > +- > + > +Following files are provided to tune the buffer of tx path of the PCIe core. > + > +- tx_path_rx_req_alloc_buf_level: watermark of Rx requested > +-
Re: [PATCH 1/2] iommu: arm-smmu-impl: Add 8250 display compatible to the client list.
On Tue, 14 Jun 2022 16:01:35 -0700, Emma Anholt wrote: > Required for turning on per-process page tables for the GPU. > > Applied to will (for-joerg/arm-smmu/updates), thanks! [1/2] iommu: arm-smmu-impl: Add 8250 display compatible to the client list. https://git.kernel.org/will/c/3482c0b73073 [2/2] arm64: dts: qcom: sm8250: Enable per-process page tables. (no commit info) Cheers, -- Will https://fixes.arm64.dev https://next.arm64.dev https://will.arm64.dev ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCHv2] iommu/arm-smmu-qcom: Add debug support for TLB sync timeouts
On 2022-05-26 05:14, Sai Prakash Ranjan wrote: TLB sync timeouts can be due to various reasons such as TBU power down or pending TCU/TBU invalidation/sync and so on. Debugging these often require dumping of some implementation defined registers to know the status of TBU/TCU operations and some of these registers are not accessible in non-secure world such as from kernel and requires SMC calls to read them in the secure world. So, add this debug support to dump implementation defined registers for TLB sync timeout issues. Signed-off-by: Sai Prakash Ranjan --- Changes in v2: * Use scm call consistently so that it works on older chipsets where some of these regs are secure registers. * Add device specific data to get the implementation defined register offsets. --- drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 161 ++--- drivers/iommu/arm/arm-smmu/arm-smmu.c | 2 + drivers/iommu/arm/arm-smmu/arm-smmu.h | 1 + 3 files changed, 146 insertions(+), 18 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c index 7820711c4560..bb68aa85b28b 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c @@ -5,13 +5,27 @@ #include #include +#include #include #include #include "arm-smmu.h" +#define QCOM_DUMMY_VAL -1 + +enum qcom_smmu_impl_reg_offset { + QCOM_SMMU_TBU_PWR_STATUS, + QCOM_SMMU_STATS_SYNC_INV_TBU_ACK, + QCOM_SMMU_MMU2QSS_AND_SAFE_WAIT_CNTR, +}; + +struct qcom_smmu_config { + const u32 *reg_offset; +}; + struct qcom_smmu { struct arm_smmu_device smmu; + const struct qcom_smmu_config *cfg; bool bypass_quirk; u8 bypass_cbndx; u32 stall_enabled; @@ -22,6 +36,56 @@ static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device *smmu) return container_of(smmu, struct qcom_smmu, smmu); } +static void qcom_smmu_tlb_sync(struct arm_smmu_device *smmu, int page, + int sync, int status) +{ + int ret; + unsigned int spin_cnt, delay; + u32 reg, tbu_pwr_status, sync_inv_ack, sync_inv_progress; + struct qcom_smmu *qsmmu = to_qcom_smmu(smmu); + const struct qcom_smmu_config *cfg; + + arm_smmu_writel(smmu, page, sync, QCOM_DUMMY_VAL); + for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) { + for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) { + reg = arm_smmu_readl(smmu, page, status); + if (!(reg & ARM_SMMU_sTLBGSTATUS_GSACTIVE)) + return; + cpu_relax(); + } + udelay(delay); + } + + dev_err_ratelimited(smmu->dev, + "TLB sync timed out -- SMMU may be deadlocked\n"); Maybe consider a single ratelimit state for the whole function so all the output stays together. If things go sufficiently wrong, mixed up bits of partial output from different events may be misleadingly unhelpful (and at the very least it'll be up to 5x more effective at the intent of limiting log spam). + cfg = qsmmu->cfg; + if (!cfg) + return; + + ret = qcom_scm_io_readl(smmu->ioaddr + cfg->reg_offset[QCOM_SMMU_TBU_PWR_STATUS], + _pwr_status); + if (ret) + dev_err_ratelimited(smmu->dev, + "Failed to read TBU power status: %d\n", ret); + + ret = qcom_scm_io_readl(smmu->ioaddr + cfg->reg_offset[QCOM_SMMU_STATS_SYNC_INV_TBU_ACK], + _inv_ack); + if (ret) + dev_err_ratelimited(smmu->dev, + "Failed to read TBU sync/inv ack status: %d\n", ret); + + ret = qcom_scm_io_readl(smmu->ioaddr + cfg->reg_offset[QCOM_SMMU_MMU2QSS_AND_SAFE_WAIT_CNTR], + _inv_progress); + if (ret) + dev_err_ratelimited(smmu->dev, + "Failed to read TCU syn/inv progress: %d\n", ret); + + dev_err_ratelimited(smmu->dev, + "TBU: power_status %#x sync_inv_ack %#x sync_inv_progress %#x\n", + tbu_pwr_status, sync_inv_ack, sync_inv_progress); +} + static void qcom_adreno_smmu_write_sctlr(struct arm_smmu_device *smmu, int idx, u32 reg) { @@ -374,6 +438,7 @@ static const struct arm_smmu_impl qcom_smmu_impl = { .def_domain_type = qcom_smmu_def_domain_type, .reset = qcom_smmu500_reset, .write_s2cr = qcom_smmu_write_s2cr, + .tlb_sync = qcom_smmu_tlb_sync, }; static const struct arm_smmu_impl qcom_adreno_smmu_impl = { @@ -382,12 +447,84 @@ static const struct arm_smmu_impl qcom_adreno_smmu_impl = { .reset = qcom_smmu500_reset, .alloc_context_bank =
Re: [PATCH v1 08/16] arm64: dts: mt8195: Add power domains controller
On 06/07/2022 14:00, Tinghan Shen wrote: > Hi Krzysztof, > > After discussing your message with our power team, > we realized that we need your help to ensure we fully understand you. > > On Mon, 2022-07-04 at 14:38 +0200, Krzysztof Kozlowski wrote: >> On 04/07/2022 12:00, Tinghan Shen wrote: >>> Add power domains controller node for mt8195. >>> >>> Signed-off-by: Weiyi Lu >>> Signed-off-by: Tinghan Shen >>> --- >>> arch/arm64/boot/dts/mediatek/mt8195.dtsi | 327 +++ >>> 1 file changed, 327 insertions(+) >>> >>> diff --git a/arch/arm64/boot/dts/mediatek/mt8195.dtsi >>> b/arch/arm64/boot/dts/mediatek/mt8195.dtsi >>> index 8d59a7da3271..d52e140d9271 100644 >>> --- a/arch/arm64/boot/dts/mediatek/mt8195.dtsi >>> +++ b/arch/arm64/boot/dts/mediatek/mt8195.dtsi >>> @@ -10,6 +10,7 @@ >>> #include >>> #include >>> #include >>> +#include >>> >>> / { >>> compatible = "mediatek,mt8195"; >>> @@ -338,6 +339,332 @@ >>> #interrupt-cells = <2>; >>> }; >>> >>> + scpsys: syscon@10006000 { >>> + compatible = "syscon", "simple-mfd"; >> >> These compatibles cannot be alone. > > the scpsys sub node has the compatible of the power domain driver. > do you suggest that the compatible in the sub node should move to here? Not necessarily, depends. You have here device node representing system registers. They need they own compatibles, just like everywhere in the kernel (except the broken cases...). Whether this should be compatible of power-domain driver, it depends what this device node is. I don't know, I don't have your datasheets or your architecture diagrams... > >>> + reg = <0 0x10006000 0 0x1000>; >>> + #power-domain-cells = <1>; >> >> If it is simple MFD, then probably it is not a power domain provider. >> Decide. > > this MFD device is the power controller on mt8195. Then it is not a simple MFD but a power controller. Do not use "simple-mfd" compatible. > Some features need > to do some operations on registers in this node. We think that implement > the operation of these registers as the MFD device can provide flexibility > for future use. We want to clarify if you're saying that an MFD device > cannot be a power domain provider. MFD device is Linuxism, so it has nothing to do here. I am talking only about simple-mfd. simple-mfd is a simple device only instantiating children and not providing anything to anyone. Neither to children. This the most important part. The children do not depend on anything from simple-mfd device. For example simple-mfd device can be shut down (gated) and children should still operate. Being a power domain controller, contradicts this usually. Best regards, Krzysztof ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 02/16] dt-bindings: memory: mediatek: Update condition for mt8195 smi node
On 06/07/2022 15:48, Matthias Brugger wrote: > > > On 04/07/2022 14:36, Krzysztof Kozlowski wrote: >> On 04/07/2022 12:00, Tinghan Shen wrote: >>> The max clock items for the dts node with compatible >>> 'mediatek,mt8195-smi-sub-common' should be 3. >>> >>> However, the dtbs_check of such node will get following message, >>> arch/arm64/boot/dts/mediatek/mt8195-evb.dtb: smi@1401: clock-names: >>> ['apb', 'smi', 'gals0'] is too long >>> From schema: >>> Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml >>> >>> Remove the last 'else' checking to fix this error. >> >> Missing fixes tag. >> > > From my understanding, fixes tags are for patches that fix bugs (hw is not > working etc) and not a warning message from dtbs_check. So my point of view > would be to not add a fixes tag here. Not conforming to bindings is also a bug. Missing properties or wrong properties, even if hardware is working, is still a bug. If such bug is not visible now in Linux, might be visible later in the future or visible in different OS (DTS are used by other systems and pieces of software like bootloaders). Limiting this only to Linux and to current version (hardware still works) is OK for Linux drivers, but not for DTS. Therefore Fixes tag in general is applicable. Of course maybe to this one not really, maybe this is too trivial, or whatever, so I do not insist. But I insist on the principle - reasonable dtbs_check warnings are like compiler warnings - bugs which have to be fixed. Best regards, Krzysztof ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 08/16] arm64: dts: mt8195: Add power domains controller
On 06/07/2022 15:41, Matthias Brugger wrote: > > > On 04/07/2022 14:38, Krzysztof Kozlowski wrote: >> On 04/07/2022 12:00, Tinghan Shen wrote: >>> Add power domains controller node for mt8195. >>> >>> Signed-off-by: Weiyi Lu >>> Signed-off-by: Tinghan Shen >>> --- >>> arch/arm64/boot/dts/mediatek/mt8195.dtsi | 327 +++ >>> 1 file changed, 327 insertions(+) >>> >>> diff --git a/arch/arm64/boot/dts/mediatek/mt8195.dtsi >>> b/arch/arm64/boot/dts/mediatek/mt8195.dtsi >>> index 8d59a7da3271..d52e140d9271 100644 >>> --- a/arch/arm64/boot/dts/mediatek/mt8195.dtsi >>> +++ b/arch/arm64/boot/dts/mediatek/mt8195.dtsi >>> @@ -10,6 +10,7 @@ >>> #include >>> #include >>> #include >>> +#include >>> >>> / { >>> compatible = "mediatek,mt8195"; >>> @@ -338,6 +339,332 @@ >>> #interrupt-cells = <2>; >>> }; >>> >>> + scpsys: syscon@10006000 { >>> + compatible = "syscon", "simple-mfd"; >> >> These compatibles cannot be alone. >> > > You mean we would need something like "mediatek,scpsys" as dummy compatible > that's not bound to any driver? Yes. syscon (and simple-mfd) must always come with a specific compatible. > >>> + reg = <0 0x10006000 0 0x1000>; >>> + #power-domain-cells = <1>; >> >> If it is simple MFD, then probably it is not a power domain provider. >> Decide. > > The SCPSYS IP block of MediaTek SoCs group several functionality, one is the > power domain controller. Others are not yet implemented, but defining the > scpsys > as a MFD will give us the possibility to do so in the future. No, quite the opposite. Having simple-mfd prevents you from implementing it correctly later as a driver, because you cannot remove it. It would be ABI break. It's fine to have one block being a simple MFD having several children, but then it's not a power controller. Children could be such power controller, but not simple-mfd. Rob explained this several times: https://lore.kernel.org/all/yxhine00hg6hb...@robh.at.kernel.org/ https://lore.kernel.org/all/20220701000959.ga3588170-r...@kernel.org/ Best regards, Krzysztof ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/4] iommu/exynos: Add basic support for SysMMU v7
On Sun, 3 Jul 2022 at 13:47, David Virag wrote: > > On Sun, 2022-07-03 at 00:48 +0300, Sam Protsenko wrote: > [...] > > Hi Marek, > > > > As I understand, you have some board with SysMMU v7, which is not VM > > capable (judging from the patches you shared earlier). Could you > > please somehow verify if this series works fine for you? For example, > > this testing driver [1] can be helpful. > > > > Thanks! > > > > [1] > > https://github.com/joe-skb7/linux/commit/bbadd46fa525fe1fef2ccbdfff81f7d29caf0506 > > Hi Sam, > > Not Marek here, but I wanted to try this on my jackpotlte (Exynos > 7885). The driver reports it's DPU sysmmu as version 7.2, and manually > reading the capabilities registers it looks like it has the 2nd > capability register but not the VM capability. > > After applying your patches, adding your test driver (with SYSMMU_BASE > corrected to 7885 value), and adding the sysmmu to dt, I tried to cat > the test file that it creates in debugfs and I got an SError kernel > panic. > > I tried tracing where the SError happens and it looks like it's this > line: > /* Preload for emulation */ > iowrite32(rw | vpn, obj->reg_base + MMU_EMU_PRELOAD); > > Trying to read the EMU registers using devmem results in a "Bus error". > > Could these emulation registers be missing from my SysMMU? Do you have > any info on what version should have it? Or maybe some capability bit? > I'll try testing it with DECON/DPP later and see if it works that way. > Hi Janghyuck, Do you have by chance any info on SysMMU v7.2, which is present e.g. on Exynos7885? David is trying to use emulation registers there with no luck, so it would be nice if you can provide some details on questions above. Thanks! > Best regards, > David ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 02/16] dt-bindings: memory: mediatek: Update condition for mt8195 smi node
On 04/07/2022 14:36, Krzysztof Kozlowski wrote: On 04/07/2022 12:00, Tinghan Shen wrote: The max clock items for the dts node with compatible 'mediatek,mt8195-smi-sub-common' should be 3. However, the dtbs_check of such node will get following message, arch/arm64/boot/dts/mediatek/mt8195-evb.dtb: smi@1401: clock-names: ['apb', 'smi', 'gals0'] is too long From schema: Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml Remove the last 'else' checking to fix this error. Missing fixes tag. From my understanding, fixes tags are for patches that fix bugs (hw is not working etc) and not a warning message from dtbs_check. So my point of view would be to not add a fixes tag here. Regards, Matthias Signed-off-by: Tinghan Shen --- .../memory-controllers/mediatek,smi-common.yaml| 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml index a98b359bf909..e5f553e2e12a 100644 --- a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml +++ b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml @@ -143,7 +143,15 @@ allOf: - const: gals0 - const: gals1 -else: # for gen2 HW that don't have gals + - if: # for gen2 HW that don't have gals + properties: +compatible: + enum: +- mediatek,mt2712-smi-common +- mediatek,mt8167-smi-common +- mediatek,mt8173-smi-common + Without looking at the code, it's impossible to understand what you are doing here. The commit msg says one, but you are doing something else. Write commit msg explaining what you want to achieve and what you are doing. Best regards, Krzysztof ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 0/5] DMA mapping changes for SCSI core
On Wed, Jul 06, 2022 at 02:40:44PM +0100, John Garry wrote: > On 30/06/2022 13:08, John Garry wrote: > > Hi Christoph, > > Can you please consider picking up this series? A few things to note > beforehand: > > - I changed to only apply the mapping limit to SAS hosts in this version. I > would need a fresh ack from Martin for those SCSI parts, but wanted to make > sure you were ok with it. Yes, I've mostly been waiting for an ACK from Martin. > - Damien had some doubt on updating the shost max_sectors as opposed to the > per-request queue default, but I think he's ok with it - see patch 4/5 I'm fine either way. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 08/16] arm64: dts: mt8195: Add power domains controller
On 04/07/2022 14:38, Krzysztof Kozlowski wrote: On 04/07/2022 12:00, Tinghan Shen wrote: Add power domains controller node for mt8195. Signed-off-by: Weiyi Lu Signed-off-by: Tinghan Shen --- arch/arm64/boot/dts/mediatek/mt8195.dtsi | 327 +++ 1 file changed, 327 insertions(+) diff --git a/arch/arm64/boot/dts/mediatek/mt8195.dtsi b/arch/arm64/boot/dts/mediatek/mt8195.dtsi index 8d59a7da3271..d52e140d9271 100644 --- a/arch/arm64/boot/dts/mediatek/mt8195.dtsi +++ b/arch/arm64/boot/dts/mediatek/mt8195.dtsi @@ -10,6 +10,7 @@ #include #include #include +#include / { compatible = "mediatek,mt8195"; @@ -338,6 +339,332 @@ #interrupt-cells = <2>; }; + scpsys: syscon@10006000 { + compatible = "syscon", "simple-mfd"; These compatibles cannot be alone. You mean we would need something like "mediatek,scpsys" as dummy compatible that's not bound to any driver? + reg = <0 0x10006000 0 0x1000>; + #power-domain-cells = <1>; If it is simple MFD, then probably it is not a power domain provider. Decide. The SCPSYS IP block of MediaTek SoCs group several functionality, one is the power domain controller. Others are not yet implemented, but defining the scpsys as a MFD will give us the possibility to do so in the future. Regards, Matthias Best regards, Krzysztof ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 0/5] DMA mapping changes for SCSI core
On 30/06/2022 13:08, John Garry wrote: Hi Christoph, Can you please consider picking up this series? A few things to note beforehand: - I changed to only apply the mapping limit to SAS hosts in this version. I would need a fresh ack from Martin for those SCSI parts, but wanted to make sure you were ok with it. - Damien had some doubt on updating the shost max_sectors as opposed to the per-request queue default, but I think he's ok with it - see patch 4/5 Thanks, John As reported in [0], DMA mappings whose size exceeds the IOMMU IOVA caching limit may see a big performance hit. This series introduces a new DMA mapping API, dma_opt_mapping_size(), so that drivers may know this limit when performance is a factor in the mapping. The SCSI SAS transport code is modified only to use this limit. For now I did not want to touch other hosts as I have a concern that this change could cause a performance regression. I also added a patch for libata-scsi as it does not currently honour the shost max_sectors limit. [0]https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leiz...@huawei.com/ Changes since v4: - tweak libata and other patch titles - Add Robin's tag (thanks!) - Clarify description of new DMA mapping API ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Re: Re: Re: [PATCH v2 1/9] PM: domains: Delete usage of driver_deferred_probe_check_state()
f the irqdomain (might be using the terms incorrectly) like the > > > gic, you can make it a platform driver. And I was trying to hack up a > > > patch that's the equivalent of platform_irqchip_probe() (which just > > > ends up eventually calling the callback you use in IRQCHIP_DECLARE(). > > > I probably made some mistake in the quick hack that I'm sure if > > > fixable. > > > > > > > > [0.013251] Failed to map interrupt for > > > > > /soc@0/bus@3040/timer@306a > > > > > > However, this timer driver also uses TIMER_OF_DECLARE() which can't > > > handle failure to get the IRQ (because it's can't -EPROBE_DEFER). So, > > > this means, the timer driver inturn needs to be converted to a > > > platform driver if it's supposed to work with the IRQCHIP_DECLARE() > > > being converted to a platform driver. > > > > > > But that's a can of worms not worth opening. But then I remembered > > > this simpler workaround will work and it is pretty much a variant of > > > the workaround that's already in the gpc's irqchip driver to allow two > > > drivers to probe the same device (people really should stop doing > > > that). > > > > > > Can you drop my previous hack patch and try this instead please? I'm > > > 99% sure this will work. > > > > > > diff --git a/drivers/irqchip/irq-imx-gpcv2.c > > > b/drivers/irqchip/irq-imx-gpcv2.c index b9c22f764b4d..8a0e82067924 > > > 100644 > > > --- a/drivers/irqchip/irq-imx-gpcv2.c > > > +++ b/drivers/irqchip/irq-imx-gpcv2.c > > > @@ -283,6 +283,7 @@ static int __init imx_gpcv2_irqchip_init(struct > > > device_node *node, > > > > > > * later the GPC power domain driver will not be skipped. > > > */ > > > > > > of_node_clear_flag(node, OF_POPULATED); > > > > > > + fwnode_dev_initialized(domain->fwnode, false); > > > > > > return 0; > > > > > > } > > > > Just to be sure here, I tried this patch on top of next-20220701 but > > unfortunately this doesn't fix the original problem either. The timer > > errors are gone though. > > To clarify, you had the timer issue only with my "combine drivers" patch, > right? That's correct. > > The probe of imx8m-blk-ctrl got slightly delayed (from 0.74 to 0.90s > > printk > > time) but results in the identical error message. > > My guess is that the probe attempt of blk-ctrl is delayed now till gpc > probes (because of the device links getting created with the > fwnode_dev_initialized() fix), but by the time gpc probe finishes, the > power domains aren't registered yet because of the additional level of > device addition and probing. > > Can you try the attached patch please? Sure, it needed some small fixes though. But the error still is present. > And if that doesn't fix the issues, then enable the debug logs in the > following functions please and share the logs from boot till the > failure? If you can enable CONFIG_PRINTK_CALLER, that'd help too. > device_link_add() > fwnode_link_add() > fw_devlink_relax_cycle() I switched fw_devlink_relax_cycle() for fw_devlink_relax_link() as the former has no debug output here. For the record I added the following line to my kernel command line: > dyndbg="func device_link_add +p; func fwnode_link_add +p; func fw_devlink_relax_link +p" I attached the dmesg until the probe error to this mail. But I noticed the following lines which seem interesting: > [1.466620][T8] imx-pgc imx-pgc-domain.5: Linked as a consumer to > regulator.8 > [1.466743][T8] imx-pgc imx-pgc-domain.5: imx_pgc_domain_probe: Probe succeeded > [1.474733][T8] imx-pgc imx-pgc-domain.6: Linked as a consumer to regulator.9 > [1.474774][T8] imx-pgc imx-pgc-domain.6: imx_pgc_domain_probe: Probe succeeded regulator.8 and regulator.9 is the power sequencer, attached on I2C. This also makes perfectly sense if you look at [1]ff. These power domains are supplied by specific power supply rails. Several, if not all, imx8mq boards have this kind of setting. > Btw, part of the reason I'm trying to make sure we fix it the right > way is that when we try to enable async boot by default, we don't run > into issues. Sounds resonable. Best regards, Alexander [1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/ arch/arm64/boot/dts/freescale/imx8mq-tqma8mq.dtsi#n84 [0.00][T0] Booting Linux on physical CPU 0x00 [0x410fd034] [0.00][T0] Linux version 5.19.0-rc5-next-2022
Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()
On Wed, Jul 06, 2022 at 01:03:44PM +0100, John Garry wrote: > On 06/07/2022 13:00, Will Deacon wrote: > > On Mon, Apr 04, 2022 at 07:27:10PM +0800, John Garry wrote: > > > Function iommu_group_store_type() supports changing the default domain > > > of an IOMMU group. > > > > > > Many conditions need to be satisfied and steps taken for this action to be > > > successful. > > > > > > Satisfying these conditions and steps will be required for setting other > > > IOMMU group attributes, so factor into a common part and a part specific > > > to update the IOMMU group attribute. > > > > > > No functional change intended. > > > > > > Some code comments are tidied up also. > > > > > > Signed-off-by: John Garry > > > --- > > > drivers/iommu/iommu.c | 96 --- > > > 1 file changed, 62 insertions(+), 34 deletions(-) > > Acked-by: Will Deacon > > > > Thanks, but currently I have no plans to progress this series, in favour of > this > https://lore.kernel.org/linux-iommu/1656590892-42307-1-git-send-email-john.ga...@huawei.com/T/#me0e806913050c95f6e6ba2c7f7d96d51ce191204 heh, then I'll stop reviewing it then :) Shame, I quite liked it so far! Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible
On Thu, Apr 07, 2022 at 03:52:53PM +0800, Leizhen (ThunderTown) wrote: > On 2022/4/4 19:27, John Garry wrote: > > Some low-level drivers may request DMA mappings whose IOVA length exceeds > > that of the current rcache upper limit. > > > > This means that allocations for those IOVAs will never be cached, and > > always must be allocated and freed from the RB tree per DMA mapping cycle. > > This has a significant effect on performance, more so since commit > > 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search > > fails"), as discussed at [0]. > > > > As a first step towards allowing the rcache range upper limit be > > configured, hold this value in the IOVA rcache structure, and allocate > > the rcaches separately. > > > > Delete macro IOVA_RANGE_CACHE_MAX_SIZE in case it's reused by mistake. > > > > [0] > > https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leiz...@huawei.com/ > > > > Signed-off-by: John Garry > > --- > > drivers/iommu/iova.c | 20 ++-- > > include/linux/iova.h | 3 +++ > > 2 files changed, 13 insertions(+), 10 deletions(-) > > > > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c > > index db77aa675145..5c22b9187b79 100644 > > --- a/drivers/iommu/iova.c > > +++ b/drivers/iommu/iova.c > > @@ -15,8 +15,6 @@ > > /* The anchor node sits above the top of the usable address space */ > > #define IOVA_ANCHOR~0UL > > > > -#define IOVA_RANGE_CACHE_MAX_SIZE 6/* log of max cached IOVA range > > size (in pages) */ > > - > > static bool iova_rcache_insert(struct iova_domain *iovad, > >unsigned long pfn, > >unsigned long size); > > @@ -443,7 +441,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned > > long size, > > * rounding up anything cacheable to make sure that can't happen. The > > * order of the unadjusted size will still match upon freeing. > > */ > > - if (size < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1))) > > + if (size < (1 << (iovad->rcache_max_size - 1))) > > size = roundup_pow_of_two(size); > > > > iova_pfn = iova_rcache_get(iovad, size, limit_pfn + 1); > > @@ -713,13 +711,15 @@ int iova_domain_init_rcaches(struct iova_domain > > *iovad) > > unsigned int cpu; > > int i, ret; > > > > - iovad->rcaches = kcalloc(IOVA_RANGE_CACHE_MAX_SIZE, > > + iovad->rcache_max_size = 6; /* Arbitrarily high default */ > > It would be better to assign this constant value to iovad->rcache_max_size > in init_iova_domain(). I think it's fine where it is as it's a meaningless number outside of the rcache code. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible
On Mon, Apr 04, 2022 at 07:27:11PM +0800, John Garry wrote: > Some low-level drivers may request DMA mappings whose IOVA length exceeds > that of the current rcache upper limit. > > This means that allocations for those IOVAs will never be cached, and > always must be allocated and freed from the RB tree per DMA mapping cycle. > This has a significant effect on performance, more so since commit > 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search > fails"), as discussed at [0]. > > As a first step towards allowing the rcache range upper limit be > configured, hold this value in the IOVA rcache structure, and allocate > the rcaches separately. > > Delete macro IOVA_RANGE_CACHE_MAX_SIZE in case it's reused by mistake. > > [0] > https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leiz...@huawei.com/ > > Signed-off-by: John Garry > --- > drivers/iommu/iova.c | 20 ++-- > include/linux/iova.h | 3 +++ > 2 files changed, 13 insertions(+), 10 deletions(-) Acked-by: Will Deacon Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()
On 06/07/2022 13:00, Will Deacon wrote: On Mon, Apr 04, 2022 at 07:27:10PM +0800, John Garry wrote: Function iommu_group_store_type() supports changing the default domain of an IOMMU group. Many conditions need to be satisfied and steps taken for this action to be successful. Satisfying these conditions and steps will be required for setting other IOMMU group attributes, so factor into a common part and a part specific to update the IOMMU group attribute. No functional change intended. Some code comments are tidied up also. Signed-off-by: John Garry --- drivers/iommu/iommu.c | 96 --- 1 file changed, 62 insertions(+), 34 deletions(-) Acked-by: Will Deacon Thanks, but currently I have no plans to progress this series, in favour of this https://lore.kernel.org/linux-iommu/1656590892-42307-1-git-send-email-john.ga...@huawei.com/T/#me0e806913050c95f6e6ba2c7f7d96d51ce191204 cheers ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()
On Mon, Apr 04, 2022 at 07:27:10PM +0800, John Garry wrote: > Function iommu_group_store_type() supports changing the default domain > of an IOMMU group. > > Many conditions need to be satisfied and steps taken for this action to be > successful. > > Satisfying these conditions and steps will be required for setting other > IOMMU group attributes, so factor into a common part and a part specific > to update the IOMMU group attribute. > > No functional change intended. > > Some code comments are tidied up also. > > Signed-off-by: John Garry > --- > drivers/iommu/iommu.c | 96 --- > 1 file changed, 62 insertions(+), 34 deletions(-) Acked-by: Will Deacon Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 08/16] arm64: dts: mt8195: Add power domains controller
Hi Krzysztof, After discussing your message with our power team, we realized that we need your help to ensure we fully understand you. On Mon, 2022-07-04 at 14:38 +0200, Krzysztof Kozlowski wrote: > On 04/07/2022 12:00, Tinghan Shen wrote: > > Add power domains controller node for mt8195. > > > > Signed-off-by: Weiyi Lu > > Signed-off-by: Tinghan Shen > > --- > > arch/arm64/boot/dts/mediatek/mt8195.dtsi | 327 +++ > > 1 file changed, 327 insertions(+) > > > > diff --git a/arch/arm64/boot/dts/mediatek/mt8195.dtsi > > b/arch/arm64/boot/dts/mediatek/mt8195.dtsi > > index 8d59a7da3271..d52e140d9271 100644 > > --- a/arch/arm64/boot/dts/mediatek/mt8195.dtsi > > +++ b/arch/arm64/boot/dts/mediatek/mt8195.dtsi > > @@ -10,6 +10,7 @@ > > #include > > #include > > #include > > +#include > > > > / { > > compatible = "mediatek,mt8195"; > > @@ -338,6 +339,332 @@ > > #interrupt-cells = <2>; > > }; > > > > + scpsys: syscon@10006000 { > > + compatible = "syscon", "simple-mfd"; > > These compatibles cannot be alone. the scpsys sub node has the compatible of the power domain driver. do you suggest that the compatible in the sub node should move to here? > > + reg = <0 0x10006000 0 0x1000>; > > + #power-domain-cells = <1>; > > If it is simple MFD, then probably it is not a power domain provider. > Decide. this MFD device is the power controller on mt8195. Some features need to do some operations on registers in this node. We think that implement the operation of these registers as the MFD device can provide flexibility for future use. We want to clarify if you're saying that an MFD device cannot be a power domain provider. Best regards, TingHan ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCHv2] iommu/arm-smmu-qcom: Add debug support for TLB sync timeouts
On Thu, May 26, 2022 at 09:44:03AM +0530, Sai Prakash Ranjan wrote: > TLB sync timeouts can be due to various reasons such as TBU power down > or pending TCU/TBU invalidation/sync and so on. Debugging these often > require dumping of some implementation defined registers to know the > status of TBU/TCU operations and some of these registers are not > accessible in non-secure world such as from kernel and requires SMC > calls to read them in the secure world. So, add this debug support > to dump implementation defined registers for TLB sync timeout issues. > > Signed-off-by: Sai Prakash Ranjan > --- > > Changes in v2: > * Use scm call consistently so that it works on older chipsets where >some of these regs are secure registers. > * Add device specific data to get the implementation defined register >offsets. > > --- > drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 161 ++--- > drivers/iommu/arm/arm-smmu/arm-smmu.c | 2 + > drivers/iommu/arm/arm-smmu/arm-smmu.h | 1 + > 3 files changed, 146 insertions(+), 18 deletions(-) If this is useful to you, then I suppose it's something we could support, however I'm pretty worried about our ability to maintain/scale this stuff as it is extended to support additional SoCs and other custom debugging features. Perhaps you could stick it all in arm-smmu-qcom-debug.c and have a new config option for that, so at least it's even further out of the way? Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 RESEND 35/35] iommu/amd: Update amd_iommu_fault structure to include PCI seg ID
Rename 'device_id' as 'sbdf' and extend it to 32bit so that we can pass PCI segment ID to ppr_notifier(). Also pass PCI segment ID to pci_get_domain_bus_and_slot() instead of default value. Signed-off-by: Vasant Hegde --- drivers/iommu/amd/amd_iommu_types.h | 2 +- drivers/iommu/amd/iommu.c | 2 +- drivers/iommu/amd/iommu_v2.c| 9 + 3 files changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 1ca54803702a..40f52d02c5b9 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -486,7 +486,7 @@ extern struct kmem_cache *amd_iommu_irq_cache; struct amd_iommu_fault { u64 address;/* IO virtual address of the fault*/ u32 pasid; /* Address space identifier */ - u16 device_id; /* Originating PCI device id */ + u32 sbdf; /* Originating PCI device id */ u16 tag;/* PPR tag */ u16 flags; /* Fault flags */ diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 6a1db8f9f453..a56a9ad3273e 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -701,7 +701,7 @@ static void iommu_handle_ppr_entry(struct amd_iommu *iommu, u64 *raw) fault.address = raw[1]; fault.pasid = PPR_PASID(raw[0]); - fault.device_id = PPR_DEVID(raw[0]); + fault.sbdf = PCI_SEG_DEVID_TO_SBDF(iommu->pci_seg->id, PPR_DEVID(raw[0])); fault.tag = PPR_TAG(raw[0]); fault.flags = PPR_FLAGS(raw[0]); diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c index 40484af2ffc2..696dbe57 100644 --- a/drivers/iommu/amd/iommu_v2.c +++ b/drivers/iommu/amd/iommu_v2.c @@ -518,15 +518,16 @@ static int ppr_notifier(struct notifier_block *nb, unsigned long e, void *data) unsigned long flags; struct fault *fault; bool finish; - u16 tag, devid; + u16 tag, devid, seg_id; int ret; iommu_fault = data; tag = iommu_fault->tag & 0x1ff; finish = (iommu_fault->tag >> 9) & 1; - devid = iommu_fault->device_id; - pdev = pci_get_domain_bus_and_slot(0, PCI_BUS_NUM(devid), + seg_id = PCI_SBDF_TO_SEGID(iommu_fault->sbdf); + devid = PCI_SBDF_TO_DEVID(iommu_fault->sbdf); + pdev = pci_get_domain_bus_and_slot(seg_id, PCI_BUS_NUM(devid), devid & 0xff); if (!pdev) return -ENODEV; @@ -540,7 +541,7 @@ static int ppr_notifier(struct notifier_block *nb, unsigned long e, void *data) goto out; } - dev_state = get_device_state(iommu_fault->device_id); + dev_state = get_device_state(iommu_fault->sbdf); if (dev_state == NULL) goto out; -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 RESEND 34/35] iommu/amd: Update device_state structure to include PCI seg ID
Rename struct device_state.devid variable to struct device_state.sbdf and extend it to 32-bit to include the 16-bit PCI segment ID via the helper function get_pci_sbdf_id(). Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/iommu_v2.c | 58 +++- 1 file changed, 24 insertions(+), 34 deletions(-) diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c index afb3efd565b7..40484af2ffc2 100644 --- a/drivers/iommu/amd/iommu_v2.c +++ b/drivers/iommu/amd/iommu_v2.c @@ -51,7 +51,7 @@ struct pasid_state { struct device_state { struct list_head list; - u16 devid; + u32 sbdf; atomic_t count; struct pci_dev *pdev; struct pasid_state **states; @@ -83,35 +83,25 @@ static struct workqueue_struct *iommu_wq; static void free_pasid_states(struct device_state *dev_state); -static u16 device_id(struct pci_dev *pdev) -{ - u16 devid; - - devid = pdev->bus->number; - devid = (devid << 8) | pdev->devfn; - - return devid; -} - -static struct device_state *__get_device_state(u16 devid) +static struct device_state *__get_device_state(u32 sbdf) { struct device_state *dev_state; list_for_each_entry(dev_state, _list, list) { - if (dev_state->devid == devid) + if (dev_state->sbdf == sbdf) return dev_state; } return NULL; } -static struct device_state *get_device_state(u16 devid) +static struct device_state *get_device_state(u32 sbdf) { struct device_state *dev_state; unsigned long flags; spin_lock_irqsave(_lock, flags); - dev_state = __get_device_state(devid); + dev_state = __get_device_state(sbdf); if (dev_state != NULL) atomic_inc(_state->count); spin_unlock_irqrestore(_lock, flags); @@ -609,7 +599,7 @@ int amd_iommu_bind_pasid(struct pci_dev *pdev, u32 pasid, struct pasid_state *pasid_state; struct device_state *dev_state; struct mm_struct *mm; - u16 devid; + u32 sbdf; int ret; might_sleep(); @@ -617,8 +607,8 @@ int amd_iommu_bind_pasid(struct pci_dev *pdev, u32 pasid, if (!amd_iommu_v2_supported()) return -ENODEV; - devid = device_id(pdev); - dev_state = get_device_state(devid); + sbdf = get_pci_sbdf_id(pdev); + dev_state = get_device_state(sbdf); if (dev_state == NULL) return -EINVAL; @@ -692,15 +682,15 @@ void amd_iommu_unbind_pasid(struct pci_dev *pdev, u32 pasid) { struct pasid_state *pasid_state; struct device_state *dev_state; - u16 devid; + u32 sbdf; might_sleep(); if (!amd_iommu_v2_supported()) return; - devid = device_id(pdev); - dev_state = get_device_state(devid); + sbdf = get_pci_sbdf_id(pdev); + dev_state = get_device_state(sbdf); if (dev_state == NULL) return; @@ -742,7 +732,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids) struct iommu_group *group; unsigned long flags; int ret, tmp; - u16 devid; + u32 sbdf; might_sleep(); @@ -759,7 +749,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids) if (pasids <= 0 || pasids > (PASID_MASK + 1)) return -EINVAL; - devid = device_id(pdev); + sbdf = get_pci_sbdf_id(pdev); dev_state = kzalloc(sizeof(*dev_state), GFP_KERNEL); if (dev_state == NULL) @@ -768,7 +758,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids) spin_lock_init(_state->lock); init_waitqueue_head(_state->wq); dev_state->pdev = pdev; - dev_state->devid = devid; + dev_state->sbdf = sbdf; tmp = pasids; for (dev_state->pasid_levels = 0; (tmp - 1) & ~0x1ff; tmp >>= 9) @@ -806,7 +796,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids) spin_lock_irqsave(_lock, flags); - if (__get_device_state(devid) != NULL) { + if (__get_device_state(sbdf) != NULL) { spin_unlock_irqrestore(_lock, flags); ret = -EBUSY; goto out_free_domain; @@ -838,16 +828,16 @@ void amd_iommu_free_device(struct pci_dev *pdev) { struct device_state *dev_state; unsigned long flags; - u16 devid; + u32 sbdf; if (!amd_iommu_v2_supported()) return; - devid = device_id(pdev); + sbdf = get_pci_sbdf_id(pdev); spin_lock_irqsave(_lock, flags); - dev_state = __get_device_state(devid); + dev_state = __get_device_state(sbdf); if (dev_state == NULL) { spin_unlock_irqrestore(_lock, flags); return; @@ -867,18 +857,18 @@ int
[PATCH v3 RESEND 33/35] iommu/amd: Print PCI segment ID in error log messages
Print pci segment ID along with bdf. Useful for debugging. Co-developed-by: Suravee Suthikulpaint Signed-off-by: Suravee Suthikulpaint Signed-off-by: Vasant Hegde --- drivers/iommu/amd/init.c | 10 +- drivers/iommu/amd/iommu.c | 36 ++-- 2 files changed, 23 insertions(+), 23 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 9b1026fa7283..3c82d9c5f1c0 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1855,11 +1855,11 @@ static int __init init_iommu_all(struct acpi_table_header *table) h = (struct ivhd_header *)p; if (*p == amd_iommu_target_ivhd_type) { - DUMP_printk("device: %02x:%02x.%01x cap: %04x " - "seg: %d flags: %01x info %04x\n", - PCI_BUS_NUM(h->devid), PCI_SLOT(h->devid), - PCI_FUNC(h->devid), h->cap_ptr, - h->pci_seg, h->flags, h->info); + DUMP_printk("device: %04x:%02x:%02x.%01x cap: %04x " + "flags: %01x info %04x\n", + h->pci_seg, PCI_BUS_NUM(h->devid), + PCI_SLOT(h->devid), PCI_FUNC(h->devid), + h->cap_ptr, h->flags, h->info); DUMP_printk(" mmio-addr: %016llx\n", h->mmio_phys); diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 2dbe17e49ffc..6a1db8f9f453 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -496,8 +496,8 @@ static void amd_iommu_report_rmp_hw_error(struct amd_iommu *iommu, volatile u32 vmg_tag, spa, flags); } } else { - pr_err_ratelimited("Event logged [RMP_HW_ERROR device=%02x:%02x.%x, vmg_tag=0x%04x, spa=0x%llx, flags=0x%04x]\n", - PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), + pr_err_ratelimited("Event logged [RMP_HW_ERROR device=%04x:%02x:%02x.%x, vmg_tag=0x%04x, spa=0x%llx, flags=0x%04x]\n", + iommu->pci_seg->id, PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), vmg_tag, spa, flags); } @@ -529,8 +529,8 @@ static void amd_iommu_report_rmp_fault(struct amd_iommu *iommu, volatile u32 *ev vmg_tag, gpa, flags_rmp, flags); } } else { - pr_err_ratelimited("Event logged [RMP_PAGE_FAULT device=%02x:%02x.%x, vmg_tag=0x%04x, gpa=0x%llx, flags_rmp=0x%04x, flags=0x%04x]\n", - PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), + pr_err_ratelimited("Event logged [RMP_PAGE_FAULT device=%04x:%02x:%02x.%x, vmg_tag=0x%04x, gpa=0x%llx, flags_rmp=0x%04x, flags=0x%04x]\n", + iommu->pci_seg->id, PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), vmg_tag, gpa, flags_rmp, flags); } @@ -576,8 +576,8 @@ static void amd_iommu_report_page_fault(struct amd_iommu *iommu, domain_id, address, flags); } } else { - pr_err_ratelimited("Event logged [IO_PAGE_FAULT device=%02x:%02x.%x domain=0x%04x address=0x%llx flags=0x%04x]\n", - PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), + pr_err_ratelimited("Event logged [IO_PAGE_FAULT device=%04x:%02x:%02x.%x domain=0x%04x address=0x%llx flags=0x%04x]\n", + iommu->pci_seg->id, PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), domain_id, address, flags); } @@ -620,20 +620,20 @@ static void iommu_print_event(struct amd_iommu *iommu, void *__evt) switch (type) { case EVENT_TYPE_ILL_DEV: - dev_err(dev, "Event logged [ILLEGAL_DEV_TABLE_ENTRY device=%02x:%02x.%x pasid=0x%05x address=0x%llx flags=0x%04x]\n", - PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), + dev_err(dev, "Event logged [ILLEGAL_DEV_TABLE_ENTRY device=%04x:%02x:%02x.%x pasid=0x%05x address=0x%llx flags=0x%04x]\n", + iommu->pci_seg->id, PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), pasid, address, flags); dump_dte_entry(iommu, devid); break; case EVENT_TYPE_DEV_TAB_ERR: - dev_err(dev, "Event logged [DEV_TAB_HARDWARE_ERROR device=%02x:%02x.%x " + dev_err(dev, "Event logged [DEV_TAB_HARDWARE_ERROR device=%04x:%02x:%02x.%x " "address=0x%llx flags=0x%04x]\n", - PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), + iommu->pci_seg->id, PCI_BUS_NUM(devid),
[PATCH v3 RESEND 32/35] iommu/amd: Add PCI segment support for ivrs_[ioapic/hpet/acpihid] commands
From: Suravee Suthikulpanit By default, PCI segment is zero and can be omitted. To support system with non-zero PCI segment ID, modify the parsing functions to allow PCI segment ID. Co-developed-by: Vasant Hegde Signed-off-by: Vasant Hegde Signed-off-by: Suravee Suthikulpanit --- .../admin-guide/kernel-parameters.txt | 34 ++ drivers/iommu/amd/init.c | 44 --- 2 files changed, 52 insertions(+), 26 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 2522b11e593f..d45e58328ce6 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2266,23 +2266,39 @@ ivrs_ioapic [HW,X86-64] Provide an override to the IOAPIC-ID<->DEVICE-ID - mapping provided in the IVRS ACPI table. For - example, to map IOAPIC-ID decimal 10 to - PCI device 00:14.0 write the parameter as: + mapping provided in the IVRS ACPI table. + By default, PCI segment is 0, and can be omitted. + For example: + * To map IOAPIC-ID decimal 10 to PCI device 00:14.0 + write the parameter as: ivrs_ioapic[10]=00:14.0 + * To map IOAPIC-ID decimal 10 to PCI segment 0x1 and + PCI device 00:14.0 write the parameter as: + ivrs_ioapic[10]=0001:00:14.0 ivrs_hpet [HW,X86-64] Provide an override to the HPET-ID<->DEVICE-ID - mapping provided in the IVRS ACPI table. For - example, to map HPET-ID decimal 0 to - PCI device 00:14.0 write the parameter as: + mapping provided in the IVRS ACPI table. + By default, PCI segment is 0, and can be omitted. + For example: + * To map HPET-ID decimal 0 to PCI device 00:14.0 + write the parameter as: ivrs_hpet[0]=00:14.0 + * To map HPET-ID decimal 10 to PCI segment 0x1 and + PCI device 00:14.0 write the parameter as: + ivrs_ioapic[10]=0001:00:14.0 ivrs_acpihid[HW,X86-64] Provide an override to the ACPI-HID:UID<->DEVICE-ID - mapping provided in the IVRS ACPI table. For - example, to map UART-HID:UID AMD0020:0 to - PCI device 00:14.5 write the parameter as: + mapping provided in the IVRS ACPI table. + + For example, to map UART-HID:UID AMD0020:0 to + PCI segment 0x1 and PCI device ID 00:14.5, + write the parameter as: + ivrs_acpihid[0001:00:14.5]=AMD0020:0 + + By default, PCI segment is 0, and can be omitted. + For example, PCI device 00:14.5 write the parameter as: ivrs_acpihid[00:14.5]=AMD0020:0 js= [HW,JOY] Analog joystick diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 9693f0b9e07a..9b1026fa7283 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -84,6 +84,10 @@ #define ACPI_DEVFLAG_ATSDIS 0x1000 #define LOOP_TIMEOUT 200 + +#define IVRS_GET_SBDF_ID(seg, bus, dev, fd)(((seg & 0x) << 16) | ((bus & 0xff) << 8) \ +| ((dev & 0x1f) << 3) | (fn & 0x7)) + /* * ACPI table definitions * @@ -3288,15 +3292,17 @@ static int __init parse_amd_iommu_options(char *str) static int __init parse_ivrs_ioapic(char *str) { - unsigned int bus, dev, fn; + u32 seg = 0, bus, dev, fn; int ret, id, i; - u16 devid; + u32 devid; ret = sscanf(str, "[%d]=%x:%x.%x", , , , ); - if (ret != 4) { - pr_err("Invalid command line: ivrs_ioapic%s\n", str); - return 1; + ret = sscanf(str, "[%d]=%x:%x:%x.%x", , , , , ); + if (ret != 5) { + pr_err("Invalid command line: ivrs_ioapic%s\n", str); + return 1; + } } if (early_ioapic_map_size == EARLY_MAP_SIZE) { @@ -3305,7 +3311,7 @@ static int __init parse_ivrs_ioapic(char *str) return 1; } - devid = ((bus & 0xff) << 8) | ((dev & 0x1f) << 3) | (fn & 0x7); + devid = IVRS_GET_SBDF_ID(seg, bus, dev, fn); cmdline_maps= true; i = early_ioapic_map_size++; @@ -3318,15
Re: [PATCH v12 0/2] iommu/mediatek: TTBR up to 35bit support
On Thu, Jun 30, 2022 at 05:29:24PM +0800, yf.w...@mediatek.com wrote: > This patchset adds MediaTek TTBR up to 35bit support for single normal zone. > > Changes in v12: > - Update [PATCH 1/2]: remove GENMASK(31, 7) > - Update [PATCH 2/2]: remove MMU_PT_ADDR_MASK definition. For both patches: Acked-by: Will Deacon Joerg -- please can you pick these up for 5.20? Thanks, Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 RESEND 31/35] iommu/amd: Specify PCI segment ID when getting pci device
From: Suravee Suthikulpanit Upcoming AMD systems can have multiple PCI segments. Hence pass PCI segment ID to pci_get_domain_bus_and_slot() instead of '0'. Co-developed-by: Vasant Hegde Signed-off-by: Vasant Hegde Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 6 -- drivers/iommu/amd/iommu.c | 19 ++- 2 files changed, 14 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index d35081d84460..9693f0b9e07a 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1962,7 +1962,8 @@ static int __init iommu_init_pci(struct amd_iommu *iommu) int cap_ptr = iommu->cap_ptr; int ret; - iommu->dev = pci_get_domain_bus_and_slot(0, PCI_BUS_NUM(iommu->devid), + iommu->dev = pci_get_domain_bus_and_slot(iommu->pci_seg->id, +PCI_BUS_NUM(iommu->devid), iommu->devid & 0xff); if (!iommu->dev) return -ENODEV; @@ -2025,7 +2026,8 @@ static int __init iommu_init_pci(struct amd_iommu *iommu) int i, j; iommu->root_pdev = - pci_get_domain_bus_and_slot(0, iommu->dev->bus->number, + pci_get_domain_bus_and_slot(iommu->pci_seg->id, + iommu->dev->bus->number, PCI_DEVFN(0, 0)); /* diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 0751dda04a10..2dbe17e49ffc 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -473,7 +473,7 @@ static void dump_command(unsigned long phys_addr) pr_err("CMD[%d]: %08x\n", i, cmd->data[i]); } -static void amd_iommu_report_rmp_hw_error(volatile u32 *event) +static void amd_iommu_report_rmp_hw_error(struct amd_iommu *iommu, volatile u32 *event) { struct iommu_dev_data *dev_data = NULL; int devid, vmg_tag, flags; @@ -485,7 +485,7 @@ static void amd_iommu_report_rmp_hw_error(volatile u32 *event) flags = (event[1] >> EVENT_FLAGS_SHIFT) & EVENT_FLAGS_MASK; spa = ((u64)event[3] << 32) | (event[2] & 0xFFF8); - pdev = pci_get_domain_bus_and_slot(0, PCI_BUS_NUM(devid), + pdev = pci_get_domain_bus_and_slot(iommu->pci_seg->id, PCI_BUS_NUM(devid), devid & 0xff); if (pdev) dev_data = dev_iommu_priv_get(>dev); @@ -505,7 +505,7 @@ static void amd_iommu_report_rmp_hw_error(volatile u32 *event) pci_dev_put(pdev); } -static void amd_iommu_report_rmp_fault(volatile u32 *event) +static void amd_iommu_report_rmp_fault(struct amd_iommu *iommu, volatile u32 *event) { struct iommu_dev_data *dev_data = NULL; int devid, flags_rmp, vmg_tag, flags; @@ -518,7 +518,7 @@ static void amd_iommu_report_rmp_fault(volatile u32 *event) flags = (event[1] >> EVENT_FLAGS_SHIFT) & EVENT_FLAGS_MASK; gpa = ((u64)event[3] << 32) | event[2]; - pdev = pci_get_domain_bus_and_slot(0, PCI_BUS_NUM(devid), + pdev = pci_get_domain_bus_and_slot(iommu->pci_seg->id, PCI_BUS_NUM(devid), devid & 0xff); if (pdev) dev_data = dev_iommu_priv_get(>dev); @@ -544,13 +544,14 @@ static void amd_iommu_report_rmp_fault(volatile u32 *event) #define IS_WRITE_REQUEST(flags)\ ((flags) & EVENT_FLAG_RW) -static void amd_iommu_report_page_fault(u16 devid, u16 domain_id, +static void amd_iommu_report_page_fault(struct amd_iommu *iommu, + u16 devid, u16 domain_id, u64 address, int flags) { struct iommu_dev_data *dev_data = NULL; struct pci_dev *pdev; - pdev = pci_get_domain_bus_and_slot(0, PCI_BUS_NUM(devid), + pdev = pci_get_domain_bus_and_slot(iommu->pci_seg->id, PCI_BUS_NUM(devid), devid & 0xff); if (pdev) dev_data = dev_iommu_priv_get(>dev); @@ -613,7 +614,7 @@ static void iommu_print_event(struct amd_iommu *iommu, void *__evt) } if (type == EVENT_TYPE_IO_FAULT) { - amd_iommu_report_page_fault(devid, pasid, address, flags); + amd_iommu_report_page_fault(iommu, devid, pasid, address, flags); return; } @@ -654,10 +655,10 @@ static void iommu_print_event(struct amd_iommu *iommu, void *__evt) pasid, address, flags); break; case EVENT_TYPE_RMP_FAULT: - amd_iommu_report_rmp_fault(event); + amd_iommu_report_rmp_fault(iommu, event); break; case EVENT_TYPE_RMP_HW_ERR: - amd_iommu_report_rmp_hw_error(event); +
[PATCH v3 RESEND 30/35] iommu/amd: Include PCI segment ID when initialize IOMMU
From: Suravee Suthikulpanit Extend current device ID variables to 32-bit to include the 16-bit segment ID when parsing device information from IVRS table to initialize each IOMMU. Co-developed-by: Vasant Hegde Signed-off-by: Vasant Hegde Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 2 +- drivers/iommu/amd/amd_iommu_types.h | 6 ++-- drivers/iommu/amd/init.c| 56 +++-- drivers/iommu/amd/quirks.c | 4 +-- 4 files changed, 35 insertions(+), 33 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index e73bd48fc716..9b7092182ca7 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -125,7 +125,7 @@ static inline int get_pci_sbdf_id(struct pci_dev *pdev) extern bool translation_pre_enabled(struct amd_iommu *iommu); extern bool amd_iommu_is_attach_deferred(struct device *dev); -extern int __init add_special_device(u8 type, u8 id, u16 *devid, +extern int __init add_special_device(u8 type, u8 id, u32 *devid, bool cmd_line); #ifdef CONFIG_DMI diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index ea238e8e6c99..1ca54803702a 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -744,8 +744,8 @@ struct acpihid_map_entry { struct list_head list; u8 uid[ACPIHID_UID_LEN]; u8 hid[ACPIHID_HID_LEN]; - u16 devid; - u16 root_devid; + u32 devid; + u32 root_devid; bool cmd_line; struct iommu_group *group; }; @@ -753,7 +753,7 @@ struct acpihid_map_entry { struct devid_map { struct list_head list; u8 id; - u16 devid; + u32 devid; bool cmd_line; }; diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index df8f4b9d20cd..d35081d84460 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1148,7 +1148,7 @@ static void __init set_dev_entry_from_acpi(struct amd_iommu *iommu, amd_iommu_set_rlookup_table(iommu, devid); } -int __init add_special_device(u8 type, u8 id, u16 *devid, bool cmd_line) +int __init add_special_device(u8 type, u8 id, u32 *devid, bool cmd_line) { struct devid_map *entry; struct list_head *list; @@ -1185,7 +1185,7 @@ int __init add_special_device(u8 type, u8 id, u16 *devid, bool cmd_line) return 0; } -static int __init add_acpi_hid_device(u8 *hid, u8 *uid, u16 *devid, +static int __init add_acpi_hid_device(u8 *hid, u8 *uid, u32 *devid, bool cmd_line) { struct acpihid_map_entry *entry; @@ -1264,7 +1264,7 @@ static int __init init_iommu_from_acpi(struct amd_iommu *iommu, { u8 *p = (u8 *)h; u8 *end = p, flags = 0; - u16 devid = 0, devid_start = 0, devid_to = 0; + u16 devid = 0, devid_start = 0, devid_to = 0, seg_id; u32 dev_i, ext_flags = 0; bool alias = false; struct ivhd_entry *e; @@ -1300,6 +1300,8 @@ static int __init init_iommu_from_acpi(struct amd_iommu *iommu, while (p < end) { e = (struct ivhd_entry *)p; + seg_id = pci_seg->id; + switch (e->type) { case IVHD_DEV_ALL: @@ -1310,9 +1312,9 @@ static int __init init_iommu_from_acpi(struct amd_iommu *iommu, break; case IVHD_DEV_SELECT: - DUMP_printk(" DEV_SELECT\t\t\t devid: %02x:%02x.%x " + DUMP_printk(" DEV_SELECT\t\t\t devid: %04x:%02x:%02x.%x " "flags: %02x\n", - PCI_BUS_NUM(e->devid), + seg_id, PCI_BUS_NUM(e->devid), PCI_SLOT(e->devid), PCI_FUNC(e->devid), e->flags); @@ -1323,8 +1325,8 @@ static int __init init_iommu_from_acpi(struct amd_iommu *iommu, case IVHD_DEV_SELECT_RANGE_START: DUMP_printk(" DEV_SELECT_RANGE_START\t " - "devid: %02x:%02x.%x flags: %02x\n", - PCI_BUS_NUM(e->devid), + "devid: %04x:%02x:%02x.%x flags: %02x\n", + seg_id, PCI_BUS_NUM(e->devid), PCI_SLOT(e->devid), PCI_FUNC(e->devid), e->flags); @@ -1336,9 +1338,9 @@ static int __init init_iommu_from_acpi(struct amd_iommu *iommu, break; case IVHD_DEV_ALIAS: - DUMP_printk(" DEV_ALIAS\t\t\t devid: %02x:%02x.%x " + DUMP_printk(" DEV_ALIAS\t\t\t devid: %04x:%02x:%02x.%x " "flags:
[PATCH v3 RESEND 29/35] iommu/amd: Introduce get_device_sbdf_id() helper function
From: Suravee Suthikulpanit Current get_device_id() only provide 16-bit PCI device ID (i.e. BDF). With multiple PCI segment support, we need to extend the helper function to include PCI segment ID. So, introduce a new helper function get_device_sbdf_id() to replace the current get_pci_device_id(). Co-developed-by: Vasant Hegde Signed-off-by: Vasant Hegde Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 7 drivers/iommu/amd/amd_iommu_types.h | 2 + drivers/iommu/amd/iommu.c | 58 ++--- 3 files changed, 38 insertions(+), 29 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 64c954e168d7..e73bd48fc716 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -115,6 +115,13 @@ void amd_iommu_domain_clr_pt_root(struct protection_domain *domain) amd_iommu_domain_set_pt_root(domain, 0); } +static inline int get_pci_sbdf_id(struct pci_dev *pdev) +{ + int seg = pci_domain_nr(pdev->bus); + u16 devid = pci_dev_id(pdev); + + return PCI_SEG_DEVID_TO_SBDF(seg, devid); +} extern bool translation_pre_enabled(struct amd_iommu *iommu); extern bool amd_iommu_is_attach_deferred(struct device *dev); diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 65b02e2ae28f..ea238e8e6c99 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -456,6 +456,8 @@ extern struct kmem_cache *amd_iommu_irq_cache; #define PCI_SBDF_TO_SEGID(sbdf)(((sbdf) >> 16) & 0x) #define PCI_SBDF_TO_DEVID(sbdf)((sbdf) & 0x) +#define PCI_SEG_DEVID_TO_SBDF(seg, devid) u32)(seg) & 0x) << 16) | \ +((devid) & 0x)) /* Make iterating over all pci segment easier */ #define for_each_pci_segment(pci_seg) \ diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 6914911d4fb6..0751dda04a10 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -92,13 +92,6 @@ static void detach_device(struct device *dev); * / -static inline u16 get_pci_device_id(struct device *dev) -{ - struct pci_dev *pdev = to_pci_dev(dev); - - return pci_dev_id(pdev); -} - static inline int get_acpihid_device_id(struct device *dev, struct acpihid_map_entry **entry) { @@ -119,16 +112,16 @@ static inline int get_acpihid_device_id(struct device *dev, return -EINVAL; } -static inline int get_device_id(struct device *dev) +static inline int get_device_sbdf_id(struct device *dev) { - int devid; + int sbdf; if (dev_is_pci(dev)) - devid = get_pci_device_id(dev); + sbdf = get_pci_sbdf_id(to_pci_dev(dev)); else - devid = get_acpihid_device_id(dev, NULL); + sbdf = get_acpihid_device_id(dev, NULL); - return devid; + return sbdf; } struct dev_table_entry *get_dev_table(struct amd_iommu *iommu) @@ -182,9 +175,11 @@ static struct amd_iommu *__rlookup_amd_iommu(u16 seg, u16 devid) static struct amd_iommu *rlookup_amd_iommu(struct device *dev) { u16 seg = get_device_segment(dev); - u16 devid = get_device_id(dev); + int devid = get_device_sbdf_id(dev); - return __rlookup_amd_iommu(seg, devid); + if (devid < 0) + return NULL; + return __rlookup_amd_iommu(seg, PCI_SBDF_TO_DEVID(devid)); } static struct protection_domain *to_pdomain(struct iommu_domain *dom) @@ -360,14 +355,15 @@ static bool check_device(struct device *dev) { struct amd_iommu_pci_seg *pci_seg; struct amd_iommu *iommu; - int devid; + int devid, sbdf; if (!dev) return false; - devid = get_device_id(dev); - if (devid < 0) + sbdf = get_device_sbdf_id(dev); + if (sbdf < 0) return false; + devid = PCI_SBDF_TO_DEVID(sbdf); iommu = rlookup_amd_iommu(dev); if (!iommu) @@ -375,7 +371,7 @@ static bool check_device(struct device *dev) /* Out of our scope? */ pci_seg = iommu->pci_seg; - if ((devid & 0x) > pci_seg->last_bdf) + if (devid > pci_seg->last_bdf) return false; return true; @@ -384,15 +380,16 @@ static bool check_device(struct device *dev) static int iommu_init_device(struct amd_iommu *iommu, struct device *dev) { struct iommu_dev_data *dev_data; - int devid; + int devid, sbdf; if (dev_iommu_priv_get(dev)) return 0; - devid = get_device_id(dev); - if (devid < 0) - return devid; + sbdf = get_device_sbdf_id(dev); + if (sbdf < 0) + return sbdf; + devid =
[PATCH v3 RESEND 28/35] iommu/amd: Flush upto last_bdf only
Fix amd_iommu_flush_dte_all() and amd_iommu_flush_tlb_all() to flush upto last_bdf only. Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/iommu.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 94ebffe15960..6914911d4fb6 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1191,8 +1191,9 @@ static int iommu_flush_dte(struct amd_iommu *iommu, u16 devid) static void amd_iommu_flush_dte_all(struct amd_iommu *iommu) { u32 devid; + u16 last_bdf = iommu->pci_seg->last_bdf; - for (devid = 0; devid <= 0x; ++devid) + for (devid = 0; devid <= last_bdf; ++devid) iommu_flush_dte(iommu, devid); iommu_completion_wait(iommu); @@ -1205,8 +1206,9 @@ static void amd_iommu_flush_dte_all(struct amd_iommu *iommu) static void amd_iommu_flush_tlb_all(struct amd_iommu *iommu) { u32 dom_id; + u16 last_bdf = iommu->pci_seg->last_bdf; - for (dom_id = 0; dom_id <= 0x; ++dom_id) { + for (dom_id = 0; dom_id <= last_bdf; ++dom_id) { struct iommu_cmd cmd; build_inv_iommu_pages(, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS, dom_id, 1); @@ -1249,8 +1251,9 @@ static void iommu_flush_irt(struct amd_iommu *iommu, u16 devid) static void amd_iommu_flush_irt_all(struct amd_iommu *iommu) { u32 devid; + u16 last_bdf = iommu->pci_seg->last_bdf; - for (devid = 0; devid <= MAX_DEV_TABLE_ENTRIES; devid++) + for (devid = 0; devid <= last_bdf; devid++) iommu_flush_irt(iommu, devid); iommu_completion_wait(iommu); -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 RESEND 27/35] iommu/amd: Remove global amd_iommu_[dev_table/alias_table/last_bdf]
From: Suravee Suthikulpanit Replace them with per PCI segment device table. Also remove dev_table_size, alias_table_size, amd_iommu_last_bdf variables. Co-developed-by: Vasant Hegde Signed-off-by: Vasant Hegde Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu_types.h | 15 - drivers/iommu/amd/init.c| 89 + drivers/iommu/amd/iommu.c | 18 -- 3 files changed, 27 insertions(+), 95 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index d932c90329e4..65b02e2ae28f 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -834,24 +834,9 @@ struct unity_map_entry { * Data structures for device handling */ -/* - * Device table used by hardware. Read and write accesses by software are - * locked with the amd_iommu_pd_table lock. - */ -extern struct dev_table_entry *amd_iommu_dev_table; - -/* - * Alias table to find requestor ids to device ids. Not locked because only - * read on runtime. - */ -extern u16 *amd_iommu_alias_table; - /* size of the dma_ops aperture as power of 2 */ extern unsigned amd_iommu_aperture_order; -/* largest PCI device id we expect translation requests for */ -extern u16 amd_iommu_last_bdf; - /* allocation bitmap for domain ids */ extern unsigned long *amd_iommu_pd_alloc_bitmap; diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 508959182c7f..df8f4b9d20cd 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -160,9 +160,6 @@ static bool amd_iommu_disabled __initdata; static bool amd_iommu_force_enable __initdata; static int amd_iommu_target_ivhd_type; -u16 amd_iommu_last_bdf;/* largest PCI device id we have - to handle */ - LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */ LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the system */ @@ -185,30 +182,12 @@ bool amdr_ivrs_remap_support __read_mostly; bool amd_iommu_force_isolation __read_mostly; -/* - * Pointer to the device table which is shared by all AMD IOMMUs - * it is indexed by the PCI device id or the HT unit id and contains - * information about the domain the device belongs to as well as the - * page table root pointer. - */ -struct dev_table_entry *amd_iommu_dev_table; - -/* - * The alias table is a driver specific data structure which contains the - * mappings of the PCI device ids to the actual requestor ids on the IOMMU. - * More than one device can share the same requestor id. - */ -u16 *amd_iommu_alias_table; - /* * AMD IOMMU allows up to 2^16 different protection domains. This is a bitmap * to know which ones are already in use. */ unsigned long *amd_iommu_pd_alloc_bitmap; -static u32 dev_table_size; /* size of the device table */ -static u32 alias_table_size; /* size of the alias table */ - enum iommu_init_state { IOMMU_START_STATE, IOMMU_IVRS_DETECTED, @@ -263,16 +242,10 @@ static void init_translation_status(struct amd_iommu *iommu) iommu->flags |= AMD_IOMMU_FLAG_TRANS_PRE_ENABLED; } -static inline void update_last_devid(u16 devid) -{ - if (devid > amd_iommu_last_bdf) - amd_iommu_last_bdf = devid; -} - -static inline unsigned long tbl_size(int entry_size) +static inline unsigned long tbl_size(int entry_size, int last_bdf) { unsigned shift = PAGE_SHIFT + -get_order(((int)amd_iommu_last_bdf + 1) * entry_size); +get_order((last_bdf + 1) * entry_size); return 1UL << shift; } @@ -404,10 +377,11 @@ static void iommu_set_device_table(struct amd_iommu *iommu) { u64 entry; u32 dev_table_size = iommu->pci_seg->dev_table_size; + void *dev_table = (void *)get_dev_table(iommu); BUG_ON(iommu->mmio_base == NULL); - entry = iommu_virt_to_phys(amd_iommu_dev_table); + entry = iommu_virt_to_phys(dev_table); entry |= (dev_table_size >> 12) - 1; memcpy_toio(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET, , sizeof(entry)); @@ -557,14 +531,12 @@ static int __init find_last_devid_from_ivhd(struct ivhd_header *h) switch (dev->type) { case IVHD_DEV_ALL: /* Use maximum BDF value for DEV_ALL */ - update_last_devid(0x); return 0x; case IVHD_DEV_SELECT: case IVHD_DEV_RANGE_END: case IVHD_DEV_ALIAS: case IVHD_DEV_EXT_SELECT: /* all the above subfield types refer to device ids */ - update_last_devid(dev->devid); if (dev->devid > last_devid) last_devid = dev->devid;
[PATCH v3 RESEND 26/35] iommu/amd: Update set_dev_entry_bit() and get_dev_entry_bit()
From: Suravee Suthikulpanit To include a pointer to per PCI segment device table. Also include struct amd_iommu as one of the function parameter to amd_iommu_apply_erratum_63() since it is needed when setting up DTE. Co-developed-by: Vasant Hegde Signed-off-by: Vasant Hegde Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 2 +- drivers/iommu/amd/init.c | 59 +++ drivers/iommu/amd/iommu.c | 2 +- 3 files changed, 41 insertions(+), 22 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 2947239700ce..64c954e168d7 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -13,7 +13,7 @@ extern irqreturn_t amd_iommu_int_thread(int irq, void *data); extern irqreturn_t amd_iommu_int_handler(int irq, void *data); -extern void amd_iommu_apply_erratum_63(u16 devid); +extern void amd_iommu_apply_erratum_63(struct amd_iommu *iommu, u16 devid); extern void amd_iommu_restart_event_logging(struct amd_iommu *iommu); extern int amd_iommu_init_devices(void); extern void amd_iommu_uninit_devices(void); diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 3024fa9a89d5..508959182c7f 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -989,22 +989,37 @@ static void iommu_enable_gt(struct amd_iommu *iommu) } /* sets a specific bit in the device table entry. */ -static void set_dev_entry_bit(u16 devid, u8 bit) +static void __set_dev_entry_bit(struct dev_table_entry *dev_table, + u16 devid, u8 bit) { int i = (bit >> 6) & 0x03; int _bit = bit & 0x3f; - amd_iommu_dev_table[devid].data[i] |= (1UL << _bit); + dev_table[devid].data[i] |= (1UL << _bit); } -static int get_dev_entry_bit(u16 devid, u8 bit) +static void set_dev_entry_bit(struct amd_iommu *iommu, u16 devid, u8 bit) +{ + struct dev_table_entry *dev_table = get_dev_table(iommu); + + return __set_dev_entry_bit(dev_table, devid, bit); +} + +static int __get_dev_entry_bit(struct dev_table_entry *dev_table, + u16 devid, u8 bit) { int i = (bit >> 6) & 0x03; int _bit = bit & 0x3f; - return (amd_iommu_dev_table[devid].data[i] & (1UL << _bit)) >> _bit; + return (dev_table[devid].data[i] & (1UL << _bit)) >> _bit; } +static int get_dev_entry_bit(struct amd_iommu *iommu, u16 devid, u8 bit) +{ + struct dev_table_entry *dev_table = get_dev_table(iommu); + + return __get_dev_entry_bit(dev_table, devid, bit); +} static bool __copy_device_table(struct amd_iommu *iommu) { @@ -1123,15 +1138,15 @@ static bool copy_device_table(void) return true; } -void amd_iommu_apply_erratum_63(u16 devid) +void amd_iommu_apply_erratum_63(struct amd_iommu *iommu, u16 devid) { int sysmgt; - sysmgt = get_dev_entry_bit(devid, DEV_ENTRY_SYSMGT1) | -(get_dev_entry_bit(devid, DEV_ENTRY_SYSMGT2) << 1); + sysmgt = get_dev_entry_bit(iommu, devid, DEV_ENTRY_SYSMGT1) | +(get_dev_entry_bit(iommu, devid, DEV_ENTRY_SYSMGT2) << 1); if (sysmgt == 0x01) - set_dev_entry_bit(devid, DEV_ENTRY_IW); + set_dev_entry_bit(iommu, devid, DEV_ENTRY_IW); } /* Writes the specific IOMMU for a device into the rlookup table */ @@ -1148,21 +1163,21 @@ static void __init set_dev_entry_from_acpi(struct amd_iommu *iommu, u16 devid, u32 flags, u32 ext_flags) { if (flags & ACPI_DEVFLAG_INITPASS) - set_dev_entry_bit(devid, DEV_ENTRY_INIT_PASS); + set_dev_entry_bit(iommu, devid, DEV_ENTRY_INIT_PASS); if (flags & ACPI_DEVFLAG_EXTINT) - set_dev_entry_bit(devid, DEV_ENTRY_EINT_PASS); + set_dev_entry_bit(iommu, devid, DEV_ENTRY_EINT_PASS); if (flags & ACPI_DEVFLAG_NMI) - set_dev_entry_bit(devid, DEV_ENTRY_NMI_PASS); + set_dev_entry_bit(iommu, devid, DEV_ENTRY_NMI_PASS); if (flags & ACPI_DEVFLAG_SYSMGT1) - set_dev_entry_bit(devid, DEV_ENTRY_SYSMGT1); + set_dev_entry_bit(iommu, devid, DEV_ENTRY_SYSMGT1); if (flags & ACPI_DEVFLAG_SYSMGT2) - set_dev_entry_bit(devid, DEV_ENTRY_SYSMGT2); + set_dev_entry_bit(iommu, devid, DEV_ENTRY_SYSMGT2); if (flags & ACPI_DEVFLAG_LINT0) - set_dev_entry_bit(devid, DEV_ENTRY_LINT0_PASS); + set_dev_entry_bit(iommu, devid, DEV_ENTRY_LINT0_PASS); if (flags & ACPI_DEVFLAG_LINT1) - set_dev_entry_bit(devid, DEV_ENTRY_LINT1_PASS); + set_dev_entry_bit(iommu, devid, DEV_ENTRY_LINT1_PASS); - amd_iommu_apply_erratum_63(devid); + amd_iommu_apply_erratum_63(iommu, devid); set_iommu_for_device(iommu, devid); } @@ -2519,8 +2534,8 @@ static void init_device_table_dma(struct
[PATCH v3 RESEND 25/35] iommu/amd: Update (un)init_device_table_dma()
From: Suravee Suthikulpanit Include struct amd_iommu_pci_seg as a function parameter since we need to access per PCI segment device table. Co-developed-by: Vasant Hegde Signed-off-by: Vasant Hegde Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 27 --- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index b7e54bb5efc5..3024fa9a89d5 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -238,7 +238,7 @@ static enum iommu_init_state init_state = IOMMU_START_STATE; static int amd_iommu_enable_interrupts(void); static int __init iommu_go_to_state(enum iommu_init_state state); -static void init_device_table_dma(void); +static void init_device_table_dma(struct amd_iommu_pci_seg *pci_seg); static bool amd_iommu_pre_enabled = true; @@ -2116,6 +2116,7 @@ static void print_iommu_info(void) static int __init amd_iommu_init_pci(void) { struct amd_iommu *iommu; + struct amd_iommu_pci_seg *pci_seg; int ret; for_each_iommu(iommu) { @@ -2146,7 +2147,8 @@ static int __init amd_iommu_init_pci(void) goto out; } - init_device_table_dma(); + for_each_pci_segment(pci_seg) + init_device_table_dma(pci_seg); for_each_iommu(iommu) iommu_flush_all_caches(iommu); @@ -2508,9 +2510,13 @@ static int __init init_memory_definitions(struct acpi_table_header *table) /* * Init the device table to not allow DMA access for devices */ -static void init_device_table_dma(void) +static void init_device_table_dma(struct amd_iommu_pci_seg *pci_seg) { u32 devid; + struct dev_table_entry *dev_table = pci_seg->dev_table; + + if (dev_table == NULL) + return; for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) { set_dev_entry_bit(devid, DEV_ENTRY_VALID); @@ -2518,13 +2524,17 @@ static void init_device_table_dma(void) } } -static void __init uninit_device_table_dma(void) +static void __init uninit_device_table_dma(struct amd_iommu_pci_seg *pci_seg) { u32 devid; + struct dev_table_entry *dev_table = pci_seg->dev_table; + + if (dev_table == NULL) + return; for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) { - amd_iommu_dev_table[devid].data[0] = 0ULL; - amd_iommu_dev_table[devid].data[1] = 0ULL; + dev_table[devid].data[0] = 0ULL; + dev_table[devid].data[1] = 0ULL; } } @@ -3117,8 +3127,11 @@ static int __init state_next(void) free_iommu_resources(); } else { struct amd_iommu *iommu; + struct amd_iommu_pci_seg *pci_seg; + + for_each_pci_segment(pci_seg) + uninit_device_table_dma(pci_seg); - uninit_device_table_dma(); for_each_iommu(iommu) iommu_flush_all_caches(iommu); } -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 RESEND 24/35] iommu/amd: Update set_dte_irq_entry
From: Suravee Suthikulpanit Start using per PCI segment device table instead of global device table. Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/iommu.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 790a3449e7b7..f1fab4168101 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2731,18 +2731,20 @@ EXPORT_SYMBOL(amd_iommu_device_info); static struct irq_chip amd_ir_chip; static DEFINE_SPINLOCK(iommu_table_lock); -static void set_dte_irq_entry(u16 devid, struct irq_remap_table *table) +static void set_dte_irq_entry(struct amd_iommu *iommu, u16 devid, + struct irq_remap_table *table) { u64 dte; + struct dev_table_entry *dev_table = get_dev_table(iommu); - dte = amd_iommu_dev_table[devid].data[2]; + dte = dev_table[devid].data[2]; dte &= ~DTE_IRQ_PHYS_ADDR_MASK; dte |= iommu_virt_to_phys(table->table); dte |= DTE_IRQ_REMAP_INTCTL; dte |= DTE_INTTABLEN; dte |= DTE_IRQ_REMAP_ENABLE; - amd_iommu_dev_table[devid].data[2] = dte; + dev_table[devid].data[2] = dte; } static struct irq_remap_table *get_irq_table(struct amd_iommu *iommu, u16 devid) @@ -2793,7 +2795,7 @@ static void set_remap_table_entry(struct amd_iommu *iommu, u16 devid, struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg; pci_seg->irq_lookup_table[devid] = table; - set_dte_irq_entry(devid, table); + set_dte_irq_entry(iommu, devid, table); iommu_flush_dte(iommu, devid); } @@ -2809,8 +2811,7 @@ static int set_remap_table_entry_alias(struct pci_dev *pdev, u16 alias, pci_seg = iommu->pci_seg; pci_seg->irq_lookup_table[alias] = table; - set_dte_irq_entry(alias, table); - + set_dte_irq_entry(iommu, alias, table); iommu_flush_dte(pci_seg->rlookup_table[alias], alias); return 0; -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 RESEND 23/35] iommu/amd: Update dump_dte_entry
From: Suravee Suthikulpanit Start using per PCI segment device table instead of global device table. Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/iommu.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 90755da7cff0..790a3449e7b7 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -451,13 +451,13 @@ static void amd_iommu_uninit_device(struct device *dev) * / -static void dump_dte_entry(u16 devid) +static void dump_dte_entry(struct amd_iommu *iommu, u16 devid) { int i; + struct dev_table_entry *dev_table = get_dev_table(iommu); for (i = 0; i < 4; ++i) - pr_err("DTE[%d]: %016llx\n", i, - amd_iommu_dev_table[devid].data[i]); + pr_err("DTE[%d]: %016llx\n", i, dev_table[devid].data[i]); } static void dump_command(unsigned long phys_addr) @@ -618,7 +618,7 @@ static void iommu_print_event(struct amd_iommu *iommu, void *__evt) dev_err(dev, "Event logged [ILLEGAL_DEV_TABLE_ENTRY device=%02x:%02x.%x pasid=0x%05x address=0x%llx flags=0x%04x]\n", PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid), pasid, address, flags); - dump_dte_entry(devid); + dump_dte_entry(iommu, devid); break; case EVENT_TYPE_DEV_TAB_ERR: dev_err(dev, "Event logged [DEV_TAB_HARDWARE_ERROR device=%02x:%02x.%x " -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 RESEND 22/35] iommu/amd: Update iommu_ignore_device
From: Suravee Suthikulpanit Start using per PCI segment device table instead of global device table. Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/iommu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 493cda5e0246..90755da7cff0 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -413,15 +413,15 @@ static int iommu_init_device(struct amd_iommu *iommu, struct device *dev) static void iommu_ignore_device(struct amd_iommu *iommu, struct device *dev) { struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg; + struct dev_table_entry *dev_table = get_dev_table(iommu); int devid; - devid = get_device_id(dev); + devid = (get_device_id(dev)) & 0x; if (devid < 0) return; - pci_seg->rlookup_table[devid] = NULL; - memset(_iommu_dev_table[devid], 0, sizeof(struct dev_table_entry)); + memset(_table[devid], 0, sizeof(struct dev_table_entry)); setup_aliases(iommu, dev); } -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 RESEND 21/35] iommu/amd: Update set_dte_entry and clear_dte_entry
From: Suravee Suthikulpanit Start using per PCI segment data structures instead of global data structures. Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/iommu.c | 19 +++ 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 6e0cd9c4f57c..493cda5e0246 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1537,6 +1537,7 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 devid, u64 pte_root = 0; u64 flags = 0; u32 old_domid; + struct dev_table_entry *dev_table = get_dev_table(iommu); if (domain->iop.mode != PAGE_MODE_NONE) pte_root = iommu_virt_to_phys(domain->iop.root); @@ -1545,7 +1546,7 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 devid, << DEV_ENTRY_MODE_SHIFT; pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV; - flags = amd_iommu_dev_table[devid].data[1]; + flags = dev_table[devid].data[1]; if (ats) flags |= DTE_FLAG_IOTLB; @@ -1584,9 +1585,9 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 devid, flags &= ~DEV_DOMID_MASK; flags |= domain->id; - old_domid = amd_iommu_dev_table[devid].data[1] & DEV_DOMID_MASK; - amd_iommu_dev_table[devid].data[1] = flags; - amd_iommu_dev_table[devid].data[0] = pte_root; + old_domid = dev_table[devid].data[1] & DEV_DOMID_MASK; + dev_table[devid].data[1] = flags; + dev_table[devid].data[0] = pte_root; /* * A kdump kernel might be replacing a domain ID that was copied from @@ -1598,11 +1599,13 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 devid, } } -static void clear_dte_entry(u16 devid) +static void clear_dte_entry(struct amd_iommu *iommu, u16 devid) { + struct dev_table_entry *dev_table = get_dev_table(iommu); + /* remove entry from the device table seen by the hardware */ - amd_iommu_dev_table[devid].data[0] = DTE_FLAG_V | DTE_FLAG_TV; - amd_iommu_dev_table[devid].data[1] &= DTE_FLAG_MASK; + dev_table[devid].data[0] = DTE_FLAG_V | DTE_FLAG_TV; + dev_table[devid].data[1] &= DTE_FLAG_MASK; amd_iommu_apply_erratum_63(devid); } @@ -1646,7 +1649,7 @@ static void do_detach(struct iommu_dev_data *dev_data) /* Update data structures */ dev_data->domain = NULL; list_del(_data->list); - clear_dte_entry(dev_data->devid); + clear_dte_entry(iommu, dev_data->devid); clone_aliases(iommu, dev_data->dev); /* Flush the DTE entry */ -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 RESEND 20/35] iommu/amd: Convert to use per PCI segment rlookup_table
Then, remove the global amd_iommu_rlookup_table and rlookup_table_size. Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/amd_iommu_types.h | 5 - drivers/iommu/amd/init.c| 23 ++- drivers/iommu/amd/iommu.c | 19 +-- 3 files changed, 11 insertions(+), 36 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 67feb847fc13..d932c90329e4 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -846,11 +846,6 @@ extern struct dev_table_entry *amd_iommu_dev_table; */ extern u16 *amd_iommu_alias_table; -/* - * Reverse lookup table to find the IOMMU which translates a specific device. - */ -extern struct amd_iommu **amd_iommu_rlookup_table; - /* size of the dma_ops aperture as power of 2 */ extern unsigned amd_iommu_aperture_order; diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index b7b50345c8a5..b7e54bb5efc5 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -200,12 +200,6 @@ struct dev_table_entry *amd_iommu_dev_table; */ u16 *amd_iommu_alias_table; -/* - * The rlookup table is used to find the IOMMU which is responsible - * for a specific device. It is also indexed by the PCI device id. - */ -struct amd_iommu **amd_iommu_rlookup_table; - /* * AMD IOMMU allows up to 2^16 different protection domains. This is a bitmap * to know which ones are already in use. @@ -214,7 +208,6 @@ unsigned long *amd_iommu_pd_alloc_bitmap; static u32 dev_table_size; /* size of the device table */ static u32 alias_table_size; /* size of the alias table */ -static u32 rlookup_table_size; /* size if the rlookup table */ enum iommu_init_state { IOMMU_START_STATE, @@ -1144,7 +1137,7 @@ void amd_iommu_apply_erratum_63(u16 devid) /* Writes the specific IOMMU for a device into the rlookup table */ static void __init set_iommu_for_device(struct amd_iommu *iommu, u16 devid) { - amd_iommu_rlookup_table[devid] = iommu; + iommu->pci_seg->rlookup_table[devid] = iommu; } /* @@ -1826,7 +1819,7 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, * Make sure IOMMU is not considered to translate itself. The IVRS * table tells us so, but this is a lie! */ - amd_iommu_rlookup_table[iommu->devid] = NULL; + pci_seg->rlookup_table[iommu->devid] = NULL; return 0; } @@ -2783,10 +2776,6 @@ static void __init free_iommu_resources(void) kmem_cache_destroy(amd_iommu_irq_cache); amd_iommu_irq_cache = NULL; - free_pages((unsigned long)amd_iommu_rlookup_table, - get_order(rlookup_table_size)); - amd_iommu_rlookup_table = NULL; - free_pages((unsigned long)amd_iommu_alias_table, get_order(alias_table_size)); amd_iommu_alias_table = NULL; @@ -2925,7 +2914,6 @@ static int __init early_amd_iommu_init(void) dev_table_size = tbl_size(DEV_TABLE_ENTRY_SIZE); alias_table_size = tbl_size(ALIAS_TABLE_ENTRY_SIZE); - rlookup_table_size = tbl_size(RLOOKUP_TABLE_ENTRY_SIZE); /* Device table - directly used by all IOMMUs */ ret = -ENOMEM; @@ -2944,13 +2932,6 @@ static int __init early_amd_iommu_init(void) if (amd_iommu_alias_table == NULL) goto out; - /* IOMMU rlookup table - find the IOMMU for a specific device */ - amd_iommu_rlookup_table = (void *)__get_free_pages( - GFP_KERNEL | __GFP_ZERO, - get_order(rlookup_table_size)); - if (amd_iommu_rlookup_table == NULL) - goto out; - amd_iommu_pd_alloc_bitmap = (void *)__get_free_pages( GFP_KERNEL | __GFP_ZERO, get_order(MAX_DOMAIN_ID/8)); diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 5ee1af9a0a54..6e0cd9c4f57c 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -287,10 +287,9 @@ static void setup_aliases(struct amd_iommu *iommu, struct device *dev) clone_aliases(iommu, dev); } -static struct iommu_dev_data *find_dev_data(u16 devid) +static struct iommu_dev_data *find_dev_data(struct amd_iommu *iommu, u16 devid) { struct iommu_dev_data *dev_data; - struct amd_iommu *iommu = amd_iommu_rlookup_table[devid]; dev_data = search_dev_data(iommu, devid); @@ -388,7 +387,7 @@ static int iommu_init_device(struct amd_iommu *iommu, struct device *dev) if (devid < 0) return devid; - dev_data = find_dev_data(devid); + dev_data = find_dev_data(iommu, devid); if (!dev_data) return -ENOMEM; @@ -403,9 +402,6 @@ static int iommu_init_device(struct amd_iommu *iommu, struct
[PATCH v3 RESEND 19/35] iommu/amd: Update alloc_irq_table and alloc_irq_index
From: Suravee Suthikulpanit Pass amd_iommu structure as one of the parameter to these functions as its needed to retrieve variable tables inside these functions. Co-developed-by: Vasant Hegde Signed-off-by: Vasant Hegde Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/iommu.c | 26 +- 1 file changed, 9 insertions(+), 17 deletions(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index c4701fa957d0..5ee1af9a0a54 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2814,21 +2814,17 @@ static int set_remap_table_entry_alias(struct pci_dev *pdev, u16 alias, return 0; } -static struct irq_remap_table *alloc_irq_table(u16 devid, struct pci_dev *pdev) +static struct irq_remap_table *alloc_irq_table(struct amd_iommu *iommu, + u16 devid, struct pci_dev *pdev) { struct irq_remap_table *table = NULL; struct irq_remap_table *new_table = NULL; struct amd_iommu_pci_seg *pci_seg; - struct amd_iommu *iommu; unsigned long flags; u16 alias; spin_lock_irqsave(_table_lock, flags); - iommu = amd_iommu_rlookup_table[devid]; - if (!iommu) - goto out_unlock; - pci_seg = iommu->pci_seg; table = pci_seg->irq_lookup_table[devid]; if (table) @@ -2884,18 +2880,14 @@ static struct irq_remap_table *alloc_irq_table(u16 devid, struct pci_dev *pdev) return table; } -static int alloc_irq_index(u16 devid, int count, bool align, - struct pci_dev *pdev) +static int alloc_irq_index(struct amd_iommu *iommu, u16 devid, int count, + bool align, struct pci_dev *pdev) { struct irq_remap_table *table; int index, c, alignment = 1; unsigned long flags; - struct amd_iommu *iommu = amd_iommu_rlookup_table[devid]; - - if (!iommu) - return -ENODEV; - table = alloc_irq_table(devid, pdev); + table = alloc_irq_table(iommu, devid, pdev); if (!table) return -ENODEV; @@ -3267,7 +3259,7 @@ static int irq_remapping_alloc(struct irq_domain *domain, unsigned int virq, if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC) { struct irq_remap_table *table; - table = alloc_irq_table(devid, NULL); + table = alloc_irq_table(iommu, devid, NULL); if (table) { if (!table->min_index) { /* @@ -3287,10 +3279,10 @@ static int irq_remapping_alloc(struct irq_domain *domain, unsigned int virq, info->type == X86_IRQ_ALLOC_TYPE_PCI_MSIX) { bool align = (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI); - index = alloc_irq_index(devid, nr_irqs, align, + index = alloc_irq_index(iommu, devid, nr_irqs, align, msi_desc_to_pci_dev(info->desc)); } else { - index = alloc_irq_index(devid, nr_irqs, false, NULL); + index = alloc_irq_index(iommu, devid, nr_irqs, false, NULL); } if (index < 0) { @@ -3416,8 +3408,8 @@ static int irq_remapping_select(struct irq_domain *d, struct irq_fwspec *fwspec, if (devid < 0) return 0; + iommu = __rlookup_amd_iommu((devid >> 16), (devid & 0x)); - iommu = amd_iommu_rlookup_table[devid]; return iommu && iommu->ir_domain == d; } -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 RESEND 18/35] iommu/amd: Update amd_irte_ops functions
From: Suravee Suthikulpanit Pass amd_iommu structure as one of the parameter to amd_irte_ops functions since its needed to activate/deactivate the iommu. Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/amd_iommu_types.h | 6 ++-- drivers/iommu/amd/iommu.c | 51 - 2 files changed, 24 insertions(+), 33 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 693926afdd0f..67feb847fc13 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -1007,9 +1007,9 @@ struct amd_ir_data { struct amd_irte_ops { void (*prepare)(void *, u32, bool, u8, u32, int); - void (*activate)(void *, u16, u16); - void (*deactivate)(void *, u16, u16); - void (*set_affinity)(void *, u16, u16, u8, u32); + void (*activate)(struct amd_iommu *iommu, void *, u16, u16); + void (*deactivate)(struct amd_iommu *iommu, void *, u16, u16); + void (*set_affinity)(struct amd_iommu *iommu, void *, u16, u16, u8, u32); void *(*get)(struct irq_remap_table *, int); void (*set_allocated)(struct irq_remap_table *, int); bool (*is_allocated)(struct irq_remap_table *, int); diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 9f373b164762..c4701fa957d0 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2934,19 +2934,14 @@ static int alloc_irq_index(u16 devid, int count, bool align, return index; } -static int modify_irte_ga(u16 devid, int index, struct irte_ga *irte, - struct amd_ir_data *data) +static int modify_irte_ga(struct amd_iommu *iommu, u16 devid, int index, + struct irte_ga *irte, struct amd_ir_data *data) { bool ret; struct irq_remap_table *table; - struct amd_iommu *iommu; unsigned long flags; struct irte_ga *entry; - iommu = amd_iommu_rlookup_table[devid]; - if (iommu == NULL) - return -EINVAL; - table = get_irq_table(iommu, devid); if (!table) return -ENOMEM; @@ -2978,16 +2973,12 @@ static int modify_irte_ga(u16 devid, int index, struct irte_ga *irte, return 0; } -static int modify_irte(u16 devid, int index, union irte *irte) +static int modify_irte(struct amd_iommu *iommu, + u16 devid, int index, union irte *irte) { struct irq_remap_table *table; - struct amd_iommu *iommu; unsigned long flags; - iommu = amd_iommu_rlookup_table[devid]; - if (iommu == NULL) - return -EINVAL; - table = get_irq_table(iommu, devid); if (!table) return -ENOMEM; @@ -3049,49 +3040,49 @@ static void irte_ga_prepare(void *entry, irte->lo.fields_remap.valid = 1; } -static void irte_activate(void *entry, u16 devid, u16 index) +static void irte_activate(struct amd_iommu *iommu, void *entry, u16 devid, u16 index) { union irte *irte = (union irte *) entry; irte->fields.valid = 1; - modify_irte(devid, index, irte); + modify_irte(iommu, devid, index, irte); } -static void irte_ga_activate(void *entry, u16 devid, u16 index) +static void irte_ga_activate(struct amd_iommu *iommu, void *entry, u16 devid, u16 index) { struct irte_ga *irte = (struct irte_ga *) entry; irte->lo.fields_remap.valid = 1; - modify_irte_ga(devid, index, irte, NULL); + modify_irte_ga(iommu, devid, index, irte, NULL); } -static void irte_deactivate(void *entry, u16 devid, u16 index) +static void irte_deactivate(struct amd_iommu *iommu, void *entry, u16 devid, u16 index) { union irte *irte = (union irte *) entry; irte->fields.valid = 0; - modify_irte(devid, index, irte); + modify_irte(iommu, devid, index, irte); } -static void irte_ga_deactivate(void *entry, u16 devid, u16 index) +static void irte_ga_deactivate(struct amd_iommu *iommu, void *entry, u16 devid, u16 index) { struct irte_ga *irte = (struct irte_ga *) entry; irte->lo.fields_remap.valid = 0; - modify_irte_ga(devid, index, irte, NULL); + modify_irte_ga(iommu, devid, index, irte, NULL); } -static void irte_set_affinity(void *entry, u16 devid, u16 index, +static void irte_set_affinity(struct amd_iommu *iommu, void *entry, u16 devid, u16 index, u8 vector, u32 dest_apicid) { union irte *irte = (union irte *) entry; irte->fields.vector = vector; irte->fields.destination = dest_apicid; - modify_irte(devid, index, irte); + modify_irte(iommu, devid, index, irte); } -static void irte_ga_set_affinity(void *entry, u16 devid, u16 index, +static void irte_ga_set_affinity(struct amd_iommu *iommu, void *entry, u16 devid, u16 index, u8 vector, u32
[PATCH v3 RESEND 17/35] iommu/amd: Introduce struct amd_ir_data.iommu
From: Suravee Suthikulpanit Add a pointer to struct amd_iommu to amd_ir_data structure, which can be used to correlate interrupt remapping data to a per-PCI-segment interrupt remapping table. Co-developed-by: Vasant Hegde Signed-off-by: Vasant Hegde Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu_types.h | 1 + drivers/iommu/amd/iommu.c | 34 + 2 files changed, 16 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index ca1a3d55cc83..693926afdd0f 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -989,6 +989,7 @@ struct irq_2_irte { struct amd_ir_data { u32 cached_ga_tag; + struct amd_iommu *iommu; struct irq_2_irte irq_2_irte; struct msi_msg msi_entry; void *entry;/* Pointer to union irte or struct irte_ga */ diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 5e4648cadff9..9f373b164762 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3002,16 +3002,11 @@ static int modify_irte(u16 devid, int index, union irte *irte) return 0; } -static void free_irte(u16 devid, int index) +static void free_irte(struct amd_iommu *iommu, u16 devid, int index) { struct irq_remap_table *table; - struct amd_iommu *iommu; unsigned long flags; - iommu = amd_iommu_rlookup_table[devid]; - if (iommu == NULL) - return; - table = get_irq_table(iommu, devid); if (!table) return; @@ -3195,7 +3190,7 @@ static void irq_remapping_prepare_irte(struct amd_ir_data *data, int devid, int index, int sub_handle) { struct irq_2_irte *irte_info = >irq_2_irte; - struct amd_iommu *iommu = amd_iommu_rlookup_table[devid]; + struct amd_iommu *iommu = data->iommu; if (!iommu) return; @@ -3336,6 +3331,7 @@ static int irq_remapping_alloc(struct irq_domain *domain, unsigned int virq, goto out_free_data; } + data->iommu = iommu; irq_data->hwirq = (devid << 16) + i; irq_data->chip_data = data; irq_data->chip = _ir_chip; @@ -3352,7 +3348,7 @@ static int irq_remapping_alloc(struct irq_domain *domain, unsigned int virq, kfree(irq_data->chip_data); } for (i = 0; i < nr_irqs; i++) - free_irte(devid, index + i); + free_irte(iommu, devid, index + i); out_free_parent: irq_domain_free_irqs_common(domain, virq, nr_irqs); return ret; @@ -3371,7 +3367,7 @@ static void irq_remapping_free(struct irq_domain *domain, unsigned int virq, if (irq_data && irq_data->chip_data) { data = irq_data->chip_data; irte_info = >irq_2_irte; - free_irte(irte_info->devid, irte_info->index); + free_irte(data->iommu, irte_info->devid, irte_info->index); kfree(data->entry); kfree(data); } @@ -3389,7 +3385,7 @@ static int irq_remapping_activate(struct irq_domain *domain, { struct amd_ir_data *data = irq_data->chip_data; struct irq_2_irte *irte_info = >irq_2_irte; - struct amd_iommu *iommu = amd_iommu_rlookup_table[irte_info->devid]; + struct amd_iommu *iommu = data->iommu; struct irq_cfg *cfg = irqd_cfg(irq_data); if (!iommu) @@ -3406,7 +3402,7 @@ static void irq_remapping_deactivate(struct irq_domain *domain, { struct amd_ir_data *data = irq_data->chip_data; struct irq_2_irte *irte_info = >irq_2_irte; - struct amd_iommu *iommu = amd_iommu_rlookup_table[irte_info->devid]; + struct amd_iommu *iommu = data->iommu; if (iommu) iommu->irte_ops->deactivate(data->entry, irte_info->devid, @@ -3502,12 +3498,16 @@ EXPORT_SYMBOL(amd_iommu_deactivate_guest_mode); static int amd_ir_set_vcpu_affinity(struct irq_data *data, void *vcpu_info) { int ret; - struct amd_iommu *iommu; struct amd_iommu_pi_data *pi_data = vcpu_info; struct vcpu_data *vcpu_pi_info = pi_data->vcpu_data; struct amd_ir_data *ir_data = data->chip_data; struct irq_2_irte *irte_info = _data->irq_2_irte; - struct iommu_dev_data *dev_data = search_dev_data(NULL, irte_info->devid); + struct iommu_dev_data *dev_data; + + if (ir_data->iommu == NULL) + return -EINVAL; + + dev_data = search_dev_data(ir_data->iommu, irte_info->devid); /* Note: * This device has never been set up for guest mode. @@ -3529,10 +3529,6 @@ static int amd_ir_set_vcpu_affinity(struct irq_data *data, void *vcpu_info) pi_data->is_guest_mode = false;
[PATCH v3 RESEND 16/35] iommu/amd: Update irq_remapping_alloc to use IOMMU lookup helper function
From: Suravee Suthikulpanit To allow IOMMU rlookup using both PCI segment and device ID. Co-developed-by: Vasant Hegde Signed-off-by: Vasant Hegde Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/iommu.c | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 19db4d54c337..5e4648cadff9 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3246,8 +3246,9 @@ static int irq_remapping_alloc(struct irq_domain *domain, unsigned int virq, struct irq_alloc_info *info = arg; struct irq_data *irq_data; struct amd_ir_data *data = NULL; + struct amd_iommu *iommu; struct irq_cfg *cfg; - int i, ret, devid; + int i, ret, devid, seg, sbdf; int index; if (!info) @@ -3263,8 +3264,14 @@ static int irq_remapping_alloc(struct irq_domain *domain, unsigned int virq, if (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI) info->flags &= ~X86_IRQ_ALLOC_CONTIGUOUS_VECTORS; - devid = get_devid(info); - if (devid < 0) + sbdf = get_devid(info); + if (sbdf < 0) + return -EINVAL; + + seg = PCI_SBDF_TO_SEGID(sbdf); + devid = PCI_SBDF_TO_DEVID(sbdf); + iommu = __rlookup_amd_iommu(seg, devid); + if (!iommu) return -EINVAL; ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg); @@ -3273,7 +3280,6 @@ static int irq_remapping_alloc(struct irq_domain *domain, unsigned int virq, if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC) { struct irq_remap_table *table; - struct amd_iommu *iommu; table = alloc_irq_table(devid, NULL); if (table) { @@ -3283,7 +3289,6 @@ static int irq_remapping_alloc(struct irq_domain *domain, unsigned int virq, * interrupts. */ table->min_index = 32; - iommu = amd_iommu_rlookup_table[devid]; for (i = 0; i < 32; ++i) iommu->irte_ops->set_allocated(table, i); } -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 RESEND 15/35] iommu/amd: Convert to use rlookup_amd_iommu helper function
From: Suravee Suthikulpanit Use rlookup_amd_iommu() helper function which will give per PCI segment rlookup_table. Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/iommu.c | 64 +++ 1 file changed, 38 insertions(+), 26 deletions(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index cfecd072e7a6..19db4d54c337 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -229,13 +229,17 @@ static struct iommu_dev_data *search_dev_data(struct amd_iommu *iommu, u16 devid static int clone_alias(struct pci_dev *pdev, u16 alias, void *data) { + struct amd_iommu *iommu; u16 devid = pci_dev_id(pdev); if (devid == alias) return 0; - amd_iommu_rlookup_table[alias] = - amd_iommu_rlookup_table[devid]; + iommu = rlookup_amd_iommu(>dev); + if (!iommu) + return 0; + + amd_iommu_set_rlookup_table(iommu, alias); memcpy(amd_iommu_dev_table[alias].data, amd_iommu_dev_table[devid].data, sizeof(amd_iommu_dev_table[alias].data)); @@ -366,7 +370,7 @@ static bool check_device(struct device *dev) if (devid > amd_iommu_last_bdf) return false; - if (amd_iommu_rlookup_table[devid] == NULL) + if (rlookup_amd_iommu(dev) == NULL) return false; return true; @@ -1270,7 +1274,9 @@ static int device_flush_iotlb(struct iommu_dev_data *dev_data, int qdep; qdep = dev_data->ats.qdep; - iommu= amd_iommu_rlookup_table[dev_data->devid]; + iommu= rlookup_amd_iommu(dev_data->dev); + if (!iommu) + return -EINVAL; build_inv_iotlb_pages(, dev_data->devid, qdep, address, size); @@ -1295,7 +1301,9 @@ static int device_flush_dte(struct iommu_dev_data *dev_data) u16 alias; int ret; - iommu = amd_iommu_rlookup_table[dev_data->devid]; + iommu = rlookup_amd_iommu(dev_data->dev); + if (!iommu) + return -EINVAL; if (dev_is_pci(dev_data->dev)) pdev = to_pci_dev(dev_data->dev); @@ -1525,8 +1533,8 @@ static void free_gcr3_table(struct protection_domain *domain) free_page((unsigned long)domain->gcr3_tbl); } -static void set_dte_entry(u16 devid, struct protection_domain *domain, - bool ats, bool ppr) +static void set_dte_entry(struct amd_iommu *iommu, u16 devid, + struct protection_domain *domain, bool ats, bool ppr) { u64 pte_root = 0; u64 flags = 0; @@ -1545,8 +1553,6 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, flags |= DTE_FLAG_IOTLB; if (ppr) { - struct amd_iommu *iommu = amd_iommu_rlookup_table[devid]; - if (iommu_feature(iommu, FEATURE_EPHSUP)) pte_root |= 1ULL << DEV_ENTRY_PPR; } @@ -1590,8 +1596,6 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, * entries for the old domain ID that is being overwritten */ if (old_domid) { - struct amd_iommu *iommu = amd_iommu_rlookup_table[devid]; - amd_iommu_flush_tlb_domid(iommu, old_domid); } } @@ -1611,7 +1615,9 @@ static void do_attach(struct iommu_dev_data *dev_data, struct amd_iommu *iommu; bool ats; - iommu = amd_iommu_rlookup_table[dev_data->devid]; + iommu = rlookup_amd_iommu(dev_data->dev); + if (!iommu) + return; ats = dev_data->ats.enabled; /* Update data structures */ @@ -1623,7 +1629,7 @@ static void do_attach(struct iommu_dev_data *dev_data, domain->dev_cnt += 1; /* Update device table */ - set_dte_entry(dev_data->devid, domain, + set_dte_entry(iommu, dev_data->devid, domain, ats, dev_data->iommu_v2); clone_aliases(iommu, dev_data->dev); @@ -1635,7 +1641,9 @@ static void do_detach(struct iommu_dev_data *dev_data) struct protection_domain *domain = dev_data->domain; struct amd_iommu *iommu; - iommu = amd_iommu_rlookup_table[dev_data->devid]; + iommu = rlookup_amd_iommu(dev_data->dev); + if (!iommu) + return; /* Update data structures */ dev_data->domain = NULL; @@ -1813,13 +1821,14 @@ static struct iommu_device *amd_iommu_probe_device(struct device *dev) { struct iommu_device *iommu_dev; struct amd_iommu *iommu; - int ret, devid; + int ret; if (!check_device(dev)) return ERR_PTR(-ENODEV); - devid = get_device_id(dev); - iommu = amd_iommu_rlookup_table[devid]; + iommu = rlookup_amd_iommu(dev); + if (!iommu) + return ERR_PTR(-ENODEV); if
[PATCH v3 RESEND 14/35] iommu/amd: Convert to use per PCI segment irq_lookup_table
Then, remove the global irq_lookup_table. Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/amd_iommu_types.h | 2 -- drivers/iommu/amd/init.c| 19 --- drivers/iommu/amd/iommu.c | 36 ++--- 3 files changed, 23 insertions(+), 34 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 8d2d5fbdb57f..ca1a3d55cc83 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -445,8 +445,6 @@ struct irq_remap_table { u32 *table; }; -extern struct irq_remap_table **irq_lookup_table; - /* Interrupt remapping feature used? */ extern bool amd_iommu_irq_remap; diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index afe3bff5bce0..b7b50345c8a5 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -206,12 +206,6 @@ u16 *amd_iommu_alias_table; */ struct amd_iommu **amd_iommu_rlookup_table; -/* - * This table is used to find the irq remapping table for a given device id - * quickly. - */ -struct irq_remap_table **irq_lookup_table; - /* * AMD IOMMU allows up to 2^16 different protection domains. This is a bitmap * to know which ones are already in use. @@ -2786,11 +2780,6 @@ static struct syscore_ops amd_iommu_syscore_ops = { static void __init free_iommu_resources(void) { - kmemleak_free(irq_lookup_table); - free_pages((unsigned long)irq_lookup_table, - get_order(rlookup_table_size)); - irq_lookup_table = NULL; - kmem_cache_destroy(amd_iommu_irq_cache); amd_iommu_irq_cache = NULL; @@ -3011,14 +3000,6 @@ static int __init early_amd_iommu_init(void) if (alloc_irq_lookup_table(pci_seg)) goto out; } - - irq_lookup_table = (void *)__get_free_pages( - GFP_KERNEL | __GFP_ZERO, - get_order(rlookup_table_size)); - kmemleak_alloc(irq_lookup_table, rlookup_table_size, - 1, GFP_KERNEL); - if (!irq_lookup_table) - goto out; } ret = init_memory_definitions(ivrs_base); diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 53ccee57a7a0..cfecd072e7a6 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2732,16 +2732,18 @@ static void set_dte_irq_entry(u16 devid, struct irq_remap_table *table) amd_iommu_dev_table[devid].data[2] = dte; } -static struct irq_remap_table *get_irq_table(u16 devid) +static struct irq_remap_table *get_irq_table(struct amd_iommu *iommu, u16 devid) { struct irq_remap_table *table; + struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg; if (WARN_ONCE(!amd_iommu_rlookup_table[devid], "%s: no iommu for devid %x\n", __func__, devid)) return NULL; - table = irq_lookup_table[devid]; - if (WARN_ONCE(!table, "%s: no table for devid %x\n", __func__, devid)) + table = pci_seg->irq_lookup_table[devid]; + if (WARN_ONCE(!table, "%s: no table for devid %x:%x\n", + __func__, pci_seg->id, devid)) return NULL; return table; @@ -2774,7 +2776,9 @@ static struct irq_remap_table *__alloc_irq_table(void) static void set_remap_table_entry(struct amd_iommu *iommu, u16 devid, struct irq_remap_table *table) { - irq_lookup_table[devid] = table; + struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg; + + pci_seg->irq_lookup_table[devid] = table; set_dte_irq_entry(devid, table); iommu_flush_dte(iommu, devid); } @@ -2783,8 +2787,14 @@ static int set_remap_table_entry_alias(struct pci_dev *pdev, u16 alias, void *data) { struct irq_remap_table *table = data; + struct amd_iommu_pci_seg *pci_seg; + struct amd_iommu *iommu = rlookup_amd_iommu(>dev); - irq_lookup_table[alias] = table; + if (!iommu) + return -EINVAL; + + pci_seg = iommu->pci_seg; + pci_seg->irq_lookup_table[alias] = table; set_dte_irq_entry(alias, table); iommu_flush_dte(amd_iommu_rlookup_table[alias], alias); @@ -2808,12 +2818,12 @@ static struct irq_remap_table *alloc_irq_table(u16 devid, struct pci_dev *pdev) goto out_unlock; pci_seg = iommu->pci_seg; - table = irq_lookup_table[devid]; + table = pci_seg->irq_lookup_table[devid]; if (table) goto out_unlock; alias = pci_seg->alias_table[devid]; - table = irq_lookup_table[alias]; + table = pci_seg->irq_lookup_table[alias]; if (table) { set_remap_table_entry(iommu, devid, table);
[PATCH v3 RESEND 13/35] iommu/amd: Introduce per PCI segment rlookup table size
It will replace global "rlookup_table_size" variable. Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/amd_iommu_types.h | 3 +++ drivers/iommu/amd/init.c| 11 ++- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 8638b1107dd2..8d2d5fbdb57f 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -561,6 +561,9 @@ struct amd_iommu_pci_seg { /* Size of the alias table */ u32 alias_table_size; + /* Size of the rlookup table */ + u32 rlookup_table_size; + /* * device table virtual address * diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 22a632397818..afe3bff5bce0 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -672,7 +672,7 @@ static inline int __init alloc_rlookup_table(struct amd_iommu_pci_seg *pci_seg) { pci_seg->rlookup_table = (void *)__get_free_pages( GFP_KERNEL | __GFP_ZERO, - get_order(rlookup_table_size)); + get_order(pci_seg->rlookup_table_size)); if (pci_seg->rlookup_table == NULL) return -ENOMEM; @@ -682,7 +682,7 @@ static inline int __init alloc_rlookup_table(struct amd_iommu_pci_seg *pci_seg) static inline void free_rlookup_table(struct amd_iommu_pci_seg *pci_seg) { free_pages((unsigned long)pci_seg->rlookup_table, - get_order(rlookup_table_size)); + get_order(pci_seg->rlookup_table_size)); pci_seg->rlookup_table = NULL; } @@ -690,9 +690,9 @@ static inline int __init alloc_irq_lookup_table(struct amd_iommu_pci_seg *pci_se { pci_seg->irq_lookup_table = (void *)__get_free_pages( GFP_KERNEL | __GFP_ZERO, -get_order(rlookup_table_size)); + get_order(pci_seg->rlookup_table_size)); kmemleak_alloc(pci_seg->irq_lookup_table, - rlookup_table_size, 1, GFP_KERNEL); + pci_seg->rlookup_table_size, 1, GFP_KERNEL); if (pci_seg->irq_lookup_table == NULL) return -ENOMEM; @@ -703,7 +703,7 @@ static inline void free_irq_lookup_table(struct amd_iommu_pci_seg *pci_seg) { kmemleak_free(pci_seg->irq_lookup_table); free_pages((unsigned long)pci_seg->irq_lookup_table, - get_order(rlookup_table_size)); + get_order(pci_seg->rlookup_table_size)); pci_seg->irq_lookup_table = NULL; } @@ -1584,6 +1584,7 @@ static struct amd_iommu_pci_seg *__init alloc_pci_segment(u16 id, DUMP_printk("PCI segment : 0x%0x, last bdf : 0x%04x\n", id, last_bdf); pci_seg->dev_table_size = tbl_size(DEV_TABLE_ENTRY_SIZE); pci_seg->alias_table_size = tbl_size(ALIAS_TABLE_ENTRY_SIZE); + pci_seg->rlookup_table_size = tbl_size(RLOOKUP_TABLE_ENTRY_SIZE); pci_seg->id = id; init_llist_head(_seg->dev_data_list); -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/4] dt-bindings: qcom-iommu: Add Qualcomm MSM8953 compatible
On Sun, Jun 12, 2022 at 11:22:13AM +0200, Luca Weiss wrote: > Document the compatible used for IOMMU on the msm8953 SoC. > > Signed-off-by: Luca Weiss > --- > Changes from v1: > - new patch > > Documentation/devicetree/bindings/iommu/qcom,iommu.txt | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt > b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt > index 059139abce35..e6cecfd360eb 100644 > --- a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt > +++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt > @@ -10,6 +10,7 @@ to non-secure vs secure interrupt line. > - compatible : Should be one of: > > "qcom,msm8916-iommu" > +"qcom,msm8953-iommu" I'm assuming Andy or Bjorn will pick this up. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 RESEND 12/35] iommu/amd: Introduce per PCI segment alias table size
It will replace global "alias_table_size" variable. Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/amd_iommu_types.h | 3 +++ drivers/iommu/amd/init.c| 5 +++-- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 1dbe9c7f973d..8638b1107dd2 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -558,6 +558,9 @@ struct amd_iommu_pci_seg { /* Size of the device table */ u32 dev_table_size; + /* Size of the alias table */ + u32 alias_table_size; + /* * device table virtual address * diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 4a1807f7a8b9..22a632397818 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -712,7 +712,7 @@ static int __init alloc_alias_table(struct amd_iommu_pci_seg *pci_seg) int i; pci_seg->alias_table = (void *)__get_free_pages(GFP_KERNEL, - get_order(alias_table_size)); + get_order(pci_seg->alias_table_size)); if (!pci_seg->alias_table) return -ENOMEM; @@ -728,7 +728,7 @@ static int __init alloc_alias_table(struct amd_iommu_pci_seg *pci_seg) static void __init free_alias_table(struct amd_iommu_pci_seg *pci_seg) { free_pages((unsigned long)pci_seg->alias_table, - get_order(alias_table_size)); + get_order(pci_seg->alias_table_size)); pci_seg->alias_table = NULL; } @@ -1583,6 +1583,7 @@ static struct amd_iommu_pci_seg *__init alloc_pci_segment(u16 id, pci_seg->last_bdf = last_bdf; DUMP_printk("PCI segment : 0x%0x, last bdf : 0x%04x\n", id, last_bdf); pci_seg->dev_table_size = tbl_size(DEV_TABLE_ENTRY_SIZE); + pci_seg->alias_table_size = tbl_size(ALIAS_TABLE_ENTRY_SIZE); pci_seg->id = id; init_llist_head(_seg->dev_data_list); -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 RESEND 11/35] iommu/amd: Introduce per PCI segment device table size
With multiple pci segment support, number of BDF supported by each segment may differ. Hence introduce per segment device table size which depends on last_bdf. This will replace global "device_table_size" variable. Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/amd_iommu_types.h | 3 +++ drivers/iommu/amd/init.c| 18 ++ 2 files changed, 13 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 8be8f3d6b44a..1dbe9c7f973d 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -555,6 +555,9 @@ struct amd_iommu_pci_seg { /* Largest PCI device id we expect translation requests for */ u16 last_bdf; + /* Size of the device table */ + u32 dev_table_size; + /* * device table virtual address * diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 73554ee9c3b3..4a1807f7a8b9 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -416,6 +416,7 @@ static void iommu_set_cwwb_range(struct amd_iommu *iommu) static void iommu_set_device_table(struct amd_iommu *iommu) { u64 entry; + u32 dev_table_size = iommu->pci_seg->dev_table_size; BUG_ON(iommu->mmio_base == NULL); @@ -652,7 +653,7 @@ static int __init find_last_devid_acpi(struct acpi_table_header *table, u16 pci_ static inline int __init alloc_dev_table(struct amd_iommu_pci_seg *pci_seg) { pci_seg->dev_table = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO | GFP_DMA32, - get_order(dev_table_size)); + get_order(pci_seg->dev_table_size)); if (!pci_seg->dev_table) return -ENOMEM; @@ -662,7 +663,7 @@ static inline int __init alloc_dev_table(struct amd_iommu_pci_seg *pci_seg) static inline void free_dev_table(struct amd_iommu_pci_seg *pci_seg) { free_pages((unsigned long)pci_seg->dev_table, - get_order(dev_table_size)); + get_order(pci_seg->dev_table_size)); pci_seg->dev_table = NULL; } @@ -1035,7 +1036,7 @@ static bool __copy_device_table(struct amd_iommu *iommu) entry = (((u64) hi) << 32) + lo; old_devtb_size = ((entry & ~PAGE_MASK) + 1) << 12; - if (old_devtb_size != dev_table_size) { + if (old_devtb_size != pci_seg->dev_table_size) { pr_err("The device table size of IOMMU:%d is not expected!\n", iommu->index); return false; @@ -1054,15 +1055,15 @@ static bool __copy_device_table(struct amd_iommu *iommu) } old_devtb = (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT) && is_kdump_kernel()) ? (__force void *)ioremap_encrypted(old_devtb_phys, - dev_table_size) - : memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB); + pci_seg->dev_table_size) + : memremap(old_devtb_phys, pci_seg->dev_table_size, MEMREMAP_WB); if (!old_devtb) return false; gfp_flag = GFP_KERNEL | __GFP_ZERO | GFP_DMA32; pci_seg->old_dev_tbl_cpy = (void *)__get_free_pages(gfp_flag, - get_order(dev_table_size)); + get_order(pci_seg->dev_table_size)); if (pci_seg->old_dev_tbl_cpy == NULL) { pr_err("Failed to allocate memory for copying old device table!\n"); memunmap(old_devtb); @@ -1581,6 +1582,7 @@ static struct amd_iommu_pci_seg *__init alloc_pci_segment(u16 id, pci_seg->last_bdf = last_bdf; DUMP_printk("PCI segment : 0x%0x, last bdf : 0x%04x\n", id, last_bdf); + pci_seg->dev_table_size = tbl_size(DEV_TABLE_ENTRY_SIZE); pci_seg->id = id; init_llist_head(_seg->dev_data_list); @@ -2675,7 +2677,7 @@ static void early_enable_iommus(void) for_each_pci_segment(pci_seg) { if (pci_seg->old_dev_tbl_cpy != NULL) { free_pages((unsigned long)pci_seg->old_dev_tbl_cpy, - get_order(dev_table_size)); + get_order(pci_seg->dev_table_size)); pci_seg->old_dev_tbl_cpy = NULL; } } @@ -2689,7 +2691,7 @@ static void early_enable_iommus(void) for_each_pci_segment(pci_seg) { free_pages((unsigned long)pci_seg->dev_table, - get_order(dev_table_size)); +
[PATCH v3 RESEND 10/35] iommu/amd: Introduce per PCI segment last_bdf
Current code uses global "amd_iommu_last_bdf" to track the last bdf supported by the system. This value is used for various memory allocation, device data flushing, etc. Introduce per PCI segment last_bdf which will be used to track last bdf supported by the given PCI segment and use this value for all per segment memory allocations. Eventually it will replace global "amd_iommu_last_bdf". Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/amd_iommu_types.h | 3 ++ drivers/iommu/amd/init.c| 69 ++--- 2 files changed, 45 insertions(+), 27 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 3099a018cef0..8be8f3d6b44a 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -552,6 +552,9 @@ struct amd_iommu_pci_seg { /* PCI segment number */ u16 id; + /* Largest PCI device id we expect translation requests for */ + u16 last_bdf; + /* * device table virtual address * diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 39d04d4143fb..73554ee9c3b3 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -552,6 +552,7 @@ static int __init find_last_devid_from_ivhd(struct ivhd_header *h) { u8 *p = (void *)h, *end = (void *)h; struct ivhd_entry *dev; + int last_devid = -EINVAL; u32 ivhd_size = get_ivhd_header_size(h); @@ -569,13 +570,15 @@ static int __init find_last_devid_from_ivhd(struct ivhd_header *h) case IVHD_DEV_ALL: /* Use maximum BDF value for DEV_ALL */ update_last_devid(0x); - break; + return 0x; case IVHD_DEV_SELECT: case IVHD_DEV_RANGE_END: case IVHD_DEV_ALIAS: case IVHD_DEV_EXT_SELECT: /* all the above subfield types refer to device ids */ update_last_devid(dev->devid); + if (dev->devid > last_devid) + last_devid = dev->devid; break; default: break; @@ -585,7 +588,7 @@ static int __init find_last_devid_from_ivhd(struct ivhd_header *h) WARN_ON(p != end); - return 0; + return last_devid; } static int __init check_ivrs_checksum(struct acpi_table_header *table) @@ -609,27 +612,31 @@ static int __init check_ivrs_checksum(struct acpi_table_header *table) * id which we need to handle. This is the first of three functions which parse * the ACPI table. So we check the checksum here. */ -static int __init find_last_devid_acpi(struct acpi_table_header *table) +static int __init find_last_devid_acpi(struct acpi_table_header *table, u16 pci_seg) { u8 *p = (u8 *)table, *end = (u8 *)table; struct ivhd_header *h; + int last_devid, last_bdf = 0; p += IVRS_HEADER_LENGTH; end += table->length; while (p < end) { h = (struct ivhd_header *)p; - if (h->type == amd_iommu_target_ivhd_type) { - int ret = find_last_devid_from_ivhd(h); - - if (ret) - return ret; + if (h->pci_seg == pci_seg && + h->type == amd_iommu_target_ivhd_type) { + last_devid = find_last_devid_from_ivhd(h); + + if (last_devid < 0) + return -EINVAL; + if (last_devid > last_bdf) + last_bdf = last_devid; } p += h->length; } WARN_ON(p != end); - return 0; + return last_bdf; } / @@ -1553,14 +1560,28 @@ static int __init init_iommu_from_acpi(struct amd_iommu *iommu, } /* Allocate PCI segment data structure */ -static struct amd_iommu_pci_seg *__init alloc_pci_segment(u16 id) +static struct amd_iommu_pci_seg *__init alloc_pci_segment(u16 id, + struct acpi_table_header *ivrs_base) { struct amd_iommu_pci_seg *pci_seg; + int last_bdf; + + /* +* First parse ACPI tables to find the largest Bus/Dev/Func we need to +* handle in this PCI segment. Upon this information the shared data +* structures for the PCI segments in the system will be allocated. +*/ + last_bdf = find_last_devid_acpi(ivrs_base, id); + if (last_bdf < 0) + return NULL; pci_seg = kzalloc(sizeof(struct amd_iommu_pci_seg), GFP_KERNEL); if (pci_seg == NULL) return NULL; + pci_seg->last_bdf =
[PATCH v3 RESEND 09/35] iommu/amd: Introduce per PCI segment unity map list
Newer AMD systems can support multiple PCI segments. In order to support multiple PCI segments IVMD table in IVRS structure is enhanced to include pci segment id. Update ivmd_header structure to include "pci_seg". Also introduce per PCI segment unity map list. It will replace global amd_iommu_unity_map list. Note that we have used "reserved" field in IVMD table to include "pci_seg id" which was set to zero. It will take care of backward compatibility (new kernel will work fine on older systems). Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/amd_iommu_types.h | 13 +++-- drivers/iommu/amd/init.c| 30 +++-- drivers/iommu/amd/iommu.c | 8 +++- 3 files changed, 34 insertions(+), 17 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index c9dd0ab37475..3099a018cef0 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -587,6 +587,13 @@ struct amd_iommu_pci_seg { * More than one device can share the same requestor id. */ u16 *alias_table; + + /* +* A list of required unity mappings we find in ACPI. It is not locked +* because as runtime it is only read. It is created at ACPI table +* parsing time. +*/ + struct list_head unity_map; }; /* @@ -813,12 +820,6 @@ struct unity_map_entry { int prot; }; -/* - * List of all unity mappings. It is not locked because as runtime it is only - * read. It is created at ACPI table parsing time. - */ -extern struct list_head amd_iommu_unity_map; - /* * Data structures for device handling */ diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 80e7eef4260f..39d04d4143fb 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -141,7 +141,8 @@ struct ivmd_header { u16 length; u16 devid; u16 aux; - u64 resv; + u16 pci_seg; + u8 resv[6]; u64 range_start; u64 range_length; } __attribute__((packed)); @@ -161,8 +162,6 @@ static int amd_iommu_target_ivhd_type; u16 amd_iommu_last_bdf;/* largest PCI device id we have to handle */ -LIST_HEAD(amd_iommu_unity_map);/* a list of required unity mappings - we find in ACPI */ LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */ LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the @@ -1564,6 +1563,7 @@ static struct amd_iommu_pci_seg *__init alloc_pci_segment(u16 id) pci_seg->id = id; init_llist_head(_seg->dev_data_list); + INIT_LIST_HEAD(_seg->unity_map); list_add_tail(_seg->list, _iommu_pci_seg_list); if (alloc_dev_table(pci_seg)) @@ -2398,10 +2398,13 @@ static int iommu_init_irq(struct amd_iommu *iommu) static void __init free_unity_maps(void) { struct unity_map_entry *entry, *next; + struct amd_iommu_pci_seg *p, *pci_seg; - list_for_each_entry_safe(entry, next, _iommu_unity_map, list) { - list_del(>list); - kfree(entry); + for_each_pci_segment_safe(pci_seg, p) { + list_for_each_entry_safe(entry, next, _seg->unity_map, list) { + list_del(>list); + kfree(entry); + } } } @@ -2409,8 +2412,13 @@ static void __init free_unity_maps(void) static int __init init_unity_map_range(struct ivmd_header *m) { struct unity_map_entry *e = NULL; + struct amd_iommu_pci_seg *pci_seg; char *s; + pci_seg = get_pci_segment(m->pci_seg); + if (pci_seg == NULL) + return -ENOMEM; + e = kzalloc(sizeof(*e), GFP_KERNEL); if (e == NULL) return -ENOMEM; @@ -2448,14 +2456,16 @@ static int __init init_unity_map_range(struct ivmd_header *m) if (m->flags & IVMD_FLAG_EXCL_RANGE) e->prot = (IVMD_FLAG_IW | IVMD_FLAG_IR) >> 1; - DUMP_printk("%s devid_start: %02x:%02x.%x devid_end: %02x:%02x.%x" - " range_start: %016llx range_end: %016llx flags: %x\n", s, + DUMP_printk("%s devid_start: %04x:%02x:%02x.%x devid_end: " + "%04x:%02x:%02x.%x range_start: %016llx range_end: %016llx" + " flags: %x\n", s, m->pci_seg, PCI_BUS_NUM(e->devid_start), PCI_SLOT(e->devid_start), - PCI_FUNC(e->devid_start), PCI_BUS_NUM(e->devid_end), + PCI_FUNC(e->devid_start), m->pci_seg, + PCI_BUS_NUM(e->devid_end), PCI_SLOT(e->devid_end), PCI_FUNC(e->devid_end), e->address_start, e->address_end, m->flags); - list_add_tail(>list, _iommu_unity_map); +
[PATCH v3 RESEND 08/35] iommu/amd: Introduce per PCI segment alias_table
From: Suravee Suthikulpanit This will replace global alias table (amd_iommu_alias_table). Co-developed-by: Vasant Hegde Signed-off-by: Vasant Hegde Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu_types.h | 7 + drivers/iommu/amd/init.c| 41 ++--- drivers/iommu/amd/iommu.c | 41 ++--- 3 files changed, 64 insertions(+), 25 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 3ef68d588cc7..c9dd0ab37475 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -580,6 +580,13 @@ struct amd_iommu_pci_seg { * will be copied to. It's only be used in kdump kernel. */ struct dev_table_entry *old_dev_tbl_cpy; + + /* +* The alias table is a driver specific data structure which contains the +* mappings of the PCI device ids to the actual requestor ids on the IOMMU. +* More than one device can share the same requestor id. +*/ + u16 *alias_table; }; /* diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index f188130cc173..80e7eef4260f 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -700,6 +700,31 @@ static inline void free_irq_lookup_table(struct amd_iommu_pci_seg *pci_seg) pci_seg->irq_lookup_table = NULL; } +static int __init alloc_alias_table(struct amd_iommu_pci_seg *pci_seg) +{ + int i; + + pci_seg->alias_table = (void *)__get_free_pages(GFP_KERNEL, + get_order(alias_table_size)); + if (!pci_seg->alias_table) + return -ENOMEM; + + /* +* let all alias entries point to itself +*/ + for (i = 0; i <= amd_iommu_last_bdf; ++i) + pci_seg->alias_table[i] = i; + + return 0; +} + +static void __init free_alias_table(struct amd_iommu_pci_seg *pci_seg) +{ + free_pages((unsigned long)pci_seg->alias_table, + get_order(alias_table_size)); + pci_seg->alias_table = NULL; +} + /* * Allocates the command buffer. This buffer is per AMD IOMMU. We can * write commands to that buffer later and the IOMMU will execute them @@ -1268,6 +1293,7 @@ static int __init init_iommu_from_acpi(struct amd_iommu *iommu, u32 dev_i, ext_flags = 0; bool alias = false; struct ivhd_entry *e; + struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg; u32 ivhd_size; int ret; @@ -1349,7 +1375,7 @@ static int __init init_iommu_from_acpi(struct amd_iommu *iommu, devid_to = e->ext >> 8; set_dev_entry_from_acpi(iommu, devid , e->flags, 0); set_dev_entry_from_acpi(iommu, devid_to, e->flags, 0); - amd_iommu_alias_table[devid] = devid_to; + pci_seg->alias_table[devid] = devid_to; break; case IVHD_DEV_ALIAS_RANGE: @@ -1407,7 +1433,7 @@ static int __init init_iommu_from_acpi(struct amd_iommu *iommu, devid = e->devid; for (dev_i = devid_start; dev_i <= devid; ++dev_i) { if (alias) { - amd_iommu_alias_table[dev_i] = devid_to; + pci_seg->alias_table[dev_i] = devid_to; set_dev_entry_from_acpi(iommu, devid_to, flags, ext_flags); } @@ -1542,6 +1568,8 @@ static struct amd_iommu_pci_seg *__init alloc_pci_segment(u16 id) if (alloc_dev_table(pci_seg)) return NULL; + if (alloc_alias_table(pci_seg)) + return NULL; if (alloc_rlookup_table(pci_seg)) return NULL; @@ -1568,6 +1596,7 @@ static void __init free_pci_segments(void) list_del(_seg->list); free_irq_lookup_table(pci_seg); free_rlookup_table(pci_seg); + free_alias_table(pci_seg); free_dev_table(pci_seg); kfree(pci_seg); } @@ -2839,7 +2868,7 @@ static void __init ivinfo_init(void *ivrs) static int __init early_amd_iommu_init(void) { struct acpi_table_header *ivrs_base; - int i, remap_cache_sz, ret; + int remap_cache_sz, ret; acpi_status status; if (!amd_iommu_detected) @@ -2910,12 +2939,6 @@ static int __init early_amd_iommu_init(void) if (amd_iommu_pd_alloc_bitmap == NULL) goto out; - /* -* let all alias entries point to itself -*/ - for (i = 0; i <= amd_iommu_last_bdf; ++i) - amd_iommu_alias_table[i] = i; - /* * never allocate domain 0 because its used as the non-allocated and
[PATCH v3 RESEND 07/35] iommu/amd: Introduce per PCI segment old_dev_tbl_cpy
From: Suravee Suthikulpanit It will remove global old_dev_tbl_cpy. Also update copy_device_table() copy device table for all PCI segments. Co-developed-by: Vasant Hegde Signed-off-by: Vasant Hegde Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu_types.h | 6 ++ drivers/iommu/amd/init.c| 109 2 files changed, 70 insertions(+), 45 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 5f3cc704f131..3ef68d588cc7 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -574,6 +574,12 @@ struct amd_iommu_pci_seg { * device id quickly. */ struct irq_remap_table **irq_lookup_table; + + /* +* Pointer to a device table which the content of old device table +* will be copied to. It's only be used in kdump kernel. +*/ + struct dev_table_entry *old_dev_tbl_cpy; }; /* diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 509655f86851..f188130cc173 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -193,11 +193,6 @@ bool amd_iommu_force_isolation __read_mostly; * page table root pointer. */ struct dev_table_entry *amd_iommu_dev_table; -/* - * Pointer to a device table which the content of old device table - * will be copied to. It's only be used in kdump kernel. - */ -static struct dev_table_entry *old_dev_tbl_cpy; /* * The alias table is a driver specific data structure which contains the @@ -992,39 +987,27 @@ static int get_dev_entry_bit(u16 devid, u8 bit) } -static bool copy_device_table(void) +static bool __copy_device_table(struct amd_iommu *iommu) { - u64 int_ctl, int_tab_len, entry = 0, last_entry = 0; + u64 int_ctl, int_tab_len, entry = 0; + struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg; struct dev_table_entry *old_devtb = NULL; u32 lo, hi, devid, old_devtb_size; phys_addr_t old_devtb_phys; - struct amd_iommu *iommu; u16 dom_id, dte_v, irq_v; gfp_t gfp_flag; u64 tmp; - if (!amd_iommu_pre_enabled) - return false; - - pr_warn("Translation is already enabled - trying to copy translation structures\n"); - for_each_iommu(iommu) { - /* All IOMMUs should use the same device table with the same size */ - lo = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET); - hi = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET + 4); - entry = (((u64) hi) << 32) + lo; - if (last_entry && last_entry != entry) { - pr_err("IOMMU:%d should use the same dev table as others!\n", - iommu->index); - return false; - } - last_entry = entry; + /* Each IOMMU use separate device table with the same size */ + lo = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET); + hi = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET + 4); + entry = (((u64) hi) << 32) + lo; - old_devtb_size = ((entry & ~PAGE_MASK) + 1) << 12; - if (old_devtb_size != dev_table_size) { - pr_err("The device table size of IOMMU:%d is not expected!\n", - iommu->index); - return false; - } + old_devtb_size = ((entry & ~PAGE_MASK) + 1) << 12; + if (old_devtb_size != dev_table_size) { + pr_err("The device table size of IOMMU:%d is not expected!\n", + iommu->index); + return false; } /* @@ -1047,31 +1030,31 @@ static bool copy_device_table(void) return false; gfp_flag = GFP_KERNEL | __GFP_ZERO | GFP_DMA32; - old_dev_tbl_cpy = (void *)__get_free_pages(gfp_flag, - get_order(dev_table_size)); - if (old_dev_tbl_cpy == NULL) { + pci_seg->old_dev_tbl_cpy = (void *)__get_free_pages(gfp_flag, + get_order(dev_table_size)); + if (pci_seg->old_dev_tbl_cpy == NULL) { pr_err("Failed to allocate memory for copying old device table!\n"); memunmap(old_devtb); return false; } for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) { - old_dev_tbl_cpy[devid] = old_devtb[devid]; + pci_seg->old_dev_tbl_cpy[devid] = old_devtb[devid]; dom_id = old_devtb[devid].data[1] & DEV_DOMID_MASK; dte_v = old_devtb[devid].data[0] & DTE_FLAG_V; if (dte_v && dom_id) { - old_dev_tbl_cpy[devid].data[0] = old_devtb[devid].data[0]; - old_dev_tbl_cpy[devid].data[1] = old_devtb[devid].data[1]; +
[PATCH v3 RESEND 06/35] iommu/amd: Introduce per PCI segment dev_data_list
This will replace global dev_data_list. Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/amd_iommu_types.h | 3 +++ drivers/iommu/amd/init.c| 1 + drivers/iommu/amd/iommu.c | 21 ++--- 3 files changed, 14 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index cfb5f0e44186..5f3cc704f131 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -546,6 +546,9 @@ struct amd_iommu_pci_seg { /* List with all PCI segments in the system */ struct list_head list; + /* List of all available dev_data structures */ + struct llist_head dev_data_list; + /* PCI segment number */ u16 id; diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index f6678dd56e28..509655f86851 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1527,6 +1527,7 @@ static struct amd_iommu_pci_seg *__init alloc_pci_segment(u16 id) return NULL; pci_seg->id = id; + init_llist_head(_seg->dev_data_list); list_add_tail(_seg->list, _iommu_pci_seg_list); if (alloc_dev_table(pci_seg)) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index b0262b2e749d..48275da7fcb0 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -62,9 +62,6 @@ static DEFINE_SPINLOCK(pd_bitmap_lock); -/* List of all available dev_data structures */ -static LLIST_HEAD(dev_data_list); - LIST_HEAD(ioapic_map); LIST_HEAD(hpet_map); LIST_HEAD(acpihid_map); @@ -195,9 +192,10 @@ static struct protection_domain *to_pdomain(struct iommu_domain *dom) return container_of(dom, struct protection_domain, domain); } -static struct iommu_dev_data *alloc_dev_data(u16 devid) +static struct iommu_dev_data *alloc_dev_data(struct amd_iommu *iommu, u16 devid) { struct iommu_dev_data *dev_data; + struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg; dev_data = kzalloc(sizeof(*dev_data), GFP_KERNEL); if (!dev_data) @@ -207,19 +205,20 @@ static struct iommu_dev_data *alloc_dev_data(u16 devid) dev_data->devid = devid; ratelimit_default_init(_data->rs); - llist_add(_data->dev_data_list, _data_list); + llist_add(_data->dev_data_list, _seg->dev_data_list); return dev_data; } -static struct iommu_dev_data *search_dev_data(u16 devid) +static struct iommu_dev_data *search_dev_data(struct amd_iommu *iommu, u16 devid) { struct iommu_dev_data *dev_data; struct llist_node *node; + struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg; - if (llist_empty(_data_list)) + if (llist_empty(_seg->dev_data_list)) return NULL; - node = dev_data_list.first; + node = pci_seg->dev_data_list.first; llist_for_each_entry(dev_data, node, dev_data_list) { if (dev_data->devid == devid) return dev_data; @@ -288,10 +287,10 @@ static struct iommu_dev_data *find_dev_data(u16 devid) struct iommu_dev_data *dev_data; struct amd_iommu *iommu = amd_iommu_rlookup_table[devid]; - dev_data = search_dev_data(devid); + dev_data = search_dev_data(iommu, devid); if (dev_data == NULL) { - dev_data = alloc_dev_data(devid); + dev_data = alloc_dev_data(iommu, devid); if (!dev_data) return NULL; @@ -3466,7 +3465,7 @@ static int amd_ir_set_vcpu_affinity(struct irq_data *data, void *vcpu_info) struct vcpu_data *vcpu_pi_info = pi_data->vcpu_data; struct amd_ir_data *ir_data = data->chip_data; struct irq_2_irte *irte_info = _data->irq_2_irte; - struct iommu_dev_data *dev_data = search_dev_data(irte_info->devid); + struct iommu_dev_data *dev_data = search_dev_data(NULL, irte_info->devid); /* Note: * This device has never been set up for guest mode. -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 RESEND 05/35] iommu/amd: Introduce per PCI segment irq_lookup_table
This will replace global irq lookup table (irq_lookup_table). Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/amd_iommu_types.h | 6 ++ drivers/iommu/amd/init.c| 27 +++ 2 files changed, 33 insertions(+) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index d0ee78c656ff..cfb5f0e44186 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -565,6 +565,12 @@ struct amd_iommu_pci_seg { * device id. */ struct amd_iommu **rlookup_table; + + /* +* This table is used to find the irq remapping table for a given +* device id quickly. +*/ + struct irq_remap_table **irq_lookup_table; }; /* diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 2fb3e1b82e09..f6678dd56e28 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -684,6 +684,26 @@ static inline void free_rlookup_table(struct amd_iommu_pci_seg *pci_seg) pci_seg->rlookup_table = NULL; } +static inline int __init alloc_irq_lookup_table(struct amd_iommu_pci_seg *pci_seg) +{ + pci_seg->irq_lookup_table = (void *)__get_free_pages( +GFP_KERNEL | __GFP_ZERO, +get_order(rlookup_table_size)); + kmemleak_alloc(pci_seg->irq_lookup_table, + rlookup_table_size, 1, GFP_KERNEL); + if (pci_seg->irq_lookup_table == NULL) + return -ENOMEM; + + return 0; +} + +static inline void free_irq_lookup_table(struct amd_iommu_pci_seg *pci_seg) +{ + kmemleak_free(pci_seg->irq_lookup_table); + free_pages((unsigned long)pci_seg->irq_lookup_table, + get_order(rlookup_table_size)); + pci_seg->irq_lookup_table = NULL; +} /* * Allocates the command buffer. This buffer is per AMD IOMMU. We can @@ -1535,6 +1555,7 @@ static void __init free_pci_segments(void) for_each_pci_segment_safe(pci_seg, next) { list_del(_seg->list); + free_irq_lookup_table(pci_seg); free_rlookup_table(pci_seg); free_dev_table(pci_seg); kfree(pci_seg); @@ -2897,6 +2918,7 @@ static int __init early_amd_iommu_init(void) amd_iommu_irq_remap = check_ioapic_information(); if (amd_iommu_irq_remap) { + struct amd_iommu_pci_seg *pci_seg; /* * Interrupt remapping enabled, create kmem_cache for the * remapping tables. @@ -2913,6 +2935,11 @@ static int __init early_amd_iommu_init(void) if (!amd_iommu_irq_cache) goto out; + for_each_pci_segment(pci_seg) { + if (alloc_irq_lookup_table(pci_seg)) + goto out; + } + irq_lookup_table = (void *)__get_free_pages( GFP_KERNEL | __GFP_ZERO, get_order(rlookup_table_size)); -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 RESEND 04/35] iommu/amd: Introduce per PCI segment rlookup table
From: Suravee Suthikulpanit This will replace global rlookup table (amd_iommu_rlookup_table). Add helper functions to set/get rlookup table for the given device. Also add macros to get seg/devid from sbdf. Co-developed-by: Vasant Hegde Signed-off-by: Vasant Hegde Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 1 + drivers/iommu/amd/amd_iommu_types.h | 11 drivers/iommu/amd/init.c| 23 +++ drivers/iommu/amd/iommu.c | 44 + 4 files changed, 79 insertions(+) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 885570cd0d77..2947239700ce 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -19,6 +19,7 @@ extern int amd_iommu_init_devices(void); extern void amd_iommu_uninit_devices(void); extern void amd_iommu_init_notifier(void); extern int amd_iommu_init_api(void); +extern void amd_iommu_set_rlookup_table(struct amd_iommu *iommu, u16 devid); #ifdef CONFIG_AMD_IOMMU_DEBUGFS void amd_iommu_debugfs_setup(struct amd_iommu *iommu); diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 422ea87ae4c7..d0ee78c656ff 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -456,6 +456,9 @@ extern bool amdr_ivrs_remap_support; /* kmem_cache to get tables with 128 byte alignement */ extern struct kmem_cache *amd_iommu_irq_cache; +#define PCI_SBDF_TO_SEGID(sbdf)(((sbdf) >> 16) & 0x) +#define PCI_SBDF_TO_DEVID(sbdf)((sbdf) & 0x) + /* Make iterating over all pci segment easier */ #define for_each_pci_segment(pci_seg) \ list_for_each_entry((pci_seg), _iommu_pci_seg_list, list) @@ -490,6 +493,7 @@ struct amd_iommu_fault { }; +struct amd_iommu; struct iommu_domain; struct irq_domain; struct amd_irte_ops; @@ -554,6 +558,13 @@ struct amd_iommu_pci_seg { * page table root pointer. */ struct dev_table_entry *dev_table; + + /* +* The rlookup iommu table is used to find the IOMMU which is +* responsible for a specific device. It is indexed by the PCI +* device id. +*/ + struct amd_iommu **rlookup_table; }; /* diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 5152243593bf..2fb3e1b82e09 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -665,6 +665,26 @@ static inline void free_dev_table(struct amd_iommu_pci_seg *pci_seg) pci_seg->dev_table = NULL; } +/* Allocate per PCI segment IOMMU rlookup table. */ +static inline int __init alloc_rlookup_table(struct amd_iommu_pci_seg *pci_seg) +{ + pci_seg->rlookup_table = (void *)__get_free_pages( + GFP_KERNEL | __GFP_ZERO, + get_order(rlookup_table_size)); + if (pci_seg->rlookup_table == NULL) + return -ENOMEM; + + return 0; +} + +static inline void free_rlookup_table(struct amd_iommu_pci_seg *pci_seg) +{ + free_pages((unsigned long)pci_seg->rlookup_table, + get_order(rlookup_table_size)); + pci_seg->rlookup_table = NULL; +} + + /* * Allocates the command buffer. This buffer is per AMD IOMMU. We can * write commands to that buffer later and the IOMMU will execute them @@ -1491,6 +1511,8 @@ static struct amd_iommu_pci_seg *__init alloc_pci_segment(u16 id) if (alloc_dev_table(pci_seg)) return NULL; + if (alloc_rlookup_table(pci_seg)) + return NULL; return pci_seg; } @@ -1513,6 +1535,7 @@ static void __init free_pci_segments(void) for_each_pci_segment_safe(pci_seg, next) { list_del(_seg->list); + free_rlookup_table(pci_seg); free_dev_table(pci_seg); kfree(pci_seg); } diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index ac8f81f527b4..b0262b2e749d 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -146,6 +146,50 @@ struct dev_table_entry *get_dev_table(struct amd_iommu *iommu) return dev_table; } +static inline u16 get_device_segment(struct device *dev) +{ + u16 seg; + + if (dev_is_pci(dev)) { + struct pci_dev *pdev = to_pci_dev(dev); + + seg = pci_domain_nr(pdev->bus); + } else { + u32 devid = get_acpihid_device_id(dev, NULL); + + seg = PCI_SBDF_TO_SEGID(devid); + } + + return seg; +} + +/* Writes the specific IOMMU for a device into the PCI segment rlookup table */ +void amd_iommu_set_rlookup_table(struct amd_iommu *iommu, u16 devid) +{ + struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg; + + pci_seg->rlookup_table[devid] = iommu; +} + +static struct amd_iommu *__rlookup_amd_iommu(u16 seg, u16 devid) +{ + struct
[PATCH v3 RESEND 03/35] iommu/amd: Introduce per PCI segment device table
From: Suravee Suthikulpanit Introduce per PCI segment device table. All IOMMUs within the segment will share this device table. This will replace global device table i.e. amd_iommu_dev_table. Also introduce helper function to get the device table for the given IOMMU. Co-developed-by: Vasant Hegde Signed-off-by: Vasant Hegde Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu.h | 1 + drivers/iommu/amd/amd_iommu_types.h | 10 ++ drivers/iommu/amd/init.c| 26 -- drivers/iommu/amd/iommu.c | 12 4 files changed, 47 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 1ab31074f5b3..885570cd0d77 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -128,4 +128,5 @@ static inline void amd_iommu_apply_ivrs_quirks(void) { } extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain, u64 *root, int mode); +extern struct dev_table_entry *get_dev_table(struct amd_iommu *iommu); #endif diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 2243b1a22d78..422ea87ae4c7 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -544,6 +544,16 @@ struct amd_iommu_pci_seg { /* PCI segment number */ u16 id; + + /* +* device table virtual address +* +* Pointer to the per PCI segment device table. +* It is indexed by the PCI device id or the HT unit id and contains +* information about the domain the device belongs to as well as the +* page table root pointer. +*/ + struct dev_table_entry *dev_table; }; /* diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index c1b5d530dbf3..5152243593bf 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -642,11 +642,29 @@ static int __init find_last_devid_acpi(struct acpi_table_header *table) * * The following functions belong to the code path which parses the ACPI table * the second time. In this ACPI parsing iteration we allocate IOMMU specific - * data structures, initialize the device/alias/rlookup table and also - * basically initialize the hardware. + * data structures, initialize the per PCI segment device/alias/rlookup table + * and also basically initialize the hardware. * / +/* Allocate per PCI segment device table */ +static inline int __init alloc_dev_table(struct amd_iommu_pci_seg *pci_seg) +{ + pci_seg->dev_table = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO | GFP_DMA32, + get_order(dev_table_size)); + if (!pci_seg->dev_table) + return -ENOMEM; + + return 0; +} + +static inline void free_dev_table(struct amd_iommu_pci_seg *pci_seg) +{ + free_pages((unsigned long)pci_seg->dev_table, + get_order(dev_table_size)); + pci_seg->dev_table = NULL; +} + /* * Allocates the command buffer. This buffer is per AMD IOMMU. We can * write commands to that buffer later and the IOMMU will execute them @@ -1471,6 +1489,9 @@ static struct amd_iommu_pci_seg *__init alloc_pci_segment(u16 id) pci_seg->id = id; list_add_tail(_seg->list, _iommu_pci_seg_list); + if (alloc_dev_table(pci_seg)) + return NULL; + return pci_seg; } @@ -1492,6 +1513,7 @@ static void __init free_pci_segments(void) for_each_pci_segment_safe(pci_seg, next) { list_del(_seg->list); + free_dev_table(pci_seg); kfree(pci_seg); } } diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index efa8af5a9419..ac8f81f527b4 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -134,6 +134,18 @@ static inline int get_device_id(struct device *dev) return devid; } +struct dev_table_entry *get_dev_table(struct amd_iommu *iommu) +{ + struct dev_table_entry *dev_table; + struct amd_iommu_pci_seg *pci_seg = iommu->pci_seg; + + BUG_ON(pci_seg == NULL); + dev_table = pci_seg->dev_table; + BUG_ON(dev_table == NULL); + + return dev_table; +} + static struct protection_domain *to_pdomain(struct iommu_domain *dom) { return container_of(dom, struct protection_domain, domain); -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 RESEND 02/35] iommu/amd: Introduce pci segment structure
Newer AMD systems can support multiple PCI segments, where each segment contains one or more IOMMU instances. However, an IOMMU instance can only support a single PCI segment. Current code assumes that system contains only one pci segment (segment 0) and creates global data structures such as device table, rlookup table, etc. Introducing per PCI segment data structure, which contains segment specific data structures. This will eventually replace the global data structures. Also update `amd_iommu->pci_seg` variable to point to PCI segment structure instead of PCI segment ID. Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/amd_iommu_types.h | 24 ++- drivers/iommu/amd/init.c| 46 - 2 files changed, 68 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 9b563f850f1d..2243b1a22d78 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -456,6 +456,11 @@ extern bool amdr_ivrs_remap_support; /* kmem_cache to get tables with 128 byte alignement */ extern struct kmem_cache *amd_iommu_irq_cache; +/* Make iterating over all pci segment easier */ +#define for_each_pci_segment(pci_seg) \ + list_for_each_entry((pci_seg), _iommu_pci_seg_list, list) +#define for_each_pci_segment_safe(pci_seg, next) \ + list_for_each_entry_safe((pci_seg), (next), _iommu_pci_seg_list, list) /* * Make iterating over all IOMMUs easier */ @@ -530,6 +535,17 @@ struct protection_domain { unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */ }; +/* + * This structure contains information about one PCI segment in the system. + */ +struct amd_iommu_pci_seg { + /* List with all PCI segments in the system */ + struct list_head list; + + /* PCI segment number */ + u16 id; +}; + /* * Structure where we save information about one hardware AMD IOMMU in the * system. @@ -581,7 +597,7 @@ struct amd_iommu { u16 cap_ptr; /* pci domain of this IOMMU */ - u16 pci_seg; + struct amd_iommu_pci_seg *pci_seg; /* start of exclusion range of that IOMMU */ u64 exclusion_start; @@ -709,6 +725,12 @@ extern struct list_head ioapic_map; extern struct list_head hpet_map; extern struct list_head acpihid_map; +/* + * List with all PCI segments in the system. This list is not locked because + * it is only written at driver initialization time + */ +extern struct list_head amd_iommu_pci_seg_list; + /* * List with all IOMMUs in the system. This list is not locked because it is * only written and read at driver initialization or suspend time diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 1d08f87e734b..c1b5d530dbf3 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -164,6 +164,7 @@ u16 amd_iommu_last_bdf; /* largest PCI device id we have LIST_HEAD(amd_iommu_unity_map);/* a list of required unity mappings we find in ACPI */ +LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */ LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the system */ @@ -1458,6 +1459,43 @@ static int __init init_iommu_from_acpi(struct amd_iommu *iommu, return 0; } +/* Allocate PCI segment data structure */ +static struct amd_iommu_pci_seg *__init alloc_pci_segment(u16 id) +{ + struct amd_iommu_pci_seg *pci_seg; + + pci_seg = kzalloc(sizeof(struct amd_iommu_pci_seg), GFP_KERNEL); + if (pci_seg == NULL) + return NULL; + + pci_seg->id = id; + list_add_tail(_seg->list, _iommu_pci_seg_list); + + return pci_seg; +} + +static struct amd_iommu_pci_seg *__init get_pci_segment(u16 id) +{ + struct amd_iommu_pci_seg *pci_seg; + + for_each_pci_segment(pci_seg) { + if (pci_seg->id == id) + return pci_seg; + } + + return alloc_pci_segment(id); +} + +static void __init free_pci_segments(void) +{ + struct amd_iommu_pci_seg *pci_seg, *next; + + for_each_pci_segment_safe(pci_seg, next) { + list_del(_seg->list); + kfree(pci_seg); + } +} + static void __init free_iommu_one(struct amd_iommu *iommu) { free_cwwb_sem(iommu); @@ -1544,8 +1582,14 @@ static void amd_iommu_ats_write_check_workaround(struct amd_iommu *iommu) */ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) { + struct amd_iommu_pci_seg *pci_seg; int ret; + pci_seg = get_pci_segment(h->pci_seg); + if (pci_seg == NULL) + return -ENOMEM; + iommu->pci_seg = pci_seg; + raw_spin_lock_init(>lock); iommu->cmd_sem_val = 0;
[PATCH v3 RESEND 01/35] iommu/amd: Update struct iommu_dev_data definition
struct iommu_dev_data contains member "pdev" to point to pci_dev. This is valid for only PCI devices and for other devices this will be NULL. This causes unnecessary "pdev != NULL" check at various places. Replace "struct pci_dev" member with "struct device" and use to_pci_dev() to get pci device reference as needed. Also adjust setup_aliases() and clone_aliases() function. No functional change intended. Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Vasant Hegde --- drivers/iommu/amd/amd_iommu_types.h | 2 +- drivers/iommu/amd/iommu.c | 32 + 2 files changed, 20 insertions(+), 14 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 72d0f5e2f651..9b563f850f1d 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -689,7 +689,7 @@ struct iommu_dev_data { struct list_head list;/* For domain->dev_list */ struct llist_node dev_data_list; /* For global dev_data_list */ struct protection_domain *domain; /* Domain the device is bound to */ - struct pci_dev *pdev; + struct device *dev; u16 devid;/* PCI Device ID */ bool iommu_v2;/* Device can make use of IOMMUv2 */ struct { diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 840831d5d2ad..efa8af5a9419 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -188,10 +188,13 @@ static int clone_alias(struct pci_dev *pdev, u16 alias, void *data) return 0; } -static void clone_aliases(struct pci_dev *pdev) +static void clone_aliases(struct device *dev) { - if (!pdev) + struct pci_dev *pdev; + + if (!dev_is_pci(dev)) return; + pdev = to_pci_dev(dev); /* * The IVRS alias stored in the alias table may not be @@ -203,14 +206,14 @@ static void clone_aliases(struct pci_dev *pdev) pci_for_each_dma_alias(pdev, clone_alias, NULL); } -static struct pci_dev *setup_aliases(struct device *dev) +static void setup_aliases(struct device *dev) { struct pci_dev *pdev = to_pci_dev(dev); u16 ivrs_alias; /* For ACPI HID devices, there are no aliases */ if (!dev_is_pci(dev)) - return NULL; + return; /* * Add the IVRS alias to the pci aliases if it is on the same @@ -221,9 +224,7 @@ static struct pci_dev *setup_aliases(struct device *dev) PCI_BUS_NUM(ivrs_alias) == pdev->bus->number) pci_add_dma_alias(pdev, ivrs_alias & 0xff, 1); - clone_aliases(pdev); - - return pdev; + clone_aliases(dev); } static struct iommu_dev_data *find_dev_data(u16 devid) @@ -331,7 +332,8 @@ static int iommu_init_device(struct device *dev) if (!dev_data) return -ENOMEM; - dev_data->pdev = setup_aliases(dev); + dev_data->dev = dev; + setup_aliases(dev); /* * By default we use passthrough mode for IOMMUv2 capable device. @@ -1232,13 +1234,17 @@ static int device_flush_dte_alias(struct pci_dev *pdev, u16 alias, void *data) static int device_flush_dte(struct iommu_dev_data *dev_data) { struct amd_iommu *iommu; + struct pci_dev *pdev = NULL; u16 alias; int ret; iommu = amd_iommu_rlookup_table[dev_data->devid]; - if (dev_data->pdev) - ret = pci_for_each_dma_alias(dev_data->pdev, + if (dev_is_pci(dev_data->dev)) + pdev = to_pci_dev(dev_data->dev); + + if (pdev) + ret = pci_for_each_dma_alias(pdev, device_flush_dte_alias, iommu); else ret = iommu_flush_dte(iommu, dev_data->devid); @@ -1561,7 +1567,7 @@ static void do_attach(struct iommu_dev_data *dev_data, /* Update device table */ set_dte_entry(dev_data->devid, domain, ats, dev_data->iommu_v2); - clone_aliases(dev_data->pdev); + clone_aliases(dev_data->dev); device_flush_dte(dev_data); } @@ -1577,7 +1583,7 @@ static void do_detach(struct iommu_dev_data *dev_data) dev_data->domain = NULL; list_del(_data->list); clear_dte_entry(dev_data->devid); - clone_aliases(dev_data->pdev); + clone_aliases(dev_data->dev); /* Flush the DTE entry */ device_flush_dte(dev_data); @@ -1818,7 +1824,7 @@ static void update_device_table(struct protection_domain *domain) list_for_each_entry(dev_data, >dev_list, list) { set_dte_entry(dev_data->devid, domain, dev_data->ats.enabled, dev_data->iommu_v2); - clone_aliases(dev_data->pdev); + clone_aliases(dev_data->dev); } } -- 2.31.1
[PATCH v3 RESEND 00/35] iommu/amd: Add multiple PCI segments support
Hi Joerg, As discussed in other thread, I have updated "From:" tag and resending patchset. No changes in the actual patch content. This patchset is based on top on "iommu/x86/amd" branch. Base commit : 0d10fe75911787 ("iommu/amd: Use try_cmpxchg64 in ") Newer AMD systems can support multiple PCI segments, where each segment contains one or more IOMMU instances. However, an IOMMU instance can only support a single PCI segment. Current code assumes a system contains only one PCI segment (segment 0) and creates global data structures such as device table, rlookup table, etc. This series introduces per-PCI-segment data structure, which contains device table, alias table, etc. For each PCI segment, all IOMMUs share the same data structure. The series also makes necessary code adjustment and logging enhancements. Finally it removes global data structures like device table, alias table, etc. In case of system w/ single PCI segment (e.g. PCI segment ID is zero), IOMMU driver allocates one PCI segment data structure, which will be shared by all IOMMUs. Patch 1 updates struct iommu_dev_data definition. Patch 2 - 13 introduce new PCI segment structure and allocate per data structures, and introduce the amd_iommu.pci_seg pointer to point to the corresponded pci_segment structure. Also, we have introduced a helper function rlookup_amd_iommu() to reverse-lookup each iommu for a particular device. Patch 14 - 27 adopt to per PCI segment data structure and removes global data structure. Patch 28 fixes flushing logic to flush upto last_bdf. Patch 29 - 35 convert usages of 16-bit PCI device ID to include 16-bit segment ID. v3 patchset: https://lore.kernel.org/linux-iommu/20220511072141.15485-1-vasant.he...@amd.com/ Changes from v2 -> v3: - Addressed Joerg's review comments - Fixed typo in patch 1 subject - Fixed few minor things in patch 2 - Merged patch 27 - 29 into one patch - Added new macros to get seg and devid from sbdf - Patch 32 : Extend devid to 32bit and added new macro. v2 patchset : https://lore.kernel.org/linux-iommu/20220425113415.24087-1-vasant.he...@amd.com/T/#t Changes from v1 -> v2: - Updated patch 1 to include dev_is_pci() check v1 patchset : https://lore.kernel.org/linux-iommu/20220404100023.324645-1-vasant.he...@amd.com/T/#t Changes from RFC -> v1: - Rebased patches on top of iommu/next tree. - Update struct iommu_dev_data definition - Updated few log message to print segment ID - Fix smatch warnings RFC patchset : https://lore.kernel.org/linux-iommu/20220311094854.31595-1-vasant.he...@amd.com/T/#t Regards, Vasant Suravee Suthikulpanit (20): iommu/amd: Introduce per PCI segment device table iommu/amd: Introduce per PCI segment rlookup table iommu/amd: Introduce per PCI segment old_dev_tbl_cpy iommu/amd: Introduce per PCI segment alias_table iommu/amd: Convert to use rlookup_amd_iommu helper function iommu/amd: Update irq_remapping_alloc to use IOMMU lookup helper function iommu/amd: Introduce struct amd_ir_data.iommu iommu/amd: Update amd_irte_ops functions iommu/amd: Update alloc_irq_table and alloc_irq_index iommu/amd: Update set_dte_entry and clear_dte_entry iommu/amd: Update iommu_ignore_device iommu/amd: Update dump_dte_entry iommu/amd: Update set_dte_irq_entry iommu/amd: Update (un)init_device_table_dma() iommu/amd: Update set_dev_entry_bit() and get_dev_entry_bit() iommu/amd: Remove global amd_iommu_[dev_table/alias_table/last_bdf] iommu/amd: Introduce get_device_sbdf_id() helper function iommu/amd: Include PCI segment ID when initialize IOMMU iommu/amd: Specify PCI segment ID when getting pci device iommu/amd: Add PCI segment support for ivrs_[ioapic/hpet/acpihid] commands Vasant Hegde (15): iommu/amd: Update struct iommu_dev_data definition iommu/amd: Introduce pci segment structure iommu/amd: Introduce per PCI segment irq_lookup_table iommu/amd: Introduce per PCI segment dev_data_list iommu/amd: Introduce per PCI segment unity map list iommu/amd: Introduce per PCI segment last_bdf iommu/amd: Introduce per PCI segment device table size iommu/amd: Introduce per PCI segment alias table size iommu/amd: Introduce per PCI segment rlookup table size iommu/amd: Convert to use per PCI segment irq_lookup_table iommu/amd: Convert to use per PCI segment rlookup_table iommu/amd: Flush upto last_bdf only iommu/amd: Print PCI segment ID in error log messages iommu/amd: Update device_state structure to include PCI seg ID iommu/amd: Update amd_iommu_fault structure to include PCI seg ID .../admin-guide/kernel-parameters.txt | 34 +- drivers/iommu/amd/amd_iommu.h | 13 +- drivers/iommu/amd/amd_iommu_types.h | 133 +++- drivers/iommu/amd/init.c | 687 +++--- drivers/iommu/amd/iommu.c | 563 -- drivers/iommu/amd/iommu_v2.c | 67 +-
Re: [PATCH] iommu/amd: Handle return of iommu_device_sysfs_add
On Fri, Jul 01, 2022 at 02:20:08AM -0400, Bo Liu wrote: > As iommu_device_sysfs_add() can fail, we should check the return value. > > Signed-off-by: Bo Liu > --- > drivers/iommu/amd/init.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/iova: change IOVA_MAG_SIZE to 127 to save memory
On Sun, Jul 03, 2022 at 07:44:50PM +0800, Feng Tang wrote: > kmalloc will round up the request size to power of 2, and current > iova_magazine's size is 1032 (1024+8) bytes, so each instance > allocated will get 2048 bytes from kmalloc, causing around 1KB > waste. > > Change IOVA_MAG_SIZE from 128 to 127 to make size of 'iova_magazine' > 1024 bytes so that no memory will be wasted. > > Signed-off-by: Feng Tang > Acked-by: Robin Murphy Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/1] iommu/vt-d: Fixes for v5.19-rc4
On Sat, Jun 25, 2022 at 09:34:29PM +0800, Lu Baolu wrote: > Hi Joerg, > > One fix is queued for v5.19. It aims to fix: > > - RID2PASID setup/teardown failures for pci alias devices > > Please consider it for the iommu/fix branch. > > Best regards, > Lu Baolu > > Lu Baolu (1): > iommu/vt-d: Fix RID2PASID setup/teardown failure Queued, thanks Baolu and sorry for the delay. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/exynos: Make driver independent of the system page size
On Thu, Jun 23, 2022 at 11:36:29AM +0200, Marek Szyprowski wrote: > drivers/iommu/exynos-iommu.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/3] iommu: More internal ops cleanup
On Tue, Jun 21, 2022 at 04:14:24PM +0100, Robin Murphy wrote: > Robin Murphy (3): > iommu: Use dev_iommu_ops() for probe_finalize > iommu: Make .release_device optional > iommu: Clean up release_device checks Applied to core branch, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 09/14] iommu/ipmmu-vmsa: Clean up bus_set_iommu()
On 2022-07-06 09:38, Alexey Kardashevskiy wrote: On 28/04/2022 23:18, Robin Murphy wrote: Stop calling bus_set_iommu() since it's now unnecessary. This also leaves the custom initcall effectively doing nothing but register the driver, which no longer needs to happen early either, so convert it to builtin_platform_driver(). Signed-off-by: Robin Murphy --- drivers/iommu/ipmmu-vmsa.c | 35 +-- 1 file changed, 1 insertion(+), 34 deletions(-) diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c index 8fdb84b3642b..2549d32f0ddd 100644 --- a/drivers/iommu/ipmmu-vmsa.c +++ b/drivers/iommu/ipmmu-vmsa.c @@ -1090,11 +1090,6 @@ static int ipmmu_probe(struct platform_device *pdev) ret = iommu_device_register(>iommu, _ops, >dev); if (ret) return ret; - -#if defined(CONFIG_IOMMU_DMA) - if (!iommu_present(_bus_type)) - bus_set_iommu(_bus_type, _ops); -#endif } /* The comment which starts here did not make it to the patch but it should have as it mentions bus_set_iommu() which is gone by the end of the series. Heh, busted! In fact I think the whole point of that comment stops being true, but I couldn't be bothered to reason about it since one of the next steps after this is to start ripping all the arm_iommu_* stuff out anyway. More general question/request - could you please include the exact sha1 the patchset is based on? It did not apply to any current trees and while it was trivial, it was slightly annoying to resolve the conflicts :) Thanks, v3 is based directly on 5.19-rc3: https://lore.kernel.org/lkml/cover.1657034827.git.robin.mur...@arm.com/ And if it helps I have it on a branch here as well: https://gitlab.arm.com/linux-arm/linux-rm/-/tree/bus-set-iommu-v3 Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v13 0/9] ACPI/IORT: Support for IORT RMR node
On Tue, Jun 28, 2022 at 07:59:39AM +, Shameerali Kolothum Thodi wrote: > Now that we have all the required acks, could you please pick this series via > IOMMU tree? Applied to core branch, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/vt-d: Fix PCI bus rescan device hot add
On Fri, Jun 24, 2022 at 02:12:28PM +0800, Baolu Lu wrote: > It makes sense as far as I am aware. By putting IOMMUs in pass-through > mode, there will be no run-time costs and things could be simplified a > lot. > > Besides the refactoring efforts, we still need this quick fix so that > the fix could be propagated to various stable and vendors' downstream trees. Patch is applied now for 5.19. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] MAINTAINERS: Remove iommu@lists.linux-foundation.org
From: Joerg Roedel The IOMMU mailing list has moved to io...@lists.linux.dev and the old list should bounce by now. Remove it from the MAINTAINERS file. Signed-off-by: Joerg Roedel --- MAINTAINERS | 11 --- 1 file changed, 11 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 66bffb24a348..ead381fdfc5a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -426,7 +426,6 @@ F: drivers/acpi/*thermal* ACPI VIOT DRIVER M: Jean-Philippe Brucker L: linux-a...@vger.kernel.org -L: iommu@lists.linux-foundation.org L: io...@lists.linux.dev S: Maintained F: drivers/acpi/viot.c @@ -960,7 +959,6 @@ F: drivers/video/fbdev/geode/ AMD IOMMU (AMD-VI) M: Joerg Roedel R: Suravee Suthikulpanit -L: iommu@lists.linux-foundation.org L: io...@lists.linux.dev S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git @@ -5979,7 +5977,6 @@ DMA MAPPING HELPERS M: Christoph Hellwig M: Marek Szyprowski R: Robin Murphy -L: iommu@lists.linux-foundation.org L: io...@lists.linux.dev S: Supported W: http://git.infradead.org/users/hch/dma-mapping.git @@ -5992,7 +5989,6 @@ F:kernel/dma/ DMA MAPPING BENCHMARK M: Xiang Chen -L: iommu@lists.linux-foundation.org L: io...@lists.linux.dev F: kernel/dma/map_benchmark.c F: tools/testing/selftests/dma/ @@ -7577,7 +7573,6 @@ F:drivers/gpu/drm/exynos/exynos_dp* EXYNOS SYSMMU (IOMMU) driver M: Marek Szyprowski -L: iommu@lists.linux-foundation.org L: io...@lists.linux.dev S: Maintained F: drivers/iommu/exynos-iommu.c @@ -,7 +9994,6 @@ F:drivers/hid/intel-ish-hid/ INTEL IOMMU (VT-d) M: David Woodhouse M: Lu Baolu -L: iommu@lists.linux-foundation.org L: io...@lists.linux.dev S: Supported T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git @@ -10379,7 +10373,6 @@ F: include/linux/iomap.h IOMMU DRIVERS M: Joerg Roedel M: Will Deacon -L: iommu@lists.linux-foundation.org L: io...@lists.linux.dev S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git @@ -12539,7 +12532,6 @@ F: drivers/i2c/busses/i2c-mt65xx.c MEDIATEK IOMMU DRIVER M: Yong Wu -L: iommu@lists.linux-foundation.org L: io...@lists.linux.dev L: linux-media...@lists.infradead.org (moderated for non-subscribers) S: Supported @@ -16591,7 +16583,6 @@ F: drivers/i2c/busses/i2c-qcom-cci.c QUALCOMM IOMMU M: Rob Clark -L: iommu@lists.linux-foundation.org L: io...@lists.linux.dev L: linux-arm-...@vger.kernel.org S: Maintained @@ -19217,7 +19208,6 @@ F: arch/x86/boot/video* SWIOTLB SUBSYSTEM M: Christoph Hellwig -L: iommu@lists.linux-foundation.org L: io...@lists.linux.dev S: Supported W: http://git.infradead.org/users/hch/dma-mapping.git @@ -21893,7 +21883,6 @@ XEN SWIOTLB SUBSYSTEM M: Juergen Gross M: Stefano Stabellini L: xen-de...@lists.xenproject.org (moderated for non-subscribers) -L: iommu@lists.linux-foundation.org L: io...@lists.linux.dev S: Supported F: arch/x86/xen/*swiotlb* -- 2.36.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] x86/ACPI: Set swiotlb area according to the number of lapic entry in MADT
On 7/6/2022 5:02 PM, Christoph Hellwig wrote: On Wed, Jul 06, 2022 at 04:57:33PM +0800, Tianyu Lan wrote: Swiotlb_init() is called in the mem_init() of different architects and memblock free pages are released to the buddy allocator just after calling swiotlb_init() via memblock_free_all(). Yes. The mem_init() is called before smp_init(). But why would that matter? cpu_possible_map is set up from setup_arch(), which is called before that. Sorry. I just still focus online cpu number and the number is got after smp_init(). Possible cpu number includes some offline cpus. I will have a try. Thanks for suggestion. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 0/7] iommu/amd: Add Generic IO Page Table Framework Support for v2 Page Table
On Tue, Jun 28, 2022 at 02:35:51PM +0530, Vasant Hegde wrote: > Sorry. I didn't get last statement ("device identity maps DMA requests > without PASID"). > Can you please elaborate? When using v1 page-tables, each device supporting ATS/PRI/PASID needs to be direct-mapped, because the v1 page-tables basically act as a stage-2 page table for the PASID ones. But when the non-pasid case moves to the pasid==0 page-table, then there is not stage-2 anymore and a device can be used with ATS/PRI/PASID while non-PASID requests are translated too, no? I didn't get how this is handled in the current patch-set. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 7/7] iommu/amd: Introduce amd_iommu_pgtable command-line option
On Tue, Jun 28, 2022 at 01:23:52PM +0530, Vasant Hegde wrote: > I think it will complicate the parsing logic. We do have `amd_iommu=off` > option. > How are we going to handle `amd_iommu=off,[pgtable_v1/v2]` ? In that case everything except 'off' will be ignored. The driver might set its internal variables, but this has no effect as the driver never initializes. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] x86/ACPI: Set swiotlb area according to the number of lapic entry in MADT
On Wed, Jul 06, 2022 at 04:57:33PM +0800, Tianyu Lan wrote: > Swiotlb_init() is called in the mem_init() of different architects and > memblock free pages are released to the buddy allocator just after > calling swiotlb_init() via memblock_free_all(). Yes. > The mem_init() is called before smp_init(). But why would that matter? cpu_possible_map is set up from setup_arch(), which is called before that. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] x86/ACPI: Set swiotlb area according to the number of lapic entry in MADT
On 7/6/2022 4:00 PM, Christoph Hellwig wrote: On Fri, Jul 01, 2022 at 01:02:21AM +0800, Tianyu Lan wrote: Can we reorder that initialization? Because I really hate having to have an arch hook in every architecture. How about using "flags" parameter of swiotlb_init() to pass area number or add new parameter for area number? I just reposted patch 1 since there is just some coding style issue and area number may also set via swiotlb kernel parameter. We still need figure out a good solution to pass area number from architecture code. What is the problem with calling swiotlb_init after nr_possible_cpus() works? Swiotlb_init() is called in the mem_init() of different architects and memblock free pages are released to the buddy allocator just after calling swiotlb_init() via memblock_free_all(). The mem_init() is called before smp_init(). If calling swiotlb_init() after smp_init(), that means we can't allocate large chunk low end memory via memblock_alloc() in the swiotlb(). Swiotlb_init() needs to rework to allocate memory from the buddy allocator and just like swiotlb_init_late() does. This will limit the bounce buffer size. Otherwise We need to do the reorder for all achitectures and there maybe some other unknown issues. swiotlb flags parameter of swiotlb_init() seems to be a good place to pass the area number in current code. If not set the swiotlb_area number/flag, the area number will be one and keep the original behavior of one single global spinlock protecting io tlb data structure. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 09/14] iommu/ipmmu-vmsa: Clean up bus_set_iommu()
On 28/04/2022 23:18, Robin Murphy wrote: Stop calling bus_set_iommu() since it's now unnecessary. This also leaves the custom initcall effectively doing nothing but register the driver, which no longer needs to happen early either, so convert it to builtin_platform_driver(). Signed-off-by: Robin Murphy --- drivers/iommu/ipmmu-vmsa.c | 35 +-- 1 file changed, 1 insertion(+), 34 deletions(-) diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c index 8fdb84b3642b..2549d32f0ddd 100644 --- a/drivers/iommu/ipmmu-vmsa.c +++ b/drivers/iommu/ipmmu-vmsa.c @@ -1090,11 +1090,6 @@ static int ipmmu_probe(struct platform_device *pdev) ret = iommu_device_register(>iommu, _ops, >dev); if (ret) return ret; - -#if defined(CONFIG_IOMMU_DMA) - if (!iommu_present(_bus_type)) - bus_set_iommu(_bus_type, _ops); -#endif } /* The comment which starts here did not make it to the patch but it should have as it mentions bus_set_iommu() which is gone by the end of the series. More general question/request - could you please include the exact sha1 the patchset is based on? It did not apply to any current trees and while it was trivial, it was slightly annoying to resolve the conflicts :) Thanks, @@ -1168,32 +1163,4 @@ static struct platform_driver ipmmu_driver = { .probe = ipmmu_probe, .remove = ipmmu_remove, }; - -static int __init ipmmu_init(void) -{ - struct device_node *np; - static bool setup_done; - int ret; - - if (setup_done) - return 0; - - np = of_find_matching_node(NULL, ipmmu_of_ids); - if (!np) - return 0; - - of_node_put(np); - - ret = platform_driver_register(_driver); - if (ret < 0) - return ret; - -#if defined(CONFIG_ARM) && !defined(CONFIG_IOMMU_DMA) - if (!iommu_present(_bus_type)) - bus_set_iommu(_bus_type, _ops); -#endif - - setup_done = true; - return 0; -} -subsys_initcall(ipmmu_init); +builtin_platform_driver(ipmmu_driver); -- Alexey ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] x86/ACPI: Set swiotlb area according to the number of lapic entry in MADT
On Fri, Jul 01, 2022 at 01:02:21AM +0800, Tianyu Lan wrote: > > Can we reorder that initialization? Because I really hate having > > to have an arch hook in every architecture. > > How about using "flags" parameter of swiotlb_init() to pass area number > or add new parameter for area number? > > I just reposted patch 1 since there is just some coding style issue and area > number may also set via swiotlb kernel parameter. We still need figure out a > good solution to pass area number from architecture code. What is the problem with calling swiotlb_init after nr_possible_cpus() works? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 20/21] PCI/P2PDMA: Introduce pci_mmap_p2pmem()
On Wed, Jul 06, 2022 at 08:51:27AM +0200, Christoph Hellwig wrote: > On Tue, Jul 05, 2022 at 12:16:45PM -0600, Logan Gunthorpe wrote: > > The current version does it through a char device, but that requires > > creating a simple_fs and anon_inode for teardown on driver removal, plus > > a bunch of hooks through the driver that exposes it (NVMe, in this case) > > to set this all up. > > > > Christoph is suggesting a sysfs interface which could potentially avoid > > the anon_inode and all of the extra hooks. It has some significant > > benefits and maybe some small downsides, but I wouldn't describe it as > > horrid. > > Yeah, I don't think is is horrible, it fits in with the resource files > for the BARs, and solves a lot of problems. Greg, can you explain > what would be so bad about it? As you mention, you will have to pass different things down into sysfs in order for that to be possible. If it matches the resource files like we currently have today, that might not be that bad, but it still feels odd to me. Let's see an implementation and a Documentation/ABI/ entry first though. thanks, greg k-h ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 20/21] PCI/P2PDMA: Introduce pci_mmap_p2pmem()
On Tue, Jul 05, 2022 at 12:16:45PM -0600, Logan Gunthorpe wrote: > The current version does it through a char device, but that requires > creating a simple_fs and anon_inode for teardown on driver removal, plus > a bunch of hooks through the driver that exposes it (NVMe, in this case) > to set this all up. > > Christoph is suggesting a sysfs interface which could potentially avoid > the anon_inode and all of the extra hooks. It has some significant > benefits and maybe some small downsides, but I wouldn't describe it as > horrid. Yeah, I don't think is is horrible, it fits in with the resource files for the BARs, and solves a lot of problems. Greg, can you explain what would be so bad about it? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 03/16] dt-bindings: power: mediatek: Refine multiple level power domain nodes
On Tue, 2022-07-05 at 14:57 -0600, Rob Herring wrote: > On Mon, Jul 04, 2022 at 06:00:15PM +0800, Tinghan Shen wrote: > > Extract duplicated properties and support more levels of power > > domain nodes. > > > > This change fix following error when do dtbs_check, > > arch/arm64/boot/dts/mediatek/mt8195-evb.dtb: power-controller: > > power-domain@15: > > power-domain@16:power-domain@18: 'power-domain@19', 'power-domain@20', > > 'power-domain@21' do not > > match any of the regexes: 'pinctrl-[0-9]+' > > From schema: > > Documentation/devicetree/bindings/power/mediatek,power-controller.yaml > > > > Signed-off-by: Tinghan Shen > > --- > > .../power/mediatek,power-controller.yaml | 132 ++ > > 1 file changed, 12 insertions(+), 120 deletions(-) > > > > diff --git > > a/Documentation/devicetree/bindings/power/mediatek,power-controller.yaml > > b/Documentation/devicetree/bindings/power/mediatek,power-controller.yaml > > index 135c6f722091..09a537a802b8 100644 > > --- a/Documentation/devicetree/bindings/power/mediatek,power-controller.yaml > > +++ b/Documentation/devicetree/bindings/power/mediatek,power-controller.yaml > > @@ -39,8 +39,17 @@ properties: > >'#size-cells': > > const: 0 > > > > +required: > > + - compatible > > + > > +additionalProperties: false > > + > > patternProperties: > >"^power-domain@[0-9a-f]+$": > > +$ref: "#/$defs/power-domain-node" > > + > > +$defs: > > + power-domain-node: > > type: object > > description: | > >Represents the power domains within the power controller node as > > documented > > @@ -98,127 +107,10 @@ patternProperties: > > $ref: /schemas/types.yaml#/definitions/phandle > > description: phandle to the device containing the SMI register > > range. > > > > -patternProperties: > > - "^power-domain@[0-9a-f]+$": > > -type: object > > -description: | > > - Represents a power domain child within a power domain parent > > node. > > - > > -properties: > > - > > - '#power-domain-cells': > > -description: > > - Must be 0 for nodes representing a single PM domain and 1 > > for nodes > > - providing multiple PM domains. > > - > > - '#address-cells': > > -const: 1 > > - > > - '#size-cells': > > -const: 0 > > - > > - reg: > > -maxItems: 1 > > - > > - clocks: > > -description: | > > - A number of phandles to clocks that need to be enabled > > during domain > > - power-up sequencing. > > - > > - clock-names: > > -description: | > > - List of names of clocks, in order to match the power-up > > sequencing > > - for each power domain we need to group the clocks by name. > > BASIC > > - clocks need to be enabled before enabling the corresponding > > power > > - domain, and should not have a '-' in their name (i.e mm, > > mfg, venc). > > - SUSBYS clocks need to be enabled before releasing the bus > > protection, > > - and should contain a '-' in their name (i.e mm-0, isp-0, > > cam-0). > > - > > - In order to follow properly the power-up sequencing, the > > clocks must > > - be specified by order, adding first the BASIC clocks > > followed by the > > - SUSBSYS clocks. > > - > > - domain-supply: > > -description: domain regulator supply. > > - > > - mediatek,infracfg: > > -$ref: /schemas/types.yaml#/definitions/phandle > > -description: phandle to the device containing the INFRACFG > > register range. > > - > > - mediatek,smi: > > -$ref: /schemas/types.yaml#/definitions/phandle > > -description: phandle to the device containing the SMI register > > range. > > - > > -patternProperties: > > - "^power-domain@[0-9a-f]+$": > > -type: object > > -description: | > > - Represents a power domain child within a power domain parent > > node. > > - > > -properties: > > + required: > > +- reg > > > > - '#power-domain-cells': > > -description: > > - Must be 0 for nodes representing a single PM domain and > > 1 for nodes > > - providing multiple PM domains. > > - > > - '#address-cells': > > -const: 1 > > - > > - '#size-cells': > > -const: 0 > > - > > - reg: > > -maxItems: 1 > > - > > - clocks: > > -description: | > > - A number of phandles to clocks that need to be enabled > > during domain > > - power-up sequencing. > > - > > - clock-names: > > -description: | > >