Re: [PATCH] iommu/io-pgtable-arm: Optimize partial walk flush for large scatter-gather list
Hi Krishna, On 2021-06-11 22:19, Krishna Reddy wrote: Hi Sai, >> > No, the unmap latency is not just in some test case written, the >> > issue is very real and we have workloads where camera is reporting >> > frame drops because of this unmap latency in the order of 100s of milliseconds. Not exactly, this issue is not specific to camera. If you look at the numbers in the commit text, even for the test device its the same observation. It depends on the buffer size we are unmapping which affects the number of TLBIs issue. I am not aware of any such HW side bw issues for camera specifically on QCOM devices. It is clear that reducing number of TLBIs reduces the umap API latency. But, It is at the expense of throwing away valid tlb entries. Quantifying the impact of arbitrary invalidation of valid tlb entries at context level is not straight forward and use case dependent. The side-effects might be rare or won't be known until they are noticed. Right but we won't know until we profile the specific usecases or try them in generic workload to see if they affect the performance. Sure, over invalidation is a concern where multiple buffers can be mapped to same context and the cache is not usable at the time for lookup and such but we don't do it for small buffers and only for large buffers which means thousands of TLB entry mappings in which case TLBIASID is preferred (note: I mentioned the HW team recommendation to use it for anything greater than 128 TLB entries) in my earlier reply. And also note that we do this only for partial walk flush, we are not arbitrarily changing all the TLBIs to ASID based. Can you provide more details on How the unmap latency is causing camera to drop frames? Is unmap performed in the perf path? I am no camera expert but from what the camera team mentioned is that there is a thread which frees memory(large unused memory buffers) periodically which ends up taking around 100+ms and causing some camera test failures with frame drops. Parallel efforts are already being made to optimize this usage of thread but as I mentioned previously, this is *not a camera specific*, lets say someone else invokes such large unmaps, it's going to face the same issue. If unmap is queued and performed on a back ground thread, would it resolve the frame drops? Not sure I understand what you mean by queuing on background thread but with that or not, we still do the same number of TLBIs and hop through iommu->io-pgtable->arm-smmu to perform the the unmap, so how will that help? Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v12 5/5] iommu: Remove mode argument from iommu_set_dma_strict()
On 2021/6/11 20:20, John Garry wrote: diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index ccbd5d4c1a50..146cb71c7441 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -350,10 +350,9 @@ static int __init iommu_dma_setup(char *str) } early_param("iommu.strict", iommu_dma_setup); -void iommu_set_dma_strict(bool strict) +void iommu_set_dma_strict(void) { - if (strict || !(iommu_cmd_line & IOMMU_CMD_LINE_STRICT)) - iommu_dma_strict = strict; + iommu_dma_strict = true; } Will this change break the functionality of iommu.strict? Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v12 3/5] iommu/vt-d: Add support for IOMMU default DMA mode build options
On 2021/6/11 20:20, John Garry wrote: @@ -453,8 +452,7 @@ static int __init intel_iommu_setup(char *str) pr_warn("intel_iommu=forcedac deprecated; use iommu.forcedac instead\n"); iommu_dma_forcedac = true; } else if (!strncmp(str, "strict", 6)) { - pr_info("Disable batched IOTLB flush\n"); - intel_iommu_strict = 1; + iommu_set_dma_strict(true); I would like to deprecate this command line and ask users to use iommu.strict instead. --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -436,7 +436,7 @@ static int __init intel_iommu_setup(char *str) pr_warn("intel_iommu=forcedac deprecated; use iommu.forcedac instead\n"); iommu_dma_forcedac = true; } else if (!strncmp(str, "strict", 6)) { - pr_info("Disable batched IOTLB flush\n"); + pr_warn("intel_iommu=strict deprecated; use iommu.strict instead\n"); intel_iommu_strict = 1; Also update Documentation/admin-guide/kernel-parameters.txt accordingly. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v12 3/5] iommu/vt-d: Add support for IOMMU default DMA mode build options
On 2021/6/11 20:20, John Garry wrote: diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 2a71347611d4..4467353f981b 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -94,6 +94,7 @@ choice prompt "IOMMU default DMA mode" depends on IOMMU_DMA + default IOMMU_DEFAULT_LAZY if INTEL_IOMMU default IOMMU_DEFAULT_STRICT If two default values are different. Which one will be overridden? Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v12 2/5] iommu: Enhance IOMMU default DMA mode build options
On 2021/6/11 20:20, John Garry wrote: +choice + prompt "IOMMU default DMA mode" This is not explicit. How about "IOMMU DMA default cache invalidation policy" ? Best regards, baolu + depends on IOMMU_DMA + + default IOMMU_DEFAULT_STRICT + help + This option allows an IOMMU DMA mode to be chosen at build time, to + override the default DMA mode of each ARCH, removing the need to + pass in kernel parameters through command line. It is still possible + to provide ARCH-specific or common boot options to override this + option. + + If unsure, keep the default. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Plan for /dev/ioasid RFC v2
On Fri, Jun 11, 2021 at 01:38:28PM -0600, Alex Williamson wrote: > That's fine for a serial port, but not a device that can do DMA. > The entire point of vfio is to try to provide secure, DMA capable > userspace drivers. If we relax enforcement of that isolation we've > failed. I don't understand why the IOASID matters at all in this. Can you explain? What is the breach of isolation? A userspace process can create many IOASIDs, it can create an IOASID that can touch any memory the process can touch. It can create two IOASID's that are identical copies of each other. How does restricting a device from attaching to an IOASID create security if I can just make a copy of that IOASID and attach to that? Is there some quirk of the IOMMU I'm missing? My understanding of isolation has been that two different security contexts cannot have access to devices in the same group because that can leak access across a security bounday, eg because device A can do DMA to device B and take control of it. Isolation means that the control of the devices in a group is not inadventantly spread between two security contexts, like two processes. > I don't see how this provides isolation. If a user only needs to > attach their devicefd to an ioasidfd to have full access to their > device, not even bound by attaching to an ioasid context, then we've > failed. That is not quite what I tried to explain. The first ioasid any device in the group attaches to becomes the only ioasid that any device in the group attaches to. It is an ownership model unique to the iommu_fd. It directly prevents process A and process B from opening devices in the same group and trying to operate them independently. A and B will not posses the same iommu fd so only one of them can activate a device. The other device remains unusable. Iin this model, I consider the iommu_fd to be the security domain. The meaning of isolation is that only devices explicitly joined to an iommu_fd can access IOASIDs in that iommu_fd. Userspace has choices how to use this Placing all devices in the same iommu_fd userspace is telling the kernel that they are in the same security domain. This means userspace says is safe for them to all share IOASIDs without isolation. If userspace wants tigher security domains then userspace can create additional iommu_fds, up to a unique iommu_fd per group. This would duplicate the security model that the vfio groups force today. The kernel security feature is to prevent un-isolated devices from being joined to different iommu_fd security contexts. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v12 2/5] iommu: Enhance IOMMU default DMA mode build options
On 2021/6/11 20:20, John Garry wrote: +config IOMMU_DEFAULT_LAZY + bool "lazy" + help + Support lazy mode, where for every IOMMU DMA unmap operation, the + flush operation of IOTLB and the free operation of IOVA are deferred. + They are only guaranteed to be done before the related IOVA will be + reused. + + The isolation provided in this mode is not as secure as STRICT mode, + such that a vulnerable time window may be created between the DMA + unmap and the mapping finally being torn down in the IOMMU, where the + device can still access the system memory. However this mode may " ... and the mappings cached in the IOMMU IOTLB or device TLB finally being invalidated, where the device probably can still access the memory which has already been unmapped by the device driver." Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Plan for /dev/ioasid RFC v2
On Fri, 11 Jun 2021 00:58:35 + "Tian, Kevin" wrote: > Hi, Alex, > > > From: Alex Williamson > > Sent: Thursday, June 10, 2021 11:39 PM > > > > On Wed, 9 Jun 2021 15:49:40 -0300 > > Jason Gunthorpe wrote: > > > > > On Wed, Jun 09, 2021 at 10:27:22AM -0600, Alex Williamson wrote: > > > > > > > > > It is a kernel decision, because a fundamental task of the kernel > > > > > > is to > > > > > > ensure isolation between user-space tasks as good as it can. And if > > > > > > a > > > > > > device assigned to one task can interfer with a device of another > > > > > > task > > > > > > (e.g. by sending P2P messages), then the promise of isolation is > > broken. > > > > > > > > > > AIUI, the IOASID model will still enforce IOMMU groups, but it's not > > > > > an > > > > > explicit part of the interface like it is for vfio. For example the > > > > > IOASID model allows attaching individual devices such that we have > > > > > granularity to create per device IOASIDs, but all devices within an > > > > > IOMMU group are required to be attached to an IOASID before they can > > be > > > > > used. > > > > > > Yes, thanks Alex > > > > > > > > It's not entirely clear to me yet how that last bit gets > > > > > implemented though, ie. what barrier is in place to prevent device > > > > > usage prior to reaching this viable state. > > > > > > The major security checkpoint for the group is on the VFIO side. We > > > must require the group before userspace can be allowed access to any > > > device registers. Obtaining the device_fd from the group_fd does this > > > today as the group_fd is the security proof. > > > > > > Actually, thinking about this some more.. If the only way to get a > > > working device_fd in the first place is to get it from the group_fd > > > and thus pass a group-based security check, why do we need to do > > > anything at the ioasid level? > > > > > > The security concept of isolation was satisfied as soon as userspace > > > opened the group_fd. What do more checks in the kernel accomplish? > > > > Opening the group is not the extent of the security check currently > > required, the group must be added to a container and an IOMMU model > > configured for the container *before* the user can get a devicefd. > > Each devicefd creates a reference to this security context, therefore > > access to a device does not exist without such a context. > > IIUC each device has a default domain when it's probed by iommu driver > at boot time. This domain includes an empty page table, implying that > device is already in a security context before it's probed by device driver. The default domain could be passthrough though, right? > Now when this device is added to vfio, vfio creates another security > context through above sequence. This sequence requires the device to > switch from default security context to this new one, before it can be > accessed by user. This is true currently, we use group semantics with the type1 IOMMU backend to attach all devices in the group to a secure context, regardless of the default domain. > Then I wonder whether it's really necessary. As long as a device is in > a security context at any time, access to a device can be allowed. The > user itself should ensure that the access happens only after the device > creates a reference to the new security context that is desired by this > user. > > Then what does group really bring to us? By definition an IOMMU group is the smallest set of devices that we can consider isolated from all other devices. Therefore devices in a group are not necessarily isolated from each other. Therefore if any device within a group is not isolated, the group is not isolated. VFIO needs to know when it's safe to provide userspace access to the device, but the device isolation is dependent on the group isolation. The group is therefore part of this picture whether implicit or explicit. > With this new proposal we just need to make sure that a device cannot > be attached to any IOASID before all devices in its group are bound to > the IOASIDfd. If we want to start with a vfio-like policy, then all devices > in the group must be attached to the same IOASID. Or as Jason suggests, > they can attach to different IOASIDs (if in the group due to !ACS) if the > user wants, or have some devices attached while others detached since > both are in a security context anyway. But if it's the device attachment to the IOASID that provides the isolation and the user might attach a device to multiple IOASIDs within the same IOASIDfd, and presumably make changes to the mapping of device to IOASID dynamically, are we interrupting user access around each of those changes? How would vfio be able to track this, and not only track it per device, but for all devices in the group. Suggesting a user needs to explicitly attach every device in the group is also a semantic change versus existing vfio, where other devices in the group must only be
Re: Plan for /dev/ioasid RFC v2
On Fri, 11 Jun 2021 13:45:29 -0300 Jason Gunthorpe wrote: > On Thu, Jun 10, 2021 at 09:38:42AM -0600, Alex Williamson wrote: > > > Opening the group is not the extent of the security check currently > > required, the group must be added to a container and an IOMMU model > > configured for the container *before* the user can get a devicefd. > > Each devicefd creates a reference to this security context, therefore > > access to a device does not exist without such a context. > > Okay, I missed that detail in the organization.. > > So, if we have an independent vfio device fd then it needs to be > kept disable until the user joins it to an ioasid that provides the > security proof to allow it to work? Yes, the user would effectively get a dummy fd with no device access until not only that device, but every device in the IOMMU group is attached to a secure context. Then we get into questions about whether devices can be moved between contexts/ioasids within the same ioasidfd and what that implies to both the device and all other devices within the group as a device is transitioned and the system is potentially exposed. > > What happens on detach? As we've discussed elsewhere in this thread, > > revoking access is more difficult than holding a reference to the > > secure context, but I'm under the impression that moving a device > > between IOASIDs could be standard practice in this new model. A device > > that's detached from a secure context, even temporarily, is a > > problem. > > This is why I think the single iommu FD is critical, it is the FD, not > the IOASID that has to authorize the security. You shouldn't move > devices between FDs, but you can move them between IOASIDs inside the > same FD. Right, but that doesn't solve the issue. Removing a device from one isolated context, even if to move it to another isolated context within the same ioasidfd exposes the device and has implications for all devices within the group. > > How to label a device seems like a relatively mundane issue relative to > > ownership and isolated contexts of groups and devices. The label is > > essentially just creating an identifier to device mapping, where the > > identifier (label) will be used in the IOASID interface, right? > > It looks that way > > > As I note above, that makes it difficult for vfio to maintain that a > > user only accesses a device in a secure context. This is exactly > > why vfio has the model of getting a devicefd from a groupfd only > > when that group is in a secure context and maintaining references to > > that secure context for each device. Split ownership of the secure > > context in IOASID vs device access in vfio and exposing devicefds > > outside the group is still a big question mark for me. Thanks, > > I think the protection model becomes different once we allow > individual devices inside a group to be attached to different > IOASID's. > > Now we just want some general authorization that the user is allowed > to operate the device_fd. That's fine for a serial port, but not a device that can do DMA. The entire point of vfio is to try to provide secure, DMA capable userspace drivers. If we relax enforcement of that isolation we've failed. > To keep a fairly similar model to the way vfio does things today.. > > - The device_fd is single open, so only one fd exists globally > > - Upon first joining the iommu_fd the group is obtained inside >the iommu_fd. This is only possible if no other iommu_fd has >obtained the group vfio_groups have an ownership model, iommu_groups do not. > - If the group can not be obtained then the device_fd is left >inoperable and cannot control the device > > - If multiple devices in the same group are joined then they all >refcount the group > > It is simple, and gives semantics similar to VFIO with the notable > difference that process can obtain a device FD, it is just inoperable > until the iommu_fd is attached. > > Removal is OK as if you remove the device_fd from the iommu_fd (only > allowed by closing it) then a newly opened FD is inoperable. I don't see how this provides isolation. If a user only needs to attach their devicefd to an ioasidfd to have full access to their device, not even bound by attaching to an ioasid context, then we've failed. All devices in a group must be bound to a secure context for the extent of the time that any device in the group is operated by a user. That seems non-negotiable to me. Thanks, Alex ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 1/2] iommu: Fix race condition during default domain allocation
> > + mutex_lock(&group->mutex); > > iommu_alloc_default_domain(group, dev); > > + mutex_unlock(&group->mutex); > > It feels wrong to serialise this for everybody just to cater for systems with > aliasing SIDs between devices. Serialization is limited to devices in the same group. Unless devices share SID, they wouldn't be in same group. > Can you provide some more information about exactly what the h/w > configuration is, and the callstack which exhibits the race, please? The failure is an after effect and is a page fault. Don't have a failure call stack here. Ashish has traced it through print messages and he can provide them. >From the prints messages, The following was observed in page fault case: Device1: iommu_probe_device() --> iommu_alloc_default_domain() --> iommu_group_alloc_default_domain() --> __iommu_attach_device(group->default_domain) Device2: iommu_probe_device() --> iommu_alloc_default_domain() --> iommu_group_alloc_default_domain() --> __iommu_attach_device(group->default_domain) Both devices(with same SID) are entering into iommu_group_alloc_default_domain() function and each one getting attached to a different group->default_domain as the second one overwrites group->default_domain after the first one attaches to group->default_domain it has created. SMMU would be setup to use first domain for the context page table. Whereas all the dma map/unamp requests from second device would be performed on a domain that is not used by SMMU for context translations and IOVA (not mapped in first domain) accesses from second device lead to page faults. -KR ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] swiotlb-xen: override common mmap and get_sgtable dma ops
This commit is dedicated to fix incorrect convertion from cpu_addr to page address in cases when we get virtual address which allocated throught xen_swiotlb_alloc_coherent() and can be mapped in the vmalloc range. As the result, virt_to_page() cannot convert this address properly and return incorrect page address. Need to detect such cases and obtains the page address using vmalloc_to_page() instead. The reference code was copied from kernel/dma/ops_helpers.c and modified to provide additional detections as described above. Signed-off-by: Roman Skakun Reviewed-by: Andrii Anisov --- Also, I have observed that the original common code didn't make additional checks about contiguity of the memory region represented by cpu_addr and size May be, this means that these functions can get only physically contiguous memory. Is this correct? Cheers! --- drivers/xen/swiotlb-xen.c | 51 +-- 1 file changed, 49 insertions(+), 2 deletions(-) diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index 2b385c1b4a99..f99c98472927 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -563,6 +563,53 @@ xen_swiotlb_dma_supported(struct device *hwdev, u64 mask) return xen_virt_to_bus(hwdev, xen_io_tlb_end - 1) <= mask; } +static int +xen_swiotlb_dma_mmap(struct device *dev, struct vm_area_struct *vma, + void *cpu_addr, dma_addr_t dma_addr, size_t size, + unsigned long attrs) +{ + unsigned long user_count = vma_pages(vma); + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; + unsigned long off = vma->vm_pgoff; + struct page *page; + + if (is_vmalloc_addr(cpu_addr)) + page = vmalloc_to_page(cpu_addr); + else + page = virt_to_page(cpu_addr); + + vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs); + + if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret)) + return -ENXIO; + + if (off >= count || user_count > count - off) + return -ENXIO; + + return remap_pfn_range(vma, vma->vm_start, + page_to_pfn(page) + vma->vm_pgoff, + user_count << PAGE_SHIFT, vma->vm_page_prot); +} + +static int +xen_swiotlb_dma_get_sgtable(struct device *dev, struct sg_table *sgt, +void *cpu_addr, dma_addr_t dma_addr, size_t size, +unsigned long attrs) +{ + struct page *page; + int ret; + + if (is_vmalloc_addr(cpu_addr)) + page = vmalloc_to_page(cpu_addr); + else + page = virt_to_page(cpu_addr); + + ret = sg_alloc_table(sgt, 1, GFP_KERNEL); + if (!ret) + sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0); + return ret; +} + const struct dma_map_ops xen_swiotlb_dma_ops = { .alloc = xen_swiotlb_alloc_coherent, .free = xen_swiotlb_free_coherent, @@ -575,8 +622,8 @@ const struct dma_map_ops xen_swiotlb_dma_ops = { .map_page = xen_swiotlb_map_page, .unmap_page = xen_swiotlb_unmap_page, .dma_supported = xen_swiotlb_dma_supported, - .mmap = dma_common_mmap, - .get_sgtable = dma_common_get_sgtable, + .mmap = xen_swiotlb_dma_mmap, + .get_sgtable = xen_swiotlb_dma_get_sgtable, .alloc_pages = dma_common_alloc_pages, .free_pages = dma_common_free_pages, }; -- 2.27.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 4/6] iommu: Factor iommu_iotlb_gather_is_disjoint() out
> On Jun 11, 2021, at 6:57 AM, Will Deacon wrote: > > On Mon, Jun 07, 2021 at 11:25:39AM -0700, Nadav Amit wrote: >> From: Nadav Amit >> >> Refactor iommu_iotlb_gather_add_page() and factor out the logic that >> detects whether IOTLB gather range and a new range are disjoint. To be >> used by the next patch that implements different gathering logic for >> AMD. >> >> Cc: Joerg Roedel >> Cc: Will Deacon >> Cc: Jiajun Cao >> Cc: Robin Murphy >> Cc: Lu Baolu >> Cc: iommu@lists.linux-foundation.org >> Cc: linux-ker...@vger.kernel.org> >> Signed-off-by: Nadav Amit >> --- >> include/linux/iommu.h | 41 + >> 1 file changed, 33 insertions(+), 8 deletions(-) > > [...] > >> diff --git a/include/linux/iommu.h b/include/linux/iommu.h >> index f254c62f3720..b5a2bfc68fb0 100644 >> --- a/include/linux/iommu.h >> +++ b/include/linux/iommu.h >> @@ -497,6 +497,28 @@ static inline void iommu_iotlb_sync(struct iommu_domain >> *domain, >> iommu_iotlb_gather_init(iotlb_gather); >> } >> >> +/** >> + * iommu_iotlb_gather_is_disjoint - Checks whether a new range is disjoint >> + * >> + * @gather: TLB gather data >> + * @iova: start of page to invalidate >> + * @size: size of page to invalidate >> + * >> + * Helper for IOMMU drivers to check whether a new range is and the gathered >> + * range are disjoint. > > I can't quite parse this. Delete the "is"? Indeed. Will do (I mean I will do ;-) ) > >>For many IOMMUs, flushing the IOMMU in this case is >> + * better than merging the two, which might lead to unnecessary >> invalidations. >> + */ >> +static inline >> +bool iommu_iotlb_gather_is_disjoint(struct iommu_iotlb_gather *gather, >> +unsigned long iova, size_t size) >> +{ >> +unsigned long start = iova, end = start + size - 1; >> + >> +return gather->end != 0 && >> +(end + 1 < gather->start || start > gather->end + 1); >> +} >> + >> + >> /** >> * iommu_iotlb_gather_add_range - Gather for address-based TLB invalidation >> * @gather: TLB gather data >> @@ -533,20 +555,16 @@ static inline void iommu_iotlb_gather_add_page(struct >> iommu_domain *domain, >> struct iommu_iotlb_gather >> *gather, >> unsigned long iova, size_t size) >> { >> -unsigned long start = iova, end = start + size - 1; >> - >> /* >> * If the new page is disjoint from the current range or is mapped at >> * a different granularity, then sync the TLB so that the gather >> * structure can be rewritten. >> */ >> -if (gather->pgsize != size || >> -end + 1 < gather->start || start > gather->end + 1) { >> -if (gather->pgsize) >> -iommu_iotlb_sync(domain, gather); >> -gather->pgsize = size; >> -} >> +if ((gather->pgsize && gather->pgsize != size) || >> +iommu_iotlb_gather_is_disjoint(gather, iova, size)) >> +iommu_iotlb_sync(domain, gather); >> >> +gather->pgsize = size; > > Why have you made this unconditional? I think it's ok, but just not sure > if it's necessary or not. In regard to gather->pgsize, this function had (and has) an invariant, in which gather->pgsize always represents the flushing granularity of its range. Arguably, “size" should never be zero, but lets assume for the matter of discussion that it might. If “size” equals to “gather->pgsize”, then the assignment in question has no impact. Otherwise, if “size” is non-zero, then iommu_iotlb_sync() would initialize the size and range (see iommu_iotlb_gather_init()), and the invariant is kept. Otherwise, “size” is zero, and “gather” already holds a range, so gather->pgsize is non-zero and (gather->pgsize && gather->pgsize != size) is true. Therefore, again, iommu_iotlb_sync() would be called and initialize the size. I think that this change makes the code much simpler to read. It probably has no performance impact as “gather” is probably cached and anyhow accessed shortly after. If anything, I can add a VM_BUG_ON() to check “size” is non-zero, although this code seems correct regardless of that. signature.asc Description: Message signed with OpenPGP ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH] iommu/io-pgtable-arm: Optimize partial walk flush for large scatter-gather list
Hi Sai, > >> > No, the unmap latency is not just in some test case written, the > >> > issue is very real and we have workloads where camera is reporting > >> > frame drops because of this unmap latency in the order of 100s of > milliseconds. > Not exactly, this issue is not specific to camera. If you look at the numbers > in the > commit text, even for the test device its the same observation. It depends on > the buffer size we are unmapping which affects the number of TLBIs issue. I am > not aware of any such HW side bw issues for camera specifically on QCOM > devices. It is clear that reducing number of TLBIs reduces the umap API latency. But, It is at the expense of throwing away valid tlb entries. Quantifying the impact of arbitrary invalidation of valid tlb entries at context level is not straight forward and use case dependent. The side-effects might be rare or won't be known until they are noticed. Can you provide more details on How the unmap latency is causing camera to drop frames? Is unmap performed in the perf path? If unmap is queued and performed on a back ground thread, would it resolve the frame drops? -KR ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Plan for /dev/ioasid RFC v2
On Thu, Jun 10, 2021 at 09:38:42AM -0600, Alex Williamson wrote: > Opening the group is not the extent of the security check currently > required, the group must be added to a container and an IOMMU model > configured for the container *before* the user can get a devicefd. > Each devicefd creates a reference to this security context, therefore > access to a device does not exist without such a context. Okay, I missed that detail in the organization.. So, if we have an independent vfio device fd then it needs to be kept disable until the user joins it to an ioasid that provides the security proof to allow it to work? > What happens on detach? As we've discussed elsewhere in this thread, > revoking access is more difficult than holding a reference to the > secure context, but I'm under the impression that moving a device > between IOASIDs could be standard practice in this new model. A device > that's detached from a secure context, even temporarily, is a > problem. This is why I think the single iommu FD is critical, it is the FD, not the IOASID that has to authorize the security. You shouldn't move devices between FDs, but you can move them between IOASIDs inside the same FD. > How to label a device seems like a relatively mundane issue relative to > ownership and isolated contexts of groups and devices. The label is > essentially just creating an identifier to device mapping, where the > identifier (label) will be used in the IOASID interface, right? It looks that way > As I note above, that makes it difficult for vfio to maintain that a > user only accesses a device in a secure context. This is exactly > why vfio has the model of getting a devicefd from a groupfd only > when that group is in a secure context and maintaining references to > that secure context for each device. Split ownership of the secure > context in IOASID vs device access in vfio and exposing devicefds > outside the group is still a big question mark for me. Thanks, I think the protection model becomes different once we allow individual devices inside a group to be attached to different IOASID's. Now we just want some general authorization that the user is allowed to operate the device_fd. To keep a fairly similar model to the way vfio does things today.. - The device_fd is single open, so only one fd exists globally - Upon first joining the iommu_fd the group is obtained inside the iommu_fd. This is only possible if no other iommu_fd has obtained the group - If the group can not be obtained then the device_fd is left inoperable and cannot control the device - If multiple devices in the same group are joined then they all refcount the group It is simple, and gives semantics similar to VFIO with the notable difference that process can obtain a device FD, it is just inoperable until the iommu_fd is attached. Removal is OK as if you remove the device_fd from the iommu_fd (only allowed by closing it) then a newly opened FD is inoperable. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: swiotlb/caamjr regression (Was: [GIT PULL] (swiotlb) stable/for-linus-5.12)
On Fri, Jun 11, 2021 at 3:35 AM Konrad Rzeszutek Wilk wrote: > > Linus, > > Would you be terribly offended if I took your code (s/unsigned > long/unsigned int), and used Chanho's description of the problem (see below)? No offense to that at all - that looks like the right solution. See my answer to Christoph: I do think my patch does the right one, but I can't test it and my knowledge of the swiotlb code is not complete enough to really do anything else than "this looks right". And adding my sign-off to the patch is fine, but I don't necessarily need the authorship credit - mine was a throw-away patch just looking at what the bisection report said. All the real effort was by the reporters (and for the commit message, Bumyong Lee & co). Finally - looking at the two places that do have that swiotlb_align_offset(), my reaction is "I don't understand what that code is doing". The index does that index = find_slots(dev, orig_addr, alloc_size + offset); so that offset does seem to depend on how the find_slots code works. Which in turn does use the same dma_get_min_align_mask() thing that swiotlb_align_offset() uses. So the offsets do seem to match, but find_slots(dev() does a lot of stuff that I don't know. so... IOW, it does reinforce my "I don't know this code AT ALL". Which just makes me more convinced that I shouldn't get authorship of the patch because if something goes wrong with it, I can't help. So at most maybe a "Suggested-by:". My patch really was based on very little context and "this is the calculation that makes sense given the other calculations in the function". Linus ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/arm-smmu: Fix arm_smmu_device refcount leak in address translation
On Thu, 10 Jun 2021 10:49:20 +0800, Xiyu Yang wrote: > The reference counting issue happens in several exception handling paths > of arm_smmu_iova_to_phys_hard(). When those error scenarios occur, the > function forgets to decrease the refcount of "smmu" increased by > arm_smmu_rpm_get(), causing a refcount leak. > > Fix this issue by jumping to "out" label when those error scenarios > occur. Applied to will (for-joerg/arm-smmu/updates), thanks! [1/1] iommu/arm-smmu: Fix arm_smmu_device refcount leak in address translation https://git.kernel.org/will/c/7c8f176d6a3f Cheers, -- Will https://fixes.arm64.dev https://next.arm64.dev https://will.arm64.dev ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/arm-smmu: Fix arm_smmu_device refcount leak when arm_smmu_rpm_get fails
On Thu, 10 Jun 2021 10:54:29 +0800, Xiyu Yang wrote: > arm_smmu_rpm_get() invokes pm_runtime_get_sync(), which increases the > refcount of the "smmu" even though the return value is less than 0. > > The reference counting issue happens in some error handling paths of > arm_smmu_rpm_get() in its caller functions. When arm_smmu_rpm_get() > fails, the caller functions forget to decrease the refcount of "smmu" > increased by arm_smmu_rpm_get(), causing a refcount leak. > > [...] Applied to will (for-joerg/arm-smmu/updates), thanks! [1/1] iommu/arm-smmu: Fix arm_smmu_device refcount leak when arm_smmu_rpm_get fails https://git.kernel.org/will/c/1adf30f198c2 Cheers, -- Will https://fixes.arm64.dev https://next.arm64.dev https://will.arm64.dev ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: swiotlb/caamjr regression (Was: [GIT PULL] (swiotlb) stable/for-linus-5.12)
On Thu, Jun 10, 2021 at 11:21 PM Christoph Hellwig wrote: > > FYI, there has been a patch on the list that should have fixed this > for about a month: > > https://lore.kernel.org/linux-iommu/20210510091816.ga2...@lst.de/T/#m0d0df6490350a08dcc24c9086c8edc165b402d6f Honestly, that patch is all kinds of strange. This expression: tlb_offset = (tlb_addr & (IO_TLB_SIZE - 1)) - swiotlb_align_offset(dev, orig_addr); makes no sense to me. Maybe it happens to work, but I think it does so by mistake rather than by design. What my patch used was: unsigned long offset = (tlb_addr - mem->start) & (IO_TLB_SIZE - 1); which actually pairs with - and makes sense with - the index calculation: int index = (tlb_addr - mem->start) >> IO_TLB_SHIFT; so that offset truly is the offset within that index. Look at how that 'index' calculation calculates the high bits of the difference, and the 'offset' calculation now literally is the low bits of the same thing that got dropped on the floor by the 'index' calculation? So those two calculations actually make sense. The swiotlb_align_offset() one doesn't. It's possible that that swiotlb_align_offset() function ends up giving the right answer just almost by mistake (because of how tlb_addr and orig_addr end up being related - the swiotlb_align_offset() expression might just end up being the same thing - I didn't look deeper), but even if the result is the same, it's not _sensible_ code, It's also possible that the swiotlb_align_offset() function ends up giving the right answer very much by design and because of how orig_addr works - because maybe the remapping is doing odd things and using that swiotlb_align_offset() function in ways that make the *obvious* and natural offset calculation not actually work. So it's at least in theory possible that my "natural offset" calculation that matches how the index is calculated doesn't actually work. But that means that the swiotlb remapping is doing some really odd things, and then I think the patch would need a lot more commentary on exactly what those very odd things are. Linus ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 00/15] Restricted DMA
v9 here: https://lore.kernel.org/patchwork/cover/1445081/ On Mon, Jun 7, 2021 at 11:28 AM Claire Chang wrote: > > On Sat, Jun 5, 2021 at 1:48 AM Will Deacon wrote: > > > > Hi Claire, > > > > On Thu, May 27, 2021 at 08:58:30PM +0800, Claire Chang wrote: > > > This series implements mitigations for lack of DMA access control on > > > systems without an IOMMU, which could result in the DMA accessing the > > > system memory at unexpected times and/or unexpected addresses, possibly > > > leading to data leakage or corruption. > > > > > > For example, we plan to use the PCI-e bus for Wi-Fi and that PCI-e bus is > > > not behind an IOMMU. As PCI-e, by design, gives the device full access to > > > system memory, a vulnerability in the Wi-Fi firmware could easily escalate > > > to a full system exploit (remote wifi exploits: [1a], [1b] that shows a > > > full chain of exploits; [2], [3]). > > > > > > To mitigate the security concerns, we introduce restricted DMA. Restricted > > > DMA utilizes the existing swiotlb to bounce streaming DMA in and out of a > > > specially allocated region and does memory allocation from the same > > > region. > > > The feature on its own provides a basic level of protection against the > > > DMA > > > overwriting buffer contents at unexpected times. However, to protect > > > against general data leakage and system memory corruption, the system > > > needs > > > to provide a way to restrict the DMA to a predefined memory region (this > > > is > > > usually done at firmware level, e.g. MPU in ATF on some ARM platforms > > > [4]). > > > > > > [1a] > > > https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_4.html > > > [1b] > > > https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_11.html > > > [2] https://blade.tencent.com/en/advisories/qualpwn/ > > > [3] > > > https://www.bleepingcomputer.com/news/security/vulnerabilities-found-in-highly-popular-firmware-for-wifi-chips/ > > > [4] > > > https://github.com/ARM-software/arm-trusted-firmware/blob/master/plat/mediatek/mt8183/drivers/emi_mpu/emi_mpu.c#L132 > > > > > > v8: > > > - Fix reserved-memory.txt and add the reg property in example. > > > - Fix sizeof for of_property_count_elems_of_size in > > > drivers/of/address.c#of_dma_set_restricted_buffer. > > > - Apply Will's suggestion to try the OF node having DMA configuration in > > > drivers/of/address.c#of_dma_set_restricted_buffer. > > > - Fix typo in the comment of > > > drivers/of/address.c#of_dma_set_restricted_buffer. > > > - Add error message for PageHighMem in > > > kernel/dma/swiotlb.c#rmem_swiotlb_device_init and move it to > > > rmem_swiotlb_setup. > > > - Fix the message string in rmem_swiotlb_setup. > > > > Thanks for the v8. It works for me out of the box on arm64 under KVM, so: > > > > Tested-by: Will Deacon > > > > Note that something seems to have gone wrong with the mail threading, so > > the last 5 patches ended up as a separate thread for me. Probably worth > > posting again with all the patches in one place, if you can. > > Thanks for testing. > > Christoph also added some comments in v7, so I'll prepare v9. > > > > > Cheers, > > > > Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 06/14] swiotlb: Update is_swiotlb_active to add a struct device argument
I don't have the HW to verify the change. Hopefully I use the right device struct for is_swiotlb_active. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 03/14] swiotlb: Set dev->dma_io_tlb_mem to the swiotlb pool used
I'm not sure if this would break arch/x86/pci/sta2x11-fixup.c swiotlb_late_init_with_default_size is called here https://elixir.bootlin.com/linux/v5.13-rc5/source/arch/x86/pci/sta2x11-fixup.c#L60 On Fri, Jun 11, 2021 at 11:27 PM Claire Chang wrote: > > Always have the pointer to the swiotlb pool used in struct device. This > could help simplify the code for other pools. > > Signed-off-by: Claire Chang > --- > drivers/of/device.c | 3 +++ > include/linux/device.h | 4 > include/linux/swiotlb.h | 8 > kernel/dma/swiotlb.c| 8 > 4 files changed, 19 insertions(+), 4 deletions(-) > > diff --git a/drivers/of/device.c b/drivers/of/device.c > index c5a9473a5fb1..1defdf15ba95 100644 > --- a/drivers/of/device.c > +++ b/drivers/of/device.c > @@ -165,6 +165,9 @@ int of_dma_configure_id(struct device *dev, struct > device_node *np, > > arch_setup_dma_ops(dev, dma_start, size, iommu, coherent); > > + if (IS_ENABLED(CONFIG_SWIOTLB)) > + swiotlb_set_io_tlb_default_mem(dev); > + > return 0; > } > EXPORT_SYMBOL_GPL(of_dma_configure_id); > diff --git a/include/linux/device.h b/include/linux/device.h > index 4443e12238a0..2e9a378c9100 100644 > --- a/include/linux/device.h > +++ b/include/linux/device.h > @@ -432,6 +432,7 @@ struct dev_links_info { > * @dma_pools: Dma pools (if dma'ble device). > * @dma_mem: Internal for coherent mem override. > * @cma_area: Contiguous memory area for dma allocations > + * @dma_io_tlb_mem: Pointer to the swiotlb pool used. Not for driver use. > * @archdata: For arch-specific additions. > * @of_node: Associated device tree node. > * @fwnode:Associated device node supplied by platform firmware. > @@ -540,6 +541,9 @@ struct device { > #ifdef CONFIG_DMA_CMA > struct cma *cma_area; /* contiguous memory area for dma >allocations */ > +#endif > +#ifdef CONFIG_SWIOTLB > + struct io_tlb_mem *dma_io_tlb_mem; > #endif > /* arch specific additions */ > struct dev_archdata archdata; > diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h > index 216854a5e513..008125ccd509 100644 > --- a/include/linux/swiotlb.h > +++ b/include/linux/swiotlb.h > @@ -108,6 +108,11 @@ static inline bool is_swiotlb_buffer(phys_addr_t paddr) > return mem && paddr >= mem->start && paddr < mem->end; > } > > +static inline void swiotlb_set_io_tlb_default_mem(struct device *dev) > +{ > + dev->dma_io_tlb_mem = io_tlb_default_mem; > +} > + > void __init swiotlb_exit(void); > unsigned int swiotlb_max_segment(void); > size_t swiotlb_max_mapping_size(struct device *dev); > @@ -119,6 +124,9 @@ static inline bool is_swiotlb_buffer(phys_addr_t paddr) > { > return false; > } > +static inline void swiotlb_set_io_tlb_default_mem(struct device *dev) > +{ > +} > static inline void swiotlb_exit(void) > { > } > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c > index 8a3e2b3b246d..29b950ab1351 100644 > --- a/kernel/dma/swiotlb.c > +++ b/kernel/dma/swiotlb.c > @@ -344,7 +344,7 @@ void __init swiotlb_exit(void) > static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t > size, >enum dma_data_direction dir) > { > - struct io_tlb_mem *mem = io_tlb_default_mem; > + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; > int index = (tlb_addr - mem->start) >> IO_TLB_SHIFT; > phys_addr_t orig_addr = mem->slots[index].orig_addr; > size_t alloc_size = mem->slots[index].alloc_size; > @@ -426,7 +426,7 @@ static unsigned int wrap_index(struct io_tlb_mem *mem, > unsigned int index) > static int find_slots(struct device *dev, phys_addr_t orig_addr, > size_t alloc_size) > { > - struct io_tlb_mem *mem = io_tlb_default_mem; > + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; > unsigned long boundary_mask = dma_get_seg_boundary(dev); > dma_addr_t tbl_dma_addr = > phys_to_dma_unencrypted(dev, mem->start) & boundary_mask; > @@ -503,7 +503,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, > phys_addr_t orig_addr, > size_t mapping_size, size_t alloc_size, > enum dma_data_direction dir, unsigned long attrs) > { > - struct io_tlb_mem *mem = io_tlb_default_mem; > + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; > unsigned int offset = swiotlb_align_offset(dev, orig_addr); > unsigned int i; > int index; > @@ -554,7 +554,7 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, > phys_addr_t tlb_addr, > size_t mapping_size, enum dma_data_direction > dir, > unsigned long attrs) > { > - struct io_tlb_mem *mem = io_tlb_default_mem; > + struct io_tlb_mem *mem = hwdev->dma_io_tlb_mem; > unsigned long flags; > unsigned int offset = swiotlb_align_
[PATCH v9 14/14] of: Add plumbing for restricted DMA pool
If a device is not behind an IOMMU, we look up the device node and set up the restricted DMA when the restricted-dma-pool is presented. Signed-off-by: Claire Chang --- drivers/of/address.c| 33 + drivers/of/device.c | 3 +++ drivers/of/of_private.h | 6 ++ 3 files changed, 42 insertions(+) diff --git a/drivers/of/address.c b/drivers/of/address.c index 3b2acca7e363..c8066d95ff0e 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include #include @@ -1001,6 +1002,38 @@ int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map) of_node_put(node); return ret; } + +int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np) +{ + struct device_node *node, *of_node = dev->of_node; + int count, i; + + count = of_property_count_elems_of_size(of_node, "memory-region", + sizeof(u32)); + /* +* If dev->of_node doesn't exist or doesn't contain memory-region, try +* the OF node having DMA configuration. +*/ + if (count <= 0) { + of_node = np; + count = of_property_count_elems_of_size( + of_node, "memory-region", sizeof(u32)); + } + + for (i = 0; i < count; i++) { + node = of_parse_phandle(of_node, "memory-region", i); + /* +* There might be multiple memory regions, but only one +* restricted-dma-pool region is allowed. +*/ + if (of_device_is_compatible(node, "restricted-dma-pool") && + of_device_is_available(node)) + return of_reserved_mem_device_init_by_idx(dev, of_node, + i); + } + + return 0; +} #endif /* CONFIG_HAS_DMA */ /** diff --git a/drivers/of/device.c b/drivers/of/device.c index 1defdf15ba95..ba4656e77502 100644 --- a/drivers/of/device.c +++ b/drivers/of/device.c @@ -168,6 +168,9 @@ int of_dma_configure_id(struct device *dev, struct device_node *np, if (IS_ENABLED(CONFIG_SWIOTLB)) swiotlb_set_io_tlb_default_mem(dev); + if (!iommu) + return of_dma_set_restricted_buffer(dev, np); + return 0; } EXPORT_SYMBOL_GPL(of_dma_configure_id); diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h index 631489f7f8c0..376462798f7e 100644 --- a/drivers/of/of_private.h +++ b/drivers/of/of_private.h @@ -163,12 +163,18 @@ struct bus_dma_region; #if defined(CONFIG_OF_ADDRESS) && defined(CONFIG_HAS_DMA) int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map); +int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np); #else static inline int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map) { return -ENODEV; } +static inline int of_dma_set_restricted_buffer(struct device *dev, + struct device_node *np) +{ + return -ENODEV; +} #endif void fdt_init_reserved_mem(void); -- 2.32.0.272.g935e593368-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 13/14] dt-bindings: of: Add restricted DMA pool
Introduce the new compatible string, restricted-dma-pool, for restricted DMA. One can specify the address and length of the restricted DMA memory region by restricted-dma-pool in the reserved-memory node. Signed-off-by: Claire Chang --- .../reserved-memory/reserved-memory.txt | 36 +-- 1 file changed, 33 insertions(+), 3 deletions(-) diff --git a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt index e8d3096d922c..46804f24df05 100644 --- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt +++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt @@ -51,6 +51,23 @@ compatible (optional) - standard definition used as a shared pool of DMA buffers for a set of devices. It can be used by an operating system to instantiate the necessary pool management subsystem if necessary. +- restricted-dma-pool: This indicates a region of memory meant to be + used as a pool of restricted DMA buffers for a set of devices. The + memory region would be the only region accessible to those devices. + When using this, the no-map and reusable properties must not be set, + so the operating system can create a virtual mapping that will be used + for synchronization. The main purpose for restricted DMA is to + mitigate the lack of DMA access control on systems without an IOMMU, + which could result in the DMA accessing the system memory at + unexpected times and/or unexpected addresses, possibly leading to data + leakage or corruption. The feature on its own provides a basic level + of protection against the DMA overwriting buffer contents at + unexpected times. However, to protect against general data leakage and + system memory corruption, the system needs to provide way to lock down + the memory access, e.g., MPU. Note that since coherent allocation + needs remapping, one must set up another device coherent pool by + shared-dma-pool and use dma_alloc_from_dev_coherent instead for atomic + coherent allocation. - vendor specific string in the form ,[-] no-map (optional) - empty property - Indicates the operating system must not create a virtual mapping @@ -85,10 +102,11 @@ memory-region-names (optional) - a list of names, one for each corresponding Example --- -This example defines 3 contiguous regions are defined for Linux kernel: +This example defines 4 contiguous regions for Linux kernel: one default of all device drivers (named linux,cma@7200 and 64MiB in size), -one dedicated to the framebuffer device (named framebuffer@7800, 8MiB), and -one for multimedia processing (named multimedia-memory@7700, 64MiB). +one dedicated to the framebuffer device (named framebuffer@7800, 8MiB), +one for multimedia processing (named multimedia-memory@7700, 64MiB), and +one for restricted dma pool (named restricted_dma_reserved@0x5000, 64MiB). / { #address-cells = <1>; @@ -120,6 +138,11 @@ one for multimedia processing (named multimedia-memory@7700, 64MiB). compatible = "acme,multimedia-memory"; reg = <0x7700 0x400>; }; + + restricted_dma_reserved: restricted_dma_reserved { + compatible = "restricted-dma-pool"; + reg = <0x5000 0x400>; + }; }; /* ... */ @@ -138,4 +161,11 @@ one for multimedia processing (named multimedia-memory@7700, 64MiB). memory-region = <&multimedia_reserved>; /* ... */ }; + + pcie_device: pcie_device@0,0 { + reg = <0x8301 0x0 0x 0x0 0x0010 + 0x8301 0x0 0x0010 0x0 0x0010>; + memory-region = <&restricted_dma_mem_reserved>; + /* ... */ + }; }; -- 2.32.0.272.g935e593368-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 12/14] dma-direct: Allocate memory from restricted DMA pool if available
The restricted DMA pool is preferred if available. The restricted DMA pools provide a basic level of protection against the DMA overwriting buffer contents at unexpected times. However, to protect against general data leakage and system memory corruption, the system needs to provide a way to lock down the memory access, e.g., MPU. Note that since coherent allocation needs remapping, one must set up another device coherent pool by shared-dma-pool and use dma_alloc_from_dev_coherent instead for atomic coherent allocation. Signed-off-by: Claire Chang --- kernel/dma/direct.c | 37 - 1 file changed, 28 insertions(+), 9 deletions(-) diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index eb4098323bbc..73fc4c659ba7 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -78,6 +78,9 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size) static void __dma_direct_free_pages(struct device *dev, struct page *page, size_t size) { + if (IS_ENABLED(CONFIG_DMA_RESTRICTED_POOL) && + swiotlb_free(dev, page, size)) + return; dma_free_contiguous(dev, page, size); } @@ -92,7 +95,17 @@ static struct page *__dma_direct_alloc_pages(struct device *dev, size_t size, gfp |= dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask, &phys_limit); - page = dma_alloc_contiguous(dev, size, gfp); + if (IS_ENABLED(CONFIG_DMA_RESTRICTED_POOL)) { + page = swiotlb_alloc(dev, size); + if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) { + __dma_direct_free_pages(dev, page, size); + page = NULL; + } + return page; + } + + if (!page) + page = dma_alloc_contiguous(dev, size, gfp); if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) { dma_free_contiguous(dev, page, size); page = NULL; @@ -148,7 +161,7 @@ void *dma_direct_alloc(struct device *dev, size_t size, gfp |= __GFP_NOWARN; if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) && - !force_dma_unencrypted(dev)) { + !force_dma_unencrypted(dev) && !is_dev_swiotlb_force(dev)) { page = __dma_direct_alloc_pages(dev, size, gfp & ~__GFP_ZERO); if (!page) return NULL; @@ -161,18 +174,23 @@ void *dma_direct_alloc(struct device *dev, size_t size, } if (!IS_ENABLED(CONFIG_ARCH_HAS_DMA_SET_UNCACHED) && - !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && - !dev_is_dma_coherent(dev)) + !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && !dev_is_dma_coherent(dev) && + !is_dev_swiotlb_force(dev)) return arch_dma_alloc(dev, size, dma_handle, gfp, attrs); /* * Remapping or decrypting memory may block. If either is required and * we can't block, allocate the memory from the atomic pools. +* If restricted DMA (i.e., is_dev_swiotlb_force) is required, one must +* set up another device coherent pool by shared-dma-pool and use +* dma_alloc_from_dev_coherent instead. */ if (IS_ENABLED(CONFIG_DMA_COHERENT_POOL) && !gfpflags_allow_blocking(gfp) && (force_dma_unencrypted(dev) || -(IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && !dev_is_dma_coherent(dev +(IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && + !dev_is_dma_coherent(dev))) && + !is_dev_swiotlb_force(dev)) return dma_direct_alloc_from_pool(dev, size, dma_handle, gfp); /* we always manually zero the memory once we are done */ @@ -253,15 +271,15 @@ void dma_direct_free(struct device *dev, size_t size, unsigned int page_order = get_order(size); if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) && - !force_dma_unencrypted(dev)) { + !force_dma_unencrypted(dev) && !is_dev_swiotlb_force(dev)) { /* cpu_addr is a struct page cookie, not a kernel address */ dma_free_contiguous(dev, cpu_addr, size); return; } if (!IS_ENABLED(CONFIG_ARCH_HAS_DMA_SET_UNCACHED) && - !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && - !dev_is_dma_coherent(dev)) { + !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && !dev_is_dma_coherent(dev) && + !is_dev_swiotlb_force(dev)) { arch_dma_free(dev, size, cpu_addr, dma_addr, attrs); return; } @@ -289,7 +307,8 @@ struct page *dma_direct_alloc_pages(struct device *dev, size_t size, void *ret; if (IS_ENABLED(CONFIG_DMA_COHERENT_POOL) && - force_dma_unencrypted(dev) && !gfpflags_allow_blocking(gfp)) + force_dma_unencrypted(dev) && !gfpflags_allow_blocking(gfp) && +
[PATCH v9 11/14] swiotlb: Add restricted DMA alloc/free support.
Add the functions, swiotlb_{alloc,free} to support the memory allocation from restricted DMA pool. Signed-off-by: Claire Chang --- include/linux/swiotlb.h | 15 +++ kernel/dma/swiotlb.c| 35 +-- 2 files changed, 48 insertions(+), 2 deletions(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 8200c100fe10..d3374497a4f8 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -162,4 +162,19 @@ static inline void swiotlb_adjust_size(unsigned long size) extern void swiotlb_print_info(void); extern void swiotlb_set_max_segment(unsigned int); +#ifdef CONFIG_DMA_RESTRICTED_POOL +struct page *swiotlb_alloc(struct device *dev, size_t size); +bool swiotlb_free(struct device *dev, struct page *page, size_t size); +#else +static inline struct page *swiotlb_alloc(struct device *dev, size_t size) +{ + return NULL; +} +static inline bool swiotlb_free(struct device *dev, struct page *page, + size_t size) +{ + return false; +} +#endif /* CONFIG_DMA_RESTRICTED_POOL */ + #endif /* __LINUX_SWIOTLB_H */ diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index a6562573f090..0a19858da5b8 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -461,8 +461,9 @@ static int find_slots(struct device *dev, phys_addr_t orig_addr, index = wrap = wrap_index(mem, ALIGN(mem->index, stride)); do { - if ((slot_addr(tbl_dma_addr, index) & iotlb_align_mask) != - (orig_addr & iotlb_align_mask)) { + if (orig_addr && + (slot_addr(tbl_dma_addr, index) & iotlb_align_mask) != + (orig_addr & iotlb_align_mask)) { index = wrap_index(mem, index + 1); continue; } @@ -702,6 +703,36 @@ late_initcall(swiotlb_create_default_debugfs); #endif #ifdef CONFIG_DMA_RESTRICTED_POOL +struct page *swiotlb_alloc(struct device *dev, size_t size) +{ + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; + phys_addr_t tlb_addr; + int index; + + if (!mem) + return NULL; + + index = find_slots(dev, 0, size); + if (index == -1) + return NULL; + + tlb_addr = slot_addr(mem->start, index); + + return pfn_to_page(PFN_DOWN(tlb_addr)); +} + +bool swiotlb_free(struct device *dev, struct page *page, size_t size) +{ + phys_addr_t tlb_addr = page_to_phys(page); + + if (!is_swiotlb_buffer(dev, tlb_addr)) + return false; + + release_slots(dev, tlb_addr); + + return true; +} + static int rmem_swiotlb_device_init(struct reserved_mem *rmem, struct device *dev) { -- 2.32.0.272.g935e593368-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 10/14] dma-direct: Add a new wrapper __dma_direct_free_pages()
Add a new wrapper __dma_direct_free_pages() that will be useful later for swiotlb_free(). Signed-off-by: Claire Chang --- kernel/dma/direct.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 078f7087e466..eb4098323bbc 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -75,6 +75,12 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size) min_not_zero(dev->coherent_dma_mask, dev->bus_dma_limit); } +static void __dma_direct_free_pages(struct device *dev, struct page *page, + size_t size) +{ + dma_free_contiguous(dev, page, size); +} + static struct page *__dma_direct_alloc_pages(struct device *dev, size_t size, gfp_t gfp) { @@ -237,7 +243,7 @@ void *dma_direct_alloc(struct device *dev, size_t size, return NULL; } out_free_pages: - dma_free_contiguous(dev, page, size); + __dma_direct_free_pages(dev, page, size); return NULL; } @@ -273,7 +279,7 @@ void dma_direct_free(struct device *dev, size_t size, else if (IS_ENABLED(CONFIG_ARCH_HAS_DMA_CLEAR_UNCACHED)) arch_dma_clear_uncached(cpu_addr, size); - dma_free_contiguous(dev, dma_direct_to_page(dev, dma_addr), size); + __dma_direct_free_pages(dev, dma_direct_to_page(dev, dma_addr), size); } struct page *dma_direct_alloc_pages(struct device *dev, size_t size, @@ -310,7 +316,7 @@ struct page *dma_direct_alloc_pages(struct device *dev, size_t size, *dma_handle = phys_to_dma_direct(dev, page_to_phys(page)); return page; out_free_pages: - dma_free_contiguous(dev, page, size); + __dma_direct_free_pages(dev, page, size); return NULL; } @@ -329,7 +335,7 @@ void dma_direct_free_pages(struct device *dev, size_t size, if (force_dma_unencrypted(dev)) set_memory_encrypted((unsigned long)vaddr, 1 << page_order); - dma_free_contiguous(dev, page, size); + __dma_direct_free_pages(dev, page, size); } #if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \ -- 2.32.0.272.g935e593368-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 09/14] swiotlb: Refactor swiotlb_tbl_unmap_single
Add a new function, release_slots, to make the code reusable for supporting different bounce buffer pools, e.g. restricted DMA pool. Signed-off-by: Claire Chang --- kernel/dma/swiotlb.c | 35 --- 1 file changed, 20 insertions(+), 15 deletions(-) diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 364c6c822063..a6562573f090 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -554,27 +554,15 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, return tlb_addr; } -/* - * tlb_addr is the physical address of the bounce buffer to unmap. - */ -void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr, - size_t mapping_size, enum dma_data_direction dir, - unsigned long attrs) +static void release_slots(struct device *dev, phys_addr_t tlb_addr) { - struct io_tlb_mem *mem = hwdev->dma_io_tlb_mem; + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; unsigned long flags; - unsigned int offset = swiotlb_align_offset(hwdev, tlb_addr); + unsigned int offset = swiotlb_align_offset(dev, tlb_addr); int index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT; int nslots = nr_slots(mem->slots[index].alloc_size + offset); int count, i; - /* -* First, sync the memory before unmapping the entry -*/ - if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) && - (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)) - swiotlb_bounce(hwdev, tlb_addr, mapping_size, DMA_FROM_DEVICE); - /* * Return the buffer to the free list by setting the corresponding * entries to indicate the number of contiguous entries available. @@ -609,6 +597,23 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr, spin_unlock_irqrestore(&mem->lock, flags); } +/* + * tlb_addr is the physical address of the bounce buffer to unmap. + */ +void swiotlb_tbl_unmap_single(struct device *dev, phys_addr_t tlb_addr, + size_t mapping_size, enum dma_data_direction dir, + unsigned long attrs) +{ + /* +* First, sync the memory before unmapping the entry +*/ + if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) && + (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)) + swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_FROM_DEVICE); + + release_slots(dev, tlb_addr); +} + void swiotlb_sync_single_for_device(struct device *dev, phys_addr_t tlb_addr, size_t size, enum dma_data_direction dir) { -- 2.32.0.272.g935e593368-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 08/14] swiotlb: Move alloc_size to find_slots
Move the maintenance of alloc_size to find_slots for better code reusability later. Signed-off-by: Claire Chang --- kernel/dma/swiotlb.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index e5ccc198d0a7..364c6c822063 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -486,8 +486,11 @@ static int find_slots(struct device *dev, phys_addr_t orig_addr, return -1; found: - for (i = index; i < index + nslots; i++) + for (i = index; i < index + nslots; i++) { mem->slots[i].list = 0; + mem->slots[i].alloc_size = + alloc_size - ((i - index) << IO_TLB_SHIFT); + } for (i = index - 1; io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 && mem->slots[i].list; i--) @@ -542,11 +545,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, * This is needed when we sync the memory. Then we sync the buffer if * needed. */ - for (i = 0; i < nr_slots(alloc_size + offset); i++) { + for (i = 0; i < nr_slots(alloc_size + offset); i++) mem->slots[index + i].orig_addr = slot_addr(orig_addr, i); - mem->slots[index + i].alloc_size = - alloc_size - (i << IO_TLB_SHIFT); - } tlb_addr = slot_addr(mem->start, index) + offset; if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) && (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) -- 2.32.0.272.g935e593368-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 07/14] swiotlb: Bounce data from/to restricted DMA pool if available
Regardless of swiotlb setting, the restricted DMA pool is preferred if available. The restricted DMA pools provide a basic level of protection against the DMA overwriting buffer contents at unexpected times. However, to protect against general data leakage and system memory corruption, the system needs to provide a way to lock down the memory access, e.g., MPU. Note that is_dev_swiotlb_force doesn't check if swiotlb_force == SWIOTLB_FORCE. Otherwise the memory allocation behavior with default swiotlb will be changed by the following patche ("dma-direct: Allocate memory from restricted DMA pool if available"). Signed-off-by: Claire Chang --- include/linux/swiotlb.h | 10 +- kernel/dma/direct.c | 3 ++- kernel/dma/direct.h | 3 ++- kernel/dma/swiotlb.c| 1 + 4 files changed, 14 insertions(+), 3 deletions(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 06cf17a80f5c..8200c100fe10 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -85,6 +85,7 @@ extern enum swiotlb_force swiotlb_force; * unmap calls. * @debugfs: The dentry to debugfs. * @late_alloc:%true if allocated using the page allocator + * @force_swiotlb: %true if swiotlb is forced */ struct io_tlb_mem { phys_addr_t start; @@ -95,6 +96,7 @@ struct io_tlb_mem { spinlock_t lock; struct dentry *debugfs; bool late_alloc; + bool force_swiotlb; struct io_tlb_slot { phys_addr_t orig_addr; size_t alloc_size; @@ -115,6 +117,11 @@ static inline void swiotlb_set_io_tlb_default_mem(struct device *dev) dev->dma_io_tlb_mem = io_tlb_default_mem; } +static inline bool is_dev_swiotlb_force(struct device *dev) +{ + return dev->dma_io_tlb_mem->force_swiotlb; +} + void __init swiotlb_exit(void); unsigned int swiotlb_max_segment(void); size_t swiotlb_max_mapping_size(struct device *dev); @@ -126,8 +133,9 @@ static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr) { return false; } -static inline void swiotlb_set_io_tlb_default_mem(struct device *dev) +static inline bool is_dev_swiotlb_force(struct device *dev) { + return false; } static inline void swiotlb_exit(void) { diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 7a88c34d0867..078f7087e466 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -496,7 +496,8 @@ size_t dma_direct_max_mapping_size(struct device *dev) { /* If SWIOTLB is active, use its maximum mapping size */ if (is_swiotlb_active(dev) && - (dma_addressing_limited(dev) || swiotlb_force == SWIOTLB_FORCE)) + (dma_addressing_limited(dev) || swiotlb_force == SWIOTLB_FORCE || +is_dev_swiotlb_force(dev))) return swiotlb_max_mapping_size(dev); return SIZE_MAX; } diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h index 13e9e7158d94..f94813674e23 100644 --- a/kernel/dma/direct.h +++ b/kernel/dma/direct.h @@ -87,7 +87,8 @@ static inline dma_addr_t dma_direct_map_page(struct device *dev, phys_addr_t phys = page_to_phys(page) + offset; dma_addr_t dma_addr = phys_to_dma(dev, phys); - if (unlikely(swiotlb_force == SWIOTLB_FORCE)) + if (unlikely(swiotlb_force == SWIOTLB_FORCE) || + is_dev_swiotlb_force(dev)) return swiotlb_map(dev, phys, size, dir, attrs); if (unlikely(!dma_capable(dev, dma_addr, size, true))) { diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 21e99907edd6..e5ccc198d0a7 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -714,6 +714,7 @@ static int rmem_swiotlb_device_init(struct reserved_mem *rmem, return -ENOMEM; swiotlb_init_io_tlb_mem(mem, rmem->base, nslabs, false, true); + mem->force_swiotlb = true; rmem->priv = mem; -- 2.32.0.272.g935e593368-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 06/14] swiotlb: Update is_swiotlb_active to add a struct device argument
Update is_swiotlb_active to add a struct device argument. This will be useful later to allow for restricted DMA pool. Signed-off-by: Claire Chang --- drivers/gpu/drm/i915/gem/i915_gem_internal.c | 2 +- drivers/gpu/drm/nouveau/nouveau_ttm.c| 2 +- drivers/pci/xen-pcifront.c | 2 +- include/linux/swiotlb.h | 4 ++-- kernel/dma/direct.c | 2 +- kernel/dma/swiotlb.c | 4 ++-- 6 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c b/drivers/gpu/drm/i915/gem/i915_gem_internal.c index ce6b664b10aa..89a894354263 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c @@ -42,7 +42,7 @@ static int i915_gem_object_get_pages_internal(struct drm_i915_gem_object *obj) max_order = MAX_ORDER; #ifdef CONFIG_SWIOTLB - if (is_swiotlb_active()) { + if (is_swiotlb_active(obj->base.dev->dev)) { unsigned int max_segment; max_segment = swiotlb_max_segment(); diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c index f4c2e46b6fe1..2ca9d9a9e5d5 100644 --- a/drivers/gpu/drm/nouveau/nouveau_ttm.c +++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c @@ -276,7 +276,7 @@ nouveau_ttm_init(struct nouveau_drm *drm) } #if IS_ENABLED(CONFIG_SWIOTLB) && IS_ENABLED(CONFIG_X86) - need_swiotlb = is_swiotlb_active(); + need_swiotlb = is_swiotlb_active(dev->dev); #endif ret = ttm_device_init(&drm->ttm.bdev, &nouveau_bo_driver, drm->dev->dev, diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c index b7a8f3a1921f..0d56985bfe81 100644 --- a/drivers/pci/xen-pcifront.c +++ b/drivers/pci/xen-pcifront.c @@ -693,7 +693,7 @@ static int pcifront_connect_and_init_dma(struct pcifront_device *pdev) spin_unlock(&pcifront_dev_lock); - if (!err && !is_swiotlb_active()) { + if (!err && !is_swiotlb_active(&pdev->xdev->dev)) { err = pci_xen_swiotlb_init_late(); if (err) dev_err(&pdev->xdev->dev, "Could not setup SWIOTLB!\n"); diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 921b469c6ad2..06cf17a80f5c 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -118,7 +118,7 @@ static inline void swiotlb_set_io_tlb_default_mem(struct device *dev) void __init swiotlb_exit(void); unsigned int swiotlb_max_segment(void); size_t swiotlb_max_mapping_size(struct device *dev); -bool is_swiotlb_active(void); +bool is_swiotlb_active(struct device *dev); void __init swiotlb_adjust_size(unsigned long size); #else #define swiotlb_force SWIOTLB_NO_FORCE @@ -141,7 +141,7 @@ static inline size_t swiotlb_max_mapping_size(struct device *dev) return SIZE_MAX; } -static inline bool is_swiotlb_active(void) +static inline bool is_swiotlb_active(struct device *dev) { return false; } diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 84c9feb5474a..7a88c34d0867 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -495,7 +495,7 @@ int dma_direct_supported(struct device *dev, u64 mask) size_t dma_direct_max_mapping_size(struct device *dev) { /* If SWIOTLB is active, use its maximum mapping size */ - if (is_swiotlb_active() && + if (is_swiotlb_active(dev) && (dma_addressing_limited(dev) || swiotlb_force == SWIOTLB_FORCE)) return swiotlb_max_mapping_size(dev); return SIZE_MAX; diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index c4a071d6a63f..21e99907edd6 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -666,9 +666,9 @@ size_t swiotlb_max_mapping_size(struct device *dev) return ((size_t)IO_TLB_SIZE) * IO_TLB_SEGSIZE; } -bool is_swiotlb_active(void) +bool is_swiotlb_active(struct device *dev) { - return io_tlb_default_mem != NULL; + return dev->dma_io_tlb_mem != NULL; } EXPORT_SYMBOL_GPL(is_swiotlb_active); -- 2.32.0.272.g935e593368-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 05/14] swiotlb: Update is_swiotlb_buffer to add a struct device argument
Update is_swiotlb_buffer to add a struct device argument. This will be useful later to allow for restricted DMA pool. Signed-off-by: Claire Chang --- drivers/iommu/dma-iommu.c | 12 ++-- drivers/xen/swiotlb-xen.c | 2 +- include/linux/swiotlb.h | 7 --- kernel/dma/direct.c | 6 +++--- kernel/dma/direct.h | 6 +++--- 5 files changed, 17 insertions(+), 16 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 5d96fcc45fec..1a6a08908245 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -506,7 +506,7 @@ static void __iommu_dma_unmap_swiotlb(struct device *dev, dma_addr_t dma_addr, __iommu_dma_unmap(dev, dma_addr, size); - if (unlikely(is_swiotlb_buffer(phys))) + if (unlikely(is_swiotlb_buffer(dev, phys))) swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs); } @@ -577,7 +577,7 @@ static dma_addr_t __iommu_dma_map_swiotlb(struct device *dev, phys_addr_t phys, } iova = __iommu_dma_map(dev, phys, aligned_size, prot, dma_mask); - if (iova == DMA_MAPPING_ERROR && is_swiotlb_buffer(phys)) + if (iova == DMA_MAPPING_ERROR && is_swiotlb_buffer(dev, phys)) swiotlb_tbl_unmap_single(dev, phys, org_size, dir, attrs); return iova; } @@ -783,7 +783,7 @@ static void iommu_dma_sync_single_for_cpu(struct device *dev, if (!dev_is_dma_coherent(dev)) arch_sync_dma_for_cpu(phys, size, dir); - if (is_swiotlb_buffer(phys)) + if (is_swiotlb_buffer(dev, phys)) swiotlb_sync_single_for_cpu(dev, phys, size, dir); } @@ -796,7 +796,7 @@ static void iommu_dma_sync_single_for_device(struct device *dev, return; phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle); - if (is_swiotlb_buffer(phys)) + if (is_swiotlb_buffer(dev, phys)) swiotlb_sync_single_for_device(dev, phys, size, dir); if (!dev_is_dma_coherent(dev)) @@ -817,7 +817,7 @@ static void iommu_dma_sync_sg_for_cpu(struct device *dev, if (!dev_is_dma_coherent(dev)) arch_sync_dma_for_cpu(sg_phys(sg), sg->length, dir); - if (is_swiotlb_buffer(sg_phys(sg))) + if (is_swiotlb_buffer(dev, sg_phys(sg))) swiotlb_sync_single_for_cpu(dev, sg_phys(sg), sg->length, dir); } @@ -834,7 +834,7 @@ static void iommu_dma_sync_sg_for_device(struct device *dev, return; for_each_sg(sgl, sg, nelems, i) { - if (is_swiotlb_buffer(sg_phys(sg))) + if (is_swiotlb_buffer(dev, sg_phys(sg))) swiotlb_sync_single_for_device(dev, sg_phys(sg), sg->length, dir); diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index 24d11861ac7d..0c4fb34f11ab 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -100,7 +100,7 @@ static int is_xen_swiotlb_buffer(struct device *dev, dma_addr_t dma_addr) * in our domain. Therefore _only_ check address within our domain. */ if (pfn_valid(PFN_DOWN(paddr))) - return is_swiotlb_buffer(paddr); + return is_swiotlb_buffer(dev, paddr); return 0; } diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index ec0c01796c8a..921b469c6ad2 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -2,6 +2,7 @@ #ifndef __LINUX_SWIOTLB_H #define __LINUX_SWIOTLB_H +#include #include #include #include @@ -102,9 +103,9 @@ struct io_tlb_mem { }; extern struct io_tlb_mem *io_tlb_default_mem; -static inline bool is_swiotlb_buffer(phys_addr_t paddr) +static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr) { - struct io_tlb_mem *mem = io_tlb_default_mem; + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; return mem && paddr >= mem->start && paddr < mem->end; } @@ -121,7 +122,7 @@ bool is_swiotlb_active(void); void __init swiotlb_adjust_size(unsigned long size); #else #define swiotlb_force SWIOTLB_NO_FORCE -static inline bool is_swiotlb_buffer(phys_addr_t paddr) +static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr) { return false; } diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index f737e3347059..84c9feb5474a 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -343,7 +343,7 @@ void dma_direct_sync_sg_for_device(struct device *dev, for_each_sg(sgl, sg, nents, i) { phys_addr_t paddr = dma_to_phys(dev, sg_dma_address(sg)); - if (unlikely(is_swiotlb_buffer(paddr))) + if (unlikely(is_swiotlb_buffer(dev, paddr))) swiotlb_sync_single_for_device(dev, paddr, sg->length,
[PATCH v9 04/14] swiotlb: Add restricted DMA pool initialization
Add the initialization function to create restricted DMA pools from matching reserved-memory nodes. Signed-off-by: Claire Chang --- include/linux/swiotlb.h | 3 +- kernel/dma/Kconfig | 14 kernel/dma/swiotlb.c| 75 + 3 files changed, 91 insertions(+), 1 deletion(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 008125ccd509..ec0c01796c8a 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -72,7 +72,8 @@ extern enum swiotlb_force swiotlb_force; * range check to see if the memory was in fact allocated by this * API. * @nslabs:The number of IO TLB blocks (in groups of 64) between @start and - * @end. This is command line adjustable via setup_io_tlb_npages. + * @end. For default swiotlb, this is command line adjustable via + * setup_io_tlb_npages. * @used: The number of used IO TLB block. * @list: The free list describing the number of free entries available * from each index. diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig index 77b405508743..3e961dc39634 100644 --- a/kernel/dma/Kconfig +++ b/kernel/dma/Kconfig @@ -80,6 +80,20 @@ config SWIOTLB bool select NEED_DMA_MAP_STATE +config DMA_RESTRICTED_POOL + bool "DMA Restricted Pool" + depends on OF && OF_RESERVED_MEM + select SWIOTLB + help + This enables support for restricted DMA pools which provide a level of + DMA memory protection on systems with limited hardware protection + capabilities, such as those lacking an IOMMU. + + For more information see + + and . + If unsure, say "n". + # # Should be selected if we can mmap non-coherent mappings to userspace. # The only thing that is really required is a way to set an uncached bit diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 29b950ab1351..c4a071d6a63f 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -39,6 +39,13 @@ #ifdef CONFIG_DEBUG_FS #include #endif +#ifdef CONFIG_DMA_RESTRICTED_POOL +#include +#include +#include +#include +#include +#endif #include #include @@ -688,3 +695,71 @@ static int __init swiotlb_create_default_debugfs(void) late_initcall(swiotlb_create_default_debugfs); #endif + +#ifdef CONFIG_DMA_RESTRICTED_POOL +static int rmem_swiotlb_device_init(struct reserved_mem *rmem, + struct device *dev) +{ + struct io_tlb_mem *mem = rmem->priv; + unsigned long nslabs = rmem->size >> IO_TLB_SHIFT; + + /* +* Since multiple devices can share the same pool, the private data, +* io_tlb_mem struct, will be initialized by the first device attached +* to it. +*/ + if (!mem) { + mem = kzalloc(struct_size(mem, slots, nslabs), GFP_KERNEL); + if (!mem) + return -ENOMEM; + + swiotlb_init_io_tlb_mem(mem, rmem->base, nslabs, false, true); + + rmem->priv = mem; + + if (IS_ENABLED(CONFIG_DEBUG_FS)) { + mem->debugfs = + debugfs_create_dir(rmem->name, debugfs_dir); + swiotlb_create_debugfs_files(mem); + } + } + + dev->dma_io_tlb_mem = mem; + + return 0; +} + +static void rmem_swiotlb_device_release(struct reserved_mem *rmem, + struct device *dev) +{ + dev->dma_io_tlb_mem = io_tlb_default_mem; +} + +static const struct reserved_mem_ops rmem_swiotlb_ops = { + .device_init = rmem_swiotlb_device_init, + .device_release = rmem_swiotlb_device_release, +}; + +static int __init rmem_swiotlb_setup(struct reserved_mem *rmem) +{ + unsigned long node = rmem->fdt_node; + + if (of_get_flat_dt_prop(node, "reusable", NULL) || + of_get_flat_dt_prop(node, "linux,cma-default", NULL) || + of_get_flat_dt_prop(node, "linux,dma-default", NULL) || + of_get_flat_dt_prop(node, "no-map", NULL)) + return -EINVAL; + + if (PageHighMem(pfn_to_page(PHYS_PFN(rmem->base { + pr_err("Restricted DMA pool must be accessible within the linear mapping."); + return -EINVAL; + } + + rmem->ops = &rmem_swiotlb_ops; + pr_info("Reserved memory: created restricted DMA pool at %pa, size %ld MiB\n", + &rmem->base, (unsigned long)rmem->size / SZ_1M); + return 0; +} + +RESERVEDMEM_OF_DECLARE(dma, "restricted-dma-pool", rmem_swiotlb_setup); +#endif /* CONFIG_DMA_RESTRICTED_POOL */ -- 2.32.0.272.g935e593368-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 03/14] swiotlb: Set dev->dma_io_tlb_mem to the swiotlb pool used
Always have the pointer to the swiotlb pool used in struct device. This could help simplify the code for other pools. Signed-off-by: Claire Chang --- drivers/of/device.c | 3 +++ include/linux/device.h | 4 include/linux/swiotlb.h | 8 kernel/dma/swiotlb.c| 8 4 files changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/of/device.c b/drivers/of/device.c index c5a9473a5fb1..1defdf15ba95 100644 --- a/drivers/of/device.c +++ b/drivers/of/device.c @@ -165,6 +165,9 @@ int of_dma_configure_id(struct device *dev, struct device_node *np, arch_setup_dma_ops(dev, dma_start, size, iommu, coherent); + if (IS_ENABLED(CONFIG_SWIOTLB)) + swiotlb_set_io_tlb_default_mem(dev); + return 0; } EXPORT_SYMBOL_GPL(of_dma_configure_id); diff --git a/include/linux/device.h b/include/linux/device.h index 4443e12238a0..2e9a378c9100 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -432,6 +432,7 @@ struct dev_links_info { * @dma_pools: Dma pools (if dma'ble device). * @dma_mem: Internal for coherent mem override. * @cma_area: Contiguous memory area for dma allocations + * @dma_io_tlb_mem: Pointer to the swiotlb pool used. Not for driver use. * @archdata: For arch-specific additions. * @of_node: Associated device tree node. * @fwnode:Associated device node supplied by platform firmware. @@ -540,6 +541,9 @@ struct device { #ifdef CONFIG_DMA_CMA struct cma *cma_area; /* contiguous memory area for dma allocations */ +#endif +#ifdef CONFIG_SWIOTLB + struct io_tlb_mem *dma_io_tlb_mem; #endif /* arch specific additions */ struct dev_archdata archdata; diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 216854a5e513..008125ccd509 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -108,6 +108,11 @@ static inline bool is_swiotlb_buffer(phys_addr_t paddr) return mem && paddr >= mem->start && paddr < mem->end; } +static inline void swiotlb_set_io_tlb_default_mem(struct device *dev) +{ + dev->dma_io_tlb_mem = io_tlb_default_mem; +} + void __init swiotlb_exit(void); unsigned int swiotlb_max_segment(void); size_t swiotlb_max_mapping_size(struct device *dev); @@ -119,6 +124,9 @@ static inline bool is_swiotlb_buffer(phys_addr_t paddr) { return false; } +static inline void swiotlb_set_io_tlb_default_mem(struct device *dev) +{ +} static inline void swiotlb_exit(void) { } diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 8a3e2b3b246d..29b950ab1351 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -344,7 +344,7 @@ void __init swiotlb_exit(void) static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t size, enum dma_data_direction dir) { - struct io_tlb_mem *mem = io_tlb_default_mem; + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; int index = (tlb_addr - mem->start) >> IO_TLB_SHIFT; phys_addr_t orig_addr = mem->slots[index].orig_addr; size_t alloc_size = mem->slots[index].alloc_size; @@ -426,7 +426,7 @@ static unsigned int wrap_index(struct io_tlb_mem *mem, unsigned int index) static int find_slots(struct device *dev, phys_addr_t orig_addr, size_t alloc_size) { - struct io_tlb_mem *mem = io_tlb_default_mem; + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; unsigned long boundary_mask = dma_get_seg_boundary(dev); dma_addr_t tbl_dma_addr = phys_to_dma_unencrypted(dev, mem->start) & boundary_mask; @@ -503,7 +503,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, size_t mapping_size, size_t alloc_size, enum dma_data_direction dir, unsigned long attrs) { - struct io_tlb_mem *mem = io_tlb_default_mem; + struct io_tlb_mem *mem = dev->dma_io_tlb_mem; unsigned int offset = swiotlb_align_offset(dev, orig_addr); unsigned int i; int index; @@ -554,7 +554,7 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr, size_t mapping_size, enum dma_data_direction dir, unsigned long attrs) { - struct io_tlb_mem *mem = io_tlb_default_mem; + struct io_tlb_mem *mem = hwdev->dma_io_tlb_mem; unsigned long flags; unsigned int offset = swiotlb_align_offset(hwdev, tlb_addr); int index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT; -- 2.32.0.272.g935e593368-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 02/14] swiotlb: Refactor swiotlb_create_debugfs
Split the debugfs creation to make the code reusable for supporting different bounce buffer pools, e.g. restricted DMA pool. Signed-off-by: Claire Chang --- kernel/dma/swiotlb.c | 23 --- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 1a1208c81e85..8a3e2b3b246d 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -64,6 +64,9 @@ enum swiotlb_force swiotlb_force; struct io_tlb_mem *io_tlb_default_mem; +#ifdef CONFIG_DEBUG_FS +static struct dentry *debugfs_dir; +#endif /* * Max segment that we can provide which (if pages are contingous) will @@ -664,18 +667,24 @@ EXPORT_SYMBOL_GPL(is_swiotlb_active); #ifdef CONFIG_DEBUG_FS -static int __init swiotlb_create_debugfs(void) +static void swiotlb_create_debugfs_files(struct io_tlb_mem *mem) { - struct io_tlb_mem *mem = io_tlb_default_mem; - - if (!mem) - return 0; - mem->debugfs = debugfs_create_dir("swiotlb", NULL); debugfs_create_ulong("io_tlb_nslabs", 0400, mem->debugfs, &mem->nslabs); debugfs_create_ulong("io_tlb_used", 0400, mem->debugfs, &mem->used); +} + +static int __init swiotlb_create_default_debugfs(void) +{ + struct io_tlb_mem *mem = io_tlb_default_mem; + + debugfs_dir = debugfs_create_dir("swiotlb", NULL); + if (mem) { + mem->debugfs = debugfs_dir; + swiotlb_create_debugfs_files(mem); + } return 0; } -late_initcall(swiotlb_create_debugfs); +late_initcall(swiotlb_create_default_debugfs); #endif -- 2.32.0.272.g935e593368-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 01/14] swiotlb: Refactor swiotlb init functions
Add a new function, swiotlb_init_io_tlb_mem, for the io_tlb_mem struct initialization to make the code reusable. Signed-off-by: Claire Chang --- kernel/dma/swiotlb.c | 53 ++-- 1 file changed, 27 insertions(+), 26 deletions(-) diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 8ca7d505d61c..1a1208c81e85 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -168,9 +168,32 @@ void __init swiotlb_update_mem_attributes(void) memset(vaddr, 0, bytes); } -int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose) +static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start, + unsigned long nslabs, bool late_alloc, + bool memory_decrypted) { + void *vaddr = phys_to_virt(start); unsigned long bytes = nslabs << IO_TLB_SHIFT, i; + + mem->nslabs = nslabs; + mem->start = start; + mem->end = mem->start + bytes; + mem->index = 0; + mem->late_alloc = late_alloc; + spin_lock_init(&mem->lock); + for (i = 0; i < mem->nslabs; i++) { + mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i); + mem->slots[i].orig_addr = INVALID_PHYS_ADDR; + mem->slots[i].alloc_size = 0; + } + + if (memory_decrypted) + set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT); + memset(vaddr, 0, bytes); +} + +int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose) +{ struct io_tlb_mem *mem; size_t alloc_size; @@ -186,16 +209,8 @@ int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose) if (!mem) panic("%s: Failed to allocate %zu bytes align=0x%lx\n", __func__, alloc_size, PAGE_SIZE); - mem->nslabs = nslabs; - mem->start = __pa(tlb); - mem->end = mem->start + bytes; - mem->index = 0; - spin_lock_init(&mem->lock); - for (i = 0; i < mem->nslabs; i++) { - mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i); - mem->slots[i].orig_addr = INVALID_PHYS_ADDR; - mem->slots[i].alloc_size = 0; - } + + swiotlb_init_io_tlb_mem(mem, __pa(tlb), nslabs, false, false); io_tlb_default_mem = mem; if (verbose) @@ -282,7 +297,6 @@ swiotlb_late_init_with_default_size(size_t default_size) int swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs) { - unsigned long bytes = nslabs << IO_TLB_SHIFT, i; struct io_tlb_mem *mem; if (swiotlb_force == SWIOTLB_NO_FORCE) @@ -297,20 +311,7 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs) if (!mem) return -ENOMEM; - mem->nslabs = nslabs; - mem->start = virt_to_phys(tlb); - mem->end = mem->start + bytes; - mem->index = 0; - mem->late_alloc = 1; - spin_lock_init(&mem->lock); - for (i = 0; i < mem->nslabs; i++) { - mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i); - mem->slots[i].orig_addr = INVALID_PHYS_ADDR; - mem->slots[i].alloc_size = 0; - } - - set_memory_decrypted((unsigned long)tlb, bytes >> PAGE_SHIFT); - memset(tlb, 0, bytes); + swiotlb_init_io_tlb_mem(mem, virt_to_phys(tlb), nslabs, true, true); io_tlb_default_mem = mem; swiotlb_print_info(); -- 2.32.0.272.g935e593368-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9 00/14] Restricted DMA
This series implements mitigations for lack of DMA access control on systems without an IOMMU, which could result in the DMA accessing the system memory at unexpected times and/or unexpected addresses, possibly leading to data leakage or corruption. For example, we plan to use the PCI-e bus for Wi-Fi and that PCI-e bus is not behind an IOMMU. As PCI-e, by design, gives the device full access to system memory, a vulnerability in the Wi-Fi firmware could easily escalate to a full system exploit (remote wifi exploits: [1a], [1b] that shows a full chain of exploits; [2], [3]). To mitigate the security concerns, we introduce restricted DMA. Restricted DMA utilizes the existing swiotlb to bounce streaming DMA in and out of a specially allocated region and does memory allocation from the same region. The feature on its own provides a basic level of protection against the DMA overwriting buffer contents at unexpected times. However, to protect against general data leakage and system memory corruption, the system needs to provide a way to restrict the DMA to a predefined memory region (this is usually done at firmware level, e.g. MPU in ATF on some ARM platforms [4]). [1a] https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_4.html [1b] https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_11.html [2] https://blade.tencent.com/en/advisories/qualpwn/ [3] https://www.bleepingcomputer.com/news/security/vulnerabilities-found-in-highly-popular-firmware-for-wifi-chips/ [4] https://github.com/ARM-software/arm-trusted-firmware/blob/master/plat/mediatek/mt8183/drivers/emi_mpu/emi_mpu.c#L132 v9: Address the comments in v7 to - set swiotlb active pool to dev->dma_io_tlb_mem - get rid of get_io_tlb_mem - dig out the device struct for is_swiotlb_active - move debugfs_create_dir out of swiotlb_create_debugfs - do set_memory_decrypted conditionally in swiotlb_init_io_tlb_mem - use IS_ENABLED in kernel/dma/direct.c - fix redefinition of 'of_dma_set_restricted_buffer' v8: - Fix reserved-memory.txt and add the reg property in example. - Fix sizeof for of_property_count_elems_of_size in drivers/of/address.c#of_dma_set_restricted_buffer. - Apply Will's suggestion to try the OF node having DMA configuration in drivers/of/address.c#of_dma_set_restricted_buffer. - Fix typo in the comment of drivers/of/address.c#of_dma_set_restricted_buffer. - Add error message for PageHighMem in kernel/dma/swiotlb.c#rmem_swiotlb_device_init and move it to rmem_swiotlb_setup. - Fix the message string in rmem_swiotlb_setup. https://lore.kernel.org/patchwork/cover/1437112/ v7: Fix debugfs, PageHighMem and comment style in rmem_swiotlb_device_init https://lore.kernel.org/patchwork/cover/1431031/ v6: Address the comments in v5 https://lore.kernel.org/patchwork/cover/1423201/ v5: Rebase on latest linux-next https://lore.kernel.org/patchwork/cover/1416899/ v4: - Fix spinlock bad magic - Use rmem->name for debugfs entry - Address the comments in v3 https://lore.kernel.org/patchwork/cover/1378113/ v3: Using only one reserved memory region for both streaming DMA and memory allocation. https://lore.kernel.org/patchwork/cover/1360992/ v2: Building on top of swiotlb. https://lore.kernel.org/patchwork/cover/1280705/ v1: Using dma_map_ops. https://lore.kernel.org/patchwork/cover/1271660/ Claire Chang (14): swiotlb: Refactor swiotlb init functions swiotlb: Refactor swiotlb_create_debugfs swiotlb: Set dev->dma_io_tlb_mem to the swiotlb pool used swiotlb: Add restricted DMA pool initialization swiotlb: Update is_swiotlb_buffer to add a struct device argument swiotlb: Update is_swiotlb_active to add a struct device argument swiotlb: Bounce data from/to restricted DMA pool if available swiotlb: Move alloc_size to find_slots swiotlb: Refactor swiotlb_tbl_unmap_single dma-direct: Add a new wrapper __dma_direct_free_pages() swiotlb: Add restricted DMA alloc/free support. dma-direct: Allocate memory from restricted DMA pool if available dt-bindings: of: Add restricted DMA pool of: Add plumbing for restricted DMA pool .../reserved-memory/reserved-memory.txt | 36 ++- drivers/gpu/drm/i915/gem/i915_gem_internal.c | 2 +- drivers/gpu/drm/nouveau/nouveau_ttm.c | 2 +- drivers/iommu/dma-iommu.c | 12 +- drivers/of/address.c | 33 +++ drivers/of/device.c | 6 + drivers/of/of_private.h | 6 + drivers/pci/xen-pcifront.c| 2 +- drivers/xen/swiotlb-xen.c | 2 +- include/linux/device.h| 4 + include/linux/swiotlb.h | 45 +++- kernel/dma/Kconfig| 14 + kernel/dma/direct.c | 62 +++-- kernel/dma/direct.h | 9 +- kernel/dma/swiotlb.c | 242 +++
Re: [PATCH] swiotlb-xen: override common mmap and get_sgtable dma ops
On 6/11/21 5:55 AM, Roman Skakun wrote: > > +static int > +xen_swiotlb_dma_mmap(struct device *dev, struct vm_area_struct *vma, > + void *cpu_addr, dma_addr_t dma_addr, size_t size, > + unsigned long attrs) > +{ > + unsigned long user_count = vma_pages(vma); > + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; > + unsigned long off = vma->vm_pgoff; > + struct page *page; > + > + if (is_vmalloc_addr(cpu_addr)) > + page = vmalloc_to_page(cpu_addr); > + else > + page = virt_to_page(cpu_addr); > + > + vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs); > + > + if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret)) > + return -ENXIO; > + > + if (off >= count || user_count > count - off) > + return -ENXIO; > + > + return remap_pfn_range(vma, vma->vm_start, > + page_to_pfn(page) + vma->vm_pgoff, > + user_count << PAGE_SHIFT, vma->vm_page_prot); > +} I suggest you create a helper for computing page value and then revert 922659ea771b3fd728149262c5ea15608fab9719 and pass result of the helper instead of cpu_addr. Here and in xen_swiotlb_dma_get_sgtable(). And use this new helper in xen_swiotlb_free_coherent() too. I am curious though why this was not a problem when Stefano was looking at the problem that introduced this vmalloc check (i.e. 8b1e868f66076490189a36d984fcce286cdd6295). Stefano? -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 4/6] iommu: Factor iommu_iotlb_gather_is_disjoint() out
On Mon, Jun 07, 2021 at 11:25:39AM -0700, Nadav Amit wrote: > From: Nadav Amit > > Refactor iommu_iotlb_gather_add_page() and factor out the logic that > detects whether IOTLB gather range and a new range are disjoint. To be > used by the next patch that implements different gathering logic for > AMD. > > Cc: Joerg Roedel > Cc: Will Deacon > Cc: Jiajun Cao > Cc: Robin Murphy > Cc: Lu Baolu > Cc: iommu@lists.linux-foundation.org > Cc: linux-ker...@vger.kernel.org> > Signed-off-by: Nadav Amit > --- > include/linux/iommu.h | 41 + > 1 file changed, 33 insertions(+), 8 deletions(-) [...] > diff --git a/include/linux/iommu.h b/include/linux/iommu.h > index f254c62f3720..b5a2bfc68fb0 100644 > --- a/include/linux/iommu.h > +++ b/include/linux/iommu.h > @@ -497,6 +497,28 @@ static inline void iommu_iotlb_sync(struct iommu_domain > *domain, > iommu_iotlb_gather_init(iotlb_gather); > } > > +/** > + * iommu_iotlb_gather_is_disjoint - Checks whether a new range is disjoint > + * > + * @gather: TLB gather data > + * @iova: start of page to invalidate > + * @size: size of page to invalidate > + * > + * Helper for IOMMU drivers to check whether a new range is and the gathered > + * range are disjoint. I can't quite parse this. Delete the "is"? > For many IOMMUs, flushing the IOMMU in this case is > + * better than merging the two, which might lead to unnecessary > invalidations. > + */ > +static inline > +bool iommu_iotlb_gather_is_disjoint(struct iommu_iotlb_gather *gather, > + unsigned long iova, size_t size) > +{ > + unsigned long start = iova, end = start + size - 1; > + > + return gather->end != 0 && > + (end + 1 < gather->start || start > gather->end + 1); > +} > + > + > /** > * iommu_iotlb_gather_add_range - Gather for address-based TLB invalidation > * @gather: TLB gather data > @@ -533,20 +555,16 @@ static inline void iommu_iotlb_gather_add_page(struct > iommu_domain *domain, > struct iommu_iotlb_gather > *gather, > unsigned long iova, size_t size) > { > - unsigned long start = iova, end = start + size - 1; > - > /* >* If the new page is disjoint from the current range or is mapped at >* a different granularity, then sync the TLB so that the gather >* structure can be rewritten. >*/ > - if (gather->pgsize != size || > - end + 1 < gather->start || start > gather->end + 1) { > - if (gather->pgsize) > - iommu_iotlb_sync(domain, gather); > - gather->pgsize = size; > - } > + if ((gather->pgsize && gather->pgsize != size) || > + iommu_iotlb_gather_is_disjoint(gather, iova, size)) > + iommu_iotlb_sync(domain, gather); > > + gather->pgsize = size; Why have you made this unconditional? I think it's ok, but just not sure if it's necessary or not. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 3/6] iommu: Improve iommu_iotlb_gather helpers
On Mon, Jun 07, 2021 at 11:25:38AM -0700, Nadav Amit wrote: > From: Robin Murphy > > The Mediatek driver is not the only one which might want a basic > address-based gathering behaviour, so although it's arguably simple > enough to open-code, let's factor it out for the sake of cleanliness. > Let's also take this opportunity to document the intent of these > helpers for clarity. > > Cc: Joerg Roedel > Cc: Will Deacon > Cc: Jiajun Cao > Cc: Robin Murphy > Cc: Lu Baolu > Cc: iommu@lists.linux-foundation.org > Cc: linux-ker...@vger.kernel.org > Signed-off-by: Robin Murphy > > --- > > Changes from Robin's version: > * Added iommu_iotlb_gather_add_range() stub !CONFIG_IOMMU_API > * Use iommu_iotlb_gather_add_range() in iommu_iotlb_gather_add_page() > --- > drivers/iommu/mtk_iommu.c | 5 + > include/linux/iommu.h | 43 ++- > 2 files changed, 39 insertions(+), 9 deletions(-) Acked-by: Will Deacon Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH][next] iommu/vt-d: Fix dereference of pointer info before it is null checked
From: Colin Ian King The assignment of iommu from info->iommu occurs before info is null checked hence leading to a potential null pointer dereference issue. Fix this by assigning iommu and checking if iommu is null after null checking info. Addresses-Coverity: ("Dereference before null check") Fixes: 4c82b88696ac ("iommu/vt-d: Allocate/register iopf queue for sva devices") Signed-off-by: Colin Ian King --- drivers/iommu/intel/iommu.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index bd93c7ec879e..76a58b8ad6c3 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -5329,10 +5329,14 @@ static int intel_iommu_disable_auxd(struct device *dev) static int intel_iommu_enable_sva(struct device *dev) { struct device_domain_info *info = get_domain_info(dev); - struct intel_iommu *iommu = info->iommu; + struct intel_iommu *iommu; int ret; - if (!info || !iommu || dmar_disabled) + if (!info || dmar_disabled) + return -EINVAL; + + iommu = info->iommu; + if (!iommu) return -EINVAL; if (!(iommu->flags & VTD_FLAG_SVM_CAPABLE)) -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 5/5] drm/msm: devcoredump iommu fault support
On Thu, Jun 10, 2021 at 02:44:13PM -0700, Rob Clark wrote: > From: Rob Clark > > Wire up support to stall the SMMU on iova fault, and collect a devcore- > dump snapshot for easier debugging of faults. > > Currently this is a6xx-only, but mostly only because so far it is the > only one using adreno-smmu-priv. Acked-by: Jordan Crouse > Signed-off-by: Rob Clark > --- > drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 19 +++- > drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 38 +++- > drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 42 ++ > drivers/gpu/drm/msm/adreno/adreno_gpu.c | 15 +++ > drivers/gpu/drm/msm/msm_gem.h | 1 + > drivers/gpu/drm/msm/msm_gem_submit.c| 1 + > drivers/gpu/drm/msm/msm_gpu.c | 48 + > drivers/gpu/drm/msm/msm_gpu.h | 17 > drivers/gpu/drm/msm/msm_gpummu.c| 5 +++ > drivers/gpu/drm/msm/msm_iommu.c | 11 + > drivers/gpu/drm/msm/msm_mmu.h | 1 + > 11 files changed, 186 insertions(+), 12 deletions(-) > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c > b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c > index eb030b00bff4..7a271de9a212 100644 > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c > @@ -1200,6 +1200,15 @@ static void a5xx_fault_detect_irq(struct msm_gpu *gpu) > struct drm_device *dev = gpu->dev; > struct msm_ringbuffer *ring = gpu->funcs->active_ring(gpu); > > + /* > + * If stalled on SMMU fault, we could trip the GPU's hang detection, > + * but the fault handler will trigger the devcore dump, and we want > + * to otherwise resume normally rather than killing the submit, so > + * just bail. > + */ > + if (gpu_read(gpu, REG_A5XX_RBBM_STATUS3) & BIT(24)) > + return; > + > DRM_DEV_ERROR(dev->dev, "gpu fault ring %d fence %x status %8.8X rb > %4.4x/%4.4x ib1 %16.16llX/%4.4x ib2 %16.16llX/%4.4x\n", > ring ? ring->id : -1, ring ? ring->seqno : 0, > gpu_read(gpu, REG_A5XX_RBBM_STATUS), > @@ -1523,6 +1532,7 @@ static struct msm_gpu_state *a5xx_gpu_state_get(struct > msm_gpu *gpu) > { > struct a5xx_gpu_state *a5xx_state = kzalloc(sizeof(*a5xx_state), > GFP_KERNEL); > + bool stalled = !!(gpu_read(gpu, REG_A5XX_RBBM_STATUS3) & BIT(24)); > > if (!a5xx_state) > return ERR_PTR(-ENOMEM); > @@ -1535,8 +1545,13 @@ static struct msm_gpu_state *a5xx_gpu_state_get(struct > msm_gpu *gpu) > > a5xx_state->base.rbbm_status = gpu_read(gpu, REG_A5XX_RBBM_STATUS); > > - /* Get the HLSQ regs with the help of the crashdumper */ > - a5xx_gpu_state_get_hlsq_regs(gpu, a5xx_state); > + /* > + * Get the HLSQ regs with the help of the crashdumper, but only if > + * we are not stalled in an iommu fault (in which case the crashdumper > + * would not have access to memory) > + */ > + if (!stalled) > + a5xx_gpu_state_get_hlsq_regs(gpu, a5xx_state); > > a5xx_set_hwcg(gpu, true); > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > index fc19db10bff1..c3699408bd1f 100644 > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > @@ -1081,6 +1081,16 @@ static int a6xx_fault_handler(void *arg, unsigned long > iova, int flags, void *da > struct msm_gpu *gpu = arg; > struct adreno_smmu_fault_info *info = data; > const char *type = "UNKNOWN"; > + const char *block; > + bool do_devcoredump = info && !READ_ONCE(gpu->crashstate); > + > + /* > + * If we aren't going to be resuming later from fault_worker, then do > + * it now. > + */ > + if (!do_devcoredump) { > + gpu->aspace->mmu->funcs->resume_translation(gpu->aspace->mmu); > + } > > /* >* Print a default message if we couldn't get the data from the > @@ -1104,15 +1114,30 @@ static int a6xx_fault_handler(void *arg, unsigned > long iova, int flags, void *da > else if (info->fsr & ARM_SMMU_FSR_EF) > type = "EXTERNAL"; > > + block = a6xx_fault_block(gpu, info->fsynr1 & 0xff); > + > pr_warn_ratelimited("*** gpu fault: ttbr0=%.16llx iova=%.16lx dir=%s > type=%s source=%s (%u,%u,%u,%u)\n", > info->ttbr0, iova, > - flags & IOMMU_FAULT_WRITE ? "WRITE" : "READ", type, > - a6xx_fault_block(gpu, info->fsynr1 & 0xff), > + flags & IOMMU_FAULT_WRITE ? "WRITE" : "READ", > + type, block, > gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(4)), > gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(5)), > gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(6)), > gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(7))); > > + if (do_devcor
Re: [PATCH v5 4/5] iommu/arm-smmu-qcom: Add stall support
On Thu, Jun 10, 2021 at 02:44:12PM -0700, Rob Clark wrote: > From: Rob Clark > > Add, via the adreno-smmu-priv interface, a way for the GPU to request > the SMMU to stall translation on faults, and then later resume the > translation, either retrying or terminating the current translation. > > This will be used on the GPU side to "freeze" the GPU while we snapshot > useful state for devcoredump. > Acked-by: Jordan Crouse > Signed-off-by: Rob Clark > --- > drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 33 ++ > include/linux/adreno-smmu-priv.h | 7 + > 2 files changed, 40 insertions(+) > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > index b2e31ea84128..61fc645c1325 100644 > --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > @@ -13,6 +13,7 @@ struct qcom_smmu { > struct arm_smmu_device smmu; > bool bypass_quirk; > u8 bypass_cbndx; > + u32 stall_enabled; > }; > > static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device *smmu) > @@ -23,12 +24,17 @@ static struct qcom_smmu *to_qcom_smmu(struct > arm_smmu_device *smmu) > static void qcom_adreno_smmu_write_sctlr(struct arm_smmu_device *smmu, int > idx, > u32 reg) > { > + struct qcom_smmu *qsmmu = to_qcom_smmu(smmu); > + > /* >* On the GPU device we want to process subsequent transactions after a >* fault to keep the GPU from hanging >*/ > reg |= ARM_SMMU_SCTLR_HUPCF; > > + if (qsmmu->stall_enabled & BIT(idx)) > + reg |= ARM_SMMU_SCTLR_CFCFG; > + > arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg); > } > > @@ -48,6 +54,31 @@ static void qcom_adreno_smmu_get_fault_info(const void > *cookie, > info->contextidr = arm_smmu_cb_read(smmu, cfg->cbndx, > ARM_SMMU_CB_CONTEXTIDR); > } > > +static void qcom_adreno_smmu_set_stall(const void *cookie, bool enabled) > +{ > + struct arm_smmu_domain *smmu_domain = (void *)cookie; > + struct arm_smmu_cfg *cfg = &smmu_domain->cfg; > + struct qcom_smmu *qsmmu = to_qcom_smmu(smmu_domain->smmu); > + > + if (enabled) > + qsmmu->stall_enabled |= BIT(cfg->cbndx); > + else > + qsmmu->stall_enabled &= ~BIT(cfg->cbndx); > +} > + > +static void qcom_adreno_smmu_resume_translation(const void *cookie, bool > terminate) > +{ > + struct arm_smmu_domain *smmu_domain = (void *)cookie; > + struct arm_smmu_cfg *cfg = &smmu_domain->cfg; > + struct arm_smmu_device *smmu = smmu_domain->smmu; > + u32 reg = 0; > + > + if (terminate) > + reg |= ARM_SMMU_RESUME_TERMINATE; > + > + arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg); > +} > + > #define QCOM_ADRENO_SMMU_GPU_SID 0 > > static bool qcom_adreno_smmu_is_gpu_device(struct device *dev) > @@ -173,6 +204,8 @@ static int qcom_adreno_smmu_init_context(struct > arm_smmu_domain *smmu_domain, > priv->get_ttbr1_cfg = qcom_adreno_smmu_get_ttbr1_cfg; > priv->set_ttbr0_cfg = qcom_adreno_smmu_set_ttbr0_cfg; > priv->get_fault_info = qcom_adreno_smmu_get_fault_info; > + priv->set_stall = qcom_adreno_smmu_set_stall; > + priv->resume_translation = qcom_adreno_smmu_resume_translation; > > return 0; > } > diff --git a/include/linux/adreno-smmu-priv.h > b/include/linux/adreno-smmu-priv.h > index 53fe32fb9214..c637e0997f6d 100644 > --- a/include/linux/adreno-smmu-priv.h > +++ b/include/linux/adreno-smmu-priv.h > @@ -45,6 +45,11 @@ struct adreno_smmu_fault_info { > * TTBR0 translation is enabled with the specified cfg > * @get_fault_info: Called by the GPU fault handler to get information about > * the fault > + * @set_stall: Configure whether stall on fault (CFCFG) is enabled. Call > + * before set_ttbr0_cfg(). If stalling on fault is enabled, > + * the GPU driver must call resume_translation() > + * @resume_translation: Resume translation after a fault > + * > * > * The GPU driver (drm/msm) and adreno-smmu work together for controlling > * the GPU's SMMU instance. This is by necessity, as the GPU is directly > @@ -60,6 +65,8 @@ struct adreno_smmu_priv { > const struct io_pgtable_cfg *(*get_ttbr1_cfg)(const void *cookie); > int (*set_ttbr0_cfg)(const void *cookie, const struct io_pgtable_cfg > *cfg); > void (*get_fault_info)(const void *cookie, struct adreno_smmu_fault_info > *info); > +void (*set_stall)(const void *cookie, bool enabled); > +void (*resume_translation)(const void *cookie, bool terminate); > }; > > #endif /* __ADRENO_SMMU_PRIV_H */ > -- > 2.31.1 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/1] dma: coherent: check no-map property for arm64
Coherent dma on ARM64 also can't work with mapped system ram, that means 'no-map' property must be specified in dts. Add the missing check for ARM64 platforms as well. Besides 'no-map' checking, 'linux,dma-default' feature is also enabled for ARM64 along with this patch. Cc: Christoph Hellwig Cc: Marek Szyprowski Cc: Robin Murphy Signed-off-by: Dong Aisheng --- kernel/dma/coherent.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c index 5b5b6c7ec7f2..d1831da7afba 100644 --- a/kernel/dma/coherent.c +++ b/kernel/dma/coherent.c @@ -356,7 +356,7 @@ static int __init rmem_dma_setup(struct reserved_mem *rmem) if (of_get_flat_dt_prop(node, "reusable", NULL)) return -EINVAL; -#ifdef CONFIG_ARM +#if defined(CONFIG_ARM) || defined(CONFIG_ARM64) if (!of_get_flat_dt_prop(node, "no-map", NULL)) { pr_err("Reserved memory: regions without no-map are not yet supported\n"); return -EINVAL; -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] iommu: Fix race condition during default domain allocation
On 2021-06-11 11:45, Will Deacon wrote: On Thu, Jun 10, 2021 at 09:46:53AM +0530, Ashish Mhetre wrote: Domain is getting created more than once during asynchronous multiple display heads(devices) probe. All the display heads share same SID and are expected to be in same domain. As iommu_alloc_default_domain() call is not protected, the group->default_domain and group->domain are ending up with different domains and leading to subsequent IOMMU faults. Fix this by protecting iommu_alloc_default_domain() call with group->mutex. Can you provide some more information about exactly what the h/w configuration is, and the callstack which exhibits the race, please? It'll be basically the same as the issue reported long ago with PCI groups in the absence of ACS not being constructed correctly. Triggering the iommu_probe_device() replay in of_iommu_configure() off the back of driver probe is way too late and allows calls to happen in the wrong order, or indeed race in parallel as here. Fixing that is still on my radar, but will not be simple, and will probably go hand-in-hand with phasing out the bus ops (for the multiple-driver-coexistence problem). Signed-off-by: Ashish Mhetre --- drivers/iommu/iommu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 808ab70..2700500 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -273,7 +273,9 @@ int iommu_probe_device(struct device *dev) * support default domains, so the return value is not yet * checked. */ + mutex_lock(&group->mutex); iommu_alloc_default_domain(group, dev); + mutex_unlock(&group->mutex); It feels wrong to serialise this for everybody just to cater for systems with aliasing SIDs between devices. If two or more devices are racing at this point then they're already going to be serialised by at least iommu_group_add_device(), so I doubt there would be much impact - only the first device through here will hold the mutex for any appreciable length of time. Every other path which modifies group->domain does so with the mutex held (note the "expected" default domain allocation flow in bus_iommu_probe() in particular), so not holding it here does seem like a straightforward oversight. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v12 5/5] iommu: Remove mode argument from iommu_set_dma_strict()
We only ever now set strict mode enabled in iommu_set_dma_strict(), so just remove the argument. Signed-off-by: John Garry --- drivers/iommu/amd/init.c| 2 +- drivers/iommu/intel/iommu.c | 6 +++--- drivers/iommu/iommu.c | 5 ++--- include/linux/iommu.h | 2 +- 4 files changed, 7 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 0e6ae6d68f14..27e9677ec303 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -3098,7 +3098,7 @@ static int __init parse_amd_iommu_options(char *str) { for (; *str; ++str) { if (strncmp(str, "fullflush", 9) == 0) - iommu_set_dma_strict(true); + iommu_set_dma_strict(); if (strncmp(str, "force_enable", 12) == 0) amd_iommu_force_enable = true; if (strncmp(str, "off", 3) == 0) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 6763e516362c..e77b8b6e7838 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -452,7 +452,7 @@ static int __init intel_iommu_setup(char *str) pr_warn("intel_iommu=forcedac deprecated; use iommu.forcedac instead\n"); iommu_dma_forcedac = true; } else if (!strncmp(str, "strict", 6)) { - iommu_set_dma_strict(true); + iommu_set_dma_strict(); } else if (!strncmp(str, "sp_off", 6)) { pr_info("Disable supported super page\n"); intel_iommu_superpage = 0; @@ -4392,7 +4392,7 @@ int __init intel_iommu_init(void) */ if (cap_caching_mode(iommu->cap)) { pr_warn("IOMMU batching disallowed due to virtualization\n"); - iommu_set_dma_strict(true); + iommu_set_dma_strict(); } iommu_device_sysfs_add(&iommu->iommu, NULL, intel_iommu_groups, @@ -5663,7 +5663,7 @@ static void quirk_calpella_no_shadow_gtt(struct pci_dev *dev) } else if (dmar_map_gfx) { /* we have to ensure the gfx device is idle before we flush */ pci_info(dev, "Disabling batched IOTLB flush on Ironlake\n"); - iommu_set_dma_strict(true); + iommu_set_dma_strict(); } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0040, quirk_calpella_no_shadow_gtt); diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index ccbd5d4c1a50..146cb71c7441 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -350,10 +350,9 @@ static int __init iommu_dma_setup(char *str) } early_param("iommu.strict", iommu_dma_setup); -void iommu_set_dma_strict(bool strict) +void iommu_set_dma_strict(void) { - if (strict || !(iommu_cmd_line & IOMMU_CMD_LINE_STRICT)) - iommu_dma_strict = strict; + iommu_dma_strict = true; } bool iommu_get_dma_strict(struct iommu_domain *domain) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 32d448050bf7..754f67d6dd90 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -476,7 +476,7 @@ int iommu_enable_nesting(struct iommu_domain *domain); int iommu_set_pgtable_quirks(struct iommu_domain *domain, unsigned long quirks); -void iommu_set_dma_strict(bool val); +void iommu_set_dma_strict(void); bool iommu_get_dma_strict(struct iommu_domain *domain); extern int report_iommu_fault(struct iommu_domain *domain, struct device *dev, -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v12 4/5] iommu/amd: Add support for IOMMU default DMA mode build options
From: Zhen Lei Make IOMMU_DEFAULT_LAZY default for when AMD_IOMMU config is set, which matches current behaviour. For "fullflush" param, just call iommu_set_dma_strict(true) directly. Since this is called from __setup(), it should occur prior to iommu_subsys_init(). As such, since we get a strict vs lazy mode print there, drop the prints in amd_iommu_init_dma_ops(). Also drop global flag amd_iommu_unmap_flush, as it has no purpose any longer. Note that we should not conflict with iommu.strict param when we set strict mode, as that param is not for x86. [jpg: Rebase for relocated file an drop amd_iommu_unmap_flush] Signed-off-by: Zhen Lei Signed-off-by: John Garry --- drivers/iommu/Kconfig | 2 +- drivers/iommu/amd/amd_iommu_types.h | 6 -- drivers/iommu/amd/init.c| 3 +-- drivers/iommu/amd/iommu.c | 6 -- 4 files changed, 2 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 4467353f981b..8c1e0f2bba0d 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -94,7 +94,7 @@ choice prompt "IOMMU default DMA mode" depends on IOMMU_DMA - default IOMMU_DEFAULT_LAZY if INTEL_IOMMU + default IOMMU_DEFAULT_LAZY if (AMD_IOMMU || INTEL_IOMMU) default IOMMU_DEFAULT_STRICT help This option allows an IOMMU DMA mode to be chosen at build time, to diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 94c1a7a9876d..8dbe61e2b3c1 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -779,12 +779,6 @@ extern u16 amd_iommu_last_bdf; /* allocation bitmap for domain ids */ extern unsigned long *amd_iommu_pd_alloc_bitmap; -/* - * If true, the addresses will be flushed on unmap time, not when - * they are reused - */ -extern bool amd_iommu_unmap_flush; - /* Smallest max PASID supported by any IOMMU in the system */ extern u32 amd_iommu_max_pasid; diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 46280e6e1535..0e6ae6d68f14 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -161,7 +161,6 @@ u16 amd_iommu_last_bdf; /* largest PCI device id we have to handle */ LIST_HEAD(amd_iommu_unity_map);/* a list of required unity mappings we find in ACPI */ -bool amd_iommu_unmap_flush;/* if true, flush on every unmap */ LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the system */ @@ -3099,7 +3098,7 @@ static int __init parse_amd_iommu_options(char *str) { for (; *str; ++str) { if (strncmp(str, "fullflush", 9) == 0) - amd_iommu_unmap_flush = true; + iommu_set_dma_strict(true); if (strncmp(str, "force_enable", 12) == 0) amd_iommu_force_enable = true; if (strncmp(str, "off", 3) == 0) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index b1fbf2c83df5..32b541ee2e11 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1775,12 +1775,6 @@ void amd_iommu_domain_update(struct protection_domain *domain) static void __init amd_iommu_init_dma_ops(void) { swiotlb = (iommu_default_passthrough() || sme_me_mask) ? 1 : 0; - - if (amd_iommu_unmap_flush) - pr_info("IO/TLB flush on unmap enabled\n"); - else - pr_info("Lazy IO/TLB flushing enabled\n"); - iommu_set_dma_strict(amd_iommu_unmap_flush); } int __init amd_iommu_init_api(void) -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v12 3/5] iommu/vt-d: Add support for IOMMU default DMA mode build options
From: Zhen Lei Make IOMMU_DEFAULT_LAZY default for when INTEL_IOMMU config is set, as is current behaviour. Also delete global flag intel_iommu_strict: - In intel_iommu_setup(), call iommu_set_dma_strict(true) directly. The accompanying print is removed, as it replicates the print in iommu_subsys_init(), and, since called from __setup(), should before iommu_subsys_init() (so desired ordering). - For cap_caching_mode() set in intel_iommu_setup(), call iommu_set_dma_strict(true) directly, and reword the accompanying print and add the missing '\n'. - For Ironlake GPU, again call iommu_set_dma_strict(true) directly and keep the accompanying print. Note that we should not conflict with iommu.strict param conflicting when we set strict mode, as that param is not for x86. [jpg: Remove intel_iommu_strict] Signed-off-by: Zhen Lei Signed-off-by: John Garry --- drivers/iommu/Kconfig | 1 + drivers/iommu/intel/iommu.c | 15 ++- 2 files changed, 7 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 2a71347611d4..4467353f981b 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -94,6 +94,7 @@ choice prompt "IOMMU default DMA mode" depends on IOMMU_DMA + default IOMMU_DEFAULT_LAZY if INTEL_IOMMU default IOMMU_DEFAULT_STRICT help This option allows an IOMMU DMA mode to be chosen at build time, to diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index be35284a2016..6763e516362c 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -360,7 +360,6 @@ int intel_iommu_enabled = 0; EXPORT_SYMBOL_GPL(intel_iommu_enabled); static int dmar_map_gfx = 1; -static int intel_iommu_strict; static int intel_iommu_superpage = 1; static int iommu_identity_mapping; static int iommu_skip_te_disable; @@ -453,8 +452,7 @@ static int __init intel_iommu_setup(char *str) pr_warn("intel_iommu=forcedac deprecated; use iommu.forcedac instead\n"); iommu_dma_forcedac = true; } else if (!strncmp(str, "strict", 6)) { - pr_info("Disable batched IOTLB flush\n"); - intel_iommu_strict = 1; + iommu_set_dma_strict(true); } else if (!strncmp(str, "sp_off", 6)) { pr_info("Disable supported super page\n"); intel_iommu_superpage = 0; @@ -4392,9 +4390,9 @@ int __init intel_iommu_init(void) * is likely to be much lower than the overhead of synchronizing * the virtual and physical IOMMU page-tables. */ - if (!intel_iommu_strict && cap_caching_mode(iommu->cap)) { - pr_warn("IOMMU batching is disabled due to virtualization"); - intel_iommu_strict = 1; + if (cap_caching_mode(iommu->cap)) { + pr_warn("IOMMU batching disallowed due to virtualization\n"); + iommu_set_dma_strict(true); } iommu_device_sysfs_add(&iommu->iommu, NULL, intel_iommu_groups, @@ -4403,7 +4401,6 @@ int __init intel_iommu_init(void) } up_read(&dmar_global_lock); - iommu_set_dma_strict(intel_iommu_strict); bus_set_iommu(&pci_bus_type, &intel_iommu_ops); if (si_domain && !hw_pass_through) register_memory_notifier(&intel_iommu_memory_nb); @@ -5666,8 +5663,8 @@ static void quirk_calpella_no_shadow_gtt(struct pci_dev *dev) } else if (dmar_map_gfx) { /* we have to ensure the gfx device is idle before we flush */ pci_info(dev, "Disabling batched IOTLB flush on Ironlake\n"); - intel_iommu_strict = 1; - } + iommu_set_dma_strict(true); + } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0040, quirk_calpella_no_shadow_gtt); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0044, quirk_calpella_no_shadow_gtt); -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v12 2/5] iommu: Enhance IOMMU default DMA mode build options
From: Zhen Lei First, add build options IOMMU_DEFAULT_{LAZY|STRICT}, so that we have the opportunity to set {lazy|strict} mode as default at build time. Then put the two config options in an choice, as they are mutually exclusive. [jpg: Make choice between strict and lazy only (and not passthrough)] Signed-off-by: Zhen Lei Signed-off-by: John Garry --- drivers/iommu/Kconfig | 38 ++ drivers/iommu/iommu.c | 3 ++- 2 files changed, 40 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 1f111b399bca..2a71347611d4 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -90,6 +90,44 @@ config IOMMU_DEFAULT_PASSTHROUGH If unsure, say N here. +choice + prompt "IOMMU default DMA mode" + depends on IOMMU_DMA + + default IOMMU_DEFAULT_STRICT + help + This option allows an IOMMU DMA mode to be chosen at build time, to + override the default DMA mode of each ARCH, removing the need to + pass in kernel parameters through command line. It is still possible + to provide ARCH-specific or common boot options to override this + option. + + If unsure, keep the default. + +config IOMMU_DEFAULT_STRICT + bool "strict" + help + For every IOMMU DMA unmap operation, the flush operation of IOTLB and + the free operation of IOVA are guaranteed to be done in the unmap + function. + +config IOMMU_DEFAULT_LAZY + bool "lazy" + help + Support lazy mode, where for every IOMMU DMA unmap operation, the + flush operation of IOTLB and the free operation of IOVA are deferred. + They are only guaranteed to be done before the related IOVA will be + reused. + + The isolation provided in this mode is not as secure as STRICT mode, + such that a vulnerable time window may be created between the DMA + unmap and the mapping finally being torn down in the IOMMU, where the + device can still access the system memory. However this mode may + provide better performance in high throughput scenarios, and is still + considerably more secure than passthrough mode or no IOMMU. + +endchoice + config OF_IOMMU def_bool y depends on OF && IOMMU_API diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index cf58949cc2f3..ccbd5d4c1a50 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -29,7 +29,8 @@ static struct kset *iommu_group_kset; static DEFINE_IDA(iommu_group_ida); static unsigned int iommu_def_domain_type __read_mostly; -static bool iommu_dma_strict __read_mostly = true; +static bool iommu_dma_strict __read_mostly = + IS_ENABLED(CONFIG_IOMMU_DEFAULT_STRICT); static u32 iommu_cmd_line __read_mostly; struct iommu_group { -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v12 1/5] iommu: Print strict or lazy mode at init time
As well as the default domain type, it's useful to know whether strict or lazy for DMA domains, so add this info in a separate print. The (stict/lazy) mode may be also set via iommu.strict earlyparm, but this will be processed prior to iommu_subsys_init(), so that print will be accurate for drivers which don't set the mode via custom means. For the drivers which do set the mode via custom means - the AMD and Intel drivers - they still maintain prints to notify the change in policy. Signed-off-by: John Garry --- drivers/iommu/iommu.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 5419c4b9f27a..cf58949cc2f3 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -138,6 +138,11 @@ static int __init iommu_subsys_init(void) (iommu_cmd_line & IOMMU_CMD_LINE_DMA_API) ? "(set via kernel command line)" : ""); + pr_info("DMA domain TLB invalidation policy: %s mode %s\n", + iommu_dma_strict ? "strict" : "lazy", + (iommu_cmd_line & IOMMU_CMD_LINE_STRICT) ? + "(set via kernel command line)" : ""); + return 0; } subsys_initcall(iommu_subsys_init); -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v12 0/5] Enhance IOMMU default DMA mode build options
This is a reboot of Zhen Lei's series from a couple of years ago, which never made it across the line. I still think that it has some value, so taking up the mantle. Motivation: Allow lazy mode be default mode for DMA domains for all ARCHs, and not only those who hardcode it (to be lazy). For ARM64, currently we must use a kernel command line parameter to use lazy mode, which is less than ideal. I have now included the print for strict/lazy mode, which I originally sent in: https://lore.kernel.org/linux-iommu/72eb3de9-1d1c-ae46-c5a9-95f26525d...@huawei.com/ There was some concern there about drivers and their custom prints conflicting with the print in that patch, but I think that it should be ok. Difference to v11: - Rebase to next-20210610 - Drop strict mode globals in Intel and AMD drivers - Include patch to print strict vs lazy mode - Include patch to remove argument from iommu_set_dma_strict() Differences to v10: - Rebase to v5.13-rc4 - Correct comment and typo in Kconfig (Randy) - Make Kconfig choice depend on relevant architectures Differences to v9: https://lore.kernel.org/linux-iommu/20190613084240.16768-1-thunder.leiz...@huawei.com/#t - Rebase to v5.13-rc2 - Remove CONFIG_IOMMU_DEFAULT_PASSTHROUGH from choice: Since we can dynamically change default domain of group, lazy or strict and passthrough are not mutually exclusive - Drop ia64 patch, which I don't think was ever required - Drop "x86/dma: use IS_ENABLED() to simplify the code", which is no longer required - Drop s390/pci patch, as this arch does not use CONFIG_IOMMU_API or even already honour CONFIG_IOMMU_DEFAULT_PASSTHROUGH https://lore.kernel.org/linux-iommu/20190613084240.16768-4-thunder.leiz...@huawei.com/ - Drop powernv/iommu patch, as I no longer think that it is relevant https://lore.kernel.org/linux-iommu/20190613084240.16768-5-thunder.leiz...@huawei.com/ - Some tidying John Garry (2): iommu: Print strict or lazy mode at init time iommu: Remove mode argument from iommu_set_dma_strict() Zhen Lei (3): iommu: Enhance IOMMU default DMA mode build options iommu/vt-d: Add support for IOMMU default DMA mode build options iommu/amd: Add support for IOMMU default DMA mode build options drivers/iommu/Kconfig | 39 + drivers/iommu/amd/amd_iommu_types.h | 6 - drivers/iommu/amd/init.c| 3 +-- drivers/iommu/amd/iommu.c | 6 - drivers/iommu/intel/iommu.c | 15 +-- drivers/iommu/iommu.c | 13 +++--- include/linux/iommu.h | 2 +- 7 files changed, 56 insertions(+), 28 deletions(-) -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 0/9] arm64: tegra: Prevent early SMMU faults
On Fri, Jun 11, 2021 at 08:48:00AM +0200, Krzysztof Kozlowski wrote: > On Thu, 3 Jun 2021 18:46:23 +0200, Thierry Reding wrote: > > this is a set of patches that is the result of earlier discussions > > regarding early identity mappings that are needed to avoid SMMU faults > > during early boot. > > > > The goal here is to avoid early identity mappings altogether and instead > > postpone the need for the identity mappings to when devices are attached > > to the SMMU. This works by making the SMMU driver coordinate with the > > memory controller driver on when to start enforcing SMMU translations. > > This makes Tegra behave in a more standard way and pushes the code to > > deal with the Tegra-specific programming into the NVIDIA SMMU > > implementation. > > > > [...] > > Applied, thanks! > > [1/9] memory: tegra: Implement SID override programming > (no commit info) > [2/9] dt-bindings: arm-smmu: Add Tegra186 compatible string > commit: 4287861dca9d77490ee50de42aa3ada92da86c9d > > [3/9] - skipped > > [4/9] iommu/arm-smmu: tegra: Detect number of instances at runtime > commit: 7ecbf253f8d64c08de28d16a66e3abbe873f6c9f > [5/9] iommu/arm-smmu: tegra: Implement SID override programming > commit: 8eb68595475ac5fcaaa3718a173283df48cb4ef1 > [6/9] iommu/arm-smmu: Use Tegra implementation on Tegra186 > commit: 2c1bc371268862a991a6498e1dddc8971b9076b8 I've applied patches 7-9 to the Tegra tree. Thanks Krzysztof and Will for your help in getting this over the finish line! Thierry signature.asc Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: swiotlb/caamjr regression (Was: [GIT PULL] (swiotlb) stable/for-linus-5.12)
On 6/11/2021 1:35 PM, Konrad Rzeszutek Wilk wrote: > On Fri, Jun 11, 2021 at 08:21:53AM +0200, Christoph Hellwig wrote: >> On Thu, Jun 10, 2021 at 05:52:07PM +0300, Horia Geantă wrote: >>> I've noticed the failure also in v5.10 and v5.11 stable kernels, >>> since the patch set has been backported. >> >> FYI, there has been a patch on the list that should have fixed this >> for about a month: >> >> https://lore.kernel.org/linux-iommu/20210510091816.ga2...@lst.de/T/#m0d0df6490350a08dcc24c9086c8edc165b402d6f >> >> but it seems like it never got picked up. > > Yikes! > > Dominique, > > Would you be up to testing the attached (and inline) patch please? > > Linus, > > Would you be terribly offended if I took your code (s/unsigned > long/unsigned int), and used Chanho's description of the problem (see below)? > Both patches work for my case. However, there's yet another, possibly significant, difference b/w the two: offset = (tlb_addr - mem->start) & (IO_TLB_SIZE - 1); vs. offset = (tlb_addr & (IO_TLB_SIZE - 1)) - swiotlb_align_offset(dev, orig_addr); I think accounting for the alignment offset (swiotlb_align_offset()) has to be kept. Horia ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 13/16] media: mtk-vcodec: Get rid of mtk_smi_larb_get/put
On Fri, Jun 11, 2021 at 12:07:24PM +0200, Matthias Brugger wrote: > That's a good question. I think the media tree would be a good > candidate, as it has the biggest bunch of patches. But that would mean > that Joerg is fine that. The DTS part could still go through my tree. IOMMU changes are only a minor part of this, so it should not go through the IOMMU tree. When Matthias has reviewed the IOMMU changes, feel free to add my Acked-by: Joerg Roedel to them. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] iommu: Fix race condition during default domain allocation
On Thu, Jun 10, 2021 at 09:46:53AM +0530, Ashish Mhetre wrote: > Domain is getting created more than once during asynchronous multiple > display heads(devices) probe. All the display heads share same SID and > are expected to be in same domain. As iommu_alloc_default_domain() call > is not protected, the group->default_domain and group->domain are ending > up with different domains and leading to subsequent IOMMU faults. > Fix this by protecting iommu_alloc_default_domain() call with group->mutex. Can you provide some more information about exactly what the h/w configuration is, and the callstack which exhibits the race, please? > Signed-off-by: Ashish Mhetre > --- > drivers/iommu/iommu.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index 808ab70..2700500 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -273,7 +273,9 @@ int iommu_probe_device(struct device *dev) >* support default domains, so the return value is not yet >* checked. >*/ > + mutex_lock(&group->mutex); > iommu_alloc_default_domain(group, dev); > + mutex_unlock(&group->mutex); It feels wrong to serialise this for everybody just to cater for systems with aliasing SIDs between devices. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: swiotlb/caamjr regression (Was: [GIT PULL] (swiotlb) stable/for-linus-5.12)
On Fri, Jun 11, 2021 at 08:21:53AM +0200, Christoph Hellwig wrote: > On Thu, Jun 10, 2021 at 05:52:07PM +0300, Horia Geantă wrote: > > I've noticed the failure also in v5.10 and v5.11 stable kernels, > > since the patch set has been backported. > > FYI, there has been a patch on the list that should have fixed this > for about a month: > > https://lore.kernel.org/linux-iommu/20210510091816.ga2...@lst.de/T/#m0d0df6490350a08dcc24c9086c8edc165b402d6f > > but it seems like it never got picked up. Yikes! Dominique, Would you be up to testing the attached (and inline) patch please? Linus, Would you be terribly offended if I took your code (s/unsigned long/unsigned int), and used Chanho's description of the problem (see below)? Christoph, I took the liberty of putting your Reviewed-by on the patch, you OK with that? Jianxiong, Would you be up for testing this patch on your NVMe rig please? I don't forsee a problem.. but just in case >From f06da55596675383fbf2563fe2919b2a8f68901b Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Thu, 10 Jun 2021 12:41:26 -0700 Subject: [PATCH] swiotlb: manipulate orig_addr when tlb_addr has offset MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit in case of driver wants to sync part of ranges with offset, swiotlb_tbl_sync_single() copies from orig_addr base to tlb_addr with offset and ends up with data mismatch. It was removed from "swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single", but said logic has to be added back in. [From Linus's email: That commit which the removed the offset calculation entirely, because the old (unsigned long)tlb_addr & (IO_TLB_SIZE - 1) was wrong, but instead of removing it, I think it should have just fixed it to be (tlb_addr - mem->start) & (IO_TLB_SIZE - 1); instead. That way the slot offset always matches the slot index calculation.] The use-case that drivers are hitting is as follow: 1. Get dma_addr_t from dma_map_single() dma_addr_t tlb_addr = dma_map_single(dev, vaddr, vsize, DMA_TO_DEVICE); |<---vsize->| +---+ | | original buffer +---+ vaddr swiotlb_align_offset |<->|<---vsize->| +---+---+ | | | swiotlb buffer +---+---+ tlb_addr 2. Do something 3. Sync dma_addr_t through dma_sync_single_for_device(..) dma_sync_single_for_device(dev, tlb_addr + offset, size, DMA_TO_DEVICE); Error case. Copy data to original buffer but it is from base addr (instead of base addr + offset) in original buffer: swiotlb_align_offset |<->|<- offset ->|<- size ->| +---+---+ | ||##| | swiotlb buffer +---+---+ tlb_addr |<- size ->| +---+ |##|| original buffer +---+ vaddr The fix is to copy the data to the original buffer and take into account the offset, like so: swiotlb_align_offset |<->|<- offset ->|<- size ->| +---+---+ | ||##| | swiotlb buffer +---+---+ tlb_addr |<- offset ->|<- size ->| +---+ ||##| | original buffer +---+ vaddr [This patch text is from Bumyong's email; and his solution was very close to Linus's, so incorporating his text] Fixes: 16fc3cef33a0 ("swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single") Signed-off-by: Bumyong Lee Signed-off-by: Chanho Park Reviewed-by: Christoph Hellwig Reported-by: Dominique MARTINET Reported-by: Horia Geantă Tested-by: Horia Geantă Signed-off-by: Linus Torvalds CC: sta...@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk --- kernel/dma/swiotlb.c | 9 + 1 file changed, 9 insertions(+) diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 8ca7d50..dc438b5 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -342,6 +342,7 @@ static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t size { struct io_tlb_mem *mem = io_tlb_default_mem; int index = (tlb_addr - mem->start) >> IO_TLB_SHIFT; + unsigned int offset = (tlb_addr - mem->start) & (IO_TLB_SIZE - 1); phys_addr_t orig_addr = mem->slots[index].orig_addr; size_t alloc_size = mem->slots[index].alloc_size; unsigned long pfn = PFN_DOWN(orig_addr); @@ -350,6 +351,14 @@ static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t size
Re: [PATCH 1/1] iommu/arm-smmu-v3: remove unnecessary oom message
On Wed, Jun 09, 2021 at 08:54:38PM +0800, Zhen Lei wrote: > Fixes scripts/checkpatch.pl warning: > WARNING: Possible unnecessary 'out of memory' message > > Remove it can help us save a bit of memory. > > Signed-off-by: Zhen Lei > --- > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 ++-- > 1 file changed, 2 insertions(+), 6 deletions(-) > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > index 2ddc3cd5a7d1..fd7c55b44881 100644 > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > @@ -2787,10 +2787,8 @@ static int arm_smmu_init_l1_strtab(struct > arm_smmu_device *smmu) > void *strtab = smmu->strtab_cfg.strtab; > > cfg->l1_desc = devm_kzalloc(smmu->dev, size, GFP_KERNEL); > - if (!cfg->l1_desc) { > - dev_err(smmu->dev, "failed to allocate l1 stream table desc\n"); > + if (!cfg->l1_desc) What error do you get if devm_kzalloc() fails? I'd like to make sure it's easy to track down _which_ allocation failed in that case -- does it give you a line number, for example? Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 13/16] media: mtk-vcodec: Get rid of mtk_smi_larb_get/put
On 10/06/2021 14:02, Yong Wu wrote: > On Thu, 2021-06-10 at 09:53 +0200, Matthias Brugger wrote: >> Hi Yong, >> >> On 12/05/2021 14:29, Yong Wu wrote: >>> On Wed, 2021-05-12 at 17:20 +0800, Hsin-Yi Wang wrote: On Sat, Apr 10, 2021 at 5:14 PM Yong Wu wrote: > > MediaTek IOMMU has already added the device_link between the consumer > and smi-larb device. If the vcodec device call the pm_runtime_get_sync, > the smi-larb's pm_runtime_get_sync also be called automatically. > > CC: Tiffany Lin > CC: Irui Wang > Signed-off-by: Yong Wu > Reviewed-by: Evan Green > Acked-by: Tiffany Lin > --- > .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c | 37 ++- > .../platform/mtk-vcodec/mtk_vcodec_drv.h | 3 -- > .../platform/mtk-vcodec/mtk_vcodec_enc.c | 1 - > .../platform/mtk-vcodec/mtk_vcodec_enc_pm.c | 46 ++- > 4 files changed, 10 insertions(+), 77 deletions(-) >>> >>> [...] >>> > @@ -108,13 +80,6 @@ void mtk_vcodec_enc_clock_on(struct mtk_vcodec_pm *pm) > } > } > > - ret = mtk_smi_larb_get(pm->larbvenc); > - if (ret) { > - mtk_v4l2_err("mtk_smi_larb_get larb3 fail %d", ret); > - goto clkerr; > - } > - return; You can't delete the return; here, otherwise vcodec_clk will be turned off immediately after they are turned on. >>> >>> Thanks very much for your review. >>> >>> Sorry for this. You are quite right. >>> >>> I checked this, it was introduced in v4 when I rebase the code. I will >>> fix it in next time. >>> >> >> Please also make sure that you add all maintainers. I realized that at least >> for >> the media/platform drivers we miss the maintainer and the corresponding >> mailing >> list. >> This is especially important in this series, as it spans several subsystems. > > Thanks for hint. I only added the file maintainer here. I will add > linux-media in next version. > > By the way, this patchset cross several trees, then which tree should it > go through. Do you have some suggestion? > That's a good question. I think the media tree would be a good candidate, as it has the biggest bunch of patches. But that would mean that Joerg is fine that. The DTS part could still go through my tree. Regards, Matthias ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu