Re: [RFC v2] /dev/iommu uAPI proposal
On Tue, Aug 03, 2021 at 03:19:26AM +, Tian, Kevin wrote: > > From: David Gibson > > Sent: Tuesday, August 3, 2021 9:51 AM > > > > On Wed, Jul 28, 2021 at 04:04:24AM +, Tian, Kevin wrote: > > > Hi, David, > > > > > > > From: David Gibson > > > > Sent: Monday, July 26, 2021 12:51 PM > > > > > > > > On Fri, Jul 09, 2021 at 07:48:44AM +, Tian, Kevin wrote: > > > > > /dev/iommu provides an unified interface for managing I/O page tables > > for > > > > > devices assigned to userspace. Device passthrough frameworks (VFIO, > > > > vDPA, > > > > > etc.) are expected to use this interface instead of creating their own > > logic to > > > > > isolate untrusted device DMAs initiated by userspace. > > > > > > > > > > This proposal describes the uAPI of /dev/iommu and also sample > > > > sequences > > > > > with VFIO as example in typical usages. The driver-facing kernel API > > > > provided > > > > > by the iommu layer is still TBD, which can be discussed after > > > > > consensus > > is > > > > > made on this uAPI. > > > > > > > > > > It's based on a lengthy discussion starting from here: > > > > > https://lore.kernel.org/linux- > > > > iommu/20210330132830.go2356...@nvidia.com/ > > > > > > > > > > v1 can be found here: > > > > > https://lore.kernel.org/linux- > > > > > > iommu/PH0PR12MB54811863B392C644E5365446DC3E9@PH0PR12MB5481.n > > > > amprd12.prod.outlook.com/T/ > > > > > > > > > > This doc is also tracked on github, though it's not very useful for > > > > > v1->v2 > > > > > given dramatic refactoring: > > > > > https://github.com/luxis1999/dev_iommu_uapi > > > > > > > > Thanks for all your work on this, Kevin. Apart from the actual > > > > semantic improvements, I'm finding v2 significantly easier to read and > > > > understand than v1. > > > > > > > > [snip] > > > > > 1.2. Attach Device to I/O address space > > > > > +++ > > > > > > > > > > Device attach/bind is initiated through passthrough framework uAPI. > > > > > > > > > > Device attaching is allowed only after a device is successfully bound > > > > > to > > > > > the IOMMU fd. User should provide a device cookie when binding the > > > > > device through VFIO uAPI. This cookie is used when the user queries > > > > > device capability/format, issues per-device iotlb invalidation and > > > > > receives per-device I/O page fault data via IOMMU fd. > > > > > > > > > > Successful binding puts the device into a security context which > > > > > isolates > > > > > its DMA from the rest system. VFIO should not allow user to access the > > > > > device before binding is completed. Similarly, VFIO should prevent the > > > > > user from unbinding the device before user access is withdrawn. > > > > > > > > > > When a device is in an iommu group which contains multiple devices, > > > > > all devices within the group must enter/exit the security context > > > > > together. Please check {1.3} for more info about group isolation via > > > > > this device-centric design. > > > > > > > > > > Successful attaching activates an I/O address space in the IOMMU, > > > > > if the device is not purely software mediated. VFIO must provide > > > > > device > > > > > specific routing information for where to install the I/O page table > > > > > in > > > > > the IOMMU for this device. VFIO must also guarantee that the attached > > > > > device is configured to compose DMAs with the routing information > > that > > > > > is provided in the attaching call. When handling DMA requests, IOMMU > > > > > identifies the target I/O address space according to the routing > > > > > information carried in the request. Misconfiguration breaks DMA > > > > > isolation thus could lead to severe security vulnerability. > > > > > > > > > > Routing information is per-device and bus specific. For PCI, it is > > > > > Requester ID (RID) identifying the device plus optional Process > > > > > Address > > > > > Space ID (PASID). For ARM, it is Stream ID (SID) plus optional Sub- > > Stream > > > > > ID (SSID). PASID or SSID is used when multiple I/O address spaces are > > > > > enabled on a single device. For simplicity and continuity reason the > > > > > following context uses RID+PASID though SID+SSID may sound a clearer > > > > > naming from device p.o.v. We can decide the actual naming when > > coding. > > > > > > > > > > Because one I/O address space can be attached by multiple devices, > > > > > per-device routing information (plus device cookie) is tracked under > > > > > each IOASID and is used respectively when activating the I/O address > > > > > space in the IOMMU for each attached device. > > > > > > > > > > The device in the /dev/iommu context always refers to a physical one > > > > > (pdev) which is identifiable via RID. Physically each pdev can support > > > > > one default I/O address space (routed via RID) and optionally multiple > > > > > non-default I/O address spaces (via RID+PASID). > > > > > > > > > > The device in
Re: [RFC v2] /dev/iommu uAPI proposal
On Wed, Aug 04, 2021 at 11:04:47AM -0300, Jason Gunthorpe wrote: > On Mon, Aug 02, 2021 at 02:49:44AM +, Tian, Kevin wrote: > > > Can you elaborate? IMO the user only cares about the label (device cookie > > plus optional vPASID) which is generated by itself when doing the attaching > > call, and expects this virtual label being used in various spots > > (invalidation, > > page fault, etc.). How the system labels the traffic (the physical RID or > > RID+ > > PASID) should be completely invisible to userspace. > > I don't think that is true if the vIOMMU driver is also emulating > PASID. Presumably the same is true for other PASID-like schemes. Right. The idea for an SVA capable vIOMMU in my scheme is that the hypervisor would set up an IOAS of address type "PASID+address" with the mappings made by the guest according to its vIOMMU semantics. Then SVA capable devices would be plugged into that IOAS by using "PASID+address" type endpoints from those devices. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC v2] /dev/iommu uAPI proposal
On Wed, Aug 04, 2021 at 11:07:42AM -0300, Jason Gunthorpe wrote: > On Tue, Aug 03, 2021 at 11:58:54AM +1000, David Gibson wrote: > > > I'd rather deduce the endpoint from a collection of devices than the > > > other way around... > > > > Which I think is confusing, and in any case doesn't cover the case of > > one "device" with multiple endpoints. > > Well they are both confusing, and I'd prefer to focus on the common > case without extra mandatory steps. Exposing optional endpoint sharing > information seems more in line with where everything is going than > making endpoint sharing a first class object. > > AFAIK a device with multiple endpoints where those endpoints are > shared with other devices doesn't really exist/or is useful? Eg PASID > has multiple RIDs by they are not shared. No, I can't think of a (non-contrived) example where a device would have *both* multiple endpoints and those endpoints are shared amongst multiple devices. I can easily think of examples where a device has multiple (non shared) endpoints and where multiple devices share a single endpoint. The point is that making endpoints explicit separates the various options here from the logic of the IOMMU layer itself. New device types with new possibilities here means new interfaces *on those devices*, but not new interfaces on /dev/iommu. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 09/25] iommu/sprd: Drop IOVA cookie management
On Thu, Aug 5, 2021 at 1:18 AM Robin Murphy wrote: > > The core code bakes its own cookies now. > > CC: Chunyan Zhang > Signed-off-by: Robin Murphy Thank you for the patch! Acked-by: Chunyan Zhang > > --- > > v3: Also remove unneeded include > --- > drivers/iommu/sprd-iommu.c | 7 --- > 1 file changed, 7 deletions(-) > > diff --git a/drivers/iommu/sprd-iommu.c b/drivers/iommu/sprd-iommu.c > index 73dfd9946312..27ac818b0354 100644 > --- a/drivers/iommu/sprd-iommu.c > +++ b/drivers/iommu/sprd-iommu.c > @@ -8,7 +8,6 @@ > > #include > #include > -#include > #include > #include > #include > @@ -144,11 +143,6 @@ static struct iommu_domain > *sprd_iommu_domain_alloc(unsigned int domain_type) > if (!dom) > return NULL; > > - if (iommu_get_dma_cookie(>domain)) { > - kfree(dom); > - return NULL; > - } > - > spin_lock_init(>pgtlock); > > dom->domain.geometry.aperture_start = 0; > @@ -161,7 +155,6 @@ static void sprd_iommu_domain_free(struct iommu_domain > *domain) > { > struct sprd_iommu_domain *dom = to_sprd_domain(domain); > > - iommu_put_dma_cookie(domain); > kfree(dom); > } > > -- > 2.25.1 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC v2] /dev/iommu uAPI proposal
> From: Jason Gunthorpe > Sent: Thursday, August 5, 2021 7:27 PM > > On Wed, Aug 04, 2021 at 10:59:21PM +, Tian, Kevin wrote: > > > From: Jason Gunthorpe > > > Sent: Wednesday, August 4, 2021 10:05 PM > > > > > > On Mon, Aug 02, 2021 at 02:49:44AM +, Tian, Kevin wrote: > > > > > > > Can you elaborate? IMO the user only cares about the label (device > cookie > > > > plus optional vPASID) which is generated by itself when doing the > attaching > > > > call, and expects this virtual label being used in various spots > (invalidation, > > > > page fault, etc.). How the system labels the traffic (the physical RID > > > > or > RID+ > > > > PASID) should be completely invisible to userspace. > > > > > > I don't think that is true if the vIOMMU driver is also emulating > > > PASID. Presumably the same is true for other PASID-like schemes. > > > > > > > I'm getting even more confused with this comment. Isn't it the > > consensus from day one that physical PASID should not be exposed > > to userspace as doing so breaks live migration? > > Uh, no? > > > with PASID emulation vIOMMU only cares about vPASID instead of > > pPASID, and the uAPI only requires user to register vPASID instead > > of reporting pPASID back to userspace... > > vPASID is only a feature of one device in existance, so we can't make > vPASID mandatory. > sure. my point is just that if vPASID is being emulated there is no need of exposing pPASID to user space. Can you give a concrete example where pPASID must be exposed and how the user wants to use this information? Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 2/9] ACPI/IORT: Add support for RMR node parsing
On 8/5/2021 7:03 PM, Lorenzo Pieralisi wrote: > On Thu, Aug 05, 2021 at 09:07:17AM +0100, Shameer Kolothum wrote: > > [...] > >> +static void __init iort_node_get_rmr_info(struct acpi_iort_node *iort_node) >> +{ >> +struct acpi_iort_node *smmu; >> +struct acpi_iort_rmr *rmr; >> +struct acpi_iort_rmr_desc *rmr_desc; >> +u32 map_count = iort_node->mapping_count; >> +u32 sid; >> +int i; >> + >> +if (!iort_node->mapping_offset || map_count != 1) { >> +pr_err(FW_BUG "Invalid ID mapping, skipping RMR node %p\n", >> + iort_node); >> +return; >> +} >> + >> +/* Retrieve associated smmu and stream id */ >> +smmu = iort_node_get_id(iort_node, , 0); >> +if (!smmu) { >> +pr_err(FW_BUG "Invalid SMMU reference, skipping RMR node %p\n", >> + iort_node); >> +return; >> +} >> + >> +/* Retrieve RMR data */ >> +rmr = (struct acpi_iort_rmr *)iort_node->node_data; >> +if (!rmr->rmr_offset || !rmr->rmr_count) { >> +pr_err(FW_BUG "Invalid RMR descriptor array, skipping RMR node >> %p\n", >> + iort_node); >> +return; >> +} >> + >> +rmr_desc = ACPI_ADD_PTR(struct acpi_iort_rmr_desc, iort_node, >> +rmr->rmr_offset); >> + >> +iort_rmr_desc_check_overlap(rmr_desc, rmr->rmr_count); >> + >> +for (i = 0; i < rmr->rmr_count; i++, rmr_desc++) { >> +struct iommu_resv_region *region; >> +enum iommu_resv_type type; >> +int prot = IOMMU_READ | IOMMU_WRITE; >> +u64 addr = rmr_desc->base_address, size = rmr_desc->length; >> + >> +if (!IS_ALIGNED(addr, SZ_64K) || !IS_ALIGNED(size, SZ_64K)) { >> +/* PAGE align base addr and size */ >> +addr &= PAGE_MASK; >> +size = PAGE_ALIGN(size + >> offset_in_page(rmr_desc->base_address)); >> + >> +pr_err(FW_BUG "RMR descriptor[0x%llx - 0x%llx] not >> aligned to 64K, continue with [0x%llx - 0x%llx]\n", >> + rmr_desc->base_address, >> + rmr_desc->base_address + rmr_desc->length - 1, >> + addr, addr + size - 1); >> +} >> +if (rmr->flags & IOMMU_RMR_REMAP_PERMITTED) { >> +type = IOMMU_RESV_DIRECT_RELAXABLE; >> +/* >> + * Set IOMMU_CACHE as IOMMU_RESV_DIRECT_RELAXABLE is >> + * normally used for allocated system memory that is >> + * then used for device specific reserved regions. >> + */ >> +prot |= IOMMU_CACHE; >> +} else { >> +type = IOMMU_RESV_DIRECT; >> +/* >> + * Set IOMMU_MMIO as IOMMU_RESV_DIRECT is normally used >> + * for device memory like MSI doorbell. >> + */ >> +prot |= IOMMU_MMIO; >> +} > > On the prot value assignment based on the remapping flag, I'd like to > hear Robin/Joerg's opinion, I'd avoid being in a situation where > "normally" this would work but then we have to quirk it. > > Is this a valid assumption _always_ ? I think we enable quite a bit of platforms with this assumption, so IMHO it's a fair compromise for now. As per Jon's comment and oob discussions, in the long run the spec should probably be updated to include a way of explicitly specifying memory attributes. --- Thanks & Best Regards, Laurentiu > >> + >> +region = iommu_alloc_resv_region(addr, size, prot, type); >> +if (region) { >> +region->fw_data.rmr.flags = rmr->flags; >> +region->fw_data.rmr.sid = sid; >> +region->fw_data.rmr.smmu = smmu; >> +list_add_tail(>list, _rmr_list); >> +} >> +} >> +} >> + >> +static void __init iort_parse_rmr(void) >> +{ >> +struct acpi_iort_node *iort_node, *iort_end; >> +struct acpi_table_iort *iort; >> +int i; >> + >> +if (iort_table->revision < 3) >> +return; >> + >> +iort = (struct acpi_table_iort *)iort_table; >> + >> +iort_node = ACPI_ADD_PTR(struct acpi_iort_node, iort, >> + iort->node_offset); >> +iort_end = ACPI_ADD_PTR(struct acpi_iort_node, iort, >> +iort_table->length); >> + >> +for (i = 0; i < iort->node_count; i++) { >> +if (WARN_TAINT(iort_node >= iort_end, TAINT_FIRMWARE_WORKAROUND, >> + "IORT node pointer overflows, bad table!\n")) >> +return; >> + >> +if (iort_node->type == ACPI_IORT_NODE_RMR) >> +iort_node_get_rmr_info(iort_node); >> + >> +iort_node = ACPI_ADD_PTR(struct acpi_iort_node, iort_node, >> +
Re: [PATCH] iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()
On 2021-08-05 16:16, John Garry wrote: On 05/08/2021 15:41, Robin Murphy wrote: I suppose they could be combined into a smaller sub-struct and loaded in a single operation, but it looks messy, and prob without much gain. Indeed I wouldn't say that saving memory is the primary concern here, and any more convoluted code is hardly going to help performance. Plus it still wouldn't help the other cases where we're just copying the size into a fake queue to do some prod arithmetic - I hadn't fully clocked what was going on there when I skimmed through things earlier. Disregarding the bogus layout change, though, do you reckon the rest of my idea makes sense? I tried the similar change to avoid zero-init the padding in arm_smmu_cmdq_write_entries() and the _arm_smmu_cmdq_poll_set_valid_map(), but the disassembly was the same. So the compiler must have got smart there. Yeah, in my build __arm_smmu_cmdq_poll_set_valid_map() only uses 32 bytes of stack, so clearly it's managed to see through the macro magic once queue_inc_prod_n() is inlined and elide the whole struct. arm_smmu_cmdq_write_entries() is inlined already, but logically must be the same deal since it's a similarly inlined queue_inc_prod_n(). However, that may all change if different compiler flags or a different compiler lead to different inlining decisions, so I'd argue that if this can matter anywhere then it's worth treating consistently everywhere. But for the original change in this patch, it did make a difference. It's nice to remove what was a memcpy: 1770: a9077eff stp xzr, xzr, [x23, #112] }, head = llq; 1774: 9400 bl 0 And performance was very fractionally better. Heh, mine was this beauty: struct arm_smmu_ll_queue llq = { 17d4: a9017f7fstp xzr, xzr, [x27, #16] 17d8: a9027f7fstp xzr, xzr, [x27, #32] 17dc: a9037f7fstp xzr, xzr, [x27, #48] 17e0: a9047f7fstp xzr, xzr, [x27, #64] }, head = llq; 17e4: b900c340str w0, [x26, #192] { 17e8: 290d0be1stp w1, w2, [sp, #104] }, head = llq; 17ec: a9440f62ldp x2, x3, [x27, #64] 17f0: a9007f5fstp xzr, xzr, [x26] 17f4: a9017f5fstp xzr, xzr, [x26, #16] 17f8: a9027f5fstp xzr, xzr, [x26, #32] 17fc: a9037f5fstp xzr, xzr, [x26, #48] 1800: a9040f42stp x2, x3, [x26, #64] struct arm_smmu_ll_queue llq = { 1804: a9057f7fstp xzr, xzr, [x27, #80] }, head = llq; 1808: a9057f5fstp xzr, xzr, [x26, #80] struct arm_smmu_ll_queue llq = { 180c: a9067f7fstp xzr, xzr, [x27, #96] }, head = llq; 1810: a9067f5fstp xzr, xzr, [x26, #96] struct arm_smmu_ll_queue llq = { 1814: a9077f7fstp xzr, xzr, [x27, #112] }, head = llq; 1818: a9077f5fstp xzr, xzr, [x26, #112] struct arm_smmu_ll_queue llq = { 181c: a9087f5fstp xzr, xzr, [x26, #128] As for pre-evaluating "nents", I'm not sure how much that can help, but I am not too optimistic. I can try some testing when I get a chance. Having said that, I would need to check the disassembly also. It'll just turn MOV,LDR,LSL sequences into plain LDRs - a small saving but with no real downside, and a third of it is in the place where doing less work matters most: add/remove: 0/0 grow/shrink: 0/8 up/down: 0/-100 (-100) Function old new delta arm_smmu_priq_thread 532 528 -4 arm_smmu_evtq_thread 368 364 -4 arm_smmu_device_probe 45644556 -8 __arm_smmu_cmdq_poll_set_valid_map.isra 316 308 -8 arm_smmu_init_one_queue.isra 320 308 -12 queue_remove_raw 192 176 -16 arm_smmu_gerror_handler 752 736 -16 arm_smmu_cmdq_issue_cmdlist 18121780 -32 Total: Before=23776, After=23676, chg -0.42% Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 2/9] ACPI/IORT: Add support for RMR node parsing
On Thu, Aug 5, 2021 at 6:03 PM Lorenzo Pieralisi wrote: > > On Thu, Aug 05, 2021 at 09:07:17AM +0100, Shameer Kolothum wrote: > > [...] > > > +static void __init iort_node_get_rmr_info(struct acpi_iort_node *iort_node) > > +{ > > + struct acpi_iort_node *smmu; > > + struct acpi_iort_rmr *rmr; > > + struct acpi_iort_rmr_desc *rmr_desc; > > + u32 map_count = iort_node->mapping_count; > > + u32 sid; > > + int i; > > + > > + if (!iort_node->mapping_offset || map_count != 1) { > > + pr_err(FW_BUG "Invalid ID mapping, skipping RMR node %p\n", > > +iort_node); > > + return; > > + } > > + > > + /* Retrieve associated smmu and stream id */ > > + smmu = iort_node_get_id(iort_node, , 0); > > + if (!smmu) { > > + pr_err(FW_BUG "Invalid SMMU reference, skipping RMR node > > %p\n", > > +iort_node); > > + return; > > + } > > + > > + /* Retrieve RMR data */ > > + rmr = (struct acpi_iort_rmr *)iort_node->node_data; > > + if (!rmr->rmr_offset || !rmr->rmr_count) { > > + pr_err(FW_BUG "Invalid RMR descriptor array, skipping RMR > > node %p\n", > > +iort_node); > > + return; > > + } > > + > > + rmr_desc = ACPI_ADD_PTR(struct acpi_iort_rmr_desc, iort_node, > > + rmr->rmr_offset); > > + > > + iort_rmr_desc_check_overlap(rmr_desc, rmr->rmr_count); > > + > > + for (i = 0; i < rmr->rmr_count; i++, rmr_desc++) { > > + struct iommu_resv_region *region; > > + enum iommu_resv_type type; > > + int prot = IOMMU_READ | IOMMU_WRITE; > > + u64 addr = rmr_desc->base_address, size = rmr_desc->length; > > + > > + if (!IS_ALIGNED(addr, SZ_64K) || !IS_ALIGNED(size, SZ_64K)) { > > + /* PAGE align base addr and size */ > > + addr &= PAGE_MASK; > > + size = PAGE_ALIGN(size + > > offset_in_page(rmr_desc->base_address)); > > + > > + pr_err(FW_BUG "RMR descriptor[0x%llx - 0x%llx] not > > aligned to 64K, continue with [0x%llx - 0x%llx]\n", > > +rmr_desc->base_address, > > +rmr_desc->base_address + rmr_desc->length - 1, > > +addr, addr + size - 1); > > + } > > + if (rmr->flags & IOMMU_RMR_REMAP_PERMITTED) { > > + type = IOMMU_RESV_DIRECT_RELAXABLE; > > + /* > > + * Set IOMMU_CACHE as IOMMU_RESV_DIRECT_RELAXABLE is > > + * normally used for allocated system memory that is > > + * then used for device specific reserved regions. > > + */ > > + prot |= IOMMU_CACHE; > > + } else { > > + type = IOMMU_RESV_DIRECT; > > + /* > > + * Set IOMMU_MMIO as IOMMU_RESV_DIRECT is normally > > used > > + * for device memory like MSI doorbell. > > + */ > > + prot |= IOMMU_MMIO; > > + } > > On the prot value assignment based on the remapping flag, I'd like to > hear Robin/Joerg's opinion, I'd avoid being in a situation where > "normally" this would work but then we have to quirk it. > > Is this a valid assumption _always_ ? These assumptions were made based on the historic use cases I could find reading the history. There aren't many known examples "in the wild" because so far we haven't had a mechanism other than quirks based around device-tree implementations. Ultimately I believe the proper solution will need to be another flag in the RMR table that specifies the type of memory an RMR Node describes, not just the base and length. -Jon > > Thanks, > Lorenzo > > > + > > + region = iommu_alloc_resv_region(addr, size, prot, type); > > + if (region) { > > + region->fw_data.rmr.flags = rmr->flags; > > + region->fw_data.rmr.sid = sid; > > + region->fw_data.rmr.smmu = smmu; > > + list_add_tail(>list, _rmr_list); > > + } > > + } > > +} > > + > > +static void __init iort_parse_rmr(void) > > +{ > > + struct acpi_iort_node *iort_node, *iort_end; > > + struct acpi_table_iort *iort; > > + int i; > > + > > + if (iort_table->revision < 3) > > + return; > > + > > + iort = (struct acpi_table_iort *)iort_table; > > + > > + iort_node = ACPI_ADD_PTR(struct acpi_iort_node, iort, > > + iort->node_offset); > > + iort_end = ACPI_ADD_PTR(struct acpi_iort_node, iort, > > + iort_table->length); > > + > > + for (i = 0; i < iort->node_count; i++) { > > + if (WARN_TAINT(iort_node >= iort_end, > >
Re: [PATCH v3] iommu/amd: Use report_iommu_fault()
Lennert, FYI: I have made some comments in V2 thread specifically around the new changes that we discussed in that thread. Thanks, Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/amd: Use report_iommu_fault()
Lennert, On 7/29/2021 9:32 PM, Lennert Buytenhek wrote: We have three cases to handle: - EVENT_FLAG_I set: IRQ remapping fault, don't call report_iommu_fault() - EVENT_FLAG_I unset, but the request was a translation request (EVENT_FLAG_TR set) or the target page was not present (EVENT_FLAG_PR unset): call report_iommu_fault(), but the RW bit will be invalid, so don't try to map it to a IOMMU_FAULT_{READ,WRITE} code So, why do we need to call report_iommu_fault() for this case? My understanding is we only have IOMMU_FAULT_[READ|WRITE]. So, if we can't identify whether the DMA is read / write, we should not need to call report_iommu_fauilt(), is it? - EVENT_FLAG_I unset, the request is a transaction request (EVENT_FLAG_TR unset) and the target page was present (EVENT_FLAG_PR set): call report_iommu_fault(), and use the RW bit to set IOMMU_FAULT_{READ,WRITE} So I don't think we can merge the test for EVENT_FLAG_I with the test for EVENT_FLAG_TR/EVENT_FLAG_PR. The only condition that we would report_iommu_fault is I=0, TR=0, PR=1, isn't it. So we should be able to just check if PR=1. We could do something like this, if you'd prefer: #define IS_IOMMU_MEM_TRANSACTION(flags) \ (((flags) & EVENT_FLAG_I) == 0) #define IS_RW_FLAG_VALID(flags) \ (((flags) & (EVENT_FLAG_TR | EVENT_FLAG_PR)) == EVENT_FLAG_PR) #define IS_WRITE_REQUEST(flags) \ (IS_RW_FLAG_VALID(flags) && (flags & EVENT_FLAG_RW)) And then do something like: if (dev_data && IS_IOMMU_MEM_TRANSACTION(flags)) { if (!report_iommu_fault(_data->domain->domain, >dev, address, IS_WRITE_REQUEST(flags) ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) Actually, IS_WRITE_REQUEST() == 0 could mean: - I=0, TR=0, PR=1 and RW=0: This is fine. - I=0, (TR=1 or PR=0), and we should not be calling report_iommu_fault() here since we cannot specify READ/WRITE here. Thanks, Suravee goto out; } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 2/9] ACPI/IORT: Add support for RMR node parsing
On Thu, Aug 05, 2021 at 09:07:17AM +0100, Shameer Kolothum wrote: [...] > +static void __init iort_node_get_rmr_info(struct acpi_iort_node *iort_node) > +{ > + struct acpi_iort_node *smmu; > + struct acpi_iort_rmr *rmr; > + struct acpi_iort_rmr_desc *rmr_desc; > + u32 map_count = iort_node->mapping_count; > + u32 sid; > + int i; > + > + if (!iort_node->mapping_offset || map_count != 1) { > + pr_err(FW_BUG "Invalid ID mapping, skipping RMR node %p\n", > +iort_node); > + return; > + } > + > + /* Retrieve associated smmu and stream id */ > + smmu = iort_node_get_id(iort_node, , 0); > + if (!smmu) { > + pr_err(FW_BUG "Invalid SMMU reference, skipping RMR node %p\n", > +iort_node); > + return; > + } > + > + /* Retrieve RMR data */ > + rmr = (struct acpi_iort_rmr *)iort_node->node_data; > + if (!rmr->rmr_offset || !rmr->rmr_count) { > + pr_err(FW_BUG "Invalid RMR descriptor array, skipping RMR node > %p\n", > +iort_node); > + return; > + } > + > + rmr_desc = ACPI_ADD_PTR(struct acpi_iort_rmr_desc, iort_node, > + rmr->rmr_offset); > + > + iort_rmr_desc_check_overlap(rmr_desc, rmr->rmr_count); > + > + for (i = 0; i < rmr->rmr_count; i++, rmr_desc++) { > + struct iommu_resv_region *region; > + enum iommu_resv_type type; > + int prot = IOMMU_READ | IOMMU_WRITE; > + u64 addr = rmr_desc->base_address, size = rmr_desc->length; > + > + if (!IS_ALIGNED(addr, SZ_64K) || !IS_ALIGNED(size, SZ_64K)) { > + /* PAGE align base addr and size */ > + addr &= PAGE_MASK; > + size = PAGE_ALIGN(size + > offset_in_page(rmr_desc->base_address)); > + > + pr_err(FW_BUG "RMR descriptor[0x%llx - 0x%llx] not > aligned to 64K, continue with [0x%llx - 0x%llx]\n", > +rmr_desc->base_address, > +rmr_desc->base_address + rmr_desc->length - 1, > +addr, addr + size - 1); > + } > + if (rmr->flags & IOMMU_RMR_REMAP_PERMITTED) { > + type = IOMMU_RESV_DIRECT_RELAXABLE; > + /* > + * Set IOMMU_CACHE as IOMMU_RESV_DIRECT_RELAXABLE is > + * normally used for allocated system memory that is > + * then used for device specific reserved regions. > + */ > + prot |= IOMMU_CACHE; > + } else { > + type = IOMMU_RESV_DIRECT; > + /* > + * Set IOMMU_MMIO as IOMMU_RESV_DIRECT is normally used > + * for device memory like MSI doorbell. > + */ > + prot |= IOMMU_MMIO; > + } On the prot value assignment based on the remapping flag, I'd like to hear Robin/Joerg's opinion, I'd avoid being in a situation where "normally" this would work but then we have to quirk it. Is this a valid assumption _always_ ? Thanks, Lorenzo > + > + region = iommu_alloc_resv_region(addr, size, prot, type); > + if (region) { > + region->fw_data.rmr.flags = rmr->flags; > + region->fw_data.rmr.sid = sid; > + region->fw_data.rmr.smmu = smmu; > + list_add_tail(>list, _rmr_list); > + } > + } > +} > + > +static void __init iort_parse_rmr(void) > +{ > + struct acpi_iort_node *iort_node, *iort_end; > + struct acpi_table_iort *iort; > + int i; > + > + if (iort_table->revision < 3) > + return; > + > + iort = (struct acpi_table_iort *)iort_table; > + > + iort_node = ACPI_ADD_PTR(struct acpi_iort_node, iort, > + iort->node_offset); > + iort_end = ACPI_ADD_PTR(struct acpi_iort_node, iort, > + iort_table->length); > + > + for (i = 0; i < iort->node_count; i++) { > + if (WARN_TAINT(iort_node >= iort_end, TAINT_FIRMWARE_WORKAROUND, > +"IORT node pointer overflows, bad table!\n")) > + return; > + > + if (iort_node->type == ACPI_IORT_NODE_RMR) > + iort_node_get_rmr_info(iort_node); > + > + iort_node = ACPI_ADD_PTR(struct acpi_iort_node, iort_node, > + iort_node->length); > + } > +} > > static void __init iort_init_platform_devices(void) > { > @@ -1636,6 +1767,7 @@ void __init acpi_iort_init(void) > } > > iort_init_platform_devices(); > + iort_parse_rmr(); > } > > #ifdef CONFIG_ZONE_DMA > -- > 2.17.1 > ___ iommu
Re: [PATCH V2 11/14] x86/Swiotlb: Add Swiotlb bounce buffer remap function for HV IVM
Hi Konrad: Could you have a look at this new version? The change since v1 is make swiotlb_init_io_tlb_mem() return error code when dma_map_decrypted() fails according your previous comment. If this change is ok, could you give your ack and this series needs to be merged via Hyper-V next tree. Thanks. On 8/5/2021 2:45 AM, Tianyu Lan wrote: From: Tianyu Lan In Isolation VM with AMD SEV, bounce buffer needs to be accessed via extra address space which is above shared_gpa_boundary (E.G 39 bit address line) reported by Hyper-V CPUID ISOLATION_CONFIG. The access physical address will be original physical address + shared_gpa_boundary. The shared_gpa_boundary in the AMD SEV SNP spec is called virtual top of memory(vTOM). Memory addresses below vTOM are automatically treated as private while memory above vTOM is treated as shared. Use dma_map_decrypted() in the swiotlb code, store remap address returned and use the remap address to copy data from/to swiotlb bounce buffer. Signed-off-by: Tianyu Lan --- Change since v1: * Make swiotlb_init_io_tlb_mem() return error code and return error when dma_map_decrypted() fails. Signed-off-by: Tianyu Lan --- include/linux/swiotlb.h | 4 kernel/dma/swiotlb.c| 32 2 files changed, 28 insertions(+), 8 deletions(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index f507e3eacbea..584560ecaa8e 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -72,6 +72,9 @@ extern enum swiotlb_force swiotlb_force; * @end: The end address of the swiotlb memory pool. Used to do a quick *range check to see if the memory was in fact allocated by this *API. + * @vaddr: The vaddr of the swiotlb memory pool. The swiotlb + * memory pool may be remapped in the memory encrypted case and store + * virtual address for bounce buffer operation. * @nslabs: The number of IO TLB blocks (in groups of 64) between @start and *@end. For default swiotlb, this is command line adjustable via *setup_io_tlb_npages. @@ -89,6 +92,7 @@ extern enum swiotlb_force swiotlb_force; struct io_tlb_mem { phys_addr_t start; phys_addr_t end; + void *vaddr; unsigned long nslabs; unsigned long used; unsigned int index; diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 1fa81c096c1d..29b6d888ef3b 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -176,7 +176,7 @@ void __init swiotlb_update_mem_attributes(void) memset(vaddr, 0, bytes); } -static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start, +static int swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start, unsigned long nslabs, bool late_alloc) { void *vaddr = phys_to_virt(start); @@ -194,14 +194,21 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start, mem->slots[i].alloc_size = 0; } - set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT); - memset(vaddr, 0, bytes); + mem->vaddr = dma_map_decrypted(vaddr, bytes); + if (!mem->vaddr) { + pr_err("Failed to decrypt memory.\n"); + return -ENOMEM; + } + + memset(mem->vaddr, 0, bytes); + return 0; } int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose) { struct io_tlb_mem *mem; size_t alloc_size; + int ret; if (swiotlb_force == SWIOTLB_NO_FORCE) return 0; @@ -216,7 +223,11 @@ int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose) panic("%s: Failed to allocate %zu bytes align=0x%lx\n", __func__, alloc_size, PAGE_SIZE); - swiotlb_init_io_tlb_mem(mem, __pa(tlb), nslabs, false); + ret = swiotlb_init_io_tlb_mem(mem, __pa(tlb), nslabs, false); + if (ret) { + memblock_free(__pa(mem), alloc_size); + return ret; + } io_tlb_default_mem = mem; if (verbose) @@ -304,6 +315,8 @@ int swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs) { struct io_tlb_mem *mem; + int size = get_order(struct_size(mem, slots, nslabs)); + int ret; if (swiotlb_force == SWIOTLB_NO_FORCE) return 0; @@ -312,12 +325,15 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs) if (WARN_ON_ONCE(io_tlb_default_mem)) return -ENOMEM; - mem = (void *)__get_free_pages(GFP_KERNEL, - get_order(struct_size(mem, slots, nslabs))); + mem = (void *)__get_free_pages(GFP_KERNEL, size); if (!mem) return -ENOMEM; - swiotlb_init_io_tlb_mem(mem, virt_to_phys(tlb), nslabs, true); + ret = swiotlb_init_io_tlb_mem(mem, virt_to_phys(tlb), nslabs,
Re: [PATCH V2 10/14] DMA: Add dma_map_decrypted/dma_unmap_encrypted() function
Hi Christoph: Could you have a look at this patch? It adds new API dma_map_decrypted() to do memory decrypted and remap. It will be used in the swiotlb code. Thanks. On 8/5/2021 2:45 AM, Tianyu Lan wrote: From: Tianyu Lan In Hyper-V Isolation VM with AMD SEV, swiotlb boucne buffer needs to be mapped into address space above vTOM and so introduce dma_map_decrypted/dma_unmap_encrypted() to map/unmap bounce buffer memory. The platform can populate man/unmap callback in the dma memory decrypted ops. Signed-off-by: Tianyu Lan --- include/linux/dma-map-ops.h | 9 + kernel/dma/mapping.c| 22 ++ 2 files changed, 31 insertions(+) diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h index 0d53a96a3d64..01d60a024e45 100644 --- a/include/linux/dma-map-ops.h +++ b/include/linux/dma-map-ops.h @@ -71,6 +71,11 @@ struct dma_map_ops { unsigned long (*get_merge_boundary)(struct device *dev); }; +struct dma_memory_decrypted_ops { + void *(*map)(void *addr, unsigned long size); + void (*unmap)(void *addr); +}; + #ifdef CONFIG_DMA_OPS #include @@ -374,6 +379,10 @@ static inline void debug_dma_dump_mappings(struct device *dev) } #endif /* CONFIG_DMA_API_DEBUG */ +void *dma_map_decrypted(void *addr, unsigned long size); +int dma_unmap_decrypted(void *addr, unsigned long size); + extern const struct dma_map_ops dma_dummy_ops; +extern struct dma_memory_decrypted_ops dma_memory_generic_decrypted_ops; #endif /* _LINUX_DMA_MAP_OPS_H */ diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index 2b06a809d0b9..6fb150dc1750 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -13,11 +13,13 @@ #include #include #include +#include #include "debug.h" #include "direct.h" bool dma_default_coherent; +struct dma_memory_decrypted_ops dma_memory_generic_decrypted_ops; /* * Managed DMA API */ @@ -736,3 +738,23 @@ unsigned long dma_get_merge_boundary(struct device *dev) return ops->get_merge_boundary(dev); } EXPORT_SYMBOL_GPL(dma_get_merge_boundary); + +void *dma_map_decrypted(void *addr, unsigned long size) +{ + if (set_memory_decrypted((unsigned long)addr, +size / PAGE_SIZE)) + return NULL; + + if (dma_memory_generic_decrypted_ops.map) + return dma_memory_generic_decrypted_ops.map(addr, size); + else + return addr; +} + +int dma_unmap_encrypted(void *addr, unsigned long size) +{ + if (dma_memory_generic_decrypted_ops.unmap) + dma_memory_generic_decrypted_ops.unmap(addr); + + return set_memory_encrypted((unsigned long)addr, size / PAGE_SIZE); +} ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V2 03/14] x86/set_memory: Add x86_set_memory_enc static call support
On 8/5/2021 10:29 PM, Dave Hansen wrote: On 8/5/21 7:23 AM, Peter Zijlstra wrote: This is assuming any of this is actually performance critical, based off of this using static_call() to begin with. This code is not performance critical. I think I sent folks off on a wild goose chase when I asked that we make an effort to optimize code that does: if (some_hyperv_check()) foo(); if (some_amd_feature_check()) bar(); with checks that will actually compile away when Hyper-V or some_amd_feature() is disabled. That's less about performance and just about good hygiene. I *wanted* to see cpu_feature_enabled(X86_FEATURE...) checks. Someone suggested using static calls, and off we went... Could we please just use cpu_feature_enabled()? Yes, cpu_feature_enabled() works. The target is just to run platform code after platform check. I will update this in the next version. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 4/9] ACPI/IORT: Add a helper to retrieve RMR memory regions
On Thu, Aug 05, 2021 at 09:07:19AM +0100, Shameer Kolothum wrote: > Add a helper function (iort_iommu_get_rmrs()) that retrieves RMR > memory descriptors associated with a given IOMMU. This will be used > by IOMMU drivers to setup necessary mappings. > > Invoke it from the generic helper iommu_dma_get_rmrs(). > > Signed-off-by: Shameer Kolothum > --- > drivers/acpi/arm64/iort.c | 38 ++ > drivers/iommu/dma-iommu.c | 4 > include/linux/acpi_iort.h | 7 +++ > 3 files changed, 49 insertions(+) Reviewed-by: Lorenzo Pieralisi > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c > index d76ba46ebe67..3c32d62e63b6 100644 > --- a/drivers/acpi/arm64/iort.c > +++ b/drivers/acpi/arm64/iort.c > @@ -809,6 +809,42 @@ static struct acpi_iort_node > *iort_get_msi_resv_iommu(struct device *dev) > return NULL; > } > > +/** > + * iort_iommu_get_rmrs() - Helper to retrieve RMR info associated with IOMMU > + * @iommu_fwnode: fwnode for the IOMMU > + * @head: RMR list head to be populated > + * > + * Returns: 0 on success, <0 failure. Please note, we will keep the already > + * allocated RMR reserve regions in case of a kmemdup() > + * failure. > + */ > +int iort_iommu_get_rmrs(struct fwnode_handle *iommu_fwnode, > + struct list_head *head) > +{ > + struct iommu_resv_region *e; > + struct acpi_iort_node *iommu; > + int rmrs = 0; > + > + iommu = iort_get_iort_node(iommu_fwnode); > + if (!iommu || list_empty(_rmr_list)) > + return -ENODEV; > + > + list_for_each_entry(e, _rmr_list, list) { > + struct iommu_resv_region *region; > + > + if (e->fw_data.rmr.smmu != iommu) > + continue; > + > + region = kmemdup(e, sizeof(*region), GFP_KERNEL); > + if (region) { > + list_add_tail(>list, head); > + rmrs++; > + } > + } > + > + return (rmrs == 0) ? -ENODEV : 0; > +} > + > /** > * iort_iommu_msi_get_resv_regions - Reserved region driver helper > * @dev: Device from iommu_get_resv_regions() > @@ -1041,6 +1077,8 @@ int iort_iommu_msi_get_resv_regions(struct device *dev, > struct list_head *head) > { return 0; } > int iort_iommu_configure_id(struct device *dev, const u32 *input_id) > { return -ENODEV; } > +int iort_iommu_get_rmrs(struct fwnode_handle *fwnode, struct list_head *head) > +{ return -ENODEV; } > #endif > > static int nc_dma_get_range(struct device *dev, u64 *size) > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > index 2fa2445e9070..1b6e27475279 100644 > --- a/drivers/iommu/dma-iommu.c > +++ b/drivers/iommu/dma-iommu.c > @@ -185,6 +185,9 @@ EXPORT_SYMBOL(iommu_put_dma_cookie); > int iommu_dma_get_rmrs(struct fwnode_handle *iommu_fwnode, > struct list_head *list) > { > + if (!is_of_node(iommu_fwnode)) > + return iort_iommu_get_rmrs(iommu_fwnode, list); > + > return -EINVAL; > } > EXPORT_SYMBOL(iommu_dma_get_rmrs); > @@ -200,6 +203,7 @@ EXPORT_SYMBOL(iommu_dma_get_rmrs); > void iommu_dma_put_rmrs(struct fwnode_handle *iommu_fwnode, > struct list_head *list) > { > + generic_iommu_put_resv_regions(iommu_fwnode->dev, list); > } > EXPORT_SYMBOL(iommu_dma_put_rmrs); > > diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h > index f1f0842a2cb2..d8c030c103f5 100644 > --- a/include/linux/acpi_iort.h > +++ b/include/linux/acpi_iort.h > @@ -38,6 +38,8 @@ int iort_dma_get_ranges(struct device *dev, u64 *size); > int iort_iommu_configure_id(struct device *dev, const u32 *id_in); > int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head > *head); > phys_addr_t acpi_iort_dma_get_max_cpu_address(void); > +int iort_iommu_get_rmrs(struct fwnode_handle *iommu_fwnode, > + struct list_head *list); > #else > static inline void acpi_iort_init(void) { } > static inline u32 iort_msi_map_id(struct device *dev, u32 id) > @@ -57,6 +59,11 @@ int iort_iommu_msi_get_resv_regions(struct device *dev, > struct list_head *head) > > static inline phys_addr_t acpi_iort_dma_get_max_cpu_address(void) > { return PHYS_ADDR_MAX; } > + > +static inline > +int iort_iommu_get_rmrs(struct fwnode_handle *iommu_fwnode, > + struct list_head *list) > +{ return -ENODEV; } > #endif > > #endif /* __ACPI_IORT_H__ */ > -- > 2.17.1 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()
On 05/08/2021 15:41, Robin Murphy wrote: I suppose they could be combined into a smaller sub-struct and loaded in a single operation, but it looks messy, and prob without much gain. Indeed I wouldn't say that saving memory is the primary concern here, and any more convoluted code is hardly going to help performance. Plus it still wouldn't help the other cases where we're just copying the size into a fake queue to do some prod arithmetic - I hadn't fully clocked what was going on there when I skimmed through things earlier. Disregarding the bogus layout change, though, do you reckon the rest of my idea makes sense? I tried the similar change to avoid zero-init the padding in arm_smmu_cmdq_write_entries() and the _arm_smmu_cmdq_poll_set_valid_map(), but the disassembly was the same. So the compiler must have got smart there. But for the original change in this patch, it did make a difference. It's nice to remove what was a memcpy: 1770: a9077eff stp xzr, xzr, [x23, #112] }, head = llq; 1774: 9400 bl 0 And performance was very fractionally better. As for pre-evaluating "nents", I'm not sure how much that can help, but I am not too optimistic. I can try some testing when I get a chance. Having said that, I would need to check the disassembly also. Thanks, John ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()
On 2021-08-05 14:40, John Garry wrote: On 05/08/2021 12:24, Robin Murphy wrote: On 2021-06-21 17:36, John Garry wrote: Members of struct "llq" will be zero-inited, apart from member max_n_shift. But we write llq.val straight after the init, so it was pointless to zero init those other members. As such, separately init member max_n_shift only. In addition, struct "head" is initialised to "llq" only so that member max_n_shift is set. But that member is never referenced for "head", so remove any init there. Removing these initializations is seen as a small performance optimisation, as this code is (very) hot path. I looked at this and immediately thought "surely the compiler can see that all the prod/cons/val fields are written anyway and elide the initialisation?", so I dumped the before and after disassembly, and... oh. You should probably clarify that it's zero-initialising all the cacheline padding which is both pointless and painful. With that, Reviewed-by: Robin Murphy However, having looked this closely I'm now tangentially wondering why max_n_shift isn't inside the padded union? It's read at the same time as both prod and cons by queue_has_space(), and never updated, so there doesn't appear to be any benefit to it being in a separate cacheline all by itself, and llq is already twice as big as it needs to be. I think that the problem is if the prod+cons 64b value and the shift are on the same cacheline, then we have a chance of accessing a stale cacheline twice: static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu, u64 *cmds, int n, bool sync) { u64 cmd_sync[CMDQ_ENT_DWORDS]; u32 prod; unsigned long flags; bool owner; struct arm_smmu_cmdq *cmdq = >cmdq; struct arm_smmu_ll_queue llq = { .max_n_shift = cmdq->q.llq.max_n_shift, // here }, head = llq; int ret = 0; /* 1. Allocate some space in the queue */ local_irq_save(flags); llq.val = READ_ONCE(cmdq->q.llq.val); // and again here since cmdq->q.llq is per-SMMU. If max_n_shift is on a separate cacheline, then it should never be stale. Ah, right, even though the accesses are always going to be close together, I suppose it could still technically cause some false sharing if someone else is trying to update prod at exactly the right time. I guess that might be why we need the explicit padding there in the first place, it's just a shame that it ends up wasting even more space with implicit padding at the end too (and I have a vague memory that trying to force member alignment and structure packing at the same time doesn't work well). Oh well. I suppose they could be combined into a smaller sub-struct and loaded in a single operation, but it looks messy, and prob without much gain. Indeed I wouldn't say that saving memory is the primary concern here, and any more convoluted code is hardly going to help performance. Plus it still wouldn't help the other cases where we're just copying the size into a fake queue to do some prod arithmetic - I hadn't fully clocked what was going on there when I skimmed through things earlier. Disregarding the bogus layout change, though, do you reckon the rest of my idea makes sense? Cheers, Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V2 03/14] x86/set_memory: Add x86_set_memory_enc static call support
On 8/5/21 7:23 AM, Peter Zijlstra wrote: > This is assuming any of this is actually performance critical, based off > of this using static_call() to begin with. This code is not performance critical. I think I sent folks off on a wild goose chase when I asked that we make an effort to optimize code that does: if (some_hyperv_check()) foo(); if (some_amd_feature_check()) bar(); with checks that will actually compile away when Hyper-V or some_amd_feature() is disabled. That's less about performance and just about good hygiene. I *wanted* to see cpu_feature_enabled(X86_FEATURE...) checks. Someone suggested using static calls, and off we went... Could we please just use cpu_feature_enabled()? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V2 03/14] x86/set_memory: Add x86_set_memory_enc static call support
On Thu, Aug 05, 2021 at 10:05:17PM +0800, Tianyu Lan wrote: > static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc) > { > + return static_call(x86_set_memory_enc)(addr, numpages, enc); > } Hurpmh... So with a bit of 'luck' you get code-gen like: __set_memory_enc_dec: jmp __SCT_x86_set_memory_enc; set_memory_encrypted: mov $1, %rdx jmp __set_memory_enc_dec set_memory_decrypted: mov $0, %rdx jmp __set_memory_enc_dec Which, to me, seems exceedingly daft. Best to make all 3 of those inlines and use EXPORT_STATIC_CALL_TRAMP_GPL(x86_set_memory_enc) or something. This is assuming any of this is actually performance critical, based off of this using static_call() to begin with. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V2 03/14] x86/set_memory: Add x86_set_memory_enc static call support
On Thu, Aug 05, 2021 at 10:05:17PM +0800, Tianyu Lan wrote: > +static int default_set_memory_enc(unsigned long addr, int numpages, bool > enc) > +{ > + return 0; > +} > + > +DEFINE_STATIC_CALL(x86_set_memory_enc, default_set_memory_enc); That's spelled: DEFINE_STATIC_CALL_RET0(x86_set_memory_enc, __set_memory_enc_dec); And then you can remove the default_set_memory_enc() thing. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
On Thu, 5 Aug 2021 at 15:35, Shameerali Kolothum Thodi wrote: > > > > > -Original Message- > > From: Ard Biesheuvel [mailto:a...@kernel.org] > > Sent: 05 August 2021 14:23 > > To: Shameerali Kolothum Thodi > > Cc: Linux ARM ; ACPI Devel Maling List > > ; Linux IOMMU > > ; Linuxarm ; > > Lorenzo Pieralisi ; Joerg Roedel > > ; Robin Murphy ; Will Deacon > > ; wanghuiqiang ; Guohanjun > > (Hanjun Guo) ; Steven Price > > ; Sami Mujawar ; Jon > > Nettleton ; Eric Auger ; > > yangyicong > > Subject: Re: [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node > > > > On Thu, 5 Aug 2021 at 10:10, Shameer Kolothum > > wrote: > > > > > > Hi, > > > > > > The series adds support to IORT RMR nodes specified in IORT > > > Revision E.b -ARM DEN 0049E[0]. RMR nodes are used to describe > > > memory ranges that are used by endpoints and require a unity > > > mapping in SMMU. > > > > > > We have faced issues with 3408iMR RAID controller cards which > > > fail to boot when SMMU is enabled. This is because these > > > controllers make use of host memory for various caching related > > > purposes and when SMMU is enabled the iMR firmware fails to > > > access these memory regions as there is no mapping for them. > > > IORT RMR provides a way for UEFI to describe and report these > > > memory regions so that the kernel can make a unity mapping for > > > these in SMMU. > > > > > > > Does this mean we are ignoring the RMR memory ranges, and exposing the > > entire physical address space to devices using the stream IDs in > > question? > > Nope. RMR node is used to describe the memory ranges used by end points > behind SMMU. And this information is used to create 1 : 1 mappings for those > ranges in SMMU. Anything outside those ranges will result in translation > fault(if there are no other dynamic DMA mappings). > Excellent! It was not obvious to me from looking at the patches, so I had to ask. Thanks, Ard. > > > > > > Change History: > > > > > > v6 --> v7 > > > > > > The only change from v6 is the fix pointed out by Steve to > > > the SMMUv2 SMR bypass install in patch #8. > > > > > > Thanks to the Tested-by tags by Laurentiu with SMMUv2 and > > > Hanjun/Huiqiang with SMMUv3 for v6. I haven't added the tags > > > yet as the series still needs more review[1]. > > > > > > Feedback and tests on this series is very much appreciated. > > > > > > v5 --> v6 > > > - Addressed comments from Robin & Lorenzo. > > > : Moved iort_parse_rmr() to acpi_iort_init() from > > > iort_init_platform_devices(). > > > : Removed use of struct iort_rmr_entry during the initial > > > parse. Using struct iommu_resv_region instead. > > > : Report RMR address alignment and overlap errors, but continue. > > > : Reworked arm_smmu_init_bypass_stes() (patch # 6). > > > - Updated SMMUv2 bypass SMR code. Thanks to Jon N (patch #8). > > > - Set IOMMU protection flags(IOMMU_CACHE, IOMMU_MMIO) based > > > on Type of RMR region. Suggested by Jon N. > > > > > > Thanks, > > > Shameer > > > [0] https://developer.arm.com/documentation/den0049/latest/ > > > [1] > > https://lore.kernel.org/linux-acpi/20210716083442.1708-1-shameerali.koloth > > um.th...@huawei.com/T/#m043c95b869973a834b2fd57f3e1ed0325c84f3b7 > > > -- > > > v4 --> v5 > > > -Added a fw_data union to struct iommu_resv_region and removed > > > struct iommu_rmr (Based on comments from Joerg/Robin). > > > -Added iommu_put_rmrs() to release mem. > > > -Thanks to Steve for verifying on SMMUv2, but not added the Tested-by > > > yet because of the above changes. > > > > > > v3 -->v4 > > > -Included the SMMUv2 SMR bypass install changes suggested by > > > Steve(patch #7) > > > -As per Robin's comments, RMR reserve implementation is now > > > more generic (patch #8) and dropped v3 patches 8 and 10. > > > -Rebase to 5.13-rc1 > > > > > > RFC v2 --> v3 > > > -Dropped RFC tag as the ACPICA header changes are now ready to be > > > part of 5.13[0]. But this series still has a dependency on that patch. > > > -Added IORT E.b related changes(node flags, _DSM function 5 checks for > > > PCIe). > > > -Changed RMR to stream id mapping from M:N to M:1 as per the spec and > > > discussion here[1]. > > > -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!) > > > -- > > > > > > Jon Nettleton (1): > > > iommu/arm-smmu: Get associated RMR info and install bypass SMR > > > > > > Shameer Kolothum (8): > > > iommu: Introduce a union to struct iommu_resv_region > > > ACPI/IORT: Add support for RMR node parsing > > > iommu/dma: Introduce generic helper to retrieve RMR info > > > ACPI/IORT: Add a helper to retrieve RMR memory regions > > > iommu/arm-smmu-v3: Introduce strtab init helper > > > iommu/arm-smmu-v3: Refactor arm_smmu_init_bypass_stes() to force > > > bypass > > > iommu/arm-smmu-v3: Get associated RMR info and install bypass STE > > > iommu/dma: Reserve any RMR regions associated with a dev > > > > > > drivers/acpi/arm64/iort.c
Re: [PATCH V2 03/14] x86/set_memory: Add x86_set_memory_enc static call support
Hi Dave: Thanks for review. On 8/5/2021 3:27 AM, Dave Hansen wrote: On 8/4/21 11:44 AM, Tianyu Lan wrote: +static int default_set_memory_enc(unsigned long addr, int numpages, bool enc); +DEFINE_STATIC_CALL(x86_set_memory_enc, default_set_memory_enc); + #define CPA_FLUSHTLB 1 #define CPA_ARRAY 2 #define CPA_PAGES_ARRAY 4 @@ -1981,6 +1985,11 @@ int set_memory_global(unsigned long addr, int numpages) } static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc) +{ + return static_call(x86_set_memory_enc)(addr, numpages, enc); +} + +static int default_set_memory_enc(unsigned long addr, int numpages, bool enc) { struct cpa_data cpa; int ret; It doesn't make a lot of difference to add this infrastructure and then ignore it for the existing in-tree user: static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc) { struct cpa_data cpa; int ret; /* Nothing to do if memory encryption is not active */ if (!mem_encrypt_active()) return 0; Shouldn't the default be to just "return 0"? Then on mem_encrypt_active() systems, do the bulk of what is in __set_memory_enc_dec() today. OK. I try moving code in __set_memory_enc_dec() to sev file mem_encrypt.c and this requires to expose cpa functions and structure. Please have a look. Tom, Joerg and Brijesh, Could you review at sev code change? Thanks. diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h index 43fa081a1adb..991366612deb 100644 --- a/arch/x86/include/asm/set_memory.h +++ b/arch/x86/include/asm/set_memory.h @@ -4,6 +4,25 @@ #include #include +#include + +/* + * The current flushing context - we pass it instead of 5 arguments: + */ +struct cpa_data { + unsigned long *vaddr; + pgd_t *pgd; + pgprot_tmask_set; + pgprot_tmask_clr; + unsigned long numpages; + unsigned long curpage; + unsigned long pfn; + unsigned intflags; + unsigned intforce_split : 1, + force_static_prot : 1, + force_flush_all : 1; + struct page **pages; +}; /* * The set_memory_* API can be used to change various attributes of a virtual @@ -83,6 +102,11 @@ int set_pages_rw(struct page *page, int numpages); int set_direct_map_invalid_noflush(struct page *page); int set_direct_map_default_noflush(struct page *page); bool kernel_page_present(struct page *page); +int __change_page_attr_set_clr(struct cpa_data *cpa, int checkalias); +void cpa_flush(struct cpa_data *data, int cache); + +int dummy_set_memory_enc(unsigned long addr, int numpages, bool enc); +DECLARE_STATIC_CALL(x86_set_memory_enc, dummy_set_memory_enc); extern int kernel_set_to_readonly; diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c index ff08dc463634..49e957c4191f 100644 --- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -20,6 +20,8 @@ #include #include #include +#include +#include #include #include @@ -178,6 +180,45 @@ void __init sme_map_bootdata(char *real_mode_data) __sme_early_map_unmap_mem(__va(cmdline_paddr), COMMAND_LINE_SIZE, true); } +static int sev_set_memory_enc(unsigned long addr, int numpages, bool enc) +{ + struct cpa_data cpa; + int ret; + + /* Should not be working on unaligned addresses */ + if (WARN_ONCE(addr & ~PAGE_MASK, "misaligned address: %#lx\n", addr)) + addr &= PAGE_MASK; + + memset(, 0, sizeof(cpa)); + cpa.vaddr = + cpa.numpages = numpages; + cpa.mask_set = enc ? __pgprot(_PAGE_ENC) : __pgprot(0); + cpa.mask_clr = enc ? __pgprot(0) : __pgprot(_PAGE_ENC); + cpa.pgd = init_mm.pgd; + + /* Must avoid aliasing mappings in the highmem code */ + kmap_flush_unused(); + vm_unmap_aliases(); + + /* +* Before changing the encryption attribute, we need to flush caches. +*/ + cpa_flush(, !this_cpu_has(X86_FEATURE_SME_COHERENT)); + + ret = __change_page_attr_set_clr(, 1); + + /* +* After changing the encryption attribute, we need to flush TLBs again +* in case any speculative TLB caching occurred (but no need to flush +* caches again). We could just use cpa_flush_all(), but in case TLB +* flushing gets optimized in the cpa_flush() path use the same logic +* as above. +*/ + cpa_flush(, 0); + + return ret; +} + void __init sme_early_init(void) { unsigned int i; @@ -185,6 +226,8 @@ void __init sme_early_init(void) if (!sme_me_mask) return; + static_call_update(x86_set_memory_enc, sev_set_memory_enc); + early_pmd_flags = __sme_set(early_pmd_flags); __supported_pte_mask = __sme_set(__supported_pte_mask); diff --git a/arch/x86/mm/pat/set_memory.c
Re: [PATCH] iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()
On 05/08/2021 12:24, Robin Murphy wrote: On 2021-06-21 17:36, John Garry wrote: Members of struct "llq" will be zero-inited, apart from member max_n_shift. But we write llq.val straight after the init, so it was pointless to zero init those other members. As such, separately init member max_n_shift only. In addition, struct "head" is initialised to "llq" only so that member max_n_shift is set. But that member is never referenced for "head", so remove any init there. Removing these initializations is seen as a small performance optimisation, as this code is (very) hot path. I looked at this and immediately thought "surely the compiler can see that all the prod/cons/val fields are written anyway and elide the initialisation?", so I dumped the before and after disassembly, and... oh. You should probably clarify that it's zero-initialising all the cacheline padding which is both pointless and painful. With that, Reviewed-by: Robin Murphy However, having looked this closely I'm now tangentially wondering why max_n_shift isn't inside the padded union? It's read at the same time as both prod and cons by queue_has_space(), and never updated, so there doesn't appear to be any benefit to it being in a separate cacheline all by itself, and llq is already twice as big as it needs to be. I think that the problem is if the prod+cons 64b value and the shift are on the same cacheline, then we have a chance of accessing a stale cacheline twice: static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu, u64 *cmds, int n, bool sync) { u64 cmd_sync[CMDQ_ENT_DWORDS]; u32 prod; unsigned long flags; bool owner; struct arm_smmu_cmdq *cmdq = >cmdq; struct arm_smmu_ll_queue llq = { .max_n_shift = cmdq->q.llq.max_n_shift, // here }, head = llq; int ret = 0; /* 1. Allocate some space in the queue */ local_irq_save(flags); llq.val = READ_ONCE(cmdq->q.llq.val);// and again here since cmdq->q.llq is per-SMMU. If max_n_shift is on a separate cacheline, then it should never be stale. I suppose they could be combined into a smaller sub-struct and loaded in a single operation, but it looks messy, and prob without much gain. Thanks, John Sorting that would also be a good opportunity to store the value of interest in its appropriate form so we're not needlessly recalculating 1 << shift every flippin' time... Robin. Signed-off-by: John Garry diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 54b2f27b81d4..8a8ad49bb7fd 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -727,11 +727,11 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu, unsigned long flags; bool owner; struct arm_smmu_cmdq *cmdq = >cmdq; - struct arm_smmu_ll_queue llq = { - .max_n_shift = cmdq->q.llq.max_n_shift, - }, head = llq; + struct arm_smmu_ll_queue llq, head; int ret = 0; + llq.max_n_shift = cmdq->q.llq.max_n_shift; + /* 1. Allocate some space in the queue */ local_irq_save(flags); llq.val = READ_ONCE(cmdq->q.llq.val); . ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
> -Original Message- > From: Ard Biesheuvel [mailto:a...@kernel.org] > Sent: 05 August 2021 14:23 > To: Shameerali Kolothum Thodi > Cc: Linux ARM ; ACPI Devel Maling List > ; Linux IOMMU > ; Linuxarm ; > Lorenzo Pieralisi ; Joerg Roedel > ; Robin Murphy ; Will Deacon > ; wanghuiqiang ; Guohanjun > (Hanjun Guo) ; Steven Price > ; Sami Mujawar ; Jon > Nettleton ; Eric Auger ; > yangyicong > Subject: Re: [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node > > On Thu, 5 Aug 2021 at 10:10, Shameer Kolothum > wrote: > > > > Hi, > > > > The series adds support to IORT RMR nodes specified in IORT > > Revision E.b -ARM DEN 0049E[0]. RMR nodes are used to describe > > memory ranges that are used by endpoints and require a unity > > mapping in SMMU. > > > > We have faced issues with 3408iMR RAID controller cards which > > fail to boot when SMMU is enabled. This is because these > > controllers make use of host memory for various caching related > > purposes and when SMMU is enabled the iMR firmware fails to > > access these memory regions as there is no mapping for them. > > IORT RMR provides a way for UEFI to describe and report these > > memory regions so that the kernel can make a unity mapping for > > these in SMMU. > > > > Does this mean we are ignoring the RMR memory ranges, and exposing the > entire physical address space to devices using the stream IDs in > question? Nope. RMR node is used to describe the memory ranges used by end points behind SMMU. And this information is used to create 1 : 1 mappings for those ranges in SMMU. Anything outside those ranges will result in translation fault(if there are no other dynamic DMA mappings). Thanks, Shameer > > > Change History: > > > > v6 --> v7 > > > > The only change from v6 is the fix pointed out by Steve to > > the SMMUv2 SMR bypass install in patch #8. > > > > Thanks to the Tested-by tags by Laurentiu with SMMUv2 and > > Hanjun/Huiqiang with SMMUv3 for v6. I haven't added the tags > > yet as the series still needs more review[1]. > > > > Feedback and tests on this series is very much appreciated. > > > > v5 --> v6 > > - Addressed comments from Robin & Lorenzo. > > : Moved iort_parse_rmr() to acpi_iort_init() from > > iort_init_platform_devices(). > > : Removed use of struct iort_rmr_entry during the initial > > parse. Using struct iommu_resv_region instead. > > : Report RMR address alignment and overlap errors, but continue. > > : Reworked arm_smmu_init_bypass_stes() (patch # 6). > > - Updated SMMUv2 bypass SMR code. Thanks to Jon N (patch #8). > > - Set IOMMU protection flags(IOMMU_CACHE, IOMMU_MMIO) based > > on Type of RMR region. Suggested by Jon N. > > > > Thanks, > > Shameer > > [0] https://developer.arm.com/documentation/den0049/latest/ > > [1] > https://lore.kernel.org/linux-acpi/20210716083442.1708-1-shameerali.koloth > um.th...@huawei.com/T/#m043c95b869973a834b2fd57f3e1ed0325c84f3b7 > > -- > > v4 --> v5 > > -Added a fw_data union to struct iommu_resv_region and removed > > struct iommu_rmr (Based on comments from Joerg/Robin). > > -Added iommu_put_rmrs() to release mem. > > -Thanks to Steve for verifying on SMMUv2, but not added the Tested-by > > yet because of the above changes. > > > > v3 -->v4 > > -Included the SMMUv2 SMR bypass install changes suggested by > > Steve(patch #7) > > -As per Robin's comments, RMR reserve implementation is now > > more generic (patch #8) and dropped v3 patches 8 and 10. > > -Rebase to 5.13-rc1 > > > > RFC v2 --> v3 > > -Dropped RFC tag as the ACPICA header changes are now ready to be > > part of 5.13[0]. But this series still has a dependency on that patch. > > -Added IORT E.b related changes(node flags, _DSM function 5 checks for > > PCIe). > > -Changed RMR to stream id mapping from M:N to M:1 as per the spec and > > discussion here[1]. > > -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!) > > -- > > > > Jon Nettleton (1): > > iommu/arm-smmu: Get associated RMR info and install bypass SMR > > > > Shameer Kolothum (8): > > iommu: Introduce a union to struct iommu_resv_region > > ACPI/IORT: Add support for RMR node parsing > > iommu/dma: Introduce generic helper to retrieve RMR info > > ACPI/IORT: Add a helper to retrieve RMR memory regions > > iommu/arm-smmu-v3: Introduce strtab init helper > > iommu/arm-smmu-v3: Refactor arm_smmu_init_bypass_stes() to force > > bypass > > iommu/arm-smmu-v3: Get associated RMR info and install bypass STE > > iommu/dma: Reserve any RMR regions associated with a dev > > > > drivers/acpi/arm64/iort.c | 172 > +++- > > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 76 +++-- > > drivers/iommu/arm/arm-smmu/arm-smmu.c | 48 ++ > > drivers/iommu/dma-iommu.c | 89 +- > > include/linux/acpi_iort.h | 7 + > > include/linux/dma-iommu.h | 13 ++ > > include/linux/iommu.h
Re: [PATCH v10 01/17] iova: Export alloc_iova_fast() and free_iova_fast()
在 2021/8/5 下午8:34, Yongji Xie 写道: My main point, though, is that if you've already got something else keeping track of the actual addresses, then the way you're using an iova_domain appears to be something you could do with a trivial bitmap allocator. That's why I don't buy the efficiency argument. The main design points of the IOVA allocator are to manage large address spaces while trying to maximise spatial locality to minimise the underlying pagetable usage, and allocating with a flexible limit to support multiple devices with different addressing capabilities in the same address space. If none of those aspects are relevant to the use-case - which AFAICS appears to be true here - then as a general-purpose resource allocator it's rubbish and has an unreasonably massive memory overhead and there are many, many better choices. OK, I get your point. Actually we used the genpool allocator in the early version. Maybe we can fall back to using it. I think maybe you can share some perf numbers to see how much alloc_iova_fast() can help. Thanks ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
On Thu, 5 Aug 2021 at 10:10, Shameer Kolothum wrote: > > Hi, > > The series adds support to IORT RMR nodes specified in IORT > Revision E.b -ARM DEN 0049E[0]. RMR nodes are used to describe > memory ranges that are used by endpoints and require a unity > mapping in SMMU. > > We have faced issues with 3408iMR RAID controller cards which > fail to boot when SMMU is enabled. This is because these > controllers make use of host memory for various caching related > purposes and when SMMU is enabled the iMR firmware fails to > access these memory regions as there is no mapping for them. > IORT RMR provides a way for UEFI to describe and report these > memory regions so that the kernel can make a unity mapping for > these in SMMU. > Does this mean we are ignoring the RMR memory ranges, and exposing the entire physical address space to devices using the stream IDs in question? > Change History: > > v6 --> v7 > > The only change from v6 is the fix pointed out by Steve to > the SMMUv2 SMR bypass install in patch #8. > > Thanks to the Tested-by tags by Laurentiu with SMMUv2 and > Hanjun/Huiqiang with SMMUv3 for v6. I haven't added the tags > yet as the series still needs more review[1]. > > Feedback and tests on this series is very much appreciated. > > v5 --> v6 > - Addressed comments from Robin & Lorenzo. > : Moved iort_parse_rmr() to acpi_iort_init() from > iort_init_platform_devices(). > : Removed use of struct iort_rmr_entry during the initial > parse. Using struct iommu_resv_region instead. > : Report RMR address alignment and overlap errors, but continue. > : Reworked arm_smmu_init_bypass_stes() (patch # 6). > - Updated SMMUv2 bypass SMR code. Thanks to Jon N (patch #8). > - Set IOMMU protection flags(IOMMU_CACHE, IOMMU_MMIO) based > on Type of RMR region. Suggested by Jon N. > > Thanks, > Shameer > [0] https://developer.arm.com/documentation/den0049/latest/ > [1] > https://lore.kernel.org/linux-acpi/20210716083442.1708-1-shameerali.kolothum.th...@huawei.com/T/#m043c95b869973a834b2fd57f3e1ed0325c84f3b7 > -- > v4 --> v5 > -Added a fw_data union to struct iommu_resv_region and removed > struct iommu_rmr (Based on comments from Joerg/Robin). > -Added iommu_put_rmrs() to release mem. > -Thanks to Steve for verifying on SMMUv2, but not added the Tested-by > yet because of the above changes. > > v3 -->v4 > -Included the SMMUv2 SMR bypass install changes suggested by > Steve(patch #7) > -As per Robin's comments, RMR reserve implementation is now > more generic (patch #8) and dropped v3 patches 8 and 10. > -Rebase to 5.13-rc1 > > RFC v2 --> v3 > -Dropped RFC tag as the ACPICA header changes are now ready to be > part of 5.13[0]. But this series still has a dependency on that patch. > -Added IORT E.b related changes(node flags, _DSM function 5 checks for > PCIe). > -Changed RMR to stream id mapping from M:N to M:1 as per the spec and > discussion here[1]. > -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!) > -- > > Jon Nettleton (1): > iommu/arm-smmu: Get associated RMR info and install bypass SMR > > Shameer Kolothum (8): > iommu: Introduce a union to struct iommu_resv_region > ACPI/IORT: Add support for RMR node parsing > iommu/dma: Introduce generic helper to retrieve RMR info > ACPI/IORT: Add a helper to retrieve RMR memory regions > iommu/arm-smmu-v3: Introduce strtab init helper > iommu/arm-smmu-v3: Refactor arm_smmu_init_bypass_stes() to force > bypass > iommu/arm-smmu-v3: Get associated RMR info and install bypass STE > iommu/dma: Reserve any RMR regions associated with a dev > > drivers/acpi/arm64/iort.c | 172 +++- > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 76 +++-- > drivers/iommu/arm/arm-smmu/arm-smmu.c | 48 ++ > drivers/iommu/dma-iommu.c | 89 +- > include/linux/acpi_iort.h | 7 + > include/linux/dma-iommu.h | 13 ++ > include/linux/iommu.h | 11 ++ > 7 files changed, 393 insertions(+), 23 deletions(-) > > -- > 2.17.1 > > > ___ > linux-arm-kernel mailing list > linux-arm-ker...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 04/12] iommu/mediatek: Add device_link between the consumer and the larb devices
On 30.07.21 04:52, Yong Wu wrote: MediaTek IOMMU-SMI diagram is like below. all the consumer connect with smi-larb, then connect with smi-common. M4U | smi-common | - | |... | | larb1 larb2 | | vdec venc When the consumer works, it should enable the smi-larb's power which also need enable the smi-common's power firstly. Thus, First of all, use the device link connect the consumer and the smi-larbs. then add device link between the smi-larb and smi-common. This patch adds device_link between the consumer and the larbs. When device_link_add, I add the flag DL_FLAG_STATELESS to avoid calling pm_runtime_xx to keep the original status of clocks. It can avoid two issues: 1) Display HW show fastlogo abnormally reported in [1]. At the beggining, all the clocks are enabled before entering kernel, but the clocks for display HW(always in larb0) will be gated after clk_enable and clk_disable called from device_link_add(->pm_runtime_resume) and rpm_idle. The clock operation happened before display driver probe. At that time, the display HW will be abnormal. 2) A deadlock issue reported in [2]. Use DL_FLAG_STATELESS to skip pm_runtime_xx to avoid the deadlock. Corresponding, DL_FLAG_AUTOREMOVE_CONSUMER can't be added, then device_link_removed should be added explicitly. [1] https://lore.kernel.org/linux-mediatek/1564213888.22908.4.camel@mhfsdcap03/ [2] https://lore.kernel.org/patchwork/patch/1086569/ Suggested-by: Tomasz Figa Signed-off-by: Yong Wu Tested-by: Dafna Hirschfeld # on mt8173 Hi, unfortunately, I have to take back the Tested-by tag. I am now testing the mtk-vcodec with latest kernel + patches sent from the mailing list: https://gitlab.collabora.com/eballetbo/linux/-/commits/topic/chromeos/chromeos-5.14 which includes this patchset. On chromeos I open a video conference with googl-meet which cause the mtk-vcodec vp8 encoder to run. If I kill it with `killall -9 chrome` I get some page fault messages from the iommu: [ 837.255952] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.265696] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.282367] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.299028] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.315683] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.332345] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.349004] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.365665] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.382329] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.42] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read In addition, running the encoder tests from the shell: sudo --user=#1000 /usr/local/libexec/chrome-binary-tests/video_encode_accelerator_tests --gtest_filter=VideoEncoderTest.FlushAtEndOfStream_Multiple* --codec=vp8 /usr/local/share/tast/data/chromiumos/tast/local/bundles/cros/video/data/tulip2-320x180.yuv --disable_validator At some point it fails with the error [ 5472.161821] [MTK_V4L2][ERROR] mtk_vcodec_wait_for_done_ctx:32: [290] ctx->type=1, cmd=1, wait_event_interruptible_timeout time=1000ms out 0 0! [ 5472.174678] [MTK_VCODEC][ERROR][290]: vp8_enc_encode_frame() irq_status=0 failed [ 5472.182687] [MTK_V4L2][ERROR] mtk_venc_worker:1239: venc_if_encode failed=-5 If you have any idea of what might be the problem or how to debug? Thanks, Dafna --- drivers/iommu/mtk_iommu.c| 22 ++ drivers/iommu/mtk_iommu_v1.c | 20 +++- 2 files changed, 41 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index a02dde094788..ee742900cf4b 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -571,22 +571,44 @@ static struct iommu_device *mtk_iommu_probe_device(struct device *dev) { struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); struct mtk_iommu_data *data; + struct device_link *link; + struct device *larbdev; + unsigned int larbid; if (!fwspec || fwspec->ops != _iommu_ops) return ERR_PTR(-ENODEV); /* Not a iommu client device */ data = dev_iommu_priv_get(dev); + /* +* Link the consumer device with the smi-larb device(supplier) +* The device in each a larb is a independent HW. thus only link +* one larb here. +*/ + larbid = MTK_M4U_TO_LARB(fwspec->ids[0]); + larbdev = data->larb_imu[larbid].dev; +
Re: [PATCH v10 01/17] iova: Export alloc_iova_fast() and free_iova_fast()
On Wed, Aug 4, 2021 at 11:43 PM Robin Murphy wrote: > > On 2021-08-04 06:02, Yongji Xie wrote: > > On Tue, Aug 3, 2021 at 6:54 PM Robin Murphy wrote: > >> > >> On 2021-08-03 09:54, Yongji Xie wrote: > >>> On Tue, Aug 3, 2021 at 3:41 PM Jason Wang wrote: > > > 在 2021/7/29 下午3:34, Xie Yongji 写道: > > Export alloc_iova_fast() and free_iova_fast() so that > > some modules can use it to improve iova allocation efficiency. > > > It's better to explain why alloc_iova() is not sufficient here. > > >>> > >>> Fine. > >> > >> What I fail to understand from the later patches is what the IOVA domain > >> actually represents. If the "device" is a userspace process then > >> logically the "IOVA" would be the userspace address, so presumably > >> somewhere you're having to translate between this arbitrary address > >> space and actual usable addresses - if you're worried about efficiency > >> surely it would be even better to not do that? > >> > > > > Yes, userspace daemon needs to translate the "IOVA" in a DMA > > descriptor to the VA (from mmap(2)). But this actually doesn't affect > > performance since it's an identical mapping in most cases. > > I'm not familiar with the vhost_iotlb stuff, but it looks suspiciously > like you're walking yet another tree to make those translations. Even if > the buffer can be mapped all at once with a fixed offset such that each > DMA mapping call doesn't need a lookup for each individual "IOVA" - that > might be what's happening already, but it's a bit hard to follow just > reading the patches in my mail client - vhost_iotlb_add_range() doesn't > look like it's super-cheap to call, and you're serialising on a lock for > that. > Yes, that's true. Since the software IOTLB is not used in the VM case, we need a unified way (vhost_iotlb) to manage the IOVA mapping for both VM and Container cases. > My main point, though, is that if you've already got something else > keeping track of the actual addresses, then the way you're using an > iova_domain appears to be something you could do with a trivial bitmap > allocator. That's why I don't buy the efficiency argument. The main > design points of the IOVA allocator are to manage large address spaces > while trying to maximise spatial locality to minimise the underlying > pagetable usage, and allocating with a flexible limit to support > multiple devices with different addressing capabilities in the same > address space. If none of those aspects are relevant to the use-case - > which AFAICS appears to be true here - then as a general-purpose > resource allocator it's rubbish and has an unreasonably massive memory > overhead and there are many, many better choices. > OK, I get your point. Actually we used the genpool allocator in the early version. Maybe we can fall back to using it. > FWIW I've recently started thinking about moving all the caching stuff > out of iova_domain and into the iommu-dma layer since it's now a giant > waste of space for all the other current IOVA users. > > >> Presumably userspace doesn't have any concern about alignment and the > >> things we have to worry about for the DMA API in general, so it's pretty > >> much just allocating slots in a buffer, and there are far more effective > >> ways to do that than a full-blown address space manager. > > > > Considering iova allocation efficiency, I think the iova allocator is > > better here. In most cases, we don't even need to hold a spin lock > > during iova allocation. > > > >> If you're going > >> to reuse any infrastructure I'd have expected it to be SWIOTLB rather > >> than the IOVA allocator. Because, y'know, you're *literally implementing > >> a software I/O TLB* ;) > >> > > > > But actually what we can reuse in SWIOTLB is the IOVA allocator. > > Huh? Those are completely unrelated and orthogonal things - SWIOTLB does > not use an external allocator (see find_slots()). By SWIOTLB I mean > specifically the library itself, not dma-direct or any of the other > users built around it. The functionality for managing slots in a buffer > and bouncing data in and out can absolutely be reused - that's why users > like the Xen and iommu-dma code *are* reusing it instead of open-coding > their own versions. > I see. Actually the slots management in SWIOTLB is what I mean by IOVA allocator. > > And > > the IOVA management in SWIOTLB is not what we want. For example, > > SWIOTLB allocates and uses contiguous memory for bouncing, which is > > not necessary in VDUSE case. > > alloc_iova() allocates a contiguous (in IOVA address) region of space. > In vduse_domain_map_page() you use it to allocate a contiguous region of > space from your bounce buffer. Can you clarify how that is fundamentally > different from allocating a contiguous region of space from a bounce > buffer? Nobody's saying the underlying implementation details of where > the buffer itself comes from can't be tweaked. > I mean physically contiguous memory here. We
Re: [PATCH] iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()
On 2021-08-05 12:24, Robin Murphy wrote: On 2021-06-21 17:36, John Garry wrote: Members of struct "llq" will be zero-inited, apart from member max_n_shift. But we write llq.val straight after the init, so it was pointless to zero init those other members. As such, separately init member max_n_shift only. In addition, struct "head" is initialised to "llq" only so that member max_n_shift is set. But that member is never referenced for "head", so remove any init there. Removing these initializations is seen as a small performance optimisation, as this code is (very) hot path. I looked at this and immediately thought "surely the compiler can see that all the prod/cons/val fields are written anyway and elide the initialisation?", so I dumped the before and after disassembly, and... oh. You should probably clarify that it's zero-initialising all the cacheline padding which is both pointless and painful. With that, Reviewed-by: Robin Murphy However, having looked this closely I'm now tangentially wondering why max_n_shift isn't inside the padded union? It's read at the same time as both prod and cons by queue_has_space(), and never updated, so there doesn't appear to be any benefit to it being in a separate cacheline all by itself, and llq is already twice as big as it needs to be. Sorting that would also be a good opportunity to store the value of interest in its appropriate form so we're not needlessly recalculating 1 << shift every flippin' time... ...on which note, how about something like this on top? (untested since I don't have any SMMUv3 hardware to hand) Robin. ->8- Subject: [PATCH] iommu/arm-smmu-v3: Improve arm_smmu_ll_queue efficiency Once initialised, max_n_shift is only ever read at the same time as accessing prod or cons, thus should not have any impact on contention to justify keeping it in its own separate cacheline. Move it inside the padding union to halve the size of struct arm_smmu_ll_queue. Even then, though, there are a couple more spots in the command issuing path where we could do without the overhead of zeroing even one cache line worth of padding, so avoid implicit initialisation of those temporary structures as was done at the top level in arm_smmu_cmdq_issue_cmdlist(). Furthermore, the shift value is only directly relevant for initially setting up the relevant queue base register; all we care about after that is the number of entries, so store that value instead once a queue is initialised and avoid needlessly recalculating it everywhere. Signed-off-by: Robin Murphy --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 35 +++-- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 29 ++--- 2 files changed, 37 insertions(+), 27 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 47610dc5d920..bc55217d6d61 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -111,7 +111,7 @@ static bool queue_has_space(struct arm_smmu_ll_queue *q, u32 n) cons = Q_IDX(q, q->cons); if (Q_WRP(q, q->prod) == Q_WRP(q, q->cons)) - space = (1 << q->max_n_shift) - (prod - cons); + space = q->nents - (prod - cons); else space = cons - prod; @@ -517,10 +517,11 @@ static void __arm_smmu_cmdq_poll_set_valid_map(struct arm_smmu_cmdq *cmdq, u32 sprod, u32 eprod, bool set) { u32 swidx, sbidx, ewidx, ebidx; - struct arm_smmu_ll_queue llq = { - .max_n_shift= cmdq->q.llq.max_n_shift, - .prod = sprod, - }; + struct arm_smmu_ll_queue llq; + + /* Avoid zero-initialising all the padding */; + llq.nents = cmdq->q.llq.nents; + llq.prod = sprod; ewidx = BIT_WORD(Q_IDX(, eprod)); ebidx = Q_IDX(, eprod) % BITS_PER_LONG; @@ -696,10 +697,11 @@ static void arm_smmu_cmdq_write_entries(struct arm_smmu_cmdq *cmdq, u64 *cmds, u32 prod, int n) { int i; - struct arm_smmu_ll_queue llq = { - .max_n_shift= cmdq->q.llq.max_n_shift, - .prod = prod, - }; + struct arm_smmu_ll_queue llq; + + /* Avoid zero-initialising all the padding */; + llq.nents = cmdq->q.llq.nents; + llq.prod = prod; for (i = 0; i < n; ++i) { u64 *cmd = [i * CMDQ_ENT_DWORDS]; @@ -736,7 +738,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu, struct arm_smmu_ll_queue llq, head; int ret = 0; - llq.max_n_shift = cmdq->q.llq.max_n_shift; + llq.nents = cmdq->q.llq.nents; /* 1. Allocate some space in the queue */ local_irq_save(flags); @@ -2845,16 +2847,18 @@ static int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
Re: [RFC v2] /dev/iommu uAPI proposal
On Wed, Aug 04, 2021 at 10:59:21PM +, Tian, Kevin wrote: > > From: Jason Gunthorpe > > Sent: Wednesday, August 4, 2021 10:05 PM > > > > On Mon, Aug 02, 2021 at 02:49:44AM +, Tian, Kevin wrote: > > > > > Can you elaborate? IMO the user only cares about the label (device cookie > > > plus optional vPASID) which is generated by itself when doing the > > > attaching > > > call, and expects this virtual label being used in various spots > > > (invalidation, > > > page fault, etc.). How the system labels the traffic (the physical RID or > > > RID+ > > > PASID) should be completely invisible to userspace. > > > > I don't think that is true if the vIOMMU driver is also emulating > > PASID. Presumably the same is true for other PASID-like schemes. > > > > I'm getting even more confused with this comment. Isn't it the > consensus from day one that physical PASID should not be exposed > to userspace as doing so breaks live migration? Uh, no? > with PASID emulation vIOMMU only cares about vPASID instead of > pPASID, and the uAPI only requires user to register vPASID instead > of reporting pPASID back to userspace... vPASID is only a feature of one device in existance, so we can't make vPASID mandatory. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()
On 2021-06-21 17:36, John Garry wrote: Members of struct "llq" will be zero-inited, apart from member max_n_shift. But we write llq.val straight after the init, so it was pointless to zero init those other members. As such, separately init member max_n_shift only. In addition, struct "head" is initialised to "llq" only so that member max_n_shift is set. But that member is never referenced for "head", so remove any init there. Removing these initializations is seen as a small performance optimisation, as this code is (very) hot path. I looked at this and immediately thought "surely the compiler can see that all the prod/cons/val fields are written anyway and elide the initialisation?", so I dumped the before and after disassembly, and... oh. You should probably clarify that it's zero-initialising all the cacheline padding which is both pointless and painful. With that, Reviewed-by: Robin Murphy However, having looked this closely I'm now tangentially wondering why max_n_shift isn't inside the padded union? It's read at the same time as both prod and cons by queue_has_space(), and never updated, so there doesn't appear to be any benefit to it being in a separate cacheline all by itself, and llq is already twice as big as it needs to be. Sorting that would also be a good opportunity to store the value of interest in its appropriate form so we're not needlessly recalculating 1 << shift every flippin' time... Robin. Signed-off-by: John Garry diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 54b2f27b81d4..8a8ad49bb7fd 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -727,11 +727,11 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu, unsigned long flags; bool owner; struct arm_smmu_cmdq *cmdq = >cmdq; - struct arm_smmu_ll_queue llq = { - .max_n_shift = cmdq->q.llq.max_n_shift, - }, head = llq; + struct arm_smmu_ll_queue llq, head; int ret = 0; + llq.max_n_shift = cmdq->q.llq.max_n_shift; + /* 1. Allocate some space in the queue */ local_irq_save(flags); llq.val = READ_ONCE(cmdq->q.llq.val); ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()
On Thu, Aug 05, 2021 at 11:22:15AM +0100, John Garry wrote: > On 21/06/2021 17:36, John Garry wrote: > > Members of struct "llq" will be zero-inited, apart from member max_n_shift. > > But we write llq.val straight after the init, so it was pointless to zero > > init those other members. As such, separately init member max_n_shift > > only. > > > > In addition, struct "head" is initialised to "llq" only so that member > > max_n_shift is set. But that member is never referenced for "head", so > > remove any init there. > > > > Removing these initializations is seen as a small performance optimisation, > > as this code is (very) hot path. > > > > Hi Will, > > Any chance you can pick up this small optimisation? Yup! I've actually queued it locally, but I may end up asking Joerg to take it directly depending on what else I queue for 5.15. So far, most of the SMMU stuff is all part of wider refactorings. Cheers, Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] of: restricted dma: Don't fail device probe on rmem init failure
On 2021-08-05 10:47, Will Deacon wrote: If CONFIG_DMA_RESTRICTED_POOL=n then probing a device with a reference to a "restricted-dma-pool" will fail with a reasonably cryptic error: | pci-host-generic: probe of 1.pci failed with error -22 Print a more helpful message in this case and try to continue probing the device as we do if the kernel doesn't have the restricted DMA patches applied or either CONFIG_OF_ADDRESS or CONFIG_HAS_DMA =n. Makes sense to me; Reviewed-by: Robin Murphy Although if we allow probe to succeed when a pool really was there for a reason, it may end up being much more fatal if the driver then tries to do a DMA transfer to any old memory and the device access causes an SError, or the VM to be killed, or whatever. That's not quite the same as the stubbed cases where the respective platforms couldn't have a genuine pool to parse either way, but as you say it is what could happen already if the user tried to use an older kernel, and I think the chance of of_reserved_mem_device_init_by_idx() failing without something being terminally wrong anyway - invalid DT, not enough RAM, etc. - is low enough that it's probably not a major concern. Plus I'd hope that the memory protection schemes people do actually implement don't take such such a zero-tolerance approach anyway - allowing a malicious or malfunctioning device to take down the system because it tried to make a rogue access which *was* already contained seems a bit silly. Robin. Cc: Claire Chang Cc: Konrad Rzeszutek Wilk Cc: Robin Murphy Cc: Christoph Hellwig Cc: Rob Herring Signed-off-by: Will Deacon --- drivers/of/address.c| 8 drivers/of/device.c | 2 +- drivers/of/of_private.h | 8 +++- 3 files changed, 8 insertions(+), 10 deletions(-) diff --git a/drivers/of/address.c b/drivers/of/address.c index 973257434398..f6bf4b423c2a 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -997,7 +997,7 @@ int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map) return ret; } -int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np) +void of_dma_set_restricted_buffer(struct device *dev, struct device_node *np) { struct device_node *node, *of_node = dev->of_node; int count, i; @@ -1022,11 +1022,11 @@ int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np) */ if (of_device_is_compatible(node, "restricted-dma-pool") && of_device_is_available(node)) - return of_reserved_mem_device_init_by_idx(dev, of_node, - i); + break; } - return 0; + if (i != count && of_reserved_mem_device_init_by_idx(dev, of_node, i)) + dev_warn(dev, "failed to initialise \"restricted-dma-pool\" memory node\n"); } #endif /* CONFIG_HAS_DMA */ diff --git a/drivers/of/device.c b/drivers/of/device.c index 2defdca418ec..258a2b099410 100644 --- a/drivers/of/device.c +++ b/drivers/of/device.c @@ -166,7 +166,7 @@ int of_dma_configure_id(struct device *dev, struct device_node *np, arch_setup_dma_ops(dev, dma_start, size, iommu, coherent); if (!iommu) - return of_dma_set_restricted_buffer(dev, np); + of_dma_set_restricted_buffer(dev, np); return 0; } diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h index f557bd22b0cf..bc883f69496b 100644 --- a/drivers/of/of_private.h +++ b/drivers/of/of_private.h @@ -163,18 +163,16 @@ struct bus_dma_region; #if defined(CONFIG_OF_ADDRESS) && defined(CONFIG_HAS_DMA) int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map); -int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np); +void of_dma_set_restricted_buffer(struct device *dev, struct device_node *np); #else static inline int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map) { return -ENODEV; } -static inline int of_dma_set_restricted_buffer(struct device *dev, - struct device_node *np) +static inline void of_dma_set_restricted_buffer(struct device *dev, + struct device_node *np) { - /* Do nothing, successfully. */ - return 0; } #endif ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()
On 21/06/2021 17:36, John Garry wrote: Members of struct "llq" will be zero-inited, apart from member max_n_shift. But we write llq.val straight after the init, so it was pointless to zero init those other members. As such, separately init member max_n_shift only. In addition, struct "head" is initialised to "llq" only so that member max_n_shift is set. But that member is never referenced for "head", so remove any init there. Removing these initializations is seen as a small performance optimisation, as this code is (very) hot path. Hi Will, Any chance you can pick up this small optimisation? Cheers Signed-off-by: John Garry diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 54b2f27b81d4..8a8ad49bb7fd 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -727,11 +727,11 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu, unsigned long flags; bool owner; struct arm_smmu_cmdq *cmdq = >cmdq; - struct arm_smmu_ll_queue llq = { - .max_n_shift = cmdq->q.llq.max_n_shift, - }, head = llq; + struct arm_smmu_ll_queue llq, head; int ret = 0; + llq.max_n_shift = cmdq->q.llq.max_n_shift; + /* 1. Allocate some space in the queue */ local_irq_save(flags); llq.val = READ_ONCE(cmdq->q.llq.val); ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] of: restricted dma: Don't fail device probe on rmem init failure
If CONFIG_DMA_RESTRICTED_POOL=n then probing a device with a reference to a "restricted-dma-pool" will fail with a reasonably cryptic error: | pci-host-generic: probe of 1.pci failed with error -22 Print a more helpful message in this case and try to continue probing the device as we do if the kernel doesn't have the restricted DMA patches applied or either CONFIG_OF_ADDRESS or CONFIG_HAS_DMA =n. Cc: Claire Chang Cc: Konrad Rzeszutek Wilk Cc: Robin Murphy Cc: Christoph Hellwig Cc: Rob Herring Signed-off-by: Will Deacon --- drivers/of/address.c| 8 drivers/of/device.c | 2 +- drivers/of/of_private.h | 8 +++- 3 files changed, 8 insertions(+), 10 deletions(-) diff --git a/drivers/of/address.c b/drivers/of/address.c index 973257434398..f6bf4b423c2a 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -997,7 +997,7 @@ int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map) return ret; } -int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np) +void of_dma_set_restricted_buffer(struct device *dev, struct device_node *np) { struct device_node *node, *of_node = dev->of_node; int count, i; @@ -1022,11 +1022,11 @@ int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np) */ if (of_device_is_compatible(node, "restricted-dma-pool") && of_device_is_available(node)) - return of_reserved_mem_device_init_by_idx(dev, of_node, - i); + break; } - return 0; + if (i != count && of_reserved_mem_device_init_by_idx(dev, of_node, i)) + dev_warn(dev, "failed to initialise \"restricted-dma-pool\" memory node\n"); } #endif /* CONFIG_HAS_DMA */ diff --git a/drivers/of/device.c b/drivers/of/device.c index 2defdca418ec..258a2b099410 100644 --- a/drivers/of/device.c +++ b/drivers/of/device.c @@ -166,7 +166,7 @@ int of_dma_configure_id(struct device *dev, struct device_node *np, arch_setup_dma_ops(dev, dma_start, size, iommu, coherent); if (!iommu) - return of_dma_set_restricted_buffer(dev, np); + of_dma_set_restricted_buffer(dev, np); return 0; } diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h index f557bd22b0cf..bc883f69496b 100644 --- a/drivers/of/of_private.h +++ b/drivers/of/of_private.h @@ -163,18 +163,16 @@ struct bus_dma_region; #if defined(CONFIG_OF_ADDRESS) && defined(CONFIG_HAS_DMA) int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map); -int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np); +void of_dma_set_restricted_buffer(struct device *dev, struct device_node *np); #else static inline int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map) { return -ENODEV; } -static inline int of_dma_set_restricted_buffer(struct device *dev, - struct device_node *np) +static inline void of_dma_set_restricted_buffer(struct device *dev, + struct device_node *np) { - /* Do nothing, successfully. */ - return 0; } #endif -- 2.32.0.605.g8dce9f2422-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 02/25] iommu/amd: Drop IOVA cookie management
On 2021-08-04 18:15, Robin Murphy wrote: The core code bakes its own cookies now. Signed-off-by: Robin Murphy --- v3: Also remove unneeded include --- drivers/iommu/amd/iommu.c | 13 - 1 file changed, 13 deletions(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 52fe2326042a..92f7cbe3d14a 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -20,7 +20,6 @@ #include #include #include -#include Oh dear, how embarrassing... I went through all the drivers making that decision based on iommu_dma* references but totally forgot about iommu_setup_dma_ops() here. And then of course fell into the trap of "such a minor change I don't need to re-rest it" hubris... sigh, roll back to v2 for this one. Apologies, Robin. #include #include #include @@ -1918,16 +1917,7 @@ static struct iommu_domain *amd_iommu_domain_alloc(unsigned type) domain->domain.geometry.aperture_end = ~0ULL; domain->domain.geometry.force_aperture = true; - if (type == IOMMU_DOMAIN_DMA && - iommu_get_dma_cookie(>domain) == -ENOMEM) - goto free_domain; - return >domain; - -free_domain: - protection_domain_free(domain); - - return NULL; } static void amd_iommu_domain_free(struct iommu_domain *dom) @@ -1944,9 +1934,6 @@ static void amd_iommu_domain_free(struct iommu_domain *dom) if (!dom) return; - if (dom->type == IOMMU_DOMAIN_DMA) - iommu_put_dma_cookie(>domain); - if (domain->flags & PD_IOMMUV2_MASK) free_gcr3_table(domain); ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v3 06/25] iommu/ipmmu-vmsa: Drop IOVA cookie management
Hi Robin, > From: Robin Murphy, Sent: Thursday, August 5, 2021 2:16 AM > > The core code bakes its own cookies now. > > CC: Yoshihiro Shimoda > CC: Geert Uytterhoeven > Signed-off-by: Robin Murphy Thank you for the patch! I tested on my environment (r8a77951-salvator-xs), and I didn't observe any regression. So, Reviewed-by: Yoshihiro Shimoda Tested-by: Yoshihiro Shimoda Best regards, Yoshihiro Shimoda ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v3 01/25] iommu: Pull IOVA cookie management into the core
Hi Robin, > From: Robin Murphy, Sent: Thursday, August 5, 2021 2:15 AM > > Now that everyone has converged on iommu-dma for IOMMU_DOMAIN_DMA > support, we can abandon the notion of drivers being responsible for the > cookie type, and consolidate all the management into the core code. > > CC: Marek Szyprowski > CC: Yoshihiro Shimoda > CC: Geert Uytterhoeven > CC: Yong Wu > CC: Heiko Stuebner > CC: Chunyan Zhang > CC: Maxime Ripard > Reviewed-by: Jean-Philippe Brucker > Reviewed-by: Lu Baolu > Signed-off-by: Robin Murphy Thank you for the patch! I tested on my environment (r8a77951-salvator-xs), and I didn't observe any regression. So, Tested-by: Yoshihiro Shimoda Best regards, Yoshihiro Shimoda ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 9/9] iommu/dma: Reserve any RMR regions associated with a dev
Get ACPI IORT RMR regions associated with a dev reserved so that there is a unity mapping for them in SMMU. Signed-off-by: Shameer Kolothum --- drivers/iommu/dma-iommu.c | 56 +++ 1 file changed, 51 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 1b6e27475279..c1ae0c3d4b33 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -207,22 +207,68 @@ void iommu_dma_put_rmrs(struct fwnode_handle *iommu_fwnode, } EXPORT_SYMBOL(iommu_dma_put_rmrs); +static bool iommu_dma_dev_has_rmr(struct iommu_fwspec *fwspec, + struct iommu_resv_region *e) +{ + int i; + + for (i = 0; i < fwspec->num_ids; i++) { + if (e->fw_data.rmr.sid == fwspec->ids[i]) + return true; + } + + return false; +} + +static void iommu_dma_get_rmr_resv_regions(struct device *dev, + struct list_head *list) +{ + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); + struct list_head rmr_list; + struct iommu_resv_region *rmr, *tmp; + + INIT_LIST_HEAD(_list); + if (iommu_dma_get_rmrs(fwspec->iommu_fwnode, _list)) + return; + + if (dev_is_pci(dev)) { + struct pci_dev *pdev = to_pci_dev(dev); + struct pci_host_bridge *host = pci_find_host_bridge(pdev->bus); + + if (!host->preserve_config) + return; + } + + list_for_each_entry_safe(rmr, tmp, _list, list) { + if (!iommu_dma_dev_has_rmr(fwspec, rmr)) + continue; + + /* Remove from iommu RMR list and add to dev resv_regions */ + list_del_init(>list); + list_add_tail(>list, list); + } + + iommu_dma_put_rmrs(fwspec->iommu_fwnode, _list); +} + /** * iommu_dma_get_resv_regions - Reserved region driver helper * @dev: Device from iommu_get_resv_regions() * @list: Reserved region list from iommu_get_resv_regions() * * IOMMU drivers can use this to implement their .get_resv_regions callback - * for general non-IOMMU-specific reservations. Currently, this covers GICv3 - * ITS region reservation on ACPI based ARM platforms that may require HW MSI - * reservation. + * for general non-IOMMU-specific reservations. Currently this covers, + * -GICv3 ITS region reservation on ACPI based ARM platforms that may + * require HW MSI reservation. + * -Any ACPI IORT RMR memory range reservations (IORT spec rev E.b) */ void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list) { - if (!is_of_node(dev_iommu_fwspec_get(dev)->iommu_fwnode)) + if (!is_of_node(dev_iommu_fwspec_get(dev)->iommu_fwnode)) { iort_iommu_msi_get_resv_regions(dev, list); - + iommu_dma_get_rmr_resv_regions(dev, list); + } } EXPORT_SYMBOL(iommu_dma_get_resv_regions); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 8/9] iommu/arm-smmu: Get associated RMR info and install bypass SMR
From: Jon Nettleton Check if there is any RMR info associated with the devices behind the SMMU and if any, install bypass SMRs for them. This is to keep any ongoing traffic associated with these devices alive when we enable/reset SMMU during probe(). Signed-off-by: Jon Nettleton Signed-off-by: Steven Price Signed-off-by: Shameer Kolothum --- drivers/iommu/arm/arm-smmu/arm-smmu.c | 48 +++ 1 file changed, 48 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index f22dbeb1e510..315feab9e85b 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -2063,6 +2063,50 @@ err_reset_platform_ops: __maybe_unused; return err; } +static void arm_smmu_rmr_install_bypass_smr(struct arm_smmu_device *smmu) +{ + struct list_head rmr_list; + struct iommu_resv_region *e; + int i, cnt = 0; + u32 reg; + + INIT_LIST_HEAD(_list); + if (iommu_dma_get_rmrs(dev_fwnode(smmu->dev), _list)) + return; + + /* +* Rather than trying to look at existing mappings that +* are setup by the firmware and then invalidate the ones +* that do no have matching RMR entries, just disable the +* SMMU until it gets enabled again in the reset routine. +*/ + reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sCR0); + reg |= ARM_SMMU_sCR0_CLIENTPD; + arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sCR0, reg); + + list_for_each_entry(e, _list, list) { + u32 sid = e->fw_data.rmr.sid; + + i = arm_smmu_find_sme(smmu, sid, ~0); + if (i < 0) + continue; + if (smmu->s2crs[i].count == 0) { + smmu->smrs[i].id = sid; + smmu->smrs[i].mask = 0; + smmu->smrs[i].valid = true; + } + smmu->s2crs[i].count++; + smmu->s2crs[i].type = S2CR_TYPE_BYPASS; + smmu->s2crs[i].privcfg = S2CR_PRIVCFG_DEFAULT; + + cnt++; + } + + dev_notice(smmu->dev, "\tpreserved %d boot mapping%s\n", cnt, + cnt == 1 ? "" : "s"); + iommu_dma_put_rmrs(dev_fwnode(smmu->dev), _list); +} + static int arm_smmu_device_probe(struct platform_device *pdev) { struct resource *res; @@ -2189,6 +2233,10 @@ static int arm_smmu_device_probe(struct platform_device *pdev) } platform_set_drvdata(pdev, smmu); + + /* Check for RMRs and install bypass SMRs if any */ + arm_smmu_rmr_install_bypass_smr(smmu); + arm_smmu_device_reset(smmu); arm_smmu_test_smr_masks(smmu); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 7/9] iommu/arm-smmu-v3: Get associated RMR info and install bypass STE
Check if there is any RMR info associated with the devices behind the SMMUv3 and if any, install bypass STEs for them. This is to keep any ongoing traffic associated with these devices alive when we enable/reset SMMUv3 during probe(). Signed-off-by: Shameer Kolothum --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 31 + 1 file changed, 31 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 85f6f1925a36..1165605d6f7a 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -3760,6 +3760,34 @@ static void __iomem *arm_smmu_ioremap(struct device *dev, resource_size_t start, return devm_ioremap_resource(dev, ); } +static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu) +{ + struct list_head rmr_list; + struct iommu_resv_region *e; + int ret; + + INIT_LIST_HEAD(_list); + if (iommu_dma_get_rmrs(dev_fwnode(smmu->dev), _list)) + return; + + list_for_each_entry(e, _list, list) { + __le64 *step; + u32 sid = e->fw_data.rmr.sid; + + ret = arm_smmu_init_sid_strtab(smmu, sid); + if (ret) { + dev_err(smmu->dev, "RMR SID(0x%x) bypass failed\n", + sid); + continue; + } + + step = arm_smmu_get_step_for_sid(smmu, sid); + arm_smmu_init_bypass_stes(step, 1, true); + } + + iommu_dma_put_rmrs(dev_fwnode(smmu->dev), _list); +} + static int arm_smmu_device_probe(struct platform_device *pdev) { int irq, ret; @@ -3841,6 +3869,9 @@ static int arm_smmu_device_probe(struct platform_device *pdev) /* Record our private device structure */ platform_set_drvdata(pdev, smmu); + /* Check for RMRs and install bypass STEs if any */ + arm_smmu_rmr_install_bypass_ste(smmu); + /* Reset the device */ ret = arm_smmu_device_reset(smmu, bypass); if (ret) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 6/9] iommu/arm-smmu-v3: Refactor arm_smmu_init_bypass_stes() to force bypass
By default, disable_bypass flag is set and any dev without an iommu domain installs STE with CFG_ABORT during arm_smmu_init_bypass_stes(). Introduce a "force" flag and move the STE update logic to arm_smmu_init_bypass_stes() so that we can force it to install CFG_BYPASS STE for specific SIDs. This will be useful in follow-up patch to install bypass for IORT RMR SIDs. Signed-off-by: Shameer Kolothum --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 31940e53c675..85f6f1925a36 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1357,12 +1357,21 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, arm_smmu_cmdq_issue_cmd(smmu, _cmd); } -static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent) +static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool force) { unsigned int i; + u64 val = STRTAB_STE_0_V; + + if (disable_bypass && !force) + val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT); + else + val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS); for (i = 0; i < nent; ++i) { - arm_smmu_write_strtab_ent(NULL, -1, strtab); + strtab[0] = cpu_to_le64(val); + strtab[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG, + STRTAB_STE_1_SHCFG_INCOMING)); + strtab[2] = 0; strtab += STRTAB_STE_DWORDS; } } @@ -1390,7 +1399,7 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid) return -ENOMEM; } - arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT); + arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT, false); arm_smmu_write_strtab_l1_desc(strtab, desc); return 0; } @@ -3042,7 +3051,7 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu) reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, smmu->sid_bits); cfg->strtab_base_cfg = reg; - arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents); + arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents, false); return 0; } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 5/9] iommu/arm-smmu-v3: Introduce strtab init helper
Introduce a helper to check the sid range and to init the l2 strtab entries(bypass). This will be useful when we have to initialize the l2 strtab with bypass for RMR SIDs. Signed-off-by: Shameer Kolothum --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 28 +++-- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 235f9bdaeaf2..31940e53c675 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2518,6 +2518,19 @@ static bool arm_smmu_sid_in_range(struct arm_smmu_device *smmu, u32 sid) return sid < limit; } +static int arm_smmu_init_sid_strtab(struct arm_smmu_device *smmu, u32 sid) +{ + /* Check the SIDs are in range of the SMMU and our stream table */ + if (!arm_smmu_sid_in_range(smmu, sid)) + return -ERANGE; + + /* Ensure l2 strtab is initialised */ + if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) + return arm_smmu_init_l2_strtab(smmu, sid); + + return 0; +} + static int arm_smmu_insert_master(struct arm_smmu_device *smmu, struct arm_smmu_master *master) { @@ -2541,20 +2554,9 @@ static int arm_smmu_insert_master(struct arm_smmu_device *smmu, new_stream->id = sid; new_stream->master = master; - /* -* Check the SIDs are in range of the SMMU and our stream table -*/ - if (!arm_smmu_sid_in_range(smmu, sid)) { - ret = -ERANGE; + ret = arm_smmu_init_sid_strtab(smmu, sid); + if (ret) break; - } - - /* Ensure l2 strtab is initialised */ - if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) { - ret = arm_smmu_init_l2_strtab(smmu, sid); - if (ret) - break; - } /* Insert into SID tree */ new_node = &(smmu->streams.rb_node); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 4/9] ACPI/IORT: Add a helper to retrieve RMR memory regions
Add a helper function (iort_iommu_get_rmrs()) that retrieves RMR memory descriptors associated with a given IOMMU. This will be used by IOMMU drivers to setup necessary mappings. Invoke it from the generic helper iommu_dma_get_rmrs(). Signed-off-by: Shameer Kolothum --- drivers/acpi/arm64/iort.c | 38 ++ drivers/iommu/dma-iommu.c | 4 include/linux/acpi_iort.h | 7 +++ 3 files changed, 49 insertions(+) diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index d76ba46ebe67..3c32d62e63b6 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -809,6 +809,42 @@ static struct acpi_iort_node *iort_get_msi_resv_iommu(struct device *dev) return NULL; } +/** + * iort_iommu_get_rmrs() - Helper to retrieve RMR info associated with IOMMU + * @iommu_fwnode: fwnode for the IOMMU + * @head: RMR list head to be populated + * + * Returns: 0 on success, <0 failure. Please note, we will keep the already + * allocated RMR reserve regions in case of a kmemdup() + * failure. + */ +int iort_iommu_get_rmrs(struct fwnode_handle *iommu_fwnode, + struct list_head *head) +{ + struct iommu_resv_region *e; + struct acpi_iort_node *iommu; + int rmrs = 0; + + iommu = iort_get_iort_node(iommu_fwnode); + if (!iommu || list_empty(_rmr_list)) + return -ENODEV; + + list_for_each_entry(e, _rmr_list, list) { + struct iommu_resv_region *region; + + if (e->fw_data.rmr.smmu != iommu) + continue; + + region = kmemdup(e, sizeof(*region), GFP_KERNEL); + if (region) { + list_add_tail(>list, head); + rmrs++; + } + } + + return (rmrs == 0) ? -ENODEV : 0; +} + /** * iort_iommu_msi_get_resv_regions - Reserved region driver helper * @dev: Device from iommu_get_resv_regions() @@ -1041,6 +1077,8 @@ int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head) { return 0; } int iort_iommu_configure_id(struct device *dev, const u32 *input_id) { return -ENODEV; } +int iort_iommu_get_rmrs(struct fwnode_handle *fwnode, struct list_head *head) +{ return -ENODEV; } #endif static int nc_dma_get_range(struct device *dev, u64 *size) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 2fa2445e9070..1b6e27475279 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -185,6 +185,9 @@ EXPORT_SYMBOL(iommu_put_dma_cookie); int iommu_dma_get_rmrs(struct fwnode_handle *iommu_fwnode, struct list_head *list) { + if (!is_of_node(iommu_fwnode)) + return iort_iommu_get_rmrs(iommu_fwnode, list); + return -EINVAL; } EXPORT_SYMBOL(iommu_dma_get_rmrs); @@ -200,6 +203,7 @@ EXPORT_SYMBOL(iommu_dma_get_rmrs); void iommu_dma_put_rmrs(struct fwnode_handle *iommu_fwnode, struct list_head *list) { + generic_iommu_put_resv_regions(iommu_fwnode->dev, list); } EXPORT_SYMBOL(iommu_dma_put_rmrs); diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h index f1f0842a2cb2..d8c030c103f5 100644 --- a/include/linux/acpi_iort.h +++ b/include/linux/acpi_iort.h @@ -38,6 +38,8 @@ int iort_dma_get_ranges(struct device *dev, u64 *size); int iort_iommu_configure_id(struct device *dev, const u32 *id_in); int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head); phys_addr_t acpi_iort_dma_get_max_cpu_address(void); +int iort_iommu_get_rmrs(struct fwnode_handle *iommu_fwnode, + struct list_head *list); #else static inline void acpi_iort_init(void) { } static inline u32 iort_msi_map_id(struct device *dev, u32 id) @@ -57,6 +59,11 @@ int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head) static inline phys_addr_t acpi_iort_dma_get_max_cpu_address(void) { return PHYS_ADDR_MAX; } + +static inline +int iort_iommu_get_rmrs(struct fwnode_handle *iommu_fwnode, + struct list_head *list) +{ return -ENODEV; } #endif #endif /* __ACPI_IORT_H__ */ -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 3/9] iommu/dma: Introduce generic helper to retrieve RMR info
Reserved Memory Regions(RMR) associated with an IOMMU can be described through ACPI IORT tables in systems with devices that require a unity mapping or bypass for those regions. Introduce a generic interface so that IOMMU drivers can retrieve and set up necessary mappings. Signed-off-by: Shameer Kolothum --- drivers/iommu/dma-iommu.c | 29 + include/linux/dma-iommu.h | 13 + 2 files changed, 42 insertions(+) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 98ba927aee1a..2fa2445e9070 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -174,6 +174,35 @@ void iommu_put_dma_cookie(struct iommu_domain *domain) } EXPORT_SYMBOL(iommu_put_dma_cookie); +/** + * + * iommu_dma_get_rmrs - Retrieve Reserved Memory Regions(RMRs) associated + * with a given IOMMU + * @iommu_fwnode: fwnode associated with IOMMU + * @list: RMR list to be populated + * + */ +int iommu_dma_get_rmrs(struct fwnode_handle *iommu_fwnode, + struct list_head *list) +{ + return -EINVAL; +} +EXPORT_SYMBOL(iommu_dma_get_rmrs); + +/** + * + * iommu_dma_put_rmrs - Release Reserved Memory Regions(RMRs) associated + * with a given IOMMU + * @iommu_fwnode: fwnode associated with IOMMU + * @list: RMR list + * + */ +void iommu_dma_put_rmrs(struct fwnode_handle *iommu_fwnode, + struct list_head *list) +{ +} +EXPORT_SYMBOL(iommu_dma_put_rmrs); + /** * iommu_dma_get_resv_regions - Reserved region driver helper * @dev: Device from iommu_get_resv_regions() diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h index 758ca4694257..3b7b2d096c6e 100644 --- a/include/linux/dma-iommu.h +++ b/include/linux/dma-iommu.h @@ -42,12 +42,16 @@ void iommu_dma_free_cpu_cached_iovas(unsigned int cpu, extern bool iommu_dma_forcedac; +int iommu_dma_get_rmrs(struct fwnode_handle *iommu, struct list_head *list); +void iommu_dma_put_rmrs(struct fwnode_handle *iommu, struct list_head *list); + #else /* CONFIG_IOMMU_DMA */ struct iommu_domain; struct msi_desc; struct msi_msg; struct device; +struct fwnode_handle; static inline void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 dma_limit) @@ -83,5 +87,14 @@ static inline void iommu_dma_get_resv_regions(struct device *dev, struct list_he { } +static int iommu_dma_get_rmrs(struct fwnode_handle *iommu, struct list_head *list) +{ + return -ENODEV; +} + +static void iommu_dma_put_rmrs(struct fwnode_handle *iommu, struct list_head *list) +{ +} + #endif /* CONFIG_IOMMU_DMA */ #endif /* __DMA_IOMMU_H */ -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 2/9] ACPI/IORT: Add support for RMR node parsing
Add support for parsing RMR node information from ACPI. Find the associated streamid and smmu node info from the RMR node and populate a linked list with RMR memory descriptors. Signed-off-by: Shameer Kolothum --- drivers/acpi/arm64/iort.c | 134 +- 1 file changed, 133 insertions(+), 1 deletion(-) diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index 3b23fb775ac4..d76ba46ebe67 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -40,6 +40,8 @@ struct iort_fwnode { static LIST_HEAD(iort_fwnode_list); static DEFINE_SPINLOCK(iort_fwnode_lock); +static LIST_HEAD(iort_rmr_list); /* list of RMR regions from ACPI */ + /** * iort_set_fwnode() - Create iort_fwnode and use it to register *iommu data in the iort_fwnode_list @@ -393,7 +395,8 @@ static struct acpi_iort_node *iort_node_get_id(struct acpi_iort_node *node, if (node->type == ACPI_IORT_NODE_NAMED_COMPONENT || node->type == ACPI_IORT_NODE_PCI_ROOT_COMPLEX || node->type == ACPI_IORT_NODE_SMMU_V3 || - node->type == ACPI_IORT_NODE_PMCG) { + node->type == ACPI_IORT_NODE_PMCG || + node->type == ACPI_IORT_NODE_RMR) { *id_out = map->output_base; return parent; } @@ -1566,6 +1569,134 @@ static void __init iort_enable_acs(struct acpi_iort_node *iort_node) #else static inline void iort_enable_acs(struct acpi_iort_node *iort_node) { } #endif +static void iort_rmr_desc_check_overlap(struct acpi_iort_rmr_desc *desc, u32 count) +{ + int i, j; + + for (i = 0; i < count; i++) { + u64 end, start = desc[i].base_address, length = desc[i].length; + + end = start + length - 1; + + /* Check for address overlap */ + for (j = i + 1; j < count; j++) { + u64 e_start = desc[j].base_address; + u64 e_end = e_start + desc[j].length - 1; + + if (start <= e_end && end >= e_start) + pr_err(FW_BUG "RMR descriptor[0x%llx - 0x%llx] overlaps, continue anyway\n", + start, end); + } + } +} + +static void __init iort_node_get_rmr_info(struct acpi_iort_node *iort_node) +{ + struct acpi_iort_node *smmu; + struct acpi_iort_rmr *rmr; + struct acpi_iort_rmr_desc *rmr_desc; + u32 map_count = iort_node->mapping_count; + u32 sid; + int i; + + if (!iort_node->mapping_offset || map_count != 1) { + pr_err(FW_BUG "Invalid ID mapping, skipping RMR node %p\n", + iort_node); + return; + } + + /* Retrieve associated smmu and stream id */ + smmu = iort_node_get_id(iort_node, , 0); + if (!smmu) { + pr_err(FW_BUG "Invalid SMMU reference, skipping RMR node %p\n", + iort_node); + return; + } + + /* Retrieve RMR data */ + rmr = (struct acpi_iort_rmr *)iort_node->node_data; + if (!rmr->rmr_offset || !rmr->rmr_count) { + pr_err(FW_BUG "Invalid RMR descriptor array, skipping RMR node %p\n", + iort_node); + return; + } + + rmr_desc = ACPI_ADD_PTR(struct acpi_iort_rmr_desc, iort_node, + rmr->rmr_offset); + + iort_rmr_desc_check_overlap(rmr_desc, rmr->rmr_count); + + for (i = 0; i < rmr->rmr_count; i++, rmr_desc++) { + struct iommu_resv_region *region; + enum iommu_resv_type type; + int prot = IOMMU_READ | IOMMU_WRITE; + u64 addr = rmr_desc->base_address, size = rmr_desc->length; + + if (!IS_ALIGNED(addr, SZ_64K) || !IS_ALIGNED(size, SZ_64K)) { + /* PAGE align base addr and size */ + addr &= PAGE_MASK; + size = PAGE_ALIGN(size + offset_in_page(rmr_desc->base_address)); + + pr_err(FW_BUG "RMR descriptor[0x%llx - 0x%llx] not aligned to 64K, continue with [0x%llx - 0x%llx]\n", + rmr_desc->base_address, + rmr_desc->base_address + rmr_desc->length - 1, + addr, addr + size - 1); + } + if (rmr->flags & IOMMU_RMR_REMAP_PERMITTED) { + type = IOMMU_RESV_DIRECT_RELAXABLE; + /* +* Set IOMMU_CACHE as IOMMU_RESV_DIRECT_RELAXABLE is +* normally used for allocated system memory that is +* then used for device specific reserved regions. +*/ + prot |= IOMMU_CACHE; + } else { + type =
[PATCH v7 1/9] iommu: Introduce a union to struct iommu_resv_region
A union is introduced to struct iommu_resv_region to hold any firmware specific data. This is in preparation to add support for IORT RMR reserve regions and the union now holds the RMR specific information. Signed-off-by: Shameer Kolothum --- include/linux/iommu.h | 11 +++ 1 file changed, 11 insertions(+) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 32d448050bf7..bd0e4641c569 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -114,6 +114,13 @@ enum iommu_resv_type { IOMMU_RESV_SW_MSI, }; +struct iommu_iort_rmr_data { +#define IOMMU_RMR_REMAP_PERMITTED (1 << 0) + u32 flags; + u32 sid;/* Stream Id associated with RMR entry */ + void *smmu; /* Associated IORT SMMU node pointer */ +}; + /** * struct iommu_resv_region - descriptor for a reserved memory region * @list: Linked list pointers @@ -121,6 +128,7 @@ enum iommu_resv_type { * @length: Length of the region in bytes * @prot: IOMMU Protection flags (READ/WRITE/...) * @type: Type of the reserved region + * @rmr: ACPI IORT RMR specific data */ struct iommu_resv_region { struct list_headlist; @@ -128,6 +136,9 @@ struct iommu_resv_region { size_t length; int prot; enum iommu_resv_typetype; + union { + struct iommu_iort_rmr_data rmr; + } fw_data; }; /** -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
Hi, The series adds support to IORT RMR nodes specified in IORT Revision E.b -ARM DEN 0049E[0]. RMR nodes are used to describe memory ranges that are used by endpoints and require a unity mapping in SMMU. We have faced issues with 3408iMR RAID controller cards which fail to boot when SMMU is enabled. This is because these controllers make use of host memory for various caching related purposes and when SMMU is enabled the iMR firmware fails to access these memory regions as there is no mapping for them. IORT RMR provides a way for UEFI to describe and report these memory regions so that the kernel can make a unity mapping for these in SMMU. Change History: v6 --> v7 The only change from v6 is the fix pointed out by Steve to the SMMUv2 SMR bypass install in patch #8. Thanks to the Tested-by tags by Laurentiu with SMMUv2 and Hanjun/Huiqiang with SMMUv3 for v6. I haven't added the tags yet as the series still needs more review[1]. Feedback and tests on this series is very much appreciated. v5 --> v6 - Addressed comments from Robin & Lorenzo. : Moved iort_parse_rmr() to acpi_iort_init() from iort_init_platform_devices(). : Removed use of struct iort_rmr_entry during the initial parse. Using struct iommu_resv_region instead. : Report RMR address alignment and overlap errors, but continue. : Reworked arm_smmu_init_bypass_stes() (patch # 6). - Updated SMMUv2 bypass SMR code. Thanks to Jon N (patch #8). - Set IOMMU protection flags(IOMMU_CACHE, IOMMU_MMIO) based on Type of RMR region. Suggested by Jon N. Thanks, Shameer [0] https://developer.arm.com/documentation/den0049/latest/ [1] https://lore.kernel.org/linux-acpi/20210716083442.1708-1-shameerali.kolothum.th...@huawei.com/T/#m043c95b869973a834b2fd57f3e1ed0325c84f3b7 -- v4 --> v5 -Added a fw_data union to struct iommu_resv_region and removed struct iommu_rmr (Based on comments from Joerg/Robin). -Added iommu_put_rmrs() to release mem. -Thanks to Steve for verifying on SMMUv2, but not added the Tested-by yet because of the above changes. v3 -->v4 -Included the SMMUv2 SMR bypass install changes suggested by Steve(patch #7) -As per Robin's comments, RMR reserve implementation is now more generic (patch #8) and dropped v3 patches 8 and 10. -Rebase to 5.13-rc1 RFC v2 --> v3 -Dropped RFC tag as the ACPICA header changes are now ready to be part of 5.13[0]. But this series still has a dependency on that patch. -Added IORT E.b related changes(node flags, _DSM function 5 checks for PCIe). -Changed RMR to stream id mapping from M:N to M:1 as per the spec and discussion here[1]. -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!) -- Jon Nettleton (1): iommu/arm-smmu: Get associated RMR info and install bypass SMR Shameer Kolothum (8): iommu: Introduce a union to struct iommu_resv_region ACPI/IORT: Add support for RMR node parsing iommu/dma: Introduce generic helper to retrieve RMR info ACPI/IORT: Add a helper to retrieve RMR memory regions iommu/arm-smmu-v3: Introduce strtab init helper iommu/arm-smmu-v3: Refactor arm_smmu_init_bypass_stes() to force bypass iommu/arm-smmu-v3: Get associated RMR info and install bypass STE iommu/dma: Reserve any RMR regions associated with a dev drivers/acpi/arm64/iort.c | 172 +++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 76 +++-- drivers/iommu/arm/arm-smmu/arm-smmu.c | 48 ++ drivers/iommu/dma-iommu.c | 89 +- include/linux/acpi_iort.h | 7 + include/linux/dma-iommu.h | 13 ++ include/linux/iommu.h | 11 ++ 7 files changed, 393 insertions(+), 23 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 02/25] iommu/amd: Drop IOVA cookie management
Hi Robin, I love your patch! Yet something to improve: [auto build test ERROR on iommu/next] [also build test ERROR on rockchip/for-next sunxi/sunxi/for-next linus/master v5.14-rc4 next-20210804] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Robin-Murphy/iommu-Refactor-DMA-domain-strictness/20210805-011913 base: https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next config: x86_64-allyesconfig (attached as .config) compiler: gcc-9 (Debian 9.3.0-22) 9.3.0 reproduce (this is a W=1 build): # https://github.com/0day-ci/linux/commit/33b83e4adc16220361ed42c229e8cd37f8a2a3aa git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Robin-Murphy/iommu-Refactor-DMA-domain-strictness/20210805-011913 git checkout 33b83e4adc16220361ed42c229e8cd37f8a2a3aa # save the attached .config to linux build tree make W=1 ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): drivers/iommu/amd/iommu.c: In function 'amd_iommu_probe_finalize': >> drivers/iommu/amd/iommu.c:1757:3: error: implicit declaration of function >> 'iommu_setup_dma_ops'; did you mean 'arch_setup_dma_ops'? >> [-Werror=implicit-function-declaration] 1757 | iommu_setup_dma_ops(dev, 0, U64_MAX); | ^~~ | arch_setup_dma_ops cc1: some warnings being treated as errors vim +1757 drivers/iommu/amd/iommu.c 1ac4cbbc5eb56d arch/x86/kernel/amd_iommu.c Joerg Roedel 2008-12-10 1749 dce8d6964ebdb3 drivers/iommu/amd_iommu.c Joerg Roedel 2020-04-29 1750 static void amd_iommu_probe_finalize(struct device *dev) dce8d6964ebdb3 drivers/iommu/amd_iommu.c Joerg Roedel 2020-04-29 1751 { dce8d6964ebdb3 drivers/iommu/amd_iommu.c Joerg Roedel 2020-04-29 1752 struct iommu_domain *domain; ac1534a55d1e87 drivers/iommu/amd_iommu.c Joerg Roedel 2012-06-21 1753 07ee86948c9111 drivers/iommu/amd_iommu.c Joerg Roedel 2015-05-28 1754 /* Domains are initialized for this device - have a look what we ended up with */ 07ee86948c9111 drivers/iommu/amd_iommu.c Joerg Roedel 2015-05-28 1755 domain = iommu_get_domain_for_dev(dev); 57f9842e488406 drivers/iommu/amd_iommu.c Joerg Roedel 2020-04-29 1756 if (domain->type == IOMMU_DOMAIN_DMA) ac6d704679d343 drivers/iommu/amd/iommu.c Jean-Philippe Brucker 2021-06-18 @1757 iommu_setup_dma_ops(dev, 0, U64_MAX); d6177a6556f853 drivers/iommu/amd/iommu.c Jean-Philippe Brucker 2021-04-22 1758 else d6177a6556f853 drivers/iommu/amd/iommu.c Jean-Philippe Brucker 2021-04-22 1759 set_dma_ops(dev, NULL); e275a2a0fc9e21 arch/x86/kernel/amd_iommu.c Joerg Roedel 2008-12-10 1760 } e275a2a0fc9e21 arch/x86/kernel/amd_iommu.c Joerg Roedel 2008-12-10 1761 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 05/25] iommu/exynos: Drop IOVA cookie management
On 04.08.2021 19:15, Robin Murphy wrote: > The core code bakes its own cookies now. > > CC: Marek Szyprowski > Signed-off-by: Robin Murphy Acked-by: Marek Szyprowski Tested-by: Marek Szyprowski > --- > > v3: Also remove unneeded include > --- > drivers/iommu/exynos-iommu.c | 19 --- > 1 file changed, 4 insertions(+), 15 deletions(-) > > diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c > index d0fbf1d10e18..939ffa768986 100644 > --- a/drivers/iommu/exynos-iommu.c > +++ b/drivers/iommu/exynos-iommu.c > @@ -21,7 +21,6 @@ > #include > #include > #include > -#include > > typedef u32 sysmmu_iova_t; > typedef u32 sysmmu_pte_t; > @@ -735,20 +734,16 @@ static struct iommu_domain > *exynos_iommu_domain_alloc(unsigned type) > /* Check if correct PTE offsets are initialized */ > BUG_ON(PG_ENT_SHIFT < 0 || !dma_dev); > > + if (type != IOMMU_DOMAIN_DMA && type != IOMMU_DOMAIN_UNMANAGED) > + return NULL; > + > domain = kzalloc(sizeof(*domain), GFP_KERNEL); > if (!domain) > return NULL; > > - if (type == IOMMU_DOMAIN_DMA) { > - if (iommu_get_dma_cookie(>domain) != 0) > - goto err_pgtable; > - } else if (type != IOMMU_DOMAIN_UNMANAGED) { > - goto err_pgtable; > - } > - > domain->pgtable = (sysmmu_pte_t *)__get_free_pages(GFP_KERNEL, 2); > if (!domain->pgtable) > - goto err_dma_cookie; > + goto err_pgtable; > > domain->lv2entcnt = (short *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, > 1); > if (!domain->lv2entcnt) > @@ -779,9 +774,6 @@ static struct iommu_domain > *exynos_iommu_domain_alloc(unsigned type) > free_pages((unsigned long)domain->lv2entcnt, 1); > err_counter: > free_pages((unsigned long)domain->pgtable, 2); > -err_dma_cookie: > - if (type == IOMMU_DOMAIN_DMA) > - iommu_put_dma_cookie(>domain); > err_pgtable: > kfree(domain); > return NULL; > @@ -809,9 +801,6 @@ static void exynos_iommu_domain_free(struct iommu_domain > *iommu_domain) > > spin_unlock_irqrestore(>lock, flags); > > - if (iommu_domain->type == IOMMU_DOMAIN_DMA) > - iommu_put_dma_cookie(iommu_domain); > - > dma_unmap_single(dma_dev, virt_to_phys(domain->pgtable), LV1TABLE_SIZE, >DMA_TO_DEVICE); > Best regards -- Marek Szyprowski, PhD Samsung R Institute Poland ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 01/25] iommu: Pull IOVA cookie management into the core
On 04.08.2021 19:15, Robin Murphy wrote: > Now that everyone has converged on iommu-dma for IOMMU_DOMAIN_DMA > support, we can abandon the notion of drivers being responsible for the > cookie type, and consolidate all the management into the core code. > > CC: Marek Szyprowski > CC: Yoshihiro Shimoda > CC: Geert Uytterhoeven > CC: Yong Wu > CC: Heiko Stuebner > CC: Chunyan Zhang > CC: Maxime Ripard > Reviewed-by: Jean-Philippe Brucker > Reviewed-by: Lu Baolu > Signed-off-by: Robin Murphy Tested-by: Marek Szyprowski > > --- > > v3: Use a simpler temporary check instead of trying to be clever with > the error code > --- > drivers/iommu/iommu.c | 7 +++ > include/linux/iommu.h | 3 ++- > 2 files changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index f2cda9950bd5..b65fcc66ffa4 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -7,6 +7,7 @@ > #define pr_fmt(fmt)"iommu: " fmt > > #include > +#include > #include > #include > #include > @@ -1946,6 +1947,11 @@ static struct iommu_domain > *__iommu_domain_alloc(struct bus_type *bus, > /* Assume all sizes by default; the driver may override this later */ > domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap; > > + /* Temporarily avoid -EEXIST while drivers still get their own cookies > */ > + if (type == IOMMU_DOMAIN_DMA && !domain->iova_cookie && > iommu_get_dma_cookie(domain)) { > + iommu_domain_free(domain); > + domain = NULL; > + } > return domain; > } > > @@ -1957,6 +1963,7 @@ EXPORT_SYMBOL_GPL(iommu_domain_alloc); > > void iommu_domain_free(struct iommu_domain *domain) > { > + iommu_put_dma_cookie(domain); > domain->ops->domain_free(domain); > } > EXPORT_SYMBOL_GPL(iommu_domain_free); > diff --git a/include/linux/iommu.h b/include/linux/iommu.h > index 4997c78e2670..141779d76035 100644 > --- a/include/linux/iommu.h > +++ b/include/linux/iommu.h > @@ -40,6 +40,7 @@ struct iommu_domain; > struct notifier_block; > struct iommu_sva; > struct iommu_fault_event; > +struct iommu_dma_cookie; > > /* iommu fault flags */ > #define IOMMU_FAULT_READ0x0 > @@ -86,7 +87,7 @@ struct iommu_domain { > iommu_fault_handler_t handler; > void *handler_token; > struct iommu_domain_geometry geometry; > - void *iova_cookie; > + struct iommu_dma_cookie *iova_cookie; > }; > > enum iommu_cap { Best regards -- Marek Szyprowski, PhD Samsung R Institute Poland ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v10 10/17] virtio: Handle device reset failure in register_virtio_device()
在 2021/8/4 下午5:07, Yongji Xie 写道: On Wed, Aug 4, 2021 at 4:54 PM Jason Wang wrote: 在 2021/8/4 下午4:50, Yongji Xie 写道: On Wed, Aug 4, 2021 at 4:32 PM Jason Wang wrote: 在 2021/8/3 下午5:38, Yongji Xie 写道: On Tue, Aug 3, 2021 at 4:09 PM Jason Wang wrote: 在 2021/7/29 下午3:34, Xie Yongji 写道: The device reset may fail in virtio-vdpa case now, so add checks to its return value and fail the register_virtio_device(). So the reset() would be called by the driver during remove as well, or is it sufficient to deal only with the reset during probe? Actually there is no way to handle failure during removal. And it should be safe with the protection of software IOTLB even if the reset() fails. Thanks, Yongji If this is true, does it mean we don't even need to care about reset failure? But we need to handle the failure in the vhost-vdpa case, isn't it? Yes, but: - This patch is for virtio not for vhost, if we don't care virtio, we can avoid the changes - For vhost, there could be two ways probably: 1) let the set_status to report error 2) require userspace to re-read for status It looks to me you want to go with 1) and I'm not sure whether or not it's too late to go with 2). Looks like 2) can't work if reset failure happens in vhost_vdpa_release() and vhost_vdpa_open(). Yes, you're right. Thanks Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RFC 0/2] dma-pool: allow user to disable atomic pool
On 06/24/21 at 11:47am, Robin Murphy wrote: > On 2021-06-24 10:29, Baoquan He wrote: > > On 06/24/21 at 08:40am, Christoph Hellwig wrote: > > > So reduce the amount allocated. But the pool is needed for proper > > > operation on systems with memory encryption. And please add the right > > > maintainer or at least mailing list for the code you're touching next > > > time. > > > > Oh, I thoutht it's memory issue only, should have run > > ./scripts/get_maintainer.pl. sorry. > > > > About reducing the amount allocated, it may not help. Because on x86_64, > > kdump kernel doesn't put any page of memory into buddy allocator of DMA > > zone. Means it will defenitely OOM for atomic_pool_dma initialization. > > > > Wondering in which case or on which device the atomic pool is needed on > > AMD system with mem encrytion enabled. As we can see, the OOM will > > happen too in kdump kernel on Intel system, even though it's not > > necessary. Sorry for very late response, and thank both for your comments. > > Hmm, I think the Kconfig reshuffle has actually left a slight wrinkle here. > For DMA_DIRECT_REMAP=y we can assume an atomic pool is always needed, since > that was the original behaviour anyway. However the implications of > AMD_MEM_ENCRYPT=y are different - even if support is enabled, it still > should only be relevant if mem_encrypt_active(), so it probably does make > sense to have an additional runtime gate on that. > > From a quick scan, use of dma_alloc_from_pool() already depends on > force_dma_unencrypted() so that's probably fine already, but I think we'd > need a bit of extra protection around dma_free_from_pool() to prevent > gen_pool_has_addr() dereferencing NULL if the pools are uninitialised, even > with your proposed patch as it is. Presumably nothing actually called > dma_direct_free() when you tested this? Yes, enforcing the conditional check of force_dma_unencrypted() around dma_free_from_pool sounds reasonable, just as we have done in dma_alloc_from_pool(). I have tested this patchset on normal x86_64 systems and one amd system with SME support, disabling atomic pool can fix the issue that there's no managed pages in dma zone then requesting page from dma zone will cause allocation failure. And even disabling atomic pool in 1st kernel didn't cause any problem on one AMD EPYC system which supports SME. I am not expert of DMA area, wondering how atomic pool is supposed to do in SME/SEV system. Besides, even though atomic pool is disabled, slub page for allocation of dma-kmalloc also triggers page allocation failure. So I change to take another way to fix them, please check v2 post. The atomic pool disabling an be a good to have change. Thanks Baoquan ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu