Re: [RFC v2] /dev/iommu uAPI proposal

2021-08-05 Thread David Gibson
On Tue, Aug 03, 2021 at 03:19:26AM +, Tian, Kevin wrote:
> > From: David Gibson 
> > Sent: Tuesday, August 3, 2021 9:51 AM
> > 
> > On Wed, Jul 28, 2021 at 04:04:24AM +, Tian, Kevin wrote:
> > > Hi, David,
> > >
> > > > From: David Gibson 
> > > > Sent: Monday, July 26, 2021 12:51 PM
> > > >
> > > > On Fri, Jul 09, 2021 at 07:48:44AM +, Tian, Kevin wrote:
> > > > > /dev/iommu provides an unified interface for managing I/O page tables
> > for
> > > > > devices assigned to userspace. Device passthrough frameworks (VFIO,
> > > > vDPA,
> > > > > etc.) are expected to use this interface instead of creating their own
> > logic to
> > > > > isolate untrusted device DMAs initiated by userspace.
> > > > >
> > > > > This proposal describes the uAPI of /dev/iommu and also sample
> > > > sequences
> > > > > with VFIO as example in typical usages. The driver-facing kernel API
> > > > provided
> > > > > by the iommu layer is still TBD, which can be discussed after 
> > > > > consensus
> > is
> > > > > made on this uAPI.
> > > > >
> > > > > It's based on a lengthy discussion starting from here:
> > > > >   https://lore.kernel.org/linux-
> > > > iommu/20210330132830.go2356...@nvidia.com/
> > > > >
> > > > > v1 can be found here:
> > > > >   https://lore.kernel.org/linux-
> > > >
> > iommu/PH0PR12MB54811863B392C644E5365446DC3E9@PH0PR12MB5481.n
> > > > amprd12.prod.outlook.com/T/
> > > > >
> > > > > This doc is also tracked on github, though it's not very useful for 
> > > > > v1->v2
> > > > > given dramatic refactoring:
> > > > >   https://github.com/luxis1999/dev_iommu_uapi
> > > >
> > > > Thanks for all your work on this, Kevin.  Apart from the actual
> > > > semantic improvements, I'm finding v2 significantly easier to read and
> > > > understand than v1.
> > > >
> > > > [snip]
> > > > > 1.2. Attach Device to I/O address space
> > > > > +++
> > > > >
> > > > > Device attach/bind is initiated through passthrough framework uAPI.
> > > > >
> > > > > Device attaching is allowed only after a device is successfully bound 
> > > > > to
> > > > > the IOMMU fd. User should provide a device cookie when binding the
> > > > > device through VFIO uAPI. This cookie is used when the user queries
> > > > > device capability/format, issues per-device iotlb invalidation and
> > > > > receives per-device I/O page fault data via IOMMU fd.
> > > > >
> > > > > Successful binding puts the device into a security context which 
> > > > > isolates
> > > > > its DMA from the rest system. VFIO should not allow user to access the
> > > > > device before binding is completed. Similarly, VFIO should prevent the
> > > > > user from unbinding the device before user access is withdrawn.
> > > > >
> > > > > When a device is in an iommu group which contains multiple devices,
> > > > > all devices within the group must enter/exit the security context
> > > > > together. Please check {1.3} for more info about group isolation via
> > > > > this device-centric design.
> > > > >
> > > > > Successful attaching activates an I/O address space in the IOMMU,
> > > > > if the device is not purely software mediated. VFIO must provide 
> > > > > device
> > > > > specific routing information for where to install the I/O page table 
> > > > > in
> > > > > the IOMMU for this device. VFIO must also guarantee that the attached
> > > > > device is configured to compose DMAs with the routing information
> > that
> > > > > is provided in the attaching call. When handling DMA requests, IOMMU
> > > > > identifies the target I/O address space according to the routing
> > > > > information carried in the request. Misconfiguration breaks DMA
> > > > > isolation thus could lead to severe security vulnerability.
> > > > >
> > > > > Routing information is per-device and bus specific. For PCI, it is
> > > > > Requester ID (RID) identifying the device plus optional Process 
> > > > > Address
> > > > > Space ID (PASID). For ARM, it is Stream ID (SID) plus optional Sub-
> > Stream
> > > > > ID (SSID). PASID or SSID is used when multiple I/O address spaces are
> > > > > enabled on a single device. For simplicity and continuity reason the
> > > > > following context uses RID+PASID though SID+SSID may sound a clearer
> > > > > naming from device p.o.v. We can decide the actual naming when
> > coding.
> > > > >
> > > > > Because one I/O address space can be attached by multiple devices,
> > > > > per-device routing information (plus device cookie) is tracked under
> > > > > each IOASID and is used respectively when activating the I/O address
> > > > > space in the IOMMU for each attached device.
> > > > >
> > > > > The device in the /dev/iommu context always refers to a physical one
> > > > > (pdev) which is identifiable via RID. Physically each pdev can support
> > > > > one default I/O address space (routed via RID) and optionally multiple
> > > > > non-default I/O address spaces (via RID+PASID).
> > > > >
> > > > > The device in 

Re: [RFC v2] /dev/iommu uAPI proposal

2021-08-05 Thread David Gibson
On Wed, Aug 04, 2021 at 11:04:47AM -0300, Jason Gunthorpe wrote:
> On Mon, Aug 02, 2021 at 02:49:44AM +, Tian, Kevin wrote:
> 
> > Can you elaborate? IMO the user only cares about the label (device cookie 
> > plus optional vPASID) which is generated by itself when doing the attaching
> > call, and expects this virtual label being used in various spots 
> > (invalidation,
> > page fault, etc.). How the system labels the traffic (the physical RID or 
> > RID+
> > PASID) should be completely invisible to userspace.
> 
> I don't think that is true if the vIOMMU driver is also emulating
> PASID. Presumably the same is true for other PASID-like schemes.

Right.  The idea for an SVA capable vIOMMU in my scheme is that the
hypervisor would set up an IOAS of address type "PASID+address" with
the mappings made by the guest according to its vIOMMU semantics.
Then SVA capable devices would be plugged into that IOAS by using
"PASID+address" type endpoints from those devices.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC v2] /dev/iommu uAPI proposal

2021-08-05 Thread David Gibson
On Wed, Aug 04, 2021 at 11:07:42AM -0300, Jason Gunthorpe wrote:
> On Tue, Aug 03, 2021 at 11:58:54AM +1000, David Gibson wrote:
> > > I'd rather deduce the endpoint from a collection of devices than the
> > > other way around...
> > 
> > Which I think is confusing, and in any case doesn't cover the case of
> > one "device" with multiple endpoints.
> 
> Well they are both confusing, and I'd prefer to focus on the common
> case without extra mandatory steps. Exposing optional endpoint sharing
> information seems more in line with where everything is going than
> making endpoint sharing a first class object.
> 
> AFAIK a device with multiple endpoints where those endpoints are
> shared with other devices doesn't really exist/or is useful? Eg PASID
> has multiple RIDs by they are not shared.

No, I can't think of a (non-contrived) example where a device would
have *both* multiple endpoints and those endpoints are shared amongst
multiple devices.  I can easily think of examples where a device has
multiple (non shared) endpoints and where multiple devices share a
single endpoint.

The point is that making endpoints explicit separates the various
options here from the logic of the IOMMU layer itself.  New device
types with new possibilities here means new interfaces *on those
devices*, but not new interfaces on /dev/iommu.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 09/25] iommu/sprd: Drop IOVA cookie management

2021-08-05 Thread Chunyan Zhang
On Thu, Aug 5, 2021 at 1:18 AM Robin Murphy  wrote:
>
> The core code bakes its own cookies now.
>
> CC: Chunyan Zhang 
> Signed-off-by: Robin Murphy 

Thank you for the patch!

Acked-by: Chunyan Zhang 

>
> ---
>
> v3: Also remove unneeded include
> ---
>  drivers/iommu/sprd-iommu.c | 7 ---
>  1 file changed, 7 deletions(-)
>
> diff --git a/drivers/iommu/sprd-iommu.c b/drivers/iommu/sprd-iommu.c
> index 73dfd9946312..27ac818b0354 100644
> --- a/drivers/iommu/sprd-iommu.c
> +++ b/drivers/iommu/sprd-iommu.c
> @@ -8,7 +8,6 @@
>
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -144,11 +143,6 @@ static struct iommu_domain 
> *sprd_iommu_domain_alloc(unsigned int domain_type)
> if (!dom)
> return NULL;
>
> -   if (iommu_get_dma_cookie(>domain)) {
> -   kfree(dom);
> -   return NULL;
> -   }
> -
> spin_lock_init(>pgtlock);
>
> dom->domain.geometry.aperture_start = 0;
> @@ -161,7 +155,6 @@ static void sprd_iommu_domain_free(struct iommu_domain 
> *domain)
>  {
> struct sprd_iommu_domain *dom = to_sprd_domain(domain);
>
> -   iommu_put_dma_cookie(domain);
> kfree(dom);
>  }
>
> --
> 2.25.1
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC v2] /dev/iommu uAPI proposal

2021-08-05 Thread Tian, Kevin
> From: Jason Gunthorpe 
> Sent: Thursday, August 5, 2021 7:27 PM
> 
> On Wed, Aug 04, 2021 at 10:59:21PM +, Tian, Kevin wrote:
> > > From: Jason Gunthorpe 
> > > Sent: Wednesday, August 4, 2021 10:05 PM
> > >
> > > On Mon, Aug 02, 2021 at 02:49:44AM +, Tian, Kevin wrote:
> > >
> > > > Can you elaborate? IMO the user only cares about the label (device
> cookie
> > > > plus optional vPASID) which is generated by itself when doing the
> attaching
> > > > call, and expects this virtual label being used in various spots
> (invalidation,
> > > > page fault, etc.). How the system labels the traffic (the physical RID 
> > > > or
> RID+
> > > > PASID) should be completely invisible to userspace.
> > >
> > > I don't think that is true if the vIOMMU driver is also emulating
> > > PASID. Presumably the same is true for other PASID-like schemes.
> > >
> >
> > I'm getting even more confused with this comment. Isn't it the
> > consensus from day one that physical PASID should not be exposed
> > to userspace as doing so breaks live migration?
> 
> Uh, no?
> 
> > with PASID emulation vIOMMU only cares about vPASID instead of
> > pPASID, and the uAPI only requires user to register vPASID instead
> > of reporting pPASID back to userspace...
> 
> vPASID is only a feature of one device in existance, so we can't make
> vPASID mandatory.
> 

sure. my point is just that if vPASID is being emulated there is no need
of exposing pPASID to user space. Can you give a concrete example
where pPASID must be exposed and how the user wants to use this
information? 

Thanks
Kevin

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 2/9] ACPI/IORT: Add support for RMR node parsing

2021-08-05 Thread Laurentiu Tudor



On 8/5/2021 7:03 PM, Lorenzo Pieralisi wrote:
> On Thu, Aug 05, 2021 at 09:07:17AM +0100, Shameer Kolothum wrote:
> 
> [...]
> 
>> +static void __init iort_node_get_rmr_info(struct acpi_iort_node *iort_node)
>> +{
>> +struct acpi_iort_node *smmu;
>> +struct acpi_iort_rmr *rmr;
>> +struct acpi_iort_rmr_desc *rmr_desc;
>> +u32 map_count = iort_node->mapping_count;
>> +u32 sid;
>> +int i;
>> +
>> +if (!iort_node->mapping_offset || map_count != 1) {
>> +pr_err(FW_BUG "Invalid ID mapping, skipping RMR node %p\n",
>> +   iort_node);
>> +return;
>> +}
>> +
>> +/* Retrieve associated smmu and stream id */
>> +smmu = iort_node_get_id(iort_node, , 0);
>> +if (!smmu) {
>> +pr_err(FW_BUG "Invalid SMMU reference, skipping RMR node %p\n",
>> +   iort_node);
>> +return;
>> +}
>> +
>> +/* Retrieve RMR data */
>> +rmr = (struct acpi_iort_rmr *)iort_node->node_data;
>> +if (!rmr->rmr_offset || !rmr->rmr_count) {
>> +pr_err(FW_BUG "Invalid RMR descriptor array, skipping RMR node 
>> %p\n",
>> +   iort_node);
>> +return;
>> +}
>> +
>> +rmr_desc = ACPI_ADD_PTR(struct acpi_iort_rmr_desc, iort_node,
>> +rmr->rmr_offset);
>> +
>> +iort_rmr_desc_check_overlap(rmr_desc, rmr->rmr_count);
>> +
>> +for (i = 0; i < rmr->rmr_count; i++, rmr_desc++) {
>> +struct iommu_resv_region *region;
>> +enum iommu_resv_type type;
>> +int prot = IOMMU_READ | IOMMU_WRITE;
>> +u64 addr = rmr_desc->base_address, size = rmr_desc->length;
>> +
>> +if (!IS_ALIGNED(addr, SZ_64K) || !IS_ALIGNED(size, SZ_64K)) {
>> +/* PAGE align base addr and size */
>> +addr &= PAGE_MASK;
>> +size = PAGE_ALIGN(size + 
>> offset_in_page(rmr_desc->base_address));
>> +
>> +pr_err(FW_BUG "RMR descriptor[0x%llx - 0x%llx] not 
>> aligned to 64K, continue with [0x%llx - 0x%llx]\n",
>> +   rmr_desc->base_address,
>> +   rmr_desc->base_address + rmr_desc->length - 1,
>> +   addr, addr + size - 1);
>> +}
>> +if (rmr->flags & IOMMU_RMR_REMAP_PERMITTED) {
>> +type = IOMMU_RESV_DIRECT_RELAXABLE;
>> +/*
>> + * Set IOMMU_CACHE as IOMMU_RESV_DIRECT_RELAXABLE is
>> + * normally used for allocated system memory that is
>> + * then used for device specific reserved regions.
>> + */
>> +prot |= IOMMU_CACHE;
>> +} else {
>> +type = IOMMU_RESV_DIRECT;
>> +/*
>> + * Set IOMMU_MMIO as IOMMU_RESV_DIRECT is normally used
>> + * for device memory like MSI doorbell.
>> + */
>> +prot |= IOMMU_MMIO;
>> +}
> 
> On the prot value assignment based on the remapping flag, I'd like to
> hear Robin/Joerg's opinion, I'd avoid being in a situation where
> "normally" this would work but then we have to quirk it.
> 
> Is this a valid assumption _always_ ?

I think we enable quite a bit of platforms with this assumption, so IMHO
it's a fair compromise for now. As per Jon's comment and oob
discussions, in the long run the spec should probably be updated to
include a way of explicitly specifying memory attributes.

---
Thanks & Best Regards, Laurentiu

> 
>> +
>> +region = iommu_alloc_resv_region(addr, size, prot, type);
>> +if (region) {
>> +region->fw_data.rmr.flags = rmr->flags;
>> +region->fw_data.rmr.sid = sid;
>> +region->fw_data.rmr.smmu = smmu;
>> +list_add_tail(>list, _rmr_list);
>> +}
>> +}
>> +}
>> +
>> +static void __init iort_parse_rmr(void)
>> +{
>> +struct acpi_iort_node *iort_node, *iort_end;
>> +struct acpi_table_iort *iort;
>> +int i;
>> +
>> +if (iort_table->revision < 3)
>> +return;
>> +
>> +iort = (struct acpi_table_iort *)iort_table;
>> +
>> +iort_node = ACPI_ADD_PTR(struct acpi_iort_node, iort,
>> + iort->node_offset);
>> +iort_end = ACPI_ADD_PTR(struct acpi_iort_node, iort,
>> +iort_table->length);
>> +
>> +for (i = 0; i < iort->node_count; i++) {
>> +if (WARN_TAINT(iort_node >= iort_end, TAINT_FIRMWARE_WORKAROUND,
>> +   "IORT node pointer overflows, bad table!\n"))
>> +return;
>> +
>> +if (iort_node->type == ACPI_IORT_NODE_RMR)
>> +iort_node_get_rmr_info(iort_node);
>> +
>> +iort_node = ACPI_ADD_PTR(struct acpi_iort_node, iort_node,
>> +   

Re: [PATCH] iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()

2021-08-05 Thread Robin Murphy

On 2021-08-05 16:16, John Garry wrote:

On 05/08/2021 15:41, Robin Murphy wrote:
I suppose they could be combined into a smaller sub-struct and loaded 
in a single operation, but it looks messy, and prob without much gain.


Indeed I wouldn't say that saving memory is the primary concern here, 
and any more convoluted code is hardly going to help performance. Plus 
it still wouldn't help the other cases where we're just copying the 
size into a fake queue to do some prod arithmetic - I hadn't fully 
clocked what was going on there when I skimmed through things earlier.


Disregarding the bogus layout change, though, do you reckon the rest 
of my idea makes sense?


I tried the similar change to avoid zero-init the padding in 
arm_smmu_cmdq_write_entries() and the 
_arm_smmu_cmdq_poll_set_valid_map(), but the disassembly was the same. 
So the compiler must have got smart there.


Yeah, in my build __arm_smmu_cmdq_poll_set_valid_map() only uses 32 
bytes of stack, so clearly it's managed to see through the macro magic 
once queue_inc_prod_n() is inlined and elide the whole struct. 
arm_smmu_cmdq_write_entries() is inlined already, but logically must be 
the same deal since it's a similarly inlined queue_inc_prod_n().


However, that may all change if different compiler flags or a different 
compiler lead to different inlining decisions, so I'd argue that if this 
can matter anywhere then it's worth treating consistently everywhere.


But for the original change in this patch, it did make a difference. 
It's nice to remove what was a memcpy:


     1770: a9077eff stp xzr, xzr, [x23, #112]
}, head = llq;
     1774: 9400 bl 0 

And performance was very fractionally better.


Heh, mine was this beauty:

struct arm_smmu_ll_queue llq = {
17d4:   a9017f7fstp xzr, xzr, [x27, #16]
17d8:   a9027f7fstp xzr, xzr, [x27, #32]
17dc:   a9037f7fstp xzr, xzr, [x27, #48]
17e0:   a9047f7fstp xzr, xzr, [x27, #64]
}, head = llq;
17e4:   b900c340str w0, [x26, #192]
{
17e8:   290d0be1stp w1, w2, [sp, #104]
}, head = llq;
17ec:   a9440f62ldp x2, x3, [x27, #64]
17f0:   a9007f5fstp xzr, xzr, [x26]
17f4:   a9017f5fstp xzr, xzr, [x26, #16]
17f8:   a9027f5fstp xzr, xzr, [x26, #32]
17fc:   a9037f5fstp xzr, xzr, [x26, #48]
1800:   a9040f42stp x2, x3, [x26, #64]
struct arm_smmu_ll_queue llq = {
1804:   a9057f7fstp xzr, xzr, [x27, #80]
}, head = llq;
1808:   a9057f5fstp xzr, xzr, [x26, #80]
struct arm_smmu_ll_queue llq = {
180c:   a9067f7fstp xzr, xzr, [x27, #96]
}, head = llq;
1810:   a9067f5fstp xzr, xzr, [x26, #96]
struct arm_smmu_ll_queue llq = {
1814:   a9077f7fstp xzr, xzr, [x27, #112]
}, head = llq;
1818:   a9077f5fstp xzr, xzr, [x26, #112]
struct arm_smmu_ll_queue llq = {
181c:   a9087f5fstp xzr, xzr, [x26, #128]

As for pre-evaluating "nents", I'm not sure how much that can help, but 
I am not too optimistic. I can try some testing when I get a chance. 
Having said that, I would need to check the disassembly also.


It'll just turn MOV,LDR,LSL sequences into plain LDRs - a small saving 
but with no real downside, and a third of it is in the place where doing 
less work matters most:


add/remove: 0/0 grow/shrink: 0/8 up/down: 0/-100 (-100)
Function old new   delta
arm_smmu_priq_thread 532 528  -4
arm_smmu_evtq_thread 368 364  -4
arm_smmu_device_probe   45644556  -8
__arm_smmu_cmdq_poll_set_valid_map.isra  316 308  -8
arm_smmu_init_one_queue.isra 320 308 -12
queue_remove_raw 192 176 -16
arm_smmu_gerror_handler  752 736 -16
arm_smmu_cmdq_issue_cmdlist 18121780 -32
Total: Before=23776, After=23676, chg -0.42%


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 2/9] ACPI/IORT: Add support for RMR node parsing

2021-08-05 Thread Jon Nettleton
On Thu, Aug 5, 2021 at 6:03 PM Lorenzo Pieralisi
 wrote:
>
> On Thu, Aug 05, 2021 at 09:07:17AM +0100, Shameer Kolothum wrote:
>
> [...]
>
> > +static void __init iort_node_get_rmr_info(struct acpi_iort_node *iort_node)
> > +{
> > + struct acpi_iort_node *smmu;
> > + struct acpi_iort_rmr *rmr;
> > + struct acpi_iort_rmr_desc *rmr_desc;
> > + u32 map_count = iort_node->mapping_count;
> > + u32 sid;
> > + int i;
> > +
> > + if (!iort_node->mapping_offset || map_count != 1) {
> > + pr_err(FW_BUG "Invalid ID mapping, skipping RMR node %p\n",
> > +iort_node);
> > + return;
> > + }
> > +
> > + /* Retrieve associated smmu and stream id */
> > + smmu = iort_node_get_id(iort_node, , 0);
> > + if (!smmu) {
> > + pr_err(FW_BUG "Invalid SMMU reference, skipping RMR node 
> > %p\n",
> > +iort_node);
> > + return;
> > + }
> > +
> > + /* Retrieve RMR data */
> > + rmr = (struct acpi_iort_rmr *)iort_node->node_data;
> > + if (!rmr->rmr_offset || !rmr->rmr_count) {
> > + pr_err(FW_BUG "Invalid RMR descriptor array, skipping RMR 
> > node %p\n",
> > +iort_node);
> > + return;
> > + }
> > +
> > + rmr_desc = ACPI_ADD_PTR(struct acpi_iort_rmr_desc, iort_node,
> > + rmr->rmr_offset);
> > +
> > + iort_rmr_desc_check_overlap(rmr_desc, rmr->rmr_count);
> > +
> > + for (i = 0; i < rmr->rmr_count; i++, rmr_desc++) {
> > + struct iommu_resv_region *region;
> > + enum iommu_resv_type type;
> > + int prot = IOMMU_READ | IOMMU_WRITE;
> > + u64 addr = rmr_desc->base_address, size = rmr_desc->length;
> > +
> > + if (!IS_ALIGNED(addr, SZ_64K) || !IS_ALIGNED(size, SZ_64K)) {
> > + /* PAGE align base addr and size */
> > + addr &= PAGE_MASK;
> > + size = PAGE_ALIGN(size + 
> > offset_in_page(rmr_desc->base_address));
> > +
> > + pr_err(FW_BUG "RMR descriptor[0x%llx - 0x%llx] not 
> > aligned to 64K, continue with [0x%llx - 0x%llx]\n",
> > +rmr_desc->base_address,
> > +rmr_desc->base_address + rmr_desc->length - 1,
> > +addr, addr + size - 1);
> > + }
> > + if (rmr->flags & IOMMU_RMR_REMAP_PERMITTED) {
> > + type = IOMMU_RESV_DIRECT_RELAXABLE;
> > + /*
> > +  * Set IOMMU_CACHE as IOMMU_RESV_DIRECT_RELAXABLE is
> > +  * normally used for allocated system memory that is
> > +  * then used for device specific reserved regions.
> > +  */
> > + prot |= IOMMU_CACHE;
> > + } else {
> > + type = IOMMU_RESV_DIRECT;
> > + /*
> > +  * Set IOMMU_MMIO as IOMMU_RESV_DIRECT is normally 
> > used
> > +  * for device memory like MSI doorbell.
> > +  */
> > + prot |= IOMMU_MMIO;
> > + }
>
> On the prot value assignment based on the remapping flag, I'd like to
> hear Robin/Joerg's opinion, I'd avoid being in a situation where
> "normally" this would work but then we have to quirk it.
>
> Is this a valid assumption _always_ ?

These assumptions were made based on the historic use cases I could
find reading the history.  There aren't many known examples "in the wild"
because so far we haven't had a mechanism other than quirks based
around device-tree implementations.

Ultimately I believe the proper solution will need to be another flag
in the RMR table that specifies the type of memory an RMR Node
describes, not just the base and length.

-Jon

>
> Thanks,
> Lorenzo
>
> > +
> > + region = iommu_alloc_resv_region(addr, size, prot, type);
> > + if (region) {
> > + region->fw_data.rmr.flags = rmr->flags;
> > + region->fw_data.rmr.sid = sid;
> > + region->fw_data.rmr.smmu = smmu;
> > + list_add_tail(>list, _rmr_list);
> > + }
> > + }
> > +}
> > +
> > +static void __init iort_parse_rmr(void)
> > +{
> > + struct acpi_iort_node *iort_node, *iort_end;
> > + struct acpi_table_iort *iort;
> > + int i;
> > +
> > + if (iort_table->revision < 3)
> > + return;
> > +
> > + iort = (struct acpi_table_iort *)iort_table;
> > +
> > + iort_node = ACPI_ADD_PTR(struct acpi_iort_node, iort,
> > +  iort->node_offset);
> > + iort_end = ACPI_ADD_PTR(struct acpi_iort_node, iort,
> > + iort_table->length);
> > +
> > + for (i = 0; i < iort->node_count; i++) {
> > + if (WARN_TAINT(iort_node >= iort_end, 
> > 

Re: [PATCH v3] iommu/amd: Use report_iommu_fault()

2021-08-05 Thread Suthikulpanit, Suravee via iommu

Lennert,

FYI: I have made some comments in V2 thread specifically around the new changes 
that we discussed in that thread.

Thanks,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/amd: Use report_iommu_fault()

2021-08-05 Thread Suthikulpanit, Suravee via iommu

Lennert,

On 7/29/2021 9:32 PM, Lennert Buytenhek wrote:

We have three cases to handle:

- EVENT_FLAG_I set: IRQ remapping fault, don't call report_iommu_fault()

- EVENT_FLAG_I unset, but the request was a translation request
   (EVENT_FLAG_TR set) or the target page was not present (EVENT_FLAG_PR
   unset): call report_iommu_fault(), but the RW bit will be invalid, so
   don't try to map it to a IOMMU_FAULT_{READ,WRITE} code


So, why do we need to call report_iommu_fault() for this case?
My understanding is we only have IOMMU_FAULT_[READ|WRITE].
So, if we can't identify whether the DMA is read / write,
we should not need to call report_iommu_fauilt(), is it?


- EVENT_FLAG_I unset, the request is a transaction request (EVENT_FLAG_TR
   unset) and the target page was present (EVENT_FLAG_PR set): call
   report_iommu_fault(), and use the RW bit to set IOMMU_FAULT_{READ,WRITE}

So I don't think we can merge the test for EVENT_FLAG_I with the
test for EVENT_FLAG_TR/EVENT_FLAG_PR.


The only condition that we would report_iommu_fault is
I=0, TR=0, PR=1, isn't it. So we should be able to just check if PR=1.



We could do something like this, if you'd prefer:

#define IS_IOMMU_MEM_TRANSACTION(flags) \
(((flags) & EVENT_FLAG_I) == 0)

#define IS_RW_FLAG_VALID(flags) \
(((flags) & (EVENT_FLAG_TR | EVENT_FLAG_PR)) == EVENT_FLAG_PR)

#define IS_WRITE_REQUEST(flags) \
(IS_RW_FLAG_VALID(flags) && (flags & EVENT_FLAG_RW))

And then do something like:

if (dev_data && IS_IOMMU_MEM_TRANSACTION(flags)) {
if (!report_iommu_fault(_data->domain->domain, >dev,
address,
IS_WRITE_REQUEST(flags) ?
IOMMU_FAULT_WRITE : IOMMU_FAULT_READ))


Actually, IS_WRITE_REQUEST() == 0 could mean:
- I=0, TR=0, PR=1 and RW=0: This is fine.
- I=0, (TR=1 or PR=0), and we should not be calling report_iommu_fault() here
  since we cannot specify READ/WRITE here.

Thanks,
Suravee


goto out;
}


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 2/9] ACPI/IORT: Add support for RMR node parsing

2021-08-05 Thread Lorenzo Pieralisi
On Thu, Aug 05, 2021 at 09:07:17AM +0100, Shameer Kolothum wrote:

[...]

> +static void __init iort_node_get_rmr_info(struct acpi_iort_node *iort_node)
> +{
> + struct acpi_iort_node *smmu;
> + struct acpi_iort_rmr *rmr;
> + struct acpi_iort_rmr_desc *rmr_desc;
> + u32 map_count = iort_node->mapping_count;
> + u32 sid;
> + int i;
> +
> + if (!iort_node->mapping_offset || map_count != 1) {
> + pr_err(FW_BUG "Invalid ID mapping, skipping RMR node %p\n",
> +iort_node);
> + return;
> + }
> +
> + /* Retrieve associated smmu and stream id */
> + smmu = iort_node_get_id(iort_node, , 0);
> + if (!smmu) {
> + pr_err(FW_BUG "Invalid SMMU reference, skipping RMR node %p\n",
> +iort_node);
> + return;
> + }
> +
> + /* Retrieve RMR data */
> + rmr = (struct acpi_iort_rmr *)iort_node->node_data;
> + if (!rmr->rmr_offset || !rmr->rmr_count) {
> + pr_err(FW_BUG "Invalid RMR descriptor array, skipping RMR node 
> %p\n",
> +iort_node);
> + return;
> + }
> +
> + rmr_desc = ACPI_ADD_PTR(struct acpi_iort_rmr_desc, iort_node,
> + rmr->rmr_offset);
> +
> + iort_rmr_desc_check_overlap(rmr_desc, rmr->rmr_count);
> +
> + for (i = 0; i < rmr->rmr_count; i++, rmr_desc++) {
> + struct iommu_resv_region *region;
> + enum iommu_resv_type type;
> + int prot = IOMMU_READ | IOMMU_WRITE;
> + u64 addr = rmr_desc->base_address, size = rmr_desc->length;
> +
> + if (!IS_ALIGNED(addr, SZ_64K) || !IS_ALIGNED(size, SZ_64K)) {
> + /* PAGE align base addr and size */
> + addr &= PAGE_MASK;
> + size = PAGE_ALIGN(size + 
> offset_in_page(rmr_desc->base_address));
> +
> + pr_err(FW_BUG "RMR descriptor[0x%llx - 0x%llx] not 
> aligned to 64K, continue with [0x%llx - 0x%llx]\n",
> +rmr_desc->base_address,
> +rmr_desc->base_address + rmr_desc->length - 1,
> +addr, addr + size - 1);
> + }
> + if (rmr->flags & IOMMU_RMR_REMAP_PERMITTED) {
> + type = IOMMU_RESV_DIRECT_RELAXABLE;
> + /*
> +  * Set IOMMU_CACHE as IOMMU_RESV_DIRECT_RELAXABLE is
> +  * normally used for allocated system memory that is
> +  * then used for device specific reserved regions.
> +  */
> + prot |= IOMMU_CACHE;
> + } else {
> + type = IOMMU_RESV_DIRECT;
> + /*
> +  * Set IOMMU_MMIO as IOMMU_RESV_DIRECT is normally used
> +  * for device memory like MSI doorbell.
> +  */
> + prot |= IOMMU_MMIO;
> + }

On the prot value assignment based on the remapping flag, I'd like to
hear Robin/Joerg's opinion, I'd avoid being in a situation where
"normally" this would work but then we have to quirk it.

Is this a valid assumption _always_ ?

Thanks,
Lorenzo

> +
> + region = iommu_alloc_resv_region(addr, size, prot, type);
> + if (region) {
> + region->fw_data.rmr.flags = rmr->flags;
> + region->fw_data.rmr.sid = sid;
> + region->fw_data.rmr.smmu = smmu;
> + list_add_tail(>list, _rmr_list);
> + }
> + }
> +}
> +
> +static void __init iort_parse_rmr(void)
> +{
> + struct acpi_iort_node *iort_node, *iort_end;
> + struct acpi_table_iort *iort;
> + int i;
> +
> + if (iort_table->revision < 3)
> + return;
> +
> + iort = (struct acpi_table_iort *)iort_table;
> +
> + iort_node = ACPI_ADD_PTR(struct acpi_iort_node, iort,
> +  iort->node_offset);
> + iort_end = ACPI_ADD_PTR(struct acpi_iort_node, iort,
> + iort_table->length);
> +
> + for (i = 0; i < iort->node_count; i++) {
> + if (WARN_TAINT(iort_node >= iort_end, TAINT_FIRMWARE_WORKAROUND,
> +"IORT node pointer overflows, bad table!\n"))
> + return;
> +
> + if (iort_node->type == ACPI_IORT_NODE_RMR)
> + iort_node_get_rmr_info(iort_node);
> +
> + iort_node = ACPI_ADD_PTR(struct acpi_iort_node, iort_node,
> +  iort_node->length);
> + }
> +}
>  
>  static void __init iort_init_platform_devices(void)
>  {
> @@ -1636,6 +1767,7 @@ void __init acpi_iort_init(void)
>   }
>  
>   iort_init_platform_devices();
> + iort_parse_rmr();
>  }
>  
>  #ifdef CONFIG_ZONE_DMA
> -- 
> 2.17.1
> 
___
iommu 

Re: [PATCH V2 11/14] x86/Swiotlb: Add Swiotlb bounce buffer remap function for HV IVM

2021-08-05 Thread Tianyu Lan

Hi Konrad:
 Could you have a look at this new version? The change since v1 is 
make swiotlb_init_io_tlb_mem() return error code when 
dma_map_decrypted() fails according your previous comment. If this 
change is ok, could you give your ack and this series needs to be merged 
via Hyper-V next tree.


Thanks.

On 8/5/2021 2:45 AM, Tianyu Lan wrote:

From: Tianyu Lan 

In Isolation VM with AMD SEV, bounce buffer needs to be accessed via
extra address space which is above shared_gpa_boundary
(E.G 39 bit address line) reported by Hyper-V CPUID ISOLATION_CONFIG.
The access physical address will be original physical address +
shared_gpa_boundary. The shared_gpa_boundary in the AMD SEV SNP
spec is called virtual top of memory(vTOM). Memory addresses below
vTOM are automatically treated as private while memory above
vTOM is treated as shared.

Use dma_map_decrypted() in the swiotlb code, store remap address returned
and use the remap address to copy data from/to swiotlb bounce buffer.

Signed-off-by: Tianyu Lan 
---
Change since v1:
* Make swiotlb_init_io_tlb_mem() return error code and return
  error when dma_map_decrypted() fails.

Signed-off-by: Tianyu Lan 
---
  include/linux/swiotlb.h |  4 
  kernel/dma/swiotlb.c| 32 
  2 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index f507e3eacbea..584560ecaa8e 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -72,6 +72,9 @@ extern enum swiotlb_force swiotlb_force;
   * @end:  The end address of the swiotlb memory pool. Used to do a quick
   *range check to see if the memory was in fact allocated by this
   *API.
+ * @vaddr: The vaddr of the swiotlb memory pool. The swiotlb
+ * memory pool may be remapped in the memory encrypted case and 
store
+ * virtual address for bounce buffer operation.
   * @nslabs:   The number of IO TLB blocks (in groups of 64) between @start and
   *@end. For default swiotlb, this is command line adjustable via
   *setup_io_tlb_npages.
@@ -89,6 +92,7 @@ extern enum swiotlb_force swiotlb_force;
  struct io_tlb_mem {
phys_addr_t start;
phys_addr_t end;
+   void *vaddr;
unsigned long nslabs;
unsigned long used;
unsigned int index;
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 1fa81c096c1d..29b6d888ef3b 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -176,7 +176,7 @@ void __init swiotlb_update_mem_attributes(void)
memset(vaddr, 0, bytes);
  }
  
-static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,

+static int swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,
unsigned long nslabs, bool late_alloc)
  {
void *vaddr = phys_to_virt(start);
@@ -194,14 +194,21 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem 
*mem, phys_addr_t start,
mem->slots[i].alloc_size = 0;
}
  
-	set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);

-   memset(vaddr, 0, bytes);
+   mem->vaddr = dma_map_decrypted(vaddr, bytes);
+   if (!mem->vaddr) {
+   pr_err("Failed to decrypt memory.\n");
+   return -ENOMEM;
+   }
+
+   memset(mem->vaddr, 0, bytes);
+   return 0;
  }
  
  int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose)

  {
struct io_tlb_mem *mem;
size_t alloc_size;
+   int ret;
  
  	if (swiotlb_force == SWIOTLB_NO_FORCE)

return 0;
@@ -216,7 +223,11 @@ int __init swiotlb_init_with_tbl(char *tlb, unsigned long 
nslabs, int verbose)
panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
  __func__, alloc_size, PAGE_SIZE);
  
-	swiotlb_init_io_tlb_mem(mem, __pa(tlb), nslabs, false);

+   ret = swiotlb_init_io_tlb_mem(mem, __pa(tlb), nslabs, false);
+   if (ret) {
+   memblock_free(__pa(mem), alloc_size);
+   return ret;
+   }
  
  	io_tlb_default_mem = mem;

if (verbose)
@@ -304,6 +315,8 @@ int
  swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
  {
struct io_tlb_mem *mem;
+   int size = get_order(struct_size(mem, slots, nslabs));
+   int ret;
  
  	if (swiotlb_force == SWIOTLB_NO_FORCE)

return 0;
@@ -312,12 +325,15 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long 
nslabs)
if (WARN_ON_ONCE(io_tlb_default_mem))
return -ENOMEM;
  
-	mem = (void *)__get_free_pages(GFP_KERNEL,

-   get_order(struct_size(mem, slots, nslabs)));
+   mem = (void *)__get_free_pages(GFP_KERNEL, size);
if (!mem)
return -ENOMEM;
  
-	swiotlb_init_io_tlb_mem(mem, virt_to_phys(tlb), nslabs, true);

+   ret = swiotlb_init_io_tlb_mem(mem, virt_to_phys(tlb), nslabs, 

Re: [PATCH V2 10/14] DMA: Add dma_map_decrypted/dma_unmap_encrypted() function

2021-08-05 Thread Tianyu Lan

Hi Christoph:
 Could you have a look at this patch? It adds new API 
dma_map_decrypted() to do memory decrypted and remap. It will

be used in the swiotlb code.

Thanks.

On 8/5/2021 2:45 AM, Tianyu Lan wrote:

From: Tianyu Lan 

In Hyper-V Isolation VM with AMD SEV, swiotlb boucne buffer
needs to be mapped into address space above vTOM and so
introduce dma_map_decrypted/dma_unmap_encrypted() to map/unmap
bounce buffer memory. The platform can populate man/unmap callback
in the dma memory decrypted ops.

Signed-off-by: Tianyu Lan 
---
  include/linux/dma-map-ops.h |  9 +
  kernel/dma/mapping.c| 22 ++
  2 files changed, 31 insertions(+)

diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index 0d53a96a3d64..01d60a024e45 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -71,6 +71,11 @@ struct dma_map_ops {
unsigned long (*get_merge_boundary)(struct device *dev);
  };
  
+struct dma_memory_decrypted_ops {

+   void *(*map)(void *addr, unsigned long size);
+   void (*unmap)(void *addr);
+};
+
  #ifdef CONFIG_DMA_OPS
  #include 
  
@@ -374,6 +379,10 @@ static inline void debug_dma_dump_mappings(struct device *dev)

  }
  #endif /* CONFIG_DMA_API_DEBUG */
  
+void *dma_map_decrypted(void *addr, unsigned long size);

+int dma_unmap_decrypted(void *addr, unsigned long size);
+
  extern const struct dma_map_ops dma_dummy_ops;
+extern struct dma_memory_decrypted_ops dma_memory_generic_decrypted_ops;
  
  #endif /* _LINUX_DMA_MAP_OPS_H */

diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 2b06a809d0b9..6fb150dc1750 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -13,11 +13,13 @@
  #include 
  #include 
  #include 
+#include 
  #include "debug.h"
  #include "direct.h"
  
  bool dma_default_coherent;
  
+struct dma_memory_decrypted_ops dma_memory_generic_decrypted_ops;

  /*
   * Managed DMA API
   */
@@ -736,3 +738,23 @@ unsigned long dma_get_merge_boundary(struct device *dev)
return ops->get_merge_boundary(dev);
  }
  EXPORT_SYMBOL_GPL(dma_get_merge_boundary);
+
+void *dma_map_decrypted(void *addr, unsigned long size)
+{
+   if (set_memory_decrypted((unsigned long)addr,
+size / PAGE_SIZE))
+   return NULL;
+
+   if (dma_memory_generic_decrypted_ops.map)
+   return dma_memory_generic_decrypted_ops.map(addr, size);
+   else
+   return addr;
+}
+
+int dma_unmap_encrypted(void *addr, unsigned long size)
+{
+   if (dma_memory_generic_decrypted_ops.unmap)
+   dma_memory_generic_decrypted_ops.unmap(addr);
+
+   return set_memory_encrypted((unsigned long)addr, size / PAGE_SIZE);
+}


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2 03/14] x86/set_memory: Add x86_set_memory_enc static call support

2021-08-05 Thread Tianyu Lan

On 8/5/2021 10:29 PM, Dave Hansen wrote:

On 8/5/21 7:23 AM, Peter Zijlstra wrote:

This is assuming any of this is actually performance critical, based off
of this using static_call() to begin with.


This code is not performance critical.

I think I sent folks off on a wild goose chase when I asked that we make
an effort to optimize code that does:

if (some_hyperv_check())
foo();

if (some_amd_feature_check())
bar();

with checks that will actually compile away when Hyper-V or
some_amd_feature() is disabled.  That's less about performance and just
about good hygiene.  I *wanted* to see
cpu_feature_enabled(X86_FEATURE...) checks.

Someone suggested using static calls, and off we went...

Could we please just use cpu_feature_enabled()?



Yes, cpu_feature_enabled() works. The target is just to run platform 
code after platform check. I will update this in the next version.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 4/9] ACPI/IORT: Add a helper to retrieve RMR memory regions

2021-08-05 Thread Lorenzo Pieralisi
On Thu, Aug 05, 2021 at 09:07:19AM +0100, Shameer Kolothum wrote:
> Add a helper function (iort_iommu_get_rmrs()) that retrieves RMR
> memory descriptors associated with a given IOMMU. This will be used
> by IOMMU drivers to setup necessary mappings.
> 
> Invoke it from the generic helper iommu_dma_get_rmrs().
> 
> Signed-off-by: Shameer Kolothum 
> ---
>  drivers/acpi/arm64/iort.c | 38 ++
>  drivers/iommu/dma-iommu.c |  4 
>  include/linux/acpi_iort.h |  7 +++
>  3 files changed, 49 insertions(+)

Reviewed-by: Lorenzo Pieralisi 

> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index d76ba46ebe67..3c32d62e63b6 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -809,6 +809,42 @@ static struct acpi_iort_node 
> *iort_get_msi_resv_iommu(struct device *dev)
>   return NULL;
>  }
>  
> +/**
> + * iort_iommu_get_rmrs() - Helper to retrieve RMR info associated with IOMMU
> + * @iommu_fwnode: fwnode for the IOMMU
> + * @head: RMR list head to be populated
> + *
> + * Returns: 0 on success, <0 failure. Please note, we will keep the already
> + *  allocated RMR reserve regions in case of a kmemdup()
> + *  failure.
> + */
> +int iort_iommu_get_rmrs(struct fwnode_handle *iommu_fwnode,
> + struct list_head *head)
> +{
> + struct iommu_resv_region *e;
> + struct acpi_iort_node *iommu;
> + int rmrs = 0;
> +
> + iommu = iort_get_iort_node(iommu_fwnode);
> + if (!iommu || list_empty(_rmr_list))
> + return -ENODEV;
> +
> + list_for_each_entry(e, _rmr_list, list) {
> + struct iommu_resv_region *region;
> +
> + if (e->fw_data.rmr.smmu != iommu)
> + continue;
> +
> + region = kmemdup(e, sizeof(*region), GFP_KERNEL);
> + if (region) {
> + list_add_tail(>list, head);
> + rmrs++;
> + }
> + }
> +
> + return (rmrs == 0) ? -ENODEV : 0;
> +}
> +
>  /**
>   * iort_iommu_msi_get_resv_regions - Reserved region driver helper
>   * @dev: Device from iommu_get_resv_regions()
> @@ -1041,6 +1077,8 @@ int iort_iommu_msi_get_resv_regions(struct device *dev, 
> struct list_head *head)
>  { return 0; }
>  int iort_iommu_configure_id(struct device *dev, const u32 *input_id)
>  { return -ENODEV; }
> +int iort_iommu_get_rmrs(struct fwnode_handle *fwnode, struct list_head *head)
> +{ return -ENODEV; }
>  #endif
>  
>  static int nc_dma_get_range(struct device *dev, u64 *size)
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 2fa2445e9070..1b6e27475279 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -185,6 +185,9 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
>  int iommu_dma_get_rmrs(struct fwnode_handle *iommu_fwnode,
>  struct list_head *list)
>  {
> + if (!is_of_node(iommu_fwnode))
> + return iort_iommu_get_rmrs(iommu_fwnode, list);
> +
>   return -EINVAL;
>  }
>  EXPORT_SYMBOL(iommu_dma_get_rmrs);
> @@ -200,6 +203,7 @@ EXPORT_SYMBOL(iommu_dma_get_rmrs);
>  void iommu_dma_put_rmrs(struct fwnode_handle *iommu_fwnode,
>   struct list_head *list)
>  {
> + generic_iommu_put_resv_regions(iommu_fwnode->dev, list);
>  }
>  EXPORT_SYMBOL(iommu_dma_put_rmrs);
>  
> diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h
> index f1f0842a2cb2..d8c030c103f5 100644
> --- a/include/linux/acpi_iort.h
> +++ b/include/linux/acpi_iort.h
> @@ -38,6 +38,8 @@ int iort_dma_get_ranges(struct device *dev, u64 *size);
>  int iort_iommu_configure_id(struct device *dev, const u32 *id_in);
>  int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head 
> *head);
>  phys_addr_t acpi_iort_dma_get_max_cpu_address(void);
> +int iort_iommu_get_rmrs(struct fwnode_handle *iommu_fwnode,
> + struct list_head *list);
>  #else
>  static inline void acpi_iort_init(void) { }
>  static inline u32 iort_msi_map_id(struct device *dev, u32 id)
> @@ -57,6 +59,11 @@ int iort_iommu_msi_get_resv_regions(struct device *dev, 
> struct list_head *head)
>  
>  static inline phys_addr_t acpi_iort_dma_get_max_cpu_address(void)
>  { return PHYS_ADDR_MAX; }
> +
> +static inline
> +int iort_iommu_get_rmrs(struct fwnode_handle *iommu_fwnode,
> + struct list_head *list)
> +{ return -ENODEV; }
>  #endif
>  
>  #endif /* __ACPI_IORT_H__ */
> -- 
> 2.17.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()

2021-08-05 Thread John Garry

On 05/08/2021 15:41, Robin Murphy wrote:
I suppose they could be combined into a smaller sub-struct and loaded 
in a single operation, but it looks messy, and prob without much gain.


Indeed I wouldn't say that saving memory is the primary concern here, 
and any more convoluted code is hardly going to help performance. Plus 
it still wouldn't help the other cases where we're just copying the size 
into a fake queue to do some prod arithmetic - I hadn't fully clocked 
what was going on there when I skimmed through things earlier.


Disregarding the bogus layout change, though, do you reckon the rest of 
my idea makes sense?


I tried the similar change to avoid zero-init the padding in 
arm_smmu_cmdq_write_entries() and the 
_arm_smmu_cmdq_poll_set_valid_map(), but the disassembly was the same. 
So the compiler must have got smart there.


But for the original change in this patch, it did make a difference. 
It's nice to remove what was a memcpy:


1770: a9077eff stp xzr, xzr, [x23, #112]
}, head = llq;
1774: 9400 bl 0 

And performance was very fractionally better.

As for pre-evaluating "nents", I'm not sure how much that can help, but 
I am not too optimistic. I can try some testing when I get a chance. 
Having said that, I would need to check the disassembly also.


Thanks,
John
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()

2021-08-05 Thread Robin Murphy

On 2021-08-05 14:40, John Garry wrote:

On 05/08/2021 12:24, Robin Murphy wrote:

On 2021-06-21 17:36, John Garry wrote:
Members of struct "llq" will be zero-inited, apart from member 
max_n_shift.
But we write llq.val straight after the init, so it was pointless to 
zero

init those other members. As such, separately init member max_n_shift
only.

In addition, struct "head" is initialised to "llq" only so that member
max_n_shift is set. But that member is never referenced for "head", so
remove any init there.

Removing these initializations is seen as a small performance 
optimisation,

as this code is (very) hot path.


I looked at this and immediately thought "surely the compiler can see 
that all the prod/cons/val fields are written anyway and elide the 
initialisation?", so I dumped the before and after disassembly, and... 
oh.


You should probably clarify that it's zero-initialising all the 
cacheline padding which is both pointless and painful. With that,


Reviewed-by: Robin Murphy 

However, having looked this closely I'm now tangentially wondering why 
max_n_shift isn't inside the padded union? It's read at the same time 
as both prod and cons by queue_has_space(), and never updated, so 
there doesn't appear to be any benefit to it being in a separate 
cacheline all by itself, and llq is already twice as big as it needs 
to be.


I think that the problem is if the prod+cons 64b value and the shift are 
on the same cacheline, then we have a chance of accessing a stale 
cacheline twice:


static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
    u64 *cmds, int n, bool sync)
{
 u64 cmd_sync[CMDQ_ENT_DWORDS];
 u32 prod;
 unsigned long flags;
 bool owner;
 struct arm_smmu_cmdq *cmdq = >cmdq;
 struct arm_smmu_ll_queue llq = {
     .max_n_shift = cmdq->q.llq.max_n_shift,    // here
 }, head = llq;
 int ret = 0;

 /* 1. Allocate some space in the queue */
 local_irq_save(flags);
 llq.val = READ_ONCE(cmdq->q.llq.val);    // and again here


since cmdq->q.llq is per-SMMU. If max_n_shift is on a separate 
cacheline, then it should never be stale.


Ah, right, even though the accesses are always going to be close 
together, I suppose it could still technically cause some false sharing 
if someone else is trying to update prod at exactly the right time. I 
guess that might be why we need the explicit padding there in the first 
place, it's just a shame that it ends up wasting even more space with 
implicit padding at the end too (and I have a vague memory that trying 
to force member alignment and structure packing at the same time doesn't 
work well). Oh well.


I suppose they could be combined into a smaller sub-struct and loaded in 
a single operation, but it looks messy, and prob without much gain.


Indeed I wouldn't say that saving memory is the primary concern here, 
and any more convoluted code is hardly going to help performance. Plus 
it still wouldn't help the other cases where we're just copying the size 
into a fake queue to do some prod arithmetic - I hadn't fully clocked 
what was going on there when I skimmed through things earlier.


Disregarding the bogus layout change, though, do you reckon the rest of 
my idea makes sense?


Cheers,
Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH V2 03/14] x86/set_memory: Add x86_set_memory_enc static call support

2021-08-05 Thread Dave Hansen
On 8/5/21 7:23 AM, Peter Zijlstra wrote:
> This is assuming any of this is actually performance critical, based off
> of this using static_call() to begin with.

This code is not performance critical.

I think I sent folks off on a wild goose chase when I asked that we make
an effort to optimize code that does:

if (some_hyperv_check())
foo();

if (some_amd_feature_check())
bar();

with checks that will actually compile away when Hyper-V or
some_amd_feature() is disabled.  That's less about performance and just
about good hygiene.  I *wanted* to see
cpu_feature_enabled(X86_FEATURE...) checks.

Someone suggested using static calls, and off we went...

Could we please just use cpu_feature_enabled()?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2 03/14] x86/set_memory: Add x86_set_memory_enc static call support

2021-08-05 Thread Peter Zijlstra
On Thu, Aug 05, 2021 at 10:05:17PM +0800, Tianyu Lan wrote:
>  static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
>  {
> + return static_call(x86_set_memory_enc)(addr, numpages, enc);
>  }

Hurpmh... So with a bit of 'luck' you get code-gen like:

__set_memory_enc_dec:
jmp __SCT_x86_set_memory_enc;

set_memory_encrypted:
mov $1, %rdx
jmp __set_memory_enc_dec

set_memory_decrypted:
mov $0, %rdx
jmp __set_memory_enc_dec


Which, to me, seems exceedingly daft. Best to make all 3 of those
inlines and use EXPORT_STATIC_CALL_TRAMP_GPL(x86_set_memory_enc) or
something.

This is assuming any of this is actually performance critical, based off
of this using static_call() to begin with.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2 03/14] x86/set_memory: Add x86_set_memory_enc static call support

2021-08-05 Thread Peter Zijlstra
On Thu, Aug 05, 2021 at 10:05:17PM +0800, Tianyu Lan wrote:
> +static int default_set_memory_enc(unsigned long addr, int numpages, bool
> enc)
> +{
> + return 0;
> +}
> +
> +DEFINE_STATIC_CALL(x86_set_memory_enc, default_set_memory_enc);

That's spelled:

DEFINE_STATIC_CALL_RET0(x86_set_memory_enc, __set_memory_enc_dec);

And then you can remove the default_set_memory_enc() thing.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node

2021-08-05 Thread Ard Biesheuvel
On Thu, 5 Aug 2021 at 15:35, Shameerali Kolothum Thodi
 wrote:
>
>
>
> > -Original Message-
> > From: Ard Biesheuvel [mailto:a...@kernel.org]
> > Sent: 05 August 2021 14:23
> > To: Shameerali Kolothum Thodi 
> > Cc: Linux ARM ; ACPI Devel Maling List
> > ; Linux IOMMU
> > ; Linuxarm ;
> > Lorenzo Pieralisi ; Joerg Roedel
> > ; Robin Murphy ; Will Deacon
> > ; wanghuiqiang ; Guohanjun
> > (Hanjun Guo) ; Steven Price
> > ; Sami Mujawar ; Jon
> > Nettleton ; Eric Auger ;
> > yangyicong 
> > Subject: Re: [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
> >
> > On Thu, 5 Aug 2021 at 10:10, Shameer Kolothum
> >  wrote:
> > >
> > > Hi,
> > >
> > > The series adds support to IORT RMR nodes specified in IORT
> > > Revision E.b -ARM DEN 0049E[0]. RMR nodes are used to describe
> > > memory ranges that are used by endpoints and require a unity
> > > mapping in SMMU.
> > >
> > > We have faced issues with 3408iMR RAID controller cards which
> > > fail to boot when SMMU is enabled. This is because these
> > > controllers make use of host memory for various caching related
> > > purposes and when SMMU is enabled the iMR firmware fails to
> > > access these memory regions as there is no mapping for them.
> > > IORT RMR provides a way for UEFI to describe and report these
> > > memory regions so that the kernel can make a unity mapping for
> > > these in SMMU.
> > >
> >
> > Does this mean we are ignoring the RMR memory ranges, and exposing the
> > entire physical address space to devices using the stream IDs in
> > question?
>
> Nope. RMR node is used to describe the memory ranges used by end points
> behind SMMU. And this information is used to create 1 : 1 mappings for those
> ranges in SMMU. Anything outside those ranges will result in translation
> fault(if there are no other dynamic DMA mappings).
>

Excellent! It was not obvious to me from looking at the patches, so I
had to ask.

Thanks,
Ard.

>
> >
> > > Change History:
> > >
> > > v6 --> v7
> > >
> > > The only change from v6 is the fix pointed out by Steve to
> > > the SMMUv2 SMR bypass install in patch #8.
> > >
> > > Thanks to the Tested-by tags by Laurentiu with SMMUv2 and
> > > Hanjun/Huiqiang with SMMUv3 for v6. I haven't added the tags
> > > yet as the series still needs more review[1].
> > >
> > > Feedback and tests on this series is very much appreciated.
> > >
> > > v5 --> v6
> > > - Addressed comments from Robin & Lorenzo.
> > >   : Moved iort_parse_rmr() to acpi_iort_init() from
> > > iort_init_platform_devices().
> > >   : Removed use of struct iort_rmr_entry during the initial
> > > parse. Using struct iommu_resv_region instead.
> > >   : Report RMR address alignment and overlap errors, but continue.
> > >   : Reworked arm_smmu_init_bypass_stes() (patch # 6).
> > > - Updated SMMUv2 bypass SMR code. Thanks to Jon N (patch #8).
> > > - Set IOMMU protection flags(IOMMU_CACHE, IOMMU_MMIO) based
> > >   on Type of RMR region. Suggested by Jon N.
> > >
> > > Thanks,
> > > Shameer
> > > [0] https://developer.arm.com/documentation/den0049/latest/
> > > [1]
> > https://lore.kernel.org/linux-acpi/20210716083442.1708-1-shameerali.koloth
> > um.th...@huawei.com/T/#m043c95b869973a834b2fd57f3e1ed0325c84f3b7
> > > --
> > > v4 --> v5
> > >  -Added a fw_data union to struct iommu_resv_region and removed
> > >   struct iommu_rmr (Based on comments from Joerg/Robin).
> > >  -Added iommu_put_rmrs() to release mem.
> > >  -Thanks to Steve for verifying on SMMUv2, but not added the Tested-by
> > >   yet because of the above changes.
> > >
> > > v3 -->v4
> > > -Included the SMMUv2 SMR bypass install changes suggested by
> > >  Steve(patch #7)
> > > -As per Robin's comments, RMR reserve implementation is now
> > >  more generic  (patch #8) and dropped v3 patches 8 and 10.
> > > -Rebase to 5.13-rc1
> > >
> > > RFC v2 --> v3
> > >  -Dropped RFC tag as the ACPICA header changes are now ready to be
> > >   part of 5.13[0]. But this series still has a dependency on that patch.
> > >  -Added IORT E.b related changes(node flags, _DSM function 5 checks for
> > >   PCIe).
> > >  -Changed RMR to stream id mapping from M:N to M:1 as per the spec and
> > >   discussion here[1].
> > >  -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!)
> > > --
> > >
> > > Jon Nettleton (1):
> > >   iommu/arm-smmu: Get associated RMR info and install bypass SMR
> > >
> > > Shameer Kolothum (8):
> > >   iommu: Introduce a union to struct iommu_resv_region
> > >   ACPI/IORT: Add support for RMR node parsing
> > >   iommu/dma: Introduce generic helper to retrieve RMR info
> > >   ACPI/IORT: Add a helper to retrieve RMR memory regions
> > >   iommu/arm-smmu-v3: Introduce strtab init helper
> > >   iommu/arm-smmu-v3: Refactor arm_smmu_init_bypass_stes() to force
> > > bypass
> > >   iommu/arm-smmu-v3: Get associated RMR info and install bypass STE
> > >   iommu/dma: Reserve any RMR regions associated with a dev
> > >
> > >  drivers/acpi/arm64/iort.c 

Re: [PATCH V2 03/14] x86/set_memory: Add x86_set_memory_enc static call support

2021-08-05 Thread Tianyu Lan

Hi Dave:
Thanks for review.

On 8/5/2021 3:27 AM, Dave Hansen wrote:

On 8/4/21 11:44 AM, Tianyu Lan wrote:

+static int default_set_memory_enc(unsigned long addr, int numpages, bool enc);
+DEFINE_STATIC_CALL(x86_set_memory_enc, default_set_memory_enc);
+
  #define CPA_FLUSHTLB 1
  #define CPA_ARRAY 2
  #define CPA_PAGES_ARRAY 4
@@ -1981,6 +1985,11 @@ int set_memory_global(unsigned long addr, int numpages)
  }
  
  static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)

+{
+   return static_call(x86_set_memory_enc)(addr, numpages, enc);
+}
+
+static int default_set_memory_enc(unsigned long addr, int numpages, bool enc)
  {
struct cpa_data cpa;
int ret;


It doesn't make a lot of difference to add this infrastructure and then
ignore it for the existing in-tree user:


static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
{
 struct cpa_data cpa;
 int ret;

 /* Nothing to do if memory encryption is not active */
 if (!mem_encrypt_active())
 return 0;


Shouldn't the default be to just "return 0"?  Then on
mem_encrypt_active() systems, do the bulk of what is in
__set_memory_enc_dec() today.



OK. I try moving code in __set_memory_enc_dec() to sev file 
mem_encrypt.c and this requires to expose cpa functions and structure.

Please have a look.

Tom, Joerg and Brijesh, Could you review at sev code change?
Thanks.



diff --git a/arch/x86/include/asm/set_memory.h 
b/arch/x86/include/asm/set_memory.h

index 43fa081a1adb..991366612deb 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -4,6 +4,25 @@

 #include 
 #include 
+#include 
+
+/*
+ * The current flushing context - we pass it instead of 5 arguments:
+ */
+struct cpa_data {
+   unsigned long   *vaddr;
+   pgd_t   *pgd;
+   pgprot_tmask_set;
+   pgprot_tmask_clr;
+   unsigned long   numpages;
+   unsigned long   curpage;
+   unsigned long   pfn;
+   unsigned intflags;
+   unsigned intforce_split : 1,
+   force_static_prot   : 1,
+   force_flush_all : 1;
+   struct page **pages;
+};

 /*
  * The set_memory_* API can be used to change various attributes of a 
virtual

@@ -83,6 +102,11 @@ int set_pages_rw(struct page *page, int numpages);
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
 bool kernel_page_present(struct page *page);
+int __change_page_attr_set_clr(struct cpa_data *cpa, int checkalias);
+void cpa_flush(struct cpa_data *data, int cache);
+
+int dummy_set_memory_enc(unsigned long addr, int numpages, bool enc);
+DECLARE_STATIC_CALL(x86_set_memory_enc, dummy_set_memory_enc);

 extern int kernel_set_to_readonly;

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index ff08dc463634..49e957c4191f 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -20,6 +20,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 

 #include 
 #include 
@@ -178,6 +180,45 @@ void __init sme_map_bootdata(char *real_mode_data)
__sme_early_map_unmap_mem(__va(cmdline_paddr), COMMAND_LINE_SIZE, true);
 }

+static int sev_set_memory_enc(unsigned long addr, int numpages, bool enc)
+{
+   struct cpa_data cpa;
+   int ret;
+
+   /* Should not be working on unaligned addresses */
+   if (WARN_ONCE(addr & ~PAGE_MASK, "misaligned address: %#lx\n", addr))
+   addr &= PAGE_MASK;
+
+   memset(, 0, sizeof(cpa));
+   cpa.vaddr = 
+   cpa.numpages = numpages;
+   cpa.mask_set = enc ? __pgprot(_PAGE_ENC) : __pgprot(0);
+   cpa.mask_clr = enc ? __pgprot(0) : __pgprot(_PAGE_ENC);
+   cpa.pgd = init_mm.pgd;
+
+   /* Must avoid aliasing mappings in the highmem code */
+   kmap_flush_unused();
+   vm_unmap_aliases();
+
+   /*
+* Before changing the encryption attribute, we need to flush caches.
+*/
+   cpa_flush(, !this_cpu_has(X86_FEATURE_SME_COHERENT));
+
+   ret = __change_page_attr_set_clr(, 1);
+
+   /*
+* After changing the encryption attribute, we need to flush TLBs again
+* in case any speculative TLB caching occurred (but no need to flush
+* caches again).  We could just use cpa_flush_all(), but in case TLB
+* flushing gets optimized in the cpa_flush() path use the same logic
+* as above.
+*/
+   cpa_flush(, 0);
+
+   return ret;
+}
+
 void __init sme_early_init(void)
 {
unsigned int i;
@@ -185,6 +226,8 @@ void __init sme_early_init(void)
if (!sme_me_mask)
return;

+   static_call_update(x86_set_memory_enc, sev_set_memory_enc);
+
early_pmd_flags = __sme_set(early_pmd_flags);

__supported_pte_mask = __sme_set(__supported_pte_mask);
diff --git a/arch/x86/mm/pat/set_memory.c 

Re: [PATCH] iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()

2021-08-05 Thread John Garry

On 05/08/2021 12:24, Robin Murphy wrote:

On 2021-06-21 17:36, John Garry wrote:
Members of struct "llq" will be zero-inited, apart from member 
max_n_shift.

But we write llq.val straight after the init, so it was pointless to zero
init those other members. As such, separately init member max_n_shift
only.

In addition, struct "head" is initialised to "llq" only so that member
max_n_shift is set. But that member is never referenced for "head", so
remove any init there.

Removing these initializations is seen as a small performance 
optimisation,

as this code is (very) hot path.


I looked at this and immediately thought "surely the compiler can see 
that all the prod/cons/val fields are written anyway and elide the 
initialisation?", so I dumped the before and after disassembly, and... oh.


You should probably clarify that it's zero-initialising all the 
cacheline padding which is both pointless and painful. With that,


Reviewed-by: Robin Murphy 

However, having looked this closely I'm now tangentially wondering why 
max_n_shift isn't inside the padded union? It's read at the same time as 
both prod and cons by queue_has_space(), and never updated, so there 
doesn't appear to be any benefit to it being in a separate cacheline all 
by itself, and llq is already twice as big as it needs to be.


I think that the problem is if the prod+cons 64b value and the shift are 
on the same cacheline, then we have a chance of accessing a stale 
cacheline twice:


static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
   u64 *cmds, int n, bool sync)
{
u64 cmd_sync[CMDQ_ENT_DWORDS];
u32 prod;
unsigned long flags;
bool owner;
struct arm_smmu_cmdq *cmdq = >cmdq;
struct arm_smmu_ll_queue llq = {
.max_n_shift = cmdq->q.llq.max_n_shift,  // here
}, head = llq;
int ret = 0;

/* 1. Allocate some space in the queue */
local_irq_save(flags);
llq.val = READ_ONCE(cmdq->q.llq.val);// and again here


since cmdq->q.llq is per-SMMU. If max_n_shift is on a separate 
cacheline, then it should never be stale.


I suppose they could be combined into a smaller sub-struct and loaded in 
a single operation, but it looks messy, and prob without much gain.


Thanks,
John

Sorting 
that would also be a good opportunity to store the value of interest in 
its appropriate form so we're not needlessly recalculating 1 << shift 
every flippin' time...


Robin.


Signed-off-by: John Garry 

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c

index 54b2f27b81d4..8a8ad49bb7fd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -727,11 +727,11 @@ static int arm_smmu_cmdq_issue_cmdlist(struct 
arm_smmu_device *smmu,

  unsigned long flags;
  bool owner;
  struct arm_smmu_cmdq *cmdq = >cmdq;
-    struct arm_smmu_ll_queue llq = {
-    .max_n_shift = cmdq->q.llq.max_n_shift,
-    }, head = llq;
+    struct arm_smmu_ll_queue llq, head;
  int ret = 0;
+    llq.max_n_shift = cmdq->q.llq.max_n_shift;
+
  /* 1. Allocate some space in the queue */
  local_irq_save(flags);
  llq.val = READ_ONCE(cmdq->q.llq.val);


.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node

2021-08-05 Thread Shameerali Kolothum Thodi



> -Original Message-
> From: Ard Biesheuvel [mailto:a...@kernel.org]
> Sent: 05 August 2021 14:23
> To: Shameerali Kolothum Thodi 
> Cc: Linux ARM ; ACPI Devel Maling List
> ; Linux IOMMU
> ; Linuxarm ;
> Lorenzo Pieralisi ; Joerg Roedel
> ; Robin Murphy ; Will Deacon
> ; wanghuiqiang ; Guohanjun
> (Hanjun Guo) ; Steven Price
> ; Sami Mujawar ; Jon
> Nettleton ; Eric Auger ;
> yangyicong 
> Subject: Re: [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
> 
> On Thu, 5 Aug 2021 at 10:10, Shameer Kolothum
>  wrote:
> >
> > Hi,
> >
> > The series adds support to IORT RMR nodes specified in IORT
> > Revision E.b -ARM DEN 0049E[0]. RMR nodes are used to describe
> > memory ranges that are used by endpoints and require a unity
> > mapping in SMMU.
> >
> > We have faced issues with 3408iMR RAID controller cards which
> > fail to boot when SMMU is enabled. This is because these
> > controllers make use of host memory for various caching related
> > purposes and when SMMU is enabled the iMR firmware fails to
> > access these memory regions as there is no mapping for them.
> > IORT RMR provides a way for UEFI to describe and report these
> > memory regions so that the kernel can make a unity mapping for
> > these in SMMU.
> >
> 
> Does this mean we are ignoring the RMR memory ranges, and exposing the
> entire physical address space to devices using the stream IDs in
> question?

Nope. RMR node is used to describe the memory ranges used by end points
behind SMMU. And this information is used to create 1 : 1 mappings for those
ranges in SMMU. Anything outside those ranges will result in translation
fault(if there are no other dynamic DMA mappings).

Thanks,
Shameer

> 
> > Change History:
> >
> > v6 --> v7
> >
> > The only change from v6 is the fix pointed out by Steve to
> > the SMMUv2 SMR bypass install in patch #8.
> >
> > Thanks to the Tested-by tags by Laurentiu with SMMUv2 and
> > Hanjun/Huiqiang with SMMUv3 for v6. I haven't added the tags
> > yet as the series still needs more review[1].
> >
> > Feedback and tests on this series is very much appreciated.
> >
> > v5 --> v6
> > - Addressed comments from Robin & Lorenzo.
> >   : Moved iort_parse_rmr() to acpi_iort_init() from
> > iort_init_platform_devices().
> >   : Removed use of struct iort_rmr_entry during the initial
> > parse. Using struct iommu_resv_region instead.
> >   : Report RMR address alignment and overlap errors, but continue.
> >   : Reworked arm_smmu_init_bypass_stes() (patch # 6).
> > - Updated SMMUv2 bypass SMR code. Thanks to Jon N (patch #8).
> > - Set IOMMU protection flags(IOMMU_CACHE, IOMMU_MMIO) based
> >   on Type of RMR region. Suggested by Jon N.
> >
> > Thanks,
> > Shameer
> > [0] https://developer.arm.com/documentation/den0049/latest/
> > [1]
> https://lore.kernel.org/linux-acpi/20210716083442.1708-1-shameerali.koloth
> um.th...@huawei.com/T/#m043c95b869973a834b2fd57f3e1ed0325c84f3b7
> > --
> > v4 --> v5
> >  -Added a fw_data union to struct iommu_resv_region and removed
> >   struct iommu_rmr (Based on comments from Joerg/Robin).
> >  -Added iommu_put_rmrs() to release mem.
> >  -Thanks to Steve for verifying on SMMUv2, but not added the Tested-by
> >   yet because of the above changes.
> >
> > v3 -->v4
> > -Included the SMMUv2 SMR bypass install changes suggested by
> >  Steve(patch #7)
> > -As per Robin's comments, RMR reserve implementation is now
> >  more generic  (patch #8) and dropped v3 patches 8 and 10.
> > -Rebase to 5.13-rc1
> >
> > RFC v2 --> v3
> >  -Dropped RFC tag as the ACPICA header changes are now ready to be
> >   part of 5.13[0]. But this series still has a dependency on that patch.
> >  -Added IORT E.b related changes(node flags, _DSM function 5 checks for
> >   PCIe).
> >  -Changed RMR to stream id mapping from M:N to M:1 as per the spec and
> >   discussion here[1].
> >  -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!)
> > --
> >
> > Jon Nettleton (1):
> >   iommu/arm-smmu: Get associated RMR info and install bypass SMR
> >
> > Shameer Kolothum (8):
> >   iommu: Introduce a union to struct iommu_resv_region
> >   ACPI/IORT: Add support for RMR node parsing
> >   iommu/dma: Introduce generic helper to retrieve RMR info
> >   ACPI/IORT: Add a helper to retrieve RMR memory regions
> >   iommu/arm-smmu-v3: Introduce strtab init helper
> >   iommu/arm-smmu-v3: Refactor arm_smmu_init_bypass_stes() to force
> > bypass
> >   iommu/arm-smmu-v3: Get associated RMR info and install bypass STE
> >   iommu/dma: Reserve any RMR regions associated with a dev
> >
> >  drivers/acpi/arm64/iort.c   | 172
> +++-
> >  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  76 +++--
> >  drivers/iommu/arm/arm-smmu/arm-smmu.c   |  48 ++
> >  drivers/iommu/dma-iommu.c   |  89 +-
> >  include/linux/acpi_iort.h   |   7 +
> >  include/linux/dma-iommu.h   |  13 ++
> >  include/linux/iommu.h 

Re: [PATCH v10 01/17] iova: Export alloc_iova_fast() and free_iova_fast()

2021-08-05 Thread Jason Wang


在 2021/8/5 下午8:34, Yongji Xie 写道:

My main point, though, is that if you've already got something else
keeping track of the actual addresses, then the way you're using an
iova_domain appears to be something you could do with a trivial bitmap
allocator. That's why I don't buy the efficiency argument. The main
design points of the IOVA allocator are to manage large address spaces
while trying to maximise spatial locality to minimise the underlying
pagetable usage, and allocating with a flexible limit to support
multiple devices with different addressing capabilities in the same
address space. If none of those aspects are relevant to the use-case -
which AFAICS appears to be true here - then as a general-purpose
resource allocator it's rubbish and has an unreasonably massive memory
overhead and there are many, many better choices.


OK, I get your point. Actually we used the genpool allocator in the
early version. Maybe we can fall back to using it.



I think maybe you can share some perf numbers to see how much 
alloc_iova_fast() can help.


Thanks






___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node

2021-08-05 Thread Ard Biesheuvel
On Thu, 5 Aug 2021 at 10:10, Shameer Kolothum
 wrote:
>
> Hi,
>
> The series adds support to IORT RMR nodes specified in IORT
> Revision E.b -ARM DEN 0049E[0]. RMR nodes are used to describe
> memory ranges that are used by endpoints and require a unity
> mapping in SMMU.
>
> We have faced issues with 3408iMR RAID controller cards which
> fail to boot when SMMU is enabled. This is because these
> controllers make use of host memory for various caching related
> purposes and when SMMU is enabled the iMR firmware fails to
> access these memory regions as there is no mapping for them.
> IORT RMR provides a way for UEFI to describe and report these
> memory regions so that the kernel can make a unity mapping for
> these in SMMU.
>

Does this mean we are ignoring the RMR memory ranges, and exposing the
entire physical address space to devices using the stream IDs in
question?


> Change History:
>
> v6 --> v7
>
> The only change from v6 is the fix pointed out by Steve to
> the SMMUv2 SMR bypass install in patch #8.
>
> Thanks to the Tested-by tags by Laurentiu with SMMUv2 and
> Hanjun/Huiqiang with SMMUv3 for v6. I haven't added the tags
> yet as the series still needs more review[1].
>
> Feedback and tests on this series is very much appreciated.
>
> v5 --> v6
> - Addressed comments from Robin & Lorenzo.
>   : Moved iort_parse_rmr() to acpi_iort_init() from
> iort_init_platform_devices().
>   : Removed use of struct iort_rmr_entry during the initial
> parse. Using struct iommu_resv_region instead.
>   : Report RMR address alignment and overlap errors, but continue.
>   : Reworked arm_smmu_init_bypass_stes() (patch # 6).
> - Updated SMMUv2 bypass SMR code. Thanks to Jon N (patch #8).
> - Set IOMMU protection flags(IOMMU_CACHE, IOMMU_MMIO) based
>   on Type of RMR region. Suggested by Jon N.
>
> Thanks,
> Shameer
> [0] https://developer.arm.com/documentation/den0049/latest/
> [1] 
> https://lore.kernel.org/linux-acpi/20210716083442.1708-1-shameerali.kolothum.th...@huawei.com/T/#m043c95b869973a834b2fd57f3e1ed0325c84f3b7
> --
> v4 --> v5
>  -Added a fw_data union to struct iommu_resv_region and removed
>   struct iommu_rmr (Based on comments from Joerg/Robin).
>  -Added iommu_put_rmrs() to release mem.
>  -Thanks to Steve for verifying on SMMUv2, but not added the Tested-by
>   yet because of the above changes.
>
> v3 -->v4
> -Included the SMMUv2 SMR bypass install changes suggested by
>  Steve(patch #7)
> -As per Robin's comments, RMR reserve implementation is now
>  more generic  (patch #8) and dropped v3 patches 8 and 10.
> -Rebase to 5.13-rc1
>
> RFC v2 --> v3
>  -Dropped RFC tag as the ACPICA header changes are now ready to be
>   part of 5.13[0]. But this series still has a dependency on that patch.
>  -Added IORT E.b related changes(node flags, _DSM function 5 checks for
>   PCIe).
>  -Changed RMR to stream id mapping from M:N to M:1 as per the spec and
>   discussion here[1].
>  -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!)
> --
>
> Jon Nettleton (1):
>   iommu/arm-smmu: Get associated RMR info and install bypass SMR
>
> Shameer Kolothum (8):
>   iommu: Introduce a union to struct iommu_resv_region
>   ACPI/IORT: Add support for RMR node parsing
>   iommu/dma: Introduce generic helper to retrieve RMR info
>   ACPI/IORT: Add a helper to retrieve RMR memory regions
>   iommu/arm-smmu-v3: Introduce strtab init helper
>   iommu/arm-smmu-v3: Refactor arm_smmu_init_bypass_stes() to force
> bypass
>   iommu/arm-smmu-v3: Get associated RMR info and install bypass STE
>   iommu/dma: Reserve any RMR regions associated with a dev
>
>  drivers/acpi/arm64/iort.c   | 172 +++-
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  76 +++--
>  drivers/iommu/arm/arm-smmu/arm-smmu.c   |  48 ++
>  drivers/iommu/dma-iommu.c   |  89 +-
>  include/linux/acpi_iort.h   |   7 +
>  include/linux/dma-iommu.h   |  13 ++
>  include/linux/iommu.h   |  11 ++
>  7 files changed, 393 insertions(+), 23 deletions(-)
>
> --
> 2.17.1
>
>
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 04/12] iommu/mediatek: Add device_link between the consumer and the larb devices

2021-08-05 Thread Dafna Hirschfeld




On 30.07.21 04:52, Yong Wu wrote:

MediaTek IOMMU-SMI diagram is like below. all the consumer connect with
smi-larb, then connect with smi-common.

 M4U
  |
 smi-common
  |
   -
   | |...
   | |
larb1 larb2
   | |
vdec   venc

When the consumer works, it should enable the smi-larb's power which
also need enable the smi-common's power firstly.

Thus, First of all, use the device link connect the consumer and the
smi-larbs. then add device link between the smi-larb and smi-common.

This patch adds device_link between the consumer and the larbs.

When device_link_add, I add the flag DL_FLAG_STATELESS to avoid calling
pm_runtime_xx to keep the original status of clocks. It can avoid two
issues:
1) Display HW show fastlogo abnormally reported in [1]. At the beggining,
all the clocks are enabled before entering kernel, but the clocks for
display HW(always in larb0) will be gated after clk_enable and clk_disable
called from device_link_add(->pm_runtime_resume) and rpm_idle. The clock
operation happened before display driver probe. At that time, the display
HW will be abnormal.

2) A deadlock issue reported in [2]. Use DL_FLAG_STATELESS to skip
pm_runtime_xx to avoid the deadlock.

Corresponding, DL_FLAG_AUTOREMOVE_CONSUMER can't be added, then
device_link_removed should be added explicitly.

[1] https://lore.kernel.org/linux-mediatek/1564213888.22908.4.camel@mhfsdcap03/
[2] https://lore.kernel.org/patchwork/patch/1086569/

Suggested-by: Tomasz Figa 
Signed-off-by: Yong Wu 
Tested-by: Dafna Hirschfeld  # on mt8173


Hi, unfortunately, I have to take back the Tested-by tag.
I am now testing the mtk-vcodec with latest kernel + patches sent from the 
mailing list:
https://gitlab.collabora.com/eballetbo/linux/-/commits/topic/chromeos/chromeos-5.14
which includes this patchset.

On chromeos I open a video conference with googl-meet which cause the 
mtk-vcodec vp8 encoder to run.
If I kill it with `killall -9 chrome` I get some page fault messages from the 
iommu:

[  837.255952] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.265696] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.282367] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.299028] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.315683] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.332345] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.349004] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.365665] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.382329] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.42] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read

In addition, running the encoder tests from the shell:

sudo --user=#1000 
/usr/local/libexec/chrome-binary-tests/video_encode_accelerator_tests 
--gtest_filter=VideoEncoderTest.FlushAtEndOfStream_Multiple*  --codec=vp8 
/usr/local/share/tast/data/chromiumos/tast/local/bundles/cros/video/data/tulip2-320x180.yuv
 --disable_validator

At some point it fails with the error

[ 5472.161821] [MTK_V4L2][ERROR] mtk_vcodec_wait_for_done_ctx:32: [290] 
ctx->type=1, cmd=1, wait_event_interruptible_timeout time=1000ms out 0 0!
[ 5472.174678] [MTK_VCODEC][ERROR][290]: vp8_enc_encode_frame() irq_status=0 
failed
[ 5472.182687] [MTK_V4L2][ERROR] mtk_venc_worker:1239: venc_if_encode failed=-5


If you have any idea of what might be the problem or how to debug?

Thanks,
Dafna


---
  drivers/iommu/mtk_iommu.c| 22 ++
  drivers/iommu/mtk_iommu_v1.c | 20 +++-
  2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index a02dde094788..ee742900cf4b 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -571,22 +571,44 @@ static struct iommu_device *mtk_iommu_probe_device(struct 
device *dev)
  {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct mtk_iommu_data *data;
+   struct device_link *link;
+   struct device *larbdev;
+   unsigned int larbid;
  
  	if (!fwspec || fwspec->ops != _iommu_ops)

return ERR_PTR(-ENODEV); /* Not a iommu client device */
  
  	data = dev_iommu_priv_get(dev);
  
+	/*

+* Link the consumer device with the smi-larb device(supplier)
+* The device in each a larb is a independent HW. thus only link
+* one larb here.
+*/
+   larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);
+   larbdev = data->larb_imu[larbid].dev;
+

Re: [PATCH v10 01/17] iova: Export alloc_iova_fast() and free_iova_fast()

2021-08-05 Thread Yongji Xie
On Wed, Aug 4, 2021 at 11:43 PM Robin Murphy  wrote:
>
> On 2021-08-04 06:02, Yongji Xie wrote:
> > On Tue, Aug 3, 2021 at 6:54 PM Robin Murphy  wrote:
> >>
> >> On 2021-08-03 09:54, Yongji Xie wrote:
> >>> On Tue, Aug 3, 2021 at 3:41 PM Jason Wang  wrote:
> 
> 
>  在 2021/7/29 下午3:34, Xie Yongji 写道:
> > Export alloc_iova_fast() and free_iova_fast() so that
> > some modules can use it to improve iova allocation efficiency.
> 
> 
>  It's better to explain why alloc_iova() is not sufficient here.
> 
> >>>
> >>> Fine.
> >>
> >> What I fail to understand from the later patches is what the IOVA domain
> >> actually represents. If the "device" is a userspace process then
> >> logically the "IOVA" would be the userspace address, so presumably
> >> somewhere you're having to translate between this arbitrary address
> >> space and actual usable addresses - if you're worried about efficiency
> >> surely it would be even better to not do that?
> >>
> >
> > Yes, userspace daemon needs to translate the "IOVA" in a DMA
> > descriptor to the VA (from mmap(2)). But this actually doesn't affect
> > performance since it's an identical mapping in most cases.
>
> I'm not familiar with the vhost_iotlb stuff, but it looks suspiciously
> like you're walking yet another tree to make those translations. Even if
> the buffer can be mapped all at once with a fixed offset such that each
> DMA mapping call doesn't need a lookup for each individual "IOVA" - that
> might be what's happening already, but it's a bit hard to follow just
> reading the patches in my mail client - vhost_iotlb_add_range() doesn't
> look like it's super-cheap to call, and you're serialising on a lock for
> that.
>

Yes, that's true. Since the software IOTLB is not used in the VM case,
we need a unified way (vhost_iotlb) to manage the IOVA mapping for
both VM and Container cases.

> My main point, though, is that if you've already got something else
> keeping track of the actual addresses, then the way you're using an
> iova_domain appears to be something you could do with a trivial bitmap
> allocator. That's why I don't buy the efficiency argument. The main
> design points of the IOVA allocator are to manage large address spaces
> while trying to maximise spatial locality to minimise the underlying
> pagetable usage, and allocating with a flexible limit to support
> multiple devices with different addressing capabilities in the same
> address space. If none of those aspects are relevant to the use-case -
> which AFAICS appears to be true here - then as a general-purpose
> resource allocator it's rubbish and has an unreasonably massive memory
> overhead and there are many, many better choices.
>

OK, I get your point. Actually we used the genpool allocator in the
early version. Maybe we can fall back to using it.

> FWIW I've recently started thinking about moving all the caching stuff
> out of iova_domain and into the iommu-dma layer since it's now a giant
> waste of space for all the other current IOVA users.
>
> >> Presumably userspace doesn't have any concern about alignment and the
> >> things we have to worry about for the DMA API in general, so it's pretty
> >> much just allocating slots in a buffer, and there are far more effective
> >> ways to do that than a full-blown address space manager.
> >
> > Considering iova allocation efficiency, I think the iova allocator is
> > better here. In most cases, we don't even need to hold a spin lock
> > during iova allocation.
> >
> >> If you're going
> >> to reuse any infrastructure I'd have expected it to be SWIOTLB rather
> >> than the IOVA allocator. Because, y'know, you're *literally implementing
> >> a software I/O TLB* ;)
> >>
> >
> > But actually what we can reuse in SWIOTLB is the IOVA allocator.
>
> Huh? Those are completely unrelated and orthogonal things - SWIOTLB does
> not use an external allocator (see find_slots()). By SWIOTLB I mean
> specifically the library itself, not dma-direct or any of the other
> users built around it. The functionality for managing slots in a buffer
> and bouncing data in and out can absolutely be reused - that's why users
> like the Xen and iommu-dma code *are* reusing it instead of open-coding
> their own versions.
>

I see. Actually the slots management in SWIOTLB is what I mean by IOVA
allocator.

> > And
> > the IOVA management in SWIOTLB is not what we want. For example,
> > SWIOTLB allocates and uses contiguous memory for bouncing, which is
> > not necessary in VDUSE case.
>
> alloc_iova() allocates a contiguous (in IOVA address) region of space.
> In vduse_domain_map_page() you use it to allocate a contiguous region of
> space from your bounce buffer. Can you clarify how that is fundamentally
> different from allocating a contiguous region of space from a bounce
> buffer? Nobody's saying the underlying implementation details of where
> the buffer itself comes from can't be tweaked.
>

I mean physically contiguous memory here. We 

Re: [PATCH] iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()

2021-08-05 Thread Robin Murphy

On 2021-08-05 12:24, Robin Murphy wrote:

On 2021-06-21 17:36, John Garry wrote:
Members of struct "llq" will be zero-inited, apart from member 
max_n_shift.

But we write llq.val straight after the init, so it was pointless to zero
init those other members. As such, separately init member max_n_shift
only.

In addition, struct "head" is initialised to "llq" only so that member
max_n_shift is set. But that member is never referenced for "head", so
remove any init there.

Removing these initializations is seen as a small performance 
optimisation,

as this code is (very) hot path.


I looked at this and immediately thought "surely the compiler can see 
that all the prod/cons/val fields are written anyway and elide the 
initialisation?", so I dumped the before and after disassembly, and... oh.


You should probably clarify that it's zero-initialising all the 
cacheline padding which is both pointless and painful. With that,


Reviewed-by: Robin Murphy 

However, having looked this closely I'm now tangentially wondering why 
max_n_shift isn't inside the padded union? It's read at the same time as 
both prod and cons by queue_has_space(), and never updated, so there 
doesn't appear to be any benefit to it being in a separate cacheline all 
by itself, and llq is already twice as big as it needs to be. Sorting 
that would also be a good opportunity to store the value of interest in 
its appropriate form so we're not needlessly recalculating 1 << shift 
every flippin' time...


...on which note, how about something like this on top?

(untested since I don't have any SMMUv3 hardware to hand)

Robin.

->8-
Subject: [PATCH] iommu/arm-smmu-v3: Improve arm_smmu_ll_queue efficiency

Once initialised, max_n_shift is only ever read at the same time as
accessing prod or cons, thus should not have any impact on contention
to justify keeping it in its own separate cacheline. Move it inside the
padding union to halve the size of struct arm_smmu_ll_queue. Even then,
though, there are a couple more spots in the command issuing path where
we could do without the overhead of zeroing even one cache line worth of
padding, so avoid implicit initialisation of those temporary structures
as was done at the top level in arm_smmu_cmdq_issue_cmdlist().

Furthermore, the shift value is only directly relevant for initially
setting up the relevant queue base register; all we care about after
that is the number of entries, so store that value instead once a
queue is initialised and avoid needlessly recalculating it everywhere.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 35 +++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 29 ++---
 2 files changed, 37 insertions(+), 27 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c

index 47610dc5d920..bc55217d6d61 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -111,7 +111,7 @@ static bool queue_has_space(struct arm_smmu_ll_queue 
*q, u32 n)

cons = Q_IDX(q, q->cons);

if (Q_WRP(q, q->prod) == Q_WRP(q, q->cons))
-   space = (1 << q->max_n_shift) - (prod - cons);
+   space = q->nents - (prod - cons);
else
space = cons - prod;

@@ -517,10 +517,11 @@ static void 
__arm_smmu_cmdq_poll_set_valid_map(struct arm_smmu_cmdq *cmdq,

   u32 sprod, u32 eprod, bool set)
 {
u32 swidx, sbidx, ewidx, ebidx;
-   struct arm_smmu_ll_queue llq = {
-   .max_n_shift= cmdq->q.llq.max_n_shift,
-   .prod   = sprod,
-   };
+   struct arm_smmu_ll_queue llq;
+
+   /* Avoid zero-initialising all the padding */;
+   llq.nents = cmdq->q.llq.nents;
+   llq.prod = sprod;

ewidx = BIT_WORD(Q_IDX(, eprod));
ebidx = Q_IDX(, eprod) % BITS_PER_LONG;
@@ -696,10 +697,11 @@ static void arm_smmu_cmdq_write_entries(struct 
arm_smmu_cmdq *cmdq, u64 *cmds,

u32 prod, int n)
 {
int i;
-   struct arm_smmu_ll_queue llq = {
-   .max_n_shift= cmdq->q.llq.max_n_shift,
-   .prod   = prod,
-   };
+   struct arm_smmu_ll_queue llq;
+
+   /* Avoid zero-initialising all the padding */;
+   llq.nents = cmdq->q.llq.nents;
+   llq.prod = prod;

for (i = 0; i < n; ++i) {
u64 *cmd = [i * CMDQ_ENT_DWORDS];
@@ -736,7 +738,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct 
arm_smmu_device *smmu,

struct arm_smmu_ll_queue llq, head;
int ret = 0;

-   llq.max_n_shift = cmdq->q.llq.max_n_shift;
+   llq.nents = cmdq->q.llq.nents;

/* 1. Allocate some space in the queue */
local_irq_save(flags);
@@ -2845,16 +2847,18 @@ static int arm_smmu_init_one_queue(struct 
arm_smmu_device *smmu,

   

Re: [RFC v2] /dev/iommu uAPI proposal

2021-08-05 Thread Jason Gunthorpe via iommu
On Wed, Aug 04, 2021 at 10:59:21PM +, Tian, Kevin wrote:
> > From: Jason Gunthorpe 
> > Sent: Wednesday, August 4, 2021 10:05 PM
> > 
> > On Mon, Aug 02, 2021 at 02:49:44AM +, Tian, Kevin wrote:
> > 
> > > Can you elaborate? IMO the user only cares about the label (device cookie
> > > plus optional vPASID) which is generated by itself when doing the 
> > > attaching
> > > call, and expects this virtual label being used in various spots 
> > > (invalidation,
> > > page fault, etc.). How the system labels the traffic (the physical RID or 
> > > RID+
> > > PASID) should be completely invisible to userspace.
> > 
> > I don't think that is true if the vIOMMU driver is also emulating
> > PASID. Presumably the same is true for other PASID-like schemes.
> > 
> 
> I'm getting even more confused with this comment. Isn't it the
> consensus from day one that physical PASID should not be exposed
> to userspace as doing so breaks live migration? 

Uh, no?

> with PASID emulation vIOMMU only cares about vPASID instead of
> pPASID, and the uAPI only requires user to register vPASID instead
> of reporting pPASID back to userspace...

vPASID is only a feature of one device in existance, so we can't make
vPASID mandatory.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()

2021-08-05 Thread Robin Murphy

On 2021-06-21 17:36, John Garry wrote:

Members of struct "llq" will be zero-inited, apart from member max_n_shift.
But we write llq.val straight after the init, so it was pointless to zero
init those other members. As such, separately init member max_n_shift
only.

In addition, struct "head" is initialised to "llq" only so that member
max_n_shift is set. But that member is never referenced for "head", so
remove any init there.

Removing these initializations is seen as a small performance optimisation,
as this code is (very) hot path.


I looked at this and immediately thought "surely the compiler can see 
that all the prod/cons/val fields are written anyway and elide the 
initialisation?", so I dumped the before and after disassembly, and... oh.


You should probably clarify that it's zero-initialising all the 
cacheline padding which is both pointless and painful. With that,


Reviewed-by: Robin Murphy 

However, having looked this closely I'm now tangentially wondering why 
max_n_shift isn't inside the padded union? It's read at the same time as 
both prod and cons by queue_has_space(), and never updated, so there 
doesn't appear to be any benefit to it being in a separate cacheline all 
by itself, and llq is already twice as big as it needs to be. Sorting 
that would also be a good opportunity to store the value of interest in 
its appropriate form so we're not needlessly recalculating 1 << shift 
every flippin' time...


Robin.


Signed-off-by: John Garry 

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 54b2f27b81d4..8a8ad49bb7fd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -727,11 +727,11 @@ static int arm_smmu_cmdq_issue_cmdlist(struct 
arm_smmu_device *smmu,
unsigned long flags;
bool owner;
struct arm_smmu_cmdq *cmdq = >cmdq;
-   struct arm_smmu_ll_queue llq = {
-   .max_n_shift = cmdq->q.llq.max_n_shift,
-   }, head = llq;
+   struct arm_smmu_ll_queue llq, head;
int ret = 0;
  
+	llq.max_n_shift = cmdq->q.llq.max_n_shift;

+
/* 1. Allocate some space in the queue */
local_irq_save(flags);
llq.val = READ_ONCE(cmdq->q.llq.val);


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()

2021-08-05 Thread Will Deacon
On Thu, Aug 05, 2021 at 11:22:15AM +0100, John Garry wrote:
> On 21/06/2021 17:36, John Garry wrote:
> > Members of struct "llq" will be zero-inited, apart from member max_n_shift.
> > But we write llq.val straight after the init, so it was pointless to zero
> > init those other members. As such, separately init member max_n_shift
> > only.
> > 
> > In addition, struct "head" is initialised to "llq" only so that member
> > max_n_shift is set. But that member is never referenced for "head", so
> > remove any init there.
> > 
> > Removing these initializations is seen as a small performance optimisation,
> > as this code is (very) hot path.
> > 
> 
> Hi Will,
> 
> Any chance you can pick up this small optimisation?

Yup! I've actually queued it locally, but I may end up asking Joerg to take
it directly depending on what else I queue for 5.15. So far, most of the
SMMU stuff is all part of wider refactorings.

Cheers,

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] of: restricted dma: Don't fail device probe on rmem init failure

2021-08-05 Thread Robin Murphy

On 2021-08-05 10:47, Will Deacon wrote:

If CONFIG_DMA_RESTRICTED_POOL=n then probing a device with a reference
to a "restricted-dma-pool" will fail with a reasonably cryptic error:

   | pci-host-generic: probe of 1.pci failed with error -22

Print a more helpful message in this case and try to continue probing
the device as we do if the kernel doesn't have the restricted DMA patches
applied or either CONFIG_OF_ADDRESS or CONFIG_HAS_DMA =n.


Makes sense to me;

Reviewed-by: Robin Murphy 

Although if we allow probe to succeed when a pool really was there for a 
reason, it may end up being much more fatal if the driver then tries to 
do a DMA transfer to any old memory and the device access causes an 
SError, or the VM to be killed, or whatever. That's not quite the same 
as the stubbed cases where the respective platforms couldn't have a 
genuine pool to parse either way, but as you say it is what could happen 
already if the user tried to use an older kernel, and I think the chance 
of of_reserved_mem_device_init_by_idx() failing without something being 
terminally wrong anyway - invalid DT, not enough RAM, etc. - is low 
enough that it's probably not a major concern. Plus I'd hope that the 
memory protection schemes people do actually implement don't take such 
such a zero-tolerance approach anyway - allowing a malicious or 
malfunctioning device to take down the system because it tried to make a 
rogue access which *was* already contained seems a bit silly.


Robin.


Cc: Claire Chang 
Cc: Konrad Rzeszutek Wilk 
Cc: Robin Murphy 
Cc: Christoph Hellwig 
Cc: Rob Herring 
Signed-off-by: Will Deacon 
---
  drivers/of/address.c| 8 
  drivers/of/device.c | 2 +-
  drivers/of/of_private.h | 8 +++-
  3 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 973257434398..f6bf4b423c2a 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -997,7 +997,7 @@ int of_dma_get_range(struct device_node *np, const struct 
bus_dma_region **map)
return ret;
  }
  
-int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np)

+void of_dma_set_restricted_buffer(struct device *dev, struct device_node *np)
  {
struct device_node *node, *of_node = dev->of_node;
int count, i;
@@ -1022,11 +1022,11 @@ int of_dma_set_restricted_buffer(struct device *dev, 
struct device_node *np)
 */
if (of_device_is_compatible(node, "restricted-dma-pool") &&
of_device_is_available(node))
-   return of_reserved_mem_device_init_by_idx(dev, of_node,
- i);
+   break;
}
  
-	return 0;

+   if (i != count && of_reserved_mem_device_init_by_idx(dev, of_node, i))
+   dev_warn(dev, "failed to initialise \"restricted-dma-pool\" memory 
node\n");
  }
  #endif /* CONFIG_HAS_DMA */
  
diff --git a/drivers/of/device.c b/drivers/of/device.c

index 2defdca418ec..258a2b099410 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -166,7 +166,7 @@ int of_dma_configure_id(struct device *dev, struct 
device_node *np,
arch_setup_dma_ops(dev, dma_start, size, iommu, coherent);
  
  	if (!iommu)

-   return of_dma_set_restricted_buffer(dev, np);
+   of_dma_set_restricted_buffer(dev, np);
  
  	return 0;

  }
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index f557bd22b0cf..bc883f69496b 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -163,18 +163,16 @@ struct bus_dma_region;
  #if defined(CONFIG_OF_ADDRESS) && defined(CONFIG_HAS_DMA)
  int of_dma_get_range(struct device_node *np,
const struct bus_dma_region **map);
-int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np);
+void of_dma_set_restricted_buffer(struct device *dev, struct device_node *np);
  #else
  static inline int of_dma_get_range(struct device_node *np,
const struct bus_dma_region **map)
  {
return -ENODEV;
  }
-static inline int of_dma_set_restricted_buffer(struct device *dev,
-  struct device_node *np)
+static inline void of_dma_set_restricted_buffer(struct device *dev,
+   struct device_node *np)
  {
-   /* Do nothing, successfully. */
-   return 0;
  }
  #endif
  


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3: Remove some unneeded init in arm_smmu_cmdq_issue_cmdlist()

2021-08-05 Thread John Garry

On 21/06/2021 17:36, John Garry wrote:

Members of struct "llq" will be zero-inited, apart from member max_n_shift.
But we write llq.val straight after the init, so it was pointless to zero
init those other members. As such, separately init member max_n_shift
only.

In addition, struct "head" is initialised to "llq" only so that member
max_n_shift is set. But that member is never referenced for "head", so
remove any init there.

Removing these initializations is seen as a small performance optimisation,
as this code is (very) hot path.



Hi Will,

Any chance you can pick up this small optimisation?

Cheers


Signed-off-by: John Garry 

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 54b2f27b81d4..8a8ad49bb7fd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -727,11 +727,11 @@ static int arm_smmu_cmdq_issue_cmdlist(struct 
arm_smmu_device *smmu,
unsigned long flags;
bool owner;
struct arm_smmu_cmdq *cmdq = >cmdq;
-   struct arm_smmu_ll_queue llq = {
-   .max_n_shift = cmdq->q.llq.max_n_shift,
-   }, head = llq;
+   struct arm_smmu_ll_queue llq, head;
int ret = 0;
  
+	llq.max_n_shift = cmdq->q.llq.max_n_shift;

+
/* 1. Allocate some space in the queue */
local_irq_save(flags);
llq.val = READ_ONCE(cmdq->q.llq.val);



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] of: restricted dma: Don't fail device probe on rmem init failure

2021-08-05 Thread Will Deacon
If CONFIG_DMA_RESTRICTED_POOL=n then probing a device with a reference
to a "restricted-dma-pool" will fail with a reasonably cryptic error:

  | pci-host-generic: probe of 1.pci failed with error -22

Print a more helpful message in this case and try to continue probing
the device as we do if the kernel doesn't have the restricted DMA patches
applied or either CONFIG_OF_ADDRESS or CONFIG_HAS_DMA =n.

Cc: Claire Chang 
Cc: Konrad Rzeszutek Wilk 
Cc: Robin Murphy 
Cc: Christoph Hellwig 
Cc: Rob Herring 
Signed-off-by: Will Deacon 
---
 drivers/of/address.c| 8 
 drivers/of/device.c | 2 +-
 drivers/of/of_private.h | 8 +++-
 3 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 973257434398..f6bf4b423c2a 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -997,7 +997,7 @@ int of_dma_get_range(struct device_node *np, const struct 
bus_dma_region **map)
return ret;
 }
 
-int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np)
+void of_dma_set_restricted_buffer(struct device *dev, struct device_node *np)
 {
struct device_node *node, *of_node = dev->of_node;
int count, i;
@@ -1022,11 +1022,11 @@ int of_dma_set_restricted_buffer(struct device *dev, 
struct device_node *np)
 */
if (of_device_is_compatible(node, "restricted-dma-pool") &&
of_device_is_available(node))
-   return of_reserved_mem_device_init_by_idx(dev, of_node,
- i);
+   break;
}
 
-   return 0;
+   if (i != count && of_reserved_mem_device_init_by_idx(dev, of_node, i))
+   dev_warn(dev, "failed to initialise \"restricted-dma-pool\" 
memory node\n");
 }
 #endif /* CONFIG_HAS_DMA */
 
diff --git a/drivers/of/device.c b/drivers/of/device.c
index 2defdca418ec..258a2b099410 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -166,7 +166,7 @@ int of_dma_configure_id(struct device *dev, struct 
device_node *np,
arch_setup_dma_ops(dev, dma_start, size, iommu, coherent);
 
if (!iommu)
-   return of_dma_set_restricted_buffer(dev, np);
+   of_dma_set_restricted_buffer(dev, np);
 
return 0;
 }
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index f557bd22b0cf..bc883f69496b 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -163,18 +163,16 @@ struct bus_dma_region;
 #if defined(CONFIG_OF_ADDRESS) && defined(CONFIG_HAS_DMA)
 int of_dma_get_range(struct device_node *np,
const struct bus_dma_region **map);
-int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np);
+void of_dma_set_restricted_buffer(struct device *dev, struct device_node *np);
 #else
 static inline int of_dma_get_range(struct device_node *np,
const struct bus_dma_region **map)
 {
return -ENODEV;
 }
-static inline int of_dma_set_restricted_buffer(struct device *dev,
-  struct device_node *np)
+static inline void of_dma_set_restricted_buffer(struct device *dev,
+   struct device_node *np)
 {
-   /* Do nothing, successfully. */
-   return 0;
 }
 #endif
 
-- 
2.32.0.605.g8dce9f2422-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 02/25] iommu/amd: Drop IOVA cookie management

2021-08-05 Thread Robin Murphy

On 2021-08-04 18:15, Robin Murphy wrote:

The core code bakes its own cookies now.

Signed-off-by: Robin Murphy 

---

v3: Also remove unneeded include
---
  drivers/iommu/amd/iommu.c | 13 -
  1 file changed, 13 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 52fe2326042a..92f7cbe3d14a 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -20,7 +20,6 @@
  #include 
  #include 
  #include 
-#include 


Oh dear, how embarrassing... I went through all the drivers making that 
decision based on iommu_dma* references but totally forgot about 
iommu_setup_dma_ops() here. And then of course fell into the trap of 
"such a minor change I don't need to re-rest it" hubris... sigh, roll 
back to v2 for this one.


Apologies,
Robin.


  #include 
  #include 
  #include 
@@ -1918,16 +1917,7 @@ static struct iommu_domain 
*amd_iommu_domain_alloc(unsigned type)
domain->domain.geometry.aperture_end   = ~0ULL;
domain->domain.geometry.force_aperture = true;
  
-	if (type == IOMMU_DOMAIN_DMA &&

-   iommu_get_dma_cookie(>domain) == -ENOMEM)
-   goto free_domain;
-
return >domain;
-
-free_domain:
-   protection_domain_free(domain);
-
-   return NULL;
  }
  
  static void amd_iommu_domain_free(struct iommu_domain *dom)

@@ -1944,9 +1934,6 @@ static void amd_iommu_domain_free(struct iommu_domain 
*dom)
if (!dom)
return;
  
-	if (dom->type == IOMMU_DOMAIN_DMA)

-   iommu_put_dma_cookie(>domain);
-
if (domain->flags & PD_IOMMUV2_MASK)
free_gcr3_table(domain);
  


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v3 06/25] iommu/ipmmu-vmsa: Drop IOVA cookie management

2021-08-05 Thread Yoshihiro Shimoda
Hi Robin,

> From: Robin Murphy, Sent: Thursday, August 5, 2021 2:16 AM
> 
> The core code bakes its own cookies now.
> 
> CC: Yoshihiro Shimoda 
> CC: Geert Uytterhoeven 
> Signed-off-by: Robin Murphy 

Thank you for the patch!
I tested on my environment (r8a77951-salvator-xs),
and I didn't observe any regression. So,

Reviewed-by: Yoshihiro Shimoda 
Tested-by: Yoshihiro Shimoda 

Best regards,
Yoshihiro Shimoda

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v3 01/25] iommu: Pull IOVA cookie management into the core

2021-08-05 Thread Yoshihiro Shimoda
Hi Robin,

> From: Robin Murphy, Sent: Thursday, August 5, 2021 2:15 AM
> 
> Now that everyone has converged on iommu-dma for IOMMU_DOMAIN_DMA
> support, we can abandon the notion of drivers being responsible for the
> cookie type, and consolidate all the management into the core code.
> 
> CC: Marek Szyprowski 
> CC: Yoshihiro Shimoda 
> CC: Geert Uytterhoeven 
> CC: Yong Wu 
> CC: Heiko Stuebner 
> CC: Chunyan Zhang 
> CC: Maxime Ripard 
> Reviewed-by: Jean-Philippe Brucker 
> Reviewed-by: Lu Baolu 
> Signed-off-by: Robin Murphy 

Thank you for the patch!
I tested on my environment (r8a77951-salvator-xs),
and I didn't observe any regression. So,

Tested-by: Yoshihiro Shimoda 

Best regards,
Yoshihiro Shimoda

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 9/9] iommu/dma: Reserve any RMR regions associated with a dev

2021-08-05 Thread Shameer Kolothum
Get ACPI IORT RMR regions associated with a dev reserved
so that there is a unity mapping for them in SMMU.

Signed-off-by: Shameer Kolothum 
---
 drivers/iommu/dma-iommu.c | 56 +++
 1 file changed, 51 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 1b6e27475279..c1ae0c3d4b33 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -207,22 +207,68 @@ void iommu_dma_put_rmrs(struct fwnode_handle 
*iommu_fwnode,
 }
 EXPORT_SYMBOL(iommu_dma_put_rmrs);
 
+static bool iommu_dma_dev_has_rmr(struct iommu_fwspec *fwspec,
+ struct iommu_resv_region *e)
+{
+   int i;
+
+   for (i = 0; i < fwspec->num_ids; i++) {
+   if (e->fw_data.rmr.sid == fwspec->ids[i])
+   return true;
+   }
+
+   return false;
+}
+
+static void iommu_dma_get_rmr_resv_regions(struct device *dev,
+  struct list_head *list)
+{
+   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct list_head rmr_list;
+   struct iommu_resv_region *rmr, *tmp;
+
+   INIT_LIST_HEAD(_list);
+   if (iommu_dma_get_rmrs(fwspec->iommu_fwnode, _list))
+   return;
+
+   if (dev_is_pci(dev)) {
+   struct pci_dev *pdev = to_pci_dev(dev);
+   struct pci_host_bridge *host = pci_find_host_bridge(pdev->bus);
+
+   if (!host->preserve_config)
+   return;
+   }
+
+   list_for_each_entry_safe(rmr, tmp, _list, list) {
+   if (!iommu_dma_dev_has_rmr(fwspec, rmr))
+   continue;
+
+   /* Remove from iommu RMR list and add to dev resv_regions */
+   list_del_init(>list);
+   list_add_tail(>list, list);
+   }
+
+   iommu_dma_put_rmrs(fwspec->iommu_fwnode, _list);
+}
+
 /**
  * iommu_dma_get_resv_regions - Reserved region driver helper
  * @dev: Device from iommu_get_resv_regions()
  * @list: Reserved region list from iommu_get_resv_regions()
  *
  * IOMMU drivers can use this to implement their .get_resv_regions callback
- * for general non-IOMMU-specific reservations. Currently, this covers GICv3
- * ITS region reservation on ACPI based ARM platforms that may require HW MSI
- * reservation.
+ * for general non-IOMMU-specific reservations. Currently this covers,
+ *  -GICv3 ITS region reservation on ACPI based ARM platforms that may
+ *   require HW MSI reservation.
+ *  -Any ACPI IORT RMR memory range reservations (IORT spec rev E.b)
  */
 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
 {
 
-   if (!is_of_node(dev_iommu_fwspec_get(dev)->iommu_fwnode))
+   if (!is_of_node(dev_iommu_fwspec_get(dev)->iommu_fwnode)) {
iort_iommu_msi_get_resv_regions(dev, list);
-
+   iommu_dma_get_rmr_resv_regions(dev, list);
+   }
 }
 EXPORT_SYMBOL(iommu_dma_get_resv_regions);
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 8/9] iommu/arm-smmu: Get associated RMR info and install bypass SMR

2021-08-05 Thread Shameer Kolothum
From: Jon Nettleton 

Check if there is any RMR info associated with the devices behind
the SMMU and if any, install bypass SMRs for them. This is to
keep any ongoing traffic associated with these devices alive
when we enable/reset SMMU during probe().

Signed-off-by: Jon Nettleton 
Signed-off-by: Steven Price 
Signed-off-by: Shameer Kolothum 
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c | 48 +++
 1 file changed, 48 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index f22dbeb1e510..315feab9e85b 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -2063,6 +2063,50 @@ err_reset_platform_ops: __maybe_unused;
return err;
 }
 
+static void arm_smmu_rmr_install_bypass_smr(struct arm_smmu_device *smmu)
+{
+   struct list_head rmr_list;
+   struct iommu_resv_region *e;
+   int i, cnt = 0;
+   u32 reg;
+
+   INIT_LIST_HEAD(_list);
+   if (iommu_dma_get_rmrs(dev_fwnode(smmu->dev), _list))
+   return;
+
+   /*
+* Rather than trying to look at existing mappings that
+* are setup by the firmware and then invalidate the ones
+* that do no have matching RMR entries, just disable the
+* SMMU until it gets enabled again in the reset routine.
+*/
+   reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sCR0);
+   reg |= ARM_SMMU_sCR0_CLIENTPD;
+   arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sCR0, reg);
+
+   list_for_each_entry(e, _list, list) {
+   u32 sid = e->fw_data.rmr.sid;
+
+   i = arm_smmu_find_sme(smmu, sid, ~0);
+   if (i < 0)
+   continue;
+   if (smmu->s2crs[i].count == 0) {
+   smmu->smrs[i].id = sid;
+   smmu->smrs[i].mask = 0;
+   smmu->smrs[i].valid = true;
+   }
+   smmu->s2crs[i].count++;
+   smmu->s2crs[i].type = S2CR_TYPE_BYPASS;
+   smmu->s2crs[i].privcfg = S2CR_PRIVCFG_DEFAULT;
+
+   cnt++;
+   }
+
+   dev_notice(smmu->dev, "\tpreserved %d boot mapping%s\n", cnt,
+  cnt == 1 ? "" : "s");
+   iommu_dma_put_rmrs(dev_fwnode(smmu->dev), _list);
+}
+
 static int arm_smmu_device_probe(struct platform_device *pdev)
 {
struct resource *res;
@@ -2189,6 +2233,10 @@ static int arm_smmu_device_probe(struct platform_device 
*pdev)
}
 
platform_set_drvdata(pdev, smmu);
+
+   /* Check for RMRs and install bypass SMRs if any */
+   arm_smmu_rmr_install_bypass_smr(smmu);
+
arm_smmu_device_reset(smmu);
arm_smmu_test_smr_masks(smmu);
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 7/9] iommu/arm-smmu-v3: Get associated RMR info and install bypass STE

2021-08-05 Thread Shameer Kolothum
Check if there is any RMR info associated with the devices behind
the SMMUv3 and if any, install bypass STEs for them. This is to
keep any ongoing traffic associated with these devices alive
when we enable/reset SMMUv3 during probe().

Signed-off-by: Shameer Kolothum 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 31 +
 1 file changed, 31 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 85f6f1925a36..1165605d6f7a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3760,6 +3760,34 @@ static void __iomem *arm_smmu_ioremap(struct device 
*dev, resource_size_t start,
return devm_ioremap_resource(dev, );
 }
 
+static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
+{
+   struct list_head rmr_list;
+   struct iommu_resv_region *e;
+   int ret;
+
+   INIT_LIST_HEAD(_list);
+   if (iommu_dma_get_rmrs(dev_fwnode(smmu->dev), _list))
+   return;
+
+   list_for_each_entry(e, _list, list) {
+   __le64 *step;
+   u32 sid = e->fw_data.rmr.sid;
+
+   ret = arm_smmu_init_sid_strtab(smmu, sid);
+   if (ret) {
+   dev_err(smmu->dev, "RMR SID(0x%x) bypass failed\n",
+   sid);
+   continue;
+   }
+
+   step = arm_smmu_get_step_for_sid(smmu, sid);
+   arm_smmu_init_bypass_stes(step, 1, true);
+   }
+
+   iommu_dma_put_rmrs(dev_fwnode(smmu->dev), _list);
+}
+
 static int arm_smmu_device_probe(struct platform_device *pdev)
 {
int irq, ret;
@@ -3841,6 +3869,9 @@ static int arm_smmu_device_probe(struct platform_device 
*pdev)
/* Record our private device structure */
platform_set_drvdata(pdev, smmu);
 
+   /* Check for RMRs and install bypass STEs if any */
+   arm_smmu_rmr_install_bypass_ste(smmu);
+
/* Reset the device */
ret = arm_smmu_device_reset(smmu, bypass);
if (ret)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v7 6/9] iommu/arm-smmu-v3: Refactor arm_smmu_init_bypass_stes() to force bypass

2021-08-05 Thread Shameer Kolothum
By default, disable_bypass flag is set and any dev without
an iommu domain installs STE with CFG_ABORT during
arm_smmu_init_bypass_stes(). Introduce a "force" flag and
move the STE update logic to arm_smmu_init_bypass_stes()
so that we can force it to install CFG_BYPASS STE for specific
SIDs.

This will be useful in follow-up patch to install bypass
for IORT RMR SIDs.

Signed-off-by: Shameer Kolothum 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 31940e53c675..85f6f1925a36 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1357,12 +1357,21 @@ static void arm_smmu_write_strtab_ent(struct 
arm_smmu_master *master, u32 sid,
arm_smmu_cmdq_issue_cmd(smmu, _cmd);
 }
 
-static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent)
+static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool 
force)
 {
unsigned int i;
+   u64 val = STRTAB_STE_0_V;
+
+   if (disable_bypass && !force)
+   val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
+   else
+   val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
 
for (i = 0; i < nent; ++i) {
-   arm_smmu_write_strtab_ent(NULL, -1, strtab);
+   strtab[0] = cpu_to_le64(val);
+   strtab[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
+  
STRTAB_STE_1_SHCFG_INCOMING));
+   strtab[2] = 0;
strtab += STRTAB_STE_DWORDS;
}
 }
@@ -1390,7 +1399,7 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device 
*smmu, u32 sid)
return -ENOMEM;
}
 
-   arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT);
+   arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT, false);
arm_smmu_write_strtab_l1_desc(strtab, desc);
return 0;
 }
@@ -3042,7 +3051,7 @@ static int arm_smmu_init_strtab_linear(struct 
arm_smmu_device *smmu)
reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, smmu->sid_bits);
cfg->strtab_base_cfg = reg;
 
-   arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents);
+   arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents, false);
return 0;
 }
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v7 5/9] iommu/arm-smmu-v3: Introduce strtab init helper

2021-08-05 Thread Shameer Kolothum
Introduce a helper to check the sid range and to init the l2 strtab
entries(bypass). This will be useful when we have to initialize the
l2 strtab with bypass for RMR SIDs.

Signed-off-by: Shameer Kolothum 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 28 +++--
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 235f9bdaeaf2..31940e53c675 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2518,6 +2518,19 @@ static bool arm_smmu_sid_in_range(struct arm_smmu_device 
*smmu, u32 sid)
return sid < limit;
 }
 
+static int arm_smmu_init_sid_strtab(struct arm_smmu_device *smmu, u32 sid)
+{
+   /* Check the SIDs are in range of the SMMU and our stream table */
+   if (!arm_smmu_sid_in_range(smmu, sid))
+   return -ERANGE;
+
+   /* Ensure l2 strtab is initialised */
+   if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)
+   return arm_smmu_init_l2_strtab(smmu, sid);
+
+   return 0;
+}
+
 static int arm_smmu_insert_master(struct arm_smmu_device *smmu,
  struct arm_smmu_master *master)
 {
@@ -2541,20 +2554,9 @@ static int arm_smmu_insert_master(struct arm_smmu_device 
*smmu,
new_stream->id = sid;
new_stream->master = master;
 
-   /*
-* Check the SIDs are in range of the SMMU and our stream table
-*/
-   if (!arm_smmu_sid_in_range(smmu, sid)) {
-   ret = -ERANGE;
+   ret = arm_smmu_init_sid_strtab(smmu, sid);
+   if (ret)
break;
-   }
-
-   /* Ensure l2 strtab is initialised */
-   if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
-   ret = arm_smmu_init_l2_strtab(smmu, sid);
-   if (ret)
-   break;
-   }
 
/* Insert into SID tree */
new_node = &(smmu->streams.rb_node);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 4/9] ACPI/IORT: Add a helper to retrieve RMR memory regions

2021-08-05 Thread Shameer Kolothum
Add a helper function (iort_iommu_get_rmrs()) that retrieves RMR
memory descriptors associated with a given IOMMU. This will be used
by IOMMU drivers to setup necessary mappings.

Invoke it from the generic helper iommu_dma_get_rmrs().

Signed-off-by: Shameer Kolothum 
---
 drivers/acpi/arm64/iort.c | 38 ++
 drivers/iommu/dma-iommu.c |  4 
 include/linux/acpi_iort.h |  7 +++
 3 files changed, 49 insertions(+)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index d76ba46ebe67..3c32d62e63b6 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -809,6 +809,42 @@ static struct acpi_iort_node 
*iort_get_msi_resv_iommu(struct device *dev)
return NULL;
 }
 
+/**
+ * iort_iommu_get_rmrs() - Helper to retrieve RMR info associated with IOMMU
+ * @iommu_fwnode: fwnode for the IOMMU
+ * @head: RMR list head to be populated
+ *
+ * Returns: 0 on success, <0 failure. Please note, we will keep the already
+ *  allocated RMR reserve regions in case of a kmemdup()
+ *  failure.
+ */
+int iort_iommu_get_rmrs(struct fwnode_handle *iommu_fwnode,
+   struct list_head *head)
+{
+   struct iommu_resv_region *e;
+   struct acpi_iort_node *iommu;
+   int rmrs = 0;
+
+   iommu = iort_get_iort_node(iommu_fwnode);
+   if (!iommu || list_empty(_rmr_list))
+   return -ENODEV;
+
+   list_for_each_entry(e, _rmr_list, list) {
+   struct iommu_resv_region *region;
+
+   if (e->fw_data.rmr.smmu != iommu)
+   continue;
+
+   region = kmemdup(e, sizeof(*region), GFP_KERNEL);
+   if (region) {
+   list_add_tail(>list, head);
+   rmrs++;
+   }
+   }
+
+   return (rmrs == 0) ? -ENODEV : 0;
+}
+
 /**
  * iort_iommu_msi_get_resv_regions - Reserved region driver helper
  * @dev: Device from iommu_get_resv_regions()
@@ -1041,6 +1077,8 @@ int iort_iommu_msi_get_resv_regions(struct device *dev, 
struct list_head *head)
 { return 0; }
 int iort_iommu_configure_id(struct device *dev, const u32 *input_id)
 { return -ENODEV; }
+int iort_iommu_get_rmrs(struct fwnode_handle *fwnode, struct list_head *head)
+{ return -ENODEV; }
 #endif
 
 static int nc_dma_get_range(struct device *dev, u64 *size)
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 2fa2445e9070..1b6e27475279 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -185,6 +185,9 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
 int iommu_dma_get_rmrs(struct fwnode_handle *iommu_fwnode,
   struct list_head *list)
 {
+   if (!is_of_node(iommu_fwnode))
+   return iort_iommu_get_rmrs(iommu_fwnode, list);
+
return -EINVAL;
 }
 EXPORT_SYMBOL(iommu_dma_get_rmrs);
@@ -200,6 +203,7 @@ EXPORT_SYMBOL(iommu_dma_get_rmrs);
 void iommu_dma_put_rmrs(struct fwnode_handle *iommu_fwnode,
struct list_head *list)
 {
+   generic_iommu_put_resv_regions(iommu_fwnode->dev, list);
 }
 EXPORT_SYMBOL(iommu_dma_put_rmrs);
 
diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h
index f1f0842a2cb2..d8c030c103f5 100644
--- a/include/linux/acpi_iort.h
+++ b/include/linux/acpi_iort.h
@@ -38,6 +38,8 @@ int iort_dma_get_ranges(struct device *dev, u64 *size);
 int iort_iommu_configure_id(struct device *dev, const u32 *id_in);
 int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head 
*head);
 phys_addr_t acpi_iort_dma_get_max_cpu_address(void);
+int iort_iommu_get_rmrs(struct fwnode_handle *iommu_fwnode,
+   struct list_head *list);
 #else
 static inline void acpi_iort_init(void) { }
 static inline u32 iort_msi_map_id(struct device *dev, u32 id)
@@ -57,6 +59,11 @@ int iort_iommu_msi_get_resv_regions(struct device *dev, 
struct list_head *head)
 
 static inline phys_addr_t acpi_iort_dma_get_max_cpu_address(void)
 { return PHYS_ADDR_MAX; }
+
+static inline
+int iort_iommu_get_rmrs(struct fwnode_handle *iommu_fwnode,
+   struct list_head *list)
+{ return -ENODEV; }
 #endif
 
 #endif /* __ACPI_IORT_H__ */
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 3/9] iommu/dma: Introduce generic helper to retrieve RMR info

2021-08-05 Thread Shameer Kolothum
Reserved Memory Regions(RMR) associated with an IOMMU can be
described through ACPI IORT tables in systems with devices
that require a unity mapping or bypass for those
regions.

Introduce a generic interface so that IOMMU drivers can retrieve
and set up necessary mappings.

Signed-off-by: Shameer Kolothum 
---
 drivers/iommu/dma-iommu.c | 29 +
 include/linux/dma-iommu.h | 13 +
 2 files changed, 42 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 98ba927aee1a..2fa2445e9070 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -174,6 +174,35 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)
 }
 EXPORT_SYMBOL(iommu_put_dma_cookie);
 
+/**
+ *
+ * iommu_dma_get_rmrs - Retrieve Reserved Memory Regions(RMRs) associated
+ *  with a given IOMMU
+ * @iommu_fwnode: fwnode associated with IOMMU
+ * @list: RMR list to be populated
+ *
+ */
+int iommu_dma_get_rmrs(struct fwnode_handle *iommu_fwnode,
+  struct list_head *list)
+{
+   return -EINVAL;
+}
+EXPORT_SYMBOL(iommu_dma_get_rmrs);
+
+/**
+ *
+ * iommu_dma_put_rmrs - Release Reserved Memory Regions(RMRs) associated
+ *  with a given IOMMU
+ * @iommu_fwnode: fwnode associated with IOMMU
+ * @list: RMR list
+ *
+ */
+void iommu_dma_put_rmrs(struct fwnode_handle *iommu_fwnode,
+   struct list_head *list)
+{
+}
+EXPORT_SYMBOL(iommu_dma_put_rmrs);
+
 /**
  * iommu_dma_get_resv_regions - Reserved region driver helper
  * @dev: Device from iommu_get_resv_regions()
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 758ca4694257..3b7b2d096c6e 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -42,12 +42,16 @@ void iommu_dma_free_cpu_cached_iovas(unsigned int cpu,
 
 extern bool iommu_dma_forcedac;
 
+int iommu_dma_get_rmrs(struct fwnode_handle *iommu, struct list_head *list);
+void iommu_dma_put_rmrs(struct fwnode_handle *iommu, struct list_head *list);
+
 #else /* CONFIG_IOMMU_DMA */
 
 struct iommu_domain;
 struct msi_desc;
 struct msi_msg;
 struct device;
+struct fwnode_handle;
 
 static inline void iommu_setup_dma_ops(struct device *dev, u64 dma_base,
   u64 dma_limit)
@@ -83,5 +87,14 @@ static inline void iommu_dma_get_resv_regions(struct device 
*dev, struct list_he
 {
 }
 
+static int iommu_dma_get_rmrs(struct fwnode_handle *iommu, struct list_head 
*list)
+{
+   return -ENODEV;
+}
+
+static void iommu_dma_put_rmrs(struct fwnode_handle *iommu, struct list_head 
*list)
+{
+}
+
 #endif /* CONFIG_IOMMU_DMA */
 #endif /* __DMA_IOMMU_H */
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 2/9] ACPI/IORT: Add support for RMR node parsing

2021-08-05 Thread Shameer Kolothum
Add support for parsing RMR node information from ACPI.

Find the associated streamid and smmu node info from the
RMR node and populate a linked list with RMR memory
descriptors.

Signed-off-by: Shameer Kolothum 
---
 drivers/acpi/arm64/iort.c | 134 +-
 1 file changed, 133 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 3b23fb775ac4..d76ba46ebe67 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -40,6 +40,8 @@ struct iort_fwnode {
 static LIST_HEAD(iort_fwnode_list);
 static DEFINE_SPINLOCK(iort_fwnode_lock);
 
+static LIST_HEAD(iort_rmr_list);   /* list of RMR regions from ACPI */
+
 /**
  * iort_set_fwnode() - Create iort_fwnode and use it to register
  *iommu data in the iort_fwnode_list
@@ -393,7 +395,8 @@ static struct acpi_iort_node *iort_node_get_id(struct 
acpi_iort_node *node,
if (node->type == ACPI_IORT_NODE_NAMED_COMPONENT ||
node->type == ACPI_IORT_NODE_PCI_ROOT_COMPLEX ||
node->type == ACPI_IORT_NODE_SMMU_V3 ||
-   node->type == ACPI_IORT_NODE_PMCG) {
+   node->type == ACPI_IORT_NODE_PMCG ||
+   node->type == ACPI_IORT_NODE_RMR) {
*id_out = map->output_base;
return parent;
}
@@ -1566,6 +1569,134 @@ static void __init iort_enable_acs(struct 
acpi_iort_node *iort_node)
 #else
 static inline void iort_enable_acs(struct acpi_iort_node *iort_node) { }
 #endif
+static void iort_rmr_desc_check_overlap(struct acpi_iort_rmr_desc *desc, u32 
count)
+{
+   int i, j;
+
+   for (i = 0; i < count; i++) {
+   u64 end, start = desc[i].base_address, length = desc[i].length;
+
+   end = start + length - 1;
+
+   /* Check for address overlap */
+   for (j = i + 1; j < count; j++) {
+   u64 e_start = desc[j].base_address;
+   u64 e_end = e_start + desc[j].length - 1;
+
+   if (start <= e_end && end >= e_start)
+   pr_err(FW_BUG "RMR descriptor[0x%llx - 0x%llx] 
overlaps, continue anyway\n",
+  start, end);
+   }
+   }
+}
+
+static void __init iort_node_get_rmr_info(struct acpi_iort_node *iort_node)
+{
+   struct acpi_iort_node *smmu;
+   struct acpi_iort_rmr *rmr;
+   struct acpi_iort_rmr_desc *rmr_desc;
+   u32 map_count = iort_node->mapping_count;
+   u32 sid;
+   int i;
+
+   if (!iort_node->mapping_offset || map_count != 1) {
+   pr_err(FW_BUG "Invalid ID mapping, skipping RMR node %p\n",
+  iort_node);
+   return;
+   }
+
+   /* Retrieve associated smmu and stream id */
+   smmu = iort_node_get_id(iort_node, , 0);
+   if (!smmu) {
+   pr_err(FW_BUG "Invalid SMMU reference, skipping RMR node %p\n",
+  iort_node);
+   return;
+   }
+
+   /* Retrieve RMR data */
+   rmr = (struct acpi_iort_rmr *)iort_node->node_data;
+   if (!rmr->rmr_offset || !rmr->rmr_count) {
+   pr_err(FW_BUG "Invalid RMR descriptor array, skipping RMR node 
%p\n",
+  iort_node);
+   return;
+   }
+
+   rmr_desc = ACPI_ADD_PTR(struct acpi_iort_rmr_desc, iort_node,
+   rmr->rmr_offset);
+
+   iort_rmr_desc_check_overlap(rmr_desc, rmr->rmr_count);
+
+   for (i = 0; i < rmr->rmr_count; i++, rmr_desc++) {
+   struct iommu_resv_region *region;
+   enum iommu_resv_type type;
+   int prot = IOMMU_READ | IOMMU_WRITE;
+   u64 addr = rmr_desc->base_address, size = rmr_desc->length;
+
+   if (!IS_ALIGNED(addr, SZ_64K) || !IS_ALIGNED(size, SZ_64K)) {
+   /* PAGE align base addr and size */
+   addr &= PAGE_MASK;
+   size = PAGE_ALIGN(size + 
offset_in_page(rmr_desc->base_address));
+
+   pr_err(FW_BUG "RMR descriptor[0x%llx - 0x%llx] not 
aligned to 64K, continue with [0x%llx - 0x%llx]\n",
+  rmr_desc->base_address,
+  rmr_desc->base_address + rmr_desc->length - 1,
+  addr, addr + size - 1);
+   }
+   if (rmr->flags & IOMMU_RMR_REMAP_PERMITTED) {
+   type = IOMMU_RESV_DIRECT_RELAXABLE;
+   /*
+* Set IOMMU_CACHE as IOMMU_RESV_DIRECT_RELAXABLE is
+* normally used for allocated system memory that is
+* then used for device specific reserved regions.
+*/
+   prot |= IOMMU_CACHE;
+   } else {
+   type = 

[PATCH v7 1/9] iommu: Introduce a union to struct iommu_resv_region

2021-08-05 Thread Shameer Kolothum
A union is introduced to struct iommu_resv_region to hold
any firmware specific data. This is in preparation to add
support for IORT RMR reserve regions and the union now holds
the RMR specific information.

Signed-off-by: Shameer Kolothum 
---
 include/linux/iommu.h | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 32d448050bf7..bd0e4641c569 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -114,6 +114,13 @@ enum iommu_resv_type {
IOMMU_RESV_SW_MSI,
 };
 
+struct iommu_iort_rmr_data {
+#define IOMMU_RMR_REMAP_PERMITTED  (1 << 0)
+   u32 flags;
+   u32 sid;/* Stream Id associated with RMR entry */
+   void *smmu; /* Associated IORT SMMU node pointer */
+};
+
 /**
  * struct iommu_resv_region - descriptor for a reserved memory region
  * @list: Linked list pointers
@@ -121,6 +128,7 @@ enum iommu_resv_type {
  * @length: Length of the region in bytes
  * @prot: IOMMU Protection flags (READ/WRITE/...)
  * @type: Type of the reserved region
+ * @rmr: ACPI IORT RMR specific data
  */
 struct iommu_resv_region {
struct list_headlist;
@@ -128,6 +136,9 @@ struct iommu_resv_region {
size_t  length;
int prot;
enum iommu_resv_typetype;
+   union {
+   struct iommu_iort_rmr_data rmr;
+   } fw_data;
 };
 
 /**
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node

2021-08-05 Thread Shameer Kolothum
Hi,

The series adds support to IORT RMR nodes specified in IORT
Revision E.b -ARM DEN 0049E[0]. RMR nodes are used to describe
memory ranges that are used by endpoints and require a unity
mapping in SMMU.

We have faced issues with 3408iMR RAID controller cards which
fail to boot when SMMU is enabled. This is because these
controllers make use of host memory for various caching related
purposes and when SMMU is enabled the iMR firmware fails to 
access these memory regions as there is no mapping for them.
IORT RMR provides a way for UEFI to describe and report these
memory regions so that the kernel can make a unity mapping for
these in SMMU.

Change History: 

v6 --> v7

The only change from v6 is the fix pointed out by Steve to
the SMMUv2 SMR bypass install in patch #8.

Thanks to the Tested-by tags by Laurentiu with SMMUv2 and
Hanjun/Huiqiang with SMMUv3 for v6. I haven't added the tags
yet as the series still needs more review[1].

Feedback and tests on this series is very much appreciated.

v5 --> v6
- Addressed comments from Robin & Lorenzo.
  : Moved iort_parse_rmr() to acpi_iort_init() from
iort_init_platform_devices().
  : Removed use of struct iort_rmr_entry during the initial
parse. Using struct iommu_resv_region instead.
  : Report RMR address alignment and overlap errors, but continue.
  : Reworked arm_smmu_init_bypass_stes() (patch # 6).
- Updated SMMUv2 bypass SMR code. Thanks to Jon N (patch #8).
- Set IOMMU protection flags(IOMMU_CACHE, IOMMU_MMIO) based
  on Type of RMR region. Suggested by Jon N.

Thanks,
Shameer
[0] https://developer.arm.com/documentation/den0049/latest/
[1] 
https://lore.kernel.org/linux-acpi/20210716083442.1708-1-shameerali.kolothum.th...@huawei.com/T/#m043c95b869973a834b2fd57f3e1ed0325c84f3b7
--
v4 --> v5
 -Added a fw_data union to struct iommu_resv_region and removed
  struct iommu_rmr (Based on comments from Joerg/Robin).
 -Added iommu_put_rmrs() to release mem.
 -Thanks to Steve for verifying on SMMUv2, but not added the Tested-by
  yet because of the above changes.

v3 -->v4
-Included the SMMUv2 SMR bypass install changes suggested by
 Steve(patch #7)
-As per Robin's comments, RMR reserve implementation is now
 more generic  (patch #8) and dropped v3 patches 8 and 10.
-Rebase to 5.13-rc1

RFC v2 --> v3
 -Dropped RFC tag as the ACPICA header changes are now ready to be
  part of 5.13[0]. But this series still has a dependency on that patch.
 -Added IORT E.b related changes(node flags, _DSM function 5 checks for
  PCIe).
 -Changed RMR to stream id mapping from M:N to M:1 as per the spec and
  discussion here[1].
 -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!)
--

Jon Nettleton (1):
  iommu/arm-smmu: Get associated RMR info and install bypass SMR

Shameer Kolothum (8):
  iommu: Introduce a union to struct iommu_resv_region
  ACPI/IORT: Add support for RMR node parsing
  iommu/dma: Introduce generic helper to retrieve RMR info
  ACPI/IORT: Add a helper to retrieve RMR memory regions
  iommu/arm-smmu-v3: Introduce strtab init helper
  iommu/arm-smmu-v3: Refactor arm_smmu_init_bypass_stes() to force
bypass
  iommu/arm-smmu-v3: Get associated RMR info and install bypass STE
  iommu/dma: Reserve any RMR regions associated with a dev

 drivers/acpi/arm64/iort.c   | 172 +++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  76 +++--
 drivers/iommu/arm/arm-smmu/arm-smmu.c   |  48 ++
 drivers/iommu/dma-iommu.c   |  89 +-
 include/linux/acpi_iort.h   |   7 +
 include/linux/dma-iommu.h   |  13 ++
 include/linux/iommu.h   |  11 ++
 7 files changed, 393 insertions(+), 23 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 02/25] iommu/amd: Drop IOVA cookie management

2021-08-05 Thread kernel test robot
Hi Robin,

I love your patch! Yet something to improve:

[auto build test ERROR on iommu/next]
[also build test ERROR on rockchip/for-next sunxi/sunxi/for-next linus/master 
v5.14-rc4 next-20210804]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Robin-Murphy/iommu-Refactor-DMA-domain-strictness/20210805-011913
base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
config: x86_64-allyesconfig (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# 
https://github.com/0day-ci/linux/commit/33b83e4adc16220361ed42c229e8cd37f8a2a3aa
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Robin-Murphy/iommu-Refactor-DMA-domain-strictness/20210805-011913
git checkout 33b83e4adc16220361ed42c229e8cd37f8a2a3aa
# save the attached .config to linux build tree
make W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   drivers/iommu/amd/iommu.c: In function 'amd_iommu_probe_finalize':
>> drivers/iommu/amd/iommu.c:1757:3: error: implicit declaration of function 
>> 'iommu_setup_dma_ops'; did you mean 'arch_setup_dma_ops'? 
>> [-Werror=implicit-function-declaration]
1757 |   iommu_setup_dma_ops(dev, 0, U64_MAX);
 |   ^~~
 |   arch_setup_dma_ops
   cc1: some warnings being treated as errors


vim +1757 drivers/iommu/amd/iommu.c

1ac4cbbc5eb56d arch/x86/kernel/amd_iommu.c Joerg Roedel  2008-12-10  
1749  
dce8d6964ebdb3 drivers/iommu/amd_iommu.c   Joerg Roedel  2020-04-29  
1750  static void amd_iommu_probe_finalize(struct device *dev)
dce8d6964ebdb3 drivers/iommu/amd_iommu.c   Joerg Roedel  2020-04-29  
1751  {
dce8d6964ebdb3 drivers/iommu/amd_iommu.c   Joerg Roedel  2020-04-29  
1752   struct iommu_domain *domain;
ac1534a55d1e87 drivers/iommu/amd_iommu.c   Joerg Roedel  2012-06-21  
1753  
07ee86948c9111 drivers/iommu/amd_iommu.c   Joerg Roedel  2015-05-28  
1754   /* Domains are initialized for this device - have a look what we 
ended up with */
07ee86948c9111 drivers/iommu/amd_iommu.c   Joerg Roedel  2015-05-28  
1755   domain = iommu_get_domain_for_dev(dev);
57f9842e488406 drivers/iommu/amd_iommu.c   Joerg Roedel  2020-04-29  
1756   if (domain->type == IOMMU_DOMAIN_DMA)
ac6d704679d343 drivers/iommu/amd/iommu.c   Jean-Philippe Brucker 2021-06-18 
@1757   iommu_setup_dma_ops(dev, 0, U64_MAX);
d6177a6556f853 drivers/iommu/amd/iommu.c   Jean-Philippe Brucker 2021-04-22  
1758   else
d6177a6556f853 drivers/iommu/amd/iommu.c   Jean-Philippe Brucker 2021-04-22  
1759   set_dma_ops(dev, NULL);
e275a2a0fc9e21 arch/x86/kernel/amd_iommu.c Joerg Roedel  2008-12-10  
1760  }
e275a2a0fc9e21 arch/x86/kernel/amd_iommu.c Joerg Roedel  2008-12-10  
1761  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 05/25] iommu/exynos: Drop IOVA cookie management

2021-08-05 Thread Marek Szyprowski


On 04.08.2021 19:15, Robin Murphy wrote:
> The core code bakes its own cookies now.
>
> CC: Marek Szyprowski 
> Signed-off-by: Robin Murphy 

Acked-by: Marek Szyprowski 

Tested-by: Marek Szyprowski 

> ---
>
> v3: Also remove unneeded include
> ---
>   drivers/iommu/exynos-iommu.c | 19 ---
>   1 file changed, 4 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
> index d0fbf1d10e18..939ffa768986 100644
> --- a/drivers/iommu/exynos-iommu.c
> +++ b/drivers/iommu/exynos-iommu.c
> @@ -21,7 +21,6 @@
>   #include 
>   #include 
>   #include 
> -#include 
>   
>   typedef u32 sysmmu_iova_t;
>   typedef u32 sysmmu_pte_t;
> @@ -735,20 +734,16 @@ static struct iommu_domain 
> *exynos_iommu_domain_alloc(unsigned type)
>   /* Check if correct PTE offsets are initialized */
>   BUG_ON(PG_ENT_SHIFT < 0 || !dma_dev);
>   
> + if (type != IOMMU_DOMAIN_DMA && type != IOMMU_DOMAIN_UNMANAGED)
> + return NULL;
> +
>   domain = kzalloc(sizeof(*domain), GFP_KERNEL);
>   if (!domain)
>   return NULL;
>   
> - if (type == IOMMU_DOMAIN_DMA) {
> - if (iommu_get_dma_cookie(>domain) != 0)
> - goto err_pgtable;
> - } else if (type != IOMMU_DOMAIN_UNMANAGED) {
> - goto err_pgtable;
> - }
> -
>   domain->pgtable = (sysmmu_pte_t *)__get_free_pages(GFP_KERNEL, 2);
>   if (!domain->pgtable)
> - goto err_dma_cookie;
> + goto err_pgtable;
>   
>   domain->lv2entcnt = (short *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 
> 1);
>   if (!domain->lv2entcnt)
> @@ -779,9 +774,6 @@ static struct iommu_domain 
> *exynos_iommu_domain_alloc(unsigned type)
>   free_pages((unsigned long)domain->lv2entcnt, 1);
>   err_counter:
>   free_pages((unsigned long)domain->pgtable, 2);
> -err_dma_cookie:
> - if (type == IOMMU_DOMAIN_DMA)
> - iommu_put_dma_cookie(>domain);
>   err_pgtable:
>   kfree(domain);
>   return NULL;
> @@ -809,9 +801,6 @@ static void exynos_iommu_domain_free(struct iommu_domain 
> *iommu_domain)
>   
>   spin_unlock_irqrestore(>lock, flags);
>   
> - if (iommu_domain->type == IOMMU_DOMAIN_DMA)
> - iommu_put_dma_cookie(iommu_domain);
> -
>   dma_unmap_single(dma_dev, virt_to_phys(domain->pgtable), LV1TABLE_SIZE,
>DMA_TO_DEVICE);
>   

Best regards
-- 
Marek Szyprowski, PhD
Samsung R Institute Poland

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 01/25] iommu: Pull IOVA cookie management into the core

2021-08-05 Thread Marek Szyprowski


On 04.08.2021 19:15, Robin Murphy wrote:
> Now that everyone has converged on iommu-dma for IOMMU_DOMAIN_DMA
> support, we can abandon the notion of drivers being responsible for the
> cookie type, and consolidate all the management into the core code.
>
> CC: Marek Szyprowski 
> CC: Yoshihiro Shimoda 
> CC: Geert Uytterhoeven 
> CC: Yong Wu 
> CC: Heiko Stuebner 
> CC: Chunyan Zhang 
> CC: Maxime Ripard 
> Reviewed-by: Jean-Philippe Brucker 
> Reviewed-by: Lu Baolu 
> Signed-off-by: Robin Murphy 
Tested-by: Marek Szyprowski 
>
> ---
>
> v3: Use a simpler temporary check instead of trying to be clever with
>  the error code
> ---
>   drivers/iommu/iommu.c | 7 +++
>   include/linux/iommu.h | 3 ++-
>   2 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index f2cda9950bd5..b65fcc66ffa4 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -7,6 +7,7 @@
>   #define pr_fmt(fmt)"iommu: " fmt
>   
>   #include 
> +#include 
>   #include 
>   #include 
>   #include 
> @@ -1946,6 +1947,11 @@ static struct iommu_domain 
> *__iommu_domain_alloc(struct bus_type *bus,
>   /* Assume all sizes by default; the driver may override this later */
>   domain->pgsize_bitmap  = bus->iommu_ops->pgsize_bitmap;
>   
> + /* Temporarily avoid -EEXIST while drivers still get their own cookies 
> */
> + if (type == IOMMU_DOMAIN_DMA && !domain->iova_cookie && 
> iommu_get_dma_cookie(domain)) {
> + iommu_domain_free(domain);
> + domain = NULL;
> + }
>   return domain;
>   }
>   
> @@ -1957,6 +1963,7 @@ EXPORT_SYMBOL_GPL(iommu_domain_alloc);
>   
>   void iommu_domain_free(struct iommu_domain *domain)
>   {
> + iommu_put_dma_cookie(domain);
>   domain->ops->domain_free(domain);
>   }
>   EXPORT_SYMBOL_GPL(iommu_domain_free);
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 4997c78e2670..141779d76035 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -40,6 +40,7 @@ struct iommu_domain;
>   struct notifier_block;
>   struct iommu_sva;
>   struct iommu_fault_event;
> +struct iommu_dma_cookie;
>   
>   /* iommu fault flags */
>   #define IOMMU_FAULT_READ0x0
> @@ -86,7 +87,7 @@ struct iommu_domain {
>   iommu_fault_handler_t handler;
>   void *handler_token;
>   struct iommu_domain_geometry geometry;
> - void *iova_cookie;
> + struct iommu_dma_cookie *iova_cookie;
>   };
>   
>   enum iommu_cap {

Best regards
-- 
Marek Szyprowski, PhD
Samsung R Institute Poland

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 10/17] virtio: Handle device reset failure in register_virtio_device()

2021-08-05 Thread Jason Wang


在 2021/8/4 下午5:07, Yongji Xie 写道:

On Wed, Aug 4, 2021 at 4:54 PM Jason Wang  wrote:


在 2021/8/4 下午4:50, Yongji Xie 写道:

On Wed, Aug 4, 2021 at 4:32 PM Jason Wang  wrote:

在 2021/8/3 下午5:38, Yongji Xie 写道:

On Tue, Aug 3, 2021 at 4:09 PM Jason Wang  wrote:

在 2021/7/29 下午3:34, Xie Yongji 写道:

The device reset may fail in virtio-vdpa case now, so add checks to
its return value and fail the register_virtio_device().

So the reset() would be called by the driver during remove as well, or
is it sufficient to deal only with the reset during probe?


Actually there is no way to handle failure during removal. And it
should be safe with the protection of software IOTLB even if the
reset() fails.

Thanks,
Yongji

If this is true, does it mean we don't even need to care about reset
failure?


But we need to handle the failure in the vhost-vdpa case, isn't it?


Yes, but:

- This patch is for virtio not for vhost, if we don't care virtio, we
can avoid the changes
- For vhost, there could be two ways probably:

1) let the set_status to report error
2) require userspace to re-read for status

It looks to me you want to go with 1) and I'm not sure whether or not
it's too late to go with 2).


Looks like 2) can't work if reset failure happens in
vhost_vdpa_release() and vhost_vdpa_open().



Yes, you're right.

Thanks




Thanks,
Yongji



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH RFC 0/2] dma-pool: allow user to disable atomic pool

2021-08-05 Thread Baoquan He
On 06/24/21 at 11:47am, Robin Murphy wrote:
> On 2021-06-24 10:29, Baoquan He wrote:
> > On 06/24/21 at 08:40am, Christoph Hellwig wrote:
> > > So reduce the amount allocated.  But the pool is needed for proper
> > > operation on systems with memory encryption.  And please add the right
> > > maintainer or at least mailing list for the code you're touching next
> > > time.
> > 
> > Oh, I thoutht it's memory issue only, should have run
> > ./scripts/get_maintainer.pl. sorry.
> > 
> > About reducing the amount allocated, it may not help. Because on x86_64,
> > kdump kernel doesn't put any page of memory into buddy allocator of DMA
> > zone. Means it will defenitely OOM for atomic_pool_dma initialization.
> > 
> > Wondering in which case or on which device the atomic pool is needed on
> > AMD system with mem encrytion enabled. As we can see, the OOM will
> > happen too in kdump kernel on Intel system, even though it's not
> > necessary.

Sorry for very late response, and thank both for your comments.

> 
> Hmm, I think the Kconfig reshuffle has actually left a slight wrinkle here.
> For DMA_DIRECT_REMAP=y we can assume an atomic pool is always needed, since
> that was the original behaviour anyway. However the implications of
> AMD_MEM_ENCRYPT=y are different - even if support is enabled, it still
> should only be relevant if mem_encrypt_active(), so it probably does make
> sense to have an additional runtime gate on that.

> 
> From a quick scan, use of dma_alloc_from_pool() already depends on
> force_dma_unencrypted() so that's probably fine already, but I think we'd
> need a bit of extra protection around dma_free_from_pool() to prevent
> gen_pool_has_addr() dereferencing NULL if the pools are uninitialised, even
> with your proposed patch as it is. Presumably nothing actually called
> dma_direct_free() when you tested this?

Yes, enforcing the conditional check of force_dma_unencrypted() around
dma_free_from_pool sounds reasonable, just as we have done in
dma_alloc_from_pool().

I have tested this patchset on normal x86_64 systems and one amd system
with SME support, disabling atomic pool can fix the issue that there's no
managed pages in dma zone then requesting page from dma zone will cause
allocation failure. And even disabling atomic pool in 1st kernel didn't
cause any problem on one AMD EPYC system which supports SME. I am not
expert of DMA area, wondering how atomic pool is supposed to do in
SME/SEV system. 

Besides, even though atomic pool is disabled, slub page for allocation
of dma-kmalloc also triggers page allocation failure. So I change to
take another way to fix them, please check v2 post. The atomic pool
disabling an be a good to have change.

Thanks
Baoquan

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu