[PATCH] iommu/qcom: Replace zero-length array with flexible-array member

2020-02-12 Thread Gustavo A. R. Silva
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
int stuff;
struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/iommu/qcom_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 39759db4f003..f1e175ca5e4a 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -48,7 +48,7 @@ struct qcom_iommu_dev {
void __iomem*local_base;
u32  sec_id;
u8   num_ctxs;
-   struct qcom_iommu_ctx   *ctxs[0];   /* indexed by asid-1 */
+   struct qcom_iommu_ctx   *ctxs[];   /* indexed by asid-1 */
 };
 
 struct qcom_iommu_ctx {
-- 
2.23.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu: fix the module name to be consistent with older kernel

2020-02-12 Thread Li Yang
On Wed, Feb 12, 2020 at 4:50 AM Will Deacon  wrote:
>
> On Tue, Feb 11, 2020 at 06:37:20PM -0600, Li Yang wrote:
> > Commit cd221bd24ff5 ("iommu/arm-smmu: Allow building as a module")
> > introduced a side effect that changed the module name from arm-smmu to
> > arm-smmu-mod.  This breaks the users of kernel parameters for the driver
> > (e.g. arm-smmu.disable_bypass).  This patch changes the module name back
> > to be consistent.
> >
> > Signed-off-by: Li Yang 
> > ---
> >  drivers/iommu/Makefile  | 4 ++--
> >  drivers/iommu/{arm-smmu.c => arm-smmu-common.c} | 0
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> >  rename drivers/iommu/{arm-smmu.c => arm-smmu-common.c} (100%)
>
> Can't we just override MODULE_PARAM_PREFIX instead of renaming the file?

I can do that.  But on the other hand, the "mod" in the module name
arm-smmu-mod.ko seems to be redundant and looks a little bit weird.
Wouldn't it be cleaner to make it just arm-smmu.ko?

Regards,
Leo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] xprtrdma: Fix DMA scatter-gather list mapping imbalance

2020-02-12 Thread Chuck Lever



> On Feb 12, 2020, at 11:03 AM, Andre Tomt  wrote:
> 
> On 12.02.2020 14:48, Chuck Lever wrote:
>>> On Feb 12, 2020, at 8:43 AM, Chuck Lever  wrote:
>>> 
>>> The @nents value that was passed to ib_dma_map_sg() has to be passed
>>> to the matching ib_dma_unmap_sg() call. If ib_dma_map_sg() choses to
>>> concatenate sg entries, it will return a different nents value than
>>> it was passed.
>>> 
>>> The bug was exposed by recent changes to the AMD IOMMU driver, which
>>> enabled sg entry concatenation.
>>> 
>>> Looking all the way back to 4143f34e01e9 ("xprtrdma: Port to new
>>> memory registration API") and reviewing other kernel ULPs, it's not
>>> clear that the frwr_map() logic was ever correct for this case.
>>> 
>>> Reported-by: Andre Tomt 
>>> Suggested-by: Robin Murphy 
>>> Signed-off-by: Chuck Lever 
>>> ---
>>> include/trace/events/rpcrdma.h |6 --
>>> net/sunrpc/xprtrdma/frwr_ops.c |   13 +++--
>>> 2 files changed, 11 insertions(+), 8 deletions(-)
>>> 
>>> Hi Andre, here's take 2, based on the trace data you sent me.
>>> Please let me know if this one fares any better.
>>> 
>>> Changes since v1:
>>> - Ensure the correct nents value is passed to ib_map_mr_sg
>>> - Record the mr_nents value in the MR trace points
> Verified working (with the patch correction) in my environment, with some 
> quick testing (mount + some random and bulk I/O)
> 
> client, 5.5.3 + patch + amd iommu on = OK
> client, 5.5.3 + patch + amd iommu off = OK
> client, 5.6-rc1 + patch + amd iommu on = OK
> 
> server, 5.5.3 + patch + intel iommu on = OK

Very good! I'll submit the fix through the NFS tree ASAP, and request backport 
to v5.5 stable.

--
Chuck Lever



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] xprtrdma: Fix DMA scatter-gather list mapping imbalance

2020-02-12 Thread Andre Tomt

On 12.02.2020 14:48, Chuck Lever wrote:

On Feb 12, 2020, at 8:43 AM, Chuck Lever  wrote:

The @nents value that was passed to ib_dma_map_sg() has to be passed
to the matching ib_dma_unmap_sg() call. If ib_dma_map_sg() choses to
concatenate sg entries, it will return a different nents value than
it was passed.

The bug was exposed by recent changes to the AMD IOMMU driver, which
enabled sg entry concatenation.

Looking all the way back to 4143f34e01e9 ("xprtrdma: Port to new
memory registration API") and reviewing other kernel ULPs, it's not
clear that the frwr_map() logic was ever correct for this case.

Reported-by: Andre Tomt 
Suggested-by: Robin Murphy 
Signed-off-by: Chuck Lever 
---
include/trace/events/rpcrdma.h |6 --
net/sunrpc/xprtrdma/frwr_ops.c |   13 +++--
2 files changed, 11 insertions(+), 8 deletions(-)

Hi Andre, here's take 2, based on the trace data you sent me.
Please let me know if this one fares any better.

Changes since v1:
- Ensure the correct nents value is passed to ib_map_mr_sg
- Record the mr_nents value in the MR trace points
Verified working (with the patch correction) in my environment, with 
some quick testing (mount + some random and bulk I/O)


client, 5.5.3 + patch + amd iommu on = OK
client, 5.5.3 + patch + amd iommu off = OK
client, 5.6-rc1 + patch + amd iommu on = OK

server, 5.5.3 + patch + intel iommu on = OK

Thanks!

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/vt-d: consider real PCI device when checking if mapping is needed

2020-02-12 Thread Robin Murphy

On 12/02/2020 3:17 am, Daniel Drake wrote:

On Wed, Feb 12, 2020 at 12:03 AM Derrick, Jonathan
 wrote:

On Tue, 2020-02-11 at 17:13 +0800, Daniel Drake wrote:

The PCI devices handled by intel-iommu may have a DMA requester on
another bus, such as VMD subdevices needing to use the VMD endpoint.

The real DMA device is now used for the DMA mapping, but one case was
missed earlier, when allocating memory through (e.g.) intel_map_page().
Confusion ensues if the iommu domain type for the subdevice does not match
the iommu domain type for the real DMA device.

Is there a way to force this situation for my testing?


I think you could hack device_def_domain_type() to return
IOMMU_DOMAIN_IDENTITY for the real device, and IOMMU_DOMAIN_DMA for
the subdevice.


As far as I'm aware that should only be possible if the subdevice has a 
distinct physical requester ID from the real device, which given the 
nomenclature I assume is not the case.


(well, technically you *can* start routing different logical streams 
from a single requester ID to multiple domains if you have PASIDs, but 
that's a whole other ball game)


Robin.


But I got curious as to why my subdevice might be IOMMU_DOMAIN_DMA, so
I checked, and found out that my assumptions weren't quite correct.
The subdevice has no iommu domain recorded at all. Before applying any
patches here, what's actually happening is:

1. Real DMA device gets registered with the iommu as
IOMMU_DOMAIN_IDENTITY using si_domain.
2. When the subdevice gets registered, the relevant code flow is
inside dmar_insert_one_dev_info():
  - it creates a new device_domain_info and domain->domain.type == IDENTITY, but
  - it then calls find_domain(dev) which successfully defers to the
real DMA device and returns the real DMA device's dmar_domain
  - since found != NULL (dmar_domain was found for this device) the
function bails out before setting dev->archdata.iommu

The results at this point are that the real DMA device is fully
registered as IOMMU_DOMAIN_IDENTITY using si_domain, but all of the
subdevices will always have dev->archdata.iommu == NULL.

Then when intel_map_page() is reached for the subdevice, it calls
iommu_need_mapping() for the subdevice.
This calls identity_mapping() on the subdevice, but that will always
return 0 because dev->archdata.iommu == NULL.
Following on from there, iommu_need_mapping() will then *always*
return true (mapping needed) for subdevices.

That will then lead to the situation described in my last mail, where
later down the allocation chain the request for creating a mapping
will be handed towards the real DMA dev, but that will then fail
because the real DMA dev is using IOMMU_DOMAIN_IDENTITY where no
mapping is needed.

Unless I missed anything that seems pretty clear to me now, and I
guess the only reason why you may not have already faced this in the
vmd case is if the real DMA device is not using IOMMU_DOMAIN_IDENTITY.
(To check this, you could log the value of the real dev
domain->domain.type in dmar_insert_one_dev_info(), and/or observe the
return value of identity_mapping() in iommu_need_mapping for the real
dev).

In any case it seems increasingly clear to me that
iommu_need_mapping() should be consulting the real DMA device in the
identity_mapping check, and your patch likewise solves the problem
faced here.

Thanks
Daniel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] xprtrdma: Fix DMA scatter-gather list mapping imbalance

2020-02-12 Thread Chuck Lever



> On Feb 12, 2020, at 8:43 AM, Chuck Lever  wrote:
> 
> The @nents value that was passed to ib_dma_map_sg() has to be passed
> to the matching ib_dma_unmap_sg() call. If ib_dma_map_sg() choses to
> concatenate sg entries, it will return a different nents value than
> it was passed.
> 
> The bug was exposed by recent changes to the AMD IOMMU driver, which
> enabled sg entry concatenation.
> 
> Looking all the way back to 4143f34e01e9 ("xprtrdma: Port to new
> memory registration API") and reviewing other kernel ULPs, it's not
> clear that the frwr_map() logic was ever correct for this case.
> 
> Reported-by: Andre Tomt 
> Suggested-by: Robin Murphy 
> Signed-off-by: Chuck Lever 
> ---
> include/trace/events/rpcrdma.h |6 --
> net/sunrpc/xprtrdma/frwr_ops.c |   13 +++--
> 2 files changed, 11 insertions(+), 8 deletions(-)
> 
> Hi Andre, here's take 2, based on the trace data you sent me.
> Please let me know if this one fares any better.
> 
> Changes since v1:
> - Ensure the correct nents value is passed to ib_map_mr_sg
> - Record the mr_nents value in the MR trace points
> 
> diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h
> index c0e4c93324f5..023c5da45999 100644
> --- a/include/trace/events/rpcrdma.h
> +++ b/include/trace/events/rpcrdma.h
> @@ -275,6 +275,7 @@ DECLARE_EVENT_CLASS(xprtrdma_mr,
> 
>   TP_STRUCT__entry(
>   __field(const void *, mr)
> + __field(unsigned int, nents)
>   __field(u32, handle)
>   __field(u32, length)
>   __field(u64, offset)
> @@ -283,14 +284,15 @@ DECLARE_EVENT_CLASS(xprtrdma_mr,
> 
>   TP_fast_assign(
>   __entry->mr = mr;
> + __entry->nents = mr->mr_nents;
>   __entry->handle = mr->mr_handle;
>   __entry->length = mr->mr_length;
>   __entry->offset = mr->mr_offset;
>   __entry->dir= mr->mr_dir;
>   ),
> 
> - TP_printk("mr=%p %u@0x%016llx:0x%08x (%s)",
> - __entry->mr, __entry->length,
> + TP_printk("mr=%p %d %u@0x%016llx:0x%08x (%s)",
> + __entry->mr, __entry->mr_nents, __entry->length,

This should be:
__entry->mr, __entry->nents, __entry->length,

Sorry about that.


>   (unsigned long long)__entry->offset, __entry->handle,
>   xprtrdma_show_direction(__entry->dir)
>   )
> diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
> index 095be887753e..75617646702b 100644
> --- a/net/sunrpc/xprtrdma/frwr_ops.c
> +++ b/net/sunrpc/xprtrdma/frwr_ops.c
> @@ -288,8 +288,8 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt 
> *r_xprt,
> {
>   struct rpcrdma_ia *ia = &r_xprt->rx_ia;
>   struct ib_reg_wr *reg_wr;
> + int i, n, dma_nents;
>   struct ib_mr *ibmr;
> - int i, n;
>   u8 key;
> 
>   if (nsegs > ia->ri_max_frwr_depth)
> @@ -313,15 +313,16 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt 
> *r_xprt,
>   break;
>   }
>   mr->mr_dir = rpcrdma_data_dir(writing);
> + mr->mr_nents = i;
> 
> - mr->mr_nents =
> - ib_dma_map_sg(ia->ri_id->device, mr->mr_sg, i, mr->mr_dir);
> - if (!mr->mr_nents)
> + dma_nents = ib_dma_map_sg(ia->ri_id->device, mr->mr_sg,
> +   mr->mr_nents, mr->mr_dir);
> + if (!dma_nents)
>   goto out_dmamap_err;
> 
>   ibmr = mr->frwr.fr_mr;
> - n = ib_map_mr_sg(ibmr, mr->mr_sg, mr->mr_nents, NULL, PAGE_SIZE);
> - if (unlikely(n != mr->mr_nents))
> + n = ib_map_mr_sg(ibmr, mr->mr_sg, dma_nents, NULL, PAGE_SIZE);
> + if (n != dma_nents)
>   goto out_mapmr_err;
> 
>   ibmr->iova &= 0x;
> 
> 

--
Chuck Lever



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2] xprtrdma: Fix DMA scatter-gather list mapping imbalance

2020-02-12 Thread Chuck Lever
The @nents value that was passed to ib_dma_map_sg() has to be passed
to the matching ib_dma_unmap_sg() call. If ib_dma_map_sg() choses to
concatenate sg entries, it will return a different nents value than
it was passed.

The bug was exposed by recent changes to the AMD IOMMU driver, which
enabled sg entry concatenation.

Looking all the way back to 4143f34e01e9 ("xprtrdma: Port to new
memory registration API") and reviewing other kernel ULPs, it's not
clear that the frwr_map() logic was ever correct for this case.

Reported-by: Andre Tomt 
Suggested-by: Robin Murphy 
Signed-off-by: Chuck Lever 
---
 include/trace/events/rpcrdma.h |6 --
 net/sunrpc/xprtrdma/frwr_ops.c |   13 +++--
 2 files changed, 11 insertions(+), 8 deletions(-)

Hi Andre, here's take 2, based on the trace data you sent me.
Please let me know if this one fares any better.

Changes since v1:
- Ensure the correct nents value is passed to ib_map_mr_sg
- Record the mr_nents value in the MR trace points

diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h
index c0e4c93324f5..023c5da45999 100644
--- a/include/trace/events/rpcrdma.h
+++ b/include/trace/events/rpcrdma.h
@@ -275,6 +275,7 @@ DECLARE_EVENT_CLASS(xprtrdma_mr,
 
TP_STRUCT__entry(
__field(const void *, mr)
+   __field(unsigned int, nents)
__field(u32, handle)
__field(u32, length)
__field(u64, offset)
@@ -283,14 +284,15 @@ DECLARE_EVENT_CLASS(xprtrdma_mr,
 
TP_fast_assign(
__entry->mr = mr;
+   __entry->nents = mr->mr_nents;
__entry->handle = mr->mr_handle;
__entry->length = mr->mr_length;
__entry->offset = mr->mr_offset;
__entry->dir= mr->mr_dir;
),
 
-   TP_printk("mr=%p %u@0x%016llx:0x%08x (%s)",
-   __entry->mr, __entry->length,
+   TP_printk("mr=%p %d %u@0x%016llx:0x%08x (%s)",
+   __entry->mr, __entry->mr_nents, __entry->length,
(unsigned long long)__entry->offset, __entry->handle,
xprtrdma_show_direction(__entry->dir)
)
diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
index 095be887753e..75617646702b 100644
--- a/net/sunrpc/xprtrdma/frwr_ops.c
+++ b/net/sunrpc/xprtrdma/frwr_ops.c
@@ -288,8 +288,8 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
 {
struct rpcrdma_ia *ia = &r_xprt->rx_ia;
struct ib_reg_wr *reg_wr;
+   int i, n, dma_nents;
struct ib_mr *ibmr;
-   int i, n;
u8 key;
 
if (nsegs > ia->ri_max_frwr_depth)
@@ -313,15 +313,16 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt 
*r_xprt,
break;
}
mr->mr_dir = rpcrdma_data_dir(writing);
+   mr->mr_nents = i;
 
-   mr->mr_nents =
-   ib_dma_map_sg(ia->ri_id->device, mr->mr_sg, i, mr->mr_dir);
-   if (!mr->mr_nents)
+   dma_nents = ib_dma_map_sg(ia->ri_id->device, mr->mr_sg,
+ mr->mr_nents, mr->mr_dir);
+   if (!dma_nents)
goto out_dmamap_err;
 
ibmr = mr->frwr.fr_mr;
-   n = ib_map_mr_sg(ibmr, mr->mr_sg, mr->mr_nents, NULL, PAGE_SIZE);
-   if (unlikely(n != mr->mr_nents))
+   n = ib_map_mr_sg(ibmr, mr->mr_sg, dma_nents, NULL, PAGE_SIZE);
+   if (n != dma_nents)
goto out_mapmr_err;
 
ibmr->iova &= 0x;


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V9 06/10] iommu/vt-d: Add svm/sva invalidate function

2020-02-12 Thread Auger Eric
Hi Jacob,

On 1/29/20 7:01 AM, Jacob Pan wrote:
> When Shared Virtual Address (SVA) is enabled for a guest OS via
> vIOMMU, we need to provide invalidation support at IOMMU API and driver
> level. This patch adds Intel VT-d specific function to implement
> iommu passdown invalidate API for shared virtual address.
> 
> The use case is for supporting caching structure invalidation
> of assigned SVM capable devices. Emulated IOMMU exposes queue
> invalidation capability and passes down all descriptors from the guest
> to the physical IOMMU.
> 
> The assumption is that guest to host device ID mapping should be
> resolved prior to calling IOMMU driver. Based on the device handle,
> host IOMMU driver can replace certain fields before submit to the
> invalidation queue.
> 
> Signed-off-by: Jacob Pan 
> Signed-off-by: Ashok Raj 
> Signed-off-by: Liu, Yi L 

I sent comments on the v7 in https://lkml.org/lkml/2019/11/12/266
I don't see any of them taken into account and if I am not wrong we did
not discuss their (un)relevance on the ML ;-)

I let you have a look at them then.

Thanks

Eric
> ---
>  drivers/iommu/intel-iommu.c | 173 
> 
>  1 file changed, 173 insertions(+)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 8a4136e805ac..b8aa6479b87f 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -5605,6 +5605,178 @@ static void intel_iommu_aux_detach_device(struct 
> iommu_domain *domain,
>   aux_domain_remove_dev(to_dmar_domain(domain), dev);
>  }
>  
> +/*
> + * 2D array for converting and sanitizing IOMMU generic TLB granularity to
> + * VT-d granularity. Invalidation is typically included in the unmap 
> operation
> + * as a result of DMA or VFIO unmap. However, for assigned device where guest
> + * could own the first level page tables without being shadowed by QEMU. In
> + * this case there is no pass down unmap to the host IOMMU as a result of 
> unmap
> + * in the guest. Only invalidations are trapped and passed down.
> + * In all cases, only first level TLB invalidation (request with PASID) can 
> be
> + * passed down, therefore we do not include IOTLB granularity for request
> + * without PASID (second level).
> + *
> + * For an example, to find the VT-d granularity encoding for IOTLB
> + * type and page selective granularity within PASID:
> + * X: indexed by iommu cache type
> + * Y: indexed by enum iommu_inv_granularity
> + * [IOMMU_CACHE_INV_TYPE_IOTLB][IOMMU_INV_GRANU_ADDR]
> + *
> + * Granu_map array indicates validity of the table. 1: valid, 0: invalid
> + *
> + */
> +const static int 
> inv_type_granu_map[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU_NR] = {
> + /* PASID based IOTLB, support PASID selective and page selective */
> + {0, 1, 1},
> + /* PASID based dev TLBs, only support all PASIDs or single PASID */
> + {1, 1, 0},
> + /* PASID cache */
> + {1, 1, 0}
> +};
> +
> +const static u64 
> inv_type_granu_table[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU_NR] = {
> + /* PASID based IOTLB */
> + {0, QI_GRAN_NONG_PASID, QI_GRAN_PSI_PASID},
> + /* PASID based dev TLBs */
> + {QI_DEV_IOTLB_GRAN_ALL, QI_DEV_IOTLB_GRAN_PASID_SEL, 0},
> + /* PASID cache */
> + {QI_PC_ALL_PASIDS, QI_PC_PASID_SEL, 0},
> +};
> +
> +static inline int to_vtd_granularity(int type, int granu, u64 *vtd_granu)
> +{
> + if (type >= IOMMU_CACHE_INV_TYPE_NR || granu >= IOMMU_INV_GRANU_NR ||
> + !inv_type_granu_map[type][granu])
> + return -EINVAL;
> +
> + *vtd_granu = inv_type_granu_table[type][granu];
> +
> + return 0;
> +}
> +
> +static inline u64 to_vtd_size(u64 granu_size, u64 nr_granules)
> +{
> + u64 nr_pages = (granu_size * nr_granules) >> VTD_PAGE_SHIFT;
> +
> + /* VT-d size is encoded as 2^size of 4K pages, 0 for 4k, 9 for 2MB, etc.
> +  * IOMMU cache invalidate API passes granu_size in bytes, and number of
> +  * granu size in contiguous memory.
> +  */
> + return order_base_2(nr_pages);
> +}
> +
> +#ifdef CONFIG_INTEL_IOMMU_SVM
> +static int intel_iommu_sva_invalidate(struct iommu_domain *domain,
> + struct device *dev, struct iommu_cache_invalidate_info 
> *inv_info)
> +{
> + struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> + struct device_domain_info *info;
> + struct intel_iommu *iommu;
> + unsigned long flags;
> + int cache_type;
> + u8 bus, devfn;
> + u16 did, sid;
> + int ret = 0;
> + u64 size;
> +
> + if (!inv_info || !dmar_domain ||
> + inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1)
> + return -EINVAL;
> +
> + if (!dev || !dev_is_pci(dev))
> + return -ENODEV;
> +
> + iommu = device_to_iommu(dev, &bus, &devfn);
> + if (!iommu)
> + return -ENODEV;
> +
> + spin_lock_irqsave(&device_domain_lock, flags);
> + spin_lock(&iommu->lock);
> + info = iommu_support_

Re: [PATCH V9 07/10] iommu/vt-d: Cache virtual command capability register

2020-02-12 Thread Auger Eric
Hi Jacob,

On 1/29/20 7:01 AM, Jacob Pan wrote:
> Virtual command registers are used in the guest only, to prevent
> vmexit cost, we cache the capability and store it during initialization.
> 
> Signed-off-by: Jacob Pan 
I sent my R-b in https://lkml.org/lkml/2019/11/8/228
Also Baolo did. Do I miss any change?

Again history log would help.

Thanks

Eric
> ---
>  drivers/iommu/dmar.c| 1 +
>  include/linux/intel-iommu.h | 5 +
>  2 files changed, 6 insertions(+)
> 
> diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> index 206733ec8140..fd35e625f315 100644
> --- a/drivers/iommu/dmar.c
> +++ b/drivers/iommu/dmar.c
> @@ -953,6 +953,7 @@ static int map_iommu(struct intel_iommu *iommu, u64 
> phys_addr)
>   warn_invalid_dmar(phys_addr, " returns all ones");
>   goto unmap;
>   }
> + iommu->vccap = dmar_readq(iommu->reg + DMAR_VCCAP_REG);
>  
>   /* the registers might be more than one page */
>   map_size = max_t(int, ecap_max_iotlb_offset(iommu->ecap),
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index dd9fa61689bc..4a8bf3b11de6 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -192,6 +192,9 @@
>  #define ecap_max_handle_mask(e) ((e >> 20) & 0xf)
>  #define ecap_sc_support(e)   ((e >> 7) & 0x1) /* Snooping Control */
>  
> +/* Virtual command interface capabilities */
> +#define vccap_pasid(v)   ((v & DMA_VCS_PAS)) /* PASID allocation 
> */
> +
>  /* IOTLB_REG */
>  #define DMA_TLB_FLUSH_GRANU_OFFSET  60
>  #define DMA_TLB_GLOBAL_FLUSH (((u64)1) << 60)
> @@ -285,6 +288,7 @@
>  
>  /* PRS_REG */
>  #define DMA_PRS_PPR  ((u32)1)
> +#define DMA_VCS_PAS  ((u64)1)
>  
>  #define IOMMU_WAIT_OP(iommu, offset, op, cond, sts)  \
>  do { \
> @@ -535,6 +539,7 @@ struct intel_iommu {
>   u64 reg_size; /* size of hw register set */
>   u64 cap;
>   u64 ecap;
> + u64 vccap;
>   u32 gcmd; /* Holds TE, EAFL. Don't need SRTP, SFL, WBF */
>   raw_spinlock_t  register_lock; /* protect register handling */
>   int seq_id; /* sequence id of the iommu */
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V9 05/10] iommu/vt-d: Support flushing more translation cache types

2020-02-12 Thread Auger Eric
Hi Jacob,

On 1/29/20 7:01 AM, Jacob Pan wrote:
> When Shared Virtual Memory is exposed to a guest via vIOMMU, scalable
> IOTLB invalidation may be passed down from outside IOMMU subsystems.
> This patch adds invalidation functions that can be used for additional
> translation cache types.
> 
> Signed-off-by: Jacob Pan 
> ---
>  drivers/iommu/dmar.c| 33 +
>  drivers/iommu/intel-pasid.c |  3 ++-
>  include/linux/intel-iommu.h | 20 
>  3 files changed, 51 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> index 071bb42bbbc5..206733ec8140 100644
> --- a/drivers/iommu/dmar.c
> +++ b/drivers/iommu/dmar.c
> @@ -1411,6 +1411,39 @@ void qi_flush_piotlb(struct intel_iommu *iommu, u16 
> did, u32 pasid, u64 addr,
>   qi_submit_sync(&desc, iommu);
>  }
>  
> +/* PASID-based device IOTLB Invalidate */
> +void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, u16 sid, u16 pfsid,
> + u32 pasid,  u16 qdep, u64 addr, unsigned size_order, u64 granu)
> +{
> + struct qi_desc desc = {.qw2 = 0, .qw3 = 0};
> +
> + desc.qw0 = QI_DEV_EIOTLB_PASID(pasid) | QI_DEV_EIOTLB_SID(sid) |
> + QI_DEV_EIOTLB_QDEP(qdep) | QI_DEIOTLB_TYPE |
> + QI_DEV_IOTLB_PFSID(pfsid);
> + desc.qw1 = QI_DEV_EIOTLB_GLOB(granu);
> +
> + /* If S bit is 0, we only flush a single page. If S bit is set,
> +  * The least significant zero bit indicates the invalidation address
> +  * range. VT-d spec 6.5.2.6.
> +  * e.g. address bit 12[0] indicates 8KB, 13[0] indicates 16KB.
> +  */
> + if (!size_order) {
> + desc.qw0 |= QI_DEV_EIOTLB_ADDR(addr) & ~QI_DEV_EIOTLB_SIZE;
> + } else {
> + unsigned long mask = 1UL << (VTD_PAGE_SHIFT + size_order);
> + desc.qw1 |= QI_DEV_EIOTLB_ADDR(addr & ~mask) | 
> QI_DEV_EIOTLB_SIZE;
> + }
> + qi_submit_sync(&desc, iommu);
I made some comments in
https://lkml.org/lkml/2019/8/14/1311
that do not seem to have been taken into account. Or do I miss something?

More generally having an individual history log would be useful and
speed up the review.

Thanks

Eric
> +}
> +
> +void qi_flush_pasid_cache(struct intel_iommu *iommu, u16 did, u64 granu, int 
> pasid)
> +{
> + struct qi_desc desc = {.qw1 = 0, .qw2 = 0, .qw3 = 0};
> +
> + desc.qw0 = QI_PC_PASID(pasid) | QI_PC_DID(did) | QI_PC_GRAN(granu) | 
> QI_PC_TYPE;
> + qi_submit_sync(&desc, iommu);
> +}
> +
>  /*
>   * Disable Queued Invalidation interface.
>   */
> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
> index bd067af4d20b..b100f51407f9 100644
> --- a/drivers/iommu/intel-pasid.c
> +++ b/drivers/iommu/intel-pasid.c
> @@ -435,7 +435,8 @@ pasid_cache_invalidation_with_pasid(struct intel_iommu 
> *iommu,
>  {
>   struct qi_desc desc;
>  
> - desc.qw0 = QI_PC_DID(did) | QI_PC_PASID_SEL | QI_PC_PASID(pasid);
> + desc.qw0 = QI_PC_DID(did) | QI_PC_GRAN(QI_PC_PASID_SEL) |
> + QI_PC_PASID(pasid) | QI_PC_TYPE;
>   desc.qw1 = 0;
>   desc.qw2 = 0;
>   desc.qw3 = 0;
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index b0ffecbc0dfc..dd9fa61689bc 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -332,7 +332,7 @@ enum {
>  #define QI_IOTLB_GRAN(gran)  (((u64)gran) >> (DMA_TLB_FLUSH_GRANU_OFFSET-4))
>  #define QI_IOTLB_ADDR(addr)  (((u64)addr) & VTD_PAGE_MASK)
>  #define QI_IOTLB_IH(ih)  (((u64)ih) << 6)
> -#define QI_IOTLB_AM(am)  (((u8)am))
> +#define QI_IOTLB_AM(am)  (((u8)am) & 0x3f)
>  
>  #define QI_CC_FM(fm) (((u64)fm) << 48)
>  #define QI_CC_SID(sid)   (((u64)sid) << 32)
> @@ -351,16 +351,21 @@ enum {
>  #define QI_PC_DID(did)   (((u64)did) << 16)
>  #define QI_PC_GRAN(gran) (((u64)gran) << 4)
>  
> -#define QI_PC_ALL_PASIDS (QI_PC_TYPE | QI_PC_GRAN(0))
> -#define QI_PC_PASID_SEL  (QI_PC_TYPE | QI_PC_GRAN(1))
> +/* PASID cache invalidation granu */
> +#define QI_PC_ALL_PASIDS 0
> +#define QI_PC_PASID_SEL  1
>  
>  #define QI_EIOTLB_ADDR(addr) ((u64)(addr) & VTD_PAGE_MASK)
>  #define QI_EIOTLB_IH(ih) (((u64)ih) << 6)
> -#define QI_EIOTLB_AM(am) (((u64)am))
> +#define QI_EIOTLB_AM(am) (((u64)am) & 0x3f)
>  #define QI_EIOTLB_PASID(pasid)   (((u64)pasid) << 32)
>  #define QI_EIOTLB_DID(did)   (((u64)did) << 16)
>  #define QI_EIOTLB_GRAN(gran) (((u64)gran) << 4)
>  
> +/* QI Dev-IOTLB inv granu */
> +#define QI_DEV_IOTLB_GRAN_ALL1
> +#define QI_DEV_IOTLB_GRAN_PASID_SEL  0
> +
>  #define QI_DEV_EIOTLB_ADDR(a)((u64)(a) & VTD_PAGE_MASK)
>  #define QI_DEV_EIOTLB_SIZE   (((u64)1) << 11)
>  #define QI_DEV_EIOTLB_GLOB(g)((u64)g)
> @@ -660,8 +665,15 @@ extern void qi_flush_iotlb(struct intel_iommu *iommu, 
> u16 did, u64 addr,
> unsigned int size_order, u64 type);
>  e

Re: [PATCH V9 03/10] iommu/vt-d: Add nested translation helper function

2020-02-12 Thread Auger Eric
Hi Jacob,
On 1/29/20 7:01 AM, Jacob Pan wrote:
> Nested translation mode is supported in VT-d 3.0 Spec.CH 3.8.
> With PASID granular translation type set to 0x11b, translation
> result from the first level(FL) also subject to a second level(SL)
> page table translation. This mode is used for SVA virtualization,
> where FL performs guest virtual to guest physical translation and
> SL performs guest physical to host physical translation.
You may also describe what the patch brings. The above text does not.
> 
> Signed-off-by: Jacob Pan 
> Signed-off-by: Liu, Yi L 
> ---
>  drivers/iommu/intel-pasid.c | 225 
> 
>  drivers/iommu/intel-pasid.h |  12 +++
>  include/linux/intel-iommu.h |   3 +
>  3 files changed, 240 insertions(+)
> 
> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
> index 22b30f10b396..bd067af4d20b 100644
> --- a/drivers/iommu/intel-pasid.c
> +++ b/drivers/iommu/intel-pasid.c
> @@ -359,6 +359,76 @@ pasid_set_flpm(struct pasid_entry *pe, u64 value)
>   pasid_set_bits(&pe->val[2], GENMASK_ULL(3, 2), value << 2);
>  }
>  
> +/*
> + * Setup the Extended Memory Type(EMT) field (Bits 91-93)
> + * of a scalable mode PASID entry.
> + */
> +static inline void
> +pasid_set_emt(struct pasid_entry *pe, u64 value)
> +{
> + pasid_set_bits(&pe->val[1], GENMASK_ULL(29, 27), value << 27);
> +}
> +
> +/*
> + * Setup the Page Attribute Table (PAT) field (Bits 96-127)
> + * of a scalable mode PASID entry.
> + */
> +static inline void
> +pasid_set_pat(struct pasid_entry *pe, u64 value)
pat is 32b
> +{
> + pasid_set_bits(&pe->val[1], GENMASK_ULL(63, 32), value << 32);
> +}
> +
> +/*
> + * Setup the Cache Disable (CD) field (Bit 89)
> + * of a scalable mode PASID entry.
> + */
> +static inline void
> +pasid_set_cd(struct pasid_entry *pe)
> +{
> + pasid_set_bits(&pe->val[1], 1 << 25, 1 << 25);
BIT_ULL() here and below?
My preference would have been to a have a helper converting absolute
field boundaries (as the spec, Fig 9-36 describes those with absolute
values) into the right val and offset. I think it would be less error
prone globally.
> +}
> +
> +/*
> + * Setup the Extended Memory Type Enable (EMTE) field (Bit 90)
> + * of a scalable mode PASID entry.
> + */
> +static inline void
> +pasid_set_emte(struct pasid_entry *pe)
> +{
> + pasid_set_bits(&pe->val[1], 1 << 26, 1 << 26);> +}
> +
> +/*
> + * Setup the Extended Access Flag Enable (EAFE) field (Bit 135)
> + * of a scalable mode PASID entry.
> + */
> +static inline void
> +pasid_set_eafe(struct pasid_entry *pe)
> +{
> + pasid_set_bits(&pe->val[2], 1 << 7, 1 << 7);
> +}
> +
> +/*
> + * Setup the Page-level Cache Disable (PCD) field (Bit 95)
> + * of a scalable mode PASID entry.
> + */
> +static inline void
> +pasid_set_pcd(struct pasid_entry *pe)
> +{
> + pasid_set_bits(&pe->val[1], 1 << 31, 1 << 31);
> +}
> +
> +/*
> + * Setup the Page-level Write-Through (PWT)) field (Bit 94)
))
> + * of a scalable mode PASID entry.
> + */
> +static inline void
> +pasid_set_pwt(struct pasid_entry *pe)
> +{
> + pasid_set_bits(&pe->val[1], 1 << 30, 1 << 30);
> +}
> +
>  static void
>  pasid_cache_invalidation_with_pasid(struct intel_iommu *iommu,
>   u16 did, int pasid)
> @@ -596,3 +666,158 @@ int intel_pasid_setup_pass_through(struct intel_iommu 
> *iommu,
>  
>   return 0;
>  }
> +
> +static int intel_pasid_setup_bind_data(struct intel_iommu *iommu,
> + struct pasid_entry *pte,
> + struct iommu_gpasid_bind_data_vtd *pasid_data)
> +{
> + /*
> +  * Not all guest PASID table entry fields are passed down during bind,
> +  * here we only set up the ones that are dependent on guest settings.
> +  * Execution related bits such as NXE, SMEP are not meaningful to IOMMU,
> +  * therefore not set. Other fields, such as snoop related, are set based
> +  * on host needs regardless of  guest settings.
s/of  /of /
> +  */
> + if (pasid_data->flags & IOMMU_SVA_VTD_GPASID_SRE) {
> + if (!ecap_srs(iommu->ecap)) {
> + pr_err("No supervisor request support on %s\n",
> +iommu->name);
> + return -EINVAL;
> + }
> + pasid_set_sre(pte);
> + }
> +
> + if (pasid_data->flags & IOMMU_SVA_VTD_GPASID_EAFE) {
> + if (!ecap_eafs(iommu->ecap)) {
> + pr_err("No extended access flag support on %s\n",
> + iommu->name);
> + return -EINVAL;
> + }
> + pasid_set_eafe(pte);
> + }
> +
> + /*
> +  * Memory type is only applicable to devices inside processor coherent
> +  * domain. PCIe devices are not included. We can skip the rest of the
> +  * flags if IOMMU does not support MTS.
> +  */
> + if (ecap_mts(iommu->ecap)) {
> + if (pasid_data->flags & I

Re: [PATCH V9 02/10] iommu/uapi: Define a mask for bind data

2020-02-12 Thread Auger Eric
Hi Jacob,

On 1/29/20 7:01 AM, Jacob Pan wrote:
> Memory type related guest PASID bind data can be grouped together for
> one simple check.
Those are flags related to memory type.
> Link: 
> https://lore.kernel.org/linux-iommu/20200109095123.17ed5e6b@jacob-builder/
not sure the link is really helpful.
> 
> Signed-off-by: Jacob Pan 
> ---
>  include/uapi/linux/iommu.h | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> index 4ad3496e5c43..fcafb6401430 100644
> --- a/include/uapi/linux/iommu.h
> +++ b/include/uapi/linux/iommu.h
> @@ -284,7 +284,10 @@ struct iommu_gpasid_bind_data_vtd {
>   __u32 pat;
>   __u32 emt;
>  };
> -
> +#define IOMMU_SVA_VTD_GPASID_EMT_MASK(IOMMU_SVA_VTD_GPASID_CD | \
> +  IOMMU_SVA_VTD_GPASID_EMTE | \
> +  IOMMU_SVA_VTD_GPASID_PCD |  \
> +  IOMMU_SVA_VTD_GPASID_PWT)
Why EMT rather than MT or MTS?
the spec says:
Those fields are treated as Reserved(0) for implementations not
supporting Memory Type (MTS=0 in Extended Capability Register).

>  /**
>   * struct iommu_gpasid_bind_data - Information about device and guest PASID 
> binding
>   * @version: Version of this data structure
> 

Thanks

Eric

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu: fix the module name to be consistent with older kernel

2020-02-12 Thread Will Deacon
On Tue, Feb 11, 2020 at 06:37:20PM -0600, Li Yang wrote:
> Commit cd221bd24ff5 ("iommu/arm-smmu: Allow building as a module")
> introduced a side effect that changed the module name from arm-smmu to
> arm-smmu-mod.  This breaks the users of kernel parameters for the driver
> (e.g. arm-smmu.disable_bypass).  This patch changes the module name back
> to be consistent.
> 
> Signed-off-by: Li Yang 
> ---
>  drivers/iommu/Makefile  | 4 ++--
>  drivers/iommu/{arm-smmu.c => arm-smmu-common.c} | 0
>  2 files changed, 2 insertions(+), 2 deletions(-)
>  rename drivers/iommu/{arm-smmu.c => arm-smmu-common.c} (100%)

Can't we just override MODULE_PARAM_PREFIX instead of renaming the file?

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu