Re: [PATCH v2] swiotlb: Adjust SWIOTBL bounce buffer size for SEV guests.

2020-06-23 Thread Ashish Kalra
Hello Konrad,

On Tue, Jun 23, 2020 at 09:38:43AM -0400, Konrad Rzeszutek Wilk wrote:
> On Mon, Apr 27, 2020 at 06:53:18PM +, Ashish Kalra wrote:
> > Hello Konrad,
> > 
> > On Mon, Mar 30, 2020 at 10:25:51PM +, Ashish Kalra wrote:
> > > Hello Konrad,
> > > 
> > > On Tue, Mar 03, 2020 at 12:03:53PM -0500, Konrad Rzeszutek Wilk wrote:
> > > > On Tue, Feb 04, 2020 at 07:35:00PM +, Ashish Kalra wrote:
> > > > > Hello Konrad,
> > > > > 
> > > > > Looking fwd. to your feedback regarding support of other memory
> > > > > encryption architectures such as Power, S390, etc.
> > > > > 
> > > > > Thanks,
> > > > > Ashish
> > > > > 
> > > > > On Fri, Jan 24, 2020 at 11:00:08PM +, Ashish Kalra wrote:
> > > > > > On Tue, Jan 21, 2020 at 03:54:03PM -0500, Konrad Rzeszutek Wilk 
> > > > > > wrote:
> > > > > > > > 
> > > > > > > > Additional memory calculations based on # of PCI devices and
> > > > > > > > their memory ranges will make it more complicated with so
> > > > > > > > many other permutations and combinations to explore, it is
> > > > > > > > essential to keep this patch as simple as possible by 
> > > > > > > > adjusting the bounce buffer size simply by determining it
> > > > > > > > from the amount of provisioned guest memory.
> > > > > > >> 
> > > > > > >> Please rework the patch to:
> > > > > > >> 
> > > > > > >>  - Use a log solution instead of the multiplication.
> > > > > > >>Feel free to cap it at a sensible value.
> > > > > > 
> > > > > > Ok.
> > > > > > 
> > > > > > >> 
> > > > > > >>  - Also the code depends on SWIOTLB calling in to the
> > > > > > >>adjust_swiotlb_default_size which looks wrong.
> > > > > > >> 
> > > > > > >>You should not adjust io_tlb_nslabs from 
> > > > > > >> swiotlb_size_or_default.
> > > > > > 
> > > > > > >>That function's purpose is to report a value.
> > > > > > >> 
> > > > > > >>  - Make io_tlb_nslabs be visible outside of the SWIOTLB code.
> > > > > > >> 
> > > > > > >>  - Can you utilize the IOMMU_INIT APIs and have your own detect 
> > > > > > >> which would
> > > > > > >>modify the io_tlb_nslabs (and set swiotbl=1?).
> > > > > > 
> > > > > > This seems to be a nice option, but then IOMMU_INIT APIs are
> > > > > > x86-specific and this swiotlb buffer size adjustment is also needed
> > > > > > for other memory encryption architectures like Power, S390, etc.
> > > > 
> > > > Oh dear. That I hadn't considered.
> > > > > > 
> > > > > > >> 
> > > > > > >>Actually you seem to be piggybacking on 
> > > > > > >> pci_swiotlb_detect_4gb - so
> > > > > > >>perhaps add in this code ? Albeit it really should be in it's 
> > > > > > >> own
> > > > > > >>file, not in arch/x86/kernel/pci-swiotlb.c
> > > > > > 
> > > > > > Actually, we piggyback on pci_swiotlb_detect_override which sets
> > > > > > swiotlb=1 as x86_64_start_kernel() and invocation of 
> > > > > > sme_early_init()
> > > > > > forces swiotlb on, but again this is all x86 architecture specific.
> > > > 
> > > > Then it looks like the best bet is to do it from within swiotlb_init?
> > > > We really can't do it from swiotlb_size_or_default - that function
> > > > should just return a value and nothing else.
> > > > 
> > > 
> > > Actually, we need to do it in swiotlb_size_or_default() as this gets 
> > > called by
> > > reserve_crashkernel_low() in arch/x86/kernel/setup.c and used to
> > > reserve low crashkernel memory. If we adjust swiotlb size later in
> > > swiotlb_init() which gets called later than reserve_crashkernel_low(),
> > > then any swiotlb size changes/expansion will conflict/overlap with the
> > > low memory reserved for crashkernel.
> > > 
> > and will also potentially cause SWIOTLB buffer allocation failures.
> > 
> > Do you have any feedback, comments on the above ?
> 
> 
> The init boot chain looks like this:
> 
> initmem_init
>   pci_iommu_alloc
>   -> pci_swiotlb_detect_4gb
>   -> swiotlb_init
> 
> reserve_crashkernel
>   reserve_crashkernel_low
>   -> swiotlb_size_or_default
>   ..
> 
> 
> (rootfs code):
>   pci_iommu_init
>   -> a bunch of the other IOMMU late_init code gets called..
>   ->  pci_swiotlb_late_init 
> 
> I have to say I am lost to how your patch fixes "If we adjust swiolb
> size later .. then any swiotlb size .. will overlap with the low memory
> reserved for crashkernel"?
> 

Actually as per the boot flow :

setup_arch() calls reserve_crashkernel() and pci_iommu_alloc() is
invoked through mm_init()/mem_init() and not via initmem_init().

start_kernel:
...
setup_arch()
reserve_crashkernel
reserve_crashkernel_low
-> swiotlb_size_or_default

...
...
mm_init()
mem_init()
pci_iommu_alloc
-> pci_swiotlb_detect_4gb
-> swiotlb_init

So as per the above boot flow, reserve_crashkernel() can get called
before swiotlb_detect/init, and hence, if we don't 

Re: [PATCH 5/7] iommu/vt-d: Fix devTLB flush for vSVA

2020-06-23 Thread Jacob Pan
On Tue, 23 Jun 2020 08:43:14 -0700
Jacob Pan  wrote:

> From: Liu Yi L 
> 
> For guest SVA usage, in order to optimize for less VMEXIT, guest
> request of IOTLB flush also includes device TLB.
> 
> On the host side, IOMMU driver performs IOTLB and implicit devTLB
> invalidation. When PASID-selective granularity is requested by the
> guest we need to derive the equivalent address range for devTLB
> instead of using the address information in the UAPI data. The reason
> for that is, unlike IOTLB flush, devTLB flush does not support
> PASID-selective granularity. This is to say, we need to set the
> following in the PASID based devTLB invalidation descriptor:
> - entire 64 bit range in address ~(0x1 << 63)
> - S bit = 1 (VT-d CH 6.5.2.6).
> 
> Without this fix, device TLB flush range is not set properly for PASID
> selective granularity. This patch also merged devTLB flush code for
> both implicit and explicit cases.
> 
> Fixes: 6ee1b77ba3ac ("iommu/vt-d: Add svm/sva invalidate function")
> Signed-off-by: Liu Yi L 
> Signed-off-by: Jacob Pan 
> ---
>  drivers/iommu/intel/iommu.c | 25 +
>  1 file changed, 17 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 96340da57075..5ea5732d5ec4 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -5408,7 +5408,7 @@ intel_iommu_sva_invalidate(struct iommu_domain
> *domain, struct device *dev, sid = PCI_DEVID(bus, devfn);
>  
>   /* Size is only valid in address selective invalidation */
> - if (inv_info->granularity != IOMMU_INV_GRANU_PASID)
> + if (inv_info->granularity == IOMMU_INV_GRANU_ADDR)
>   size = to_vtd_size(inv_info->addr_info.granule_size,
>  inv_info->addr_info.nb_granules);
>  
> @@ -5417,6 +5417,7 @@ intel_iommu_sva_invalidate(struct iommu_domain
> *domain, struct device *dev, IOMMU_CACHE_INV_TYPE_NR) {
>   int granu = 0;
>   u64 pasid = 0;
> + u64 addr = 0;
>  
>   granu = to_vtd_granularity(cache_type,
> inv_info->granularity); if (granu == -EINVAL) {
> @@ -5456,19 +5457,27 @@ intel_iommu_sva_invalidate(struct
> iommu_domain *domain, struct device *dev, (granu ==
> QI_GRAN_NONG_PASID) ? -1 : 1 << size, inv_info->addr_info.flags &
> IOMMU_INV_ADDR_FLAGS_LEAF); 
> + if (!info->ats_enabled)
> + break;
>   /*
>* Always flush device IOTLB if ATS is
> enabled. vIOMMU
>* in the guest may assume IOTLB flush is
> inclusive,
>* which is more efficient.
>*/
> - if (info->ats_enabled)
> - qi_flush_dev_iotlb_pasid(iommu, sid,
> - info->pfsid, pasid,
> - info->ats_qdep,
> -
> inv_info->addr_info.addr,
> - size);
> - break;
> + fallthrough;
>   case IOMMU_CACHE_INV_TYPE_DEV_IOTLB:
> + /*
> +  * There is no PASID selective flush for
> device TLB, so
> +  * the equivalent of that is we set the size
> to be the
> +  * entire range of 64 bit. User only
> provides PASID info
> +  * without address info. So we set addr to 0.
> +  */
> + if (inv_info->granularity ==
> IOMMU_INV_GRANU_PASID) {
> + size = 64 - VTD_PAGE_SHIFT;
> + addr = 0;
> + } else if (inv_info->granularity ==
> IOMMU_INV_GRANU_ADDR)
> + addr = inv_info->addr_info.addr;
> +
>   if (info->ats_enabled)
>   qi_flush_dev_iotlb_pasid(iommu, sid,
>   info->pfsid, pasid,
addr should be used here. will fix in the next version. Baolu has
pointed out this before but missed it here.

Jacob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/7] iommu/vt-d: Fix devTLB flush for vSVA

2020-06-23 Thread kernel test robot
Hi Jacob,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on iommu/next]
[also build test WARNING on linux/master linus/master v5.8-rc2 next-20200623]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use  as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Jacob-Pan/iommu-vt-d-Misc-tweaks-and-fixes-for-vSVA/20200623-233905
base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
config: i386-allyesconfig (attached as .config)
compiler: gcc-9 (Debian 9.3.0-13) 9.3.0
reproduce (this is a W=1 build):
# save the attached .config to linux build tree
make W=1 ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   drivers/iommu/intel/iommu.c: In function 'intel_iommu_sva_invalidate':
>> drivers/iommu/intel/iommu.c:5420:7: warning: variable 'addr' set but not 
>> used [-Wunused-but-set-variable]
5420 |   u64 addr = 0;
 |   ^~~~

vim +/addr +5420 drivers/iommu/intel/iommu.c

  5370  
  5371  #ifdef CONFIG_INTEL_IOMMU_SVM
  5372  static int
  5373  intel_iommu_sva_invalidate(struct iommu_domain *domain, struct device 
*dev,
  5374 struct iommu_cache_invalidate_info *inv_info)
  5375  {
  5376  struct dmar_domain *dmar_domain = to_dmar_domain(domain);
  5377  struct device_domain_info *info;
  5378  struct intel_iommu *iommu;
  5379  unsigned long flags;
  5380  int cache_type;
  5381  u8 bus, devfn;
  5382  u16 did, sid;
  5383  int ret = 0;
  5384  u64 size = 0;
  5385  
  5386  if (!inv_info || !dmar_domain ||
  5387  inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1)
  5388  return -EINVAL;
  5389  
  5390  if (!dev || !dev_is_pci(dev))
  5391  return -ENODEV;
  5392  
  5393  iommu = device_to_iommu(dev, , );
  5394  if (!iommu)
  5395  return -ENODEV;
  5396  
  5397  if (!(dmar_domain->flags & DOMAIN_FLAG_NESTING_MODE))
  5398  return -EINVAL;
  5399  
  5400  spin_lock_irqsave(_domain_lock, flags);
  5401  spin_lock(>lock);
  5402  info = get_domain_info(dev);
  5403  if (!info) {
  5404  ret = -EINVAL;
  5405  goto out_unlock;
  5406  }
  5407  did = dmar_domain->iommu_did[iommu->seq_id];
  5408  sid = PCI_DEVID(bus, devfn);
  5409  
  5410  /* Size is only valid in address selective invalidation */
  5411  if (inv_info->granularity == IOMMU_INV_GRANU_ADDR)
  5412  size = to_vtd_size(inv_info->addr_info.granule_size,
  5413 inv_info->addr_info.nb_granules);
  5414  
  5415  for_each_set_bit(cache_type,
  5416   (unsigned long *)_info->cache,
  5417   IOMMU_CACHE_INV_TYPE_NR) {
  5418  int granu = 0;
  5419  u64 pasid = 0;
> 5420  u64 addr = 0;
  5421  
  5422  granu = to_vtd_granularity(cache_type, 
inv_info->granularity);
  5423  if (granu == -EINVAL) {
  5424  pr_err_ratelimited("Invalid cache type and 
granu combination %d/%d\n",
  5425 cache_type, 
inv_info->granularity);
  5426  break;
  5427  }
  5428  
  5429  /*
  5430   * PASID is stored in different locations based on the
  5431   * granularity.
  5432   */
  5433  if (inv_info->granularity == IOMMU_INV_GRANU_PASID &&
  5434  (inv_info->pasid_info.flags & 
IOMMU_INV_PASID_FLAGS_PASID))
  5435  pasid = inv_info->pasid_info.pasid;
  5436  else if (inv_info->granularity == IOMMU_INV_GRANU_ADDR 
&&
  5437   (inv_info->addr_info.flags & 
IOMMU_INV_ADDR_FLAGS_PASID))
  5438  pasid = inv_info->addr_info.pasid;
  5439  
  5440  switch (BIT(cache_type)) {
  5441  case IOMMU_CACHE_INV_TYPE_IOTLB:
  5442  if (inv_info->granularity == 
IOMMU_INV_GRANU_ADDR &&
  5443  size &&
  5444  (inv_info->addr_info.addr & 
((BIT(VTD_PAGE_SHIFT + size)) - 1))) {
  5445  pr_err_ratelimited("Address out of 
range, 0x%llx, size order %llu\n",
  5446

[PATCH v3 2/5] iommu/uapi: Add argsz for user filled data

2020-06-23 Thread Jacob Pan
As IOMMU UAPI gets extended, user data size may increase. To support
backward compatibiliy, this patch introduces a size field to each UAPI
data structures. It is *always* the responsibility for the user to fill in
the correct size.

Specific scenarios for user data handling are documented in:
Documentation/userspace-api/iommu.rst

Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
 include/uapi/linux/iommu.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
index e907b7091a46..303f148a5cd7 100644
--- a/include/uapi/linux/iommu.h
+++ b/include/uapi/linux/iommu.h
@@ -135,6 +135,7 @@ enum iommu_page_response_code {
 
 /**
  * struct iommu_page_response - Generic page response information
+ * @argsz: User filled size of this data
  * @version: API version of this structure
  * @flags: encodes whether the corresponding fields are valid
  * (IOMMU_FAULT_PAGE_RESPONSE_* values)
@@ -143,6 +144,7 @@ enum iommu_page_response_code {
  * @code: response code from  iommu_page_response_code
  */
 struct iommu_page_response {
+   __u32   argsz;
 #define IOMMU_PAGE_RESP_VERSION_1  1
__u32   version;
 #define IOMMU_PAGE_RESP_PASID_VALID(1 << 0)
@@ -218,6 +220,7 @@ struct iommu_inv_pasid_info {
 /**
  * struct iommu_cache_invalidate_info - First level/stage invalidation
  * information
+ * @argsz: User filled size of this data
  * @version: API version of this structure
  * @cache: bitfield that allows to select which caches to invalidate
  * @granularity: defines the lowest granularity used for the invalidation:
@@ -246,6 +249,7 @@ struct iommu_inv_pasid_info {
  * must support the used granularity.
  */
 struct iommu_cache_invalidate_info {
+   __u32   argsz;
 #define IOMMU_CACHE_INVALIDATE_INFO_VERSION_1 1
__u32   version;
 /* IOMMU paging structure cache */
@@ -292,6 +296,7 @@ struct iommu_gpasid_bind_data_vtd {
 
 /**
  * struct iommu_gpasid_bind_data - Information about device and guest PASID 
binding
+ * @argsz: User filled size of this data
  * @version:   Version of this data structure
  * @format:PASID table entry format
  * @flags: Additional information on guest bind request
@@ -309,6 +314,7 @@ struct iommu_gpasid_bind_data_vtd {
  * PASID to host PASID based on this bind data.
  */
 struct iommu_gpasid_bind_data {
+   __u32 argsz;
 #define IOMMU_GPASID_BIND_VERSION_11
__u32 version;
 #define IOMMU_PASID_FORMAT_INTEL_VTD   1
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 5/5] iommu/uapi: Support both kernel and user unbind guest PASID

2020-06-23 Thread Jacob Pan
Guest SVA unbind data can come from either kernel and user space, if a
user pointer is passed in, IOMMU driver must copy from data from user.
If the unbind data is assembled in kernel, data can be trusted and
directly used. This patch creates a wrapper for unbind gpasid such that
user pointer can be parsed and sanitized before calling into the kernel
unbind function. Common user data copy code also consolidated.

Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
 drivers/iommu/iommu.c | 70 ++-
 include/linux/iommu.h | 13 --
 2 files changed, 64 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 4a025c429b41..595527e4c6b7 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2010,19 +2010,15 @@ int iommu_cache_invalidate(struct iommu_domain *domain, 
struct device *dev,
 }
 EXPORT_SYMBOL_GPL(iommu_cache_invalidate);
 
-int iommu_sva_bind_gpasid(struct iommu_domain *domain, struct device *dev,
-   void __user *udata)
-{
 
-   struct iommu_gpasid_bind_data data;
+static int iommu_sva_prepare_bind_data(void __user *udata, bool bind,
+  struct iommu_gpasid_bind_data *data)
+{
unsigned long minsz, maxsz;
 
-   if (unlikely(!domain->ops->sva_bind_gpasid))
-   return -ENODEV;
-
/* Current kernel data size is the max to be copied from user */
maxsz = sizeof(struct iommu_gpasid_bind_data);
-   memset((void *), 0, maxsz);
+   memset((void *)data, 0, maxsz);
 
/*
 * No new spaces can be added before the variable sized union, the
@@ -2031,11 +2027,11 @@ int iommu_sva_bind_gpasid(struct iommu_domain *domain, 
struct device *dev,
minsz = offsetof(struct iommu_gpasid_bind_data, vendor);
 
/* Copy minsz from user to get flags and argsz */
-   if (copy_from_user(, udata, minsz))
+   if (copy_from_user(data, udata, minsz))
return -EFAULT;
 
/* Fields before variable size union is mandatory */
-   if (data.argsz < minsz)
+   if (data->argsz < minsz)
return -EINVAL;
/*
 * User might be using a newer UAPI header, we shall let IOMMU vendor
@@ -2043,26 +2039,66 @@ int iommu_sva_bind_gpasid(struct iommu_domain *domain, 
struct device *dev,
 * can be vendor specific, larger argsz could be the result of extension
 * for one vendor but it should not affect another vendor.
 */
-   if (data.argsz > maxsz)
-   data.argsz = maxsz;
+   if (data->argsz > maxsz)
+   data->argsz = maxsz;
+
+   /*
+* For unbind, we don't need any extra data, host PASID is included in
+* the minsz and that is all we need.
+*/
+   if (!bind)
+   return 0;
 
/* Copy the remaining user data _after_ minsz */
-   if (copy_from_user((void *) + minsz, udata + minsz,
-   data.argsz - minsz))
+   if (copy_from_user((void *)data + minsz, udata + minsz,
+   data->argsz - minsz))
return -EFAULT;
 
+   return 0;
+}
+
+int iommu_sva_bind_gpasid(struct iommu_domain *domain, struct device *dev,
+   void __user *udata)
+{
+
+   struct iommu_gpasid_bind_data data;
+   int ret;
+
+   if (unlikely(!domain->ops->sva_bind_gpasid))
+   return -ENODEV;
+
+   ret = iommu_sva_prepare_bind_data(udata, true, );
+   if (ret)
+   return ret;
 
return domain->ops->sva_bind_gpasid(domain, dev, );
 }
 EXPORT_SYMBOL_GPL(iommu_sva_bind_gpasid);
 
-int iommu_sva_unbind_gpasid(struct iommu_domain *domain, struct device *dev,
-ioasid_t pasid)
+int __iommu_sva_unbind_gpasid(struct iommu_domain *domain, struct device *dev,
+   struct iommu_gpasid_bind_data *data)
 {
if (unlikely(!domain->ops->sva_unbind_gpasid))
return -ENODEV;
 
-   return domain->ops->sva_unbind_gpasid(dev, pasid);
+   return domain->ops->sva_unbind_gpasid(dev, data->hpasid);
+}
+EXPORT_SYMBOL_GPL(__iommu_sva_unbind_gpasid);
+
+int iommu_sva_unbind_gpasid(struct iommu_domain *domain, struct device *dev,
+   void __user *udata)
+{
+   struct iommu_gpasid_bind_data data;
+   int ret;
+
+   if (unlikely(!domain->ops->sva_bind_gpasid))
+   return -ENODEV;
+
+   ret = iommu_sva_prepare_bind_data(udata, false, );
+   if (ret)
+   return ret;
+
+   return __iommu_sva_unbind_gpasid(domain, dev, );
 }
 EXPORT_SYMBOL_GPL(iommu_sva_unbind_gpasid);
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a688fea42ae5..2567c33dc4e8 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -437,7 +437,9 @@ extern int iommu_cache_invalidate(struct 

[PATCH v3 3/5] iommu/uapi: Use named union for user data

2020-06-23 Thread Jacob Pan
IOMMU UAPI data size is filled by the user space which must be validated
by ther kernel. To ensure backward compatibility, user data can only be
extended by either re-purpose padding bytes or extend the variable sized
union at the end. No size change is allowed before the union. Therefore,
the minimum size is the offset of the union.

To use offsetof() on the union, we must make it named.

Link: https://lkml.org/lkml/2020/6/11/834
Signed-off-by: Jacob Pan 
---
 drivers/iommu/intel/iommu.c | 24 
 drivers/iommu/intel/svm.c   |  2 +-
 include/uapi/linux/iommu.h  |  4 ++--
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 50fc62413a35..59cba214ef13 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5409,8 +5409,8 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
 
/* Size is only valid in address selective invalidation */
if (inv_info->granularity == IOMMU_INV_GRANU_ADDR)
-   size = to_vtd_size(inv_info->addr_info.granule_size,
-  inv_info->addr_info.nb_granules);
+   size = to_vtd_size(inv_info->granu.addr_info.granule_size,
+  inv_info->granu.addr_info.nb_granules);
 
for_each_set_bit(cache_type,
 (unsigned long *)_info->cache,
@@ -5431,20 +5431,20 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
 * granularity.
 */
if (inv_info->granularity == IOMMU_INV_GRANU_PASID &&
-   (inv_info->pasid_info.flags & IOMMU_INV_PASID_FLAGS_PASID))
-   pasid = inv_info->pasid_info.pasid;
+   (inv_info->granu.pasid_info.flags & 
IOMMU_INV_PASID_FLAGS_PASID))
+   pasid = inv_info->granu.pasid_info.pasid;
else if (inv_info->granularity == IOMMU_INV_GRANU_ADDR &&
-(inv_info->addr_info.flags & 
IOMMU_INV_ADDR_FLAGS_PASID))
-   pasid = inv_info->addr_info.pasid;
+(inv_info->granu.addr_info.flags & 
IOMMU_INV_ADDR_FLAGS_PASID))
+   pasid = inv_info->granu.addr_info.pasid;
 
switch (BIT(cache_type)) {
case IOMMU_CACHE_INV_TYPE_IOTLB:
/* HW will ignore LSB bits based on address mask */
if (inv_info->granularity == IOMMU_INV_GRANU_ADDR &&
size &&
-   (inv_info->addr_info.addr & ((BIT(VTD_PAGE_SHIFT + 
size)) - 1))) {
+   (inv_info->granu.addr_info.addr & 
((BIT(VTD_PAGE_SHIFT + size)) - 1))) {
WARN_ONCE(1, "Address out of range, 0x%llx, 
size order %llu\n",
- inv_info->addr_info.addr, size);
+ inv_info->granu.addr_info.addr, size);
}
 
/*
@@ -5452,9 +5452,9 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
 * We use npages = -1 to indicate that.
 */
qi_flush_piotlb(iommu, did, pasid,
-   mm_to_dma_pfn(inv_info->addr_info.addr),
+   
mm_to_dma_pfn(inv_info->granu.addr_info.addr),
(granu == QI_GRAN_NONG_PASID) ? -1 : 1 
<< size,
-   inv_info->addr_info.flags & 
IOMMU_INV_ADDR_FLAGS_LEAF);
+   inv_info->granu.addr_info.flags & 
IOMMU_INV_ADDR_FLAGS_LEAF);
 
if (!info->ats_enabled)
break;
@@ -5475,13 +5475,13 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
size = 64 - VTD_PAGE_SHIFT;
addr = 0;
} else if (inv_info->granularity == 
IOMMU_INV_GRANU_ADDR)
-   addr = inv_info->addr_info.addr;
+   addr = inv_info->granu.addr_info.addr;
 
if (info->ats_enabled)
qi_flush_dev_iotlb_pasid(iommu, sid,
info->pfsid, pasid,
info->ats_qdep,
-   inv_info->addr_info.addr,
+   inv_info->granu.addr_info.addr,
size);
else
pr_warn_ratelimited("Passdown device IOTLB 
flush w/o ATS!\n");
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 

[PATCH v3 1/5] docs: IOMMU user API

2020-06-23 Thread Jacob Pan
IOMMU UAPI is newly introduced to support communications between guest
virtual IOMMU and host IOMMU. There has been lots of discussions on how
it should work with VFIO UAPI and userspace in general.

This document is indended to clarify the UAPI design and usage. The
mechenics of how future extensions should be achieved are also covered
in this documentation.

Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
 Documentation/userspace-api/iommu.rst | 244 ++
 1 file changed, 244 insertions(+)
 create mode 100644 Documentation/userspace-api/iommu.rst

diff --git a/Documentation/userspace-api/iommu.rst 
b/Documentation/userspace-api/iommu.rst
new file mode 100644
index ..f9e4ed90a413
--- /dev/null
+++ b/Documentation/userspace-api/iommu.rst
@@ -0,0 +1,244 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. iommu:
+
+=
+IOMMU Userspace API
+=
+
+IOMMU UAPI is used for virtualization cases where communications are
+needed between physical and virtual IOMMU drivers. For native
+usage, IOMMU is a system device which does not need to communicate
+with user space directly.
+
+The primary use cases are guest Shared Virtual Address (SVA) and
+guest IO virtual address (IOVA), wherein a virtual IOMMU (vIOMMU) is
+required to communicate with the physical IOMMU in the host.
+
+.. contents:: :local:
+
+Functionalities
+===
+Communications of user and kernel involve both directions. The
+supported user-kernel APIs are as follows:
+
+1. Alloc/Free PASID
+2. Bind/unbind guest PASID (e.g. Intel VT-d)
+3. Bind/unbind guest PASID table (e.g. ARM sMMU)
+4. Invalidate IOMMU caches
+5. Service page requests
+
+Requirements
+
+The IOMMU UAPIs are generic and extensible to meet the following
+requirements:
+
+1. Emulated and para-virtualised vIOMMUs
+2. Multiple vendors (Intel VT-d, ARM sMMU, etc.)
+3. Extensions to the UAPI shall not break existing user space
+
+Interfaces
+==
+Although the data structures defined in IOMMU UAPI are self-contained,
+there is no user API functions introduced. Instead, IOMMU UAPI is
+designed to work with existing user driver frameworks such as VFIO.
+
+Extension Rules & Precautions
+-
+When IOMMU UAPI gets extended, the data structures can *only* be
+modified in two ways:
+
+1. Adding new fields by re-purposing the padding[] field. No size change.
+2. Adding new union members at the end. May increase in size.
+
+No new fields can be added *after* the variable sized union in that it
+will break backward compatibility when offset moves. In both cases, a
+new flag must be accompanied with a new field such that the IOMMU
+driver can process the data based on the new flag. Version field is
+only reserved for the unlikely event of UAPI upgrade at its entirety.
+
+It's *always* the caller's responsibility to indicate the size of the
+structure passed by setting argsz appropriately.
+Though at the same time, argsz is user provided data which is not
+trusted. The argsz field allows the user to indicate how much data
+they're providing, it's still the kernel's responsibility to validate
+whether it's correct and sufficient for the requested operation.
+
+Compatibility Checking
+--
+When IOMMU UAPI extension results in size increase, user such as VFIO
+has to handle the following cases:
+
+1. User and kernel has exact size match
+2. An older user with older kernel header (smaller UAPI size) running on a
+   newer kernel (larger UAPI size)
+3. A newer user with newer kernel header (larger UAPI size) running
+   on an older kernel.
+4. A malicious/misbehaving user pass illegal/invalid size but within
+   range. The data may contain garbage.
+
+Feature Checking
+
+While launching a guest with vIOMMU, it is important to ensure that host
+can support the UAPI data structures to be used for vIOMMU-pIOMMU
+communications. Without upfront compatibility checking, future faults
+are difficult to report even in normal conditions. For example, TLB
+invalidations should always succeed. There is no architectural way to
+report back to the vIOMMU if the UAPI data is incompatible. If that
+happens, in order to protect IOMMU iosolation guarantee, we have to
+resort to not giving completion status in vIOMMU. This may result in
+VM hang.
+
+For this reason the following IOMMU UAPIs cannot fail:
+
+1. Free PASID
+2. Unbind guest PASID
+3. Unbind guest PASID table (SMMU)
+4. Cache invalidate
+
+User applications such as QEMU is expected to import kernel UAPI
+headers. Backward compatibility is supported per feature flags.
+For example, an older QEMU (with older kernel header) can run on newer
+kernel. Newer QEMU (with new kernel header) may refuse to initialize
+on an older kernel if new feature flags are not supported by older
+kernel. Simply recompile existing code with newer kernel header should
+not be an issue in that 

[PATCH v3 0/5] IOMMU user API enhancement

2020-06-23 Thread Jacob Pan
IOMMU user API header was introduced to support nested DMA translation and
related fault handling. The current UAPI data structures consist of three
areas that cover the interactions between host kernel and guest:
 - fault handling
 - cache invalidation
 - bind guest page tables, i.e. guest PASID

Future extensions are likely to support more architectures and vIOMMU features.

In the previous discussion, using user-filled data size and feature flags is
made a preferred approach over a unified version number.
https://lkml.org/lkml/2020/1/29/45

In addition to introduce argsz field to data structures, this patchset is also
trying to document the UAPI design, usage, and extension rules. VT-d driver
changes to utilize the new argsz field is included, VFIO usage is to follow.

Thanks,

Jacob


Changeog:
v3:
- Rewrote backward compatibility rule to support existing code
  re-compiled with newer kernel UAPI header that runs on older
  kernel. Based on review comment from Alex W.
  https://lore.kernel.org/linux-iommu/20200611094741.6d118...@w520.home/
- Take user pointer directly in UAPI functions. Perform argsz check
  and copy_from_user() in IOMMU driver. Eliminate the need for
  VFIO or other upper layer to parse IOMMU data.
- Create wrapper function for in-kernel users of UAPI functions
v2:
- Removed unified API version and helper
- Introduced argsz for each UAPI data
- Introduced UAPI doc

Jacob Pan (5):
  docs: IOMMU user API
  iommu/uapi: Add argsz for user filled data
  iommu/uapi: Use named union for user data
  iommu/uapi: Handle data and argsz filled by users
  iommu/uapi: Support both kernel and user unbind guest PASID

 Documentation/userspace-api/iommu.rst | 244 ++
 drivers/iommu/intel/iommu.c   |  24 ++--
 drivers/iommu/intel/svm.c |   5 +-
 drivers/iommu/iommu.c | 138 +--
 include/linux/iommu.h |  20 ++-
 include/uapi/linux/iommu.h|  10 +-
 6 files changed, 413 insertions(+), 28 deletions(-)
 create mode 100644 Documentation/userspace-api/iommu.rst

-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 4/5] iommu/uapi: Handle data and argsz filled by users

2020-06-23 Thread Jacob Pan
IOMMU UAPI data has a user filled argsz field which indicates the data
length comes with the API call. User data is not trusted, argsz must be
validated based on the current kernel data size, mandatory data size,
and feature flags.

User data may also be extended, results in possible argsz increase.
Backward compatibility is ensured based on size and flags checking.
Details are documented in Documentation/userspace-api/iommu.rst

This patch adds sanity checks in both IOMMU layer and vendor code, where
VT-d is the only user for now.

Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
 drivers/iommu/intel/svm.c |  3 ++
 drivers/iommu/iommu.c | 96 ---
 include/linux/iommu.h |  7 ++--
 3 files changed, 98 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 713b3a218483..237db56878c0 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -244,6 +244,9 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, 
struct device *dev,
data->format != IOMMU_PASID_FORMAT_INTEL_VTD)
return -EINVAL;
 
+   if (data->argsz != offsetofend(struct iommu_gpasid_bind_data, 
vendor.vtd))
+   return -EINVAL;
+
if (!dev_is_pci(dev))
return -ENOTSUPP;
 
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index d43120eb1dc5..4a025c429b41 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1951,22 +1951,108 @@ int iommu_attach_device(struct iommu_domain *domain, 
struct device *dev)
 EXPORT_SYMBOL_GPL(iommu_attach_device);
 
 int iommu_cache_invalidate(struct iommu_domain *domain, struct device *dev,
-  struct iommu_cache_invalidate_info *inv_info)
+   void __user *uinfo)
 {
+   struct iommu_cache_invalidate_info inv_info;
+   unsigned long minsz, maxsz;
+
if (unlikely(!domain->ops->cache_invalidate))
return -ENODEV;
 
-   return domain->ops->cache_invalidate(domain, dev, inv_info);
+   /* Current kernel data size is the max to be copied from user */
+   maxsz = sizeof(struct iommu_cache_invalidate_info);
+   memset((void *)_info, 0, maxsz);
+
+   /*
+* No new spaces can be added before the variable sized union, the
+* minimum size is the offset to the union.
+*/
+   minsz = offsetof(struct iommu_cache_invalidate_info, granu);
+
+   /* Copy minsz from user to get flags and argsz */
+   if (copy_from_user(_info, uinfo, minsz))
+   return -EFAULT;
+
+   /* Fields before variable size union is mandatory */
+   if (inv_info.argsz < minsz)
+   return -EINVAL;
+   /*
+* User might be using a newer UAPI header, we shall let IOMMU vendor
+* driver decide on what size it needs. Since the UAPI data extension
+* can be vendor specific, larger argsz could be the result of extension
+* for one vendor but it should not affect another vendor.
+*/
+   /*
+* User might be using a newer UAPI header which has a larger data
+* size, we shall support the existing flags within the current
+* size.
+*/
+   if (inv_info.argsz > maxsz)
+   inv_info.argsz = maxsz;
+
+   /* Checking the exact argsz based on generic flags */
+   if (inv_info.granularity == IOMMU_INV_GRANU_ADDR &&
+   inv_info.argsz != offsetofend(struct 
iommu_cache_invalidate_info,
+   granu.addr_info))
+   return -EINVAL;
+
+   if (inv_info.granularity == IOMMU_INV_GRANU_PASID &&
+   inv_info.argsz != offsetofend(struct 
iommu_cache_invalidate_info,
+   granu.pasid_info))
+   return -EINVAL;
+
+   /* Copy the remaining user data _after_ minsz */
+   if (copy_from_user((void *)_info + minsz, uinfo + minsz,
+   inv_info.argsz - minsz))
+   return -EFAULT;
+
+   return domain->ops->cache_invalidate(domain, dev, _info);
 }
 EXPORT_SYMBOL_GPL(iommu_cache_invalidate);
 
-int iommu_sva_bind_gpasid(struct iommu_domain *domain,
-  struct device *dev, struct iommu_gpasid_bind_data 
*data)
+int iommu_sva_bind_gpasid(struct iommu_domain *domain, struct device *dev,
+   void __user *udata)
 {
+
+   struct iommu_gpasid_bind_data data;
+   unsigned long minsz, maxsz;
+
if (unlikely(!domain->ops->sva_bind_gpasid))
return -ENODEV;
 
-   return domain->ops->sva_bind_gpasid(domain, dev, data);
+   /* Current kernel data size is the max to be copied from user */
+   maxsz = sizeof(struct iommu_gpasid_bind_data);
+   memset((void *), 0, maxsz);
+
+   /*
+* No new spaces can be added before the variable sized union, the
+* minimum 

Re: [PATCH 4/4] iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist()

2020-06-23 Thread Robin Murphy
On 2020-06-23 10:21, John Garry wrote:
> On 23/06/2020 02:07, kernel test robot wrote:
> 
> + Rikard, as the GENMASK compile-time sanity checks were added recently
> 
>> Hi John,
>>
>> I love your patch! Perhaps something to improve:
>>
>> [auto build test WARNING on iommu/next]
>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>> And when submitting patch, we suggest to use  as documented in
>> https://git-scm.com/docs/git-format-patch]
>>
>> url:    
>> https://github.com/0day-ci/linux/commits/John-Garry/iommu-arm-smmu-v3-Improve-cmdq-lock-efficiency/20200623-013438
>>  
>>
>> base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git 
>> next
>> config: arm64-randconfig-c024-20200622 (attached as .config)
>> compiler: aarch64-linux-gcc (GCC) 9.3.0
>>
>> If you fix the issue, kindly add following tag as appropriate
>> Reported-by: kernel test robot 
>>
>> All warnings (new ones prefixed by >>, old ones prefixed by <<):
>>
>> In file included from include/linux/bits.h:23,
>> from include/linux/ioport.h:15,
>> from include/linux/acpi.h:12,
>> from drivers/iommu/arm-smmu-v3.c:12:
>> drivers/iommu/arm-smmu-v3.c: In function 'arm_smmu_cmdq_issue_cmdlist':
>> include/linux/bits.h:26:28: warning: comparison of unsigned expression 
>> < 0 is always false [-Wtype-limits]
>> 26 |   __builtin_constant_p((l) > (h)), (l) > (h), 0)))
> 
> I'd say that GENMASK_INPUT_CHECK() should be able to handle a l=0 and 
> h=unsigned value, so I doubt this warn.
> 
> Using GENMASK((int)cmdq->q.llq.max_n_shift, 0) resolves it, but it looks 
> like GENMASK_INPUT_CHECK() could be improved.

That said, I think this particular case might be even better off dodging
GENMASK() entirely, by doing something like this first. Untested...

Robin.

->8-
Subject: [PATCH] iommu/arm-smmu-v3: Streamline queue calculations

Beyond the initial queue setup based on the log2 values from ID
registers, the log2 queue size is only ever used in the form of
(1 << max_n_shift) to repeatedly recalculate the number of queue
elements. Simply storing it in that form leads to slightly more
efficient code, particularly in the low-level queue accessors
where it counts most:

add/remove: 0/0 grow/shrink: 1/7 up/down: 4/-120 (-116)
Function old new   delta
arm_smmu_init_one_queue  360 364  +4
arm_smmu_priq_thread 512 508  -4
arm_smmu_evtq_thread 300 292  -8
__arm_smmu_cmdq_poll_set_valid_map.isra  296 288  -8
queue_remove_raw 180 164 -16
arm_smmu_gerror_handler  732 716 -16
arm_smmu_device_probe   43124284 -28
arm_smmu_cmdq_issue_cmdlist 18921852 -40
Total: Before=20135, After=20019, chg -0.58%

Signed-off-by: Robin Murphy 
---
 drivers/iommu/arm-smmu-v3.c | 46 ++---
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index f578677a5c41..407cb9451a7a 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -185,8 +185,8 @@
 #define ARM_SMMU_MEMATTR_DEVICE_nGnRE  0x1
 #define ARM_SMMU_MEMATTR_OIWB  0xf
 
-#define Q_IDX(llq, p)  ((p) & ((1 << (llq)->max_n_shift) - 1))
-#define Q_WRP(llq, p)  ((p) & (1 << (llq)->max_n_shift))
+#define Q_IDX(llq, p)  ((p) & ((llq)->max_n - 1))
+#define Q_WRP(llq, p)  ((p) & (llq)->max_n)
 #define Q_OVERFLOW_FLAG(1U << 31)
 #define Q_OVF(p)   ((p) & Q_OVERFLOW_FLAG)
 #define Q_ENT(q, p)((q)->base +\
@@ -531,7 +531,7 @@ struct arm_smmu_ll_queue {
} atomic;
u8  __pad[SMP_CACHE_BYTES];
} cacheline_aligned_in_smp;
-   u32 max_n_shift;
+   u32 max_n;
 };
 
 struct arm_smmu_queue {
@@ -771,7 +771,7 @@ static bool queue_has_space(struct arm_smmu_ll_queue *q, 
u32 n)
cons = Q_IDX(q, q->cons);
 
if (Q_WRP(q, q->prod) == Q_WRP(q, q->cons))
-   space = (1 << q->max_n_shift) - (prod - cons);
+   space = q->max_n - (prod - cons);
else
space = cons - prod;
 
@@ -1164,8 +1164,8 @@ static void __arm_smmu_cmdq_poll_set_valid_map(struct 
arm_smmu_cmdq *cmdq,
 {
u32 swidx, sbidx, ewidx, ebidx;
struct arm_smmu_ll_queue llq = {
-   .max_n_shift

[kbuild] Re: [PATCH v7 33/36] rapidio: fix common struct sg_table related issues

2020-06-23 Thread Dan Carpenter
Hi Marek,

url:
https://github.com/0day-ci/linux/commits/Marek-Szyprowski/DRM-fix-struct-sg_table-nents-vs-orig_nents-misuse/20200619-184302
base:ce2cc8efd7a40cbd17841add878cb691d0ce0bba
config: i386-randconfig-m021-20200623 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-13) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 
Reported-by: Dan Carpenter 

smatch warnings:
drivers/rapidio/devices/rio_mport_cdev.c:939 rio_dma_transfer() error: 
uninitialized symbol 'nents'.

# 
https://github.com/0day-ci/linux/commit/c99597eab54307f46248273962da0c23a9a88b76
git remote add linux-review https://github.com/0day-ci/linux
git remote update linux-review
git checkout c99597eab54307f46248273962da0c23a9a88b76
vim +/nents +939 drivers/rapidio/devices/rio_mport_cdev.c

e8de370188d098 Alexandre Bounine  2016-03-22  804  static int
4e1016dac1ccce Alexandre Bounine  2016-05-05  805  rio_dma_transfer(struct file 
*filp, u32 transfer_mode,
e8de370188d098 Alexandre Bounine  2016-03-22  806enum 
rio_transfer_sync sync, enum dma_data_direction dir,
e8de370188d098 Alexandre Bounine  2016-03-22  807struct 
rio_transfer_io *xfer)
e8de370188d098 Alexandre Bounine  2016-03-22  808  {
e8de370188d098 Alexandre Bounine  2016-03-22  809   struct mport_cdev_priv 
*priv = filp->private_data;
e8de370188d098 Alexandre Bounine  2016-03-22  810   unsigned long nr_pages 
= 0;
e8de370188d098 Alexandre Bounine  2016-03-22  811   struct page **page_list 
= NULL;
e8de370188d098 Alexandre Bounine  2016-03-22  812   struct mport_dma_req 
*req;
e8de370188d098 Alexandre Bounine  2016-03-22  813   struct mport_dev *md = 
priv->md;
e8de370188d098 Alexandre Bounine  2016-03-22  814   struct dma_chan *chan;
67446283d89467 John Hubbard   2020-06-04  815   int ret;
e8de370188d098 Alexandre Bounine  2016-03-22  816   int nents;
^

e8de370188d098 Alexandre Bounine  2016-03-22  817  
e8de370188d098 Alexandre Bounine  2016-03-22  818   if (xfer->length == 0)
e8de370188d098 Alexandre Bounine  2016-03-22  819   return -EINVAL;
e8de370188d098 Alexandre Bounine  2016-03-22  820   req = 
kzalloc(sizeof(*req), GFP_KERNEL);
e8de370188d098 Alexandre Bounine  2016-03-22  821   if (!req)
e8de370188d098 Alexandre Bounine  2016-03-22  822   return -ENOMEM;
e8de370188d098 Alexandre Bounine  2016-03-22  823  
e8de370188d098 Alexandre Bounine  2016-03-22  824   ret = 
get_dma_channel(priv);
e8de370188d098 Alexandre Bounine  2016-03-22  825   if (ret) {
e8de370188d098 Alexandre Bounine  2016-03-22  826   kfree(req);
e8de370188d098 Alexandre Bounine  2016-03-22  827   return ret;
e8de370188d098 Alexandre Bounine  2016-03-22  828   }
c5157b76869ba9 Ioan Nicu  2018-04-20  829   chan = priv->dmach;
c5157b76869ba9 Ioan Nicu  2018-04-20  830  
c5157b76869ba9 Ioan Nicu  2018-04-20  831   
kref_init(>refcount);
c5157b76869ba9 Ioan Nicu  2018-04-20  832   
init_completion(>req_comp);
c5157b76869ba9 Ioan Nicu  2018-04-20  833   req->dir = dir;
c5157b76869ba9 Ioan Nicu  2018-04-20  834   req->filp = filp;
c5157b76869ba9 Ioan Nicu  2018-04-20  835   req->priv = priv;
c5157b76869ba9 Ioan Nicu  2018-04-20  836   req->dmach = chan;
c5157b76869ba9 Ioan Nicu  2018-04-20  837   req->sync = sync;
e8de370188d098 Alexandre Bounine  2016-03-22  838  
e8de370188d098 Alexandre Bounine  2016-03-22  839   /*
e8de370188d098 Alexandre Bounine  2016-03-22  840* If parameter 
loc_addr != NULL, we are transferring data from/to
e8de370188d098 Alexandre Bounine  2016-03-22  841* data buffer 
allocated in user-space: lock in memory user-space
e8de370188d098 Alexandre Bounine  2016-03-22  842* buffer pages and 
build an SG table for DMA transfer request
e8de370188d098 Alexandre Bounine  2016-03-22  843*
e8de370188d098 Alexandre Bounine  2016-03-22  844* Otherwise (loc_addr 
== NULL) contiguous kernel-space buffer is
e8de370188d098 Alexandre Bounine  2016-03-22  845* used for DMA data 
transfers: build single entry SG table using
e8de370188d098 Alexandre Bounine  2016-03-22  846* offset within the 
internal buffer specified by handle parameter.
e8de370188d098 Alexandre Bounine  2016-03-22  847*/
e8de370188d098 Alexandre Bounine  2016-03-22  848   if (xfer->loc_addr) {
c4860ad6056483 Tvrtko Ursulin 2017-07-31  849   unsigned int 
offset;
e8de370188d098 Alexandre Bounine  2016-03-22  850   long pinned;
e8de370188d098 Alexandre Bounine  2016-03-22  851  
c4860ad6056483 Tvrtko Ursulin 2017-07-31  852   offset = 
lower_32_bits(offset_in_page(xfer->loc_addr));
e8de370188d098 Alexandre Bounine 

[PATCH 2/7] iommu/vt-d: Remove global page support in devTLB flush

2020-06-23 Thread Jacob Pan
Global pages support is removed from VT-d spec 3.0 for dev TLB
invalidation. This patch is to remove the bits for vSVA. Similar change
already made for the native SVA. See the link below.

Link: https://lkml.org/lkml/2019/8/26/651
Signed-off-by: Jacob Pan 
---
 drivers/iommu/intel/dmar.c  | 4 +---
 drivers/iommu/intel/iommu.c | 4 ++--
 include/linux/intel-iommu.h | 3 +--
 3 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index cc46dff98fa0..d9f973fa1190 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -1437,8 +1437,7 @@ void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, 
u32 pasid, u64 addr,
 
 /* PASID-based device IOTLB Invalidate */
 void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, u16 sid, u16 pfsid,
- u32 pasid,  u16 qdep, u64 addr,
- unsigned int size_order, u64 granu)
+ u32 pasid,  u16 qdep, u64 addr, unsigned int 
size_order)
 {
unsigned long mask = 1UL << (VTD_PAGE_SHIFT + size_order - 1);
struct qi_desc desc = {.qw1 = 0, .qw2 = 0, .qw3 = 0};
@@ -1446,7 +1445,6 @@ void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, 
u16 sid, u16 pfsid,
desc.qw0 = QI_DEV_EIOTLB_PASID(pasid) | QI_DEV_EIOTLB_SID(sid) |
QI_DEV_EIOTLB_QDEP(qdep) | QI_DEIOTLB_TYPE |
QI_DEV_IOTLB_PFSID(pfsid);
-   desc.qw1 = QI_DEV_EIOTLB_GLOB(granu);
 
/*
 * If S bit is 0, we only flush a single page. If S bit is set,
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 9129663a7406..96340da57075 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5466,7 +5466,7 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
info->pfsid, pasid,
info->ats_qdep,
inv_info->addr_info.addr,
-   size, granu);
+   size);
break;
case IOMMU_CACHE_INV_TYPE_DEV_IOTLB:
if (info->ats_enabled)
@@ -5474,7 +5474,7 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
info->pfsid, pasid,
info->ats_qdep,
inv_info->addr_info.addr,
-   size, granu);
+   size);
else
pr_warn_ratelimited("Passdown device IOTLB 
flush w/o ATS!\n");
break;
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 729386ca8122..9a6614880773 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -380,7 +380,6 @@ enum {
 
 #define QI_DEV_EIOTLB_ADDR(a)  ((u64)(a) & VTD_PAGE_MASK)
 #define QI_DEV_EIOTLB_SIZE (((u64)1) << 11)
-#define QI_DEV_EIOTLB_GLOB(g)  ((u64)(g) & 0x1)
 #define QI_DEV_EIOTLB_PASID(p) ((u64)((p) & 0xf) << 32)
 #define QI_DEV_EIOTLB_SID(sid) ((u64)((sid) & 0x) << 16)
 #define QI_DEV_EIOTLB_QDEP(qd) ((u64)((qd) & 0x1f) << 4)
@@ -704,7 +703,7 @@ void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, 
u32 pasid, u64 addr,
 
 void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, u16 sid, u16 pfsid,
  u32 pasid, u16 qdep, u64 addr,
- unsigned int size_order, u64 granu);
+ unsigned int size_order);
 void qi_flush_pasid_cache(struct intel_iommu *iommu, u16 did, u64 granu,
  int pasid);
 
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 5/7] iommu/vt-d: Fix devTLB flush for vSVA

2020-06-23 Thread Jacob Pan
From: Liu Yi L 

For guest SVA usage, in order to optimize for less VMEXIT, guest request
of IOTLB flush also includes device TLB.

On the host side, IOMMU driver performs IOTLB and implicit devTLB
invalidation. When PASID-selective granularity is requested by the guest
we need to derive the equivalent address range for devTLB instead of
using the address information in the UAPI data. The reason for that is, unlike
IOTLB flush, devTLB flush does not support PASID-selective granularity.
This is to say, we need to set the following in the PASID based devTLB
invalidation descriptor:
- entire 64 bit range in address ~(0x1 << 63)
- S bit = 1 (VT-d CH 6.5.2.6).

Without this fix, device TLB flush range is not set properly for PASID
selective granularity. This patch also merged devTLB flush code for both
implicit and explicit cases.

Fixes: 6ee1b77ba3ac ("iommu/vt-d: Add svm/sva invalidate function")
Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
 drivers/iommu/intel/iommu.c | 25 +
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 96340da57075..5ea5732d5ec4 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5408,7 +5408,7 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
sid = PCI_DEVID(bus, devfn);
 
/* Size is only valid in address selective invalidation */
-   if (inv_info->granularity != IOMMU_INV_GRANU_PASID)
+   if (inv_info->granularity == IOMMU_INV_GRANU_ADDR)
size = to_vtd_size(inv_info->addr_info.granule_size,
   inv_info->addr_info.nb_granules);
 
@@ -5417,6 +5417,7 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
 IOMMU_CACHE_INV_TYPE_NR) {
int granu = 0;
u64 pasid = 0;
+   u64 addr = 0;
 
granu = to_vtd_granularity(cache_type, inv_info->granularity);
if (granu == -EINVAL) {
@@ -5456,19 +5457,27 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
(granu == QI_GRAN_NONG_PASID) ? -1 : 1 
<< size,
inv_info->addr_info.flags & 
IOMMU_INV_ADDR_FLAGS_LEAF);
 
+   if (!info->ats_enabled)
+   break;
/*
 * Always flush device IOTLB if ATS is enabled. vIOMMU
 * in the guest may assume IOTLB flush is inclusive,
 * which is more efficient.
 */
-   if (info->ats_enabled)
-   qi_flush_dev_iotlb_pasid(iommu, sid,
-   info->pfsid, pasid,
-   info->ats_qdep,
-   inv_info->addr_info.addr,
-   size);
-   break;
+   fallthrough;
case IOMMU_CACHE_INV_TYPE_DEV_IOTLB:
+   /*
+* There is no PASID selective flush for device TLB, so
+* the equivalent of that is we set the size to be the
+* entire range of 64 bit. User only provides PASID info
+* without address info. So we set addr to 0.
+*/
+   if (inv_info->granularity == IOMMU_INV_GRANU_PASID) {
+   size = 64 - VTD_PAGE_SHIFT;
+   addr = 0;
+   } else if (inv_info->granularity == 
IOMMU_INV_GRANU_ADDR)
+   addr = inv_info->addr_info.addr;
+
if (info->ats_enabled)
qi_flush_dev_iotlb_pasid(iommu, sid,
info->pfsid, pasid,
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/7] iommu/vt-d: Enforce PASID devTLB field mask

2020-06-23 Thread Jacob Pan
From: Liu Yi L 

Set proper masks to avoid invalid input spillover to reserved bits.

Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
 include/linux/intel-iommu.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 4100bd224f5c..729386ca8122 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -380,8 +380,8 @@ enum {
 
 #define QI_DEV_EIOTLB_ADDR(a)  ((u64)(a) & VTD_PAGE_MASK)
 #define QI_DEV_EIOTLB_SIZE (((u64)1) << 11)
-#define QI_DEV_EIOTLB_GLOB(g)  ((u64)g)
-#define QI_DEV_EIOTLB_PASID(p) (((u64)p) << 32)
+#define QI_DEV_EIOTLB_GLOB(g)  ((u64)(g) & 0x1)
+#define QI_DEV_EIOTLB_PASID(p) ((u64)((p) & 0xf) << 32)
 #define QI_DEV_EIOTLB_SID(sid) ((u64)((sid) & 0x) << 16)
 #define QI_DEV_EIOTLB_QDEP(qd) ((u64)((qd) & 0x1f) << 4)
 #define QI_DEV_EIOTLB_PFSID(pfsid) (((u64)(pfsid & 0xf) << 12) | \
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/7] iommu/vt-d: Misc tweaks and fixes for vSVA

2020-06-23 Thread Jacob Pan
Hi Baolu and all,

This a series to address some of the issues we found in vSVA support.
Most of the patches deal with exception handling, we also removed some bits
that are not currently supported.

Many thanks to Kevin Tian's review.

Jacob & Yi

Jacob Pan (4):
  iommu/vt-d: Remove global page support in devTLB flush
  iommu/vt-d: Fix PASID devTLB invalidation
  iommu/vt-d: Warn on out-of-range invalidation address
  iommu/vt-d: Disable multiple GPASID-dev bind

Liu Yi L (3):
  iommu/vt-d: Enforce PASID devTLB field mask
  iommu/vt-d: Handle non-page aligned address
  iommu/vt-d: Fix devTLB flush for vSVA

 drivers/iommu/intel/dmar.c  | 23 ++-
 drivers/iommu/intel/iommu.c | 34 +-
 drivers/iommu/intel/pasid.c | 11 ++-
 drivers/iommu/intel/svm.c   | 22 +-
 include/linux/intel-iommu.h |  5 ++---
 5 files changed, 60 insertions(+), 35 deletions(-)

-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/7] iommu/vt-d: Handle non-page aligned address

2020-06-23 Thread Jacob Pan
From: Liu Yi L 

Address information for device TLB invalidation comes from userspace
when device is directly assigned to a guest with vIOMMU support.
VT-d requires page aligned address. This patch checks and enforce
address to be page aligned, otherwise reserved bits can be set in the
invalidation descriptor. Unrecoverable fault will be reported due to
non-zero value in the reserved bits.

Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
 drivers/iommu/intel/dmar.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index d9f973fa1190..53f4e5003620 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -1455,9 +1455,24 @@ void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, 
u16 sid, u16 pfsid,
 * Max Invs Pending (MIP) is set to 0 for now until we have DIT in
 * ECAP.
 */
-   desc.qw1 |= addr & ~mask;
-   if (size_order)
+   if (addr & ~VTD_PAGE_MASK)
+   pr_warn_ratelimited("Invalidate non-page aligned address 
%llx\n", addr);
+
+   if (size_order) {
+   /* Take page address */
+   desc.qw1 |= QI_DEV_EIOTLB_ADDR(addr);
+   /*
+* Existing 0s in address below size_order may be the least
+* significant bit, we must set them to 1s to avoid having
+* smaller size than desired.
+*/
+   desc.qw1 |= GENMASK_ULL(size_order + VTD_PAGE_SHIFT,
+   VTD_PAGE_SHIFT);
+   /* Clear size_order bit to indicate size */
+   desc.qw1 &= ~mask;
+   /* Set the S bit to indicate flushing more than 1 page */
desc.qw1 |= QI_DEV_EIOTLB_SIZE;
+   }
 
qi_submit_sync(iommu, , 1, 0);
 }
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/7] iommu/vt-d: Fix PASID devTLB invalidation

2020-06-23 Thread Jacob Pan
DevTLB flush can be used for both DMA request with and without PASIDs.
The former uses PASID#0 (RID2PASID), latter uses non-zero PASID for SVA
usage.

This patch adds a check for PASID value such that devTLB flush with
PASID is used for SVA case. This is more efficient in that multiple
PASIDs can be used by a single device, when tearing down a PASID entry
we shall flush only the devTLB specific to a PASID.

Fixes: 6f7db75e1c46 ("iommu/vt-d: Add second level page table")
Signed-off-by: Jacob Pan 
---
 drivers/iommu/intel/pasid.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index c81f0f17c6ba..3991a24539a1 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -486,7 +486,16 @@ devtlb_invalidation_with_pasid(struct intel_iommu *iommu,
qdep = info->ats_qdep;
pfsid = info->pfsid;
 
-   qi_flush_dev_iotlb(iommu, sid, pfsid, qdep, 0, 64 - VTD_PAGE_SHIFT);
+   /*
+* When PASID 0 is used, it indicates RID2PASID(DMA request w/o PASID),
+* devTLB flush w/o PASID should be used. For non-zero PASID under
+* SVA usage, device could do DMA with multiple PASIDs. It is more
+* efficient to flush devTLB specific to the PASID.
+*/
+   if (pasid)
+   qi_flush_dev_iotlb_pasid(iommu, sid, pfsid, pasid, qdep, 0, 64 
- VTD_PAGE_SHIFT);
+   else
+   qi_flush_dev_iotlb(iommu, sid, pfsid, qdep, 0, 64 - 
VTD_PAGE_SHIFT);
 }
 
 void intel_pasid_tear_down_entry(struct intel_iommu *iommu, struct device *dev,
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 6/7] iommu/vt-d: Warn on out-of-range invalidation address

2020-06-23 Thread Jacob Pan
For guest requested IOTLB invalidation, address and mask are provided as
part of the invalidation data. VT-d HW silently ignores any address bits
below the mask. SW shall also allow such case but give warning if
address does not align with the mask. This patch relax the fault
handling from error to warning and proceed with invalidation request
with the given mask.

Signed-off-by: Jacob Pan 
---
 drivers/iommu/intel/iommu.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 5ea5732d5ec4..50fc62413a35 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5439,13 +5439,12 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
 
switch (BIT(cache_type)) {
case IOMMU_CACHE_INV_TYPE_IOTLB:
+   /* HW will ignore LSB bits based on address mask */
if (inv_info->granularity == IOMMU_INV_GRANU_ADDR &&
size &&
(inv_info->addr_info.addr & ((BIT(VTD_PAGE_SHIFT + 
size)) - 1))) {
-   pr_err_ratelimited("Address out of range, 
0x%llx, size order %llu\n",
-  inv_info->addr_info.addr, 
size);
-   ret = -ERANGE;
-   goto out_unlock;
+   WARN_ONCE(1, "Address out of range, 0x%llx, 
size order %llu\n",
+ inv_info->addr_info.addr, size);
}
 
/*
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 7/7] iommu/vt-d: Disable multiple GPASID-dev bind

2020-06-23 Thread Jacob Pan
For the unlikely use case where multiple aux domains from the same pdev
are attached to a single guest and then bound to a single process
(thus same PASID) within that guest, we cannot easily support this case
by refcounting the number of users. As there is only one SL page table
per PASID while we have multiple aux domains thus multiple SL page tables
for the same PASID.

Extra unbinding guest PASID can happen due to race between normal and
exception cases. Termination of one aux domain may affect others unless
we actively track and switch aux domains to ensure the validity of SL
page tables and TLB states in the shared PASID entry.

Support for sharing second level PGDs across domains can reduce the
complexity but this is not available due to the limitations on VFIO
container architecture. We can revisit this decision once sharing PGDs
are available.

Overall, the complexity and potential glitch do not warrant this unlikely
use case thereby removed by this patch.

Cc: Kevin Tian 
Cc: Lu Baolu 
Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
 drivers/iommu/intel/svm.c | 22 +-
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 6c87c807a0ab..d386853121a2 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -277,20 +277,16 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, 
struct device *dev,
goto out;
}
 
+   /*
+* Do not allow multiple bindings of the same device-PASID since
+* there is only one SL page tables per PASID. We may revisit
+* once sharing PGD across domains are supported.
+*/
for_each_svm_dev(sdev, svm, dev) {
-   /*
-* For devices with aux domains, we should allow
-* multiple bind calls with the same PASID and pdev.
-*/
-   if (iommu_dev_feature_enabled(dev,
- IOMMU_DEV_FEAT_AUX)) {
-   sdev->users++;
-   } else {
-   dev_warn_ratelimited(dev,
-"Already bound with PASID 
%u\n",
-svm->pasid);
-   ret = -EBUSY;
-   }
+   dev_warn_ratelimited(dev,
+"Already bound with PASID %u\n",
+svm->pasid);
+   ret = -EBUSY;
goto out;
}
} else {
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

2020-06-23 Thread Bjorn Helgaas
On Fri, Jun 19, 2020 at 10:26:54AM +0800, Zhangfei Gao wrote:
> Have studied _DSM method, two issues we met comparing using quirk.
> 
> 1. Need change definition of either pci_host_bridge or pci_dev, like adding
> member can_stall,
> while pci system does not know stall now.
> 
> a, pci devices do not have uuid: uuid need be described in dsdt, while pci
> devices are not defined in dsdt.
>     so we have to use host bridge.

PCI devices *can* be described in the DSDT.  IIUC these particular
devices are hardwired (not plug-in cards), so platform firmware can
know about them and could describe them in the DSDT.

> b,  Parsing dsdt is in in pci subsystem.
> Like drivers/acpi/pci_root.c:
>    obj = acpi_evaluate_dsm(ACPI_HANDLE(bus->bridge), _acpi_dsm_guid,
> 1,
>     IGNORE_PCI_BOOT_CONFIG_DSM, NULL);
> 
> After parsing DSM in pci, we need record this info.
> Currently, can_stall info is recorded in iommu_fwspec,
> which is allocated in iommu_fwspec_init and called by iort_iommu_configure
> for uefi.

You can look for a _DSM wherever it is convenient for you.  It could
be in an AMBA shim layer.

> 2. Guest kernel also need support sva.
> Using quirk, the guest can boot with sva enabled, since quirk is
> self-contained by kernel.
> If using  _DSM, a specific uefi or dtb has to be provided,
> currently we can useQEMU_EFI.fd from apt install qemu-efi

I don't quite understand what this means, but as I mentioned before, a
quirk for a *limited* number of devices is OK, as long as there is a
plan that removes the need for a quirk for future devices.

E.g., if the next platform version ships with a DTB or firmware with a
_DSM or other mechanism that enables the kernel to discover this
information without a kernel change, it's fine to use a quirk to cover
the early platform.

The principles are:

  - I don't want to have to update a quirk for every new Device ID
that needs this.

  - I don't really want to have to manage non-PCI information in the
struct pci_dev.  If this is AMBA- or IOMMU-related, it should be
stored in a structure related to AMBA or the IOMMU.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/4] iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist()

2020-06-23 Thread Rikard Falkeborn
Den tis 23 juni 2020 12:21John Garry  skrev:

> On 23/06/2020 10:35, Rikard Falkeborn wrote:
> >
> > I'd say that GENMASK_INPUT_CHECK() should be able to handle a l=0 and
> > h=unsigned value, so I doubt this warn.
> >
> > Using GENMASK((int)cmdq->q.llq.max_n_shift, 0) resolves it, but it
> > looks
> > like GENMASK_INPUT_CHECK() could be improved.
> >
> >
> > Indeed it could, it is fixed in -next.
>
> ok, thanks for the pointer, but I still see this on today's -next with
> this patch:
>
> make W=1 drivers/iommu/arm-smmu-v3.o
>
>
Oh, ok thanks for reporting. I guess different gcc versions have different
behaviour. I guess we'll have to change the comparison to (!((h) == (l) ||
(h) > (l))) instead (not sure I got all parenthesis and logic correct but
you get the idea).

I'm travelling and wont have time to look at this until next week though.

Rikard



In file included from ./include/linux/bits.h:23:0,
>  from ./include/linux/ioport.h:15,
>  from ./include/linux/acpi.h:12,
>  from drivers/iommu/arm-smmu-v3.c:12:
> drivers/iommu/arm-smmu-v3.c: In function ‘arm_smmu_cmdq_issue_cmdlist’:
> ./include/linux/bits.h:27:7: warning: comparison of unsigned expression
> < 0 is always false [-Wtype-limits]
>(l) > (h), 0)))
>^
> ./include/linux/build_bug.h:16:62: note: in definition of macro
> ‘BUILD_BUG_ON_ZERO’
> #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
>   ^
> ./include/linux/bits.h:40:3: note: in expansion of macro
> ‘GENMASK_INPUT_CHECK’
>   (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
>^~~
> drivers/iommu/arm-smmu-v3.c:1404:18: note: in expansion of macro ‘GENMASK’
>   u32 prod_mask = GENMASK(cmdq->q.llq.max_n_shift, 0);
>
> That's gcc 7.5.0 .
>
> Cheers,
> John
>
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2] swiotlb: Adjust SWIOTBL bounce buffer size for SEV guests.

2020-06-23 Thread Konrad Rzeszutek Wilk
On Mon, Apr 27, 2020 at 06:53:18PM +, Ashish Kalra wrote:
> Hello Konrad,
> 
> On Mon, Mar 30, 2020 at 10:25:51PM +, Ashish Kalra wrote:
> > Hello Konrad,
> > 
> > On Tue, Mar 03, 2020 at 12:03:53PM -0500, Konrad Rzeszutek Wilk wrote:
> > > On Tue, Feb 04, 2020 at 07:35:00PM +, Ashish Kalra wrote:
> > > > Hello Konrad,
> > > > 
> > > > Looking fwd. to your feedback regarding support of other memory
> > > > encryption architectures such as Power, S390, etc.
> > > > 
> > > > Thanks,
> > > > Ashish
> > > > 
> > > > On Fri, Jan 24, 2020 at 11:00:08PM +, Ashish Kalra wrote:
> > > > > On Tue, Jan 21, 2020 at 03:54:03PM -0500, Konrad Rzeszutek Wilk wrote:
> > > > > > > 
> > > > > > > Additional memory calculations based on # of PCI devices and
> > > > > > > their memory ranges will make it more complicated with so
> > > > > > > many other permutations and combinations to explore, it is
> > > > > > > essential to keep this patch as simple as possible by 
> > > > > > > adjusting the bounce buffer size simply by determining it
> > > > > > > from the amount of provisioned guest memory.
> > > > > >> 
> > > > > >> Please rework the patch to:
> > > > > >> 
> > > > > >>  - Use a log solution instead of the multiplication.
> > > > > >>Feel free to cap it at a sensible value.
> > > > > 
> > > > > Ok.
> > > > > 
> > > > > >> 
> > > > > >>  - Also the code depends on SWIOTLB calling in to the
> > > > > >>adjust_swiotlb_default_size which looks wrong.
> > > > > >> 
> > > > > >>You should not adjust io_tlb_nslabs from 
> > > > > >> swiotlb_size_or_default.
> > > > > 
> > > > > >>That function's purpose is to report a value.
> > > > > >> 
> > > > > >>  - Make io_tlb_nslabs be visible outside of the SWIOTLB code.
> > > > > >> 
> > > > > >>  - Can you utilize the IOMMU_INIT APIs and have your own detect 
> > > > > >> which would
> > > > > >>modify the io_tlb_nslabs (and set swiotbl=1?).
> > > > > 
> > > > > This seems to be a nice option, but then IOMMU_INIT APIs are
> > > > > x86-specific and this swiotlb buffer size adjustment is also needed
> > > > > for other memory encryption architectures like Power, S390, etc.
> > > 
> > > Oh dear. That I hadn't considered.
> > > > > 
> > > > > >> 
> > > > > >>Actually you seem to be piggybacking on pci_swiotlb_detect_4gb 
> > > > > >> - so
> > > > > >>perhaps add in this code ? Albeit it really should be in it's 
> > > > > >> own
> > > > > >>file, not in arch/x86/kernel/pci-swiotlb.c
> > > > > 
> > > > > Actually, we piggyback on pci_swiotlb_detect_override which sets
> > > > > swiotlb=1 as x86_64_start_kernel() and invocation of sme_early_init()
> > > > > forces swiotlb on, but again this is all x86 architecture specific.
> > > 
> > > Then it looks like the best bet is to do it from within swiotlb_init?
> > > We really can't do it from swiotlb_size_or_default - that function
> > > should just return a value and nothing else.
> > > 
> > 
> > Actually, we need to do it in swiotlb_size_or_default() as this gets called 
> > by
> > reserve_crashkernel_low() in arch/x86/kernel/setup.c and used to
> > reserve low crashkernel memory. If we adjust swiotlb size later in
> > swiotlb_init() which gets called later than reserve_crashkernel_low(),
> > then any swiotlb size changes/expansion will conflict/overlap with the
> > low memory reserved for crashkernel.
> > 
> and will also potentially cause SWIOTLB buffer allocation failures.
> 
> Do you have any feedback, comments on the above ?


The init boot chain looks like this:

initmem_init
pci_iommu_alloc
-> pci_swiotlb_detect_4gb
-> swiotlb_init

reserve_crashkernel
reserve_crashkernel_low
-> swiotlb_size_or_default
..


(rootfs code):
pci_iommu_init
-> a bunch of the other IOMMU late_init code gets called..
->  pci_swiotlb_late_init 

I have to say I am lost to how your patch fixes "If we adjust swiolb
size later .. then any swiotlb size .. will overlap with the low memory
reserved for crashkernel"?

Or are you saying that 'reserve_crashkernel_low' is the _culprit_ and it
is the one changing the size? And hence it modifying the swiotlb size
will fix this problem? Aka _before_ all the other IOMMU get their hand
on it?

If so why not create an
IOMMU_INIT(crashkernel_adjust_swiotlb,pci_swiotlb_detect_override,
NULL, NULL);

And crashkernel_adjust_swiotlb would change the size of swiotlb buffer
if conditions are found to require it.

You also may want to put a #define DEBUG in arch/x86/kernel/pci-iommu_table.c
to check out whether the tree structure of IOMMU entries is correct.



But still I am lost - if say the AMD one does decide for unknown reason
to expand the SWIOTLB you are still stuck with the 'overlap with
the low memory reserved' or so.

Perhaps add a late_init that gets called as the last one to validate
this ? And maybe if the swiotlb gets turned off you also take proper

Re: [PATCH v6 1/4] iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage

2020-06-23 Thread Thierry Reding
On Tue, Jun 23, 2020 at 12:16:55PM +0100, Robin Murphy wrote:
> On 2020-06-23 11:29, Thierry Reding wrote:
> [...]
> > > diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
> > > index c75b9d957b702..52c84c30f83e4 100644
> > > --- a/drivers/iommu/arm-smmu-impl.c
> > > +++ b/drivers/iommu/arm-smmu-impl.c
> > > @@ -160,6 +160,9 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
> > > arm_smmu_device *smmu)
> > >*/
> > >   switch (smmu->model) {
> > >   case ARM_MMU500:
> > > + if (of_device_is_compatible(smmu->dev->of_node,
> > > + "nvidia,tegra194-smmu-500"))
> > > + return nvidia_smmu_impl_init(smmu);
> > 
> > Should NVIDIA_TEGRA194_SMMU be a separate value for smmu->model,
> > perhaps? That way we avoid this somewhat odd check here.
> 
> No, this is simply in the wrong place. The design here is that we pick up
> anything related to the basic SMMU IP (model) first, then make any
> platform-specific integration checks. That way a platform-specific init
> function can see the model impl set and subclass it if necessary (although
> nobody's actually done that yet). The setup for Cavium is just a short-cut
> since their model is unique to their integration, so the lines get a bit
> blurred and there's little benefit to trying to separate it out.
> 
> In short, put this down below with the other of_device_is_compatible()
> checks.
> 
> > >   smmu->impl = _mmu500_impl;
> > >   break;
> > >   case CAVIUM_SMMUV2:
> > > diff --git a/drivers/iommu/arm-smmu-nvidia.c 
> > > b/drivers/iommu/arm-smmu-nvidia.c
> > 
> > I wonder if it would be better to name this arm-smmu-tegra.c to make it
> > clearer that this is for a Tegra chip. We do have regular expressions in
> > MAINTAINERS that catch anything with "tegra" in it to make this easier.
> 
> There was a notion that these would be grouped by vendor, but if there's a
> strong preference for all NVIDIA-SoC-related stuff to be named "Tegra" then
> I'm not going to complain too much.

Maybe I was being overly cautious. I was just trying to avoid adding
something called nvidia-arm-smmu which might eventually turn out to be
ambiguous if there was ever a non-Tegra chip and the ARM SMMU
implementation was not compatible with the one instantiated on Tegra.

Note that I have no knowledge of such a chip being designed, so this may
never actually become an issue.

In either case, the compatible string already identifies this as Tegra-
specific, so we could always change the driver name later on if we have
to.

> > > new file mode 100644
> > > index 0..dafc293a45217
> > > --- /dev/null
> > > +++ b/drivers/iommu/arm-smmu-nvidia.c
> > > @@ -0,0 +1,161 @@
> > > +// SPDX-License-Identifier: GPL-2.0-only
> > > +// Nvidia ARM SMMU v2 implementation quirks
> > 
> > s/Nvidia/NVIDIA/
> > 
> > > +// Copyright (C) 2019 NVIDIA CORPORATION.  All rights reserved.
> > 
> > I suppose this should now also include 2020.
> > 
> > > +
> > > +#define pr_fmt(fmt) "nvidia-smmu: " fmt
> > 
> > Same here. Might be worth making this "tegra-smmu: " for consistency.
> 
> On the other hand, a log prefix that is literally the name of a completely
> unrelated driver seems more confusing to users than useful. Same for the
> function naming - the tegra_smmu_* namespace is already owned by that
> driver.

The ARM SMMU replaced the Tegra SMMU on Tegra186 and later, so both
drivers are never going to run concurrently. The only "problem" might be
that both drivers have symbols with the same prefix, but since they
don't export any of those symbols I don't see how that would become a
real issue.

But then again, the Tegra SMMU is also technically an NVIDIA SMMU, so
sticking with the current name might also be confusing.

Perhaps a good compromise would be to use a "tegra194{-,_}smmu" prefix
instead, which would make this fairly specific to just Tegra194 (and
compatible chips). That's a fairly common pattern we've been following
on Tegra, as you can for example see in drivers/gpio/gpio-tegra186.c,
drivers/dma/tegra210-adma.c, drivers/memory/tegra/tegra186-emc.c, etc.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 3/4] iommu/arm-smmu: Add global/context fault implementation hooks

2020-06-23 Thread Thierry Reding
On Tue, Jun 23, 2020 at 12:30:16PM +0100, Robin Murphy wrote:
> On 2020-06-23 09:36, Thierry Reding wrote:
> [...]
> > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > > index 243bc4cb2705b..d720e1e191176 100644
> > > --- a/drivers/iommu/arm-smmu.c
> > > +++ b/drivers/iommu/arm-smmu.c
> > > @@ -673,6 +673,7 @@ static int arm_smmu_init_domain_context(struct 
> > > iommu_domain *domain,
> > >   enum io_pgtable_fmt fmt;
> > >   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > >   struct arm_smmu_cfg *cfg = _domain->cfg;
> > > + irqreturn_t (*context_fault)(int irq, void *dev);
> > >   mutex_lock(_domain->init_mutex);
> > >   if (smmu_domain->smmu)
> > > @@ -835,7 +836,9 @@ static int arm_smmu_init_domain_context(struct 
> > > iommu_domain *domain,
> > >* handler seeing a half-initialised domain state.
> > >*/
> > >   irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
> > > - ret = devm_request_irq(smmu->dev, irq, arm_smmu_context_fault,
> > > + context_fault = (smmu->impl && smmu->impl->context_fault) ?
> > > +  smmu->impl->context_fault : arm_smmu_context_fault;
> > 
> > A simpler way might have been to assign arm_smmu_context_fault to all
> > implementations. That way we wouldn't have to perform this check here
> > and instead just always using smmu->impl->context_fault.
> 
> But smmu->impl can still be NULL...
> 
> Everything in impl, including the presence of impl itself, is optional, so
> the notion of overriding a default with the same default doesn't really make
> much sense, and would go against the pattern everywhere else.

True. I had assumed that every implementation would set smmu->impl
anyway, in which case there'd be little reason to use these default
fallbacks since each implementation could simply directly refer to the
exact implementation that it wants.

Perhaps the above could be made a bit more palatable by using a standard
if/else rather than the ternary operator? That would also more closely
match the pattern elsewhere.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2] dma-remap: Align the size in dma_common_*_remap()

2020-06-23 Thread Christoph Hellwig
On Tue, Jun 23, 2020 at 02:07:55PM +0200, Eric Auger wrote:
> Running a guest with a virtio-iommu protecting virtio devices
> is broken since commit 515e5b6d90d4 ("dma-mapping: use vmap insted
> of reimplementing it"). Before the conversion, the size was
> page aligned in __get_vm_area_node(). Doing so fixes the
> regression.
> 
> Fixes: 515e5b6d90d4 ("dma-mapping: use vmap insted of reimplementing it")
> Signed-off-by: Eric Auger 

Thanks,

I've applied this locally, waiting for git.infradead.org to be back
up.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2] dma-remap: Align the size in dma_common_*_remap()

2020-06-23 Thread Eric Auger
Running a guest with a virtio-iommu protecting virtio devices
is broken since commit 515e5b6d90d4 ("dma-mapping: use vmap insted
of reimplementing it"). Before the conversion, the size was
page aligned in __get_vm_area_node(). Doing so fixes the
regression.

Fixes: 515e5b6d90d4 ("dma-mapping: use vmap insted of reimplementing it")
Signed-off-by: Eric Auger 

---

v1 -> v2:
- fix line over 80 characters
---
 kernel/dma/remap.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/dma/remap.c b/kernel/dma/remap.c
index e739a6eea6e7..78b23f089cf1 100644
--- a/kernel/dma/remap.c
+++ b/kernel/dma/remap.c
@@ -24,7 +24,8 @@ void *dma_common_pages_remap(struct page **pages, size_t size,
 {
void *vaddr;
 
-   vaddr = vmap(pages, size >> PAGE_SHIFT, VM_DMA_COHERENT, prot);
+   vaddr = vmap(pages, PAGE_ALIGN(size) >> PAGE_SHIFT,
+VM_DMA_COHERENT, prot);
if (vaddr)
find_vm_area(vaddr)->pages = pages;
return vaddr;
@@ -37,7 +38,7 @@ void *dma_common_pages_remap(struct page **pages, size_t size,
 void *dma_common_contiguous_remap(struct page *page, size_t size,
pgprot_t prot, const void *caller)
 {
-   int count = size >> PAGE_SHIFT;
+   int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
struct page **pages;
void *vaddr;
int i;
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dma-remap: Align the size in dma_common_*_remap()

2020-06-23 Thread Christoph Hellwig
On Mon, Jun 22, 2020 at 10:23:20PM +0200, Eric Auger wrote:
> Running a guest with a virtio-iommu protecting virtio devices
> is broken since commit 515e5b6d90d4 ("dma-mapping: use vmap insted
> of reimplementing it"). Before the conversion, the size was
> page aligned in __get_vm_area_node(). Doing so fixes the
> regression.
> 
> Fixes: 515e5b6d90d4 ("dma-mapping: use vmap insted of reimplementing it")
> Signed-off-by: Eric Auger 
> ---
>  kernel/dma/remap.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/dma/remap.c b/kernel/dma/remap.c
> index e739a6eea6e7..a3151a9b0c08 100644
> --- a/kernel/dma/remap.c
> +++ b/kernel/dma/remap.c
> @@ -24,7 +24,7 @@ void *dma_common_pages_remap(struct page **pages, size_t 
> size,
>  {
>   void *vaddr;
>  
> - vaddr = vmap(pages, size >> PAGE_SHIFT, VM_DMA_COHERENT, prot);
> + vaddr = vmap(pages, PAGE_ALIGN(size) >> PAGE_SHIFT, VM_DMA_COHERENT, 
> prot);

Please respin without going over 80 characters.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 3/4] iommu/arm-smmu: Add global/context fault implementation hooks

2020-06-23 Thread Robin Murphy

On 2020-06-23 09:36, Thierry Reding wrote:
[...]

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 243bc4cb2705b..d720e1e191176 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -673,6 +673,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain 
*domain,
enum io_pgtable_fmt fmt;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct arm_smmu_cfg *cfg = _domain->cfg;
+   irqreturn_t (*context_fault)(int irq, void *dev);
  
  	mutex_lock(_domain->init_mutex);

if (smmu_domain->smmu)
@@ -835,7 +836,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain 
*domain,
 * handler seeing a half-initialised domain state.
 */
irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
-   ret = devm_request_irq(smmu->dev, irq, arm_smmu_context_fault,
+   context_fault = (smmu->impl && smmu->impl->context_fault) ?
+smmu->impl->context_fault : arm_smmu_context_fault;


A simpler way might have been to assign arm_smmu_context_fault to all
implementations. That way we wouldn't have to perform this check here
and instead just always using smmu->impl->context_fault.


But smmu->impl can still be NULL...

Everything in impl, including the presence of impl itself, is optional, 
so the notion of overriding a default with the same default doesn't 
really make much sense, and would go against the pattern everywhere else.


Robin.


+   ret = devm_request_irq(smmu->dev, irq, context_fault,
   IRQF_SHARED, "arm-smmu-context-fault", domain);
if (ret < 0) {
dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n",
@@ -2107,6 +2110,7 @@ static int arm_smmu_device_probe(struct platform_device 
*pdev)
struct arm_smmu_device *smmu;
struct device *dev = >dev;
int num_irqs, i, err;
+   irqreturn_t (*global_fault)(int irq, void *dev);
  
  	smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);

if (!smmu) {
@@ -2193,9 +2197,12 @@ static int arm_smmu_device_probe(struct platform_device 
*pdev)
smmu->num_context_irqs = smmu->num_context_banks;
}
  
+	global_fault = (smmu->impl && smmu->impl->global_fault) ?

+   smmu->impl->global_fault : arm_smmu_global_fault;
+


Same as above.

Thierry


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 1/4] iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage

2020-06-23 Thread Robin Murphy

On 2020-06-23 11:29, Thierry Reding wrote:
[...]

diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index c75b9d957b702..52c84c30f83e4 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -160,6 +160,9 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
arm_smmu_device *smmu)
 */
switch (smmu->model) {
case ARM_MMU500:
+   if (of_device_is_compatible(smmu->dev->of_node,
+   "nvidia,tegra194-smmu-500"))
+   return nvidia_smmu_impl_init(smmu);


Should NVIDIA_TEGRA194_SMMU be a separate value for smmu->model,
perhaps? That way we avoid this somewhat odd check here.


No, this is simply in the wrong place. The design here is that we pick 
up anything related to the basic SMMU IP (model) first, then make any 
platform-specific integration checks. That way a platform-specific init 
function can see the model impl set and subclass it if necessary 
(although nobody's actually done that yet). The setup for Cavium is just 
a short-cut since their model is unique to their integration, so the 
lines get a bit blurred and there's little benefit to trying to separate 
it out.


In short, put this down below with the other of_device_is_compatible() 
checks.



smmu->impl = _mmu500_impl;
break;
case CAVIUM_SMMUV2:
diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c


I wonder if it would be better to name this arm-smmu-tegra.c to make it
clearer that this is for a Tegra chip. We do have regular expressions in
MAINTAINERS that catch anything with "tegra" in it to make this easier.


There was a notion that these would be grouped by vendor, but if there's 
a strong preference for all NVIDIA-SoC-related stuff to be named "Tegra" 
then I'm not going to complain too much.



new file mode 100644
index 0..dafc293a45217
--- /dev/null
+++ b/drivers/iommu/arm-smmu-nvidia.c
@@ -0,0 +1,161 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Nvidia ARM SMMU v2 implementation quirks


s/Nvidia/NVIDIA/


+// Copyright (C) 2019 NVIDIA CORPORATION.  All rights reserved.


I suppose this should now also include 2020.


+
+#define pr_fmt(fmt) "nvidia-smmu: " fmt


Same here. Might be worth making this "tegra-smmu: " for consistency.


On the other hand, a log prefix that is literally the name of a 
completely unrelated driver seems more confusing to users than useful. 
Same for the function naming - the tegra_smmu_* namespace is already 
owned by that driver.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 1/4] iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage

2020-06-23 Thread Thierry Reding
On Thu, Jun 04, 2020 at 04:44:11PM -0700, Krishna Reddy wrote:
> NVIDIA's Tegra194 soc uses two ARM MMU-500s together to interleave

s/soc/SoC/

> IOVA accesses across them.
> Add NVIDIA implementation for dual ARM MMU-500s and add new compatible
> string for Tegra194 soc.

Same here.

> 
> Signed-off-by: Krishna Reddy 
> ---
>  MAINTAINERS |   2 +
>  drivers/iommu/Makefile  |   2 +-
>  drivers/iommu/arm-smmu-impl.c   |   3 +
>  drivers/iommu/arm-smmu-nvidia.c | 161 
>  drivers/iommu/arm-smmu.h|   1 +
>  5 files changed, 168 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/iommu/arm-smmu-nvidia.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 50659d76976b7..118da0893c964 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -16572,9 +16572,11 @@ F:   drivers/i2c/busses/i2c-tegra.c
>  
>  TEGRA IOMMU DRIVERS
>  M:   Thierry Reding 
> +R:   Krishna Reddy 
>  L:   linux-te...@vger.kernel.org
>  S:   Supported
>  F:   drivers/iommu/tegra*
> +F:   drivers/iommu/arm-smmu-nvidia.c
>  
>  TEGRA KBC DRIVER
>  M:   Laxman Dewangan 
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 57cf4ba5e27cb..35542df00da72 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -15,7 +15,7 @@ obj-$(CONFIG_AMD_IOMMU) += amd_iommu.o amd_iommu_init.o 
> amd_iommu_quirks.o
>  obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd_iommu_debugfs.o
>  obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o
>  obj-$(CONFIG_ARM_SMMU) += arm_smmu.o
> -arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-qcom.o
> +arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-qcom.o arm-smmu-nvidia.o
>  obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
>  obj-$(CONFIG_DMAR_TABLE) += dmar.o
>  obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o intel-pasid.o
> diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
> index c75b9d957b702..52c84c30f83e4 100644
> --- a/drivers/iommu/arm-smmu-impl.c
> +++ b/drivers/iommu/arm-smmu-impl.c
> @@ -160,6 +160,9 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
> arm_smmu_device *smmu)
>*/
>   switch (smmu->model) {
>   case ARM_MMU500:
> + if (of_device_is_compatible(smmu->dev->of_node,
> + "nvidia,tegra194-smmu-500"))
> + return nvidia_smmu_impl_init(smmu);

Should NVIDIA_TEGRA194_SMMU be a separate value for smmu->model,
perhaps? That way we avoid this somewhat odd check here.

>   smmu->impl = _mmu500_impl;
>   break;
>   case CAVIUM_SMMUV2:
> diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c

I wonder if it would be better to name this arm-smmu-tegra.c to make it
clearer that this is for a Tegra chip. We do have regular expressions in
MAINTAINERS that catch anything with "tegra" in it to make this easier.

> new file mode 100644
> index 0..dafc293a45217
> --- /dev/null
> +++ b/drivers/iommu/arm-smmu-nvidia.c
> @@ -0,0 +1,161 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Nvidia ARM SMMU v2 implementation quirks

s/Nvidia/NVIDIA/

> +// Copyright (C) 2019 NVIDIA CORPORATION.  All rights reserved.

I suppose this should now also include 2020.

> +
> +#define pr_fmt(fmt) "nvidia-smmu: " fmt

Same here. Might be worth making this "tegra-smmu: " for consistency.

> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "arm-smmu.h"
> +
> +/* Tegra194 has three ARM MMU-500 Instances.
> + * Two of them are used together for Interleaved IOVA accesses and
> + * used by Non-Isochronous Hw devices for SMMU translations.

"non-isochronous", s/Hw/HW/

> + * Third one is used for SMMU translations from Isochronous HW devices.

"isochronous"

> + * It is possible to use this Implementation to program either

"implementation"

> + * all three or two of the instances identically as desired through
> + * DT node.
> + *
> + * Programming all the three instances identically comes with redundant tlb

s/tlb/TLB/

> + * invalidations as all three never need to be tlb invalidated for a HW 
> device.

Same here.

> + *
> + * When Linux Kernel supports multiple SMMU devices, The SMMU device used for

"kernel" and "..., the SMMU device"

> + * Isochornous HW devices should be added as a separate ARM MMU-500 device

"isochronous"

> + * in DT and be programmed independently for efficient tlb invalidates.

"TLB"

> + *
> + */
> +#define MAX_SMMU_INSTANCES 3
> +
> +#define TLB_LOOP_TIMEOUT 100 /* 1s! */

USEC_PER_SEC?

> +#define TLB_SPIN_COUNT   10
> +
> +struct nvidia_smmu {
> + struct arm_smmu_device  smmu;
> + unsigned intnum_inst;
> + void __iomem*bases[MAX_SMMU_INSTANCES];
> +};
> +
> +#define to_nvidia_smmu(s) container_of(s, struct nvidia_smmu, smmu)

Making this static inline can make error messages more readable.

> +
> +#define nsmmu_page(smmu, inst, page) \
> + 

Re: [PATCH 4/4] iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist()

2020-06-23 Thread John Garry

On 23/06/2020 10:35, Rikard Falkeborn wrote:


I'd say that GENMASK_INPUT_CHECK() should be able to handle a l=0 and
h=unsigned value, so I doubt this warn.

Using GENMASK((int)cmdq->q.llq.max_n_shift, 0) resolves it, but it
looks
like GENMASK_INPUT_CHECK() could be improved.


Indeed it could, it is fixed in -next.


ok, thanks for the pointer, but I still see this on today's -next with 
this patch:


make W=1 drivers/iommu/arm-smmu-v3.o

In file included from ./include/linux/bits.h:23:0,
from ./include/linux/ioport.h:15,
from ./include/linux/acpi.h:12,
from drivers/iommu/arm-smmu-v3.c:12:
drivers/iommu/arm-smmu-v3.c: In function ‘arm_smmu_cmdq_issue_cmdlist’:
./include/linux/bits.h:27:7: warning: comparison of unsigned expression 
< 0 is always false [-Wtype-limits]

  (l) > (h), 0)))
  ^
./include/linux/build_bug.h:16:62: note: in definition of macro 
‘BUILD_BUG_ON_ZERO’

#define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
 ^
./include/linux/bits.h:40:3: note: in expansion of macro 
‘GENMASK_INPUT_CHECK’

 (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
  ^~~
drivers/iommu/arm-smmu-v3.c:1404:18: note: in expansion of macro ‘GENMASK’
 u32 prod_mask = GENMASK(cmdq->q.llq.max_n_shift, 0);

That's gcc 7.5.0 .

Cheers,
John

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 4/4] iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist()

2020-06-23 Thread Rikard Falkeborn
Hi

Den tis 23 juni 2020 11:22John Garry  skrev:

> On 23/06/2020 02:07, kernel test robot wrote:
>
> + Rikard, as the GENMASK compile-time sanity checks were added recently
>
> > Hi John,
> >
> > I love your patch! Perhaps something to improve:
> >
> > [auto build test WARNING on iommu/next]
> > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > And when submitting patch, we suggest to use  as documented in
> > https://git-scm.com/docs/git-format-patch]
> >
> > url:
> https://github.com/0day-ci/linux/commits/John-Garry/iommu-arm-smmu-v3-Improve-cmdq-lock-efficiency/20200623-013438
> > base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
> next
> > config: arm64-randconfig-c024-20200622 (attached as .config)
> > compiler: aarch64-linux-gcc (GCC) 9.3.0
> >
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot 
> >
> > All warnings (new ones prefixed by >>, old ones prefixed by <<):
> >
> > In file included from include/linux/bits.h:23,
> > from include/linux/ioport.h:15,
> > from include/linux/acpi.h:12,
> > from drivers/iommu/arm-smmu-v3.c:12:
> > drivers/iommu/arm-smmu-v3.c: In function 'arm_smmu_cmdq_issue_cmdlist':
> > include/linux/bits.h:26:28: warning: comparison of unsigned expression <
> 0 is always false [-Wtype-limits]
> > 26 |   __builtin_constant_p((l) > (h)), (l) > (h), 0)))
>
> I'd say that GENMASK_INPUT_CHECK() should be able to handle a l=0 and
> h=unsigned value, so I doubt this warn.
>
> Using GENMASK((int)cmdq->q.llq.max_n_shift, 0) resolves it, but it looks
> like GENMASK_INPUT_CHECK() could be improved.
>

Indeed it could, it is fixed in -next.

Rikard

> |^
> > include/linux/build_bug.h:16:62: note: in definition of macro
> 'BUILD_BUG_ON_ZERO'
> > 16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e));
> })))
> > |  ^
> > include/linux/bits.h:39:3: note: in expansion of macro
> 'GENMASK_INPUT_CHECK'
> > 39 |  (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
> > |   ^~~
> >>> drivers/iommu/arm-smmu-v3.c:1404:18: note: in expansion of macro
> 'GENMASK'
> > 1404 |  u32 prod_mask = GENMASK(cmdq->q.llq.max_n_shift, 0);
> > |  ^~~
> > include/linux/bits.h:26:40: warning: comparison of unsigned expression <
> 0 is always false [-Wtype-limits]
> > 26 |   __builtin_constant_p((l) > (h)), (l) > (h), 0)))
> > |^
> > include/linux/build_bug.h:16:62: note: in definition of macro
> 'BUILD_BUG_ON_ZERO'
> > 16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e));
> })))
> > |  ^
> > include/linux/bits.h:39:3: note: in expansion of macro
> 'GENMASK_INPUT_CHECK'
> > 39 |  (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
> > |   ^~~
> >>> drivers/iommu/arm-smmu-v3.c:1404:18: note: in expansion of macro
> 'GENMASK'
> > 1404 |  u32 prod_mask = GENMASK(cmdq->q.llq.max_n_shift, 0);
> > |  ^~~
> >
> > vim +/GENMASK +1404 drivers/iommu/arm-smmu-v3.c
>
>
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 4/4] iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist()

2020-06-23 Thread John Garry

On 23/06/2020 02:07, kernel test robot wrote:

+ Rikard, as the GENMASK compile-time sanity checks were added recently


Hi John,

I love your patch! Perhaps something to improve:

[auto build test WARNING on iommu/next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use  as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/John-Garry/iommu-arm-smmu-v3-Improve-cmdq-lock-efficiency/20200623-013438
base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
config: arm64-randconfig-c024-20200622 (attached as .config)
compiler: aarch64-linux-gcc (GCC) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>, old ones prefixed by <<):

In file included from include/linux/bits.h:23,
from include/linux/ioport.h:15,
from include/linux/acpi.h:12,
from drivers/iommu/arm-smmu-v3.c:12:
drivers/iommu/arm-smmu-v3.c: In function 'arm_smmu_cmdq_issue_cmdlist':
include/linux/bits.h:26:28: warning: comparison of unsigned expression < 0 is 
always false [-Wtype-limits]
26 |   __builtin_constant_p((l) > (h)), (l) > (h), 0)))


I'd say that GENMASK_INPUT_CHECK() should be able to handle a l=0 and 
h=unsigned value, so I doubt this warn.


Using GENMASK((int)cmdq->q.llq.max_n_shift, 0) resolves it, but it looks 
like GENMASK_INPUT_CHECK() could be improved.



|^
include/linux/build_bug.h:16:62: note: in definition of macro 
'BUILD_BUG_ON_ZERO'
16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
|  ^
include/linux/bits.h:39:3: note: in expansion of macro 'GENMASK_INPUT_CHECK'
39 |  (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
|   ^~~

drivers/iommu/arm-smmu-v3.c:1404:18: note: in expansion of macro 'GENMASK'

1404 |  u32 prod_mask = GENMASK(cmdq->q.llq.max_n_shift, 0);
|  ^~~
include/linux/bits.h:26:40: warning: comparison of unsigned expression < 0 is 
always false [-Wtype-limits]
26 |   __builtin_constant_p((l) > (h)), (l) > (h), 0)))
|^
include/linux/build_bug.h:16:62: note: in definition of macro 
'BUILD_BUG_ON_ZERO'
16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
|  ^
include/linux/bits.h:39:3: note: in expansion of macro 'GENMASK_INPUT_CHECK'
39 |  (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
|   ^~~

drivers/iommu/arm-smmu-v3.c:1404:18: note: in expansion of macro 'GENMASK'

1404 |  u32 prod_mask = GENMASK(cmdq->q.llq.max_n_shift, 0);
|  ^~~

vim +/GENMASK +1404 drivers/iommu/arm-smmu-v3.c



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 2/4] dt-bindings: arm-smmu: Add binding for Tegra194 SMMU

2020-06-23 Thread Thierry Reding
On Thu, Jun 04, 2020 at 04:44:12PM -0700, Krishna Reddy wrote:
> Add binding for NVIDIA's Tegra194 Soc SMMU that is based
> on ARM MMU-500.
> 
> Signed-off-by: Krishna Reddy 
> ---
>  Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
> b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> index e3ef1c69d1326..8f7ffd248f303 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> @@ -37,6 +37,11 @@ properties:
>- qcom,sc7180-smmu-500
>- qcom,sdm845-smmu-500
>- const: arm,mmu-500
> +  - description: NVIDIA SoCs that use more than one "arm,mmu-500"
> +items:
> +  - enum:
> +  - nvdia,tegra194-smmu-500

The -500 suffix here seems a bit redundant since there's no other type
of SMMU in Tegra194, correct?

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 3/4] iommu/arm-smmu: Add global/context fault implementation hooks

2020-06-23 Thread Thierry Reding
On Thu, Jun 04, 2020 at 04:44:13PM -0700, Krishna Reddy wrote:
> Add global/context fault hooks to allow NVIDIA SMMU implementation
> handle faults across multiple SMMUs.
> 
> Signed-off-by: Krishna Reddy 
> ---
>  drivers/iommu/arm-smmu-nvidia.c | 100 
>  drivers/iommu/arm-smmu.c|  11 +++-
>  drivers/iommu/arm-smmu.h|   3 +
>  3 files changed, 112 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
> index dafc293a45217..5999b6a770992 100644
> --- a/drivers/iommu/arm-smmu-nvidia.c
> +++ b/drivers/iommu/arm-smmu-nvidia.c
> @@ -117,6 +117,104 @@ static int nsmmu_reset(struct arm_smmu_device *smmu)
>   return 0;
>  }
>  
> +static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
> +{
> + return container_of(dom, struct arm_smmu_domain, domain);
> +}
> +
> +static irqreturn_t nsmmu_global_fault_inst(int irq,
> +struct arm_smmu_device *smmu,
> +int inst)
> +{
> + u32 gfsr, gfsynr0, gfsynr1, gfsynr2;
> +
> + gfsr = readl_relaxed(nsmmu_page(smmu, inst, 0) + ARM_SMMU_GR0_sGFSR);
> + gfsynr0 = readl_relaxed(nsmmu_page(smmu, inst, 0) +
> + ARM_SMMU_GR0_sGFSYNR0);
> + gfsynr1 = readl_relaxed(nsmmu_page(smmu, inst, 0) +
> + ARM_SMMU_GR0_sGFSYNR1);
> + gfsynr2 = readl_relaxed(nsmmu_page(smmu, inst, 0) +
> + ARM_SMMU_GR0_sGFSYNR2);
> +
> + if (!gfsr)
> + return IRQ_NONE;
> +
> + dev_err_ratelimited(smmu->dev,
> + "Unexpected global fault, this could be serious\n");
> + dev_err_ratelimited(smmu->dev,
> + "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 
> 0x%08x\n",
> + gfsr, gfsynr0, gfsynr1, gfsynr2);
> +
> + writel_relaxed(gfsr, nsmmu_page(smmu, inst, 0) + ARM_SMMU_GR0_sGFSR);
> + return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t nsmmu_global_fault(int irq, void *dev)
> +{
> + int inst;
> + irqreturn_t irq_ret = IRQ_NONE;
> + struct arm_smmu_device *smmu = dev;
> +
> + for (inst = 0; inst < to_nvidia_smmu(smmu)->num_inst; inst++) {
> + irq_ret = nsmmu_global_fault_inst(irq, smmu, inst);
> + if (irq_ret == IRQ_HANDLED)
> + return irq_ret;
> + }
> +
> + return irq_ret;
> +}
> +
> +static irqreturn_t nsmmu_context_fault_bank(int irq,
> + struct arm_smmu_device *smmu,
> + int idx, int inst)
> +{
> + u32 fsr, fsynr, cbfrsynra;
> + unsigned long iova;
> +
> + fsr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSR);
> + if (!(fsr & ARM_SMMU_FSR_FAULT))
> + return IRQ_NONE;
> +
> + fsynr = readl_relaxed(nsmmu_page(smmu, inst, smmu->numpage + idx) +
> +   ARM_SMMU_CB_FSYNR0);
> + iova = readq_relaxed(nsmmu_page(smmu, inst, smmu->numpage + idx) +
> +  ARM_SMMU_CB_FAR);
> + cbfrsynra = readl_relaxed(nsmmu_page(smmu, inst, 1) +
> +   ARM_SMMU_GR1_CBFRSYNRA(idx));
> +
> + dev_err_ratelimited(smmu->dev,
> + "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, 
> cbfrsynra=0x%x, cb=%d\n",
> + fsr, iova, fsynr, cbfrsynra, idx);
> +
> + writel_relaxed(fsr, nsmmu_page(smmu, inst, smmu->numpage + idx) +
> + ARM_SMMU_CB_FSR);
> + return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t nsmmu_context_fault(int irq, void *dev)
> +{
> + int inst, idx;
> + irqreturn_t irq_ret = IRQ_NONE;
> + struct iommu_domain *domain = dev;
> + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> + struct arm_smmu_device *smmu = smmu_domain->smmu;
> +
> + for (inst = 0; inst < to_nvidia_smmu(smmu)->num_inst; inst++) {
> + /* Interrupt line shared between all context faults.
> +  * Check for faults across all contexts.
> +  */
> + for (idx = 0; idx < smmu->num_context_banks; idx++) {
> + irq_ret = nsmmu_context_fault_bank(irq, smmu,
> +idx, inst);
> +
> + if (irq_ret == IRQ_HANDLED)
> + return irq_ret;
> + }
> + }
> +
> + return irq_ret;
> +}
> +
>  static const struct arm_smmu_impl nvidia_smmu_impl = {
>   .read_reg = nsmmu_read_reg,
>   .write_reg = nsmmu_write_reg,
> @@ -124,6 +222,8 @@ static const struct arm_smmu_impl nvidia_smmu_impl = {
>   .write_reg64 = nsmmu_write_reg64,
>   .reset = nsmmu_reset,
>   .tlb_sync = nsmmu_tlb_sync,
> + .global_fault = nsmmu_global_fault,
> + .context_fault = nsmmu_context_fault,
>  };
>  
>  struct arm_smmu_device *nvidia_smmu_impl_init(struct 

Re: [PATCH v6 4/4] iommu/arm-smmu-nvidia: fix the warning reported by kbuild test robot

2020-06-23 Thread Thierry Reding
On Thu, Jun 04, 2020 at 04:44:14PM -0700, Krishna Reddy wrote:
> >> drivers/iommu/arm-smmu-nvidia.c:151:33: sparse: sparse: cast removes
> >> address space '' of expression
> 
> Reported-by: kbuild test robot 
> Signed-off-by: Krishna Reddy 
> ---
>  drivers/iommu/arm-smmu-nvidia.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

This should be folded into the patch that introduced this error.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 0/6] [PULL REQUEST] iommu/vt-d: fixes for v5.8

2020-06-23 Thread Joerg Roedel
On Tue, Jun 23, 2020 at 07:13:39AM +0800, Lu Baolu wrote:
> Hi Joerg,
> 
> Below fix patches have been piled up for v5.8. Please consider them for
> your fix branch.
> 
> Best regards,
> Lu Baolu
> 
> Lu Baolu (5):
>   iommu/vt-d: Make Intel SVM code 64-bit only
>   iommu/vt-d: Set U/S bit in first level page table by default
>   iommu/vt-d: Enable PCI ACS for platform opt in hint
>   iommu/vt-d: Update scalable mode paging structure coherency
>   iommu/vt-d: Fix misuse of iommu_domain_identity_map()
> 
> Rajat Jain (1):
>   iommu/vt-d: Don't apply gfx quirks to untrusted devices
> 
>  drivers/iommu/Kconfig   |  2 +-
>  drivers/iommu/intel/dmar.c  |  3 +-
>  drivers/iommu/intel/iommu.c | 59 -
>  include/linux/intel-iommu.h |  1 +
>  4 files changed, 56 insertions(+), 9 deletions(-)

Applied to iommu/fixes, thanks Baolu.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

2020-06-23 Thread Zhangfei Gao

Hi, Joerg

On 2020/6/22 下午7:55, Joerg Roedel wrote:

On Thu, Jun 04, 2020 at 09:33:07PM +0800, Zhangfei Gao wrote:

+++ b/drivers/iommu/iommu.c
@@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct
fwnode_handle *iommu_fwnode,
     fwspec->iommu_fwnode = iommu_fwnode;
     fwspec->ops = ops;
     dev_iommu_fwspec_set(dev, fwspec);
+
+   if (dev_is_pci(dev))
+   pci_fixup_device(pci_fixup_final, to_pci_dev(dev));
+

That's not going to fly, I don't think we should run the fixups twice,
and they should not be run from IOMMU code. Is the only reason for this
second pass that iommu_fwspec is not yet allocated when it runs the
first time? I ask because it might be easier to just allocate the struct
earlier then.

Thanks for looking this.

Yes, it is the only reason calling fixup secondly after iommu_fwspec is 
allocated.


The first time fixup final is very early in pci_bus_add_device.
If allocating iommu_fwspec earlier, it maybe in pci_alloc_dev.
And assigning ops still in iommu_fwspec_init.
Have tested it works.
Not sure is it acceptable?

Alternatively, adding can_stall to struct pci_dev is simple but ugly too,
since pci does not know stall now.


Thanks



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 14/15] docs: fix references for DMA*.txt files

2020-06-23 Thread Christoph Hellwig
On Tue, Jun 23, 2020 at 09:09:10AM +0200, Mauro Carvalho Chehab wrote:
> As we moved those files to core-api, fix references to point
> to their newer locations.

Can we please just revert the RST conversion that I didn't ACK?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 14/15] docs: fix references for DMA*.txt files

2020-06-23 Thread Mauro Carvalho Chehab
As we moved those files to core-api, fix references to point
to their newer locations.

Signed-off-by: Mauro Carvalho Chehab 
---
 Documentation/PCI/pci.rst  |  6 +++---
 Documentation/block/biodoc.rst |  2 +-
 Documentation/bus-virt-phys-mapping.txt|  2 +-
 Documentation/core-api/dma-api.rst |  6 +++---
 Documentation/core-api/dma-isa-lpc.rst |  2 +-
 Documentation/driver-api/usb/dma.rst   |  6 +++---
 .../translations/ko_KR/memory-barriers.txt |  6 +++---
 arch/ia64/hp/common/sba_iommu.c| 12 ++--
 arch/parisc/kernel/pci-dma.c   |  2 +-
 arch/x86/include/asm/dma-mapping.h |  4 ++--
 arch/x86/kernel/amd_gart_64.c  |  2 +-
 drivers/parisc/sba_iommu.c | 14 +++---
 include/linux/dma-mapping.h|  2 +-
 include/media/videobuf-dma-sg.h|  2 +-
 kernel/dma/debug.c |  2 +-
 15 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/Documentation/PCI/pci.rst b/Documentation/PCI/pci.rst
index 8c016d8c9862..d10d3fe604c5 100644
--- a/Documentation/PCI/pci.rst
+++ b/Documentation/PCI/pci.rst
@@ -265,7 +265,7 @@ Set the DMA mask size
 -
 .. note::
If anything below doesn't make sense, please refer to
-   Documentation/DMA-API.txt. This section is just a reminder that
+   :doc:`/core-api/dma-api`. This section is just a reminder that
drivers need to indicate DMA capabilities of the device and is not
an authoritative source for DMA interfaces.
 
@@ -291,7 +291,7 @@ Many 64-bit "PCI" devices (before PCI-X) and some PCI-X 
devices are
 Setup shared control data
 -
 Once the DMA masks are set, the driver can allocate "consistent" (a.k.a. 
shared)
-memory.  See Documentation/DMA-API.txt for a full description of
+memory.  See :doc:`/core-api/dma-api` for a full description of
 the DMA APIs. This section is just a reminder that it needs to be done
 before enabling DMA on the device.
 
@@ -421,7 +421,7 @@ owners if there is one.
 
 Then clean up "consistent" buffers which contain the control data.
 
-See Documentation/DMA-API.txt for details on unmapping interfaces.
+See :doc:`/core-api/dma-api` for details on unmapping interfaces.
 
 
 Unregister from other subsystems
diff --git a/Documentation/block/biodoc.rst b/Documentation/block/biodoc.rst
index b964796ec9c7..ba7f45d0271c 100644
--- a/Documentation/block/biodoc.rst
+++ b/Documentation/block/biodoc.rst
@@ -196,7 +196,7 @@ a virtual address mapping (unlike the earlier scheme of 
virtual address
 do not have a corresponding kernel virtual address space mapping) and
 low-memory pages.
 
-Note: Please refer to Documentation/DMA-API-HOWTO.txt for a discussion
+Note: Please refer to :doc:`/core-api/dma-api-howto` for a discussion
 on PCI high mem DMA aspects and mapping of scatter gather lists, and support
 for 64 bit PCI.
 
diff --git a/Documentation/bus-virt-phys-mapping.txt 
b/Documentation/bus-virt-phys-mapping.txt
index 4bb07c2f3e7d..c7bc99cd2e21 100644
--- a/Documentation/bus-virt-phys-mapping.txt
+++ b/Documentation/bus-virt-phys-mapping.txt
@@ -8,7 +8,7 @@ How to access I/O mapped memory from within device drivers
 
The virt_to_bus() and bus_to_virt() functions have been
superseded by the functionality provided by the PCI DMA interface
-   (see Documentation/DMA-API-HOWTO.txt).  They continue
+   (see :doc:`/core-api/dma-api-howto`).  They continue
to be documented below for historical purposes, but new code
must not use them. --davidm 00/12/12
 
diff --git a/Documentation/core-api/dma-api.rst 
b/Documentation/core-api/dma-api.rst
index 2d8d2fed7317..63b4a2f20867 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -5,7 +5,7 @@ Dynamic DMA mapping using the generic device
 :Author: James E.J. Bottomley 
 
 This document describes the DMA API.  For a more gentle introduction
-of the API (and actual examples), see Documentation/DMA-API-HOWTO.txt.
+of the API (and actual examples), see :doc:`/core-api/dma-api-howto`.
 
 This API is split into two pieces.  Part I describes the basic API.
 Part II describes extensions for supporting non-consistent memory
@@ -471,7 +471,7 @@ without the _attrs suffixes, except that they pass an 
optional
 dma_attrs.
 
 The interpretation of DMA attributes is architecture-specific, and
-each attribute should be documented in Documentation/DMA-attributes.txt.
+each attribute should be documented in :doc:`/core-api/dma-attributes`.
 
 If dma_attrs are 0, the semantics of each of these functions
 is identical to those of the corresponding function
@@ -484,7 +484,7 @@ for DMA::
 
#include 
/* DMA_ATTR_FOO should be defined in linux/dma-mapping.h and
-   * documented in Documentation/DMA-attributes.txt 

[PATCH v2 00/15] Documentation fixes

2020-06-23 Thread Mauro Carvalho Chehab
Hi Jon,

As requested, this is a rebase of a previous series posted on Jan, 15.

Since then, several patches got merged via other trees or became
obsolete. There were also 2 patches before that fits better at the
ReST conversion patchset. So, I'll be sending it on another patch
series together with the remaining ReST conversions.

I also added reviews/acks received.

So, the series reduced from 29 to 15 patches.

Let's hope b4 would be able to properly handle this one.

Regards,
Mauro

Mauro Carvalho Chehab (15):
  mm: vmalloc.c: remove a kernel-doc annotation from a removed parameter
  net: dev: add a missing kernel-doc annotation
  net: netdevice.h: add a description for napi_defer_hard_irqs
  scripts/kernel-doc: parse __ETHTOOL_DECLARE_LINK_MODE_MASK
  net: pylink.h: add kernel-doc descriptions for new fields at
phylink_config
  scripts/kernel-doc: handle function pointer prototypes
  fs: fs.h: fix a kernel-doc parameter description
  kcsan: fix a kernel-doc warning
  selftests/vm/keys: fix a broken reference at protection_keys.c
  docs: hugetlbpage.rst: fix some warnings
  docs: powerpc: fix some issues at vas-api.rst
  docs: driver-model: remove a duplicated markup at driver.rst
  docs: ABI: fix a typo when pointing to w1-generic.rst
  docs: fix references for DMA*.txt files
  docs: fs: proc.rst: convert a new chapter to ReST

 .../ABI/testing/sysfs-driver-w1_therm |  2 +-
 Documentation/PCI/pci.rst |  6 +--
 Documentation/admin-guide/mm/hugetlbpage.rst  | 23 +++---
 Documentation/block/biodoc.rst|  2 +-
 Documentation/bus-virt-phys-mapping.txt   |  2 +-
 Documentation/core-api/dma-api.rst|  6 +--
 Documentation/core-api/dma-isa-lpc.rst|  2 +-
 .../driver-api/driver-model/driver.rst|  2 -
 Documentation/driver-api/usb/dma.rst  |  6 +--
 Documentation/filesystems/proc.rst| 44 +--
 Documentation/powerpc/vas-api.rst | 23 +++---
 .../translations/ko_KR/memory-barriers.txt|  6 +--
 arch/ia64/hp/common/sba_iommu.c   | 12 ++---
 arch/parisc/kernel/pci-dma.c  |  2 +-
 arch/x86/include/asm/dma-mapping.h|  4 +-
 arch/x86/kernel/amd_gart_64.c |  2 +-
 drivers/parisc/sba_iommu.c| 14 +++---
 include/linux/dma-mapping.h   |  2 +-
 include/linux/fs.h|  2 +-
 include/linux/kcsan-checks.h  | 10 +++--
 include/linux/netdevice.h |  2 +
 include/linux/phylink.h   |  4 ++
 include/media/videobuf-dma-sg.h   |  2 +-
 kernel/dma/debug.c|  2 +-
 mm/vmalloc.c  |  1 -
 net/core/dev.c|  1 +
 scripts/kernel-doc|  7 +++
 tools/testing/selftests/vm/protection_keys.c  |  2 +-
 28 files changed, 114 insertions(+), 79 deletions(-)

-- 
2.26.2


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 14/15] vfio: Document dual stage control

2020-06-23 Thread Liu, Yi L
> From: Stefan Hajnoczi 
> Sent: Monday, June 22, 2020 8:51 PM
> 
> On Wed, Jun 17, 2020 at 06:27:27AM +, Liu, Yi L wrote:
> > > From: Stefan Hajnoczi 
> > > Sent: Monday, June 15, 2020 5:41 PM
> > > On Thu, Jun 11, 2020 at 05:15:33AM -0700, Liu Yi L wrote:
> > >
> > > > From: Eric Auger 
> > > >
> > > > The VFIO API was enhanced to support nested stage control: a bunch of
> > > > new iotcls and usage guideline.
> > > >
> > > > Let's document the process to follow to set up nested mode.
> > > >
> > > > Cc: Kevin Tian 
> > > > CC: Jacob Pan 
> > > > Cc: Alex Williamson 
> > > > Cc: Eric Auger 
> > > > Cc: Jean-Philippe Brucker 
> > > > Cc: Joerg Roedel 
> > > > Cc: Lu Baolu 
> > > > Signed-off-by: Eric Auger 
> > > > Signed-off-by: Liu Yi L 
> > > > ---
> > > > v1 -> v2:
> > > > *) new in v2, compared with Eric's original version, pasid table bind
> > > >and fault reporting is removed as this series doesn't cover them.
> > > >Original version from Eric.
> > > >https://lkml.org/lkml/2020/3/20/700
> > > >
> > > >  Documentation/driver-api/vfio.rst | 64
> > > > +++
> > > >  1 file changed, 64 insertions(+)
> > > >
> > > > diff --git a/Documentation/driver-api/vfio.rst
> > > > b/Documentation/driver-api/vfio.rst
> > > > index f1a4d3c..06224bd 100644
> > > > --- a/Documentation/driver-api/vfio.rst
> > > > +++ b/Documentation/driver-api/vfio.rst
> > > > @@ -239,6 +239,70 @@ group and can access them as follows::
> > > > /* Gratuitous device reset and go... */
> > > > ioctl(device, VFIO_DEVICE_RESET);
> > > >
> > > > +IOMMU Dual Stage Control
> > > > +
> > > > +
> > > > +Some IOMMUs support 2 stages/levels of translation. Stage corresponds
> > > > +to the ARM terminology while level corresponds to Intel's VTD 
> > > > terminology.
> > > > +In the following text we use either without distinction.
> > > > +
> > > > +This is useful when the guest is exposed with a virtual IOMMU and
> > > > +some devices are assigned to the guest through VFIO. Then the guest
> > > > +OS can use stage 1 (GIOVA -> GPA or GVA->GPA), while the hypervisor
> > > > +uses stage 2 for VM isolation (GPA -> HPA).
> > > > +
> > > > +Under dual stage translation, the guest gets ownership of the stage 1
> > > > +page tables and also owns stage 1 configuration structures. The
> > > > +hypervisor owns the root configuration structure (for security
> > > > +reason), including stage 2 configuration. This works as long
> > > > +configuration structures and page table
> > >
> > > s/as long configuration/as long as configuration/
> >
> > got it.
> >
> > >
> > > > +format are compatible between the virtual IOMMU and the physical IOMMU.
> > >
> > > s/format/formats/
> >
> > I see.
> >
> > > > +
> > > > +Assuming the HW supports it, this nested mode is selected by choosing
> > > > +the VFIO_TYPE1_NESTING_IOMMU type through:
> > > > +
> > > > +ioctl(container, VFIO_SET_IOMMU, VFIO_TYPE1_NESTING_IOMMU);
> > > > +
> > > > +This forces the hypervisor to use the stage 2, leaving stage 1
> > > > +available for guest usage. The guest stage 1 format depends on IOMMU
> > > > +vendor, and it is the same with the nesting configuration method.
> > > > +User space should check the format and configuration method after
> > > > +setting nesting type by
> > > > +using:
> > > > +
> > > > +ioctl(container->fd, VFIO_IOMMU_GET_INFO, _info);
> > > > +
> > > > +Details can be found in Documentation/userspace-api/iommu.rst. For
> > > > +Intel VT-d, each stage 1 page table is bound to host by:
> > > > +
> > > > +nesting_op->flags = VFIO_IOMMU_NESTING_OP_BIND_PGTBL;
> > > > +memcpy(_op->data, _data, sizeof(bind_data));
> > > > +ioctl(container->fd, VFIO_IOMMU_NESTING_OP, nesting_op);
> > > > +
> > > > +As mentioned above, guest OS may use stage 1 for GIOVA->GPA or 
> > > > GVA->GPA.
> > > > +GVA->GPA page tables are available when PASID (Process Address Space
> > > > +GVA->ID)
> > > > +is exposed to guest. e.g. guest with PASID-capable devices assigned.
> > > > +For such page table binding, the bind_data should include PASID info,
> > > > +which is allocated by guest itself or by host. This depends on
> > > > +hardware vendor e.g. Intel VT-d requires to allocate PASID from host.
> > > > +This requirement is available by VFIO_IOMMU_GET_INFO. User space
> > > > +could allocate PASID from host by:
> > > > +
> > > > +req.flags = VFIO_IOMMU_ALLOC_PASID;
> > > > +ioctl(container, VFIO_IOMMU_PASID_REQUEST, );
> > >
> > > It is not clear how the userspace application determines whether PASIDs 
> > > must be
> > > allocated from the host via VFIO_IOMMU_PASID_REQUEST or if the guest 
> > > itself
> can
> > > allocate PASIDs. The text mentions VFIO_IOMMU_GET_INFO but what exactly
> > > should the userspace application check?
> >
> > For VT-d, spec 3.0 introduced Virtual Cmd interface for PASID allocation,
> > guest request PASID from host if it detects the interface.