Re: [PATCH RFC v1 4/7] swiotlb: to implement io_tlb_high_mem

2022-06-10 Thread Dongli Zhang
Hi Christoph,

On 6/8/22 10:05 PM, Christoph Hellwig wrote:
> All this really needs to be hidden under the hood.
> 

Since this patch file has 200+ lines, would you please help clarify what does
'this' indicate?

The idea of this patch:

1. Convert the functions to initialize swiotlb into callee. The callee accepts
'true' or 'false' as arguments to indicate whether it is for default or new
swiotlb buffer, e.g., swiotlb_init_remap() into __swiotlb_init_remap().

2. At the caller side to decide if we are going to call the callee to create the
extra buffer.

Do you mean the callee if still *too high level* and all the
decision/allocation/initialization should be down at *lower level functions*?

E.g., if I re-work the "struct io_tlb_mem" to make the 64-bit buffer as the 2nd
array of io_tlb_mem->slots[32_or_64][index], the user will only notice it is the
default 'io_tlb_default_mem', but will not be able to know if it is allocated
from 32-bit or 64-bit buffer.

Thank you very much for the feedback.

Dongli Zhang
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits

2022-06-10 Thread John Garry via iommu

On 09/06/2022 21:34, Bart Van Assche wrote:

On 6/9/22 10:54, John Garry wrote:
ok, but do you have a system where the UFS host controller is behind 
an IOMMU? I had the impression that UFS controllers would be mostly 
found in embedded systems and IOMMUs are not as common on there.


Modern phones have an IOMMU. Below one can find an example from a Pixel 
6 phone. The UFS storage controller is not controller by the IOMMU as 
far as I can see but I wouldn't be surprised if the security team would 
ask us one day to enable the IOMMU for the UFS controller.


OK, then unfortunately it seems that you have no method to test. I might 
be able to test USB MSC but I am not even sure if I can even get DMA 
mappings who length exceeds the IOVA rcache limit there.


Thanks,
John
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] uacce: fix concurrency of fops_open and uacce_remove

2022-06-10 Thread Zhangfei Gao
The uacce parent's module can be removed when uacce is working,
which may cause troubles.

If rmmod/uacce_remove happens just after fops_open: bind_queue,
the uacce_remove can not remove the bound queue since it is not
added to the queue list yet, which blocks the uacce_disable_sva.

Change queues_lock area to make sure the bound queue is added to
the list thereby can be searched in uacce_remove.

And uacce->parent->driver is checked immediately in case rmmod is
just happening.

Also the parent driver must always stop DMA before calling
uacce_remove.

Signed-off-by: Yang Shen 
Signed-off-by: Zhangfei Gao 
---
 drivers/misc/uacce/uacce.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
index 281c54003edc..b6219c6bfb48 100644
--- a/drivers/misc/uacce/uacce.c
+++ b/drivers/misc/uacce/uacce.c
@@ -136,9 +136,16 @@ static int uacce_fops_open(struct inode *inode, struct 
file *filep)
if (!q)
return -ENOMEM;
 
+   mutex_lock(>queues_lock);
+
+   if (!uacce->parent->driver) {
+   ret = -ENODEV;
+   goto out_with_lock;
+   }
+
ret = uacce_bind_queue(uacce, q);
if (ret)
-   goto out_with_mem;
+   goto out_with_lock;
 
q->uacce = uacce;
 
@@ -153,7 +160,6 @@ static int uacce_fops_open(struct inode *inode, struct file 
*filep)
uacce->inode = inode;
q->state = UACCE_Q_INIT;
 
-   mutex_lock(>queues_lock);
list_add(>list, >queues);
mutex_unlock(>queues_lock);
 
@@ -161,7 +167,8 @@ static int uacce_fops_open(struct inode *inode, struct file 
*filep)
 
 out_with_bond:
uacce_unbind_queue(q);
-out_with_mem:
+out_with_lock:
+   mutex_unlock(>queues_lock);
kfree(q);
return ret;
 }
@@ -171,10 +178,10 @@ static int uacce_fops_release(struct inode *inode, struct 
file *filep)
struct uacce_queue *q = filep->private_data;
 
mutex_lock(>uacce->queues_lock);
-   list_del(>list);
-   mutex_unlock(>uacce->queues_lock);
uacce_put_queue(q);
uacce_unbind_queue(q);
+   list_del(>list);
+   mutex_unlock(>uacce->queues_lock);
kfree(q);
 
return 0;
@@ -513,10 +520,10 @@ void uacce_remove(struct uacce_device *uacce)
uacce_put_queue(q);
uacce_unbind_queue(q);
}
-   mutex_unlock(>queues_lock);
 
/* disable sva now since no opened queues */
uacce_disable_sva(uacce);
+   mutex_unlock(>queues_lock);
 
if (uacce->cdev)
cdev_device_del(uacce->cdev, >dev);
-- 
2.36.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2] dt-bindings: iommu: renesas, ipmmu-vmsa: R-Car V3U is R-Car Gen4

2022-06-10 Thread Geert Uytterhoeven
Despite the name, R-Car V3U is the first member of the R-Car Gen4
family.  Hence move its compatible value to the R-Car Gen4 section.

Signed-off-by: Geert Uytterhoeven 
Acked-by: Krzysztof Kozlowski 
Acked-by: Joerg Roedel 
Reviewed-by: Wolfram Sang 
---
v2:
  - Add Acked-by, Reviewed-by,
  - Add blank lines to improve readability.
---
 .../devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml   | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml 
b/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml
index 8854569ca3a6c949..26d0a5121f02a153 100644
--- a/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml
+++ b/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml
@@ -29,6 +29,7 @@ properties:
   - renesas,ipmmu-r8a7793  # R-Car M2-N
   - renesas,ipmmu-r8a7794  # R-Car E2
   - const: renesas,ipmmu-vmsa  # R-Mobile APE6 or R-Car Gen2 or RZ/G1
+
   - items:
   - enum:
   - renesas,ipmmu-r8a774a1 # RZ/G2M
@@ -43,10 +44,11 @@ properties:
   - renesas,ipmmu-r8a77980 # R-Car V3H
   - renesas,ipmmu-r8a77990 # R-Car E3
   - renesas,ipmmu-r8a77995 # R-Car D3
-  - renesas,ipmmu-r8a779a0 # R-Car V3U
+
   - items:
   - enum:
-  - renesas,ipmmu-r8a779f0 # R-Car S4-8
+  - renesas,ipmmu-r8a779a0   # R-Car V3U
+  - renesas,ipmmu-r8a779f0   # R-Car S4-8
   - const: renesas,rcar-gen4-ipmmu-vmsa  # R-Car Gen4
 
   reg:
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 02/11] iommu: Add max_pasids field in struct dev_iommu

2022-06-10 Thread Baolu Lu

On 2022/6/10 17:01, Tian, Kevin wrote:

From: Baolu Lu 
Sent: Friday, June 10, 2022 2:47 PM

On 2022/6/10 03:01, Raj, Ashok wrote:

On Tue, Jun 07, 2022 at 09:49:33AM +0800, Lu Baolu wrote:

@@ -218,6 +219,30 @@ static void dev_iommu_free(struct device *dev)
kfree(param);
   }

+static u32 dev_iommu_get_max_pasids(struct device *dev)
+{
+   u32 max_pasids = dev->iommu->iommu_dev->max_pasids;
+   u32 num_bits;
+   int ret;
+
+   if (!max_pasids)
+   return 0;
+
+   if (dev_is_pci(dev)) {
+   ret = pci_max_pasids(to_pci_dev(dev));
+   if (ret < 0)
+   return 0;
+
+   return min_t(u32, max_pasids, ret);


Ah.. that answers my other question to consider device pasid-max. I guess
if we need any enforcement of restricting devices that aren't supporting
the full PASID, that will be done by some higher layer?


The mm->pasid style of SVA is explicitly enabled through
iommu_dev_enable_feature(IOMMU_DEV_FEAT_SVA). The IOMMU driver
specific
restriction might be put there?



too many returns in this function, maybe setup all returns to the end of
the function might be elegant?


I didn't find cleanup room after a quick scan of the code. But sure, let
me go through code again offline.



If we do care:

+static u32 dev_iommu_get_max_pasids(struct device *dev)
+{
+   u32 max_pasids = 0,
+   u32 num_bits = 0;
+   int ret;
+
+   if (dev_is_pci(dev)) {
+   ret = pci_max_pasids(to_pci_dev(dev));
+   if (ret > 0)
+   max_pasids = ret;
+   } else {
+   ret = device_property_read_u32(dev, "pasid-num-bits", 
_bits);
+   if (!ret)
+   max_pasids = 1UL << num_bits;
+   }
+
+   return min_t(u32, max_pasids, dev->iommu->iommu_dev->max_pasids);
+}


Great! Cleaner and more compact than mine. Thank you!

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v8 02/11] iommu: Add max_pasids field in struct dev_iommu

2022-06-10 Thread Tian, Kevin
> From: Baolu Lu 
> Sent: Friday, June 10, 2022 2:47 PM
> 
> On 2022/6/10 03:01, Raj, Ashok wrote:
> > On Tue, Jun 07, 2022 at 09:49:33AM +0800, Lu Baolu wrote:
> >> @@ -218,6 +219,30 @@ static void dev_iommu_free(struct device *dev)
> >>kfree(param);
> >>   }
> >>
> >> +static u32 dev_iommu_get_max_pasids(struct device *dev)
> >> +{
> >> +  u32 max_pasids = dev->iommu->iommu_dev->max_pasids;
> >> +  u32 num_bits;
> >> +  int ret;
> >> +
> >> +  if (!max_pasids)
> >> +  return 0;
> >> +
> >> +  if (dev_is_pci(dev)) {
> >> +  ret = pci_max_pasids(to_pci_dev(dev));
> >> +  if (ret < 0)
> >> +  return 0;
> >> +
> >> +  return min_t(u32, max_pasids, ret);
> >
> > Ah.. that answers my other question to consider device pasid-max. I guess
> > if we need any enforcement of restricting devices that aren't supporting
> > the full PASID, that will be done by some higher layer?
> 
> The mm->pasid style of SVA is explicitly enabled through
> iommu_dev_enable_feature(IOMMU_DEV_FEAT_SVA). The IOMMU driver
> specific
> restriction might be put there?
> 
> >
> > too many returns in this function, maybe setup all returns to the end of
> > the function might be elegant?
> 
> I didn't find cleanup room after a quick scan of the code. But sure, let
> me go through code again offline.
>

If we do care:

+static u32 dev_iommu_get_max_pasids(struct device *dev)
+{
+   u32 max_pasids = 0, 
+   u32 num_bits = 0;
+   int ret;
+
+   if (dev_is_pci(dev)) {
+   ret = pci_max_pasids(to_pci_dev(dev));
+   if (ret > 0)
+   max_pasids = ret;
+   } else {
+   ret = device_property_read_u32(dev, "pasid-num-bits", 
_bits);
+   if (!ret)
+   max_pasids = 1UL << num_bits;
+   }
+
+   return min_t(u32, max_pasids, dev->iommu->iommu_dev->max_pasids);
+}

 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v8 01/11] iommu: Add max_pasids field in struct iommu_device

2022-06-10 Thread Tian, Kevin
> From: Jason Gunthorpe 
> Sent: Friday, June 10, 2022 7:53 AM
> 
> On Thu, Jun 09, 2022 at 05:25:42PM +, Raj, Ashok wrote:
> >
> > On Tue, Jun 07, 2022 at 09:49:32AM +0800, Lu Baolu wrote:
> > > Use this field to keep the number of supported PASIDs that an IOMMU
> > > hardware is able to support. This is a generic attribute of an IOMMU
> > > and lifting it into the per-IOMMU device structure makes it possible
> >
> > There is also a per-device attribute that tells what width is supported on
> > each device. When a device enables SVM, for simplicity we were proposing
> to
> > disable SVM on devices that don't support the full width, since it adds
> > additional complexity on the allocation interface. Is that factored into
> > this?
> 
> I would like to see the concept of a 'global PASID' and this is the
> only place we'd union all the per-device limits together. SVM can
> optionally use a global PASID and ENQCMD requires it, but I don't want
> to see the core code assuming everything is ENQCMD.
> 

Agree. and I think this is what this v8 is leaning toward. The core
code simply populates the pasid entry of the target device w/o
assuming the pasid is 'local' or 'global'. Then sva helpers actually
decides how the pasid is allocated.

Currently only global pasids are supported which is how sva works
before. We don't plan to change it in this series.

In parallel Jacob is working on per-device local pasids which will
then be used by his DMA API pasid work and also iommufd.

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] dma-direct: take dma-ranges/offsets into account in resource mapping

2022-06-10 Thread Serge Semin via iommu
A basic device-specific linear memory mapping was introduced back in
commit ("dma: Take into account dma_pfn_offset") as a single-valued offset
preserved in the device.dma_pfn_offset field, which was initialized for
instance by means of the "dma-ranges" DT property. Afterwards the
functionality was extended to support more than one device-specific region
defined in the device.dma_range_map list of maps. But all of these
improvements concerned a single pointer, page or sg DMA-mapping methods,
while the system resource mapping function turned to miss the
corresponding modification. Thus the dma_direct_map_resource() method now
just casts the CPU physical address to the device DMA address with no
dma-ranges-based mapping taking into account, which is obviously wrong.
Let's fix it by using the phys_to_dma_direct() method to get the
device-specific bus address from the passed memory resource for the case
of the directly mapped DMA.

Fixes: 25f1e1887088 ("dma: Take into account dma_pfn_offset")
Signed-off-by: Serge Semin 

---

After a long discussion with Christoph and Robin regarding this patch
here:
https://lore.kernel.org/lkml/20220324014836.19149-4-sergey.se...@baikalelectronics.ru
and here
https://lore.kernel.org/linux-pci/20220503225104.12108-2-sergey.se...@baikalelectronics.ru/
It was decided to consult with wider maintainers audience whether it's ok
to accept the change as is or a more sophisticated solution needs to be
found for the non-linear direct MMIO mapping.

Cc: Christoph Hellwig 
Cc: Robin Murphy 
Cc: Manivannan Sadhasivam 

file: arch/arm/mach-orion5x/board-dt.c
Cc: Andrew Lunn 
Cc: Sebastian Hesselbarth 
Cc: Gregory Clement 
Cc: linux-arm-ker...@lists.infradead.org

file: drivers/crypto/marvell/cesa/cesa.c
Cc: Srujana Challa 
Cc: Arnaud Ebalard 
Cc: Boris Brezillon 
Cc: linux-cry...@vger.kernel.org

file: drivers/dma/{fsl-edma-common.c,pl330.c,sh/rcar-dmac.c}
Cc: Vinod Koul 
Cc: dmaeng...@vger.kernel.org

file: arch/arm/boot/dts/{vfxxx.dtsi,ls1021a.dtsi,imx7ulp.dtsi,fsl-ls1043a.dtsi}
Cc: Shawn Guo 
Cc: Sascha Hauer 
Cc: Li Yang 
Cc: linux-arm-ker...@lists.infradead.org

file: arch/arm/boot/dts/r8a77*.dtsi, arch/arm64/boot/dts/renesas/r8a77*.dtsi
Cc: Geert Uytterhoeven 
Cc: Magnus Damm 
Cc: linux-renesas-...@vger.kernel.org

file: drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
Cc: Alex Deucher 
Cc: "Christian König" 
Cc: "Pan, Xinhui" 

file: drivers/gpu/drm/virtio/virtgpu_vram.c
Cc: David Airlie 
Cc: Gerd Hoffmann 

file: drivers/media/common/videobuf2/videobuf2-dma-contig.c
Cc: Tomasz Figa 
Cc: Marek Szyprowski 

file: drivers/misc/habanalabs/common/memory.c
Cc: Oded Gabbay 
Cc: Arnd Bergmann 
Cc: Greg Kroah-Hartman 

file: drivers/mtd/nand/raw/qcom_nandc.c
Cc: Manivannan Sadhasivam 

file: 
arch/arm64/boot/dts/qcom/{ipq8074.dtsi,ipq6018.dtsi,qcom-sdx55.dtsi,qcom-ipq4019.dtsi,qcom-ipq8064.dtsi}
Cc: Andy Gross 
Cc: Bjorn Andersson 
Cc: linux-arm-...@vger.kernel.org

file: drivers/net/ethernet/marvell/octeontx2/af/rvu.c
Cc: Sunil Goutham 
Cc: Linu Cherian 
Cc: Geetha sowjanya 

file: drivers/ntb/ntb_transport.c
Cc: Jon Mason 
Cc: Dave Jiang 
Cc: n...@lists.linux.dev
---
 kernel/dma/direct.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 9743c6ccce1a..bc06db74dfdb 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -497,7 +497,7 @@ int dma_direct_map_sg(struct device *dev, struct 
scatterlist *sgl, int nents,
 dma_addr_t dma_direct_map_resource(struct device *dev, phys_addr_t paddr,
size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
-   dma_addr_t dma_addr = paddr;
+   dma_addr_t dma_addr = phys_to_dma_direct(dev, paddr);
 
if (unlikely(!dma_capable(dev, dma_addr, size, false))) {
dev_err_once(dev,
-- 
2.35.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v8 04/11] iommu: Add sva iommu_domain support

2022-06-10 Thread Baolu Lu

On 2022/6/10 04:25, Raj, Ashok wrote:

Hi Baolu


Hi Ashok,



some minor nits.


Thanks for your comments.



On Tue, Jun 07, 2022 at 09:49:35AM +0800, Lu Baolu wrote:

The sva iommu_domain represents a hardware pagetable that the IOMMU
hardware could use for SVA translation. This adds some infrastructure
to support SVA domain in the iommu common layer. It includes:

- Extend the iommu_domain to support a new IOMMU_DOMAIN_SVA domain
   type. The IOMMU drivers that support SVA should provide the sva
   domain specific iommu_domain_ops.
- Add a helper to allocate an SVA domain. The iommu_domain_free()
   is still used to free an SVA domain.
- Add helpers to attach an SVA domain to a device and the reverse
   operation.

Some buses, like PCI, route packets without considering the PASID value.
Thus a DMA target address with PASID might be treated as P2P if the
address falls into the MMIO BAR of other devices in the group. To make
things simple, the attach/detach interfaces only apply to devices
belonging to the singleton groups, and the singleton is immutable in
fabric i.e. not affected by hotplug.

The iommu_attach/detach_device_pasid() can be used for other purposes,
such as kernel DMA with pasid, mediation device, etc.

Suggested-by: Jean-Philippe Brucker 
Suggested-by: Jason Gunthorpe 
Signed-off-by: Lu Baolu 
Reviewed-by: Jean-Philippe Brucker 
---
  include/linux/iommu.h | 45 -
  drivers/iommu/iommu.c | 93 +++
  2 files changed, 136 insertions(+), 2 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 3fbad42c0bf8..9173c5741447 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -64,6 +64,9 @@ struct iommu_domain_geometry {
  #define __IOMMU_DOMAIN_PT (1U << 2)  /* Domain is identity mapped   */
  #define __IOMMU_DOMAIN_DMA_FQ (1U << 3)  /* DMA-API uses flush queue*/
  
+#define __IOMMU_DOMAIN_SHARED	(1U << 4)  /* Page table shared from CPU  */


s/from CPU/with CPU


Sure.




+#define __IOMMU_DOMAIN_HOST_VA (1U << 5)  /* Host CPU virtual address */


Do you mean general CPU VA? or Host CPU VA, I'm reading the latter as 2nd
stage?


Host CPU VA. In the near future, we will add another flag _GUEST_VA, so
that the shared page table types are distiguished.




+
  /*
   * This are the possible domain-types
   *
@@ -86,15 +89,24 @@ struct iommu_domain_geometry {
  #define IOMMU_DOMAIN_DMA_FQ   (__IOMMU_DOMAIN_PAGING |\
 __IOMMU_DOMAIN_DMA_API |   \
 __IOMMU_DOMAIN_DMA_FQ)
+#define IOMMU_DOMAIN_SVA   (__IOMMU_DOMAIN_SHARED |\
+__IOMMU_DOMAIN_HOST_VA)


Doesn't shared automatically mean CPU VA? Do we need another flag?


Yes. Shared means CPU VA, but there're many types. Besides above two, we
also see the shared KVM/EPT.



  
  struct iommu_domain {

unsigned type;
const struct iommu_domain_ops *ops;
unsigned long pgsize_bitmap;/* Bitmap of page sizes in use */
-   iommu_fault_handler_t handler;
-   void *handler_token;
struct iommu_domain_geometry geometry;
struct iommu_dma_cookie *iova_cookie;
+   union {
+   struct {/* IOMMU_DOMAIN_DMA */
+   iommu_fault_handler_t handler;
+   void *handler_token;
+   };
+   struct {/* IOMMU_DOMAIN_SVA */
+   struct mm_struct *mm;
+   };
+   };
  };
  
  static inline bool iommu_is_dma_domain(struct iommu_domain *domain)

@@ -262,6 +274,8 @@ struct iommu_ops {
   * struct iommu_domain_ops - domain specific operations
   * @attach_dev: attach an iommu domain to a device
   * @detach_dev: detach an iommu domain from a device
+ * @set_dev_pasid: set an iommu domain to a pasid of device
+ * @block_dev_pasid: block pasid of device from using iommu domain
   * @map: map a physically contiguous memory region to an iommu domain
   * @map_pages: map a physically contiguous set of pages of the same size to
   * an iommu domain.
@@ -282,6 +296,10 @@ struct iommu_ops {
  struct iommu_domain_ops {
int (*attach_dev)(struct iommu_domain *domain, struct device *dev);
void (*detach_dev)(struct iommu_domain *domain, struct device *dev);
+   int (*set_dev_pasid)(struct iommu_domain *domain, struct device *dev,
+ioasid_t pasid);
+   void (*block_dev_pasid)(struct iommu_domain *domain, struct device *dev,
+   ioasid_t pasid);
  
  	int (*map)(struct iommu_domain *domain, unsigned long iova,

   phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
@@ -679,6 +697,12 @@ int iommu_group_claim_dma_owner(struct iommu_group *group, 
void *owner);
  void iommu_group_release_dma_owner(struct iommu_group *group);
  bool iommu_group_dma_owner_claimed(struct iommu_group *group);
  
+struct 

Re: [PATCH v8 02/11] iommu: Add max_pasids field in struct dev_iommu

2022-06-10 Thread Baolu Lu

On 2022/6/10 03:01, Raj, Ashok wrote:

On Tue, Jun 07, 2022 at 09:49:33AM +0800, Lu Baolu wrote:

Use this field to save the number of PASIDs that a device is able to
consume. It is a generic attribute of a device and lifting it into the
per-device dev_iommu struct could help to avoid the boilerplate code
in various IOMMU drivers.

Signed-off-by: Lu Baolu 
---
  include/linux/iommu.h |  2 ++
  drivers/iommu/iommu.c | 26 ++
  2 files changed, 28 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 03fbb1b71536..d50afb2c9a09 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -364,6 +364,7 @@ struct iommu_fault_param {
   * @fwspec:IOMMU fwspec data
   * @iommu_dev: IOMMU device this device is linked to
   * @priv:  IOMMU Driver private data
+ * @max_pasids:  number of PASIDs device can consume
   *
   * TODO: migrate other per device data pointers under iommu_dev_data, e.g.
   *struct iommu_group  *iommu_group;
@@ -375,6 +376,7 @@ struct dev_iommu {
struct iommu_fwspec *fwspec;
struct iommu_device *iommu_dev;
void*priv;
+   u32 max_pasids;
  };
  
  int iommu_device_register(struct iommu_device *iommu,

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 847ad47a2dfd..adac85ccde73 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -20,6 +20,7 @@
  #include 
  #include 
  #include 
+#include 


Is this needed for this patch?


Yes. It's for pci_max_pasids().




  #include 
  #include 
  #include 
@@ -218,6 +219,30 @@ static void dev_iommu_free(struct device *dev)
kfree(param);
  }
  
+static u32 dev_iommu_get_max_pasids(struct device *dev)

+{
+   u32 max_pasids = dev->iommu->iommu_dev->max_pasids;
+   u32 num_bits;
+   int ret;
+
+   if (!max_pasids)
+   return 0;
+
+   if (dev_is_pci(dev)) {
+   ret = pci_max_pasids(to_pci_dev(dev));
+   if (ret < 0)
+   return 0;
+
+   return min_t(u32, max_pasids, ret);


Ah.. that answers my other question to consider device pasid-max. I guess
if we need any enforcement of restricting devices that aren't supporting
the full PASID, that will be done by some higher layer?


The mm->pasid style of SVA is explicitly enabled through
iommu_dev_enable_feature(IOMMU_DEV_FEAT_SVA). The IOMMU driver specific
restriction might be put there?



too many returns in this function, maybe setup all returns to the end of
the function might be elegant?


I didn't find cleanup room after a quick scan of the code. But sure, let
me go through code again offline.




+   }
+
+   ret = device_property_read_u32(dev, "pasid-num-bits", _bits);
+   if (ret)
+   return 0;
+
+   return min_t(u32, max_pasids, 1UL << num_bits);
+}
+
  static int __iommu_probe_device(struct device *dev, struct list_head 
*group_list)
  {
const struct iommu_ops *ops = dev->bus->iommu_ops;
@@ -243,6 +268,7 @@ static int __iommu_probe_device(struct device *dev, struct 
list_head *group_list
}
  
  	dev->iommu->iommu_dev = iommu_dev;

+   dev->iommu->max_pasids = dev_iommu_get_max_pasids(dev);
  
  	group = iommu_group_get_for_dev(dev);

if (IS_ERR(group)) {
--
2.25.1



Best regards,
Baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 01/11] iommu: Add max_pasids field in struct iommu_device

2022-06-10 Thread Baolu Lu

On 2022/6/10 01:25, Raj, Ashok wrote:

diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 4f29139bbfc3..e065cbe3c857 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -479,7 +479,6 @@ enum {
  #define VTD_FLAG_IRQ_REMAP_PRE_ENABLED(1 << 1)
  #define VTD_FLAG_SVM_CAPABLE  (1 << 2)
  
-extern int intel_iommu_sm;

  extern spinlock_t device_domain_lock;
  
  #define sm_supported(iommu)	(intel_iommu_sm && ecap_smts((iommu)->ecap))

@@ -786,6 +785,7 @@ struct context_entry *iommu_context_addr(struct intel_iommu 
*iommu, u8 bus,
  extern const struct iommu_ops intel_iommu_ops;
  
  #ifdef CONFIG_INTEL_IOMMU

+extern int intel_iommu_sm;
  extern int iommu_calculate_agaw(struct intel_iommu *iommu);
  extern int iommu_calculate_max_sagaw(struct intel_iommu *iommu);
  extern int dmar_disabled;
@@ -802,6 +802,7 @@ static inline int iommu_calculate_max_sagaw(struct 
intel_iommu *iommu)
  }
  #define dmar_disabled (1)
  #define intel_iommu_enabled (0)
+#define intel_iommu_sm (0)

Is the above part of this patch? Or should be moved up somewhere?


This is to make pasid_supported() usable in dmar.c. It's only needed by
the change in this patch. I should make this clear in the commit
message. :-)

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCHES 1/2] iommu: Add RCU-protected page free support

2022-06-10 Thread Baolu Lu

On 2022/6/10 01:06, Raj, Ashok wrote:

On Thu, Jun 09, 2022 at 03:08:10PM +0800, Lu Baolu wrote:

The IOMMU page tables are updated using iommu_map/unmap() interfaces.
Currently, there is no mandatory requirement for drivers to use locks
to ensure concurrent updates to page tables, because it's assumed that
overlapping IOVA ranges do not have concurrent updates. Therefore the
IOMMU drivers only need to take care of concurrent updates to level
page table entries.


The last part doesn't read well..
s/updates to level page table entries/ updates to page-table entries at the
same level



But enabling new features challenges this assumption. For example, the
hardware assisted dirty page tracking feature requires scanning page
tables in interfaces other than mapping and unmapping. This might result
in a use-after-free scenario in which a level page table has been freed
by the unmap() interface, while another thread is scanning the next level
page table.

This adds RCU-protected page free support so that the pages are really
freed and reused after a RCU grace period. Hence, the page tables are
safe for scanning within a rcu_read_lock critical region. Considering
that scanning the page table is a rare case, this also adds a domain
flag and the RCU-protected page free is only used when this flat is set.


s/flat/flag


Above updated. Thank you!



Signed-off-by: Lu Baolu 
---
  include/linux/iommu.h |  9 +
  drivers/iommu/iommu.c | 23 +++
  2 files changed, 32 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 5e1afe169549..6f68eabb8567 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -95,6 +95,7 @@ struct iommu_domain {
void *handler_token;
struct iommu_domain_geometry geometry;
struct iommu_dma_cookie *iova_cookie;
+   unsigned long concurrent_traversal:1;


Does this need to be a bitfield? Even though you are needing just one bit
now, you can probably make have maskbits?



As discussed in another reply, I am about to drop the driver opt-in and
wrapper the helper around the iommu_iotlb_gather.




  };
  
  static inline bool iommu_is_dma_domain(struct iommu_domain *domain)

@@ -657,6 +658,12 @@ static inline void dev_iommu_priv_set(struct device *dev, 
void *priv)
dev->iommu->priv = priv;
  }
  
+static inline void domain_set_concurrent_traversal(struct iommu_domain *domain,

+  bool value)
+{
+   domain->concurrent_traversal = value;
+}
+
  int iommu_probe_device(struct device *dev);
  void iommu_release_device(struct device *dev);
  
@@ -677,6 +684,8 @@ int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner);

  void iommu_group_release_dma_owner(struct iommu_group *group);
  bool iommu_group_dma_owner_claimed(struct iommu_group *group);
  
+void iommu_free_pgtbl_pages(struct iommu_domain *domain,

+   struct list_head *pages);
  #else /* CONFIG_IOMMU_API */
  
  struct iommu_ops {};

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 847ad47a2dfd..ceeb97ebe3e2 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -3252,3 +3252,26 @@ bool iommu_group_dma_owner_claimed(struct iommu_group 
*group)
return user;
  }
  EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);
+
+static void pgtble_page_free_rcu(struct rcu_head *rcu)


maybe the names can be consistent? pgtble_ vs pgtbl below.

vote to drop the 'e' :-)


Updated.




+{
+   struct page *page = container_of(rcu, struct page, rcu_head);
+
+   __free_pages(page, 0);
+}
+
+void iommu_free_pgtbl_pages(struct iommu_domain *domain,
+   struct list_head *pages)
+{
+   struct page *page, *next;
+
+   if (!domain->concurrent_traversal) {
+   put_pages_list(pages);
+   return;
+   }
+
+   list_for_each_entry_safe(page, next, pages, lru) {
+   list_del(>lru);
+   call_rcu(>rcu_head, pgtble_page_free_rcu);
+   }
+}
--
2.25.1



Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick

2022-06-10 Thread Christoph Hellwig
On Thu, Jun 09, 2022 at 04:12:10PM +0100, Robin Murphy wrote:
> +   If you have a modern PCI Express based system, this feature mostly 
> just

Overly long line here.

Otherwise looks good:

Reviewed-by: Christoph Hellwig 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCHES 1/2] iommu: Add RCU-protected page free support

2022-06-10 Thread Baolu Lu

On 2022/6/9 21:32, Jason Gunthorpe wrote:

On Thu, Jun 09, 2022 at 02:19:06PM +0100, Robin Murphy wrote:


Is there a significant benefit to keeping both paths, or could we get away
with just always using RCU? Realistically, pagetable pages aren't likely to
be freed all that frequently, except perhaps at domain teardown, but that
shouldn't really be performance-critical, and I guess we could stick an RCU
sync point in iommu_domain_free() if we're really worried about releasing
larger quantities of pages back to the allocator ASAP?


I think you are right, anything that uses the iommu_iotlb_gather may
as well use RCU too.

IIRC the allocators already know that RCU is often sitting on
freed-memory and have some contigency to flush it out before OOMing,
so nothing special should be needed.


Fair enough. How about below code?

static void pgtble_page_free_rcu(struct rcu_head *rcu)
{
struct page *page = container_of(rcu, struct page, rcu_head);

__free_pages(page, 0);
}

/*
 * Free pages gathered in the freelist of iommu_iotlb_gather. Use RCU free
 * way so that it's safe for lock-free page table walk.
 */
void iommu_free_iotlb_gather_pages(struct iommu_iotlb_gather *iotlb_gather)
{
struct page *page, *next;

list_for_each_entry_safe(page, next, _gather->freelist, 
lru) {

list_del(>lru);
call_rcu(>rcu_head, pgtble_page_free_rcu);
}
}

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu