Re: [PATCH 1/4] ARM/omap1: switch to use dma_direct_set_offset for lbus DMA offsets

2020-09-17 Thread Tony Lindgren
* Christoph Hellwig  [200917 17:37]:
> Switch the omap1510 platform ohci device to use dma_direct_set_offset
> to set the DMA offset instead of using direct hooks into the DMA
> mapping code and remove the now unused hooks.

Looks nice to me :) I still can't test this probably for few more weeks
though but hopefully Aaro or Janusz (Added to Cc) can test it.

Regards,

Tony

> Signed-off-by: Christoph Hellwig 
> ---
>  arch/arm/include/asm/dma-direct.h | 18 -
>  arch/arm/mach-omap1/include/mach/memory.h | 31 ---
>  arch/arm/mach-omap1/usb.c | 22 
>  3 files changed, 22 insertions(+), 49 deletions(-)
> 
> diff --git a/arch/arm/include/asm/dma-direct.h 
> b/arch/arm/include/asm/dma-direct.h
> index 436544aeb83405..77fcb7ee5ec907 100644
> --- a/arch/arm/include/asm/dma-direct.h
> +++ b/arch/arm/include/asm/dma-direct.h
> @@ -9,7 +9,6 @@
>   * functions used internally by the DMA-mapping API to provide DMA
>   * addresses. They must not be used by drivers.
>   */
> -#ifndef __arch_pfn_to_dma
>  static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
>  {
>   if (dev && dev->dma_range_map)
> @@ -34,23 +33,6 @@ static inline dma_addr_t virt_to_dma(struct device *dev, 
> void *addr)
>   return (dma_addr_t)__virt_to_bus((unsigned long)(addr));
>  }
>  
> -#else
> -static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
> -{
> - return __arch_pfn_to_dma(dev, pfn);
> -}
> -
> -static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
> -{
> - return __arch_dma_to_pfn(dev, addr);
> -}
> -
> -static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
> -{
> - return __arch_virt_to_dma(dev, addr);
> -}
> -#endif
> -
>  static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
>  {
>   unsigned int offset = paddr & ~PAGE_MASK;
> diff --git a/arch/arm/mach-omap1/include/mach/memory.h 
> b/arch/arm/mach-omap1/include/mach/memory.h
> index 1142560e0078f5..36bccb6ab8 100644
> --- a/arch/arm/mach-omap1/include/mach/memory.h
> +++ b/arch/arm/mach-omap1/include/mach/memory.h
> @@ -14,42 +14,11 @@
>   * OMAP-1510 bus address is translated into a Local Bus address if the
>   * OMAP bus type is lbus. We do the address translation based on the
>   * device overriding the defaults used in the dma-mapping API.
> - * Note that the is_lbus_device() test is not very efficient on 1510
> - * because of the strncmp().
>   */
> -#if defined(CONFIG_ARCH_OMAP15XX) && !defined(__ASSEMBLER__)
>  
>  /*
>   * OMAP-1510 Local Bus address offset
>   */
>  #define OMAP1510_LB_OFFSET   UL(0x3000)
>  
> -#define virt_to_lbus(x)  ((x) - PAGE_OFFSET + OMAP1510_LB_OFFSET)
> -#define lbus_to_virt(x)  ((x) - OMAP1510_LB_OFFSET + PAGE_OFFSET)
> -#define is_lbus_device(dev)  (cpu_is_omap15xx() && dev && 
> (strncmp(dev_name(dev), "ohci", 4) == 0))
> -
> -#define __arch_pfn_to_dma(dev, pfn)  \
> - ({ dma_addr_t __dma = __pfn_to_phys(pfn); \
> -if (is_lbus_device(dev)) \
> - __dma = __dma - PHYS_OFFSET + OMAP1510_LB_OFFSET; \
> -__dma; })
> -
> -#define __arch_dma_to_pfn(dev, addr) \
> - ({ dma_addr_t __dma = addr; \
> -if (is_lbus_device(dev)) \
> - __dma += PHYS_OFFSET - OMAP1510_LB_OFFSET;  \
> -__phys_to_pfn(__dma);\
> - })
> -
> -#define __arch_dma_to_virt(dev, addr)({ (void *) 
> (is_lbus_device(dev) ? \
> - lbus_to_virt(addr) : \
> - __phys_to_virt(addr)); })
> -
> -#define __arch_virt_to_dma(dev, addr)({ unsigned long __addr = 
> (unsigned long)(addr); \
> -(dma_addr_t) (is_lbus_device(dev) ? \
> - virt_to_lbus(__addr) : \
> - __virt_to_phys(__addr)); })
> -
> -#endif   /* CONFIG_ARCH_OMAP15XX */
> -
>  #endif
> diff --git a/arch/arm/mach-omap1/usb.c b/arch/arm/mach-omap1/usb.c
> index d8e9bbda8f7bdd..ba8566204ea9f4 100644
> --- a/arch/arm/mach-omap1/usb.c
> +++ b/arch/arm/mach-omap1/usb.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include 
> @@ -542,6 +543,25 @@ static u32 __init omap1_usb2_init(unsigned nwires, 
> unsigned alt_pingroup)
>  /* ULPD_APLL_CTRL */
>  #define APLL_NDPLL_SWITCH(1 << 0)
>  
> +static int omap_1510_usb_ohci_notifier(struct notifier_block *nb,
> + unsigned long event, void *data)
> +{
> + struct device *dev = data;
> +
> + if (event != BUS_NOTIFY_ADD_DEVICE)
> + return NOTIFY_DONE;
> +
> + if (strncmp(dev_name(dev), "ohci", 4) == 0 &&
> + dma_direct_set_offset(dev, PHYS_OFFSET, OMAP1510_LB_OFFSET,
> + (u64)-1))
> 

Re: [PATCH 3/4] ARM/dma-mapping: don't handle NULL devices in dma-direct.h

2020-09-17 Thread Christoph Hellwig
On Thu, Sep 17, 2020 at 07:50:10PM +0100, Russell King - ARM Linux admin wrote:
> On Thu, Sep 17, 2020 at 07:32:28PM +0200, Christoph Hellwig wrote:
> > The DMA API removed support for not passing in a device a long time
> > ago, so remove the NULL checks.
> 
> What happens with ISA devices?

For actual drivers they've been switched to struct isa_driver, which
provides a struct device.  For some of the special case like the
arch/arm/kernel/dma-isa.c we now use static struct device instances.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-17 Thread Jason Wang


On 2020/9/18 上午2:17, Jacob Pan (Jun) wrote:

Hi Jason,
On Thu, 17 Sep 2020 11:53:49 +0800, Jason Wang 
wrote:


On 2020/9/17 上午7:09, Jacob Pan (Jun) wrote:

Hi Jason,
On Wed, 16 Sep 2020 15:38:41 -0300, Jason Gunthorpe 
wrote:
  

On Wed, Sep 16, 2020 at 11:21:10AM -0700, Jacob Pan (Jun) wrote:

Hi Jason,
On Wed, 16 Sep 2020 14:01:13 -0300, Jason Gunthorpe
 wrote:
  

On Wed, Sep 16, 2020 at 09:33:43AM -0700, Raj, Ashok wrote:

On Wed, Sep 16, 2020 at 12:07:54PM -0300, Jason Gunthorpe
wrote:

On Tue, Sep 15, 2020 at 05:22:26PM -0700, Jacob Pan (Jun)
wrote:

If user space wants to bind page tables, create the PASID
with /dev/sva, use ioctls there to setup the page table
the way it wants, then pass the now configured PASID to a
driver that can use it.

Are we talking about bare metal SVA?

What a weird term.

Glad you noticed it at v7 :-)

Any suggestions on something less weird than
Shared Virtual Addressing? There is a reason why we moved from
SVM to SVA.

SVA is fine, what is "bare metal" supposed to mean?
  

What I meant here is sharing virtual address between DMA and host
process. This requires devices perform DMA request with PASID and
use IOMMU first level/stage 1 page tables.
This can be further divided into 1) user SVA 2) supervisor SVA
(sharing init_mm)

My point is that /dev/sva is not useful here since the driver can
perform PASID allocation while doing SVA bind.

No, you are thinking too small.

Look at VDPA, it has a SVA uAPI. Some HW might use PASID for the
SVA.

Could you point to me the SVA UAPI? I couldn't find it in the
mainline. Seems VDPA uses VHOST interface?


It's the vhost_iotlb_msg defined in uapi/linux/vhost_types.h.


Thanks for the pointer, for complete vSVA functionality we would need
1 TLB flush (IOTLB and PASID cache etc.)
2 PASID alloc/free
3 bind/unbind page tables or PASID tables
4 Page request service

Seems vhost_iotlb_msg can be used for #1 partially. And the
proposal is to pluck out the rest into /dev/sda? Seems awkward as Alex
pointed out earlier for similar situation in VFIO.



Consider it doesn't have any PASID support yet, my understanding is that 
if we go with /dev/sva:


- vhost uAPI will still keep the uAPI for associating an ASID to a 
specific virtqueue

- except for this, we can use /dev/sva for all the rest (P)ASID operations




  

When VDPA is used by DPDK it makes sense that the PASID will be SVA
and 1:1 with the mm_struct.
  

I still don't see why bare metal DPDK needs to get a handle of the
PASID.


My understanding is that it may:

- have a unified uAPI with vSVA: alloc, bind, unbind, free

Got your point, but vSVA needs more than these



Yes it's just a subset of what vSVA required.





- leave the binding policy to userspace instead of the using a
implied one in the kenrel


Only if necessary.



Yes, I think it's all about visibility(flexibility) and**manageability.

Consider device has queue A, B, C. We will only dedicated queue A, B for 
one PASID(for vSVA) and C with another PASID(for SVA). It looks to me 
the current sva_bind() API doesn't support this. We still need an API 
for allocating a PASID for SVA and assign it to the (mediated) device.  
This case is pretty common for implementing a shadow queue for a guest.






Perhaps the SVA patch would explain. Or are you talking about
vDPA DPDK process that is used to support virtio-net-pmd in the
guest?

When VDPA is used by qemu it makes sense that the PASID will be an
arbitary IOVA map constructed to be 1:1 with the guest vCPU
physical map. /dev/sva allows a single uAPI to do this kind of
setup, and qemu can support it while supporting a range of SVA
kernel drivers. VDPA and vfio-mdev are obvious initial targets.

*BOTH* are needed.

In general any uAPI for PASID should have the option to use either
the mm_struct SVA PASID *OR* a PASID from /dev/sva. It costs
virtually nothing to implement this in the driver as PASID is just
a number, and gives so much more flexability.
  

Not really nothing in terms of PASID life cycles. For example, if
user uses uacce interface to open an accelerator, it gets an
FD_acc. Then it opens /dev/sva to allocate PASID then get another
FD_pasid. Then we pass FD_pasid to the driver to bind page tables,
perhaps multiple drivers. Now we have to worry about If FD_pasid
gets closed before FD_acc(s) closed and all these race conditions.


I'm not sure I understand this. But this demonstrates the flexibility
of an unified uAPI. E.g it allows vDPA and VFIO device to use the
same PAISD which can be shared with a process in the guest.


This is for user DMA not for vSVA. I was contending that /dev/sva
creates unnecessary steps for such usage.



A question here is where the PASID management is expected to be done. 
I'm not quite sure the silent 1:1 binding done in intel_svm_bind_mm() 
can satisfy the requirement for management layer.





For vSVA, I think vDPA and VFIO can potentially share but I am not
seeing convincing benefits.

If a guest process wants 

[PATCH] iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate()

2020-09-17 Thread Yu Kuai
if of_find_device_by_node() succeed, qcom_iommu_of_xlate() doesn't have
a corresponding put_device(). Thus add put_device() to fix the exception
handling for this function implementation.

Fixes: e86d1aa8b60f ("iommu/arm-smmu: Move Arm SMMU drivers into their own 
subdirectory")
Signed-off-by: Yu Kuai 
---
 drivers/iommu/arm/arm-smmu/qcom_iommu.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c 
b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index 9535a6af7553..7a9594d221e0 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -584,8 +584,10 @@ static int qcom_iommu_of_xlate(struct device *dev, struct 
of_phandle_args *args)
 * index into qcom_iommu->ctxs:
 */
if (WARN_ON(asid < 1) ||
-   WARN_ON(asid > qcom_iommu->num_ctxs))
+   WARN_ON(asid > qcom_iommu->num_ctxs)) {
+   put_device(_pdev->dev);
return -EINVAL;
+   }
 
if (!dev_iommu_priv_get(dev)) {
dev_iommu_priv_set(dev, qcom_iommu);
@@ -595,6 +597,7 @@ static int qcom_iommu_of_xlate(struct device *dev, struct 
of_phandle_args *args)
 * banks are ok, but multiple devices are not:
 */
if (WARN_ON(qcom_iommu != dev_iommu_priv_get(dev)))
+   put_device(_pdev->dev);
return -EINVAL;
}
 
-- 
2.25.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/exynos: add missing put_device() call in exynos_iommu_of_xlate()

2020-09-17 Thread Yu Kuai
if of_find_device_by_node() succeed, exynos_iommu_of_xlate() doesn't have
a corresponding put_device(). Thus add put_device() to fix the exception
handling for this function implementation.

Fixes: aa759fd376fb ("iommu/exynos: Add callback for initializing devices from 
device tree")
Signed-off-by: Yu Kuai 
---
 drivers/iommu/exynos-iommu.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index bad3c0ce10cb..de324b4eedfe 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -1295,13 +1295,17 @@ static int exynos_iommu_of_xlate(struct device *dev,
return -ENODEV;
 
data = platform_get_drvdata(sysmmu);
-   if (!data)
+   if (!data) {
+   put_device(>dev);
return -ENODEV;
+   }
 
if (!owner) {
owner = kzalloc(sizeof(*owner), GFP_KERNEL);
-   if (!owner)
+   if (!owner) {
+   put_device(>dev);
return -ENOMEM;
+   }
 
INIT_LIST_HEAD(>controllers);
mutex_init(>rpm_lock);
-- 
2.25.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 0/9] x86: tag application address space for devices

2020-09-17 Thread Fenghua Yu
Hi, Joerg,

On Wed, Sep 16, 2020 at 10:06:02AM +0200, Joerg Roedel wrote:
> On Tue, Sep 15, 2020 at 09:30:04AM -0700, Fenghua Yu wrote:
> > Ashok Raj (1):
> >   Documentation/x86: Add documentation for SVA (Shared Virtual
> > Addressing)
> > 
> > Fenghua Yu (7):
> >   drm, iommu: Change type of pasid to u32
> >   iommu/vt-d: Change flags type to unsigned int in binding mm
> >   x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions
> >   x86/msr-index: Define IA32_PASID MSR
> >   mm: Define pasid in mm
> >   x86/cpufeatures: Mark ENQCMD as disabled when configured out
> >   x86/mmu: Allocate/free PASID
> > 
> > Yu-cheng Yu (1):
> >   x86/fpu/xstate: Add supervisor PASID state for ENQCMD feature
> 
> For the IOMMU bits:
> 
> Acked-by: Joerg Roedel 

Thank you!

-Fenghua
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [trivial PATCH] treewide: Convert switch/case fallthrough; to break;

2020-09-17 Thread Jacob Keller



On 9/9/2020 1:55 PM, Keith Busch wrote:
> On Wed, Sep 09, 2020 at 01:06:39PM -0700, Joe Perches wrote:
>> diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c
>> index eea0f453cfb6..8aac5bc60f4c 100644
>> --- a/crypto/tcrypt.c
>> +++ b/crypto/tcrypt.c
>> @@ -2464,7 +2464,7 @@ static int do_test(const char *alg, u32 type, u32 
>> mask, int m, u32 num_mb)
>>  test_hash_speed("streebog512", sec,
>>  generic_hash_speed_template);
>>  if (mode > 300 && mode < 400) break;
>> -fallthrough;
>> +break;
>>  case 399:
>>  break;
> 
> Just imho, this change makes the preceding 'if' look even more
> pointless. Maybe the fallthrough was a deliberate choice? Not that my
> opinion matters here as I don't know this module, but it looked a bit
> odd to me.
> 

Yea this does look very odd..
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/4] ARM/footbridge: switch to use dma_direct_set_offset for lbus DMA offsets

2020-09-17 Thread Russell King - ARM Linux admin
On Thu, Sep 17, 2020 at 07:32:27PM +0200, Christoph Hellwig wrote:
>  static int __init cats_pci_init(void)
>  {
> - if (machine_is_cats())
> - pci_common_init(_pci);
> + if (!machine_is_cats())
> + return 0;
> + bus_register_notifier(_bus_type, _pci_dma_nb);
> + pci_common_init(_pci);

I'd prefer these things to retain a positive-logic construct, so:

if (machine_is_cats()) {
bus_register_notifier(_bus_type, _pci_dma_nb);
pci_common_init(_pci);
}

It's the same number of lines.

Otherwise, I think it's fine. I'll try to find some spare time to give
it a go on a Netwinder.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/4] ARM/dma-mapping: don't handle NULL devices in dma-direct.h

2020-09-17 Thread Russell King - ARM Linux admin
On Thu, Sep 17, 2020 at 07:32:28PM +0200, Christoph Hellwig wrote:
> The DMA API removed support for not passing in a device a long time
> ago, so remove the NULL checks.

What happens with ISA devices?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-17 Thread Jacob Pan (Jun)
Hi Jason,
On Thu, 17 Sep 2020 11:53:49 +0800, Jason Wang 
wrote:

> On 2020/9/17 上午7:09, Jacob Pan (Jun) wrote:
> > Hi Jason,
> > On Wed, 16 Sep 2020 15:38:41 -0300, Jason Gunthorpe 
> > wrote:
> >  
> >> On Wed, Sep 16, 2020 at 11:21:10AM -0700, Jacob Pan (Jun) wrote:  
> >>> Hi Jason,
> >>> On Wed, 16 Sep 2020 14:01:13 -0300, Jason Gunthorpe
> >>>  wrote:
> >>>  
>  On Wed, Sep 16, 2020 at 09:33:43AM -0700, Raj, Ashok wrote:  
> > On Wed, Sep 16, 2020 at 12:07:54PM -0300, Jason Gunthorpe
> > wrote:  
> >> On Tue, Sep 15, 2020 at 05:22:26PM -0700, Jacob Pan (Jun)
> >> wrote:  
>  If user space wants to bind page tables, create the PASID
>  with /dev/sva, use ioctls there to setup the page table
>  the way it wants, then pass the now configured PASID to a
>  driver that can use it.  
> >>> Are we talking about bare metal SVA?  
> >> What a weird term.  
> > Glad you noticed it at v7 :-)
> >
> > Any suggestions on something less weird than
> > Shared Virtual Addressing? There is a reason why we moved from
> > SVM to SVA.  
>  SVA is fine, what is "bare metal" supposed to mean?
>   
> >>> What I meant here is sharing virtual address between DMA and host
> >>> process. This requires devices perform DMA request with PASID and
> >>> use IOMMU first level/stage 1 page tables.
> >>> This can be further divided into 1) user SVA 2) supervisor SVA
> >>> (sharing init_mm)
> >>>
> >>> My point is that /dev/sva is not useful here since the driver can
> >>> perform PASID allocation while doing SVA bind.  
> >> No, you are thinking too small.
> >>
> >> Look at VDPA, it has a SVA uAPI. Some HW might use PASID for the
> >> SVA. 
> > Could you point to me the SVA UAPI? I couldn't find it in the
> > mainline. Seems VDPA uses VHOST interface?  
> 
> 
> It's the vhost_iotlb_msg defined in uapi/linux/vhost_types.h.
> 
Thanks for the pointer, for complete vSVA functionality we would need
1 TLB flush (IOTLB and PASID cache etc.)
2 PASID alloc/free
3 bind/unbind page tables or PASID tables
4 Page request service

Seems vhost_iotlb_msg can be used for #1 partially. And the
proposal is to pluck out the rest into /dev/sda? Seems awkward as Alex
pointed out earlier for similar situation in VFIO.

> 
> >  
> >> When VDPA is used by DPDK it makes sense that the PASID will be SVA
> >> and 1:1 with the mm_struct.
> >>  
> > I still don't see why bare metal DPDK needs to get a handle of the
> > PASID.  
> 
> 
> My understanding is that it may:
> 
> - have a unified uAPI with vSVA: alloc, bind, unbind, free
Got your point, but vSVA needs more than these

> - leave the binding policy to userspace instead of the using a
> implied one in the kenrel
> 
Only if necessary.

> 
> > Perhaps the SVA patch would explain. Or are you talking about
> > vDPA DPDK process that is used to support virtio-net-pmd in the
> > guest? 
> >> When VDPA is used by qemu it makes sense that the PASID will be an
> >> arbitary IOVA map constructed to be 1:1 with the guest vCPU
> >> physical map. /dev/sva allows a single uAPI to do this kind of
> >> setup, and qemu can support it while supporting a range of SVA
> >> kernel drivers. VDPA and vfio-mdev are obvious initial targets.
> >>
> >> *BOTH* are needed.
> >>
> >> In general any uAPI for PASID should have the option to use either
> >> the mm_struct SVA PASID *OR* a PASID from /dev/sva. It costs
> >> virtually nothing to implement this in the driver as PASID is just
> >> a number, and gives so much more flexability.
> >>  
> > Not really nothing in terms of PASID life cycles. For example, if
> > user uses uacce interface to open an accelerator, it gets an
> > FD_acc. Then it opens /dev/sva to allocate PASID then get another
> > FD_pasid. Then we pass FD_pasid to the driver to bind page tables,
> > perhaps multiple drivers. Now we have to worry about If FD_pasid
> > gets closed before FD_acc(s) closed and all these race conditions.  
> 
> 
> I'm not sure I understand this. But this demonstrates the flexibility
> of an unified uAPI. E.g it allows vDPA and VFIO device to use the
> same PAISD which can be shared with a process in the guest.
> 
This is for user DMA not for vSVA. I was contending that /dev/sva
creates unnecessary steps for such usage.

For vSVA, I think vDPA and VFIO can potentially share but I am not
seeing convincing benefits.

If a guest process wants to do SVA with a VFIO assigned device and a
vDPA-backed virtio-net at the same time, it might be a limitation if
PASID is not managed via a common interface. But I am not sure how vDPA
SVA support will look like, does it support gIOVA? need virtio IOMMU?

> For the race condition, it could be probably solved with refcnt.
> 
Agreed but the best solution might be not to have the problem in the
first place :)

> Thanks
> 
> 
> >
> > If we do not expose FD_pasid to the user, the teardown is much
> > simpler and streamlined. Following each FD_acc close, PASID unbind

Re: [PATCH] iommu/amd: Fix event counter availability check

2020-09-17 Thread Alexander Monakov
On Tue, 16 Jun 2020, Suravee Suthikulpanit wrote:

> > > Instead of blindly moving the code around to a spot that would just work,
> > > I am trying to understand what might be required here. In this case,
> > > the init_device_table_dma()should not be needed. I suspect it's the IOMMU
> > > invalidate all command that's also needed here.
> > > 
> > > I'm also checking with the HW and BIOS team. Meanwhile, could you please
> > > give
> > > the following change a try:
> > Hello. Can you give any update please?
> > 
> > Alexander
> > 
> 
> Sorry for late reply. I have a reproducer and working with the HW team to
> understand the issue.
> I should be able to provide update with solution by the end of this week.

Hello, hope you are doing well. Has this investigation found anything?

Thanks.
Alexander
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/4] ARM/dma-mapping: remove the arm specific phys to dma translation helpers

2020-09-17 Thread Christoph Hellwig
Now that the two remaining architectures that hooked into the DMA
translation directly use the range map instead, there is no need to
override phys_to_dma and dma_to_phys.  Remove asm/dma-direct.h after
switching the remaining callsites to the phys addr based generic
helpers instead of the PFN or virt addr based arm ones.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/Kconfig  |  1 -
 arch/arm/common/dmabounce.c   | 14 +-
 arch/arm/include/asm/dma-direct.h | 45 ---
 arch/arm/mm/dma-mapping.c | 20 +++---
 4 files changed, 17 insertions(+), 63 deletions(-)
 delete mode 100644 arch/arm/include/asm/dma-direct.h

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index e00d94b1665876..2c4d608fa5fecf 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -14,7 +14,6 @@ config ARM
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PTE_SPECIAL if ARM_LPAE
-   select ARCH_HAS_PHYS_TO_DMA
select ARCH_HAS_SETUP_DMA_OPS
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index d3e00ea9208834..561d6d06c6b6aa 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -258,7 +258,7 @@ static inline dma_addr_t map_single(struct device *dev, 
void *ptr, size_t size,
}
 
dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n",
-   __func__, buf->ptr, virt_to_dma(dev, buf->ptr),
+   __func__, buf->ptr, phys_to_dma(dev, virt_to_phys(buf->ptr)),
buf->safe, buf->safe_dma_addr);
 
if ((dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) &&
@@ -279,7 +279,7 @@ static inline void unmap_single(struct device *dev, struct 
safe_buffer *buf,
BUG_ON(buf->direction != dir);
 
dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n",
-   __func__, buf->ptr, virt_to_dma(dev, buf->ptr),
+   __func__, buf->ptr, phys_to_dma(dev, virt_to_phys(buf->ptr)),
buf->safe, buf->safe_dma_addr);
 
DO_STATS(dev->archdata.dmabounce->bounce_count++);
@@ -320,7 +320,7 @@ static dma_addr_t dmabounce_map_page(struct device *dev, 
struct page *page,
dev_dbg(dev, "%s(page=%p,off=%#lx,size=%zx,dir=%x)\n",
__func__, page, offset, size, dir);
 
-   dma_addr = pfn_to_dma(dev, page_to_pfn(page)) + offset;
+   dma_addr = phys_to_dma(dev, page_to_phys(page)) + offset;
 
ret = needs_bounce(dev, dma_addr, size);
if (ret < 0)
@@ -380,8 +380,8 @@ static int __dmabounce_sync_for_cpu(struct device *dev, 
dma_addr_t addr,
BUG_ON(buf->direction != dir);
 
dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x off=%#lx) mapped to %p 
(dma=%#x)\n",
-   __func__, buf->ptr, virt_to_dma(dev, buf->ptr), off,
-   buf->safe, buf->safe_dma_addr);
+   __func__, buf->ptr, phys_to_dma(dev, virt_to_phys(buf->ptr)),
+   off, buf->safe, buf->safe_dma_addr);
 
DO_STATS(dev->archdata.dmabounce->bounce_count++);
 
@@ -420,8 +420,8 @@ static int __dmabounce_sync_for_device(struct device *dev, 
dma_addr_t addr,
BUG_ON(buf->direction != dir);
 
dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x off=%#lx) mapped to %p 
(dma=%#x)\n",
-   __func__, buf->ptr, virt_to_dma(dev, buf->ptr), off,
-   buf->safe, buf->safe_dma_addr);
+   __func__, buf->ptr, phys_to_dma(dev, virt_to_phys(buf->ptr)),
+   off, buf->safe, buf->safe_dma_addr);
 
DO_STATS(dev->archdata.dmabounce->bounce_count++);
 
diff --git a/arch/arm/include/asm/dma-direct.h 
b/arch/arm/include/asm/dma-direct.h
deleted file mode 100644
index 84cb4e30658891..00
--- a/arch/arm/include/asm/dma-direct.h
+++ /dev/null
@@ -1,45 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef ASM_ARM_DMA_DIRECT_H
-#define ASM_ARM_DMA_DIRECT_H 1
-
-#include 
-
-/*
- * dma_to_pfn/pfn_to_dma/virt_to_dma are architecture private
- * functions used internally by the DMA-mapping API to provide DMA
- * addresses. They must not be used by drivers.
- */
-static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
-{
-   if (dev->dma_range_map)
-   pfn = PFN_DOWN(translate_phys_to_dma(dev, PFN_PHYS(pfn)));
-   return (dma_addr_t)__pfn_to_phys(pfn);
-}
-
-static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
-{
-   unsigned long pfn = __phys_to_pfn(addr);
-
-   if (dev->dma_range_map)
-   pfn = PFN_DOWN(translate_dma_to_phys(dev, PFN_PHYS(pfn)));
-   return pfn;
-}
-
-static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
-{
-   return (dma_addr_t)__virt_to_bus((unsigned long)(addr));
-}
-
-static inline dma_addr_t phys_to_dma(struct device *dev, 

[PATCH 3/4] ARM/dma-mapping: don't handle NULL devices in dma-direct.h

2020-09-17 Thread Christoph Hellwig
The DMA API removed support for not passing in a device a long time
ago, so remove the NULL checks.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/include/asm/dma-direct.h | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/dma-direct.h 
b/arch/arm/include/asm/dma-direct.h
index 1f04a5e1c615de..84cb4e30658891 100644
--- a/arch/arm/include/asm/dma-direct.h
+++ b/arch/arm/include/asm/dma-direct.h
@@ -11,7 +11,7 @@
  */
 static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
 {
-   if (dev && dev->dma_range_map)
+   if (dev->dma_range_map)
pfn = PFN_DOWN(translate_phys_to_dma(dev, PFN_PHYS(pfn)));
return (dma_addr_t)__pfn_to_phys(pfn);
 }
@@ -20,16 +20,13 @@ static inline unsigned long dma_to_pfn(struct device *dev, 
dma_addr_t addr)
 {
unsigned long pfn = __phys_to_pfn(addr);
 
-   if (dev && dev->dma_range_map)
+   if (dev->dma_range_map)
pfn = PFN_DOWN(translate_dma_to_phys(dev, PFN_PHYS(pfn)));
return pfn;
 }
 
 static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
 {
-   if (dev)
-   return pfn_to_dma(dev, virt_to_pfn(addr));
-
return (dma_addr_t)__virt_to_bus((unsigned long)(addr));
 }
 
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/4] ARM/footbridge: switch to use dma_direct_set_offset for lbus DMA offsets

2020-09-17 Thread Christoph Hellwig
Switch the footbridge PCI devices to use dma_direct_set_offset to set the
DMA offset instead of using direct hooks into the DMA mapping code and
remove the now unused hooks.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/include/asm/dma-direct.h |  4 +-
 arch/arm/include/asm/memory.h |  2 -
 arch/arm/mach-footbridge/cats-pci.c   |  7 +++-
 arch/arm/mach-footbridge/common.c | 40 ++-
 arch/arm/mach-footbridge/common.h |  3 ++
 arch/arm/mach-footbridge/ebsa285-pci.c|  7 +++-
 .../arm/mach-footbridge/include/mach/memory.h |  4 --
 arch/arm/mach-footbridge/netwinder-pci.c  |  7 +++-
 arch/arm/mach-footbridge/personal-pci.c   |  7 +++-
 9 files changed, 54 insertions(+), 27 deletions(-)

diff --git a/arch/arm/include/asm/dma-direct.h 
b/arch/arm/include/asm/dma-direct.h
index 77fcb7ee5ec907..1f04a5e1c615de 100644
--- a/arch/arm/include/asm/dma-direct.h
+++ b/arch/arm/include/asm/dma-direct.h
@@ -13,12 +13,12 @@ static inline dma_addr_t pfn_to_dma(struct device *dev, 
unsigned long pfn)
 {
if (dev && dev->dma_range_map)
pfn = PFN_DOWN(translate_phys_to_dma(dev, PFN_PHYS(pfn)));
-   return (dma_addr_t)__pfn_to_bus(pfn);
+   return (dma_addr_t)__pfn_to_phys(pfn);
 }
 
 static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
 {
-   unsigned long pfn = __bus_to_pfn(addr);
+   unsigned long pfn = __phys_to_pfn(addr);
 
if (dev && dev->dma_range_map)
pfn = PFN_DOWN(translate_dma_to_phys(dev, PFN_PHYS(pfn)));
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 99035b5891ef44..af612606136ff2 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -346,8 +346,6 @@ static inline unsigned long __virt_to_idmap(unsigned long x)
 #ifndef __virt_to_bus
 #define __virt_to_bus  __virt_to_phys
 #define __bus_to_virt  __phys_to_virt
-#define __pfn_to_bus(x)__pfn_to_phys(x)
-#define __bus_to_pfn(x)__phys_to_pfn(x)
 #endif
 
 /*
diff --git a/arch/arm/mach-footbridge/cats-pci.c 
b/arch/arm/mach-footbridge/cats-pci.c
index 0b2fd7e2e9b429..257cb068ac0c5b 100644
--- a/arch/arm/mach-footbridge/cats-pci.c
+++ b/arch/arm/mach-footbridge/cats-pci.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include "common.h"
 
 /* cats host-specific stuff */
 static int irqmap_cats[] __initdata = { IRQ_PCI, IRQ_IN0, IRQ_IN1, IRQ_IN3 };
@@ -56,8 +57,10 @@ static struct hw_pci cats_pci __initdata = {
 
 static int __init cats_pci_init(void)
 {
-   if (machine_is_cats())
-   pci_common_init(_pci);
+   if (!machine_is_cats())
+   return 0;
+   bus_register_notifier(_bus_type, _pci_dma_nb);
+   pci_common_init(_pci);
return 0;
 }
 
diff --git a/arch/arm/mach-footbridge/common.c 
b/arch/arm/mach-footbridge/common.c
index eee095f0e2f6c2..dc14d72ca7bb3f 100644
--- a/arch/arm/mach-footbridge/common.c
+++ b/arch/arm/mach-footbridge/common.c
@@ -12,6 +12,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 
 #include 
@@ -219,8 +221,17 @@ void footbridge_restart(enum reboot_mode mode, const char 
*cmd)
}
 }
 
-#ifdef CONFIG_FOOTBRIDGE_ADDIN
-
+#ifdef CONFIG_FOOTBRIDGE_HOST
+/*
+ * The footbridge is programmed to expose the system RAM at 0xe000.
+ * The requirement is that the RAM isn't placed at bus address 0, which
+ * would clash with VGA cards.
+ */
+static inline unsigned long fb_bus_sdram_offset(void)
+{
+   return 0xe000;
+}
+#elif defined(CONFIG_FOOTBRIDGE_ADDIN)
 static inline unsigned long fb_bus_sdram_offset(void)
 {
return *CSR_PCISDRAMBASE & 0xfff0;
@@ -248,17 +259,24 @@ unsigned long __bus_to_virt(unsigned long res)
return res;
 }
 EXPORT_SYMBOL(__bus_to_virt);
+#else
+#error "Undefined footbridge mode"
+#endif
 
-unsigned long __pfn_to_bus(unsigned long pfn)
+static int footbridge_pci_dma_notifier(struct notifier_block *nb,
+   unsigned long event, void *data)
 {
-   return __pfn_to_phys(pfn) + (fb_bus_sdram_offset() - PHYS_OFFSET);
-}
-EXPORT_SYMBOL(__pfn_to_bus);
+   struct device *dev = data;
 
-unsigned long __bus_to_pfn(unsigned long bus)
-{
-   return __phys_to_pfn(bus - (fb_bus_sdram_offset() - PHYS_OFFSET));
+   if (event != BUS_NOTIFY_ADD_DEVICE)
+   return NOTIFY_DONE;
+
+   if (dma_direct_set_offset(dev, PAGE_OFFSET, fb_bus_sdram_offset(),
+   (u64)-1))
+   WARN_ONCE(1, "failed to set DMA offset\n");
+   return NOTIFY_OK;
 }
-EXPORT_SYMBOL(__bus_to_pfn);
 
-#endif
+struct notifier_block footbridge_pci_dma_nb = {
+   .notifier_call  = footbridge_pci_dma_notifier,
+};
diff --git a/arch/arm/mach-footbridge/common.h 
b/arch/arm/mach-footbridge/common.h
index e12587db59c4c8..1773a4183b580c 100644
--- a/arch/arm/mach-footbridge/common.h
+++ b/arch/arm/mach-footbridge/common.h
@@ -1,4 +1,5 @@
 /* 

[PATCH 1/4] ARM/omap1: switch to use dma_direct_set_offset for lbus DMA offsets

2020-09-17 Thread Christoph Hellwig
Switch the omap1510 platform ohci device to use dma_direct_set_offset
to set the DMA offset instead of using direct hooks into the DMA
mapping code and remove the now unused hooks.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/include/asm/dma-direct.h | 18 -
 arch/arm/mach-omap1/include/mach/memory.h | 31 ---
 arch/arm/mach-omap1/usb.c | 22 
 3 files changed, 22 insertions(+), 49 deletions(-)

diff --git a/arch/arm/include/asm/dma-direct.h 
b/arch/arm/include/asm/dma-direct.h
index 436544aeb83405..77fcb7ee5ec907 100644
--- a/arch/arm/include/asm/dma-direct.h
+++ b/arch/arm/include/asm/dma-direct.h
@@ -9,7 +9,6 @@
  * functions used internally by the DMA-mapping API to provide DMA
  * addresses. They must not be used by drivers.
  */
-#ifndef __arch_pfn_to_dma
 static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
 {
if (dev && dev->dma_range_map)
@@ -34,23 +33,6 @@ static inline dma_addr_t virt_to_dma(struct device *dev, 
void *addr)
return (dma_addr_t)__virt_to_bus((unsigned long)(addr));
 }
 
-#else
-static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
-{
-   return __arch_pfn_to_dma(dev, pfn);
-}
-
-static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
-{
-   return __arch_dma_to_pfn(dev, addr);
-}
-
-static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
-{
-   return __arch_virt_to_dma(dev, addr);
-}
-#endif
-
 static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
 {
unsigned int offset = paddr & ~PAGE_MASK;
diff --git a/arch/arm/mach-omap1/include/mach/memory.h 
b/arch/arm/mach-omap1/include/mach/memory.h
index 1142560e0078f5..36bccb6ab8 100644
--- a/arch/arm/mach-omap1/include/mach/memory.h
+++ b/arch/arm/mach-omap1/include/mach/memory.h
@@ -14,42 +14,11 @@
  * OMAP-1510 bus address is translated into a Local Bus address if the
  * OMAP bus type is lbus. We do the address translation based on the
  * device overriding the defaults used in the dma-mapping API.
- * Note that the is_lbus_device() test is not very efficient on 1510
- * because of the strncmp().
  */
-#if defined(CONFIG_ARCH_OMAP15XX) && !defined(__ASSEMBLER__)
 
 /*
  * OMAP-1510 Local Bus address offset
  */
 #define OMAP1510_LB_OFFSET UL(0x3000)
 
-#define virt_to_lbus(x)((x) - PAGE_OFFSET + OMAP1510_LB_OFFSET)
-#define lbus_to_virt(x)((x) - OMAP1510_LB_OFFSET + PAGE_OFFSET)
-#define is_lbus_device(dev)(cpu_is_omap15xx() && dev && 
(strncmp(dev_name(dev), "ohci", 4) == 0))
-
-#define __arch_pfn_to_dma(dev, pfn)\
-   ({ dma_addr_t __dma = __pfn_to_phys(pfn); \
-  if (is_lbus_device(dev)) \
-   __dma = __dma - PHYS_OFFSET + OMAP1510_LB_OFFSET; \
-  __dma; })
-
-#define __arch_dma_to_pfn(dev, addr)   \
-   ({ dma_addr_t __dma = addr; \
-  if (is_lbus_device(dev)) \
-   __dma += PHYS_OFFSET - OMAP1510_LB_OFFSET;  \
-  __phys_to_pfn(__dma);\
-   })
-
-#define __arch_dma_to_virt(dev, addr)  ({ (void *) (is_lbus_device(dev) ? \
-   lbus_to_virt(addr) : \
-   __phys_to_virt(addr)); })
-
-#define __arch_virt_to_dma(dev, addr)  ({ unsigned long __addr = (unsigned 
long)(addr); \
-  (dma_addr_t) (is_lbus_device(dev) ? \
-   virt_to_lbus(__addr) : \
-   __virt_to_phys(__addr)); })
-
-#endif /* CONFIG_ARCH_OMAP15XX */
-
 #endif
diff --git a/arch/arm/mach-omap1/usb.c b/arch/arm/mach-omap1/usb.c
index d8e9bbda8f7bdd..ba8566204ea9f4 100644
--- a/arch/arm/mach-omap1/usb.c
+++ b/arch/arm/mach-omap1/usb.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -542,6 +543,25 @@ static u32 __init omap1_usb2_init(unsigned nwires, 
unsigned alt_pingroup)
 /* ULPD_APLL_CTRL */
 #define APLL_NDPLL_SWITCH  (1 << 0)
 
+static int omap_1510_usb_ohci_notifier(struct notifier_block *nb,
+   unsigned long event, void *data)
+{
+   struct device *dev = data;
+
+   if (event != BUS_NOTIFY_ADD_DEVICE)
+   return NOTIFY_DONE;
+
+   if (strncmp(dev_name(dev), "ohci", 4) == 0 &&
+   dma_direct_set_offset(dev, PHYS_OFFSET, OMAP1510_LB_OFFSET,
+   (u64)-1))
+   WARN_ONCE(1, "failed to set DMA offset\n");
+   return NOTIFY_OK;
+}
+
+static struct notifier_block omap_1510_usb_ohci_nb = {
+   .notifier_call  = omap_1510_usb_ohci_notifier,
+};
+
 static void __init omap_1510_usb_init(struct omap_usb_config *config)
 {
unsigned int val;
@@ -600,6 +620,8 @@ static void __init omap_1510_usb_init(struct 

RFC: remove the need for on ARM

2020-09-17 Thread Christoph Hellwig
Hi Russell,

as Robin pointed out there is not much need for the ARM specific
routines to translated to and from a dma_addr_t given that we have
the dma offset (and now offset range) functionality.  This series
converts ARM over to the generic helpers.  This has only been tested
on qemu, and specificall not on omap1 and footbridge given that I
do not have the hardware.

The patches are on to of the dma-mapping for-next tree, to make
review and testing easier a git tree is also available here:

git://git.infradead.org/users/hch/misc.git arm-dma-direct-cleanups

Gitweb:


http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/arm-dma-direct-cleanups


Diffstat:
 arch/arm/include/asm/dma-direct.h|   66 ---
 b/arch/arm/Kconfig   |1 
 b/arch/arm/common/dmabounce.c|   14 ++--
 b/arch/arm/include/asm/memory.h  |2 
 b/arch/arm/mach-footbridge/cats-pci.c|7 +-
 b/arch/arm/mach-footbridge/common.c  |   40 ++---
 b/arch/arm/mach-footbridge/common.h  |3 +
 b/arch/arm/mach-footbridge/ebsa285-pci.c |7 +-
 b/arch/arm/mach-footbridge/include/mach/memory.h |4 -
 b/arch/arm/mach-footbridge/netwinder-pci.c   |7 +-
 b/arch/arm/mach-footbridge/personal-pci.c|7 +-
 b/arch/arm/mach-omap1/include/mach/memory.h  |   31 --
 b/arch/arm/mach-omap1/usb.c  |   22 +++
 b/arch/arm/mm/dma-mapping.c  |   20 +++---
 14 files changed, 91 insertions(+), 140 deletions(-)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-17 Thread Jason Gunthorpe
On Thu, Sep 17, 2020 at 11:53:49AM +0800, Jason Wang wrote:
> > > When VDPA is used by qemu it makes sense that the PASID will be an
> > > arbitary IOVA map constructed to be 1:1 with the guest vCPU physical
> > > map. /dev/sva allows a single uAPI to do this kind of setup, and qemu
> > > can support it while supporting a range of SVA kernel drivers. VDPA
> > > and vfio-mdev are obvious initial targets.
> > > 
> > > *BOTH* are needed.
> > > 
> > > In general any uAPI for PASID should have the option to use either the
> > > mm_struct SVA PASID *OR* a PASID from /dev/sva. It costs virtually
> > > nothing to implement this in the driver as PASID is just a number, and
> > > gives so much more flexability.
> > > 
> > Not really nothing in terms of PASID life cycles. For example, if user
> > uses uacce interface to open an accelerator, it gets an FD_acc. Then it
> > opens /dev/sva to allocate PASID then get another FD_pasid. Then we
> > pass FD_pasid to the driver to bind page tables, perhaps multiple
> > drivers. Now we have to worry about If FD_pasid gets closed before
> > FD_acc(s) closed and all these race conditions.
> 
> 
> I'm not sure I understand this. But this demonstrates the flexibility of an
> unified uAPI. E.g it allows vDPA and VFIO device to use the same PAISD which
> can be shared with a process in the guest.
> 
> For the race condition, it could be probably solved with refcnt.

Yep

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 3/9] Documentation/x86: Add documentation for SVA (Shared Virtual Addressing)

2020-09-17 Thread Borislav Petkov
On Thu, Sep 17, 2020 at 10:22:39AM -0700, Raj, Ashok wrote:
> s/translation again/translation

Ok, last one. Now stop looking at that text because you'll find more.

:-)))

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 3/9] Documentation/x86: Add documentation for SVA (Shared Virtual Addressing)

2020-09-17 Thread Raj, Ashok
Thanks Boris.

multiple "again" makes it funny again :-)


On Thu, Sep 17, 2020 at 07:18:49PM +0200, Borislav Petkov wrote:
> On Thu, Sep 17, 2020 at 07:56:09AM -0700, Raj, Ashok wrote:
> > Just tweaked it a bit: 
> > 
> > "When ATS lookup fails for a virtual address, device should use PRI in
> > order to request the virtual address to be paged into the CPU page tables.
> > The device must use ATS again in order the fetch the translation again

s/translation again/translation

> > before use"
> 
> Thanks, amended.
> 


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 3/9] Documentation/x86: Add documentation for SVA (Shared Virtual Addressing)

2020-09-17 Thread Borislav Petkov
On Thu, Sep 17, 2020 at 07:56:09AM -0700, Raj, Ashok wrote:
> Just tweaked it a bit: 
> 
> "When ATS lookup fails for a virtual address, device should use PRI in
> order to request the virtual address to be paged into the CPU page tables.
> The device must use ATS again in order the fetch the translation again
> before use"

Thanks, amended.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2] dma-direct: Fix potential NULL pointer dereference

2020-09-17 Thread Thomas Tai




On 2020-09-17 12:05 p.m., Konrad Rzeszutek Wilk wrote:

On Wed, Sep 16, 2020 at 02:51:06PM -0600, Thomas Tai wrote:

When booting the kernel v5.9-rc4 on a VM, the kernel would panic when
printing a warning message in swiotlb_map(). The dev->dma_mask must not
be a NULL pointer when calling the dma mapping layer. A NULL pointer check
can potentially avoid the panic.

[drm] Initialized virtio_gpu 0.1.0 0 for virtio0 on minor 0
  BUG: kernel NULL pointer dereference, address: 
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x) - not-present page
  PGD 0 P4D 0
  Oops:  [#1] SMP PTI
  CPU: 1 PID: 331 Comm: systemd-udevd Not tainted 5.9.0-rc4 #1
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
  BIOS 1.13.0-1ubuntu1 04/01/2014
  RIP: 0010:swiotlb_map+0x1ac/0x200
  Code: e8 d9 fc ff ff 80 3d 92 ee 4c 01 00 75 51 49 8b 84 24 48 02 00 00
  4d 8b 6c 24 50 c6 05 7c ee 4c 01 01 4d 8b bc 24 58 02 00 00 <4c> 8b 30
  4d 85 ed 75 04 4d 8b 2c 24 4c 89 e7 e8 10 6b 4f 00 4d 89
  RSP: 0018:9f96801af6f8 EFLAGS: 00010246
  RAX:  RBX: 1000 RCX: 0080
  RDX: 007f RSI: 0202 RDI: 0202
  RBP: 9f96801af748 R08:  R09: 0020
  R10:  R11: 8fabfffa3000 R12: 8faad02c7810
  R13:  R14: 0020 R15: 
  FS:  7fabc63588c0() GS:8fabf7c8()
  knlGS:
  CS:  0010 DS:  ES:  CR0: 80050033
  CR2:  CR3: 000151496005 CR4: 00370ee0
  DR0:  DR1:  DR2: 
  DR3:  DR6: fffe0ff0 DR7: 0400
  Call Trace:
   dma_direct_map_sg+0x124/0x210
   dma_map_sg_attrs+0x32/0x50
   drm_gem_shmem_get_pages_sgt+0x6a/0x90 [drm]
   virtio_gpu_object_create+0x140/0x2f0 [virtio_gpu]
   ? ww_mutex_unlock+0x26/0x30
   virtio_gpu_mode_dumb_create+0xab/0x160 [virtio_gpu]
   drm_mode_create_dumb+0x82/0x90 [drm]
   drm_client_framebuffer_create+0xaa/0x200 [drm]
   drm_fb_helper_generic_probe+0x59/0x150 [drm_kms_helper]
   drm_fb_helper_single_fb_probe+0x29e/0x3e0 [drm_kms_helper]
   __drm_fb_helper_initial_config_and_unlock+0x41/0xd0 [drm_kms_helper]
   drm_fbdev_client_hotplug+0xe6/0x1a0 [drm_kms_helper]
   drm_fbdev_generic_setup+0xaf/0x170 [drm_kms_helper]
   virtio_gpu_probe+0xea/0x100 [virtio_gpu]
   virtio_dev_probe+0x14b/0x1e0 [virtio]
   really_probe+0x1db/0x440
   driver_probe_device+0xe9/0x160
   device_driver_attach+0x5d/0x70
   __driver_attach+0x8f/0x150
   ? device_driver_attach+0x70/0x70
   bus_for_each_dev+0x7e/0xc0
   driver_attach+0x1e/0x20
   bus_add_driver+0x152/0x1f0
   driver_register+0x74/0xd0
   ? 0xc0529000
   register_virtio_driver+0x20/0x30 [virtio]
   virtio_gpu_driver_init+0x15/0x1000 [virtio_gpu]
   do_one_initcall+0x4a/0x1fa
   ? _cond_resched+0x19/0x30
   ? kmem_cache_alloc_trace+0x16b/0x2e0
   do_init_module+0x62/0x240
   load_module+0xe0e/0x1100
   ? security_kernel_post_read_file+0x5c/0x70
   __do_sys_finit_module+0xbe/0x120
   ? __do_sys_finit_module+0xbe/0x120
   __x64_sys_finit_module+0x1a/0x20
   do_syscall_64+0x38/0x50
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Thomas Tai 


Reviewed-by: Konrad Rzeszutek Wilk 


Thank you, Konrad for reviewing the fix.

Thomas



Thank you!

---
  include/linux/dma-direct.h |  3 ---
  kernel/dma/mapping.c   | 11 +++
  2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 6e87225..0648708 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -62,9 +62,6 @@ static inline bool dma_capable(struct device *dev, dma_addr_t 
addr, size_t size,
  {
dma_addr_t end = addr + size - 1;
  
-	if (!dev->dma_mask)

-   return false;
-
if (is_ram && !IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT) &&
min(addr, end) < phys_to_dma(dev, PFN_PHYS(min_low_pfn)))
return false;
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 0d12942..7133d5c 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -144,6 +144,10 @@ dma_addr_t dma_map_page_attrs(struct device *dev, struct 
page *page,
dma_addr_t addr;
  
  	BUG_ON(!valid_dma_direction(dir));

+
+   if (WARN_ON_ONCE(!dev->dma_mask))
+   return DMA_MAPPING_ERROR;
+
if (dma_map_direct(dev, ops))
addr = dma_direct_map_page(dev, page, offset, size, dir, attrs);
else
@@ -179,6 +183,10 @@ int dma_map_sg_attrs(struct device *dev, struct 
scatterlist *sg, int nents,
int ents;
  
  	BUG_ON(!valid_dma_direction(dir));

+
+   if (WARN_ON_ONCE(!dev->dma_mask))
+   return 0;
+
if (dma_map_direct(dev, ops))
ents = dma_direct_map_sg(dev, sg, nents, dir, attrs);
else
@@ -213,6 +221,9 @@ dma_addr_t 

Re: [PATCH V2] dma-direct: Fix potential NULL pointer dereference

2020-09-17 Thread Thomas Tai




On 2020-09-17 12:44 p.m., Christoph Hellwig wrote:

Thanks,

applied to the dma-mapping for-next tree.


Thank you, Christoph for suggesting and applying the fix.

Thomas




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: support range based offsets in dma-direct v3

2020-09-17 Thread Christoph Hellwig
I've pulled this into the dma-mapping for-next tree.  Thanks Jim and
everyone helping out!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2] dma-direct: Fix potential NULL pointer dereference

2020-09-17 Thread Christoph Hellwig
Thanks,

applied to the dma-mapping for-next tree.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2] dma-direct: Fix potential NULL pointer dereference

2020-09-17 Thread Konrad Rzeszutek Wilk
On Wed, Sep 16, 2020 at 02:51:06PM -0600, Thomas Tai wrote:
> When booting the kernel v5.9-rc4 on a VM, the kernel would panic when
> printing a warning message in swiotlb_map(). The dev->dma_mask must not
> be a NULL pointer when calling the dma mapping layer. A NULL pointer check
> can potentially avoid the panic.
> 
> [drm] Initialized virtio_gpu 0.1.0 0 for virtio0 on minor 0
>  BUG: kernel NULL pointer dereference, address: 
>  #PF: supervisor read access in kernel mode
>  #PF: error_code(0x) - not-present page
>  PGD 0 P4D 0
>  Oops:  [#1] SMP PTI
>  CPU: 1 PID: 331 Comm: systemd-udevd Not tainted 5.9.0-rc4 #1
>  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
>  BIOS 1.13.0-1ubuntu1 04/01/2014
>  RIP: 0010:swiotlb_map+0x1ac/0x200
>  Code: e8 d9 fc ff ff 80 3d 92 ee 4c 01 00 75 51 49 8b 84 24 48 02 00 00
>  4d 8b 6c 24 50 c6 05 7c ee 4c 01 01 4d 8b bc 24 58 02 00 00 <4c> 8b 30
>  4d 85 ed 75 04 4d 8b 2c 24 4c 89 e7 e8 10 6b 4f 00 4d 89
>  RSP: 0018:9f96801af6f8 EFLAGS: 00010246
>  RAX:  RBX: 1000 RCX: 0080
>  RDX: 007f RSI: 0202 RDI: 0202
>  RBP: 9f96801af748 R08:  R09: 0020
>  R10:  R11: 8fabfffa3000 R12: 8faad02c7810
>  R13:  R14: 0020 R15: 
>  FS:  7fabc63588c0() GS:8fabf7c8()
>  knlGS:
>  CS:  0010 DS:  ES:  CR0: 80050033
>  CR2:  CR3: 000151496005 CR4: 00370ee0
>  DR0:  DR1:  DR2: 
>  DR3:  DR6: fffe0ff0 DR7: 0400
>  Call Trace:
>   dma_direct_map_sg+0x124/0x210
>   dma_map_sg_attrs+0x32/0x50
>   drm_gem_shmem_get_pages_sgt+0x6a/0x90 [drm]
>   virtio_gpu_object_create+0x140/0x2f0 [virtio_gpu]
>   ? ww_mutex_unlock+0x26/0x30
>   virtio_gpu_mode_dumb_create+0xab/0x160 [virtio_gpu]
>   drm_mode_create_dumb+0x82/0x90 [drm]
>   drm_client_framebuffer_create+0xaa/0x200 [drm]
>   drm_fb_helper_generic_probe+0x59/0x150 [drm_kms_helper]
>   drm_fb_helper_single_fb_probe+0x29e/0x3e0 [drm_kms_helper]
>   __drm_fb_helper_initial_config_and_unlock+0x41/0xd0 [drm_kms_helper]
>   drm_fbdev_client_hotplug+0xe6/0x1a0 [drm_kms_helper]
>   drm_fbdev_generic_setup+0xaf/0x170 [drm_kms_helper]
>   virtio_gpu_probe+0xea/0x100 [virtio_gpu]
>   virtio_dev_probe+0x14b/0x1e0 [virtio]
>   really_probe+0x1db/0x440
>   driver_probe_device+0xe9/0x160
>   device_driver_attach+0x5d/0x70
>   __driver_attach+0x8f/0x150
>   ? device_driver_attach+0x70/0x70
>   bus_for_each_dev+0x7e/0xc0
>   driver_attach+0x1e/0x20
>   bus_add_driver+0x152/0x1f0
>   driver_register+0x74/0xd0
>   ? 0xc0529000
>   register_virtio_driver+0x20/0x30 [virtio]
>   virtio_gpu_driver_init+0x15/0x1000 [virtio_gpu]
>   do_one_initcall+0x4a/0x1fa
>   ? _cond_resched+0x19/0x30
>   ? kmem_cache_alloc_trace+0x16b/0x2e0
>   do_init_module+0x62/0x240
>   load_module+0xe0e/0x1100
>   ? security_kernel_post_read_file+0x5c/0x70
>   __do_sys_finit_module+0xbe/0x120
>   ? __do_sys_finit_module+0xbe/0x120
>   __x64_sys_finit_module+0x1a/0x20
>   do_syscall_64+0x38/0x50
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Signed-off-by: Thomas Tai 

Reviewed-by: Konrad Rzeszutek Wilk 

Thank you!
> ---
>  include/linux/dma-direct.h |  3 ---
>  kernel/dma/mapping.c   | 11 +++
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
> index 6e87225..0648708 100644
> --- a/include/linux/dma-direct.h
> +++ b/include/linux/dma-direct.h
> @@ -62,9 +62,6 @@ static inline bool dma_capable(struct device *dev, 
> dma_addr_t addr, size_t size,
>  {
>   dma_addr_t end = addr + size - 1;
>  
> - if (!dev->dma_mask)
> - return false;
> -
>   if (is_ram && !IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT) &&
>   min(addr, end) < phys_to_dma(dev, PFN_PHYS(min_low_pfn)))
>   return false;
> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> index 0d12942..7133d5c 100644
> --- a/kernel/dma/mapping.c
> +++ b/kernel/dma/mapping.c
> @@ -144,6 +144,10 @@ dma_addr_t dma_map_page_attrs(struct device *dev, struct 
> page *page,
>   dma_addr_t addr;
>  
>   BUG_ON(!valid_dma_direction(dir));
> +
> + if (WARN_ON_ONCE(!dev->dma_mask))
> + return DMA_MAPPING_ERROR;
> +
>   if (dma_map_direct(dev, ops))
>   addr = dma_direct_map_page(dev, page, offset, size, dir, attrs);
>   else
> @@ -179,6 +183,10 @@ int dma_map_sg_attrs(struct device *dev, struct 
> scatterlist *sg, int nents,
>   int ents;
>  
>   BUG_ON(!valid_dma_direction(dir));
> +
> + if (WARN_ON_ONCE(!dev->dma_mask))
> + return 0;
> +
>   if (dma_map_direct(dev, ops))
>   ents = dma_direct_map_sg(dev, sg, nents, dir, attrs);
>   else
> @@ -213,6 +221,9 @@ 

Re: [PATCH v8 3/9] Documentation/x86: Add documentation for SVA (Shared Virtual Addressing)

2020-09-17 Thread Raj, Ashok
Hi Boris,

On Thu, Sep 17, 2020 at 09:53:38AM +0200, Borislav Petkov wrote:
> On Tue, Sep 15, 2020 at 09:30:07AM -0700, Fenghua Yu wrote:
> > +Background
> > +==
> > +
> > +Shared Virtual Addressing (SVA) allows the processor and device to use the
> > +same virtual addresses avoiding the need for software to translate virtual
> > +addresses to physical addresses. SVA is what PCIe calls Shared Virtual
> > +Memory (SVM).
> > +
> > +In addition to the convenience of using application virtual addresses
> > +by the device, it also doesn't require pinning pages for DMA.
> > +PCIe Address Translation Services (ATS) along with Page Request Interface
> > +(PRI) allow devices to function much the same way as the CPU handling
> > +application page-faults. For more information please refer to the PCIe
> > +specification Chapter 10: ATS Specification.
> > +
> > +Use of SVA requires IOMMU support in the platform. IOMMU also is required
> > +to support PCIe features ATS and PRI. ATS allows devices to cache
> > +translations for virtual addresses. The IOMMU driver uses the 
> > mmu_notifier()
> > +support to keep the device TLB cache and the CPU cache in sync. PRI allows
> > +the device to request paging the virtual address by using the CPU page 
> > tables
> > +before accessing the address.
> 
> That still reads funny, the "the device to request paging the virtual
> address" part. Do you mean that per chance here:
> 
> "Before the device can access that address, the device uses the PRI in
> order to request the virtual address to be paged in into the CPU page
> tables."
> 
Agree, this reads a bit funny.

Just tweaked it a bit: 

"When ATS lookup fails for a virtual address, device should use PRI in
order to request the virtual address to be paged into the CPU page tables.
The device must use ATS again in order the fetch the translation again
before use"

Cheers,
Ashok
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [oss-drivers] [trivial PATCH] treewide: Convert switch/case fallthrough; to break;

2020-09-17 Thread Simon Horman
On Wed, Sep 09, 2020 at 01:06:39PM -0700, Joe Perches wrote:
> fallthrough to a separate case/default label break; isn't very readable.
> 
> Convert pseudo-keyword fallthrough; statements to a simple break; when
> the next label is case or default and the only statement in the next
> label block is break;
> 
> Found using:
> 
> $ grep-2.5.4 -rP --include=*.[ch] -n 
> "fallthrough;(\s*(case\s+\w+|default)\s*:\s*){1,7}break;" *
> 
> Miscellanea:
> 
> o Move or coalesce a couple label blocks above a default: block.
> 
> Signed-off-by: Joe Perches 

...

> diff --git a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c 
> b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c
> index 252fe06f58aa..1d5b87079104 100644
> --- a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c
> +++ b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c
> @@ -345,7 +345,7 @@ static int matching_bar(struct nfp_bar *bar, u32 tgt, u32 
> act, u32 tok,
>   baract = NFP_CPP_ACTION_RW;
>   if (act == 0)
>   act = NFP_CPP_ACTION_RW;
> - fallthrough;
> + break;
>   case NFP_PCIE_BAR_PCIE2CPP_MapType_FIXED:
>   break;
>   default:

This is a cascading fall-through handling all map types.
I don't think this change improves readability.

...
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RESEND v9 11/13] iommu/arm-smmu-v3: Add SVA device feature

2020-09-17 Thread Jean-Philippe Brucker
On Tue, Sep 08, 2020 at 11:46:05AM +0200, Auger Eric wrote:
> Hi Jean,
> 
> On 8/17/20 7:15 PM, Jean-Philippe Brucker wrote:
> > Implement the IOMMU device feature callbacks to support the SVA feature.
> > At the moment dev_has_feat() returns false since I/O Page Faults isn't
> > yet implemented.
> and because we don't advertise BTM, isn't it?

Right, adding it to the commit log

> Besides
> 
> Reviewed-by: Eric Auger 

Thanks for the reviews!

Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RESEND v9 10/13] iommu/arm-smmu-v3: Check for SVA features

2020-09-17 Thread Jean-Philippe Brucker
On Tue, Sep 08, 2020 at 11:38:31AM +0200, Auger Eric wrote:
> Hi Jean,
> On 8/17/20 7:15 PM, Jean-Philippe Brucker wrote:
> > Aggregate all sanity-checks for sharing CPU page tables with the SMMU
> > under a single ARM_SMMU_FEAT_SVA bit. For PCIe SVA, users also need to
> > check FEAT_ATS and FEAT_PRI. For platform SVA, they will have to check
> > FEAT_STALLS.
> > 
> > Introduce ARM_SMMU_FEAT_BTM (Broadcast TLB Maintenance), but don't
> > enable it at the moment. Since the entire VMID space is shared with the
> > CPU, enabling DVM (by clearing SMMU_CR2.PTM) could result in
> > over-invalidation and affect performance of stage-2 mappings.
> In which series do you plan to enable it?

In the third part, after the PRI+stall series. I still haven't had time to
look at solving the stage-2 DVM problem (pinning VMIDs through KVM), so it
might be a while.

[...]
> > +   /*
> > +* See max_pinned_asids in arch/arm64/mm/context.c. The following is
> > +* generally the maximum number of bindable processes.
> > +*/
> > +   if (IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0))
> Out of curiosity, What is the rationale behind using
> arm64_kernel_unmapped_at_el0() versus
> IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0)?
> CPU caps being finalized?

I'm not sure. The caps are finalized at this point. I'll change it.

> Is that why you say "generally" here?

I said "generally" because having less PASIDs than ASIDs is in theory
possible, but hardware will normally support 20-bit PASIDs.

> > +   asid_bits--;
> > +   dev_dbg(smmu->dev, "%d shared contexts\n", (1 << asid_bits) -> +
> > num_possible_cpus() - 2);
> nit: s/shared/bindable?

I find "shared" clearer, with regard to contexts

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RESEND v9 09/13] iommu/arm-smmu-v3: Seize private ASID

2020-09-17 Thread Jean-Philippe Brucker
On Mon, Sep 07, 2020 at 06:41:11PM +0200, Auger Eric wrote:
> > +/*
> > + * Try to reserve this ASID in the SMMU. If it is in use, try to steal it 
> > from
> > + * the private entry. Careful here, we may be modifying the context tables 
> > of
> > + * another SMMU!
> Not sure I got what you meant by this comment.

That comment does need refreshing:

/*
 * Check if the CPU ASID is available on the SMMU side. If a private context
 * descriptor is using it, try to replace it.
 */

> > + */
> >  static struct arm_smmu_ctx_desc *
> >  arm_smmu_share_asid(struct mm_struct *mm, u16 asid)
> >  {
> > +   int ret;
> > +   u32 new_asid;
> > struct arm_smmu_ctx_desc *cd;
> > +   struct arm_smmu_device *smmu;
> > +   struct arm_smmu_domain *smmu_domain;
> >  
> > cd = xa_load(_smmu_asid_xa, asid);
> > if (!cd)
> > @@ -27,8 +36,31 @@ arm_smmu_share_asid(struct mm_struct *mm, u16 asid)
> > return cd;
> > }
> >  
> > -   /* Ouch, ASID is already in use for a private cd. */
> > -   return ERR_PTR(-EBUSY);
> > +   smmu_domain = container_of(cd, struct arm_smmu_domain, s1_cfg.cd);
> > +   smmu = smmu_domain->smmu;
> > +
> > +   ret = xa_alloc(_smmu_asid_xa, _asid, cd,
> > +  XA_LIMIT(1, 1 << smmu->asid_bits), GFP_KERNEL);
> XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL)

Good catch

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RESEND v9 00/13] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part)

2020-09-17 Thread Jean-Philippe Brucker
On Fri, Sep 04, 2020 at 11:08:34AM +0200, Joerg Roedel wrote:
> On Mon, Aug 17, 2020 at 07:15:46PM +0200, Jean-Philippe Brucker wrote:
> > Jean-Philippe Brucker (12):
> >   iommu/ioasid: Add ioasid references
> >   iommu/sva: Add PASID helpers
> >   arm64: mm: Pin down ASIDs for sharing mm with devices
> >   iommu/io-pgtable-arm: Move some definitions to a header
> >   arm64: cpufeature: Export symbol read_sanitised_ftr_reg()
> >   iommu/arm-smmu-v3: Move definitions to a header
> >   iommu/arm-smmu-v3: Share process page tables
> >   iommu/arm-smmu-v3: Seize private ASID
> >   iommu/arm-smmu-v3: Check for SVA features
> >   iommu/arm-smmu-v3: Add SVA device feature
> >   iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()
> >   iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops
> 
> Looks like the mm parts still need Acks/Reviews?
> 

Yes, I was hoping that the original [1] would get an ack or review.
Hopefully it will get picked up for 5.10, but I'll also Cc the mm list in
my v10.

Thanks,
Jean

[1] 
https://lore.kernel.org/linux-iommu/1600187413-163670-8-git-send-email-fenghua...@intel.com/T/#u
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RESEND v9 08/13] iommu/arm-smmu-v3: Share process page tables

2020-09-17 Thread Jean-Philippe Brucker
On Tue, Sep 08, 2020 at 09:41:20AM +0200, Auger Eric wrote:
> > +static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct 
> > *mm)
> > +{
> > +   u16 asid;
> > +   int err = 0;
> > +   u64 tcr, par, reg;
> > +   struct arm_smmu_ctx_desc *cd;
> > +   struct arm_smmu_ctx_desc *ret = NULL;
> > +
> > +   asid = arm64_mm_context_get(mm);
> > +   if (!asid)
> > +   return ERR_PTR(-ESRCH);
> > +
> > +   cd = kzalloc(sizeof(*cd), GFP_KERNEL);
> > +   if (!cd) {
> > +   err = -ENOMEM;
> > +   goto out_put_context;
> > +   }
> > +
> > +   refcount_set(>refs, 1);
> > +
> > +   mutex_lock(_smmu_asid_lock);
> > +   ret = arm_smmu_share_asid(mm, asid);
> > +   if (ret) {
> > +   mutex_unlock(_smmu_asid_lock);
> > +   goto out_free_cd;
> > +   }
> > +
> > +   err = xa_insert(_smmu_asid_xa, asid, cd, GFP_KERNEL);
> > +   mutex_unlock(_smmu_asid_lock);
> I am not clear about the locking scope. Can't we release the lock before
> as if I understand correctly xa_insert/xa_erase takes the xa_lock.

The mutex prevents conflicts with the private ASID allocation:

CPU 1   CPU 2
arm_smmu_alloc_shared_cd()  arm_smmu_attach_dev()
 arm_smmu_share_asid(mm, 1)  arm_smmu_domain_finalise_s1()
  xa_load(_xa, 1) -> NULL, ok
  xa_alloc(_xa) -> ASID #1
 xa_insert(_xa, 1) -> error

> > +
> > +   if (err)
> > +   goto out_free_asid;
> > +
> > +   tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, 64ULL - VA_BITS) |
> Wondering if no additional check is needed to check if the T0SZ is valid
> as documented in 5.4 Context Descriptor T0SZ description.

Indeed, this code probably predates 52-bit VA support. Now patch 10 should
check the VA limits, and we should use vabits_actual rather than VA_BITS.
I'll leave out IDR3.STT for now because arch/arm64/ doesn't support it.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RESEND v9 03/13] iommu/sva: Add PASID helpers

2020-09-17 Thread Jean-Philippe Brucker
On Tue, Sep 08, 2020 at 09:45:02AM +0200, Auger Eric wrote:
> > +int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max)
> > +{
> > +   int ret = 0;
> > +   ioasid_t pasid;
> > +
> > +   if (min == INVALID_IOASID || max == INVALID_IOASID ||
> > +   min == 0 || max < min)
> you may add a comment explaining why min == 0 is forbidden.

Right, I'll add to the function doc "@min must be greater than 0 because
0 indicates an unused mm->pasid."

> > +   return -EINVAL;
> > +
> > +   mutex_lock(_sva_lock);
> > +   if (mm->pasid) {
> > +   if (mm->pasid >= min && mm->pasid <= max)
> > +   ioasid_get(mm->pasid);
> > +   else
> > +   ret = -EOVERFLOW;
> > +   } else {
> > +   pasid = ioasid_alloc(_sva_pasid, min, max, mm);
> > +   if (pasid == INVALID_IOASID)
> > +   ret = -ENOMEM;
> > +   else
> > +   mm->pasid = pasid;
> > +   }
> > +   mutex_unlock(_sva_lock);
> > +   return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_sva_alloc_pasid);
> > +
> > +/**
> > + * iommu_sva_free_pasid - Release the mm's PASID
> > + * @mm: the mm.
> > + *
> > + * Drop one reference to a PASID allocated with iommu_sva_alloc_pasid()
> > + */
> > +void iommu_sva_free_pasid(struct mm_struct *mm)
> > +{
> > +   mutex_lock(_sva_lock);
> > +   if (ioasid_put(mm->pasid))
> > +   mm->pasid = 0;
> ditto: 0 versus INVALID_IOASID
> > +   mutex_unlock(_sva_lock);
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_sva_free_pasid);
> > +
> > +/* ioasid wants a void * argument */
> shouldn't it be:
> ioasid_find getter() requires a void *arg?

Ok

Thanks,
Jean

> > +static bool __mmget_not_zero(void *mm)
> > +{
> > +   return mmget_not_zero(mm);
> > +}
> > +
> > +/**
> > + * iommu_sva_find() - Find mm associated to the given PASID
> > + * @pasid: Process Address Space ID assigned to the mm
> > + *
> > + * On success a reference to the mm is taken, and must be released with 
> > mmput().
> > + *
> > + * Returns the mm corresponding to this PASID, or an error if not found.
> > + */
> > +struct mm_struct *iommu_sva_find(ioasid_t pasid)
> > +{
> > +   return ioasid_find(_sva_pasid, pasid, __mmget_not_zero);
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_sva_find);
> > 
> Thanks
> 
> Eric
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] drm: panfrost: Coherency support

2020-09-17 Thread Robin Murphy

On 2020-09-16 18:46, Rob Herring wrote:

On Wed, Sep 16, 2020 at 11:04 AM Alyssa Rosenzweig
 wrote:



So I get a performance regression with the dma-coherent approach, even if it's
clearly the cleaner.


That's bizarre -- this should really be the faster of the two.


Coherency may not be free. CortexA9 had something like 4x slower
memcpy if SMP was enabled as an example. I don't know if there's
anything going on like that specifically here. If there's never any
CPU accesses mixed in with kmscube, then there would be no benefit to
coherency.


There will still be CPU benefits in terms of not having to spend time 
cache-cleaning every BO upon allocation, and less overhead on writing 
out descriptors in the first place (due to cacheable vs. non-cacheable).


I haven't tried the NSh hack on Juno, but I don't see any notable 
performance issue as-is - kmscube hits a solid 60FPS from the off (now 
that it works without spewing faults). Given that the hardware on Juno 
can be generously described as "less good", it would certainly be 
interesting to figure out what difference is at play here...


The usual argument against I/O coherency is that it adds latency to 
every access, but if you already have a coherent interconnect anyway 
then the sensible answer to that is implementing decent snoop filters, 
rather than making software more complicated.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] drm: panfrost: Coherency support

2020-09-17 Thread Alyssa Rosenzweig
> The DDK blob has the ability to mark only certain areas of memory as
> coherent for performance reasons. For simple things like kmscube I would
> expect that it's basically write-only from the CPU and almost all memory the
> GPU touches isn't touched by the CPU. I.e. coherency isn't helping and the
> coherency traffic is probably expensive. Whether the complexity is worth it
> for "real" content I don't know - it may just be silly benchmarks that
> benefit.

Right, Panfrost userspace specifically assumes GPU reads to be expensive
and treats GPU memory as write-only *except* for a few special cases
(compute-like workloads, glReadPixels, some blits, etc).

The vast majority of the GPU memory - everything used in kmscube - will be
write-only to the CPU and fed directly into the display zero-copy (or
read by the GPU later as a dmabuf).


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[RESEND][PATCH 0/2] iommu/tegra-smmu: Fix TLB line for Tegra210

2020-09-17 Thread Nicolin Chen
These two patches fix ACTIVE_TLB_LINES field setting in tegra-smmu
driver for Tegra210 platforms.

This resend in series groups two previous seperate changes that're
corelated, being pointed out by Thierry. Also adding his Acked-by.

Nicolin Chen (2):
  iommu/tegra-smmu: Fix tlb_mask
  memory: tegra: Correct num_tlb_lines for tegra210

 drivers/iommu/tegra-smmu.c  | 2 +-
 drivers/memory/tegra/tegra210.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3] powerpc/pseries/svm: Allocate SWIOTLB buffer anywhere in memory

2020-09-17 Thread Michael Ellerman
On Tue, 18 Aug 2020 19:11:26 -0300, Thiago Jung Bauermann wrote:
> POWER secure guests (i.e., guests which use the Protection Execution
> Facility) need to use SWIOTLB to be able to do I/O with the hypervisor, but
> they don't need the SWIOTLB memory to be in low addresses since the
> hypervisor doesn't have any addressing limitation.
> 
> This solves a SWIOTLB initialization problem we are seeing in secure guests
> with 128 GB of RAM: they are configured with 4 GB of crashkernel reserved
> memory, which leaves no space for SWIOTLB in low addresses.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/pseries/svm: Allocate SWIOTLB buffer anywhere in memory
  https://git.kernel.org/powerpc/c/eae9eec476d13fad9af6da1f44a054ee02b7b161

cheers
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND][PATCH 2/2] memory: tegra: Correct num_tlb_lines for tegra210

2020-09-17 Thread Nicolin Chen
According to Tegra210 TRM, the default value of TLB_ACTIVE_LINES
field of register MC_SMMU_TLB_CONFIG_0 is 0x30. So num_tlb_lines
should be 48 (0x30) rather than 32 (0x20).

Signed-off-by: Nicolin Chen 
Acked-by: Thierry Reding 
---
 drivers/memory/tegra/tegra210.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/memory/tegra/tegra210.c b/drivers/memory/tegra/tegra210.c
index 51b537cfa5a7..4fbf8cbc 100644
--- a/drivers/memory/tegra/tegra210.c
+++ b/drivers/memory/tegra/tegra210.c
@@ -1074,7 +1074,7 @@ static const struct tegra_smmu_soc tegra210_smmu_soc = {
.num_groups = ARRAY_SIZE(tegra210_groups),
.supports_round_robin_arbitration = true,
.supports_request_limit = true,
-   .num_tlb_lines = 32,
+   .num_tlb_lines = 48,
.num_asids = 128,
 };
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND][PATCH 1/2] iommu/tegra-smmu: Fix tlb_mask

2020-09-17 Thread Nicolin Chen
The "num_tlb_lines" might not be a power-of-2 value, being 48 on
Tegra210 for example. So the current way of calculating tlb_mask
using the num_tlb_lines is not correct: tlb_mask=0x5f in case of
num_tlb_lines=48, which will trim a setting of 0x30 (48) to 0x10.

Signed-off-by: Nicolin Chen 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-smmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 84fdee473873..0becdbfea306 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -1120,7 +1120,7 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
BIT_MASK(mc->soc->num_address_bits - SMMU_PTE_SHIFT) - 1;
dev_dbg(dev, "address bits: %u, PFN mask: %#lx\n",
mc->soc->num_address_bits, smmu->pfn_mask);
-   smmu->tlb_mask = (smmu->soc->num_tlb_lines << 1) - 1;
+   smmu->tlb_mask = (1 << fls(smmu->soc->num_tlb_lines)) - 1;
dev_dbg(dev, "TLB lines: %u, mask: %#lx\n", smmu->soc->num_tlb_lines,
smmu->tlb_mask);
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] drm: panfrost: Coherency support

2020-09-17 Thread Steven Price

On 17/09/2020 11:51, Tomeu Vizoso wrote:

On 9/17/20 12:38 PM, Steven Price wrote:

On 16/09/2020 18:46, Rob Herring wrote:

On Wed, Sep 16, 2020 at 11:04 AM Alyssa Rosenzweig
 wrote:


So I get a performance regression with the dma-coherent approach, 
even if it's

clearly the cleaner.


That's bizarre -- this should really be the faster of the two.


Coherency may not be free. CortexA9 had something like 4x slower
memcpy if SMP was enabled as an example. I don't know if there's
anything going on like that specifically here. If there's never any
CPU accesses mixed in with kmscube, then there would be no benefit to
coherency.


The DDK blob has the ability to mark only certain areas of memory as 
coherent for performance reasons. For simple things like kmscube I 
would expect that it's basically write-only from the CPU and almost 
all memory the GPU touches isn't touched by the CPU. I.e. coherency 
isn't helping and the coherency traffic is probably expensive. Whether 
the complexity is worth it for "real" content I don't know - it may 
just be silly benchmarks that benefit.


Or maybe it's only a problem for applications that do silly things? I 
don't think kmscube was ever optimized for performance.


Well doing silly things is almost the definition of a benchmark ;) A lot 
of the mobile graphics benchmarks suffer from not being very intelligent 
in how they render (e.g. many have meshes that are far too detailed so 
the triangles are smaller than the pixels).


Of course there are also applications that get things wrong too.

Steve
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] drm: panfrost: Coherency support

2020-09-17 Thread Tomeu Vizoso

On 9/17/20 12:38 PM, Steven Price wrote:

On 16/09/2020 18:46, Rob Herring wrote:

On Wed, Sep 16, 2020 at 11:04 AM Alyssa Rosenzweig
 wrote:


So I get a performance regression with the dma-coherent approach, 
even if it's

clearly the cleaner.


That's bizarre -- this should really be the faster of the two.


Coherency may not be free. CortexA9 had something like 4x slower
memcpy if SMP was enabled as an example. I don't know if there's
anything going on like that specifically here. If there's never any
CPU accesses mixed in with kmscube, then there would be no benefit to
coherency.


The DDK blob has the ability to mark only certain areas of memory as 
coherent for performance reasons. For simple things like kmscube I would 
expect that it's basically write-only from the CPU and almost all memory 
the GPU touches isn't touched by the CPU. I.e. coherency isn't helping 
and the coherency traffic is probably expensive. Whether the complexity 
is worth it for "real" content I don't know - it may just be silly 
benchmarks that benefit.


Or maybe it's only a problem for applications that do silly things? I 
don't think kmscube was ever optimized for performance.


Regards,

Tomeu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/tegra-smmu: Fix tlb_mask

2020-09-17 Thread Thierry Reding
On Tue, Sep 15, 2020 at 05:23:59PM -0700, Nicolin Chen wrote:
> The "num_tlb_lines" might not be a power-of-2 value, being 48 on
> Tegra210 for example. So the current way of calculating tlb_mask
> using the num_tlb_lines is not correct: tlb_mask=0x5f in case of
> num_tlb_lines=48, which will trim a setting of 0x30 (48) to 0x10.
> 
> Signed-off-by: Nicolin Chen 
> ---
>  drivers/iommu/tegra-smmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

This is technically a prerequisite for this patch you sent out earlier:


https://patchwork.ozlabs.org/project/linux-tegra/patch/20200915232803.26163-1-nicoleots...@gmail.com/

You should send both of those out as one series and add maintainers for
both subsystems to both patches so that they can work out who will be
applying them.

For this pair it's probably best for Joerg to pick up both patches
because this primarily concerns the Tegra SMMU, whereas the above patch
only provides the per-SoC data update for the SMMU. Obviously if Joerg
prefers for Krzysztof to pick up both patches that's fine with me too.

In either case, please send this out as a series so that both Joerg and
Krzysztof (Cc'ed for visibility) are aware of both patches. From the
Tegra side:

Acked-by: Thierry Reding 

> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index 84fdee473873..0becdbfea306 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -1120,7 +1120,7 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
>   BIT_MASK(mc->soc->num_address_bits - SMMU_PTE_SHIFT) - 1;
>   dev_dbg(dev, "address bits: %u, PFN mask: %#lx\n",
>   mc->soc->num_address_bits, smmu->pfn_mask);
> - smmu->tlb_mask = (smmu->soc->num_tlb_lines << 1) - 1;
> + smmu->tlb_mask = (1 << fls(smmu->soc->num_tlb_lines)) - 1;
>   dev_dbg(dev, "TLB lines: %u, mask: %#lx\n", smmu->soc->num_tlb_lines,
>   smmu->tlb_mask);
>  
> -- 
> 2.17.1
> 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 0/3] drm: panfrost: Coherency support

2020-09-17 Thread Steven Price

On 16/09/2020 18:46, Rob Herring wrote:

On Wed, Sep 16, 2020 at 11:04 AM Alyssa Rosenzweig
 wrote:



So I get a performance regression with the dma-coherent approach, even if it's
clearly the cleaner.


That's bizarre -- this should really be the faster of the two.


Coherency may not be free. CortexA9 had something like 4x slower
memcpy if SMP was enabled as an example. I don't know if there's
anything going on like that specifically here. If there's never any
CPU accesses mixed in with kmscube, then there would be no benefit to
coherency.


The DDK blob has the ability to mark only certain areas of memory as 
coherent for performance reasons. For simple things like kmscube I would 
expect that it's basically write-only from the CPU and almost all memory 
the GPU touches isn't touched by the CPU. I.e. coherency isn't helping 
and the coherency traffic is probably expensive. Whether the complexity 
is worth it for "real" content I don't know - it may just be silly 
benchmarks that benefit.


Steve
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] drm/panfrost: Support cache-coherent integrations

2020-09-17 Thread Steven Price

On 16/09/2020 00:51, Robin Murphy wrote:

When the GPU's ACE-Lite interface is fully wired up and capable of
snooping CPU caches, it may be described as "dma-coherent" in
devicetree, which will already inform the DMA layer not to perform
unnecessary cache maintenance. However, we still need to ensure that
the GPU uses the appropriate cacheable outer-shareable attributes in
order to generate the requisite snoop signals, and that CPU mappings
don't create a mismatch by using a non-cacheable type either.

Signed-off-by: Robin Murphy 


LGTM

Reviewed-by: Steven Price 


---
  drivers/gpu/drm/panfrost/panfrost_device.h | 1 +
  drivers/gpu/drm/panfrost/panfrost_drv.c| 2 ++
  drivers/gpu/drm/panfrost/panfrost_gem.c| 2 ++
  drivers/gpu/drm/panfrost/panfrost_mmu.c| 1 +
  4 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
b/drivers/gpu/drm/panfrost/panfrost_device.h
index c30c719a8059..b31f45315e96 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -84,6 +84,7 @@ struct panfrost_device {
/* pm_domains for devices with more than one. */
struct device *pm_domain_devs[MAX_PM_DOMAINS];
struct device_link *pm_domain_links[MAX_PM_DOMAINS];
+   bool coherent;
  
  	struct panfrost_features features;

const struct panfrost_compatible *comp;
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index ada51df9a7a3..2a6f2f716b2f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -588,6 +588,8 @@ static int panfrost_probe(struct platform_device *pdev)
if (!pfdev->comp)
return -ENODEV;
  
+	pfdev->coherent = device_get_dma_attr(>dev) == DEV_DMA_COHERENT;

+
/* Allocate and initialze the DRM device. */
ddev = drm_dev_alloc(_drm_driver, >dev);
if (IS_ERR(ddev))
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c 
b/drivers/gpu/drm/panfrost/panfrost_gem.c
index 33355dd302f1..cdf1a8754eba 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -220,6 +220,7 @@ static const struct drm_gem_object_funcs panfrost_gem_funcs 
= {
   */
  struct drm_gem_object *panfrost_gem_create_object(struct drm_device *dev, 
size_t size)
  {
+   struct panfrost_device *pfdev = dev->dev_private;
struct panfrost_gem_object *obj;
  
  	obj = kzalloc(sizeof(*obj), GFP_KERNEL);

@@ -229,6 +230,7 @@ struct drm_gem_object *panfrost_gem_create_object(struct 
drm_device *dev, size_t
INIT_LIST_HEAD(>mappings.list);
mutex_init(>mappings.lock);
obj->base.base.funcs = _gem_funcs;
+   obj->base.map_cached = pfdev->coherent;
  
  	return >base.base;

  }
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index e8f7b11352d2..8852fd378f7a 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -371,6 +371,7 @@ int panfrost_mmu_pgtable_alloc(struct panfrost_file_priv 
*priv)
.pgsize_bitmap  = SZ_4K | SZ_2M,
.ias= FIELD_GET(0xff, pfdev->features.mmu_features),
.oas= FIELD_GET(0xff00, 
pfdev->features.mmu_features),
+   .coherent_walk  = pfdev->coherent,
.tlb= _tlb_ops,
.iommu_dev  = pfdev->dev,
};



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 3/9] Documentation/x86: Add documentation for SVA (Shared Virtual Addressing)

2020-09-17 Thread Borislav Petkov
On Tue, Sep 15, 2020 at 09:30:07AM -0700, Fenghua Yu wrote:
> +Background
> +==
> +
> +Shared Virtual Addressing (SVA) allows the processor and device to use the
> +same virtual addresses avoiding the need for software to translate virtual
> +addresses to physical addresses. SVA is what PCIe calls Shared Virtual
> +Memory (SVM).
> +
> +In addition to the convenience of using application virtual addresses
> +by the device, it also doesn't require pinning pages for DMA.
> +PCIe Address Translation Services (ATS) along with Page Request Interface
> +(PRI) allow devices to function much the same way as the CPU handling
> +application page-faults. For more information please refer to the PCIe
> +specification Chapter 10: ATS Specification.
> +
> +Use of SVA requires IOMMU support in the platform. IOMMU also is required
> +to support PCIe features ATS and PRI. ATS allows devices to cache
> +translations for virtual addresses. The IOMMU driver uses the mmu_notifier()
> +support to keep the device TLB cache and the CPU cache in sync. PRI allows
> +the device to request paging the virtual address by using the CPU page tables
> +before accessing the address.

That still reads funny, the "the device to request paging the virtual
address" part. Do you mean that per chance here:

"Before the device can access that address, the device uses the PRI in
order to request the virtual address to be paged in into the CPU page
tables."

?

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-17 Thread Tian, Kevin
> From: Jason Gunthorpe 
> Sent: Wednesday, September 16, 2020 10:45 PM
> 
> On Wed, Sep 16, 2020 at 01:19:18AM +, Tian, Kevin wrote:
> > > From: Jason Gunthorpe 
> > > Sent: Tuesday, September 15, 2020 10:29 PM
> > >
> > > > Do they need a device at all?  It's not clear to me why RID based
> > > > IOMMU management fits within vfio's scope, but PASID based does not.
> > >
> > > In RID mode vfio-pci completely owns the PCI function, so it is more
> > > natural that VFIO, as the sole device owner, would own the DMA
> mapping
> > > machinery. Further, the RID IOMMU mode is rarely used outside of VFIO
> > > so there is not much reason to try and disaggregate the API.
> >
> > It is also used by vDPA.
> 
> A driver in VDPA, not VDPA itself.

what is the difference? It is still the example of using RID IOMMU mode
outside of VFIO (and just implies that vDPA even doesn't do a good
abstraction internally).

> 
> > > PASID on the other hand, is shared. vfio-mdev drivers will share the
> > > device with other kernel drivers. PASID and DMA will be concurrent
> > > with VFIO and other kernel drivers/etc.
> >
> > Looks you are equating PASID to host-side sharing, while ignoring
> > another valid usage that a PASID-capable device is passed through
> > to the guest through vfio-pci and then PASID is used by the guest
> > for guest-side sharing. In such case, it is an exclusive usage in host
> > side and then what is the problem for VFIO to manage PASID given
> > that vfio-pci completely owns the function?
> 
> This is no different than vfio-pci being yet another client to
> /dev/sva
> 

My comment was to echo Alex's question about "why RID based
IOMMU management fits within vfio's scope, but PASID based 
does not". and when talking about generalization we should look
bigger beyond sva. What really matters here is the iommu_domain
which is about everything related to DMA mapping. The domain
associated with a passthru device is marked as "unmanaged" in 
kernel and allows userspace to manage DMA mapping of this 
device through a set of iommu_ops:

- alloc/free domain;
- attach/detach device/subdevice;
- map/unmap a memory region;
- bind/unbind page table and invalidate iommu cache;
- ... (and lots of other callbacks)

map/unmap or bind/unbind are just different ways of managing
DMAs in an iommu domain. The passthrough framework (VFIO 
or VDPA) has been providing its uAPI to manage every aspect of 
iommu_domain so far, and sva is just a natural extension following 
this design. If we really want to generalize something, it needs to 
be /dev/iommu as an unified interface for managing every aspect 
of iommu_domain. Asking SVA abstraction alone just causes 
unnecessary mess to both kernel (sync domain/device association 
between /dev/vfio and /dev/sva) and userspace (talk to two 
interfaces even for same vfio-pci device). Then it sounds like more 
like a bandaid for saving development effort in VDPA (which 
instead should go proposing /dev/iommu when it was invented 
instead of reinventing its own bits until such effort is unaffordable 
and then ask for partial abstraction to fix its gap).

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu