date:20200515


On 2020-05-15 22:33, Song Bao Hua wrote:

Subject: Re: Constantly map and unmap of streaming DMA buffers with
IOMMU backend might cause serious performance problem

On Fri, May 15, 2020 at 01:10:21PM +0100, Robin Murphy wrote:

Meanwhile, for the safety of buffers, lower-layer drivers need to make

certain the buffers have already been unmapped in iommu before those
buffers go back to buddy for other users.


That sounds like it would only have benefit in a very small set of specific
circumstances, and would be very difficult to generalise to buffers that
are mapped via dma_map_page() or dma_map_single(). Furthermore, a
high-level API that affects a low-level driver's interpretation of
mid-layer API calls without the mid-layer's knowledge sounds like a hideous
abomination of anti-design. If a mid-layer API lends itself to inefficiency
at the lower level, it would seem a lot cleaner and more robust to extend
*that* API for stateful buffer reuse. Failing that, it might possibly be
appropriate to approach this at the driver level - many of the cleverer
network drivers already implement buffer pools to recycle mapped SKBs
internally, couldn't the "zip driver" simply try doing something like that
for itself?


Exactly.  If you upper consumer of the DMA API keeps reusing the same
pages just map them once and use dma_sync_* to transfer ownership as
needed.


The problem is that the lower-layer drivers don't know if upper consumer keeps 
reusing the same pages. They are running in different software layers.
For example, Consumer is here in mm/zswap.c
static int zswap_frontswap_store(unsigned type, pgoff_t offset,
struct page *page)
{
...
/* compress */
dst = get_cpu_var(zswap_dstmem);
...
ret = crypto_comp_compress(tfm, src, PAGE_SIZE, dst, );
...
}

But the lower-layer driver is in drivers/crypto/...

Meanwhile, the lower-layer driver couldn't cache the pointers of buffer address 
coming from consumers to detect if the upper-layer is using the same page.
Because the same page might come from different users or come from the 
different stages of the same user with different permissions.


Indeed the driver can't cache arbitrary pointers, but if typical buffers 
are small enough it can copy the data into its own already-mapped page, 
dma_sync it, and perform the DMA operation from there. That might even 
be more or less what your first suggestion was, but I'm still not quite 
sure.



For example, consumer A uses the buffer as destination, then returns it to 
buddy, but consumer B gets the same buffer and uses it as source.

Another possibility is
Consumer A uses the buffer, returns it to buddy, after some time, it allocates 
a buffer again, but gets the same buffer from buddy like before.

For the safety of the buffer, lower-layer driver must guarantee the buffer is 
unmapped when the buffer returns to buddy.

I think only the upper-layer consumer knows if it is reusing the buffer.


Right, and if reusing buffers is common in crypto callers, then there's 
an argument for "set up reusable buffer", "process updated buffer" and 
"clean up buffer" operations to be added to the crypto API itself, such 
that the underlying drivers can then optimise for DMA usage in a robust 
and obvious way if they want to (or just implement the setup and 
teardown as no-ops and still do a full map/unmap in each "process" call 
if they don't).


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: Constantly map and unmap of streaming DMA buffers with IOMMU backend might cause serious performance problem

2020-05-15 Thread Song Bao Hua

> Subject: Re: Constantly map and unmap of streaming DMA buffers with
> IOMMU backend might cause serious performance problem
> 
> On Fri, May 15, 2020 at 01:10:21PM +0100, Robin Murphy wrote:
> >> Meanwhile, for the safety of buffers, lower-layer drivers need to make
> certain the buffers have already been unmapped in iommu before those
> buffers go back to buddy for other users.
> >
> > That sounds like it would only have benefit in a very small set of specific
> > circumstances, and would be very difficult to generalise to buffers that
> > are mapped via dma_map_page() or dma_map_single(). Furthermore, a
> > high-level API that affects a low-level driver's interpretation of
> > mid-layer API calls without the mid-layer's knowledge sounds like a hideous
> > abomination of anti-design. If a mid-layer API lends itself to inefficiency
> > at the lower level, it would seem a lot cleaner and more robust to extend
> > *that* API for stateful buffer reuse. Failing that, it might possibly be
> > appropriate to approach this at the driver level - many of the cleverer
> > network drivers already implement buffer pools to recycle mapped SKBs
> > internally, couldn't the "zip driver" simply try doing something like that
> > for itself?
> 
> Exactly.  If you upper consumer of the DMA API keeps reusing the same
> pages just map them once and use dma_sync_* to transfer ownership as
> needed.

The problem is that the lower-layer drivers don't know if upper consumer keeps 
reusing the same pages. They are running in different software layers.
For example, Consumer is here in mm/zswap.c
static int zswap_frontswap_store(unsigned type, pgoff_t offset,
struct page *page)
{
...
/* compress */
dst = get_cpu_var(zswap_dstmem);
...
ret = crypto_comp_compress(tfm, src, PAGE_SIZE, dst, );
...
}

But the lower-layer driver is in drivers/crypto/...

Meanwhile, the lower-layer driver couldn't cache the pointers of buffer address 
coming from consumers to detect if the upper-layer is using the same page.
Because the same page might come from different users or come from the 
different stages of the same user with different permissions.

For example, consumer A uses the buffer as destination, then returns it to 
buddy, but consumer B gets the same buffer and uses it as source.

Another possibility is
Consumer A uses the buffer, returns it to buddy, after some time, it allocates 
a buffer again, but gets the same buffer from buddy like before.

For the safety of the buffer, lower-layer driver must guarantee the buffer is 
unmapped when the buffer returns to buddy.

I think only the upper-layer consumer knows if it is reusing the buffer. 

Thanks
Barry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/4] PCI/ATS: Only enable ATS for trusted devices

2020-05-15 Thread Bjorn Helgaas

On Fri, May 15, 2020 at 12:43:59PM +0200, Jean-Philippe Brucker wrote:
> Add pci_ats_supported(), which checks whether a device has an ATS
> capability, and whether it is trusted.  A device is untrusted if it is
> plugged into an external-facing port such as Thunderbolt and could be
> spoof an existing device to exploit weaknesses in the IOMMU
> configuration.  PCIe ATS is one such weaknesses since it allows
> endpoints to cache IOMMU translations and emit transactions with
> 'Translated' Address Type (10b) that partially bypass the IOMMU
> translation.
> 
> The SMMUv3 and VT-d IOMMU drivers already disallow ATS and transactions
> with 'Translated' Address Type for untrusted devices.  Add the check to
> pci_enable_ats() to let other drivers (AMD IOMMU for now) benefit from
> it.
> 
> By checking ats_cap, the pci_ats_supported() helper also returns whether
> ATS was globally disabled with pci=noats, and could later include more
> things, for example whether the whole PCIe hierarchy down to the
> endpoint supports ATS.
> 
> Signed-off-by: Jean-Philippe Brucker 

Acked-by: Bjorn Helgaas 

> ---
>  include/linux/pci-ats.h |  3 +++
>  drivers/pci/ats.c   | 18 +-
>  2 files changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
> index d08f0869f1213e..f75c307f346de9 100644
> --- a/include/linux/pci-ats.h
> +++ b/include/linux/pci-ats.h
> @@ -6,11 +6,14 @@
>  
>  #ifdef CONFIG_PCI_ATS
>  /* Address Translation Service */
> +bool pci_ats_supported(struct pci_dev *dev);
>  int pci_enable_ats(struct pci_dev *dev, int ps);
>  void pci_disable_ats(struct pci_dev *dev);
>  int pci_ats_queue_depth(struct pci_dev *dev);
>  int pci_ats_page_aligned(struct pci_dev *dev);
>  #else /* CONFIG_PCI_ATS */
> +static inline bool pci_ats_supported(struct pci_dev *d)
> +{ return false; }
>  static inline int pci_enable_ats(struct pci_dev *d, int ps)
>  { return -ENODEV; }
>  static inline void pci_disable_ats(struct pci_dev *d) { }
> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> index 390e92f2d8d1fc..15fa0c37fd8e44 100644
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c
> @@ -30,6 +30,22 @@ void pci_ats_init(struct pci_dev *dev)
>   dev->ats_cap = pos;
>  }
>  
> +/**
> + * pci_ats_supported - check if the device can use ATS
> + * @dev: the PCI device
> + *
> + * Returns true if the device supports ATS and is allowed to use it, false
> + * otherwise.
> + */
> +bool pci_ats_supported(struct pci_dev *dev)
> +{
> + if (!dev->ats_cap)
> + return false;
> +
> + return !dev->untrusted;
> +}
> +EXPORT_SYMBOL_GPL(pci_ats_supported);
> +
>  /**
>   * pci_enable_ats - enable the ATS capability
>   * @dev: the PCI device
> @@ -42,7 +58,7 @@ int pci_enable_ats(struct pci_dev *dev, int ps)
>   u16 ctrl;
>   struct pci_dev *pdev;
>  
> - if (!dev->ats_cap)
> + if (!pci_ats_supported(dev))
>   return -EINVAL;
>  
>   if (WARN_ON(dev->ats_enabled))
> -- 
> 2.26.2
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: Implement deferred domain attachment


On 2020-05-15 19:26, Joerg Roedel wrote:

On Fri, May 15, 2020 at 05:28:53PM +0100, Robin Murphy wrote:

On 2020-05-15 17:14, Joerg Roedel wrote:

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index ba128d1cdaee..403fda04ea98 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -362,8 +362,8 @@ static int iommu_dma_deferred_attach(struct device *dev,
return 0;
if (unlikely(ops->is_attach_deferred &&
-   ops->is_attach_deferred(domain, dev)))
-   return iommu_attach_device(domain, dev);
+ops->is_attach_deferred(domain, dev)))
+   return iommu_attach_device_no_defer(domain, dev);


Wouldn't it be simpler to just invoke ops->attach_dev directly and avoid
having to formalise a public interface that nobody else should ever use
anyway?


That would omit the ops->attach_dev != NULL check and the trace-point on
device attach. Besides that, it would be a layering violation. But the
function is of course entirely internal to the iommu subsytem and is a
good canditate to be moved to a header file in drivers/iommu.


Sure, checking the pointer before calling was implied, but the 
tracepoint is a good argument, I'd forgotten about that :)



@@ -746,8 +747,11 @@ int iommu_group_add_device(struct iommu_group *group,
struct device *dev)

 mutex_lock(>mutex);
 list_add_tail(>list, >devices);
-   if (group->domain)
-   ret = __iommu_attach_device(group->domain, dev);
+   domain = group->domain;
+   if (domain && (!domain->ops->is_attach_deferred ||
+  !domain->ops->is_attach_deferred(domain, dev)))
+   ret = __iommu_attach_device(domain, dev);
+   }
 mutex_unlock(>mutex);
 if (ret)
 goto err_put_group;


No, doing this in iommu_group_add_device() doesn't solve the problem.
The attach must not happen before a device driver took control of the
device and silenced any DMA initiated by the old kernel. At probe time
this isn't guaranteed.


But that's not what this is; this is (supposed to be) the exact same 
"don't actually perform the attach yet" logic as before, just 
restricting it to default domains in the one place that it actually 
needs to be, so as not to fundamentally bugger up iommu_attach_device() 
in a way that prevents it from working as expected at the correct point 
later.


Thinking a bit more, consider if the driver resets the device then 
attaches it straight to its own unmanaged domain rather than calling any 
DMA ops (e.g. VFIO?) - it looks like that would also be totally broken 
right now, and no amount of bodges in iommu-dma is going to help there.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/2] dt-bindings: arm-smmu: Add sc7180 compatible string

2020-05-15 Thread Doug Anderson

Hi,

On Fri, May 1, 2020 at 3:30 AM Sharat Masetty  wrote:
>
> This patch simply adds a new compatible string for SC7180 platform.
>
> Signed-off-by: Sharat Masetty 
> ---
>  Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
> b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> index 6515dbe..986098b 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> @@ -28,6 +28,7 @@ properties:
>- enum:
>- qcom,msm8996-smmu-v2
>- qcom,msm8998-smmu-v2
> +  - qcom,sc7180-smmu-v2
>- qcom,sdm845-smmu-v2
>- const: qcom,smmu-v2

Is anything blocking this patch from landing now?

-Doug
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [PATCH] iommu: Remove functions that support private domain

2020-05-15 Thread Prakhya, Sai Praneeth

Hi Joerg,

> -Original Message-
> From: Joerg Roedel 
> Sent: Friday, May 15, 2020 2:59 AM
> To: Prakhya, Sai Praneeth 
> Cc: iommu@lists.linux-foundation.org; Lu Baolu 
> Subject: Re: [PATCH] iommu: Remove functions that support private domain
> 
> On Thu, May 14, 2020 at 11:12:52PM +, Prakhya, Sai Praneeth wrote:
> > +static int is_driver_bound(struct device *dev, void *not_used) {
> > +   int ret = 0;
> > +
> > +   device_lock(dev);
> > +   if (device_is_bound(dev))
> > +   ret = 1;
> > +   device_unlock(dev);
> > +   return ret;
> > +}
> 
> This locks only one device, so without lock-conversion there could be a driver
> probe after the device_unlock(), while we are probing the other devices of the
> group.
> 
> > [SNIP]
> >
> > +   /*
> > +* Check if any device in the group still has a driver binded to it.
> > +* This might race with device driver probing code and unfortunately
> > +* there is no clean way out of that either, locking all devices in the
> > +* group and then do the re-attach will introduce a lock-inversion with
> > +* group->mutex - Joerg.
> > +*/
> > +   if (iommu_group_for_each_dev(group, NULL, is_driver_bound)) {
> > +   pr_err("Active drivers exist for devices in the group\n");
> > +   return -EBUSY;
> > +   }
> 
> The lock inversion comes into the picture when this code is called from
> device(-driver) core through the bus-notifiers. The notifiers are called with 
> the
> device already locked.

Make sense. I will look through that code.

> > Another question I have is.. if it's racy then it should be racy even
> > for one device iommu groups.. right? Why would it be racy only with
> > multiple devices iommu group?
> 
> Valid point. So the device needs to be locked _while_ the default domain 
> change
> happens. If triggered by sysfs there should be no locking problems, I guess. 
> But
> you better try it out.

I will try this out and will update you.

Regards,
Sai
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: Implement deferred domain attachment

On Fri, May 15, 2020 at 05:28:53PM +0100, Robin Murphy wrote:
> On 2020-05-15 17:14, Joerg Roedel wrote:
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index ba128d1cdaee..403fda04ea98 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -362,8 +362,8 @@ static int iommu_dma_deferred_attach(struct device *dev,
> > return 0;
> > if (unlikely(ops->is_attach_deferred &&
> > -   ops->is_attach_deferred(domain, dev)))
> > -   return iommu_attach_device(domain, dev);
> > +ops->is_attach_deferred(domain, dev)))
> > +   return iommu_attach_device_no_defer(domain, dev);
> 
> Wouldn't it be simpler to just invoke ops->attach_dev directly and avoid
> having to formalise a public interface that nobody else should ever use
> anyway?

That would omit the ops->attach_dev != NULL check and the trace-point on
device attach. Besides that, it would be a layering violation. But the
function is of course entirely internal to the iommu subsytem and is a
good canditate to be moved to a header file in drivers/iommu.

> @@ -746,8 +747,11 @@ int iommu_group_add_device(struct iommu_group *group,
> struct device *dev)
> 
> mutex_lock(>mutex);
> list_add_tail(>list, >devices);
> -   if (group->domain)
> -   ret = __iommu_attach_device(group->domain, dev);
> +   domain = group->domain;
> +   if (domain && (!domain->ops->is_attach_deferred ||
> +  !domain->ops->is_attach_deferred(domain, dev)))
> +   ret = __iommu_attach_device(domain, dev);
> +   }
> mutex_unlock(>mutex);
> if (ret)
> goto err_put_group;

No, doing this in iommu_group_add_device() doesn't solve the problem.
The attach must not happen before a device driver took control of the
device and silenced any DMA initiated by the old kernel. At probe time
this isn't guaranteed.


Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: (a design open) RE: [PATCH v1 6/8] vfio/type1: Bind guest page tables to host

2020-05-15 Thread Alex Williamson

On Fri, 15 May 2020 07:39:14 +
"Tian, Kevin"  wrote:

> Hi, Alex,
> 
> When working on an updated version Yi and I found an design open
> which needs your guidance.
> 
> In concept nested translation can be incarnated as one GPA->HPA page
> table and multiple GVA->GPA page tables per VM. It means one container
> is sufficient to include all SVA-capable devices assigned to the same guest,
> as there is just one iova space (GPA->HPA) to be managed by vfio iommu
> map/unmap api. GVA->GPA page tables are bound through the new api
> introduced in this patch. It is different from legacy shadow translation 
> which merges GIOVA->GPA->HPA into GIOVA->HPA thus each device requires 
> its own iova space and must be in a separate container.

I think you've redefined the container as specifically IOVA context,
but I think a container is actually more of a shared IOMMU context
between groups and devices within those groups.  Userspace is under no
obligation to share a container between groups and the kernel is under
no obligation to let groups share a container.

> However supporting multiple SVA-capable devices in one container 
> imposes one challenge. Now the bind_guest_pgtbl is implemented as
> iommu type1 api. From the definition of iommu type 1 any operation
> should be applied to all devices within the same container, just like
> dma map/unmap. However this philosophy is incorrect regarding to
> page table binding. We must follow the guest binding requirements 
> between its page tables and assigned devices, otherwise every bound
> address space is suddenly accessible by all assigned devices thus creating
> security holes.

Is that a security hole, or is that simply the nature of a container?
Containers are meant to allow a user to share an IOMMU context between
groups of devices.  Sharing that context necessarily implies that the
user is granting the devices access to those address spaces.

> Do you think whether it's possible to extend iommu api to accept
> device specific cmd? If not, moving it to vfio-pci or vfio-mdev sounds
> also problematic, as PASID and page tables are IOMMU things which
> are not touched by vfio device drivers today.

Is this really that different from a group holding multiple devices,
each of which has a unique requester ID as seen by the IOMMU?  The
IOMMU can support separate contexts per device, but the topology
restricts us that those contexts are not fully isolated.  So we've
imposed the group restriction on ourselves to reflect that.  If a user
wants to share an IOMMU context between devices, maybe that precludes
the user from making use of nested translation.
 
> Alternatively can we impose the limitation that nesting APIs can be
> only enabled for singleton containers which contains only one device?
> This basically falls back to the state of legacy shadow translation 
> vIOMMU. and our current Qemu vIOMMU implementation actually 
> does this way regardless of whether nesting is used. Do you think 
> whether such tradeoff is acceptable as a starting point?

I think it's an acceptable starting point.  It seems what you're
describing is separating a monolithic notion of IOMMU context into
multiple layers, so we can share them at different points, ex. share a
GPA->HPA context among multiple different GVA->GPA contexts.  That
seems to imply something like the sub/super-container idea that's been
tossed around before, but never been fully defined or explored.  Thanks,

Alex

> > -Original Message-
> > From: Liu, Yi L 
> > Sent: Sunday, March 22, 2020 8:32 PM
> > To: alex.william...@redhat.com; eric.au...@redhat.com
> > Cc: Tian, Kevin ; jacob.jun@linux.intel.com;
> > j...@8bytes.org; Raj, Ashok ; Liu, Yi L
> > ; Tian, Jun J ; Sun, Yi Y
> > ; jean-phili...@linaro.org; pet...@redhat.com;
> > iommu@lists.linux-foundation.org; k...@vger.kernel.org; linux-
> > ker...@vger.kernel.org; Wu, Hao 
> > Subject: [PATCH v1 6/8] vfio/type1: Bind guest page tables to host
> > 
> > From: Liu Yi L 
> > 
> > VFIO_TYPE1_NESTING_IOMMU is an IOMMU type which is backed by
> > hardware
> > IOMMUs that have nesting DMA translation (a.k.a dual stage address
> > translation). For such hardware IOMMUs, there are two stages/levels of
> > address translation, and software may let userspace/VM to own the first-
> > level/stage-1 translation structures. Example of such usage is vSVA (
> > virtual Shared Virtual Addressing). VM owns the first-level/stage-1
> > translation structures and bind the structures to host, then hardware
> > IOMMU would utilize nesting translation when doing DMA translation fo
> > the devices behind such hardware IOMMU.
> > 
> > This patch adds vfio support for binding guest translation (a.k.a stage 1)
> > structure to host iommu. And for VFIO_TYPE1_NESTING_IOMMU, not only
> > bind
> > guest page table is needed, it also requires to expose interface to guest
> > for iommu cache invalidation when guest modified the first-level/stage-1
> > translation structures since

Re: [PATCH 03/14] docs: fix references for DMA*.txt files

2020-05-15 Thread Jonathan Corbet

On Fri,  1 May 2020 17:37:47 +0200
Mauro Carvalho Chehab  wrote:

> As we moved those files to core-api, fix references to point
> to their newer locations.
> 
> Signed-off-by: Mauro Carvalho Chehab 
> ---
>  Documentation/PCI/pci.rst  |  6 +++---
>  Documentation/block/biodoc.rst |  2 +-
>  Documentation/core-api/bus-virt-phys-mapping.rst   |  2 +-
>  Documentation/core-api/dma-api.rst |  6 +++---
>  Documentation/core-api/dma-isa-lpc.rst |  2 +-
>  Documentation/driver-api/usb/dma.rst   |  6 +++---
>  Documentation/memory-barriers.txt  |  6 +++---
>  .../translations/ko_KR/memory-barriers.txt |  6 +++---
>  arch/ia64/hp/common/sba_iommu.c| 12 ++--
>  arch/parisc/kernel/pci-dma.c   |  2 +-
>  arch/x86/include/asm/dma-mapping.h |  4 ++--
>  arch/x86/kernel/amd_gart_64.c  |  2 +-
>  drivers/parisc/sba_iommu.c | 14 +++---
>  include/linux/dma-mapping.h|  2 +-
>  include/media/videobuf-dma-sg.h|  2 +-
>  kernel/dma/debug.c |  2 +-
>  16 files changed, 38 insertions(+), 38 deletions(-)

...of course, not applying patch 2 causes this one to not apply, so
holding off for now...

jon
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 0/4] PCI, iommu: Factor 'untrusted' check for ATS enablement

2020-05-15 Thread Will Deacon

Hi,

On Fri, May 15, 2020 at 10:19:49AM -0700, Raj, Ashok wrote:
> On Fri, May 15, 2020 at 08:43:51AM -0700, Christoph Hellwig wrote:
> > Can you please lift the untrusted flag into struct device?  It really
> > isn't a PCI specific concept, and we should not have code poking into
> > pci_dev all over the iommu code.
> 
> Just for clarification: All IOMMU's today mostly pertain to only PCI devices
> and for devices that aren't PCI like HPET for instance we give a PCI
> identifier. Facilities like ATS for instance require the platform to have 
> an IOMMU.
> 
> what additional problems does moving this to struct device solve?

ATS is PCI specific, but IOMMUs certainly aren't! The vast majority of
IOMMUs deployed in arm/arm64 SoCs are /not/ using any sort of PCI.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 0/4] PCI, iommu: Factor 'untrusted' check for ATS enablement

2020-05-15 Thread Raj, Ashok

Hi Christoph

On Fri, May 15, 2020 at 08:43:51AM -0700, Christoph Hellwig wrote:
> Can you please lift the untrusted flag into struct device?  It really
> isn't a PCI specific concept, and we should not have code poking into
> pci_dev all over the iommu code.

Just for clarification: All IOMMU's today mostly pertain to only PCI devices
and for devices that aren't PCI like HPET for instance we give a PCI
identifier. Facilities like ATS for instance require the platform to have 
an IOMMU.

what additional problems does moving this to struct device solve?

Cheers,
Ashok

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: Implement deferred domain attachment


On 2020-05-15 17:14, Joerg Roedel wrote:

On Fri, May 15, 2020 at 04:42:23PM +0100, Robin Murphy wrote:

   struct iommu_domain *iommu_get_dma_domain(struct device *dev)
   {
-   return dev->iommu_group->default_domain;
+   struct iommu_domain *domain = dev->iommu_group->default_domain;
+
+   if (__iommu_is_attach_deferred(domain, dev))
+   __iommu_attach_device_no_defer(domain, dev);


This raises a red flag, since iommu-dma already has explicit deferred attach
handling where it should need it, immediately after this is called to
retrieve the domain. The whole thing smells to me like we should have an
explicit special-case in iommu_probe_device() rather than hooking
__iommu_attach_device() in general then having to bodge around the fallout
elsewhere.


Good point, I missed that. But it didn't work for its only user, the
AMD IOMMU driver, the reason is that it calls iommu_attach_device(),
which in its code-path checks for deferred attaching again and bails
out, without do the real attachment.

But below updated fix should work. Jerry, could you please test it
again?

 From 4e262dedcd36c7572312c65e66416da74fc78047 Mon Sep 17 00:00:00 2001
From: Joerg Roedel 
Date: Fri, 15 May 2020 11:25:03 +0200
Subject: [PATCH] iommu: Fix deferred domain attachment

The IOMMU core code has support for deferring the attachment of a domain
to a device. This is needed in kdump kernels where the new domain must
not be attached to a device before the device driver takes it over.

When the AMD IOMMU driver got converted to use the dma-iommu
implementation, the deferred attaching got lost. The code in
dma-iommu.c has support for deferred attaching, but it calls into
iommu_attach_device() to actually do it. But iommu_attach_device()
will check if the device should be deferred in it code-path and do
nothing, breaking deferred attachment.

Provide a function in IOMMU core code to reliably attach a device to a
domain without any deferred checks and also without other safe-guards.

Cc: Jerry Snitselaar 
Cc: Tom Murphy 
Reported-by: Jerry Snitselaar 
Fixes: 7959b6f8 ("iommu/dma-iommu: Handle deferred devices")
Signed-off-by: Joerg Roedel 
---
  drivers/iommu/dma-iommu.c |  4 ++--
  drivers/iommu/iommu.c | 37 -
  include/linux/iommu.h |  2 ++
  3 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index ba128d1cdaee..403fda04ea98 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -362,8 +362,8 @@ static int iommu_dma_deferred_attach(struct device *dev,
return 0;
  
  	if (unlikely(ops->is_attach_deferred &&

-   ops->is_attach_deferred(domain, dev)))
-   return iommu_attach_device(domain, dev);
+ops->is_attach_deferred(domain, dev)))
+   return iommu_attach_device_no_defer(domain, dev);


Wouldn't it be simpler to just invoke ops->attach_dev directly and avoid 
having to formalise a public interface that nobody else should ever use 
anyway?


That said, unless I've entirely misunderstood the situation I still 
think that something like the below makes more sense (apologies for 
broken whitespace).


Robin.

->8-
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 2b471419e26c..1a52e530774c 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -704,6 +704,7 @@ int iommu_group_add_device(struct iommu_group 
*group, struct device *dev)

 {
int ret, i = 0;
struct group_device *device;
+   struct iommu_domain *domain;

device = kzalloc(sizeof(*device), GFP_KERNEL);
if (!device)
@@ -746,8 +747,11 @@ int iommu_group_add_device(struct iommu_group 
*group, struct device *dev)


mutex_lock(>mutex);
list_add_tail(>list, >devices);
-   if (group->domain)
-   ret = __iommu_attach_device(group->domain, dev);
+   domain = group->domain;
+   if (domain && (!domain->ops->is_attach_deferred ||
+  !domain->ops->is_attach_deferred(domain, dev)))
+   ret = __iommu_attach_device(domain, dev);
+   }
mutex_unlock(>mutex);
if (ret)
goto err_put_group;
@@ -1652,9 +1656,6 @@ static int __iommu_attach_device(struct 
iommu_domain *domain,

 struct device *dev)
 {
int ret;
-   if ((domain->ops->is_attach_deferred != NULL) &&
-   domain->ops->is_attach_deferred(domain, dev))
-   return 0;

if (unlikely(domain->ops->attach_dev == NULL))
return -ENODEV;
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: Implement deferred domain attachment

On Fri, May 15, 2020 at 04:42:23PM +0100, Robin Murphy wrote:
> >   struct iommu_domain *iommu_get_dma_domain(struct device *dev)
> >   {
> > -   return dev->iommu_group->default_domain;
> > +   struct iommu_domain *domain = dev->iommu_group->default_domain;
> > +
> > +   if (__iommu_is_attach_deferred(domain, dev))
> > +   __iommu_attach_device_no_defer(domain, dev);
> 
> This raises a red flag, since iommu-dma already has explicit deferred attach
> handling where it should need it, immediately after this is called to
> retrieve the domain. The whole thing smells to me like we should have an
> explicit special-case in iommu_probe_device() rather than hooking
> __iommu_attach_device() in general then having to bodge around the fallout
> elsewhere.

Good point, I missed that. But it didn't work for its only user, the
AMD IOMMU driver, the reason is that it calls iommu_attach_device(),
which in its code-path checks for deferred attaching again and bails
out, without do the real attachment.

But below updated fix should work. Jerry, could you please test it
again?

>From 4e262dedcd36c7572312c65e66416da74fc78047 Mon Sep 17 00:00:00 2001
From: Joerg Roedel 
Date: Fri, 15 May 2020 11:25:03 +0200
Subject: [PATCH] iommu: Fix deferred domain attachment

The IOMMU core code has support for deferring the attachment of a domain
to a device. This is needed in kdump kernels where the new domain must
not be attached to a device before the device driver takes it over.

When the AMD IOMMU driver got converted to use the dma-iommu
implementation, the deferred attaching got lost. The code in
dma-iommu.c has support for deferred attaching, but it calls into
iommu_attach_device() to actually do it. But iommu_attach_device()
will check if the device should be deferred in it code-path and do
nothing, breaking deferred attachment.

Provide a function in IOMMU core code to reliably attach a device to a
domain without any deferred checks and also without other safe-guards.

Cc: Jerry Snitselaar 
Cc: Tom Murphy 
Reported-by: Jerry Snitselaar 
Fixes: 7959b6f8 ("iommu/dma-iommu: Handle deferred devices")
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/dma-iommu.c |  4 ++--
 drivers/iommu/iommu.c | 37 -
 include/linux/iommu.h |  2 ++
 3 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index ba128d1cdaee..403fda04ea98 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -362,8 +362,8 @@ static int iommu_dma_deferred_attach(struct device *dev,
return 0;
 
if (unlikely(ops->is_attach_deferred &&
-   ops->is_attach_deferred(domain, dev)))
-   return iommu_attach_device(domain, dev);
+ops->is_attach_deferred(domain, dev)))
+   return iommu_attach_device_no_defer(domain, dev);
 
return 0;
 }
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 4050569188be..91dbdbc6d640 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static struct kset *iommu_group_kset;
@@ -1889,13 +1890,19 @@ void iommu_domain_free(struct iommu_domain *domain)
 }
 EXPORT_SYMBOL_GPL(iommu_domain_free);
 
-static int __iommu_attach_device(struct iommu_domain *domain,
-struct device *dev)
+static bool __iommu_is_attach_deferred(struct iommu_domain *domain,
+  struct device *dev)
+{
+   if (!domain->ops->is_attach_deferred)
+   return false;
+
+   return domain->ops->is_attach_deferred(domain, dev);
+}
+
+static int __iommu_attach_device_no_defer(struct iommu_domain *domain,
+ struct device *dev)
 {
int ret;
-   if ((domain->ops->is_attach_deferred != NULL) &&
-   domain->ops->is_attach_deferred(domain, dev))
-   return 0;
 
if (unlikely(domain->ops->attach_dev == NULL))
return -ENODEV;
@@ -1903,9 +1910,29 @@ static int __iommu_attach_device(struct iommu_domain 
*domain,
ret = domain->ops->attach_dev(domain, dev);
if (!ret)
trace_attach_device_to_domain(dev);
+
return ret;
 }
 
+static int __iommu_attach_device(struct iommu_domain *domain,
+struct device *dev)
+{
+   if (__iommu_is_attach_deferred(domain, dev))
+   return 0;
+
+   return __iommu_attach_device_no_defer(domain, dev);
+}
+
+int iommu_attach_device_no_defer(struct iommu_domain *domain,
+struct device *dev)
+{
+   /* Safe-Guard to only call this when needed */
+   if (!is_kdump_kernel())
+   return -ENODEV;
+
+   return __iommu_attach_device_no_defer(domain, dev);
+}
+
 int iommu_attach_device(struct iommu_domain *domain, struct device *dev)
 {

Re: [PATCH] iommu: Remove functions that support private domain

On Fri, May 15, 2020 at 08:55:42PM +0800, Lu Baolu wrote:
> It seems that we can do like this:
> 
> [1] mutex_lock(>lock)
> [2] for_each_group_device(device_lock())
> [3] if (for_each_group_device(!device_is_bound()))
>   change_default_domain()
> [4] for_each_group_device_reverse(device_unlock())
> [5] mutex_unlock(>lock)

The problem here is that I am pretty sure we also have:

device_lock() /* from device/driver core code */
-> bus_notifier()
  -> iommu_bus_notifier()
-> ...
  -> mutex_lock(>lock)

Which would cause lock-inversion with the above code.


Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 0/4] PCI, iommu: Factor 'untrusted' check for ATS enablement

2020-05-15 Thread Christoph Hellwig

Can you please lift the untrusted flag into struct device?  It really
isn't a PCI specific concept, and we should not have code poking into
pci_dev all over the iommu code.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: Implement deferred domain attachment


On 2020-05-15 10:45, Joerg Roedel wrote:

From: Joerg Roedel 

The IOMMU core code has support for deferring the attachment of a domain
to a device. This is needed in kdump kernels where the new domain must
not be attached to a device before the device driver takes it over.

But this needs support from the dma-ops code too, to actually do the
late attachment when there are DMA-API calls for the device. This got
lost in the AMD IOMMU driver after converting it to the dma-iommu code.

Do the late attachment in the dma-iommu code-path to fix the issue.

Cc: Jerry Snitselaar 
Cc: Tom Murphy 
Reported-by: Jerry Snitselaar 
Tested-by: Jerry Snitselaar 
Fixes: be62dbf554c5 ("iommu/amd: Convert AMD iommu driver to the dma-iommu api")
Signed-off-by: Joerg Roedel 
---
  drivers/iommu/iommu.c | 33 +++--
  1 file changed, 27 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 4050569188be..f54ebb964271 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1889,13 +1889,19 @@ void iommu_domain_free(struct iommu_domain *domain)
  }
  EXPORT_SYMBOL_GPL(iommu_domain_free);
  
-static int __iommu_attach_device(struct iommu_domain *domain,

-struct device *dev)
+static bool __iommu_is_attach_deferred(struct iommu_domain *domain,
+  struct device *dev)
+{
+   if (!domain->ops->is_attach_deferred)
+   return false;
+
+   return domain->ops->is_attach_deferred(domain, dev);
+}
+
+static int __iommu_attach_device_no_defer(struct iommu_domain *domain,
+ struct device *dev)
  {
int ret;
-   if ((domain->ops->is_attach_deferred != NULL) &&
-   domain->ops->is_attach_deferred(domain, dev))
-   return 0;
  
  	if (unlikely(domain->ops->attach_dev == NULL))

return -ENODEV;
@@ -1903,9 +1909,19 @@ static int __iommu_attach_device(struct iommu_domain 
*domain,
ret = domain->ops->attach_dev(domain, dev);
if (!ret)
trace_attach_device_to_domain(dev);
+
return ret;
  }
  
+static int __iommu_attach_device(struct iommu_domain *domain,

+struct device *dev)
+{
+   if (__iommu_is_attach_deferred(domain, dev))
+   return 0;
+
+   return __iommu_attach_device_no_defer(domain, dev);
+}
+
  int iommu_attach_device(struct iommu_domain *domain, struct device *dev)
  {
struct iommu_group *group;
@@ -2023,7 +2039,12 @@ EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev);
   */
  struct iommu_domain *iommu_get_dma_domain(struct device *dev)
  {
-   return dev->iommu_group->default_domain;
+   struct iommu_domain *domain = dev->iommu_group->default_domain;
+
+   if (__iommu_is_attach_deferred(domain, dev))
+   __iommu_attach_device_no_defer(domain, dev);


This raises a red flag, since iommu-dma already has explicit deferred 
attach handling where it should need it, immediately after this is 
called to retrieve the domain. The whole thing smells to me like we 
should have an explicit special-case in iommu_probe_device() rather than 
hooking __iommu_attach_device() in general then having to bodge around 
the fallout elsewhere.


Robin.


+
+   return domain;
  }
  
  /*



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: (a design open) RE: [PATCH v1 6/8] vfio/type1: Bind guest page tables to host

2020-05-15 Thread Raj, Ashok

On Fri, May 15, 2020 at 12:39:14AM -0700, Tian, Kevin wrote:
> Hi, Alex,
> 
> When working on an updated version Yi and I found an design open
> which needs your guidance.
> 
> In concept nested translation can be incarnated as one GPA->HPA page
> table and multiple GVA->GPA page tables per VM. It means one container
> is sufficient to include all SVA-capable devices assigned to the same guest,
> as there is just one iova space (GPA->HPA) to be managed by vfio iommu
> map/unmap api. GVA->GPA page tables are bound through the new api
> introduced in this patch. It is different from legacy shadow translation
> which merges GIOVA->GPA->HPA into GIOVA->HPA thus each device requires
> its own iova space and must be in a separate container.
> 
> However supporting multiple SVA-capable devices in one container
> imposes one challenge. Now the bind_guest_pgtbl is implemented as
> iommu type1 api. From the definition of iommu type 1 any operation
> should be applied to all devices within the same container, just like
> dma map/unmap. However this philosophy is incorrect regarding to
> page table binding. We must follow the guest binding requirements
> between its page tables and assigned devices, otherwise every bound
> address space is suddenly accessible by all assigned devices thus creating
> security holes.

The above 2 paragraphs are bit confusing :-) but what really matters
is the below: 
> 
> Do you think whether it's possible to extend iommu api to accept
> device specific cmd? If not, moving it to vfio-pci or vfio-mdev sounds
> also problematic, as PASID and page tables are IOMMU things which
> are not touched by vfio device drivers today.

All you are referring to is when admin groups multiple devices in a
single container, you are saying you can't give isolation to them
for SVA purposes. This is logically equivalent to a single group with
multiple devices.

- Such as devices behind p2p bridge
- MFD's or devices behind switches or hieararchies without ACS
  support for instance.

By limitation you mean, disable PASID on those devices in a single 
container?

what about ATS? 

> 
> Alternatively can we impose the limitation that nesting APIs can be
> only enabled for singleton containers which contains only one device?
> This basically falls back to the state of legacy shadow translation
> vIOMMU. and our current Qemu vIOMMU implementation actually
> does this way regardless of whether nesting is used. Do you think
> whether such tradeoff is acceptable as a starting point?
> 
> Thanks
> Kevin
> 
> > -Original Message-
> > From: Liu, Yi L 
> > Sent: Sunday, March 22, 2020 8:32 PM
> > To: alex.william...@redhat.com; eric.au...@redhat.com
> > Cc: Tian, Kevin ; jacob.jun@linux.intel.com;
> > j...@8bytes.org; Raj, Ashok ; Liu, Yi L
> > ; Tian, Jun J ; Sun, Yi Y
> > ; jean-phili...@linaro.org; pet...@redhat.com;
> > iommu@lists.linux-foundation.org; k...@vger.kernel.org; linux-
> > ker...@vger.kernel.org; Wu, Hao 
> > Subject: [PATCH v1 6/8] vfio/type1: Bind guest page tables to host
> >
> > From: Liu Yi L 
> >
> > VFIO_TYPE1_NESTING_IOMMU is an IOMMU type which is backed by
> > hardware
> > IOMMUs that have nesting DMA translation (a.k.a dual stage address
> > translation). For such hardware IOMMUs, there are two stages/levels of
> > address translation, and software may let userspace/VM to own the first-
> > level/stage-1 translation structures. Example of such usage is vSVA (
> > virtual Shared Virtual Addressing). VM owns the first-level/stage-1
> > translation structures and bind the structures to host, then hardware
> > IOMMU would utilize nesting translation when doing DMA translation fo
> > the devices behind such hardware IOMMU.
> >
> > This patch adds vfio support for binding guest translation (a.k.a stage 1)
> > structure to host iommu. And for VFIO_TYPE1_NESTING_IOMMU, not only
> > bind
> > guest page table is needed, it also requires to expose interface to guest
> > for iommu cache invalidation when guest modified the first-level/stage-1
> > translation structures since hardware needs to be notified to flush stale
> > iotlbs. This would be introduced in next patch.
> >
> > In this patch, guest page table bind and unbind are done by using flags
> > VFIO_IOMMU_BIND_GUEST_PGTBL and
> > VFIO_IOMMU_UNBIND_GUEST_PGTBL under IOCTL
> > VFIO_IOMMU_BIND, the bind/unbind data are conveyed by
> > struct iommu_gpasid_bind_data. Before binding guest page table to host,
> > VM should have got a PASID allocated by host via
> > VFIO_IOMMU_PASID_REQUEST.
> >
> > Bind guest translation structures (here is guest page table) to host
> > are the first step to setup vSVA (Virtual Shared Virtual Addressing).
> >
> > Cc: Kevin Tian 
> > CC: Jacob Pan 
> > Cc: Alex Williamson 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Signed-off-by: Jean-Philippe Brucker 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Jacob Pan 
> > ---
> >  drivers/vfio/vfio_iommu_type1.c | 158
> >

Re: Constantly map and unmap of streaming DMA buffers with IOMMU backend might cause serious performance problem

2020-05-15 Thread h...@lst.de

On Fri, May 15, 2020 at 01:10:21PM +0100, Robin Murphy wrote:
>> Meanwhile, for the safety of buffers, lower-layer drivers need to make 
>> certain the buffers have already been unmapped in iommu before those buffers 
>> go back to buddy for other users.
>
> That sounds like it would only have benefit in a very small set of specific 
> circumstances, and would be very difficult to generalise to buffers that 
> are mapped via dma_map_page() or dma_map_single(). Furthermore, a 
> high-level API that affects a low-level driver's interpretation of 
> mid-layer API calls without the mid-layer's knowledge sounds like a hideous 
> abomination of anti-design. If a mid-layer API lends itself to inefficiency 
> at the lower level, it would seem a lot cleaner and more robust to extend 
> *that* API for stateful buffer reuse. Failing that, it might possibly be 
> appropriate to approach this at the driver level - many of the cleverer 
> network drivers already implement buffer pools to recycle mapped SKBs 
> internally, couldn't the "zip driver" simply try doing something like that 
> for itself?

Exactly.  If you upper consumer of the DMA API keeps reusing the same
pages just map them once and use dma_sync_* to transfer ownership as
needed.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: Implement deferred domain attachment

On Fri, May 15, 2020 at 09:51:03PM +0800, Lu Baolu wrote:
> On 2020/5/15 17:45, Joerg Roedel wrote:
> >   struct iommu_domain *iommu_get_dma_domain(struct device *dev)
> >   {
> > -   return dev->iommu_group->default_domain;
> > +   struct iommu_domain *domain = dev->iommu_group->default_domain;
> > +
> > +   if (__iommu_is_attach_deferred(domain, dev))
> > +   __iommu_attach_device_no_defer(domain, dev);
> 
> It seems that the return value needs to be checked. The default domain
> is invalid if attach() failed.

True, I looked at that, the callers can't handle returning NULL here, so
I kept it this way for now. The outcome is that DMA will fail, but
otherwise we'd see a NULL-ptr dereference really quickly after returning
from that function.

Bottom line: This needs to be cleaned up separatly.

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: Implement deferred domain attachment

2020-05-15 Thread Lu Baolu


Hi Joerg,

On 2020/5/15 17:45, Joerg Roedel wrote:

From: Joerg Roedel 

The IOMMU core code has support for deferring the attachment of a domain
to a device. This is needed in kdump kernels where the new domain must
not be attached to a device before the device driver takes it over.

But this needs support from the dma-ops code too, to actually do the
late attachment when there are DMA-API calls for the device. This got
lost in the AMD IOMMU driver after converting it to the dma-iommu code.

Do the late attachment in the dma-iommu code-path to fix the issue.

Cc: Jerry Snitselaar 
Cc: Tom Murphy 
Reported-by: Jerry Snitselaar 
Tested-by: Jerry Snitselaar 
Fixes: be62dbf554c5 ("iommu/amd: Convert AMD iommu driver to the dma-iommu api")
Signed-off-by: Joerg Roedel 
---
  drivers/iommu/iommu.c | 33 +++--
  1 file changed, 27 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 4050569188be..f54ebb964271 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1889,13 +1889,19 @@ void iommu_domain_free(struct iommu_domain *domain)
  }
  EXPORT_SYMBOL_GPL(iommu_domain_free);
  
-static int __iommu_attach_device(struct iommu_domain *domain,

-struct device *dev)
+static bool __iommu_is_attach_deferred(struct iommu_domain *domain,
+  struct device *dev)
+{
+   if (!domain->ops->is_attach_deferred)
+   return false;
+
+   return domain->ops->is_attach_deferred(domain, dev);
+}
+
+static int __iommu_attach_device_no_defer(struct iommu_domain *domain,
+ struct device *dev)
  {
int ret;
-   if ((domain->ops->is_attach_deferred != NULL) &&
-   domain->ops->is_attach_deferred(domain, dev))
-   return 0;
  
  	if (unlikely(domain->ops->attach_dev == NULL))

return -ENODEV;
@@ -1903,9 +1909,19 @@ static int __iommu_attach_device(struct iommu_domain 
*domain,
ret = domain->ops->attach_dev(domain, dev);
if (!ret)
trace_attach_device_to_domain(dev);
+
return ret;
  }
  
+static int __iommu_attach_device(struct iommu_domain *domain,

+struct device *dev)
+{
+   if (__iommu_is_attach_deferred(domain, dev))
+   return 0;
+
+   return __iommu_attach_device_no_defer(domain, dev);
+}
+
  int iommu_attach_device(struct iommu_domain *domain, struct device *dev)
  {
struct iommu_group *group;
@@ -2023,7 +2039,12 @@ EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev);
   */
  struct iommu_domain *iommu_get_dma_domain(struct device *dev)
  {
-   return dev->iommu_group->default_domain;
+   struct iommu_domain *domain = dev->iommu_group->default_domain;
+
+   if (__iommu_is_attach_deferred(domain, dev))
+   __iommu_attach_device_no_defer(domain, dev);


It seems that the return value needs to be checked. The default domain
is invalid if attach() failed.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: Remove functions that support private domain

2020-05-15 Thread Lu Baolu


Hi,

On 2020/5/15 17:59, Joerg Roedel wrote:

On Thu, May 14, 2020 at 11:12:52PM +, Prakhya, Sai Praneeth wrote:

+static int is_driver_bound(struct device *dev, void *not_used)
+{
+   int ret = 0;
+
+   device_lock(dev);
+   if (device_is_bound(dev))
+   ret = 1;
+   device_unlock(dev);
+   return ret;
+}


This locks only one device, so without lock-conversion there could be a
driver probe after the device_unlock(), while we are probing the other
devices of the group.


[SNIP]

+   /*
+* Check if any device in the group still has a driver binded to it.
+* This might race with device driver probing code and unfortunately
+* there is no clean way out of that either, locking all devices in the
+* group and then do the re-attach will introduce a lock-inversion with
+* group->mutex - Joerg.
+*/
+   if (iommu_group_for_each_dev(group, NULL, is_driver_bound)) {
+   pr_err("Active drivers exist for devices in the group\n");
+   return -EBUSY;
+   }


The lock inversion comes into the picture when this code is called from
device(-driver) core through the bus-notifiers. The notifiers are called
with the device already locked.


Another question I have is.. if it's racy then it should be racy even
for one device iommu groups.. right? Why would it be racy only with
multiple devices iommu group?


Valid point. So the device needs to be locked _while_ the default domain
change happens. If triggered by sysfs there should be no locking
problems, I guess. But you better try it out.


It seems that we can do like this:

[1] mutex_lock(>lock)
[2] for_each_group_device(device_lock())
[3] if (for_each_group_device(!device_is_bound()))
change_default_domain()
[4] for_each_group_device_reverse(device_unlock())
[5] mutex_unlock(>lock)

A possible problem exists at step 2 when another thread is trying to
lock devices in the reverse order at the same time.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/4] iommu/amd: Use pci_ats_supported()

On Fri, May 15, 2020 at 02:11:24PM +0200, Jean-Philippe Brucker wrote:
> On Fri, May 15, 2020 at 02:01:50PM +0200, Joerg Roedel wrote:
> > Hmm, but per patch 1, pci_ats_supported() does not check
> > pci_ats_disabled(), or do I miss something?
> 
> The commit message isn't clear. pci_ats_init() sets dev->ats_cap only if
> !pci_ats_disabled(), so checking dev->ats_cap in pci_ats_supported()
> takes pci_ats_disabled() into account.

Right, so the patch is fine:

Reviewed-by: Joerg Roedel 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/4] iommu/amd: Use pci_ats_supported()

On Fri, May 15, 2020 at 02:01:50PM +0200, Joerg Roedel wrote:
> On Fri, May 15, 2020 at 12:44:00PM +0200, Jean-Philippe Brucker wrote:
> > The pci_ats_supported() function checks if a device supports ATS and is
> > allowed to use it. In addition to checking that the device has an ATS
> > capability and that the global pci=noats is not set
> > (pci_ats_disabled()), it also checks if a device is untrusted.
> 
> Hmm, but per patch 1, pci_ats_supported() does not check
> pci_ats_disabled(), or do I miss something?

The commit message isn't clear. pci_ats_init() sets dev->ats_cap only if
!pci_ats_disabled(), so checking dev->ats_cap in pci_ats_supported()
takes pci_ats_disabled() into account.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Constantly map and unmap of streaming DMA buffers with IOMMU backend might cause serious performance problem

On 2020-05-15 09:19, Song Bao Hua wrote:
[ snip... nice analysis, but ultimately it's still "doing stuff has more
overhead than not doing stuff" ]

I am thinking several possible ways on decreasing or removing the latency of DMA
map/unmap for every single DMA transfer. Meanwhile, "non-strict" as an existing
option with possible safety issues, I won't discuss it in this mail.

But passthrough and non-strict mode *specifically exist* for the cases
where performance is the most important concern - streaming DMA with an
IOMMU in the middle has an unavoidable tradeoff between performance and
isolation, so dismissing that out of hand is not a good way to start
making this argument.

1. provide bounce coherent buffers for streaming buffers.
As the coherent buffers keep the status of mapping, we can remove the overhead
of map and unmap for each single DMA operations. However, this solution
requires memory copy between stream buffers and bounce buffers. Thus it will
work only if copy is faster than map/unmap. Meanwhile, it will consume much
more memory bandwidth.

I'm struggling to understand how that would work, can you explain it in
more detail?

2.make upper-layer kernel components aware of the pain of iommu map/unmap
upper-layer fs, mm, networks can somehow let the lower-layer drivers know the
end of the life cycle of sg buffers. In zswap case, I have seen zswap always
use the same 2 pages as the destination buffers to save compressed page, but
the compressor driver still has to constantly map and unmap those same two
pages for every single compression since zswap and zip drivers are working in
two completely different software layers.

I am thinking some way as below, upper-layer kernel code can call:
sg_init_table();
sg_mark_reusable();
/* use the buffer many times */

sg_mark_stop_reuse();

After that, if low level drivers see "reusable" flag, it will realize the buffer can be used
multiple times and will not do map/unmap every time. it means upper-layer components will further use the
buffers and the same buffers will probably be given to lower-layer drivers for new DMA transfer later. When
upper-layer code sets " stop_reuse", lower-layer driver will unmap the sg buffers, possibly by
providing a unmap-callback to upper-layer components. For zswap case, I have seen the same buffers are always
re-used and zip driver maps and unmaps it again and again. Shortly after the buffer is unmapped, it will be
mapped in the next transmission, almost without any time gap between unmap and map. In case zswap can set the
"reusable" flag, zip driver will save a lot of time.
Meanwhile, for the safety of buffers, lower-layer drivers need to make certain
the buffers have already been unmapped in iommu before those buffers go back to
buddy for other users.

That sounds like it would only have benefit in a very small set of
specific circumstances, and would be very difficult to generalise to
buffers that are mapped via dma_map_page() or dma_map_single().
Furthermore, a high-level API that affects a low-level driver's
interpretation of mid-layer API calls without the mid-layer's knowledge
sounds like a hideous abomination of anti-design. If a mid-layer API
lends itself to inefficiency at the lower level, it would seem a lot
cleaner and more robust to extend *that* API for stateful buffer reuse.
Failing that, it might possibly be appropriate to approach this at the
driver level - many of the cleverer network drivers already implement
buffer pools to recycle mapped SKBs internally, couldn't the "zip
driver" simply try doing something like that for itself?

Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/4] iommu/amd: Use pci_ats_supported()

On Fri, May 15, 2020 at 12:44:00PM +0200, Jean-Philippe Brucker wrote:
> The pci_ats_supported() function checks if a device supports ATS and is
> allowed to use it. In addition to checking that the device has an ATS
> capability and that the global pci=noats is not set
> (pci_ats_disabled()), it also checks if a device is untrusted.

Hmm, but per patch 1, pci_ats_supported() does not check
pci_ats_disabled(), or do I miss something?


Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/4] PCI/ATS: Only enable ATS for trusted devices

Hi Jean-Philippe,

thanks for doing this!

On Fri, May 15, 2020 at 12:43:59PM +0200, Jean-Philippe Brucker wrote:
> Add pci_ats_supported(), which checks whether a device has an ATS
> capability, and whether it is trusted.  A device is untrusted if it is
> plugged into an external-facing port such as Thunderbolt and could be
> spoof an existing device to exploit weaknesses in the IOMMU
> configuration.  PCIe ATS is one such weaknesses since it allows
> endpoints to cache IOMMU translations and emit transactions with
> 'Translated' Address Type (10b) that partially bypass the IOMMU
> translation.
> 
> The SMMUv3 and VT-d IOMMU drivers already disallow ATS and transactions
> with 'Translated' Address Type for untrusted devices.  Add the check to
> pci_enable_ats() to let other drivers (AMD IOMMU for now) benefit from
> it.
> 
> By checking ats_cap, the pci_ats_supported() helper also returns whether
> ATS was globally disabled with pci=noats, and could later include more
> things, for example whether the whole PCIe hierarchy down to the
> endpoint supports ATS.
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  include/linux/pci-ats.h |  3 +++
>  drivers/pci/ats.c   | 18 +-
>  2 files changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
> index d08f0869f1213e..f75c307f346de9 100644
> --- a/include/linux/pci-ats.h
> +++ b/include/linux/pci-ats.h
> @@ -6,11 +6,14 @@
>  
>  #ifdef CONFIG_PCI_ATS
>  /* Address Translation Service */
> +bool pci_ats_supported(struct pci_dev *dev);
>  int pci_enable_ats(struct pci_dev *dev, int ps);
>  void pci_disable_ats(struct pci_dev *dev);
>  int pci_ats_queue_depth(struct pci_dev *dev);
>  int pci_ats_page_aligned(struct pci_dev *dev);
>  #else /* CONFIG_PCI_ATS */
> +static inline bool pci_ats_supported(struct pci_dev *d)
> +{ return false; }
>  static inline int pci_enable_ats(struct pci_dev *d, int ps)
>  { return -ENODEV; }
>  static inline void pci_disable_ats(struct pci_dev *d) { }
> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> index 390e92f2d8d1fc..15fa0c37fd8e44 100644
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c
> @@ -30,6 +30,22 @@ void pci_ats_init(struct pci_dev *dev)
>   dev->ats_cap = pos;
>  }
>  
> +/**
> + * pci_ats_supported - check if the device can use ATS
> + * @dev: the PCI device
> + *
> + * Returns true if the device supports ATS and is allowed to use it, false
> + * otherwise.
> + */
> +bool pci_ats_supported(struct pci_dev *dev)
> +{
> + if (!dev->ats_cap)
> + return false;
> +
> + return !dev->untrusted;

dev->untrusted is an 'unsigned int :1', so while this works I would
prefer 'return (dev->untrusted == 0);' here, to be more type-safe.

With that changed:

Reviewed-by: Joerg Roedel 

> +}
> +EXPORT_SYMBOL_GPL(pci_ats_supported);
> +
>  /**
>   * pci_enable_ats - enable the ATS capability
>   * @dev: the PCI device
> @@ -42,7 +58,7 @@ int pci_enable_ats(struct pci_dev *dev, int ps)
>   u16 ctrl;
>   struct pci_dev *pdev;
>  
> - if (!dev->ats_cap)
> + if (!pci_ats_supported(dev))
>   return -EINVAL;
>  
>   if (WARN_ON(dev->ats_enabled))
> -- 
> 2.26.2
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 4/4] iommu/vt-d: Use pci_ats_supported()

The pci_ats_supported() helper checks if a device supports ATS and is
allowed to use it. By checking the ATS capability it also integrates the
pci_ats_disabled() check from pci_ats_init(). Simplify the vt-d checks.

Acked-by: Lu Baolu 
Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/intel-iommu.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0182cff2c7ac75..ed21ce6d123810 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1454,8 +1454,7 @@ static void iommu_enable_dev_iotlb(struct 
device_domain_info *info)
!pci_reset_pri(pdev) && !pci_enable_pri(pdev, 32))
info->pri_enabled = 1;
 #endif
-   if (!pdev->untrusted && info->ats_supported &&
-   pci_ats_page_aligned(pdev) &&
+   if (info->ats_supported && pci_ats_page_aligned(pdev) &&
!pci_enable_ats(pdev, VTD_PAGE_SHIFT)) {
info->ats_enabled = 1;
domain_update_iotlb(info->domain);
@@ -2611,10 +2610,8 @@ static struct dmar_domain 
*dmar_insert_one_dev_info(struct intel_iommu *iommu,
if (dev && dev_is_pci(dev)) {
struct pci_dev *pdev = to_pci_dev(info->dev);
 
-   if (!pdev->untrusted &&
-   !pci_ats_disabled() &&
-   ecap_dev_iotlb_support(iommu->ecap) &&
-   pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ATS) &&
+   if (ecap_dev_iotlb_support(iommu->ecap) &&
+   pci_ats_supported(pdev) &&
dmar_find_matched_atsr_unit(pdev))
info->ats_supported = 1;
 
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 1/4] PCI/ATS: Only enable ATS for trusted devices

Add pci_ats_supported(), which checks whether a device has an ATS
capability, and whether it is trusted.  A device is untrusted if it is
plugged into an external-facing port such as Thunderbolt and could be
spoof an existing device to exploit weaknesses in the IOMMU
configuration.  PCIe ATS is one such weaknesses since it allows
endpoints to cache IOMMU translations and emit transactions with
'Translated' Address Type (10b) that partially bypass the IOMMU
translation.

The SMMUv3 and VT-d IOMMU drivers already disallow ATS and transactions
with 'Translated' Address Type for untrusted devices.  Add the check to
pci_enable_ats() to let other drivers (AMD IOMMU for now) benefit from
it.

By checking ats_cap, the pci_ats_supported() helper also returns whether
ATS was globally disabled with pci=noats, and could later include more
things, for example whether the whole PCIe hierarchy down to the
endpoint supports ATS.

Signed-off-by: Jean-Philippe Brucker 
---
 include/linux/pci-ats.h |  3 +++
 drivers/pci/ats.c   | 18 +-
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
index d08f0869f1213e..f75c307f346de9 100644
--- a/include/linux/pci-ats.h
+++ b/include/linux/pci-ats.h
@@ -6,11 +6,14 @@
 
 #ifdef CONFIG_PCI_ATS
 /* Address Translation Service */
+bool pci_ats_supported(struct pci_dev *dev);
 int pci_enable_ats(struct pci_dev *dev, int ps);
 void pci_disable_ats(struct pci_dev *dev);
 int pci_ats_queue_depth(struct pci_dev *dev);
 int pci_ats_page_aligned(struct pci_dev *dev);
 #else /* CONFIG_PCI_ATS */
+static inline bool pci_ats_supported(struct pci_dev *d)
+{ return false; }
 static inline int pci_enable_ats(struct pci_dev *d, int ps)
 { return -ENODEV; }
 static inline void pci_disable_ats(struct pci_dev *d) { }
diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index 390e92f2d8d1fc..15fa0c37fd8e44 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -30,6 +30,22 @@ void pci_ats_init(struct pci_dev *dev)
dev->ats_cap = pos;
 }
 
+/**
+ * pci_ats_supported - check if the device can use ATS
+ * @dev: the PCI device
+ *
+ * Returns true if the device supports ATS and is allowed to use it, false
+ * otherwise.
+ */
+bool pci_ats_supported(struct pci_dev *dev)
+{
+   if (!dev->ats_cap)
+   return false;
+
+   return !dev->untrusted;
+}
+EXPORT_SYMBOL_GPL(pci_ats_supported);
+
 /**
  * pci_enable_ats - enable the ATS capability
  * @dev: the PCI device
@@ -42,7 +58,7 @@ int pci_enable_ats(struct pci_dev *dev, int ps)
u16 ctrl;
struct pci_dev *pdev;
 
-   if (!dev->ats_cap)
+   if (!pci_ats_supported(dev))
return -EINVAL;
 
if (WARN_ON(dev->ats_enabled))
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 3/4] iommu/arm-smmu-v3: Use pci_ats_supported()

The new pci_ats_supported() function checks if a device supports ATS and
is allowed to use it.

Signed-off-by: Jean-Philippe Brucker 
---
I dropped the Ack because I slightly changed the patch to keep the
fwspec check, since last version:
https://lore.kernel.org/linux-iommu/20200311124506.208376-8-jean-phili...@linaro.org/
---
 drivers/iommu/arm-smmu-v3.c | 20 +---
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 82508730feb7a1..39b935e86ab203 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2652,26 +2652,16 @@ static void arm_smmu_install_ste_for_dev(struct 
arm_smmu_master *master)
}
 }
 
-#ifdef CONFIG_PCI_ATS
 static bool arm_smmu_ats_supported(struct arm_smmu_master *master)
 {
-   struct pci_dev *pdev;
+   struct device *dev = master->dev;
struct arm_smmu_device *smmu = master->smmu;
-   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev);
-
-   if (!(smmu->features & ARM_SMMU_FEAT_ATS) || !dev_is_pci(master->dev) ||
-   !(fwspec->flags & IOMMU_FWSPEC_PCI_RC_ATS) || pci_ats_disabled())
-   return false;
+   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
 
-   pdev = to_pci_dev(master->dev);
-   return !pdev->untrusted && pdev->ats_cap;
+   return (smmu->features & ARM_SMMU_FEAT_ATS) &&
+   !(fwspec->flags & IOMMU_FWSPEC_PCI_RC_ATS) &&
+   dev_is_pci(dev) && pci_ats_supported(to_pci_dev(dev));
 }
-#else
-static bool arm_smmu_ats_supported(struct arm_smmu_master *master)
-{
-   return false;
-}
-#endif
 
 static void arm_smmu_enable_ats(struct arm_smmu_master *master)
 {
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 0/4] PCI, iommu: Factor 'untrusted' check for ATS enablement

I sent these in March as part of ATS enablement for device-tree [1], but
haven't found the time to address the largest comment on that series
about consolidating the root bridge ATS support between the different
ACPI tables.

I'm resending only the bits that consolidate the 'untrusted' check for
ATS, since there have been more discussions about this [2]. Patch 1
moves the 'untrusted' check to drivers/pci/ats.c and patches 2-4 modify
the ATS-capable IOMMU drivers.

The only functional change should be to the AMD IOMMU driver. With this
change all IOMMU drivers block 'Translated' PCIe transactions and
Translation Requests from untrusted devices.

[1] 
https://lore.kernel.org/linux-iommu/20200311124506.208376-1-jean-phili...@linaro.org/
[2] 
https://lore.kernel.org/linux-pci/20200513151929.GA38418@bjorn-Precision-5520/

Jean-Philippe Brucker (4):
  PCI/ATS: Only enable ATS for trusted devices
  iommu/amd: Use pci_ats_supported()
  iommu/arm-smmu-v3: Use pci_ats_supported()
  iommu/vt-d: Use pci_ats_supported()

 include/linux/pci-ats.h |  3 +++
 drivers/iommu/amd_iommu.c   | 12 
 drivers/iommu/arm-smmu-v3.c | 20 +---
 drivers/iommu/intel-iommu.c |  9 +++--
 drivers/pci/ats.c   | 18 +-
 5 files changed, 32 insertions(+), 30 deletions(-)

-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 2/4] iommu/amd: Use pci_ats_supported()

The pci_ats_supported() function checks if a device supports ATS and is
allowed to use it. In addition to checking that the device has an ATS
capability and that the global pci=noats is not set
(pci_ats_disabled()), it also checks if a device is untrusted.

A device is untrusted if it is plugged into an external-facing port such
as Thunderbolt and could be spoofing an existing device to exploit
weaknesses in the IOMMU configuration. By calling pci_ats_supported() we
keep DTE[I]=0 for untrusted devices and abort transactions with
Pretranslated Addresses.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/amd_iommu.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 1dc3718560d0e8..8b7a9e811d33a6 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -313,16 +313,15 @@ static struct iommu_group *acpihid_device_group(struct 
device *dev)
 static bool pci_iommuv2_capable(struct pci_dev *pdev)
 {
static const int caps[] = {
-   PCI_EXT_CAP_ID_ATS,
PCI_EXT_CAP_ID_PRI,
PCI_EXT_CAP_ID_PASID,
};
int i, pos;
 
-   if (pci_ats_disabled())
+   if (!pci_ats_supported(pdev))
return false;
 
-   for (i = 0; i < 3; ++i) {
+   for (i = 0; i < 2; ++i) {
pos = pci_find_ext_capability(pdev, caps[i]);
if (pos == 0)
return false;
@@ -3150,11 +3149,8 @@ int amd_iommu_device_info(struct pci_dev *pdev,
 
memset(info, 0, sizeof(*info));
 
-   if (!pci_ats_disabled()) {
-   pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ATS);
-   if (pos)
-   info->flags |= AMD_IOMMU_DEVICE_FLAG_ATS_SUP;
-   }
+   if (pci_ats_supported(pdev))
+   info->flags |= AMD_IOMMU_DEVICE_FLAG_ATS_SUP;
 
pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI);
if (pos)
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v5 30/38] dmabuf: fix common struct sg_table related issues

2020-05-15 Thread Gerd Hoffmann

> diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
> index acb26c6..89e293b 100644
> --- a/drivers/dma-buf/udmabuf.c
> +++ b/drivers/dma-buf/udmabuf.c
> @@ -63,10 +63,9 @@ static struct sg_table *get_sg_table(struct device *dev, 
> struct dma_buf *buf,
>   GFP_KERNEL);
>   if (ret < 0)
>   goto err;
> - if (!dma_map_sg(dev, sg->sgl, sg->nents, direction)) {
> - ret = -EINVAL;
> + ret = dma_map_sgtable(dev, sg, direction, 0);
> + if (ret < 0)
>   goto err;
> - }
>   return sg;
>  
>  err:
> @@ -78,7 +77,7 @@ static struct sg_table *get_sg_table(struct device *dev, 
> struct dma_buf *buf,
>  static void put_sg_table(struct device *dev, struct sg_table *sg,
>enum dma_data_direction direction)
>  {
> - dma_unmap_sg(dev, sg->sgl, sg->nents, direction);
> + dma_unmap_sgtable(dev, sg, direction, 0);
>   sg_free_table(sg);
>   kfree(sg);
>  }

Easy straightforward conversation.

Acked-by: Gerd Hoffmann 

take care,
  Gerd

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v5 25/38] drm: virtio: fix common struct sg_table related issues

2020-05-15 Thread Gerd Hoffmann

On Wed, May 13, 2020 at 03:32:32PM +0200, Marek Szyprowski wrote:
> The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
> returns the number of the created entries in the DMA address space.
> However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
> dma_unmap_sg must be called with the original number of the entries
> passed to the dma_map_sg().
> 
> struct sg_table is a common structure used for describing a non-contiguous
> memory buffer, used commonly in the DRM and graphics subsystems. It
> consists of a scatterlist with memory pages and DMA addresses (sgl entry),
> as well as the number of scatterlist entries: CPU pages (orig_nents entry)
> and DMA mapped pages (nents entry).
> 
> It turned out that it was a common mistake to misuse nents and orig_nents
> entries, calling DMA-mapping functions with a wrong number of entries or
> ignoring the number of mapped entries returned by the dma_map_sg()
> function.
> 
> To avoid such issues, lets use a common dma-mapping wrappers operating
> directly on the struct sg_table objects and use scatterlist page
> iterators where possible. This, almost always, hides references to the
> nents and orig_nents entries, making the code robust, easier to follow
> and copy/paste safe.

Looks all sane.

Acked-by: Gerd Hoffmann 

take care,
  Gerd

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 23/33] iommu/mediatek-v1 Convert to probe/release_device() call-backs

Hi,

On Fri, May 15, 2020 at 03:44:59PM +0800, Yong Wu wrote:
> On Tue, 2020-04-14 at 15:15 +0200, Joerg Roedel wrote:
> > -   return iommu_device_link(>iommu, dev);
> > +   err = arm_iommu_attach_device(dev, mtk_mapping);
> > +   if (err)
> > +   dev_err(dev, "Can't create IOMMU mapping - DMA-OPS will not 
> > work\n");
> 
> 
> Hi Joerg,
> 
>  Thanks very much for this patch.
> 
>  This arm_iommu_attach_device is called just as we expected.
> 
>  But it will fail in this callstack as the group->mutex was tried to
> be re-locked...
> 
> [] (iommu_attach_device) from []
> (__arm_iommu_attach_device+0x34/0x90)
> [] (__arm_iommu_attach_device) from []
> (arm_iommu_attach_device+0xc/0x20)
> [] (arm_iommu_attach_device) from []
> (mtk_iommu_probe_finalize+0x34/0x50)
> [] (mtk_iommu_probe_finalize) from []
> (bus_iommu_probe+0x2a8/0x2c4)
> [] (bus_iommu_probe) from [] (bus_set_iommu
> +0x88/0xd4)
> [] (bus_set_iommu) from [] (mtk_iommu_probe
> +0x2f8/0x364)

Thanks for the report, is


https://lore.kernel.org/lkml/1589530123-30240-1-git-send-email-yong...@mediatek.com/

The fix for this issue or is something else required?


Thanks,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: Remove functions that support private domain

On Wed, May 13, 2020 at 03:47:21PM -0700, Sai Praneeth Prakhya wrote:
> After moving iommu_group setup to iommu core code [1][2] and removing
> private domain support in vt-d [3], there are no users for functions such
> as iommu_request_dm_for_dev(), iommu_request_dma_domain_for_dev() and
> request_default_domain_for_dev(). So, remove these functions.
> 
> [1] commit dce8d6964ebd ("iommu/amd: Convert to probe/release_device()
> call-backs")
> [2] commit e5d1841f18b2 ("iommu/vt-d: Convert to probe/release_device()
> call-backs")
> [3] commit 327d5b2fee91 ("iommu/vt-d: Allow 32bit devices to uses DMA
> domain")
> 
> Cc: Joerg Roedel 
> Cc: Lu Baolu 
> Signed-off-by: Sai Praneeth Prakhya 
> ---
>  drivers/iommu/iommu.c | 65 ---
>  include/linux/iommu.h | 12 
>  2 files changed, 77 deletions(-)

Applied that patch to the x86/vt-d branch.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: Remove functions that support private domain

On Thu, May 14, 2020 at 11:12:52PM +, Prakhya, Sai Praneeth wrote:
> +static int is_driver_bound(struct device *dev, void *not_used)
> +{
> + int ret = 0;
> +
> + device_lock(dev);
> + if (device_is_bound(dev))
> + ret = 1;
> + device_unlock(dev);
> + return ret;
> +}

This locks only one device, so without lock-conversion there could be a
driver probe after the device_unlock(), while we are probing the other
devices of the group.

> [SNIP]
> 
> + /*
> +  * Check if any device in the group still has a driver binded to it.
> +  * This might race with device driver probing code and unfortunately
> +  * there is no clean way out of that either, locking all devices in the
> +  * group and then do the re-attach will introduce a lock-inversion with
> +  * group->mutex - Joerg.
> +  */
> + if (iommu_group_for_each_dev(group, NULL, is_driver_bound)) {
> + pr_err("Active drivers exist for devices in the group\n");
> + return -EBUSY;
> + }

The lock inversion comes into the picture when this code is called from
device(-driver) core through the bus-notifiers. The notifiers are called
with the device already locked.

> Another question I have is.. if it's racy then it should be racy even
> for one device iommu groups.. right? Why would it be racy only with
> multiple devices iommu group?

Valid point. So the device needs to be locked _while_ the default domain
change happens. If triggered by sysfs there should be no locking
problems, I guess. But you better try it out.

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 5/5] drm/sun4i: mixer: Call of_dma_configure if there's an IOMMU

2020-05-15 Thread Paul Kocialkowski

Hi,

On Wed 13 May 20, 16:07, Maxime Ripard wrote:
> The main DRM device is actually a virtual device so it doesn't have the
> iommus property, which is instead on the DMA masters, in this case the
> mixers.
> 
> Add a call to of_dma_configure with the mixers DT node but on the DRM
> virtual device to configure it in the same way than the mixers.

Although I'm not very familiar with the DMA API, this looks legit to me and
matches what's already done in sun4i_backend for the interconnect. So:

Reviewed-by: Paul Kocialkowski 

Cheers,

Paul

> Signed-off-by: Maxime Ripard 
> ---
>  drivers/gpu/drm/sun4i/sun8i_mixer.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/gpu/drm/sun4i/sun8i_mixer.c 
> b/drivers/gpu/drm/sun4i/sun8i_mixer.c
> index 56cc037fd312..cc4fb916318f 100644
> --- a/drivers/gpu/drm/sun4i/sun8i_mixer.c
> +++ b/drivers/gpu/drm/sun4i/sun8i_mixer.c
> @@ -363,6 +363,19 @@ static int sun8i_mixer_bind(struct device *dev, struct 
> device *master,
>   mixer->engine.ops = _engine_ops;
>   mixer->engine.node = dev->of_node;
>  
> + if (of_find_property(dev->of_node, "iommus", NULL)) {
> + /*
> +  * This assume we have the same DMA constraints for
> +  * all our the mixers in our pipeline. This sounds
> +  * bad, but it has always been the case for us, and
> +  * DRM doesn't do per-device allocation either, so we
> +  * would need to fix DRM first...
> +  */
> + ret = of_dma_configure(drm->dev, dev->of_node, true);
> + if (ret)
> + return ret;
> + }
> +
>   /*
>* While this function can fail, we shouldn't do anything
>* if this happens. Some early DE2 DT entries don't provide
> -- 
> git-series 0.9.1
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] iommu: Implement deferred domain attachment