Re: [PATCH] iommu: Relax ACS requirement for RCiEP devices.

2020-05-04 Thread Raj, Ashok
Hi Alex

+ Joerg, accidently missed in the Cc.

On Mon, May 04, 2020 at 11:19:36PM -0600, Alex Williamson wrote:
> On Mon,  4 May 2020 21:42:16 -0700
> Ashok Raj  wrote:
> 
> > PCIe Spec recommends we can relax ACS requirement for RCIEP devices.
> > 
> > PCIe 5.0 Specification.
> > 6.12 Access Control Services (ACS)
> > Implementation of ACS in RCiEPs is permitted but not required. It is
> > explicitly permitted that, within a single Root Complex, some RCiEPs
> > implement ACS and some do not. It is strongly recommended that Root Complex
> > implementations ensure that all accesses originating from RCiEPs
> > (PFs and VFs) without ACS capability are first subjected to processing by
> > the Translation Agent (TA) in the Root Complex before further decoding and
> > processing. The details of such Root Complex handling are outside the scope
> > of this specification.
> > 
> > Since Linux didn't give special treatment to allow this exception, certain
> > RCiEP MFD devices are getting grouped in a single iommu group. This
> > doesn't permit a single device to be assigned to a guest for instance.
> > 
> > In one vendor system: Device 14.x were grouped in a single IOMMU group.
> > 
> > /sys/kernel/iommu_groups/5/devices/:00:14.0
> > /sys/kernel/iommu_groups/5/devices/:00:14.2
> > /sys/kernel/iommu_groups/5/devices/:00:14.3
> > 
> > After the patch:
> > /sys/kernel/iommu_groups/5/devices/:00:14.0
> > /sys/kernel/iommu_groups/5/devices/:00:14.2
> > /sys/kernel/iommu_groups/6/devices/:00:14.3 <<< new group
> > 
> > 14.0 and 14.2 are integrated devices, but legacy end points.
> > Whereas 14.3 was a PCIe compliant RCiEP.
> > 
> > 00:14.3 Network controller: Intel Corporation Device 9df0 (rev 30)
> > Capabilities: [40] Express (v2) Root Complex Integrated Endpoint, MSI 00
> > 
> > This permits assigning this device to a guest VM.
> > 
> > Fixes: f096c061f552 ("iommu: Rework iommu_group_get_for_pci_dev()")
> > Signed-off-by: Ashok Raj 
> > To: Joerg Roedel 
> > To: Bjorn Helgaas 
> > Cc: linux-ker...@vger.kernel.org
> > Cc: iommu@lists.linux-foundation.org
> > Cc: Lu Baolu 
> > Cc: Alex Williamson 
> > Cc: Darrel Goeddel 
> > Cc: Mark Scott ,
> > Cc: Romil Sharma 
> > Cc: Ashok Raj 
> > ---
> >  drivers/iommu/iommu.c | 15 ++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > index 2b471419e26c..5744bd65f3e2 100644
> > --- a/drivers/iommu/iommu.c
> > +++ b/drivers/iommu/iommu.c
> > @@ -1187,7 +1187,20 @@ static struct iommu_group 
> > *get_pci_function_alias_group(struct pci_dev *pdev,
> > struct pci_dev *tmp = NULL;
> > struct iommu_group *group;
> >  
> > -   if (!pdev->multifunction || pci_acs_enabled(pdev, REQ_ACS_FLAGS))
> > +   /*
> > +* PCI Spec 5.0, Section 6.12 Access Control Service
> > +* Implementation of ACS in RCiEPs is permitted but not required.
> > +* It is explicitly permitted that, within a single Root
> > +* Complex, some RCiEPs implement ACS and some do not. It is
> > +* strongly recommended that Root Complex implementations ensure
> > +* that all accesses originating from RCiEPs (PFs and VFs) without
> > +* ACS capability are first subjected to processing by the Translation
> > +* Agent (TA) in the Root Complex before further decoding and
> > +* processing.
> > +*/
> 
> Is the language here really strong enough to make this change?  ACS is
> an optional feature, so being permitted but not required is rather
> meaningless.  The spec is also specifically avoiding the words "must"
> or "shall" and even when emphasized with "strongly", we still only have
> a recommendation that may or may not be honored.  This seems like a
> weak basis for assuming that RCiEPs universally honor this
> recommendation.  Thanks,
> 

We are speaking about PCIe spec, where people write it about 5 years ahead
and every vendor tries to massage their product behavior with vague
words like this..  :)

But honestly for any any RCiEP, or even integrated endpoints, there 
is no way to send them except up north. These aren't behind a RP.

I did check with couple folks who are part of the SIG, and seem to agree
that ACS treatment for RCiEP's doesn't mean much. 

I understand the language isn't strong, but it doesn't seem like ACS should
be a strong requirement for RCiEP's and reasonable to relax.

What are your thoughts? 

Cheers,
Ashok
> 
> > +   if (!pdev->multifunction ||
> > +   (pci_pcie_type(pdev) == PCI_EXP_TYPE_RC_END) ||
> > +pci_acs_enabled(pdev, REQ_ACS_FLAGS))
> > return NULL;
> >  
> > for_each_pci_dev(tmp) {
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: Relax ACS requirement for RCiEP devices.

2020-05-04 Thread Alex Williamson
On Mon,  4 May 2020 21:42:16 -0700
Ashok Raj  wrote:

> PCIe Spec recommends we can relax ACS requirement for RCIEP devices.
> 
> PCIe 5.0 Specification.
> 6.12 Access Control Services (ACS)
> Implementation of ACS in RCiEPs is permitted but not required. It is
> explicitly permitted that, within a single Root Complex, some RCiEPs
> implement ACS and some do not. It is strongly recommended that Root Complex
> implementations ensure that all accesses originating from RCiEPs
> (PFs and VFs) without ACS capability are first subjected to processing by
> the Translation Agent (TA) in the Root Complex before further decoding and
> processing. The details of such Root Complex handling are outside the scope
> of this specification.
> 
> Since Linux didn't give special treatment to allow this exception, certain
> RCiEP MFD devices are getting grouped in a single iommu group. This
> doesn't permit a single device to be assigned to a guest for instance.
> 
> In one vendor system: Device 14.x were grouped in a single IOMMU group.
> 
> /sys/kernel/iommu_groups/5/devices/:00:14.0
> /sys/kernel/iommu_groups/5/devices/:00:14.2
> /sys/kernel/iommu_groups/5/devices/:00:14.3
> 
> After the patch:
> /sys/kernel/iommu_groups/5/devices/:00:14.0
> /sys/kernel/iommu_groups/5/devices/:00:14.2
> /sys/kernel/iommu_groups/6/devices/:00:14.3 <<< new group
> 
> 14.0 and 14.2 are integrated devices, but legacy end points.
> Whereas 14.3 was a PCIe compliant RCiEP.
> 
> 00:14.3 Network controller: Intel Corporation Device 9df0 (rev 30)
> Capabilities: [40] Express (v2) Root Complex Integrated Endpoint, MSI 00
> 
> This permits assigning this device to a guest VM.
> 
> Fixes: f096c061f552 ("iommu: Rework iommu_group_get_for_pci_dev()")
> Signed-off-by: Ashok Raj 
> To: Joerg Roedel 
> To: Bjorn Helgaas 
> Cc: linux-ker...@vger.kernel.org
> Cc: iommu@lists.linux-foundation.org
> Cc: Lu Baolu 
> Cc: Alex Williamson 
> Cc: Darrel Goeddel 
> Cc: Mark Scott ,
> Cc: Romil Sharma 
> Cc: Ashok Raj 
> ---
>  drivers/iommu/iommu.c | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 2b471419e26c..5744bd65f3e2 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1187,7 +1187,20 @@ static struct iommu_group 
> *get_pci_function_alias_group(struct pci_dev *pdev,
>   struct pci_dev *tmp = NULL;
>   struct iommu_group *group;
>  
> - if (!pdev->multifunction || pci_acs_enabled(pdev, REQ_ACS_FLAGS))
> + /*
> +  * PCI Spec 5.0, Section 6.12 Access Control Service
> +  * Implementation of ACS in RCiEPs is permitted but not required.
> +  * It is explicitly permitted that, within a single Root
> +  * Complex, some RCiEPs implement ACS and some do not. It is
> +  * strongly recommended that Root Complex implementations ensure
> +  * that all accesses originating from RCiEPs (PFs and VFs) without
> +  * ACS capability are first subjected to processing by the Translation
> +  * Agent (TA) in the Root Complex before further decoding and
> +  * processing.
> +  */

Is the language here really strong enough to make this change?  ACS is
an optional feature, so being permitted but not required is rather
meaningless.  The spec is also specifically avoiding the words "must"
or "shall" and even when emphasized with "strongly", we still only have
a recommendation that may or may not be honored.  This seems like a
weak basis for assuming that RCiEPs universally honor this
recommendation.  Thanks,

Alex

> + if (!pdev->multifunction ||
> + (pci_pcie_type(pdev) == PCI_EXP_TYPE_RC_END) ||
> +  pci_acs_enabled(pdev, REQ_ACS_FLAGS))
>   return NULL;
>  
>   for_each_pci_dev(tmp) {

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu: Relax ACS requirement for RCiEP devices.

2020-05-04 Thread Ashok Raj
PCIe Spec recommends we can relax ACS requirement for RCIEP devices.

PCIe 5.0 Specification.
6.12 Access Control Services (ACS)
Implementation of ACS in RCiEPs is permitted but not required. It is
explicitly permitted that, within a single Root Complex, some RCiEPs
implement ACS and some do not. It is strongly recommended that Root Complex
implementations ensure that all accesses originating from RCiEPs
(PFs and VFs) without ACS capability are first subjected to processing by
the Translation Agent (TA) in the Root Complex before further decoding and
processing. The details of such Root Complex handling are outside the scope
of this specification.

Since Linux didn't give special treatment to allow this exception, certain
RCiEP MFD devices are getting grouped in a single iommu group. This
doesn't permit a single device to be assigned to a guest for instance.

In one vendor system: Device 14.x were grouped in a single IOMMU group.

/sys/kernel/iommu_groups/5/devices/:00:14.0
/sys/kernel/iommu_groups/5/devices/:00:14.2
/sys/kernel/iommu_groups/5/devices/:00:14.3

After the patch:
/sys/kernel/iommu_groups/5/devices/:00:14.0
/sys/kernel/iommu_groups/5/devices/:00:14.2
/sys/kernel/iommu_groups/6/devices/:00:14.3 <<< new group

14.0 and 14.2 are integrated devices, but legacy end points.
Whereas 14.3 was a PCIe compliant RCiEP.

00:14.3 Network controller: Intel Corporation Device 9df0 (rev 30)
Capabilities: [40] Express (v2) Root Complex Integrated Endpoint, MSI 00

This permits assigning this device to a guest VM.

Fixes: f096c061f552 ("iommu: Rework iommu_group_get_for_pci_dev()")
Signed-off-by: Ashok Raj 
To: Joerg Roedel 
To: Bjorn Helgaas 
Cc: linux-ker...@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Cc: Lu Baolu 
Cc: Alex Williamson 
Cc: Darrel Goeddel 
Cc: Mark Scott ,
Cc: Romil Sharma 
Cc: Ashok Raj 
---
 drivers/iommu/iommu.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 2b471419e26c..5744bd65f3e2 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1187,7 +1187,20 @@ static struct iommu_group 
*get_pci_function_alias_group(struct pci_dev *pdev,
struct pci_dev *tmp = NULL;
struct iommu_group *group;
 
-   if (!pdev->multifunction || pci_acs_enabled(pdev, REQ_ACS_FLAGS))
+   /*
+* PCI Spec 5.0, Section 6.12 Access Control Service
+* Implementation of ACS in RCiEPs is permitted but not required.
+* It is explicitly permitted that, within a single Root
+* Complex, some RCiEPs implement ACS and some do not. It is
+* strongly recommended that Root Complex implementations ensure
+* that all accesses originating from RCiEPs (PFs and VFs) without
+* ACS capability are first subjected to processing by the Translation
+* Agent (TA) in the Root Complex before further decoding and
+* processing.
+*/
+   if (!pdev->multifunction ||
+   (pci_pcie_type(pdev) == PCI_EXP_TYPE_RC_END) ||
+pci_acs_enabled(pdev, REQ_ACS_FLAGS))
return NULL;
 
for_each_pci_dev(tmp) {
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/4] dma-mapping: move the remaining DMA API calls out of line

2020-05-04 Thread Alexey Kardashevskiy



On 17/04/2020 17:58, Christoph Hellwig wrote:
> On Wed, Apr 15, 2020 at 09:21:37PM +1000, Alexey Kardashevskiy wrote:
>> And the fact they were exported leaves possibility that there is a
>> driver somewhere relying on these symbols or distro kernel won't build
>> because the symbol disappeared from exports (I do not know what KABI
>> guarantees or if mainline kernel cares).
> 
> We absolutely do not care.  In fact for abuses of APIs that drivers
> should not use we almost care to make them private and break people
> abusing them.

ok :)

>> I do not care in particular but
>> some might, a line separated with empty lines in the commit log would do.
> 
> I'll add a blurb for the next version.


Has it gone anywhere? Thanks,


-- 
Alexey
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 17/25] iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()

2020-05-04 Thread Jacob Pan
On Mon, 4 May 2020 18:43:51 +0200
Jean-Philippe Brucker  wrote:

> On Thu, Apr 30, 2020 at 02:16:17PM -0700, Jacob Pan wrote:
> > > +static void arm_smmu_mm_invalidate_range(struct mmu_notifier *mn,
> > > +  struct mm_struct *mm,
> > > +  unsigned long start,
> > > unsigned long end) +{
> > > + /* TODO: invalidate ATS */
> > > +}
> > > +
> > > +static void arm_smmu_mm_release(struct mmu_notifier *mn, struct
> > > mm_struct *mm) +{
> > > + struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
> > > + struct arm_smmu_domain *smmu_domain;
> > > +
> > > + mutex_lock(&arm_smmu_sva_lock);
> > > + if (smmu_mn->cleared) {
> > > + mutex_unlock(&arm_smmu_sva_lock);
> > > + return;
> > > + }
> > > +
> > > + smmu_domain = smmu_mn->domain;
> > > +
> > > + /*
> > > +  * DMA may still be running. Keep the cd valid but
> > > disable
> > > +  * translation, so that new events will still result in
> > > stall.
> > > +  */  
> > Does "disable translation" also disable translated requests?  
> 
> No it doesn't disable translated requests, it only prevents the SMMU
> from accessing the pgd.
> 
OK. same as VT-d.

> > I guess
> > release is called after tlb invalidate range, so assuming no more
> > devTLB left to generate translated request?  
> 
> I'm counting on the invalidate below (here a TODO, implemented in next
> patch) to drop all devTLB entries. After that invalidate, the device:
> * issues a Translation Request, returns with R=W=0 because we disabled
>   translation (and it isn't present in the SMMU TLB).
> * issues a Page Request, returns with InvalidRequest because
>   mmget_not_zero() fails.
> 
Same flow. Thanks for the explanation.

> >   
> > > + arm_smmu_write_ctx_desc(smmu_domain, mm->pasid,
> > > &invalid_cd); +
> > > + arm_smmu_tlb_inv_asid(smmu_domain->smmu,
> > > smmu_mn->cd->asid);
> > > + /* TODO: invalidate ATS */
> > > +  
> > If mm release is called after tlb invalidate range, is it still
> > necessary to invalidate again?  
> 
> No, provided all mappings from the address space are unmapped and
> invalidated. I'll double check, but in my tests invalidate range
> didn't seem to be called for all mappings on mm exit, so I believe we
> do need this.
> 
I think it is safe to invalidate again. There was a concern that mm
release may delete IOMMU driver from the notification list and miss tlb
invalidate range. I had a hard time to confirm that with ftrace while
killing a process, many lost events.


> Thanks,
> Jean
> 

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5.6 61/73] iommu/vt-d: Use right Kconfig option name

2020-05-04 Thread Joe Perches
On Mon, 2020-05-04 at 19:58 +0200, Greg Kroah-Hartman wrote:
> From: Lu Baolu 
> 
> commit ba61c3da00f4a5bf8805aeca1ba5ac3c9bd82e96 upstream.
> 
> The CONFIG_ prefix should be added in the code.
> 
> Fixes: 046182525db61 ("iommu/vt-d: Add Kconfig option to enable/disable 
> scalable mode")
> Reported-and-tested-by: Kumar, Sanjay K 
> Signed-off-by: Lu Baolu 
> Cc: Ashok Raj 
> Link: 
> https://lore.kernel.org/r/20200501072427.14265-1-baolu...@linux.intel.com
> Signed-off-by: Joerg Roedel 
> Signed-off-by: Greg Kroah-Hartman 
> 
> ---
>  drivers/iommu/intel-iommu.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -371,11 +371,11 @@ int dmar_disabled = 0;
>  int dmar_disabled = 1;
>  #endif /* CONFIG_INTEL_IOMMU_DEFAULT_ON */
>  
> -#ifdef INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON
> +#ifdef CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON
>  int intel_iommu_sm = 1;
>  #else
>  int intel_iommu_sm;
> -#endif /* INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON */
> +#endif /* CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON */

Perhaps simpler as

int intel_iommu_sm = IS_BUILTIN(CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON);

So perhaps:
---
 drivers/iommu/intel-iommu.c | 13 ++---
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0182cff2c7ac..ab8552c48391 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -365,17 +365,8 @@ static int intel_iommu_attach_device(struct iommu_domain 
*domain,
 static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
dma_addr_t iova);
 
-#ifdef CONFIG_INTEL_IOMMU_DEFAULT_ON
-int dmar_disabled = 0;
-#else
-int dmar_disabled = 1;
-#endif /* CONFIG_INTEL_IOMMU_DEFAULT_ON */
-
-#ifdef CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON
-int intel_iommu_sm = 1;
-#else
-int intel_iommu_sm;
-#endif /* CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON */
+int dmar_disabled = !IS_BUILTIN(CONFIG_INTEL_IOMMU_DEFAULT_ON);
+int intel_iommu_sm = IS_BUILTIN(CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON);
 
 int intel_iommu_enabled = 0;
 EXPORT_SYMBOL_GPL(intel_iommu_enabled);

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH] drivers: iommu: reset cached node if dma_mask is changed

2020-05-04 Thread Ajay Kumar
The current IOVA allocation code stores a cached copy of the
first allocated IOVA address node, and all the subsequent allocations
have no way to get past(higher than) the first allocated IOVA range.

This causes issue when dma_mask for the master device is changed.
Though the DMA window is increased, the allocation code unaware of
the change, goes ahead allocating IOVA address lower than the
first allocated IOVA address.

This patch adds a check for dma_mask change in the IOVA allocation
function and resets the cached IOVA node to anchor node everytime
the dma_mask change is observed.

NOTE:
 This patch is needed to address the issue discussed in below thread:
 https://www.spinics.net/lists/iommu/msg43586.html

Signed-off-by: Ajay Kumar 
Signed-off-by: Sathyam Panda 
---
 drivers/iommu/iova.c | 17 -
 include/linux/iova.h |  1 +
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 41c605b0058f..0e99975036ae 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -44,6 +44,7 @@ init_iova_domain(struct iova_domain *iovad, unsigned long 
granule,
iovad->granule = granule;
iovad->start_pfn = start_pfn;
iovad->dma_32bit_pfn = 1UL << (32 - iova_shift(iovad));
+   iovad->curr_limit_pfn = iovad->dma_32bit_pfn;
iovad->max32_alloc_size = iovad->dma_32bit_pfn;
iovad->flush_cb = NULL;
iovad->fq = NULL;
@@ -116,9 +117,20 @@ EXPORT_SYMBOL_GPL(init_iova_flush_queue);
 static struct rb_node *
 __get_cached_rbnode(struct iova_domain *iovad, unsigned long limit_pfn)
 {
-   if (limit_pfn <= iovad->dma_32bit_pfn)
+   if (limit_pfn <= iovad->dma_32bit_pfn) {
+   /* re-init cached node if DMA limit has changed */
+   if (limit_pfn != iovad->curr_limit_pfn) {
+   iovad->cached32_node = &iovad->anchor.node;
+   iovad->curr_limit_pfn = limit_pfn;
+   }
return iovad->cached32_node;
+   }
 
+   /* re-init cached node if DMA limit has changed */
+   if (limit_pfn != iovad->curr_limit_pfn) {
+   iovad->cached_node = &iovad->anchor.node;
+   iovad->curr_limit_pfn = limit_pfn;
+   }
return iovad->cached_node;
 }
 
@@ -190,6 +202,9 @@ static int __alloc_and_insert_iova_range(struct iova_domain 
*iovad,
if (size_aligned)
align_mask <<= fls_long(size - 1);
 
+   if (limit_pfn != iovad->curr_limit_pfn)
+   iovad->max32_alloc_size = iovad->dma_32bit_pfn;
+
/* Walk the tree backwards */
spin_lock_irqsave(&iovad->iova_rbtree_lock, flags);
if (limit_pfn <= iovad->dma_32bit_pfn &&
diff --git a/include/linux/iova.h b/include/linux/iova.h
index a0637abffee8..be2220c096ef 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -73,6 +73,7 @@ struct iova_domain {
unsigned long   granule;/* pfn granularity for this domain */
unsigned long   start_pfn;  /* Lower limit for this domain */
unsigned long   dma_32bit_pfn;
+   unsigned long   curr_limit_pfn; /* Current max limit for this domain */
unsigned long   max32_alloc_size; /* Size of last failed allocation */
struct iova_fq __percpu *fq;/* Flush Queue */
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 17/25] iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()

2020-05-04 Thread Jean-Philippe Brucker
On Thu, Apr 30, 2020 at 02:16:17PM -0700, Jacob Pan wrote:
> > +static void arm_smmu_mm_invalidate_range(struct mmu_notifier *mn,
> > +struct mm_struct *mm,
> > +unsigned long start,
> > unsigned long end) +{
> > +   /* TODO: invalidate ATS */
> > +}
> > +
> > +static void arm_smmu_mm_release(struct mmu_notifier *mn, struct
> > mm_struct *mm) +{
> > +   struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
> > +   struct arm_smmu_domain *smmu_domain;
> > +
> > +   mutex_lock(&arm_smmu_sva_lock);
> > +   if (smmu_mn->cleared) {
> > +   mutex_unlock(&arm_smmu_sva_lock);
> > +   return;
> > +   }
> > +
> > +   smmu_domain = smmu_mn->domain;
> > +
> > +   /*
> > +* DMA may still be running. Keep the cd valid but disable
> > +* translation, so that new events will still result in
> > stall.
> > +*/
> Does "disable translation" also disable translated requests?

No it doesn't disable translated requests, it only prevents the SMMU from
accessing the pgd.

> I guess
> release is called after tlb invalidate range, so assuming no more
> devTLB left to generate translated request?

I'm counting on the invalidate below (here a TODO, implemented in next
patch) to drop all devTLB entries. After that invalidate, the device:
* issues a Translation Request, returns with R=W=0 because we disabled
  translation (and it isn't present in the SMMU TLB).
* issues a Page Request, returns with InvalidRequest because
  mmget_not_zero() fails.

> 
> > +   arm_smmu_write_ctx_desc(smmu_domain, mm->pasid, &invalid_cd);
> > +
> > +   arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid);
> > +   /* TODO: invalidate ATS */
> > +
> If mm release is called after tlb invalidate range, is it still
> necessary to invalidate again?

No, provided all mappings from the address space are unmapped and
invalidated. I'll double check, but in my tests invalidate range didn't
seem to be called for all mappings on mm exit, so I believe we do need
this.

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 19/25] iommu/arm-smmu-v3: Add support for Hardware Translation Table Update

2020-05-04 Thread Jean-Philippe Brucker
Hi,

On Mon, May 04, 2020 at 07:54:03PM +0530, Prabhakar Kushwaha wrote:
> Dear Jean,
> 
> On Thu, Apr 30, 2020 at 8:11 PM Jean-Philippe Brucker
>  wrote:
> >
> > If the SMMU supports it and the kernel was built with HTTU support, enable
> 
> is there any framework/config for HTTU which must be enabled to use this 
> patch?
> 
> 
> > We can enable HTTU even if CPUs don't support it, because the kernel
> > always checks for HW dirty bit and updates the PTE flags atomically.
> >
> I believe, this statement is valid in context of this patch-set only.
> 
> One cannot use code snipped to test HTTU because exiting
> io-pgtable-arm.c driver doesn't have framework to leverage HTTU
> benfits. It by-default sets AF=1 and does not set DBM.

Right, this patch only sets the hardware access and dirty flags for SVA
(page tables shared with the CPU through iommu_bind*()), it doesn't enable
anything for iommu_map/unmap(). Although I remember discussing it for VM
migration, I don't know of any effort to use hardware access/dirty bits
outside of SVA.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 01/25] mm: Add a PASID field to mm_struct

2020-05-04 Thread Jean-Philippe Brucker
On Mon, May 04, 2020 at 09:52:44AM +0800, Xu Zaibo wrote:
> 
> On 2020/4/30 22:34, Jean-Philippe Brucker wrote:
> > Some devices can tag their DMA requests with a 20-bit Process Address
> > Space ID (PASID), allowing them to access multiple address spaces. In
> > combination with recoverable I/O page faults (for example PCIe PRI),
> > PASID allows the IOMMU to share page tables with the MMU.
> > 
> > To make sure that a single PASID is allocated for each address space, as
> > required by Intel ENQCMD, store the PASID in the mm_struct. The IOMMU
> > driver is in charge of serializing modifications to the PASID field.
> > 
> > Signed-off-by: Jean-Philippe Brucker 
> > ---
> > For the field's validity I'm thinking invalid PASID = 0. In ioasid.h we
> > define INVALID_IOASID as ~0U, but I think we can now change it to 0,
> > since Intel is now also reserving PASID #0 for Transactions without
> > PASID and AMD IOMMU uses GIoV for this too.
> > ---
> >   include/linux/mm_types.h | 4 
> >   1 file changed, 4 insertions(+)
> > 
> > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> > index 4aba6c0c2ba80..8db6472758175 100644
> > --- a/include/linux/mm_types.h
> > +++ b/include/linux/mm_types.h
> > @@ -534,6 +534,10 @@ struct mm_struct {
> > atomic_long_t hugetlb_usage;
> >   #endif
> > struct work_struct async_put_work;
> > +#ifdef CONFIG_IOMMU_SUPPORT
> > +   /* Address space ID used by device DMA */
> > +   unsigned int pasid;
> > +#endif
> Maybe '#ifdef CONFIG_IOMMU_SVA ... #endif' is more reasonable?

CONFIG_IOMMU_SVA enables a few helpers but IOMMU drivers don't have to use
them, so I think IOMMU_SUPPORT is more appropriate.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 05/25] iommu/iopf: Handle mm faults

2020-05-04 Thread Jean-Philippe Brucker
On Sun, May 03, 2020 at 01:54:36PM +0800, Lu Baolu wrote:
> On 2020/4/30 22:34, Jean-Philippe Brucker wrote:
> > When a recoverable page fault is handled by the fault workqueue, find the
> > associated mm and call handle_mm_fault.
> > 
> > Signed-off-by: Jean-Philippe Brucker 
> > ---
> > v5->v6: select CONFIG_IOMMU_SVA
> > ---
> >   drivers/iommu/Kconfig  |  1 +
> >   drivers/iommu/io-pgfault.c | 79 +-
> >   2 files changed, 78 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> > index 4f33e489f0726..1e64ee6592e16 100644
> > --- a/drivers/iommu/Kconfig
> > +++ b/drivers/iommu/Kconfig
> > @@ -109,6 +109,7 @@ config IOMMU_SVA
> >   config IOMMU_PAGE_FAULT
> > bool
> > +   select IOMMU_SVA
> 
> It would be better to move this to the previous patch.
> 
[...]
> > @@ -104,6 +156,29 @@ static void iopf_handle_group(struct work_struct *work)
> >*
> >* Add a fault to the device workqueue, to be handled by mm.
> >*
> > + * This module doesn't handle PCI PASID Stop Marker; IOMMU drivers must 
> > discard
> > + * them before reporting faults. A PASID Stop Marker (LRW = 0b100) doesn't
> > + * expect a response. It may be generated when disabling a PASID (issuing a
> > + * PASID stop request) by some PCI devices.
> > + *
> > + * The PASID stop request is issued by the device driver before unbind(). 
> > Once
> > + * it completes, no page request is generated for this PASID anymore and
> > + * outstanding ones have been pushed to the IOMMU (as per PCIe 4.0r1.0 - 
> > 6.20.1
> > + * and 10.4.1.2 - Managing PASID TLP Prefix Usage). Some PCI devices will 
> > wait
> > + * for all outstanding page requests to come back with a response before
> > + * completing the PASID stop request. Others do not wait for page 
> > responses, and
> > + * instead issue this Stop Marker that tells us when the PASID can be
> > + * reallocated.
> > + *
> > + * It is safe to discard the Stop Marker because it is an optimization.
> > + * a. Page requests, which are posted requests, have been flushed to the 
> > IOMMU
> > + *when the stop request completes.
> > + * b. We flush all fault queues on unbind() before freeing the PASID.
> > + *
> > + * So even though the Stop Marker might be issued by the device *after* 
> > the stop
> > + * request completes, outstanding faults will have been dealt with by the 
> > time
> > + * we free the PASID.
> > + *
> >* Return: 0 on success and <0 on error.
> >*/
> >   int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
> > 
> 
> The same for the comments.

I think I'll squash both patches, probably doesn't make it harder to
review.

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 04/25] iommu: Add a page fault handler

2020-05-04 Thread Jean-Philippe Brucker
On Sun, May 03, 2020 at 01:49:01PM +0800, Lu Baolu wrote:
> > +static void iopf_handle_group(struct work_struct *work)
> > +{
> > +   struct iopf_group *group;
> > +   struct iopf_fault *iopf, *next;
> > +   enum iommu_page_response_code status = IOMMU_PAGE_RESP_SUCCESS;
> > +
> > +   group = container_of(work, struct iopf_group, work);
> > +
> > +   list_for_each_entry_safe(iopf, next, &group->faults, head) {
> > +   /*
> > +* For the moment, errors are sticky: don't handle subsequent
> > +* faults in the group if there is an error.
> > +*/
> > +   if (status == IOMMU_PAGE_RESP_SUCCESS)
> > +   status = iopf_handle_single(iopf);
> > +
> > +   if (!(iopf->fault.prm.flags &
> > + IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE))
> > +   kfree(iopf);
> 
> The iopf is freed,but not removed from the list. This will cause wild
> pointer in code.

We free the list with the group below, so this one is fine.

> 
> > +   }
> > +
> > +   iopf_complete_group(group->dev, &group->last_fault, status);
> > +   kfree(group);
> > +}
> > +
> 
> [...]
> 
> > +/**
> > + * iopf_queue_flush_dev - Ensure that all queued faults have been processed
> > + * @dev: the endpoint whose faults need to be flushed.
> > + * @pasid: the PASID affected by this flush
> > + *
> > + * The IOMMU driver calls this before releasing a PASID, to ensure that all
> > + * pending faults for this PASID have been handled, and won't hit the 
> > address
> > + * space of the next process that uses this PASID. The driver must make 
> > sure
> > + * that no new fault is added to the queue. In particular it must flush its
> > + * low-level queue before calling this function.
> > + *
> > + * Return: 0 on success and <0 on error.
> > + */
> > +int iopf_queue_flush_dev(struct device *dev, int pasid)
> > +{
> > +   int ret = 0;
> > +   struct iopf_device_param *iopf_param;
> > +   struct dev_iommu *param = dev->iommu;
> > +
> > +   if (!param)
> > +   return -ENODEV;
> > +
> > +   mutex_lock(¶m->lock);
> > +   iopf_param = param->iopf_param;
> > +   if (iopf_param)
> > +   flush_workqueue(iopf_param->queue->wq);
> 
> There may be other pasid iopf in the workqueue. Flush all tasks in
> the workqueue will hurt other pasids. I might lose any context.

Granted this isn't optimal because we don't take the PASID argument into
account (I think I'll remove it, don't know how to use it). But I don't
think it affects other PASIDs, because all flush_workqueue() does is wait
until all faults currently in the worqueue are processed. So it only
blocks the current thread, but nothing is lost.

> 
> > +   else
> > +   ret = -ENODEV;
> > +   mutex_unlock(¶m->lock);
> > +
> > +   return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(iopf_queue_flush_dev);
> > +
> > +/**
> > + * iopf_queue_discard_partial - Remove all pending partial fault
> > + * @queue: the queue whose partial faults need to be discarded
> > + *
> > + * When the hardware queue overflows, last page faults in a group may have 
> > been
> > + * lost and the IOMMU driver calls this to discard all partial faults. The
> > + * driver shouldn't be adding new faults to this queue concurrently.
> > + *
> > + * Return: 0 on success and <0 on error.
> > + */
> > +int iopf_queue_discard_partial(struct iopf_queue *queue)
> > +{
> > +   struct iopf_fault *iopf, *next;
> > +   struct iopf_device_param *iopf_param;
> > +
> > +   if (!queue)
> > +   return -EINVAL;
> > +
> > +   mutex_lock(&queue->lock);
> > +   list_for_each_entry(iopf_param, &queue->devices, queue_list) {
> > +   list_for_each_entry_safe(iopf, next, &iopf_param->partial, head)
> > +   kfree(iopf);
> 
> iopf is freed but not removed from the list.

Ouch yes this is wrong, will fix.

> 
> > +   }
> > +   mutex_unlock(&queue->lock);
> > +   return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(iopf_queue_discard_partial);
> > +
> > +/**
> > + * iopf_queue_add_device - Add producer to the fault queue
> > + * @queue: IOPF queue
> > + * @dev: device to add
> > + *
> > + * Return: 0 on success and <0 on error.
> > + */
> > +int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev)
> > +{
> > +   int ret = -EBUSY;
> > +   struct iopf_device_param *iopf_param;
> > +   struct dev_iommu *param = dev->iommu;
> > +
> > +   if (!param)
> > +   return -ENODEV;
> > +
> > +   iopf_param = kzalloc(sizeof(*iopf_param), GFP_KERNEL);
> > +   if (!iopf_param)
> > +   return -ENOMEM;
> > +
> > +   INIT_LIST_HEAD(&iopf_param->partial);
> > +   iopf_param->queue = queue;
> > +   iopf_param->dev = dev;
> > +
> > +   mutex_lock(&queue->lock);
> > +   mutex_lock(¶m->lock);
> > +   if (!param->iopf_param) {
> > +   list_add(&iopf_param->queue_list, &queue->devices);
> > +   param->iopf_param = iopf_param;
> > +   ret = 0;
> > +   }
> > +   mutex_unlock(¶m->lock);
> > +   mutex_unlock(&queue->lock);
> > +
> > +   if

Re: [PATCH v6 17/25] iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()

2020-05-04 Thread Jean-Philippe Brucker
On Fri, May 01, 2020 at 09:55:13AM -0300, Jason Gunthorpe wrote:
> On Fri, May 01, 2020 at 05:15:52AM -0700, Christoph Hellwig wrote:
> > > @@ -432,6 +432,7 @@ config ARM_SMMU_V3
> > >   tristate "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
> > >   depends on ARM64
> > >   select IOMMU_API
> > > + select IOMMU_SVA
> > >   select IOMMU_IO_PGTABLE_LPAE
> > >   select GENERIC_MSI_IRQ_DOMAIN
> > 
> > Doesn't this need to select MMU_NOTIFIER now?
> > 
> > > + struct mmu_notifier_ops mn_ops;
> > 
> > Note: not a pointer.
> > 
> > > + /* If bind() was already called for this (dev, mm) pair, reuse it. */
> > > + list_for_each_entry(bond, &master->bonds, list) {
> > > + if (bond->mm == mm) {
> > > + refcount_inc(&bond->refs);
> > > + return &bond->sva;
> > > + }
> > > + }
> 
> I also would like it if searching for mms in linked lists was not
> necessary, this is kind of the point of 'get'
> 
> Is this a side effect of the earlier remark to get rid of the linked
> list inside the notifier?
> 
> > Or we could enhance the mmu_notifier_get to pass a private
> > oaque instance ID pointer, which is checked in addition to the ops,
> > and you could probably kill off the bonds list and lookup.
> 
> This might be the best option if it can absorb the above search..

It wouldn't, the above search is separate. I currently register the MMU
notifier on (IOMMU domain, mm). The (dev, mm) search above is to follow
the iommu_sva_bind_device() API doc, that states we should return the same
handle for a given (dev, mm) pair.

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 17/25] iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()

2020-05-04 Thread Jean-Philippe Brucker
On Fri, May 01, 2020 at 05:15:52AM -0700, Christoph Hellwig wrote:
> > @@ -432,6 +432,7 @@ config ARM_SMMU_V3
> > tristate "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
> > depends on ARM64
> > select IOMMU_API
> > +   select IOMMU_SVA
> > select IOMMU_IO_PGTABLE_LPAE
> > select GENERIC_MSI_IRQ_DOMAIN
> 
> Doesn't this need to select MMU_NOTIFIER now?

Yes, will fix

> > +   struct mmu_notifier_ops mn_ops;
> 
> Note: not a pointer.
> 
> > +   /* If bind() was already called for this (dev, mm) pair, reuse it. */
> > +   list_for_each_entry(bond, &master->bonds, list) {
> > +   if (bond->mm == mm) {
> > +   refcount_inc(&bond->refs);
> > +   return &bond->sva;
> > +   }
> > +   }
> > +
> > +   mn = mmu_notifier_get(&smmu_domain->mn_ops, mm);
> > +   if (IS_ERR(mn))
> > +   return ERR_CAST(mn);
> 
> Which seems to be to avoid mmu_notifier_get reusing notifiers registered
> by other arm_smmu_master instance right?

Yes, although I'm registering a single mmu notifier per (domain, mm) pair,
not (master, mm), because the SMMU driver keeps one set of PASID tables
per IOMMU domain.

> Either you could just use plain old mmu_notifier_register to avoid
> the reuse.  Or we could enhance the mmu_notifier_get to pass a private
> oaque instance ID pointer, which is checked in addition to the ops,
> and you could probably kill off the bonds list and lookup.

Going back to mmu_notifier_register() seems better for now. I don't want
to change the core APIs just for this driver, because it's likely to
change again when more hardware starts appearing and we optimize it.

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 02/25] iommu/ioasid: Add ioasid references

2020-05-04 Thread Jacob Pan
On Mon, 4 May 2020 16:25:48 +0200
Jean-Philippe Brucker  wrote:

> On Thu, Apr 30, 2020 at 11:39:31AM -0700, Jacob Pan wrote:
> > > +/**
> > > + * ioasid_get - obtain a reference to the IOASID
> > > + */
> > > +void ioasid_get(ioasid_t ioasid)  
> > why void? what if the ioasid is not valid.  
> 
> My intended use was for the caller to get an additional reference when
> they're already holding one. So this should always succeed and I'd
> prefer a WARN_ON if the ioasid isn't valid rather than returning an
> error. But if you intend to add a state to ioasids between dropping
> refcount and free, then a return value makes sense.
> 
Yes, a WARN_ON will do. No need for return value for now.

> Thanks,
> Jean
> 
> >   
> > > +{
> > > + struct ioasid_data *ioasid_data;
> > > +
> > > + spin_lock(&ioasid_allocator_lock);
> > > + ioasid_data = xa_load(&active_allocator->xa, ioasid);
> > > + if (ioasid_data)
> > > + refcount_inc(&ioasid_data->refs);
> > > + spin_unlock(&ioasid_allocator_lock);
> > > +}
> > > +EXPORT_SYMBOL_GPL(ioasid_get);
> > > +
> > >  /**
> > >   * ioasid_free - Free an IOASID
> > >   * @ioasid: the ID to remove
> > > + *
> > > + * Put a reference to the IOASID, free it when the number of
> > > references drops to
> > > + * zero.
> > > + *
> > > + * Return: %true if the IOASID was freed, %false otherwise.
> > >   */
> > > -void ioasid_free(ioasid_t ioasid)
> > > +bool ioasid_free(ioasid_t ioasid)
> > >  {
> > > + bool free = false;
> > >   struct ioasid_data *ioasid_data;
> > >  
> > >   spin_lock(&ioasid_allocator_lock);
> > > @@ -360,6 +383,10 @@ void ioasid_free(ioasid_t ioasid)
> > >   goto exit_unlock;
> > >   }
> > >  
> > > + free = refcount_dec_and_test(&ioasid_data->refs);
> > > + if (!free)
> > > + goto exit_unlock;
> > > +  
> > Just FYI, we may need to add states for the IOASID, i.g. mark the
> > IOASID inactive after free. And prohibit ioasid_get() after freed.
> > For VT-d, this is useful when KVM queries the IOASID.
> >   
> > >   active_allocator->ops->free(ioasid,
> > > active_allocator->ops->pdata); /* Custom allocator needs
> > > additional steps to free the xa element */ if
> > > (active_allocator->flags & IOASID_ALLOCATOR_CUSTOM) { @@ -369,6
> > > +396,7 @@ void ioasid_free(ioasid_t ioasid) 
> > >  exit_unlock:
> > >   spin_unlock(&ioasid_allocator_lock);
> > > + return free;
> > >  }
> > >  EXPORT_SYMBOL_GPL(ioasid_free);
> > >
> > 
> > [Jacob Pan]  

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 02/25] iommu/ioasid: Add ioasid references

2020-05-04 Thread Jacob Pan
On Mon, 4 May 2020 16:39:32 +0200
Jean-Philippe Brucker  wrote:

> On Thu, Apr 30, 2020 at 01:48:42PM -0700, Jacob Pan wrote:
> > On Thu, 30 Apr 2020 11:39:31 -0700
> > Jacob Pan  wrote:
> >   
> > > > -void ioasid_free(ioasid_t ioasid)
> > > > +bool ioasid_free(ioasid_t ioasid)
> > > >  {  
> > Sorry I missed this in the last reply.
> > 
> > I think free needs to be unconditional since there is not a good
> > way to fail it.
> > 
> > Also can we have more symmetric APIs, seems we don't have
> > ioasid_put() in this patchset.  
> 
> Yes I was thinking of renaming ioasid_free() to ioasid_put() but got
> lazy. 
> 
> > How about?
> > ioasid_alloc()
> > ioasid_free(); //drop reference, mark inactive, but not reclaimed if
> > refcount is not zero.
> > ioasid_get() // returns err if the ioasid is marked inactive by
> > ioasid_free()  
> 
> How does the caller know that the ioasid is in active/inactive state,
> and not freed/reallocated?
> 
In inactive state, callers of ioasid_find(), ioasid_get() would all
fail. Only ioasid_put can still operate on it.

In freed state (i.e. not allocated), it will be the same as above with
the exception that ioasid_put has no effect.

> > ioasid_put();// drop reference, reclaim if refcount is 0.  
> 
> I'll add ioasid_put() for now. I'd like to avoid introducing the
> inactive state in this patch,
Sounds good. I just wanted to consult with you about the above APIs. I
will introduce the state when we have a real use.

> so shall I change the calls in the
> Intel driver to ioasid_put(), and not introduce a new ioasid_free()
> for the moment?
> 
Sounds good. 

> Thanks,
> Jean
> 

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 00/25] iommu: Shared Virtual Addressing for SMMUv3

2020-05-04 Thread Jean-Philippe Brucker
On Thu, Apr 30, 2020 at 02:18:16PM -0700, Jacob Pan wrote:
> On Thu, 30 Apr 2020 16:33:59 +0200
> Jean-Philippe Brucker  wrote:
> 
> > Shared Virtual Addressing (SVA) allows to share process page tables
> > with devices using the IOMMU, PASIDs and I/O page faults. Add SVA
> > support to the Arm SMMUv3 driver.
> > 
> > Since v5 [1]:
> > 
> > * Added patches 1-3. Patch 1 adds a PASID field to mm_struct as
> >   discussed in [1] and [2]. This is also needed for Intel ENQCMD.
> > Patch 2 adds refcounts to IOASID and patch 3 adds a couple of helpers
> > to allocate the PASID.
> > 
> > * Dropped most of iommu-sva.c. After getting rid of io_mm following
> >   review of v5, there wasn't enough generic code left to justify the
> >   indirect branch overhead of io_mm_ops in the MMU notifiers. I ended
> > up with more glue than useful code, and couldn't find an easy way to
> > deal with domains in the SMMU driver (we keep PASID tables per domain,
> >   while x86 keeps them per device). The direct approach in patch 17 is
> >   nicer and a little easier to read. The SMMU driver only gained 160
> >   lines, while iommu-sva lost 470 lines.
> > 
> >   As a result I dropped the MMU notifier patch.
> > 
> >   Jacob, one upside of this rework is that we now free ioasids in
> >   blocking context, which might help with your addition of notifiers
> > to ioasid.c
> > 
> Thanks for the note. It does make notifier much easier, plus the
> refcount can alleviate the constraint on ordering.
> 
> I guess we don't share mmu notifier code for now :)

I think it's more efficient for each IOMMU driver to at least implement
their own invalidate_range() callback and avoid indirect branches. For the
rest I couldn't find a lot of code to share, most of it is writing PASID
tables and invalidating. We can revisit later, as long as we agree on the
bind() API the implementations should be similar enough.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 02/25] iommu/ioasid: Add ioasid references

2020-05-04 Thread Jean-Philippe Brucker
On Thu, Apr 30, 2020 at 01:48:42PM -0700, Jacob Pan wrote:
> On Thu, 30 Apr 2020 11:39:31 -0700
> Jacob Pan  wrote:
> 
> > > -void ioasid_free(ioasid_t ioasid)
> > > +bool ioasid_free(ioasid_t ioasid)
> > >  {
> Sorry I missed this in the last reply.
> 
> I think free needs to be unconditional since there is not a good way to
> fail it.
> 
> Also can we have more symmetric APIs, seems we don't have ioasid_put()
> in this patchset.

Yes I was thinking of renaming ioasid_free() to ioasid_put() but got lazy. 

> How about?
> ioasid_alloc()
> ioasid_free(); //drop reference, mark inactive, but not reclaimed if
>   refcount is not zero.
> ioasid_get() // returns err if the ioasid is marked inactive by
>   ioasid_free()

How does the caller know that the ioasid is in active/inactive state, and
not freed/reallocated?

> ioasid_put();// drop reference, reclaim if refcount is 0.

I'll add ioasid_put() for now. I'd like to avoid introducing the inactive
state in this patch, so shall I change the calls in the Intel driver to
ioasid_put(), and not introduce a new ioasid_free() for the moment?

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 11/25] iommu/arm-smmu-v3: Share process page tables

2020-05-04 Thread Suzuki K Poulose

On 05/04/2020 03:11 PM, Jean-Philippe Brucker wrote:

On Thu, Apr 30, 2020 at 04:39:53PM +0100, Suzuki K Poulose wrote:

On 04/30/2020 03:34 PM, Jean-Philippe Brucker wrote:

With Shared Virtual Addressing (SVA), we need to mirror CPU TTBR, TCR,
MAIR and ASIDs in SMMU contexts. Each SMMU has a single ASID space split
into two sets, shared and private. Shared ASIDs correspond to those
obtained from the arch ASID allocator, and private ASIDs are used for
"classic" map/unmap DMA.

Cc: Suzuki K Poulose 
Signed-off-by: Jean-Philippe Brucker 
---



+
+   tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, 64ULL - VA_BITS) |
+ FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, ARM_LPAE_TCR_RGN_WBWA) |
+ FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, ARM_LPAE_TCR_RGN_WBWA) |
+ FIELD_PREP(CTXDESC_CD_0_TCR_SH0, ARM_LPAE_TCR_SH_IS) |
+ CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64;
+
+   switch (PAGE_SIZE) {
+   case SZ_4K:
+   tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_TG0, ARM_LPAE_TCR_TG0_4K);
+   break;
+   case SZ_16K:
+   tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_TG0, ARM_LPAE_TCR_TG0_16K);
+   break;
+   case SZ_64K:
+   tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_TG0, ARM_LPAE_TCR_TG0_64K);
+   break;
+   default:
+   WARN_ON(1);
+   ret = -EINVAL;
+   goto err_free_asid;
+   }
+
+   reg = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+   par = cpuid_feature_extract_unsigned_field(reg, 
ID_AA64MMFR0_PARANGE_SHIFT);
+   tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_IPS, par);
+
+   cd->ttbr = virt_to_phys(mm->pgd);


Does the TTBR follow the same layout as TTBR_ELx for 52bit IPA ? i.e,
TTBR[5:2] = BADDR[51:48] ? Are you covered for that ?


Good point, I don't remember checking this. The SMMU TTBR doesn't have the
same layout as the CPU's, and we don't need to swizzle the bits. For the
lower bits, the alignment requirements on the pgd are identical to the
MMU.


Ok, if that is the case:

Acked-by: Suzuki K Poulose 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 02/25] iommu/ioasid: Add ioasid references

2020-05-04 Thread Jean-Philippe Brucker
On Thu, Apr 30, 2020 at 11:39:31AM -0700, Jacob Pan wrote:
> > +/**
> > + * ioasid_get - obtain a reference to the IOASID
> > + */
> > +void ioasid_get(ioasid_t ioasid)
> why void? what if the ioasid is not valid.

My intended use was for the caller to get an additional reference when
they're already holding one. So this should always succeed and I'd prefer
a WARN_ON if the ioasid isn't valid rather than returning an error. But if
you intend to add a state to ioasids between dropping refcount and free,
then a return value makes sense.

Thanks,
Jean

> 
> > +{
> > +   struct ioasid_data *ioasid_data;
> > +
> > +   spin_lock(&ioasid_allocator_lock);
> > +   ioasid_data = xa_load(&active_allocator->xa, ioasid);
> > +   if (ioasid_data)
> > +   refcount_inc(&ioasid_data->refs);
> > +   spin_unlock(&ioasid_allocator_lock);
> > +}
> > +EXPORT_SYMBOL_GPL(ioasid_get);
> > +
> >  /**
> >   * ioasid_free - Free an IOASID
> >   * @ioasid: the ID to remove
> > + *
> > + * Put a reference to the IOASID, free it when the number of
> > references drops to
> > + * zero.
> > + *
> > + * Return: %true if the IOASID was freed, %false otherwise.
> >   */
> > -void ioasid_free(ioasid_t ioasid)
> > +bool ioasid_free(ioasid_t ioasid)
> >  {
> > +   bool free = false;
> > struct ioasid_data *ioasid_data;
> >  
> > spin_lock(&ioasid_allocator_lock);
> > @@ -360,6 +383,10 @@ void ioasid_free(ioasid_t ioasid)
> > goto exit_unlock;
> > }
> >  
> > +   free = refcount_dec_and_test(&ioasid_data->refs);
> > +   if (!free)
> > +   goto exit_unlock;
> > +
> Just FYI, we may need to add states for the IOASID, i.g. mark the IOASID
> inactive after free. And prohibit ioasid_get() after freed. For VT-d,
> this is useful when KVM queries the IOASID.
> 
> > active_allocator->ops->free(ioasid,
> > active_allocator->ops->pdata); /* Custom allocator needs additional
> > steps to free the xa element */ if (active_allocator->flags &
> > IOASID_ALLOCATOR_CUSTOM) { @@ -369,6 +396,7 @@ void
> > ioasid_free(ioasid_t ioasid) 
> >  exit_unlock:
> > spin_unlock(&ioasid_allocator_lock);
> > +   return free;
> >  }
> >  EXPORT_SYMBOL_GPL(ioasid_free);
> >  
> 
> [Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 19/25] iommu/arm-smmu-v3: Add support for Hardware Translation Table Update

2020-05-04 Thread Prabhakar Kushwaha
Dear Jean,

On Thu, Apr 30, 2020 at 8:11 PM Jean-Philippe Brucker
 wrote:
>
> If the SMMU supports it and the kernel was built with HTTU support, enable

is there any framework/config for HTTU which must be enabled to use this patch?


> We can enable HTTU even if CPUs don't support it, because the kernel
> always checks for HW dirty bit and updates the PTE flags atomically.
>
I believe, this statement is valid in context of this patch-set only.

One cannot use code snipped to test HTTU because exiting
io-pgtable-arm.c driver doesn't have framework to leverage HTTU
benfits. It by-default sets AF=1 and does not set DBM.

Thanks

--pk
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 11/25] iommu/arm-smmu-v3: Share process page tables

2020-05-04 Thread Jean-Philippe Brucker
On Thu, Apr 30, 2020 at 04:39:53PM +0100, Suzuki K Poulose wrote:
> On 04/30/2020 03:34 PM, Jean-Philippe Brucker wrote:
> > With Shared Virtual Addressing (SVA), we need to mirror CPU TTBR, TCR,
> > MAIR and ASIDs in SMMU contexts. Each SMMU has a single ASID space split
> > into two sets, shared and private. Shared ASIDs correspond to those
> > obtained from the arch ASID allocator, and private ASIDs are used for
> > "classic" map/unmap DMA.
> > 
> > Cc: Suzuki K Poulose 
> > Signed-off-by: Jean-Philippe Brucker 
> > ---
> 
> > +
> > +   tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, 64ULL - VA_BITS) |
> > + FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, ARM_LPAE_TCR_RGN_WBWA) |
> > + FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, ARM_LPAE_TCR_RGN_WBWA) |
> > + FIELD_PREP(CTXDESC_CD_0_TCR_SH0, ARM_LPAE_TCR_SH_IS) |
> > + CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64;
> > +
> > +   switch (PAGE_SIZE) {
> > +   case SZ_4K:
> > +   tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_TG0, ARM_LPAE_TCR_TG0_4K);
> > +   break;
> > +   case SZ_16K:
> > +   tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_TG0, ARM_LPAE_TCR_TG0_16K);
> > +   break;
> > +   case SZ_64K:
> > +   tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_TG0, ARM_LPAE_TCR_TG0_64K);
> > +   break;
> > +   default:
> > +   WARN_ON(1);
> > +   ret = -EINVAL;
> > +   goto err_free_asid;
> > +   }
> > +
> > +   reg = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
> > +   par = cpuid_feature_extract_unsigned_field(reg, 
> > ID_AA64MMFR0_PARANGE_SHIFT);
> > +   tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_IPS, par);
> > +
> > +   cd->ttbr = virt_to_phys(mm->pgd);
> 
> Does the TTBR follow the same layout as TTBR_ELx for 52bit IPA ? i.e,
> TTBR[5:2] = BADDR[51:48] ? Are you covered for that ?

Good point, I don't remember checking this. The SMMU TTBR doesn't have the
same layout as the CPU's, and we don't need to swizzle the bits. For the
lower bits, the alignment requirements on the pgd are identical to the
MMU.

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 09/21] drm: panfrost: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Steven Price

On 04/05/2020 13:53, Marek Szyprowski wrote:

The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.


I find this commit message a bit confusing, but AFAICT the problem with 
the Panfrost code is really in mmu_map_sg() where we don't have the 
return value from dma_map_sg() and the for_each_sg() loop could (in 
theory) run off the end of the list.


The fix seems correct - store the return where it's meant to be (nents) 
and make sure when unmapping we use the original (orig_nents). So you 
might also consider adding:


Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver")

Even better would be the wrappers you mention in the cover letter! ;)

Reviewed-by: Steven Price 



Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
  drivers/gpu/drm/panfrost/panfrost_gem.c | 3 ++-
  drivers/gpu/drm/panfrost/panfrost_mmu.c | 4 +++-
  2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c 
b/drivers/gpu/drm/panfrost/panfrost_gem.c
index 17b654e..22fec7c 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -42,7 +42,8 @@ static void panfrost_gem_free_object(struct drm_gem_object 
*obj)
for (i = 0; i < n_sgt; i++) {
if (bo->sgts[i].sgl) {
dma_unmap_sg(pfdev->dev, bo->sgts[i].sgl,
-bo->sgts[i].nents, 
DMA_BIDIRECTIONAL);
+bo->sgts[i].orig_nents,
+DMA_BIDIRECTIONAL);
sg_free_table(&bo->sgts[i]);
}
}
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index ed28aeb..2d9b1f9 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -517,7 +517,9 @@ static int panfrost_mmu_map_fault_addr(struct 
panfrost_device *pfdev, int as,
if (ret)
goto err_pages;
  
-	if (!dma_map_sg(pfdev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL)) {

+   sgt->nents = dma_map_sg(pfdev->dev, sgt->sgl, sgt->orig_nents,
+   DMA_BIDIRECTIONAL);
+   if (!sgt->nents) {
ret = -EINVAL;
goto err_map;
}



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 00/21] DRM: fix struct sg_table nents vs. orig_nents misuse

2020-05-04 Thread Christoph Hellwig
On Mon, May 04, 2020 at 03:05:30PM +0200, Marek Szyprowski wrote:
> Hi Christoph,
> 
> On 04.05.2020 14:52, Christoph Hellwig wrote:
> > On Mon, May 04, 2020 at 02:50:17PM +0200, Marek Szyprowski wrote:
> >> v2:
> >> - dropped most of the changes to drm/i915
> >> - added fixes for rcar-du, xen, media and ion
> >> - fixed a few issues pointed by kbuild test robot
> >> - added wide cc: list for each patch
> > Didn't you plan to add dma_map_sgbuf and co helper?
> 
> Yes, I have a followup patches for that, but I wanted to fix the 
> existing code in the first step. Then I wanted to send a wrappers and 
> their application. Do you think I should do everything at once, in one 
> patchset?

That would be my preference.  The helpers should be mostly trivial
wrappers, so they can easily backported, and they force passing of the
correct parameters.  So I don't really see a need to fix up all the 20+
places up first just to convert them to the proper API a little later.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 00/21] DRM: fix struct sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
Hi Christoph,

On 04.05.2020 14:52, Christoph Hellwig wrote:
> On Mon, May 04, 2020 at 02:50:17PM +0200, Marek Szyprowski wrote:
>> v2:
>> - dropped most of the changes to drm/i915
>> - added fixes for rcar-du, xen, media and ion
>> - fixed a few issues pointed by kbuild test robot
>> - added wide cc: list for each patch
> Didn't you plan to add dma_map_sgbuf and co helper?

Yes, I have a followup patches for that, but I wanted to fix the 
existing code in the first step. Then I wanted to send a wrappers and 
their application. Do you think I should do everything at once, in one 
patchset?

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 07/21] drm: lima: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/gpu/drm/lima/lima_gem.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index 5404e0d..3edd2ff 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -70,7 +70,7 @@ int lima_heap_alloc(struct lima_bo *bo, struct lima_vm *vm)
 
if (bo->base.sgt) {
dma_unmap_sg(dev, bo->base.sgt->sgl,
-bo->base.sgt->nents, DMA_BIDIRECTIONAL);
+bo->base.sgt->orig_nents, DMA_BIDIRECTIONAL);
sg_free_table(bo->base.sgt);
} else {
bo->base.sgt = kmalloc(sizeof(*bo->base.sgt), GFP_KERNEL);
@@ -80,7 +80,7 @@ int lima_heap_alloc(struct lima_bo *bo, struct lima_vm *vm)
}
}
 
-   dma_map_sg(dev, sgt.sgl, sgt.nents, DMA_BIDIRECTIONAL);
+   sgt.nents = dma_map_sg(dev, sgt.sgl, sgt.orig_nents, DMA_BIDIRECTIONAL);
 
*bo->base.sgt = sgt;
 
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 17/21] drm: rcar-du: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/media/platform/vsp1/vsp1_drm.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/media/platform/vsp1/vsp1_drm.c 
b/drivers/media/platform/vsp1/vsp1_drm.c
index a4a45d6..b54a30f 100644
--- a/drivers/media/platform/vsp1/vsp1_drm.c
+++ b/drivers/media/platform/vsp1/vsp1_drm.c
@@ -912,8 +912,9 @@ int vsp1_du_map_sg(struct device *dev, struct sg_table *sgt)
 * skip cache sync. This will need to be revisited when support for
 * non-coherent buffers will be added to the DU driver.
 */
-   return dma_map_sg_attrs(vsp1->bus_master, sgt->sgl, sgt->nents,
-   DMA_TO_DEVICE, DMA_ATTR_SKIP_CPU_SYNC);
+   sgt->nents = dma_map_sg_attrs(vsp1->bus_master, sgt->sgl,
+   sgt->orig_nents, DMA_TO_DEVICE, DMA_ATTR_SKIP_CPU_SYNC);
+   return sgt->nents;
 }
 EXPORT_SYMBOL_GPL(vsp1_du_map_sg);
 
@@ -921,7 +922,7 @@ void vsp1_du_unmap_sg(struct device *dev, struct sg_table 
*sgt)
 {
struct vsp1_device *vsp1 = dev_get_drvdata(dev);
 
-   dma_unmap_sg_attrs(vsp1->bus_master, sgt->sgl, sgt->nents,
+   dma_unmap_sg_attrs(vsp1->bus_master, sgt->sgl, sgt->orig_nents,
   DMA_TO_DEVICE, DMA_ATTR_SKIP_CPU_SYNC);
 }
 EXPORT_SYMBOL_GPL(vsp1_du_unmap_sg);
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 13/21] drm: virtio: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/gpu/drm/virtio/virtgpu_object.c | 11 ++-
 drivers/gpu/drm/virtio/virtgpu_vq.c |  8 
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c 
b/drivers/gpu/drm/virtio/virtgpu_object.c
index 6ccbd01..12f6bee 100644
--- a/drivers/gpu/drm/virtio/virtgpu_object.c
+++ b/drivers/gpu/drm/virtio/virtgpu_object.c
@@ -73,7 +73,8 @@ void virtio_gpu_cleanup_object(struct virtio_gpu_object *bo)
if (shmem->pages) {
if (shmem->mapped) {
dma_unmap_sg(vgdev->vdev->dev.parent,
-shmem->pages->sgl, shmem->mapped,
+shmem->pages->sgl,
+shmem->pages->orig_nents,
 DMA_TO_DEVICE);
shmem->mapped = 0;
}
@@ -157,13 +158,13 @@ static int virtio_gpu_object_shmem_init(struct 
virtio_gpu_device *vgdev,
}
 
if (use_dma_api) {
-   shmem->mapped = dma_map_sg(vgdev->vdev->dev.parent,
+   shmem->pages->nents = dma_map_sg(vgdev->vdev->dev.parent,
   shmem->pages->sgl,
-  shmem->pages->nents,
+  shmem->pages->orig_nents,
   DMA_TO_DEVICE);
-   *nents = shmem->mapped;
+   *nents = shmem->mapped = shmem->pages->nents;
} else {
-   *nents = shmem->pages->nents;
+   *nents = shmem->pages->orig_nents;
}
 
*ents = kmalloc_array(*nents, sizeof(struct virtio_gpu_mem_entry),
diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
b/drivers/gpu/drm/virtio/virtgpu_vq.c
index 9e663a5..661325b 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vq.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
@@ -604,8 +604,8 @@ void virtio_gpu_cmd_transfer_to_host_2d(struct 
virtio_gpu_device *vgdev,
 
if (use_dma_api)
dma_sync_sg_for_device(vgdev->vdev->dev.parent,
-  shmem->pages->sgl, shmem->pages->nents,
-  DMA_TO_DEVICE);
+  shmem->pages->sgl,
+  shmem->pages->orig_nents, DMA_TO_DEVICE);
 
cmd_p = virtio_gpu_alloc_cmd(vgdev, &vbuf, sizeof(*cmd_p));
memset(cmd_p, 0, sizeof(*cmd_p));
@@ -1020,8 +1020,8 @@ void virtio_gpu_cmd_transfer_to_host_3d(struct 
virtio_gpu_device *vgdev,
 
if (use_dma_api)
dma_sync_sg_for_device(vgdev->vdev->dev.parent,
-  shmem->pages->sgl, shmem->pages->nents,
-  DMA_TO_DEVICE);
+  shmem->pages->sgl,
+  shmem->pages->orig_nents, DMA_TO_DEVICE);
 
cmd_p = virtio_gpu_alloc_cmd(vgdev, &vbuf, sizeof(*cmd_p));
memset(cmd_p, 0, sizeof(*cmd_p));
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 20/21] media: pci: fix common ALSA DMA-mapping related code

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/media/pci/cx23885/cx23885-alsa.c | 2 +-
 drivers/media/pci/cx25821/cx25821-alsa.c | 2 +-
 drivers/media/pci/cx88/cx88-alsa.c   | 2 +-
 drivers/media/pci/saa7134/saa7134-alsa.c | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/media/pci/cx23885/cx23885-alsa.c 
b/drivers/media/pci/cx23885/cx23885-alsa.c
index df44ed7..3f366e4 100644
--- a/drivers/media/pci/cx23885/cx23885-alsa.c
+++ b/drivers/media/pci/cx23885/cx23885-alsa.c
@@ -129,7 +129,7 @@ static int cx23885_alsa_dma_unmap(struct cx23885_audio_dev 
*dev)
if (!buf->sglen)
return 0;
 
-   dma_unmap_sg(&dev->pci->dev, buf->sglist, buf->sglen, 
PCI_DMA_FROMDEVICE);
+   dma_unmap_sg(&dev->pci->dev, buf->sglist, buf->nr_pages, 
PCI_DMA_FROMDEVICE);
buf->sglen = 0;
return 0;
 }
diff --git a/drivers/media/pci/cx25821/cx25821-alsa.c 
b/drivers/media/pci/cx25821/cx25821-alsa.c
index 3016164..c40304d 100644
--- a/drivers/media/pci/cx25821/cx25821-alsa.c
+++ b/drivers/media/pci/cx25821/cx25821-alsa.c
@@ -193,7 +193,7 @@ static int cx25821_alsa_dma_unmap(struct cx25821_audio_dev 
*dev)
if (!buf->sglen)
return 0;
 
-   dma_unmap_sg(&dev->pci->dev, buf->sglist, buf->sglen, 
PCI_DMA_FROMDEVICE);
+   dma_unmap_sg(&dev->pci->dev, buf->sglist, buf->nr_pages, 
PCI_DMA_FROMDEVICE);
buf->sglen = 0;
return 0;
 }
diff --git a/drivers/media/pci/cx88/cx88-alsa.c 
b/drivers/media/pci/cx88/cx88-alsa.c
index 7d7acee..3c6fe6c 100644
--- a/drivers/media/pci/cx88/cx88-alsa.c
+++ b/drivers/media/pci/cx88/cx88-alsa.c
@@ -332,7 +332,7 @@ static int cx88_alsa_dma_unmap(struct cx88_audio_dev *dev)
if (!buf->sglen)
return 0;
 
-   dma_unmap_sg(&dev->pci->dev, buf->sglist, buf->sglen,
+   dma_unmap_sg(&dev->pci->dev, buf->sglist, buf->nr_pages,
 PCI_DMA_FROMDEVICE);
buf->sglen = 0;
return 0;
diff --git a/drivers/media/pci/saa7134/saa7134-alsa.c 
b/drivers/media/pci/saa7134/saa7134-alsa.c
index 544ca57..398c47f 100644
--- a/drivers/media/pci/saa7134/saa7134-alsa.c
+++ b/drivers/media/pci/saa7134/saa7134-alsa.c
@@ -313,7 +313,7 @@ static int saa7134_alsa_dma_unmap(struct saa7134_dev *dev)
if (!dma->sglen)
return 0;
 
-   dma_unmap_sg(&dev->pci->dev, dma->sglist, dma->sglen, 
PCI_DMA_FROMDEVICE);
+   dma_unmap_sg(&dev->pci->dev, dma->sglist, dma->nr_pages, 
PCI_DMA_FROMDEVICE);
dma->sglen = 0;
return 0;
 }
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/5] iommu/amd: Fix race in increase_address_space()/fetch_pte()

2020-05-04 Thread Joerg Roedel
From: Joerg Roedel 

The 'pt_root' and 'mode' struct members of 'struct protection_domain'
need to be get/set atomically, otherwise the page-table of the domain
can get corrupted.

Merge the fields into one atomic64_t struct member which can be
get/set atomically.

Fixes: 92d420ec028d ("iommu/amd: Relax locking in dma_ops path")
Reported-by: Qian Cai 
Tested-by: Qian Cai 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd_iommu.c   | 140 
 drivers/iommu/amd_iommu_types.h |   9 +-
 2 files changed, 112 insertions(+), 37 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 20cce366e951..28229a38af4d 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -151,6 +151,26 @@ static struct protection_domain *to_pdomain(struct 
iommu_domain *dom)
return container_of(dom, struct protection_domain, domain);
 }
 
+static void amd_iommu_domain_get_pgtable(struct protection_domain *domain,
+struct domain_pgtable *pgtable)
+{
+   u64 pt_root = atomic64_read(&domain->pt_root);
+
+   pgtable->root = (u64 *)(pt_root & PAGE_MASK);
+   pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */
+}
+
+static u64 amd_iommu_domain_encode_pgtable(u64 *root, int mode)
+{
+   u64 pt_root;
+
+   /* lowest 3 bits encode pgtable mode */
+   pt_root = mode & 7;
+   pt_root |= (u64)root;
+
+   return pt_root;
+}
+
 static struct iommu_dev_data *alloc_dev_data(u16 devid)
 {
struct iommu_dev_data *dev_data;
@@ -1397,13 +1417,18 @@ static struct page *free_sub_pt(unsigned long root, int 
mode,
 
 static void free_pagetable(struct protection_domain *domain)
 {
-   unsigned long root = (unsigned long)domain->pt_root;
+   struct domain_pgtable pgtable;
struct page *freelist = NULL;
+   unsigned long root;
+
+   amd_iommu_domain_get_pgtable(domain, &pgtable);
+   atomic64_set(&domain->pt_root, 0);
 
-   BUG_ON(domain->mode < PAGE_MODE_NONE ||
-  domain->mode > PAGE_MODE_6_LEVEL);
+   BUG_ON(pgtable.mode < PAGE_MODE_NONE ||
+  pgtable.mode > PAGE_MODE_6_LEVEL);
 
-   freelist = free_sub_pt(root, domain->mode, freelist);
+   root = (unsigned long)pgtable.root;
+   freelist = free_sub_pt(root, pgtable.mode, freelist);
 
free_page_list(freelist);
 }
@@ -1417,24 +1442,28 @@ static bool increase_address_space(struct 
protection_domain *domain,
   unsigned long address,
   gfp_t gfp)
 {
+   struct domain_pgtable pgtable;
unsigned long flags;
bool ret = false;
-   u64 *pte;
+   u64 *pte, root;
 
spin_lock_irqsave(&domain->lock, flags);
 
-   if (address <= PM_LEVEL_SIZE(domain->mode) ||
-   WARN_ON_ONCE(domain->mode == PAGE_MODE_6_LEVEL))
+   amd_iommu_domain_get_pgtable(domain, &pgtable);
+
+   if (address <= PM_LEVEL_SIZE(pgtable.mode) ||
+   WARN_ON_ONCE(pgtable.mode == PAGE_MODE_6_LEVEL))
goto out;
 
pte = (void *)get_zeroed_page(gfp);
if (!pte)
goto out;
 
-   *pte = PM_LEVEL_PDE(domain->mode,
-   iommu_virt_to_phys(domain->pt_root));
-   domain->pt_root  = pte;
-   domain->mode+= 1;
+   *pte = PM_LEVEL_PDE(pgtable.mode, iommu_virt_to_phys(pgtable.root));
+
+   root = amd_iommu_domain_encode_pgtable(pte, pgtable.mode + 1);
+
+   atomic64_set(&domain->pt_root, root);
 
ret = true;
 
@@ -1451,16 +1480,22 @@ static u64 *alloc_pte(struct protection_domain *domain,
  gfp_t gfp,
  bool *updated)
 {
+   struct domain_pgtable pgtable;
int level, end_lvl;
u64 *pte, *page;
 
BUG_ON(!is_power_of_2(page_size));
 
-   while (address > PM_LEVEL_SIZE(domain->mode))
+   amd_iommu_domain_get_pgtable(domain, &pgtable);
+
+   while (address > PM_LEVEL_SIZE(pgtable.mode)) {
*updated = increase_address_space(domain, address, gfp) || 
*updated;
+   amd_iommu_domain_get_pgtable(domain, &pgtable);
+   }
+
 
-   level   = domain->mode - 1;
-   pte = &domain->pt_root[PM_LEVEL_INDEX(level, address)];
+   level   = pgtable.mode - 1;
+   pte = &pgtable.root[PM_LEVEL_INDEX(level, address)];
address = PAGE_SIZE_ALIGN(address, page_size);
end_lvl = PAGE_SIZE_LEVEL(page_size);
 
@@ -1536,16 +1571,19 @@ static u64 *fetch_pte(struct protection_domain *domain,
  unsigned long address,
  unsigned long *page_size)
 {
+   struct domain_pgtable pgtable;
int level;
u64 *pte;
 
*page_size = 0;
 
-   if (address > PM_LEVEL_SIZE(domain->mode))
+   amd_iommu_domain_get_pgtable(domain, &pgtable);
+
+   if (address > PM_LEVEL_SIZE(pgtable.mode))
 

[PATCH v2 02/21] drm: amdgpu: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
Reviewed-by: Christian König 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 7 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index 43d8ed7..4df813e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -307,8 +307,9 @@ static struct sg_table *amdgpu_dma_buf_map(struct 
dma_buf_attachment *attach,
if (IS_ERR(sgt))
return sgt;
 
-   if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir,
- DMA_ATTR_SKIP_CPU_SYNC))
+   sgt->nents = dma_map_sg_attrs(attach->dev, sgt->sgl, 
sgt->orig_nents,
+ dir, DMA_ATTR_SKIP_CPU_SYNC);
+   if (!sgt->nents)
goto error_free;
break;
 
@@ -349,7 +350,7 @@ static void amdgpu_dma_buf_unmap(struct dma_buf_attachment 
*attach,
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
 
if (sgt->sgl->page_link) {
-   dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, dir);
+   dma_unmap_sg(attach->dev, sgt->sgl, sgt->orig_nents, dir);
sg_free_table(sgt);
kfree(sgt);
} else {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index eff1f73..1f8c507 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1043,7 +1043,6 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(ttm->bdev);
struct amdgpu_ttm_tt *gtt = (void *)ttm;
-   unsigned nents;
int r;
 
int write = !(gtt->userflags & AMDGPU_GEM_USERPTR_READONLY);
@@ -1059,8 +1058,9 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 
/* Map SG to device */
r = -ENOMEM;
-   nents = dma_map_sg(adev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
-   if (nents == 0)
+   ttm->sg->nents = dma_map_sg(adev->dev, ttm->sg->sgl,
+   ttm->sg->orig_nents, direction);
+   if (ttm->sg->nents == 0)
goto release_sg;
 
/* convert SG to linear array of pages and dma addresses */
@@ -1091,7 +1091,7 @@ static void amdgpu_ttm_tt_unpin_userptr(struct ttm_tt 
*ttm)
return;
 
/* unmap the pages mapped to the device */
-   dma_unmap_sg(adev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
+   dma_unmap_sg(adev->dev, ttm->sg->sgl, ttm->sg->orig_nents, direction);
 
sg_free_table(ttm->sg);
 
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v2 12/21] drm: tegra: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/gpu/drm/tegra/gem.c   | 25 +
 drivers/gpu/drm/tegra/plane.c | 13 +++--
 2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
index 6237681..5710ab4 100644
--- a/drivers/gpu/drm/tegra/gem.c
+++ b/drivers/gpu/drm/tegra/gem.c
@@ -98,8 +98,8 @@ static struct sg_table *tegra_bo_pin(struct device *dev, 
struct host1x_bo *bo,
 * the SG table needs to be copied to avoid overwriting any
 * other potential users of the original SG table.
 */
-   err = sg_alloc_table_from_sg(sgt, obj->sgt->sgl, 
obj->sgt->nents,
-GFP_KERNEL);
+   err = sg_alloc_table_from_sg(sgt, obj->sgt->sgl,
+obj->sgt->orig_nents, GFP_KERNEL);
if (err < 0)
goto free;
} else {
@@ -197,7 +197,7 @@ static int tegra_bo_iommu_map(struct tegra_drm *tegra, 
struct tegra_bo *bo)
bo->iova = bo->mm->start;
 
bo->size = iommu_map_sg(tegra->domain, bo->iova, bo->sgt->sgl,
-   bo->sgt->nents, prot);
+   bo->sgt->orig_nents, prot);
if (!bo->size) {
dev_err(tegra->drm->dev, "failed to map buffer\n");
err = -ENOMEM;
@@ -264,7 +264,7 @@ static struct tegra_bo *tegra_bo_alloc_object(struct 
drm_device *drm,
 static void tegra_bo_free(struct drm_device *drm, struct tegra_bo *bo)
 {
if (bo->pages) {
-   dma_unmap_sg(drm->dev, bo->sgt->sgl, bo->sgt->nents,
+   dma_unmap_sg(drm->dev, bo->sgt->sgl, bo->sgt->orig_nents,
 DMA_FROM_DEVICE);
drm_gem_put_pages(&bo->gem, bo->pages, true, true);
sg_free_table(bo->sgt);
@@ -290,9 +290,9 @@ static int tegra_bo_get_pages(struct drm_device *drm, 
struct tegra_bo *bo)
goto put_pages;
}
 
-   err = dma_map_sg(drm->dev, bo->sgt->sgl, bo->sgt->nents,
-DMA_FROM_DEVICE);
-   if (err == 0) {
+   bo->sgt->nents = dma_map_sg(drm->dev, bo->sgt->sgl, bo->sgt->orig_nents,
+   DMA_FROM_DEVICE);
+   if (bo->sgt->nents == 0) {
err = -EFAULT;
goto free_sgt;
}
@@ -571,7 +571,8 @@ int tegra_drm_mmap(struct file *file, struct vm_area_struct 
*vma)
goto free;
}
 
-   if (dma_map_sg(attach->dev, sgt->sgl, sgt->nents, dir) == 0)
+   sgt->nents = dma_map_sg(attach->dev, sgt->sgl, sgt->orig_nents, dir);
+   if (sgt->nents == 0)
goto free;
 
return sgt;
@@ -590,7 +591,7 @@ static void tegra_gem_prime_unmap_dma_buf(struct 
dma_buf_attachment *attach,
struct tegra_bo *bo = to_tegra_bo(gem);
 
if (bo->pages)
-   dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, dir);
+   dma_unmap_sg(attach->dev, sgt->sgl, sgt->orig_nents, dir);
 
sg_free_table(sgt);
kfree(sgt);
@@ -609,7 +610,7 @@ static int tegra_gem_prime_begin_cpu_access(struct dma_buf 
*buf,
struct drm_device *drm = gem->dev;
 
if (bo->pages)
-   dma_sync_sg_for_cpu(drm->dev, bo->sgt->sgl, bo->sgt->nents,
+   dma_sync_sg_for_cpu(drm->dev, bo->sgt->sgl, bo->sgt->orig_nents,
DMA_FROM_DEVICE);
 
return 0;
@@ -623,8 +624,8 @@ static int tegra_gem_prime_end_cpu_access(struct dma_buf 
*buf,
struct drm_device *drm = gem->dev;
 
if (bo->pages)
-   dma_sync_sg_for_device(drm->dev, bo->sgt->sgl, bo->sgt->nents,
-  DMA_TO_DEVICE);
+   dma_sync_sg_for_device(drm->dev, bo->sgt->sgl,
+  bo->sgt->orig_nents, DMA_TO_DEVICE);
 
return 0;
 }
diff --git a/drivers/gpu/drm/tegra/plane.c b/drivers/gpu/drm/tegra/plane.c
index 9ccfb56..3982bf8 100644
--- a/drivers/gpu/drm/tegra/plane.c
+++ b/drivers/gpu/drm/tegra/plane.c
@@ -130,9 +130,10 @@ static int tegra_dc_pin(struct tegra_dc *dc, struct 
tegra_plane_state *state)
}
 
if (sgt) {
-   err = dma_map_sg(dc->dev, sgt->sgl, sgt->nents,
-  

[PATCH v2 18/21] xen: gntdev: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/xen/gntdev-dmabuf.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c
index 75d3bb9..ed749fd 100644
--- a/drivers/xen/gntdev-dmabuf.c
+++ b/drivers/xen/gntdev-dmabuf.c
@@ -248,7 +248,7 @@ static void dmabuf_exp_ops_detach(struct dma_buf *dma_buf,
if (sgt) {
if (gntdev_dmabuf_attach->dir != DMA_NONE)
dma_unmap_sg_attrs(attach->dev, sgt->sgl,
-  sgt->nents,
+  sgt->orig_nents,
   gntdev_dmabuf_attach->dir,
   DMA_ATTR_SKIP_CPU_SYNC);
sg_free_table(sgt);
@@ -288,8 +288,10 @@ static void dmabuf_exp_ops_detach(struct dma_buf *dma_buf,
sgt = dmabuf_pages_to_sgt(gntdev_dmabuf->pages,
  gntdev_dmabuf->nr_pages);
if (!IS_ERR(sgt)) {
-   if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir,
- DMA_ATTR_SKIP_CPU_SYNC)) {
+   sgt->nents = dma_map_sg_attrs(attach->dev, sgt->sgl,
+ sgt->orig_nents, dir,
+ DMA_ATTR_SKIP_CPU_SYNC);
+   if (!sgt->nents) {
sg_free_table(sgt);
kfree(sgt);
sgt = ERR_PTR(-ENOMEM);
@@ -625,7 +627,7 @@ static struct gntdev_dmabuf *dmabuf_imp_alloc_storage(int 
count)
 
/* Now convert sgt to array of pages and check for page validity. */
i = 0;
-   for_each_sg_page(sgt->sgl, &sg_iter, sgt->nents, 0) {
+   for_each_sg_page(sgt->sgl, &sg_iter, sgt->orig_nents, 0) {
struct page *page = sg_page_iter_page(&sg_iter);
/*
 * Check if page is valid: this can happen if we are given
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 05/21] drm: exynos: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/gpu/drm/exynos/exynos_drm_g2d.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c 
b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
index fcee33a..f995b0c 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
@@ -396,7 +396,7 @@ static void g2d_userptr_put_dma_addr(struct g2d_data *g2d,
 
 out:
dma_unmap_sg(to_dma_dev(g2d->drm_dev), g2d_userptr->sgt->sgl,
-   g2d_userptr->sgt->nents, DMA_BIDIRECTIONAL);
+g2d_userptr->sgt->orig_nents, DMA_BIDIRECTIONAL);
 
pages = frame_vector_pages(g2d_userptr->vec);
if (!IS_ERR(pages)) {
@@ -511,8 +511,9 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct g2d_data 
*g2d,
 
g2d_userptr->sgt = sgt;
 
-   if (!dma_map_sg(to_dma_dev(g2d->drm_dev), sgt->sgl, sgt->nents,
-   DMA_BIDIRECTIONAL)) {
+   sgt->nents = dma_map_sg(to_dma_dev(g2d->drm_dev), sgt->sgl,
+   sgt->orig_nents, DMA_BIDIRECTIONAL);
+   if (!sgt->nents) {
DRM_DEV_ERROR(g2d->dev, "failed to map sgt with dma region.\n");
ret = -ENOMEM;
goto err_sg_free_table;
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/5] iommu/amd: Update Device Table in increase_address_space()

2020-05-04 Thread Joerg Roedel
From: Joerg Roedel 

The Device Table needs to be updated before the new page-table root
can be published in domain->pt_root. Otherwise a concurrent call to
fetch_pte might fetch a PTE which is not reachable through the Device
Table Entry.

Fixes: 92d420ec028d ("iommu/amd: Relax locking in dma_ops path")
Reported-by: Qian Cai 
Tested-by: Qian Cai 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd_iommu.c | 49 ---
 1 file changed, 36 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index d2499c86d395..2ae1daac888a 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -101,6 +101,8 @@ struct kmem_cache *amd_iommu_irq_cache;
 static void update_domain(struct protection_domain *domain);
 static int protection_domain_init(struct protection_domain *domain);
 static void detach_device(struct device *dev);
+static void update_and_flush_device_table(struct protection_domain *domain,
+ struct domain_pgtable *pgtable);
 
 /
  *
@@ -1461,8 +1463,16 @@ static bool increase_address_space(struct 
protection_domain *domain,
 
*pte = PM_LEVEL_PDE(pgtable.mode, iommu_virt_to_phys(pgtable.root));
 
-   root = amd_iommu_domain_encode_pgtable(pte, pgtable.mode + 1);
+   pgtable.root  = pte;
+   pgtable.mode += 1;
+   update_and_flush_device_table(domain, &pgtable);
+   domain_flush_complete(domain);
 
+   /*
+* Device Table needs to be updated and flushed before the new root can
+* be published.
+*/
+   root = amd_iommu_domain_encode_pgtable(pte, pgtable.mode);
atomic64_set(&domain->pt_root, root);
 
ret = true;
@@ -1893,19 +1903,17 @@ static bool dma_ops_domain(struct protection_domain 
*domain)
 }
 
 static void set_dte_entry(u16 devid, struct protection_domain *domain,
+ struct domain_pgtable *pgtable,
  bool ats, bool ppr)
 {
-   struct domain_pgtable pgtable;
u64 pte_root = 0;
u64 flags = 0;
u32 old_domid;
 
-   amd_iommu_domain_get_pgtable(domain, &pgtable);
+   if (pgtable->mode != PAGE_MODE_NONE)
+   pte_root = iommu_virt_to_phys(pgtable->root);
 
-   if (pgtable.mode != PAGE_MODE_NONE)
-   pte_root = iommu_virt_to_phys(pgtable.root);
-
-   pte_root |= (pgtable.mode & DEV_ENTRY_MODE_MASK)
+   pte_root |= (pgtable->mode & DEV_ENTRY_MODE_MASK)
<< DEV_ENTRY_MODE_SHIFT;
pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
 
@@ -1978,6 +1986,7 @@ static void clear_dte_entry(u16 devid)
 static void do_attach(struct iommu_dev_data *dev_data,
  struct protection_domain *domain)
 {
+   struct domain_pgtable pgtable;
struct amd_iommu *iommu;
bool ats;
 
@@ -1993,7 +2002,9 @@ static void do_attach(struct iommu_dev_data *dev_data,
domain->dev_cnt += 1;
 
/* Update device table */
-   set_dte_entry(dev_data->devid, domain, ats, dev_data->iommu_v2);
+   amd_iommu_domain_get_pgtable(domain, &pgtable);
+   set_dte_entry(dev_data->devid, domain, &pgtable,
+ ats, dev_data->iommu_v2);
clone_aliases(dev_data->pdev);
 
device_flush_dte(dev_data);
@@ -2304,22 +2315,34 @@ static int amd_iommu_domain_get_attr(struct 
iommu_domain *domain,
  *
  */
 
-static void update_device_table(struct protection_domain *domain)
+static void update_device_table(struct protection_domain *domain,
+   struct domain_pgtable *pgtable)
 {
struct iommu_dev_data *dev_data;
 
list_for_each_entry(dev_data, &domain->dev_list, list) {
-   set_dte_entry(dev_data->devid, domain, dev_data->ats.enabled,
- dev_data->iommu_v2);
+   set_dte_entry(dev_data->devid, domain, pgtable,
+ dev_data->ats.enabled, dev_data->iommu_v2);
clone_aliases(dev_data->pdev);
}
 }
 
+static void update_and_flush_device_table(struct protection_domain *domain,
+ struct domain_pgtable *pgtable)
+{
+   update_device_table(domain, pgtable);
+   domain_flush_devices(domain);
+}
+
 static void update_domain(struct protection_domain *domain)
 {
-   update_device_table(domain);
+   struct domain_pgtable pgtable;
 
-   domain_flush_devices(domain);
+   /* Update device table */
+   amd_iommu_domain_get_pgtable(domain, &pgtable);
+   update_and_flush_device_table(domain, &pgtable);
+
+   /* Flush domain TLB(s) and wait for completion */
domain_flush_tlb_pde(domain);
domain_flush_complete(domain);
 }
-- 
2.17.1

___

[PATCH 5/5] iommu/amd: Do not flush Device Table in iommu_map_page()

2020-05-04 Thread Joerg Roedel
From: Joerg Roedel 

The flush of the Device Table Entries for the domain has already
happened in increase_address_space(), if necessary. Do no flush them
again in iommu_map_page().

Tested-by: Qian Cai 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd_iommu.c | 27 ---
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 2ae1daac888a..1dc3718560d0 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1446,15 +1446,18 @@ static bool increase_address_space(struct 
protection_domain *domain,
 {
struct domain_pgtable pgtable;
unsigned long flags;
-   bool ret = false;
+   bool ret = true;
u64 *pte, root;
 
spin_lock_irqsave(&domain->lock, flags);
 
amd_iommu_domain_get_pgtable(domain, &pgtable);
 
-   if (address <= PM_LEVEL_SIZE(pgtable.mode) ||
-   WARN_ON_ONCE(pgtable.mode == PAGE_MODE_6_LEVEL))
+   if (address <= PM_LEVEL_SIZE(pgtable.mode))
+   goto out;
+
+   ret = false;
+   if (WARN_ON_ONCE(pgtable.mode == PAGE_MODE_6_LEVEL))
goto out;
 
pte = (void *)get_zeroed_page(gfp);
@@ -1499,19 +1502,15 @@ static u64 *alloc_pte(struct protection_domain *domain,
amd_iommu_domain_get_pgtable(domain, &pgtable);
 
while (address > PM_LEVEL_SIZE(pgtable.mode)) {
-   bool upd = increase_address_space(domain, address, gfp);
-
-   /* Read new values to check if update was successful */
-   amd_iommu_domain_get_pgtable(domain, &pgtable);
-
/*
 * Return an error if there is no memory to update the
 * page-table.
 */
-   if (!upd && (address > PM_LEVEL_SIZE(pgtable.mode)))
+   if (!increase_address_space(domain, address, gfp))
return NULL;
 
-   *updated = *updated || upd;
+   /* Read new values to check if update was successful */
+   amd_iommu_domain_get_pgtable(domain, &pgtable);
}
 
 
@@ -1719,7 +1718,13 @@ static int iommu_map_page(struct protection_domain *dom,
unsigned long flags;
 
spin_lock_irqsave(&dom->lock, flags);
-   update_domain(dom);
+   /*
+* Flush domain TLB(s) and wait for completion. Any Device-Table
+* Updates and flushing already happened in
+* increase_address_space().
+*/
+   domain_flush_tlb_pde(dom);
+   domain_flush_complete(dom);
spin_unlock_irqrestore(&dom->lock, flags);
}
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 15/21] drm: xen: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/gpu/drm/xen/xen_drm_front_gem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c 
b/drivers/gpu/drm/xen/xen_drm_front_gem.c
index f0b85e0..ba4bdc5 100644
--- a/drivers/gpu/drm/xen/xen_drm_front_gem.c
+++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c
@@ -215,7 +215,7 @@ struct drm_gem_object *
return ERR_PTR(ret);
 
DRM_DEBUG("Imported buffer of size %zu with nents %u\n",
- size, sgt->nents);
+ size, sgt->orig_nents);
 
return &xen_obj->base;
 }
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 06/21] drm: i915: fix sg_table nents vs. orig_nents misuse for dmabuf objects

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h.

This driver creatively uses sg_table->orig_nents to store the size of the
allocate scatterlist and ignores the number of the entries returned by
dma_map_sg function. The sg_table->orig_nents is (mis)used to properly
free the (over)allocated scatterlist.

This patch only fixes the sg_table->nents entries in the sg_table objects
exported by the dmabuf related functions, so the other drivers, which
might share buffers with i915 could rely on the nents and orig_nents
values.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c   | 9 +
 drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c | 5 +++--
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 7db5a79..98159df 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -48,9 +48,10 @@ static struct sg_table *i915_gem_map_dma_buf(struct 
dma_buf_attachment *attachme
src = sg_next(src);
}
 
-   if (!dma_map_sg_attrs(attachment->dev,
- st->sgl, st->nents, dir,
- DMA_ATTR_SKIP_CPU_SYNC)) {
+   st->nents = dma_map_sg_attrs(attachment->dev,
+st->sgl, st->orig_nents, dir,
+DMA_ATTR_SKIP_CPU_SYNC);
+   if (!st->nents) {
ret = -ENOMEM;
goto err_free_sg;
}
@@ -74,7 +75,7 @@ static void i915_gem_unmap_dma_buf(struct dma_buf_attachment 
*attachment,
struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment->dmabuf);
 
dma_unmap_sg_attrs(attachment->dev,
-  sg->sgl, sg->nents, dir,
+  sg->sgl, sg->orig_nents, dir,
   DMA_ATTR_SKIP_CPU_SYNC);
sg_free_table(sg);
kfree(sg);
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c 
b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
index debaf7b..5723525 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
@@ -28,7 +28,8 @@ static struct sg_table *mock_map_dma_buf(struct 
dma_buf_attachment *attachment,
sg = sg_next(sg);
}
 
-   if (!dma_map_sg(attachment->dev, st->sgl, st->nents, dir)) {
+   st->nents = dma_map_sg(attachment->dev, st->sgl, st->orig_nents, dir);
+   if (!st->nents) {
err = -ENOMEM;
goto err_st;
}
@@ -46,7 +47,7 @@ static void mock_unmap_dma_buf(struct dma_buf_attachment 
*attachment,
   struct sg_table *st,
   enum dma_data_direction dir)
 {
-   dma_unmap_sg(attachment->dev, st->sgl, st->nents, dir);
+   dma_unmap_sg(attachment->dev, st->sgl, st->orig_nents, dir);
sg_free_table(st);
kfree(st);
 }
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 19/21] dmabuf: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/dma-buf/heaps/heap-helpers.c | 7 ---
 drivers/dma-buf/udmabuf.c| 5 +++--
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/dma-buf/heaps/heap-helpers.c 
b/drivers/dma-buf/heaps/heap-helpers.c
index 9f964ca..b923863 100644
--- a/drivers/dma-buf/heaps/heap-helpers.c
+++ b/drivers/dma-buf/heaps/heap-helpers.c
@@ -144,8 +144,9 @@ struct sg_table *dma_heap_map_dma_buf(struct 
dma_buf_attachment *attachment,
 
table = &a->table;
 
-   if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
-   direction))
+   table->nents = dma_map_sg(attachment->dev, table->sgl,
+ table->orig_nents, direction);
+   if (!table->nents)
table = ERR_PTR(-ENOMEM);
return table;
 }
@@ -154,7 +155,7 @@ static void dma_heap_unmap_dma_buf(struct 
dma_buf_attachment *attachment,
   struct sg_table *table,
   enum dma_data_direction direction)
 {
-   dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
+   dma_unmap_sg(attachment->dev, table->sgl, table->orig_nents, direction);
 }
 
 static vm_fault_t dma_heap_vm_fault(struct vm_fault *vmf)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index acb26c6..ea0cf71 100644
--- a/drivers/dma-buf/udmabuf.c
+++ b/drivers/dma-buf/udmabuf.c
@@ -63,7 +63,8 @@ static struct sg_table *get_sg_table(struct device *dev, 
struct dma_buf *buf,
GFP_KERNEL);
if (ret < 0)
goto err;
-   if (!dma_map_sg(dev, sg->sgl, sg->nents, direction)) {
+   sg->nents = dma_map_sg(dev, sg->sgl, sg->orig_nents, direction);
+   if (!sg->nents) {
ret = -EINVAL;
goto err;
}
@@ -78,7 +79,7 @@ static struct sg_table *get_sg_table(struct device *dev, 
struct dma_buf *buf,
 static void put_sg_table(struct device *dev, struct sg_table *sg,
 enum dma_data_direction direction)
 {
-   dma_unmap_sg(dev, sg->sgl, sg->nents, direction);
+   dma_unmap_sg(dev, sg->sgl, sg->orig_nents, direction);
sg_free_table(sg);
kfree(sg);
 }
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 03/21] drm: armada: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/gpu/drm/armada/armada_gem.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/armada/armada_gem.c 
b/drivers/gpu/drm/armada/armada_gem.c
index 976685f..749647f 100644
--- a/drivers/gpu/drm/armada/armada_gem.c
+++ b/drivers/gpu/drm/armada/armada_gem.c
@@ -407,8 +407,10 @@ int armada_gem_pwrite_ioctl(struct drm_device *dev, void 
*data,
sg_set_page(sg, page, PAGE_SIZE, 0);
}
 
-   if (dma_map_sg(attach->dev, sgt->sgl, sgt->nents, dir) == 0) {
-   num = sgt->nents;
+   sgt->nents = dma_map_sg(attach->dev, sgt->sgl, sgt->orig_nents,
+   dir);
+   if (sgt->nents == 0) {
+   num = sgt->orig_nents;
goto release;
}
} else if (dobj->page) {
@@ -418,7 +420,9 @@ int armada_gem_pwrite_ioctl(struct drm_device *dev, void 
*data,
 
sg_set_page(sgt->sgl, dobj->page, dobj->obj.size, 0);
 
-   if (dma_map_sg(attach->dev, sgt->sgl, sgt->nents, dir) == 0)
+   sgt->nents = dma_map_sg(attach->dev, sgt->sgl, sgt->orig_nents,
+   dir);
+   if (sgt->nents == 0)
goto free_table;
} else if (dobj->linear) {
/* Single contiguous physical region - no struct page */
@@ -449,11 +453,11 @@ static void armada_gem_prime_unmap_dma_buf(struct 
dma_buf_attachment *attach,
int i;
 
if (!dobj->linear)
-   dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, dir);
+   dma_unmap_sg(attach->dev, sgt->sgl, sgt->orig_nents, dir);
 
if (dobj->obj.filp) {
struct scatterlist *sg;
-   for_each_sg(sgt->sgl, sg, sgt->nents, i)
+   for_each_sg(sgt->sgl, sg, sgt->orig_nents, i)
put_page(sg_page(sg));
}
 
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 09/21] drm: panfrost: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/gpu/drm/panfrost/panfrost_gem.c | 3 ++-
 drivers/gpu/drm/panfrost/panfrost_mmu.c | 4 +++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c 
b/drivers/gpu/drm/panfrost/panfrost_gem.c
index 17b654e..22fec7c 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -42,7 +42,8 @@ static void panfrost_gem_free_object(struct drm_gem_object 
*obj)
for (i = 0; i < n_sgt; i++) {
if (bo->sgts[i].sgl) {
dma_unmap_sg(pfdev->dev, bo->sgts[i].sgl,
-bo->sgts[i].nents, 
DMA_BIDIRECTIONAL);
+bo->sgts[i].orig_nents,
+DMA_BIDIRECTIONAL);
sg_free_table(&bo->sgts[i]);
}
}
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index ed28aeb..2d9b1f9 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -517,7 +517,9 @@ static int panfrost_mmu_map_fault_addr(struct 
panfrost_device *pfdev, int as,
if (ret)
goto err_pages;
 
-   if (!dma_map_sg(pfdev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL)) {
+   sgt->nents = dma_map_sg(pfdev->dev, sgt->sgl, sgt->orig_nents,
+   DMA_BIDIRECTIONAL);
+   if (!sgt->nents) {
ret = -EINVAL;
goto err_map;
}
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 04/21] drm: etnaviv: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/gpu/drm/etnaviv/etnaviv_gem.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index dc9ef30..a224a97 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -27,7 +27,8 @@ static void etnaviv_gem_scatter_map(struct etnaviv_gem_object 
*etnaviv_obj)
 * because display controller, GPU, etc. are not coherent.
 */
if (etnaviv_obj->flags & ETNA_BO_CACHE_MASK)
-   dma_map_sg(dev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL);
+   sgt->nents = dma_map_sg(dev->dev, sgt->sgl, sgt->orig_nents,
+   DMA_BIDIRECTIONAL);
 }
 
 static void etnaviv_gem_scatterlist_unmap(struct etnaviv_gem_object 
*etnaviv_obj)
@@ -51,7 +52,8 @@ static void etnaviv_gem_scatterlist_unmap(struct 
etnaviv_gem_object *etnaviv_obj
 * discard those writes.
 */
if (etnaviv_obj->flags & ETNA_BO_CACHE_MASK)
-   dma_unmap_sg(dev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL);
+   dma_unmap_sg(dev->dev, sgt->sgl, sgt->orig_nents,
+DMA_BIDIRECTIONAL);
 }
 
 /* called with etnaviv_obj->lock held */
@@ -405,7 +407,7 @@ int etnaviv_gem_cpu_prep(struct drm_gem_object *obj, u32 op,
 
if (etnaviv_obj->flags & ETNA_BO_CACHED) {
dma_sync_sg_for_cpu(dev->dev, etnaviv_obj->sgt->sgl,
-   etnaviv_obj->sgt->nents,
+   etnaviv_obj->sgt->orig_nents,
etnaviv_op_to_dma_dir(op));
etnaviv_obj->last_cpu_prep_op = op;
}
@@ -422,7 +424,7 @@ int etnaviv_gem_cpu_fini(struct drm_gem_object *obj)
/* fini without a prep is almost certainly a userspace error */
WARN_ON(etnaviv_obj->last_cpu_prep_op == 0);
dma_sync_sg_for_device(dev->dev, etnaviv_obj->sgt->sgl,
-   etnaviv_obj->sgt->nents,
+   etnaviv_obj->sgt->orig_nents,
etnaviv_op_to_dma_dir(etnaviv_obj->last_cpu_prep_op));
etnaviv_obj->last_cpu_prep_op = 0;
}
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 16/21] drm: host1x: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/gpu/host1x/job.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
index a10643a..3ea185e 100644
--- a/drivers/gpu/host1x/job.c
+++ b/drivers/gpu/host1x/job.c
@@ -166,8 +166,9 @@ static unsigned int pin_job(struct host1x *host, struct 
host1x_job *job)
goto unpin;
}
 
-   err = dma_map_sg(dev, sgt->sgl, sgt->nents, dir);
-   if (!err) {
+   sgt->nents = dma_map_sg(dev, sgt->sgl, sgt->orig_nents,
+   dir);
+   if (!sgt->nents) {
err = -ENOMEM;
goto unpin;
}
@@ -217,7 +218,7 @@ static unsigned int pin_job(struct host1x *host, struct 
host1x_job *job)
}
 
if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && host->domain) {
-   for_each_sg(sgt->sgl, sg, sgt->nents, j)
+   for_each_sg(sgt->sgl, sg, sgt->orig_nents, j)
gather_size += sg->length;
gather_size = iova_align(&host->iova, gather_size);
 
@@ -231,7 +232,7 @@ static unsigned int pin_job(struct host1x *host, struct 
host1x_job *job)
 
err = iommu_map_sg(host->domain,
iova_dma_addr(&host->iova, alloc),
-   sgt->sgl, sgt->nents, IOMMU_READ);
+   sgt->sgl, sgt->orig_nents, IOMMU_READ);
if (err == 0) {
__free_iova(&host->iova, alloc);
err = -EINVAL;
@@ -241,9 +242,9 @@ static unsigned int pin_job(struct host1x *host, struct 
host1x_job *job)
job->unpins[job->num_unpins].size = gather_size;
phys_addr = iova_dma_addr(&host->iova, alloc);
} else if (sgt) {
-   err = dma_map_sg(host->dev, sgt->sgl, sgt->nents,
-DMA_TO_DEVICE);
-   if (!err) {
+   sgt->nents = dma_map_sg(host->dev, sgt->sgl,
+   sgt->orig_nents, DMA_TO_DEVICE);
+   if (!sgt->nents) {
err = -ENOMEM;
goto unpin;
}
@@ -647,7 +648,7 @@ void host1x_job_unpin(struct host1x_job *job)
}
 
if (unpin->dev && sgt)
-   dma_unmap_sg(unpin->dev, sgt->sgl, sgt->nents,
+   dma_unmap_sg(unpin->dev, sgt->sgl, sgt->orig_nents,
 unpin->dir);
 
host1x_bo_unpin(dev, unpin->bo, sgt);
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 08/21] drm: msm: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/gpu/drm/msm/msm_gem.c   | 8 
 drivers/gpu/drm/msm/msm_iommu.c | 3 ++-
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 5a6a79f..54c3bbb 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -54,10 +54,10 @@ static void sync_for_device(struct msm_gem_object *msm_obj)
 
if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) {
dma_sync_sg_for_device(dev, msm_obj->sgt->sgl,
-   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
+   msm_obj->sgt->orig_nents, DMA_BIDIRECTIONAL);
} else {
dma_map_sg(dev, msm_obj->sgt->sgl,
-   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
+   msm_obj->sgt->orig_nents, DMA_BIDIRECTIONAL);
}
 }
 
@@ -67,10 +67,10 @@ static void sync_for_cpu(struct msm_gem_object *msm_obj)
 
if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) {
dma_sync_sg_for_cpu(dev, msm_obj->sgt->sgl,
-   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
+   msm_obj->sgt->orig_nents, DMA_BIDIRECTIONAL);
} else {
dma_unmap_sg(dev, msm_obj->sgt->sgl,
-   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
+   msm_obj->sgt->orig_nents, DMA_BIDIRECTIONAL);
}
 }
 
diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index ad58cfe..b0ca084 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -43,7 +43,8 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova,
struct msm_iommu *iommu = to_msm_iommu(mmu);
size_t ret;
 
-   ret = iommu_map_sg(iommu->domain, iova, sgt->sgl, sgt->nents, prot);
+   ret = iommu_map_sg(iommu->domain, iova, sgt->sgl, sgt->orig_nents,
+  prot);
WARN_ON(!ret);
 
return (ret == len) ? 0 : -EINVAL;
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 21/21] staging: ion: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/staging/android/ion/ion.c | 17 +
 drivers/staging/android/ion/ion_heap.c|  6 +++---
 drivers/staging/android/ion/ion_system_heap.c |  2 +-
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/android/ion/ion.c 
b/drivers/staging/android/ion/ion.c
index 38b51ea..b14170c 100644
--- a/drivers/staging/android/ion/ion.c
+++ b/drivers/staging/android/ion/ion.c
@@ -147,14 +147,14 @@ static struct sg_table *dup_sg_table(struct sg_table 
*table)
if (!new_table)
return ERR_PTR(-ENOMEM);
 
-   ret = sg_alloc_table(new_table, table->nents, GFP_KERNEL);
+   ret = sg_alloc_table(new_table, table->orig_nents, GFP_KERNEL);
if (ret) {
kfree(new_table);
return ERR_PTR(-ENOMEM);
}
 
new_sg = new_table->sgl;
-   for_each_sg(table->sgl, sg, table->nents, i) {
+   for_each_sg(table->sgl, sg, table->orig_nents, i) {
memcpy(new_sg, sg, sizeof(*sg));
new_sg->dma_address = 0;
new_sg = sg_next(new_sg);
@@ -227,8 +227,9 @@ static struct sg_table *ion_map_dma_buf(struct 
dma_buf_attachment *attachment,
 
table = a->table;
 
-   if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
-   direction))
+   table->nents = dma_map_sg(attachment->dev, table->sgl,
+ table->orig_nents, direction);
+   if (!table->nents)
return ERR_PTR(-ENOMEM);
 
return table;
@@ -238,7 +239,7 @@ static void ion_unmap_dma_buf(struct dma_buf_attachment 
*attachment,
  struct sg_table *table,
  enum dma_data_direction direction)
 {
-   dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
+   dma_unmap_sg(attachment->dev, table->sgl, table->orig_nents, direction);
 }
 
 static int ion_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
@@ -297,7 +298,7 @@ static int ion_dma_buf_begin_cpu_access(struct dma_buf 
*dmabuf,
 
mutex_lock(&buffer->lock);
list_for_each_entry(a, &buffer->attachments, list) {
-   dma_sync_sg_for_cpu(a->dev, a->table->sgl, a->table->nents,
+   dma_sync_sg_for_cpu(a->dev, a->table->sgl, a->table->orig_nents,
direction);
}
 
@@ -320,8 +321,8 @@ static int ion_dma_buf_end_cpu_access(struct dma_buf 
*dmabuf,
 
mutex_lock(&buffer->lock);
list_for_each_entry(a, &buffer->attachments, list) {
-   dma_sync_sg_for_device(a->dev, a->table->sgl, a->table->nents,
-  direction);
+   dma_sync_sg_for_device(a->dev, a->table->sgl,
+  a->table->orig_nents, direction);
}
mutex_unlock(&buffer->lock);
 
diff --git a/drivers/staging/android/ion/ion_heap.c 
b/drivers/staging/android/ion/ion_heap.c
index 0755b11..f2f7ca7 100644
--- a/drivers/staging/android/ion/ion_heap.c
+++ b/drivers/staging/android/ion/ion_heap.c
@@ -38,7 +38,7 @@ void *ion_heap_map_kernel(struct ion_heap *heap,
else
pgprot = pgprot_writecombine(PAGE_KERNEL);
 
-   for_each_sg(table->sgl, sg, table->nents, i) {
+   for_each_sg(table->sgl, sg, table->orig_nents, i) {
int npages_this_entry = PAGE_ALIGN(sg->length) / PAGE_SIZE;
struct page *page = sg_page(sg);
 
@@ -71,7 +71,7 @@ int ion_heap_map_user(struct ion_heap *heap, struct 
ion_buffer *buffer,
int i;
int ret;
 
-   for_each_sg(table->sgl, sg, table->nents, i) {
+   for_each_sg(table->sgl, sg, table->orig_nents, i) {
struct page *page = sg_page(sg);
unsigned long remainder = vma->vm_end - addr;
unsigned long len = sg->length;
@@ -142,7 +142,7 @@ int ion_heap_buffer_zero(struct ion_buffer *buffer)
else
pgprot = pgprot_writecombine(PAGE_KERNEL);
 
-   return ion_heap_sglist_zero(table->sgl, table->nents, pgprot);
+   return ion_heap_sglist_zero(table->sgl, table->orig_nents, pgprot);
 }
 
 int ion_heap_pages_zero(struct page *page, size_t size, pgprot_t pgprot)
diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index b83a1d1..

[PATCH v2 01/21] drm: core: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/drm_cache.c| 2 +-
 drivers/gpu/drm/drm_gem_shmem_helper.c | 7 ---
 drivers/gpu/drm/drm_prime.c| 9 +
 3 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index 03e01b0..63bd497 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -127,7 +127,7 @@ static void drm_cache_flush_clflush(struct page *pages[],
struct sg_page_iter sg_iter;
 
mb(); /*CLFLUSH is ordered only by using memory barriers*/
-   for_each_sg_page(st->sgl, &sg_iter, st->nents, 0)
+   for_each_sg_page(st->sgl, &sg_iter, st->orig_nents, 0)
drm_clflush_page(sg_page_iter_page(&sg_iter));
mb(); /*Make sure that all cache line entry is flushed*/
 
diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index df31e57..f47caa7 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -118,7 +118,7 @@ void drm_gem_shmem_free_object(struct drm_gem_object *obj)
} else {
if (shmem->sgt) {
dma_unmap_sg(obj->dev->dev, shmem->sgt->sgl,
-shmem->sgt->nents, DMA_BIDIRECTIONAL);
+shmem->sgt->orig_nents, DMA_BIDIRECTIONAL);
sg_free_table(shmem->sgt);
kfree(shmem->sgt);
}
@@ -396,7 +396,7 @@ void drm_gem_shmem_purge_locked(struct drm_gem_object *obj)
WARN_ON(!drm_gem_shmem_is_purgeable(shmem));
 
dma_unmap_sg(obj->dev->dev, shmem->sgt->sgl,
-shmem->sgt->nents, DMA_BIDIRECTIONAL);
+shmem->sgt->orig_nents, DMA_BIDIRECTIONAL);
sg_free_table(shmem->sgt);
kfree(shmem->sgt);
shmem->sgt = NULL;
@@ -623,7 +623,8 @@ struct sg_table *drm_gem_shmem_get_pages_sgt(struct 
drm_gem_object *obj)
goto err_put_pages;
}
/* Map the pages for use by the h/w. */
-   dma_map_sg(obj->dev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL);
+   sgt->nents = dma_map_sg(obj->dev->dev, sgt->sgl, sgt->orig_nents,
+   DMA_BIDIRECTIONAL);
 
shmem->sgt = sgt;
 
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 282774e..f3e2d2a 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -626,8 +626,9 @@ struct sg_table *drm_gem_map_dma_buf(struct 
dma_buf_attachment *attach,
else
sgt = obj->dev->driver->gem_prime_get_sg_table(obj);
 
-   if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir,
- DMA_ATTR_SKIP_CPU_SYNC)) {
+   sgt->nents = dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->orig_nents,
+ dir, DMA_ATTR_SKIP_CPU_SYNC);
+   if (!sgt->nents) {
sg_free_table(sgt);
kfree(sgt);
sgt = ERR_PTR(-ENOMEM);
@@ -652,7 +653,7 @@ void drm_gem_unmap_dma_buf(struct dma_buf_attachment 
*attach,
if (!sgt)
return;
 
-   dma_unmap_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir,
+   dma_unmap_sg_attrs(attach->dev, sgt->sgl, sgt->orig_nents, dir,
   DMA_ATTR_SKIP_CPU_SYNC);
sg_free_table(sgt);
kfree(sgt);
@@ -975,7 +976,7 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, 
struct page **pages,
 */
page_index = 0;
dma_index = 0;
-   for_each_sg(sgt->sgl, sg, sgt->nents, count) {
+   for_each_sg(sgt->sgl, sg, sgt->orig_nents, count) {
page_len = sg->length;
page = sg_page(sg);
dma_len = sg_dma_len(sg);
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 14/21] drm: vmwgfx: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c
index bf0bc46..a5fd128 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c
@@ -362,7 +362,7 @@ static void vmw_ttm_unmap_from_dma(struct vmw_ttm_tt 
*vmw_tt)
 {
struct device *dev = vmw_tt->dev_priv->dev->dev;
 
-   dma_unmap_sg(dev, vmw_tt->sgt.sgl, vmw_tt->sgt.nents,
+   dma_unmap_sg(dev, vmw_tt->sgt.sgl, vmw_tt->sgt.orig_nents,
DMA_BIDIRECTIONAL);
vmw_tt->sgt.nents = vmw_tt->sgt.orig_nents;
 }
@@ -449,10 +449,10 @@ static int vmw_ttm_map_dma(struct vmw_ttm_tt *vmw_tt)
if (unlikely(ret != 0))
goto out_sg_alloc_fail;
 
-   if (vsgt->num_pages > vmw_tt->sgt.nents) {
+   if (vsgt->num_pages > vmw_tt->sgt.orig_nents) {
uint64_t over_alloc =
sgl_size * (vsgt->num_pages -
-   vmw_tt->sgt.nents);
+   vmw_tt->sgt.orig_nents);
 
ttm_mem_global_free(glob, over_alloc);
vmw_tt->sg_alloc_size -= over_alloc;
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 11/21] drm: rockchip: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
index 0d18846..a024c71 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
@@ -37,7 +37,7 @@ static int rockchip_gem_iommu_map(struct rockchip_gem_object 
*rk_obj)
rk_obj->dma_addr = rk_obj->mm.start;
 
ret = iommu_map_sg(private->domain, rk_obj->dma_addr, rk_obj->sgt->sgl,
-  rk_obj->sgt->nents, prot);
+  rk_obj->sgt->orig_nents, prot);
if (ret < rk_obj->base.size) {
DRM_ERROR("failed to map buffer: size=%zd request_size=%zd\n",
  ret, rk_obj->base.size);
@@ -98,11 +98,11 @@ static int rockchip_gem_get_pages(struct 
rockchip_gem_object *rk_obj)
 * TODO: Replace this by drm_clflush_sg() once it can be implemented
 * without relying on symbols that are not exported.
 */
-   for_each_sg(rk_obj->sgt->sgl, s, rk_obj->sgt->nents, i)
+   for_each_sg(rk_obj->sgt->sgl, s, rk_obj->sgt->orig_nents, i)
sg_dma_address(s) = sg_phys(s);
 
-   dma_sync_sg_for_device(drm->dev, rk_obj->sgt->sgl, rk_obj->sgt->nents,
-  DMA_TO_DEVICE);
+   dma_sync_sg_for_device(drm->dev, rk_obj->sgt->sgl,
+  rk_obj->sgt->orig_nents, DMA_TO_DEVICE);
 
return 0;
 
@@ -351,7 +351,8 @@ void rockchip_gem_free_object(struct drm_gem_object *obj)
rockchip_gem_iommu_unmap(rk_obj);
} else {
dma_unmap_sg(drm->dev, rk_obj->sgt->sgl,
-rk_obj->sgt->nents, DMA_BIDIRECTIONAL);
+rk_obj->sgt->orig_nents,
+DMA_BIDIRECTIONAL);
}
drm_prime_gem_destroy(obj, rk_obj->sgt);
} else {
@@ -493,14 +494,14 @@ static unsigned long 
rockchip_sg_get_contiguous_size(struct sg_table *sgt,
struct sg_table *sg,
struct rockchip_gem_object *rk_obj)
 {
-   int count = dma_map_sg(drm->dev, sg->sgl, sg->nents,
+   int count = dma_map_sg(drm->dev, sg->sgl, sg->orig_nents,
   DMA_BIDIRECTIONAL);
if (!count)
return -EINVAL;
 
if (rockchip_sg_get_contiguous_size(sg, count) < attach->dmabuf->size) {
DRM_ERROR("failed to map sg_table to contiguous linear 
address.\n");
-   dma_unmap_sg(drm->dev, sg->sgl, sg->nents,
+   dma_unmap_sg(drm->dev, sg->sgl, sg->orig_nents,
 DMA_BIDIRECTIONAL);
return -EINVAL;
}
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/5] iommu/amd: Do not loop forever when trying to increase address space

2020-05-04 Thread Joerg Roedel
From: Joerg Roedel 

When increase_address_space() fails to allocate memory, alloc_pte()
will call it again until it succeeds. Do not loop forever while trying
to increase the address space and just return an error instead.

Tested-by: Qian Cai 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd_iommu.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 28229a38af4d..68da484a69dd 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1489,8 +1489,19 @@ static u64 *alloc_pte(struct protection_domain *domain,
amd_iommu_domain_get_pgtable(domain, &pgtable);
 
while (address > PM_LEVEL_SIZE(pgtable.mode)) {
-   *updated = increase_address_space(domain, address, gfp) || 
*updated;
+   bool upd = increase_address_space(domain, address, gfp);
+
+   /* Read new values to check if update was successful */
amd_iommu_domain_get_pgtable(domain, &pgtable);
+
+   /*
+* Return an error if there is no memory to update the
+* page-table.
+*/
+   if (!upd && (address > PM_LEVEL_SIZE(pgtable.mode)))
+   return NULL;
+
+   *updated = *updated || upd;
}
 
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/5] iommu/amd: Call domain_flush_complete() in update_domain()

2020-05-04 Thread Joerg Roedel
From: Joerg Roedel 

The update_domain() function is expected to also inform the hardware
about domain changes. This needs a COMPLETION_WAIT command to be sent
to all IOMMUs which use the domain.

Tested-by: Qian Cai 
Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd_iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 68da484a69dd..d2499c86d395 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2321,6 +2321,7 @@ static void update_domain(struct protection_domain 
*domain)
 
domain_flush_devices(domain);
domain_flush_tlb_pde(domain);
+   domain_flush_complete(domain);
 }
 
 int __init amd_iommu_init_api(void)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 10/21] drm: radeon: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
Reviewed-by: Christian König 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
 drivers/gpu/drm/radeon/radeon_ttm.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
b/drivers/gpu/drm/radeon/radeon_ttm.c
index 5d50c9e..f8275c8 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -481,7 +481,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 {
struct radeon_device *rdev = radeon_get_rdev(ttm->bdev);
struct radeon_ttm_tt *gtt = (void *)ttm;
-   unsigned pinned = 0, nents;
+   unsigned pinned = 0;
int r;
 
int write = !(gtt->userflags & RADEON_GEM_USERPTR_READONLY);
@@ -522,8 +522,9 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)
goto release_sg;
 
r = -ENOMEM;
-   nents = dma_map_sg(rdev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
-   if (nents == 0)
+   ttm->sg->nents = dma_map_sg(rdev->dev, ttm->sg->sgl,
+   ttm->sg->orig_nents, direction);
+   if (ttm->sg->nents == 0)
goto release_sg;
 
drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
@@ -554,9 +555,9 @@ static void radeon_ttm_tt_unpin_userptr(struct ttm_tt *ttm)
return;
 
/* free the sg table and pages again */
-   dma_unmap_sg(rdev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
+   dma_unmap_sg(rdev->dev, ttm->sg->sgl, ttm->sg->orig_nents, direction);
 
-   for_each_sg_page(ttm->sg->sgl, &sg_iter, ttm->sg->nents, 0) {
+   for_each_sg_page(ttm->sg->sgl, &sg_iter, ttm->sg->orig_nents, 0) {
struct page *page = sg_page_iter_page(&sg_iter);
if (!(gtt->userflags & RADEON_GEM_USERPTR_READONLY))
set_page_dirty(page);
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 0/5] iommu/amd: Fix race conditions around increase_address_space()

2020-05-04 Thread Joerg Roedel
Hi,

Qian reported more race conditions around the dma-mapping
code path in the AMD IOMMU driver. The race condtions are
between increas_address_space() and fetch_pte(), there were
two of them:

1) the 'pt_root' and 'mode' fields of 'struct
   protection_domain' relate to each other so that
   they must be updated together atomically.

2) The increase_address_space() function publishes
   the updated page-table before it has been written
   to the DTE. This might cause PTEs to be mapped
   and addresses handed to the device which are not
   yet reachable through the DTE entry, causing IO
   page-faults.

This patch-set fixes these issues, as tested by Qian Cai.

Thanks a lot again for reporting these issued and testing
the fixes!

Regards,

Joerg


Joerg Roedel (5):
  iommu/amd: Fix race in increase_address_space()/fetch_pte()
  iommu/amd: Do not loop forever when trying to increase address space
  iommu/amd: Call domain_flush_complete() in update_domain()
  iommu/amd: Update Device Table in increase_address_space()
  iommu/amd: Do not flush Device Table in iommu_map_page()

 drivers/iommu/amd_iommu.c   | 198 +---
 drivers/iommu/amd_iommu_types.h |   9 +-
 2 files changed, 161 insertions(+), 46 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 00/21] DRM: fix struct sg_table nents vs. orig_nents misuse

2020-05-04 Thread Christoph Hellwig
On Mon, May 04, 2020 at 02:50:17PM +0200, Marek Szyprowski wrote:
> v2:
> - dropped most of the changes to drm/i915
> - added fixes for rcar-du, xen, media and ion
> - fixed a few issues pointed by kbuild test robot
> - added wide cc: list for each patch

Didn't you plan to add dma_map_sgbuf and co helper?

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 00/21] DRM: fix struct sg_table nents vs. orig_nents misuse

2020-05-04 Thread Marek Szyprowski
Dear All,

During the Exynos DRM GEM rework and fixing the issues in the 
drm_prime_sg_to_page_addr_arrays() function [1] I've noticed that most
drivers in DRM framework incorrectly use nents and orig_nents entries of
the struct sg_table.

In case of the most DMA-mapping implementations exchanging those two
entries or using nents for all loops on the scatterlist is harmless,
because they both have the same value. There exists however a DMA-mapping
implementations, for which such incorrect usage breaks things. The nents
returned by dma_map_sg() might be lower than the nents passed as its
parameter and this is perfectly fine. DMA framework or IOMMU is allowed
to join consecutive chunks while mapping if such operation is supported
by the underlying HW (bus, bridge, IOMMU, etc). Example of the case
where dma_map_sg() might return 1 'DMA' chunk for the 4 'physical' pages
is described here [2]

The DMA-mapping framework documentation [3] states that dma_map_sg()
returns the numer of the created entries in the DMA address space.
However the subsequent calls to dma_sync_sg_for_{device,cpu} and
dma_unmap_sg must be called with the original number of entries passed to
dma_map_sg. The common pattern in DRM drivers were to assign the
dma_map_sg() return value to sg_table->nents and use that value for
the subsequent calls to dma_sync_sg_* or dma_unmap_sg functions. Also
the code iterated over nents times to access the pages stored in the
processed scatterlist, while it should use orig_nents as the numer of
the page entries.

I've tried to identify all such incorrect usage of sg_table->nents and
this is a result of my research. It looks that the incorrect pattern has
been copied over the many drivers mainly in the DRM subsystem. Too bad in
most cases it even worked correctly if the system used simple,
linear DMA-mapping implementation, for which swapping nents and
orig_nents doesn't make any difference.

The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix
it fully. The driver creatively uses sg_table->orig_nents to store the
size of the allocate scatterlist and ignores the number of the entries
returned by dma_map_sg function. In this patchset I only fixed the
sg_table objects exported by dmabuf related functions. I hope that I
didn't break anything there.

As a follow-up for this patchset I will prepare a common
dma_{map,sync,unmap}_sgtable() wrappers, which will use a proper sg_table
entries and call respective DMA-mapping functions. I hope this will help
to avoid issue like this in the future.

Patches are based on top of Linux next-20200504.

Best regards,
Marek Szyprowski


References:

[1] https://lkml.org/lkml/2020/3/27/555 
[2] https://lkml.org/lkml/2020/3/29/65
[3] Documentation/DMA-API-HOWTO.txt


Changelog:

v2:
- dropped most of the changes to drm/i915
- added fixes for rcar-du, xen, media and ion
- fixed a few issues pointed by kbuild test robot
- added wide cc: list for each patch

v1: 
https://lore.kernel.org/linux-iommu/c01c9766-9778-fd1f-f36e-2dc7bd376...@arm.com/T/#m879a727e4e46b5479ef8603994b1a006fb2aa337
- initial version


Patch summary:

Marek Szyprowski (21):
  drm: core: fix sg_table nents vs. orig_nents misuse
  drm: amdgpu: fix sg_table nents vs. orig_nents misuse
  drm: armada: fix sg_table nents vs. orig_nents misuse
  drm: etnaviv: fix sg_table nents vs. orig_nents misuse
  drm: exynos: fix sg_table nents vs. orig_nents misuse
  drm: i915: fix sg_table nents vs. orig_nents misuse for dmabuf objects
  drm: lima: fix sg_table nents vs. orig_nents misuse
  drm: msm: fix sg_table nents vs. orig_nents misuse
  drm: panfrost: fix sg_table nents vs. orig_nents misuse
  drm: radeon: fix sg_table nents vs. orig_nents misuse
  drm: rockchip: fix sg_table nents vs. orig_nents misuse
  drm: tegra: fix sg_table nents vs. orig_nents misuse
  drm: virtio: fix sg_table nents vs. orig_nents misuse
  drm: vmwgfx: fix sg_table nents vs. orig_nents misuse
  drm: xen: fix sg_table nents vs. orig_nents misuse
  drm: host1x: fix sg_table nents vs. orig_nents misuse
  drm: rcar-du: fix sg_table nents vs. orig_nents misuse
  xen: gntdev: fix sg_table nents vs. orig_nents misuse
  dmabuf: fix sg_table nents vs. orig_nents misuse
  media: pci: fix common ALSA DMA-mapping related code
  staging: ion: fix sg_table nents vs. orig_nents misuse

 drivers/dma-buf/heaps/heap-helpers.c |  7 ---
 drivers/dma-buf/udmabuf.c|  5 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c  |  7 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c  |  8 
 drivers/gpu/drm/armada/armada_gem.c  | 14 -
 drivers/gpu/drm/drm_cache.c  |  2 +-
 drivers/gpu/drm/drm_gem_shmem_helper.c   |  7 ---
 drivers/gpu/drm/drm_prime.c  |  9 +
 drivers/gpu/drm/etnaviv/etnaviv_gem.c| 10 ++
 drivers/gpu/drm/exynos/exynos_drm