date:20170608

On Wed, Jun 07, 2017 at 02:17:32PM -0500, Tom Lendacky wrote:
> Add warnings to let the user know when bounce buffers are being used for
> DMA when SME is active.  Since the bounce buffers are not in encrypted
> memory, these notifications are to allow the user to determine some
> appropriate action - if necessary.

And what would the action be?  Do we need a boot or other option to
disallow this fallback for people who care deeply?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFCv2 2/2] iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801

On Tue, Jun 06, 2017 at 03:01:36PM +, Shameerali Kolothum Thodi wrote:

[...]

> > > + irq_dom = pci_msi_get_device_domain(to_pci_dev(dev));
> > > + if (irq_dom) {
> > > + int ret;
> > > + u32 rid;
> > > +
> > > + rid = pci_msi_domain_get_msi_rid(irq_dom,
> > to_pci_dev(dev));
> > > + ret = iort_dev_find_its_base(dev, rid, 0, &base);
> > 
> > Well, here we use ITS id 0 which is fine as long as code in IORT uses the 
> > same
> > policy for getting the irq_domain (ie we want to reserve the ITS address
> > space that is actually used by the device to send IRQs not a a different 
> > one) it
> > is just a heads-up because I find this confusing.
> 
> Ok. Just to make it clear, 0 is the index into the ITS identifier
> list.  I noted that iort_get_device_domain() uses index 0 while
> retrieving the ITS identifier.  May be use the same approach here as
> well? ie, remove the index from function call?
> 
> I am not sure, how we can get the index info  though theoretically It
> is possible for the ITS group node having multiple ITSs.

Actually I think it would make sense to reserve ALL ITS regions a device
may be mapped to instead of just index 0 (ie in your case it is
equivalent); this leaves us some leeway as to choose which ITS the
device will be actually mapped to and this code does not have to care.

Lorenzo

>  
> > > + if (!ret) {
> > > + dev_info(dev, "SMMUv3:HW MSI resv addr
> > 0x%pa\n", &base);
> > > + region = iommu_alloc_resv_region(base, SZ_128K,
> > > +  prot,
> > IOMMU_RESV_MSI);
> > > + return region;
> > > + }
> > > + }
> > > +
> > > + return NULL;
> > > +}
> > > +#else
> > > +static struct iommu_resv_region *arm_smmu_acpi_alloc_hw_msi(struct
> > > +device *dev) {
> > > + return NULL;
> > > +}
> > > +#endif
> > > +
> > >  static int arm_smmu_add_device(struct device *dev)  {
> > >   int i, ret;
> > > @@ -1903,11 +1936,20 @@ static int arm_smmu_of_xlate(struct device
> > > *dev, struct of_phandle_args *args)  static void
> > arm_smmu_get_resv_regions(struct device *dev,
> > > struct list_head *head)
> > >  {
> > > - struct iommu_resv_region *region;
> > > + struct iommu_fwspec *fwspec = dev->iommu_fwspec;
> > > + struct iommu_resv_region *region = NULL;
> > >   int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> > > + struct arm_smmu_device *smmu;
> > > +
> > > + smmu = arm_smmu_get_by_fwnode(fwspec->iommu_fwnode);
> > >
> > > - region = iommu_alloc_resv_region(MSI_IOVA_BASE,
> > MSI_IOVA_LENGTH,
> > > -  prot, IOMMU_RESV_SW_MSI);
> > > + if (smmu && (smmu->options & ARM_SMMU_OPT_RESV_HW_MSI)
> > &&
> > > +   dev_is_pci(dev))
> > > + region = arm_smmu_acpi_alloc_hw_msi(dev);
> > 
> > Is it safe to carry on if arm_smmu_acpi_alloc_hw_msi() returns NULL here ?
> 
> It is just that PCIe devices won't be functional on this platforms as the 
> endpoint will 
> be configured with ITS IOVA address. May be I should add some dev_warn() here.
> 
> Thanks,
> Shameer
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 1/3] ACPI/IORT: Fixup SMMUv3 resource size for Cavium ThunderX2 SMMUv3 model

On Tue, May 30, 2017 at 05:33:39PM +0530, Geetha sowjanya wrote:
> From: Linu Cherian 
> 
> Cavium ThunderX2 implementation doesn't support second page in SMMU
> register space. Hence, resource size is set as 64k for this model.
> 
> Signed-off-by: Linu Cherian 
> Signed-off-by: Geetha Sowjanya 
> ---
>  drivers/acpi/arm64/iort.c |   10 +-
>  1 files changed, 9 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index c5fecf9..bba2b59 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -833,12 +833,20 @@ static void __init arm_smmu_v3_init_resources(struct 
> resource *res,
>  {
>   struct acpi_iort_smmu_v3 *smmu;
>   int num_res = 0;
> + unsigned long size = SZ_128K;
>  
>   /* Retrieve SMMUv3 specific data */
>   smmu = (struct acpi_iort_smmu_v3 *)node->node_data;
>  
> + /*
> +  * Override the size, for Cavium ThunderX2 implementation
> +  * which doesn't support the page 1 SMMU register space.
> +  */
> + if (smmu->model == ACPI_IORT_SMMU_CAVIUM_CN99XX)
> + size = SZ_64K;

Nit: add a function, say arm_smmu_v3_resource_size() with the logic
above that by default returns SZ_128K, I do not like this switch
in the middle of this function.

Lorenzo

> +
>   res[num_res].start = smmu->base_address;
> - res[num_res].end = smmu->base_address + SZ_128K - 1;
> + res[num_res].end = smmu->base_address + size - 1;
>   res[num_res].flags = IORESOURCE_MEM;
>  
>   num_res++;
> -- 
> 1.7.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 0/2] acpi/iort, numa: Add numa node mapping for smmuv3 devices

2017-06-08 Thread John Garry


On 08/06/2017 05:44, Ganapatrao Kulkarni wrote:

ARM IORT specification(rev. C) has added  provision to define proximity
domain in SMMUv3 IORT table. Adding required code to parse Proximity
domain and set numa_node of smmv3 platform devices.

v3:
  - Addressed Lorenzo Pieralisi comment.

v2:
  - Changed as per Lorenzo Pieralisi and Hanjun Guo suggestions.

v1:
  - Initial patch

Ganapatrao Kulkarni (2):
  acpica: iort: Update SMMUv3 header for proximity domain mapping
  acpi/iort: numa: Add numa node mapping for smmuv3 devices

 drivers/acpi/arm64/iort.c | 28 ++--
 include/acpi/actbl2.h |  4 
 2 files changed, 30 insertions(+), 2 deletions(-)



We'll try and test this in the next day or so.

John

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFCv2 2/2] iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801

2017-06-08 Thread Shameerali Kolothum Thodi



> -Original Message-
> From: Lorenzo Pieralisi [mailto:lorenzo.pieral...@arm.com]
> Sent: Thursday, June 08, 2017 9:49 AM
> To: Shameerali Kolothum Thodi
> Cc: marc.zyng...@arm.com; sudeep.ho...@arm.com; will.dea...@arm.com;
> robin.mur...@arm.com; hanjun@linaro.org; Gabriele Paoloni; John
> Garry; iommu@lists.linux-foundation.org; linux-arm-
> ker...@lists.infradead.org; linux-a...@vger.kernel.org; de...@acpica.org;
> Linuxarm; Wangzhou (B); Guohanjun (Hanjun Guo)
> Subject: Re: [RFCv2 2/2] iommu/arm-smmu-v3:Enable ACPI based HiSilicon
> erratum 161010801
> 
> On Tue, Jun 06, 2017 at 03:01:36PM +, Shameerali Kolothum Thodi
> wrote:
> 
> [...]
> 
> > > > +   irq_dom = pci_msi_get_device_domain(to_pci_dev(dev));
> > > > +   if (irq_dom) {
> > > > +   int ret;
> > > > +   u32 rid;
> > > > +
> > > > +   rid = pci_msi_domain_get_msi_rid(irq_dom,
> > > to_pci_dev(dev));
> > > > +   ret = iort_dev_find_its_base(dev, rid, 0, &base);
> > >
> > > Well, here we use ITS id 0 which is fine as long as code in IORT
> > > uses the same policy for getting the irq_domain (ie we want to
> > > reserve the ITS address space that is actually used by the device to
> > > send IRQs not a a different one) it is just a heads-up because I find this
> confusing.
> >
> > Ok. Just to make it clear, 0 is the index into the ITS identifier
> > list.  I noted that iort_get_device_domain() uses index 0 while
> > retrieving the ITS identifier.  May be use the same approach here as
> > well? ie, remove the index from function call?
> >
> > I am not sure, how we can get the index info  though theoretically It
> > is possible for the ITS group node having multiple ITSs.
> 
> Actually I think it would make sense to reserve ALL ITS regions a device may
> be mapped to instead of just index 0 (ie in your case it is equivalent); this
> leaves us some leeway as to choose which ITS the device will be actually
> mapped to and this code does not have to care.

Ok. That make sense. Just a quick one, is it ok to add another helper function 
in
iort code to retrieve the its->its_count then? 

Thanks,
Shameer
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFCv2 2/2] iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801

2017-06-08 Thread Shameerali Kolothum Thodi



> -Original Message-
> From: Lorenzo Pieralisi [mailto:lorenzo.pieral...@arm.com]
> Sent: Wednesday, June 07, 2017 6:16 PM
> To: Shameerali Kolothum Thodi
> Cc: marc.zyng...@arm.com; sudeep.ho...@arm.com; will.dea...@arm.com;
> robin.mur...@arm.com; hanjun@linaro.org; Gabriele Paoloni; John
> Garry; iommu@lists.linux-foundation.org; linux-arm-
> ker...@lists.infradead.org; linux-a...@vger.kernel.org; de...@acpica.org;
> Linuxarm; Wangzhou (B); Guohanjun (Hanjun Guo)
> Subject: Re: [RFCv2 2/2] iommu/arm-smmu-v3:Enable ACPI based HiSilicon
> erratum 161010801
> 
> On Tue, Jun 06, 2017 at 03:01:36PM +, Shameerali Kolothum Thodi
> wrote:
> > Hi Lorenzo,
> >
> > > -Original Message-
> > > From: Lorenzo Pieralisi [mailto:lorenzo.pieral...@arm.com]
> > > Sent: Tuesday, June 06, 2017 2:56 PM
> > > To: Shameerali Kolothum Thodi
> > > Cc: marc.zyng...@arm.com; sudeep.ho...@arm.com;
> will.dea...@arm.com;
> > > robin.mur...@arm.com; hanjun@linaro.org; Gabriele Paoloni; John
> > > Garry; iommu@lists.linux-foundation.org; linux-arm-
> > > ker...@lists.infradead.org; linux-a...@vger.kernel.org;
> de...@acpica.org;
> > > Linuxarm; Wangzhou (B); Guohanjun (Hanjun Guo)
> > > Subject: Re: [RFCv2 2/2] iommu/arm-smmu-v3:Enable ACPI based
> HiSilicon
> > > erratum 161010801
> > >
> > > On Wed, May 31, 2017 at 03:32:13PM +0100, shameer wrote:
> > > > The HiSilicon erratum 161010801 describes the limitation of HiSilicon
> > > > platforms Hip06/Hip07 to support the SMMU mappings for MSI
> > > transactions.
> > > >
> > > > On these platforms GICv3 ITS translator is presented with the deviceID
> > > > by extending the MSI payload data to 64 bits to include the deviceID.
> > > > Hence, the PCIe controller on this platforms has to differentiate the
> > > > MSI payload against other DMA payload and has to modify the MSI
> > > payload.
> > > > This basically makes it difficult for this platforms to have a SMMU
> > > > translation for MSI.
> > > >
> > > > This patch implements a ACPI table based quirk to reserve the hw msi
> > > > regions in the smmu-v3 driver which means these address regions will
> > > > not be translated and will be excluded from iova allocations.
> > > >
> > > > The HW ITS address region associated with the dev is retrieved using a
> > > > new helper function added in the IORT code.
> > >
> > > Remove or rephrase last paragraph, it reads as if you are adding an IORT
> > > helper function in this patch but you actually aren't.
> >
> > Thanks for going through this patch series. I will remove this in next
> version.
> >
> > > > Signed-off-by: shameer 
> > > > ---
> > > >  drivers/iommu/arm-smmu-v3.c | 49
> > > > ++---
> > > >  1 file changed, 46 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-
> smmu-
> > > v3.c
> > > > index abe4b88..3767526 100644
> > > > --- a/drivers/iommu/arm-smmu-v3.c
> > > > +++ b/drivers/iommu/arm-smmu-v3.c
> > > > @@ -597,6 +597,7 @@ struct arm_smmu_device {
> > > > u32 features;
> > > >
> > > >  #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
> > > > +#define ARM_SMMU_OPT_RESV_HW_MSI   (1 << 1)
> > > > u32 options;
> > > >
> > > > struct arm_smmu_cmdqcmdq;
> > > > @@ -1755,6 +1756,38 @@ static bool arm_smmu_sid_in_range(struct
> > > > arm_smmu_device *smmu, u32 sid)
> > > >
> > > >  static struct iommu_ops arm_smmu_ops;
> > > >
> > > > +#ifdef CONFIG_ACPI
> > > > +static struct iommu_resv_region
> *arm_smmu_acpi_alloc_hw_msi(struct
> > > > +device *dev) {
> > > > +   struct iommu_resv_region *region;
> > > > +   struct  irq_domain *irq_dom;
> > > > +   int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> > > > +   u64 base;
> > >
> > > phys_addr_t
> >
> > Ok.
> >
> > > > +   irq_dom = pci_msi_get_device_domain(to_pci_dev(dev));
> > > > +   if (irq_dom) {
> > > > +   int ret;
> > > > +   u32 rid;
> > > > +
> > > > +   rid = pci_msi_domain_get_msi_rid(irq_dom,
> > > to_pci_dev(dev));
> > > > +   ret = iort_dev_find_its_base(dev, rid, 0, &base);
> > >
> > > Well, here we use ITS id 0 which is fine as long as code in IORT uses the
> same
> > > policy for getting the irq_domain (ie we want to reserve the ITS address
> > > space that is actually used by the device to send IRQs not a a different
> one) it
> > > is just a heads-up because I find this confusing.
> >
> > Ok. Just to make it clear, 0 is the index into the ITS identifier list.
> > I noted that iort_get_device_domain() uses index 0 while retrieving the ITS
> identifier.
> > May be use the same approach here as well? ie, remove the index from
> function call?
> >
> > I am not sure, how we can get the index info  though theoretically It is
> possible for
> > the ITS group node having multiple ITSs.
> 
> Yes, it would be ideal to avoid

Re: [RFCv2 2/2] iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801

On Thu, Jun 08, 2017 at 09:09:28AM +, Shameerali Kolothum Thodi wrote:
> 
> 
> > -Original Message-
> > From: Lorenzo Pieralisi [mailto:lorenzo.pieral...@arm.com]
> > Sent: Thursday, June 08, 2017 9:49 AM
> > To: Shameerali Kolothum Thodi
> > Cc: marc.zyng...@arm.com; sudeep.ho...@arm.com; will.dea...@arm.com;
> > robin.mur...@arm.com; hanjun@linaro.org; Gabriele Paoloni; John
> > Garry; iommu@lists.linux-foundation.org; linux-arm-
> > ker...@lists.infradead.org; linux-a...@vger.kernel.org; de...@acpica.org;
> > Linuxarm; Wangzhou (B); Guohanjun (Hanjun Guo)
> > Subject: Re: [RFCv2 2/2] iommu/arm-smmu-v3:Enable ACPI based HiSilicon
> > erratum 161010801
> > 
> > On Tue, Jun 06, 2017 at 03:01:36PM +, Shameerali Kolothum Thodi
> > wrote:
> > 
> > [...]
> > 
> > > > > + irq_dom = pci_msi_get_device_domain(to_pci_dev(dev));
> > > > > + if (irq_dom) {
> > > > > + int ret;
> > > > > + u32 rid;
> > > > > +
> > > > > + rid = pci_msi_domain_get_msi_rid(irq_dom,
> > > > to_pci_dev(dev));
> > > > > + ret = iort_dev_find_its_base(dev, rid, 0, &base);
> > > >
> > > > Well, here we use ITS id 0 which is fine as long as code in IORT
> > > > uses the same policy for getting the irq_domain (ie we want to
> > > > reserve the ITS address space that is actually used by the device to
> > > > send IRQs not a a different one) it is just a heads-up because I find 
> > > > this
> > confusing.
> > >
> > > Ok. Just to make it clear, 0 is the index into the ITS identifier
> > > list.  I noted that iort_get_device_domain() uses index 0 while
> > > retrieving the ITS identifier.  May be use the same approach here as
> > > well? ie, remove the index from function call?
> > >
> > > I am not sure, how we can get the index info  though theoretically It
> > > is possible for the ITS group node having multiple ITSs.
> > 
> > Actually I think it would make sense to reserve ALL ITS regions a device may
> > be mapped to instead of just index 0 (ie in your case it is equivalent); 
> > this
> > leaves us some leeway as to choose which ITS the device will be actually
> > mapped to and this code does not have to care.
> 
> Ok. That make sense. Just a quick one, is it ok to add another helper 
> function in
> iort code to retrieve the its->its_count then? 

While at it, given that the pci API code to retrieve domain and rid falls
back to IORT anyway, I would add the whole reservation to IORT (mind,
it depends on IOMMU_API) as one function instead of fiddling about with
indexes.

Side note: why Hilisicon dts upstream (eg hip07.dtsi) report ITS size
as 256K ? I was just checking whether the ITS reg map size is system
dependent and I bumped into them, I suspect there may be some dts
patching needed here.

Lorenzo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFCv2 2/2] iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801

2017-06-08 Thread Shameerali Kolothum Thodi



> -Original Message-
> From: Lorenzo Pieralisi [mailto:lorenzo.pieral...@arm.com]
> Sent: Thursday, June 08, 2017 11:15 AM
> To: Shameerali Kolothum Thodi
> Cc: marc.zyng...@arm.com; sudeep.ho...@arm.com; will.dea...@arm.com;
> robin.mur...@arm.com; hanjun@linaro.org; Gabriele Paoloni; John
> Garry; iommu@lists.linux-foundation.org; linux-arm-
> ker...@lists.infradead.org; linux-a...@vger.kernel.org; de...@acpica.org;
> Linuxarm; Wangzhou (B); Guohanjun (Hanjun Guo)
> Subject: Re: [RFCv2 2/2] iommu/arm-smmu-v3:Enable ACPI based HiSilicon
> erratum 161010801
> 
> On Thu, Jun 08, 2017 at 09:09:28AM +, Shameerali Kolothum Thodi
> wrote:
> >
> >
> > > -Original Message-
> > > From: Lorenzo Pieralisi [mailto:lorenzo.pieral...@arm.com]
> > > Sent: Thursday, June 08, 2017 9:49 AM
> > > To: Shameerali Kolothum Thodi
> > > Cc: marc.zyng...@arm.com; sudeep.ho...@arm.com;
> will.dea...@arm.com;
> > > robin.mur...@arm.com; hanjun@linaro.org; Gabriele Paoloni; John
> > > Garry; iommu@lists.linux-foundation.org; linux-arm-
> > > ker...@lists.infradead.org; linux-a...@vger.kernel.org;
> de...@acpica.org;
> > > Linuxarm; Wangzhou (B); Guohanjun (Hanjun Guo)
> > > Subject: Re: [RFCv2 2/2] iommu/arm-smmu-v3:Enable ACPI based
> HiSilicon
> > > erratum 161010801
> > >
> > > On Tue, Jun 06, 2017 at 03:01:36PM +, Shameerali Kolothum Thodi
> > > wrote:
> > >
> > > [...]
> > >
> > > > > > +   irq_dom = pci_msi_get_device_domain(to_pci_dev(dev));
> > > > > > +   if (irq_dom) {
> > > > > > +   int ret;
> > > > > > +   u32 rid;
> > > > > > +
> > > > > > +   rid = pci_msi_domain_get_msi_rid(irq_dom,
> > > > > to_pci_dev(dev));
> > > > > > +   ret = iort_dev_find_its_base(dev, rid, 0, &base);
> > > > >
> > > > > Well, here we use ITS id 0 which is fine as long as code in IORT
> > > > > uses the same policy for getting the irq_domain (ie we want to
> > > > > reserve the ITS address space that is actually used by the device to
> > > > > send IRQs not a a different one) it is just a heads-up because I find
> this
> > > confusing.
> > > >
> > > > Ok. Just to make it clear, 0 is the index into the ITS identifier
> > > > list.  I noted that iort_get_device_domain() uses index 0 while
> > > > retrieving the ITS identifier.  May be use the same approach here as
> > > > well? ie, remove the index from function call?
> > > >
> > > > I am not sure, how we can get the index info  though theoretically It
> > > > is possible for the ITS group node having multiple ITSs.
> > >
> > > Actually I think it would make sense to reserve ALL ITS regions a device
> may
> > > be mapped to instead of just index 0 (ie in your case it is equivalent); 
> > > this
> > > leaves us some leeway as to choose which ITS the device will be actually
> > > mapped to and this code does not have to care.
> >
> > Ok. That make sense. Just a quick one, is it ok to add another helper
> function in
> > iort code to retrieve the its->its_count then?
> 
> While at it, given that the pci API code to retrieve domain and rid falls
> back to IORT anyway, I would add the whole reservation to IORT (mind,
> it depends on IOMMU_API) as one function instead of fiddling about with
> indexes.

Ok. I will take a look at this. 

> Side note: why Hilisicon dts upstream (eg hip07.dtsi) report ITS size
> as 256K ? I was just checking whether the ITS reg map size is system
> dependent and I bumped into them, I suspect there may be some dts
> patching needed here.
> 

Thanks for catching this. I will check with the team and update.

Shameer

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 3/8] iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap

Whilst the short-descriptor format's split_blk_unmap implementation has
no need to be recursive, it followed the pattern of the LPAE version
anyway for the sake of consistency. With the latter now reworked for
both efficiency and future scalability improvements, tweak the former
similarly, not least to make it less obtuse.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 85 --
 1 file changed, 45 insertions(+), 40 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index a490db032c51..3ee8e61eeb18 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -281,6 +281,13 @@ static arm_v7s_iopte arm_v7s_prot_to_pte(int prot, int lvl,
else if (prot & IOMMU_CACHE)
pte |= ARM_V7S_ATTR_B | ARM_V7S_ATTR_C;
 
+   pte |= ARM_V7S_PTE_TYPE_PAGE;
+   if (lvl == 1 && (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS))
+   pte |= ARM_V7S_ATTR_NS_SECTION;
+
+   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)
+   pte |= ARM_V7S_ATTR_MTK_4GB;
+
return pte;
 }
 
@@ -353,7 +360,7 @@ static int arm_v7s_init_pte(struct arm_v7s_io_pgtable *data,
int lvl, int num_entries, arm_v7s_iopte *ptep)
 {
struct io_pgtable_cfg *cfg = &data->iop.cfg;
-   arm_v7s_iopte pte = arm_v7s_prot_to_pte(prot, lvl, cfg);
+   arm_v7s_iopte pte;
int i;
 
for (i = 0; i < num_entries; i++)
@@ -375,13 +382,7 @@ static int arm_v7s_init_pte(struct arm_v7s_io_pgtable 
*data,
return -EEXIST;
}
 
-   pte |= ARM_V7S_PTE_TYPE_PAGE;
-   if (lvl == 1 && (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS))
-   pte |= ARM_V7S_ATTR_NS_SECTION;
-
-   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)
-   pte |= ARM_V7S_ATTR_MTK_4GB;
-
+   pte = arm_v7s_prot_to_pte(prot, lvl, cfg);
if (num_entries > 1)
pte = arm_v7s_pte_to_cont(pte, lvl);
 
@@ -391,6 +392,20 @@ static int arm_v7s_init_pte(struct arm_v7s_io_pgtable 
*data,
return 0;
 }
 
+static arm_v7s_iopte arm_v7s_install_table(arm_v7s_iopte *table,
+  arm_v7s_iopte *ptep,
+  struct io_pgtable_cfg *cfg)
+{
+   arm_v7s_iopte new;
+
+   new = virt_to_phys(table) | ARM_V7S_PTE_TYPE_TABLE;
+   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
+   new |= ARM_V7S_ATTR_NS_TABLE;
+
+   __arm_v7s_set_pte(ptep, new, 1, cfg);
+   return new;
+}
+
 static int __arm_v7s_map(struct arm_v7s_io_pgtable *data, unsigned long iova,
 phys_addr_t paddr, size_t size, int prot,
 int lvl, arm_v7s_iopte *ptep)
@@ -418,11 +433,7 @@ static int __arm_v7s_map(struct arm_v7s_io_pgtable *data, 
unsigned long iova,
if (!cptep)
return -ENOMEM;
 
-   pte = virt_to_phys(cptep) | ARM_V7S_PTE_TYPE_TABLE;
-   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
-   pte |= ARM_V7S_ATTR_NS_TABLE;
-
-   __arm_v7s_set_pte(ptep, pte, 1, cfg);
+   arm_v7s_install_table(cptep, ptep, cfg);
} else if (ARM_V7S_PTE_IS_TABLE(pte, lvl)) {
cptep = iopte_deref(pte, lvl);
} else {
@@ -503,41 +514,35 @@ static void arm_v7s_split_cont(struct arm_v7s_io_pgtable 
*data,
 
 static int arm_v7s_split_blk_unmap(struct arm_v7s_io_pgtable *data,
   unsigned long iova, size_t size,
-  arm_v7s_iopte *ptep)
+  arm_v7s_iopte blk_pte, arm_v7s_iopte *ptep)
 {
-   unsigned long blk_start, blk_end, blk_size;
-   phys_addr_t blk_paddr;
-   arm_v7s_iopte table = 0;
-   int prot = arm_v7s_pte_to_prot(*ptep, 1);
+   struct io_pgtable_cfg *cfg = &data->iop.cfg;
+   arm_v7s_iopte pte, *tablep;
+   int i, unmap_idx, num_entries, num_ptes;
 
-   blk_size = ARM_V7S_BLOCK_SIZE(1);
-   blk_start = iova & ARM_V7S_LVL_MASK(1);
-   blk_end = blk_start + ARM_V7S_BLOCK_SIZE(1);
-   blk_paddr = *ptep & ARM_V7S_LVL_MASK(1);
+   tablep = __arm_v7s_alloc_table(2, GFP_ATOMIC, data);
+   if (!tablep)
+   return 0; /* Bytes unmapped */
 
-   for (; blk_start < blk_end; blk_start += size, blk_paddr += size) {
-   arm_v7s_iopte *tablep;
+   num_ptes = ARM_V7S_PTES_PER_LVL(2);
+   num_entries = size >> ARM_V7S_LVL_SHIFT(2);
+   unmap_idx = ARM_V7S_LVL_IDX(iova, 2);
 
+   pte = arm_v7s_prot_to_pte(arm_v7s_pte_to_prot(blk_pte, 1), 2, cfg);
+   if (num_entries > 1)
+   pte = arm_v7s_pte_to_cont(pte, 2);
+
+   for (i = 0; i < num_ptes; pte += size, i += num_entries) {
/* Unmap! */
-   if (blk_start == iova)
+   if (i == unmap_idx)

[PATCH 0/8] io-pgtable lock removal

Hi all,

Here's the cleaned up nominally-final version of the patches everybody's
keen to see. #1 is just a non-critical thing-I-spotted-in-passing fix,
#2-#4 do some preparatory work (and bid farewell to everyone's least
favourite bit of code, hooray!), and #5-#8 do the dirty deed itself.

The branch I've previously shared has been updated too:

  git://linux-arm.org/linux-rm  iommu/pgtable

All feedback welcome, as I'd really like to land this for 4.13.

Robin.


Robin Murphy (8):
  iommu/io-pgtable-arm-v7s: Check table PTEs more precisely
  iommu/io-pgtable-arm: Improve split_blk_unmap
  iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap
  iommu/io-pgtable: Introduce explicit coherency
  iommu/io-pgtable-arm: Support lockless operation
  iommu/io-pgtable-arm-v7s: Support lockless operation
  iommu/arm-smmu: Remove io-pgtable spinlock
  iommu/arm-smmu-v3: Remove io-pgtable spinlock

 drivers/iommu/arm-smmu-v3.c|  36 ++-
 drivers/iommu/arm-smmu.c   |  48 --
 drivers/iommu/io-pgtable-arm-v7s.c | 173 +
 drivers/iommu/io-pgtable-arm.c | 190 -
 drivers/iommu/io-pgtable.h |   6 ++
 5 files changed, 268 insertions(+), 185 deletions(-)

-- 
2.12.2.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 2/8] iommu/io-pgtable-arm: Improve split_blk_unmap

The current split_blk_unmap implementation suffers from some inscrutable
pointer trickery for creating the tables to replace the block entry, but
more than that it also suffers from hideous inefficiency. For example,
the most pathological case of unmapping a level 3 page from a level 1
block will allocate 513 lower-level tables to remap the entire block at
page granularity, when only 2 are actually needed (the rest can be
covered by level 2 block entries).

Also, we would like to be able to relax the spinlock requirement in
future, for which the roll-back-and-try-again logic for race resolution
would be pretty hideous under the current paradigm.

Both issues can be resolved most neatly by turning things sideways:
instead of repeatedly recursing into __arm_lpae_map() map to build up an
entire new sub-table depth-first, we can directly replace the block
entry with a next-level table of block/page entries, then repeat by
unmapping at the next level if necessary. With a little refactoring of
some helper functions, the code ends up not much bigger than before, but
considerably easier to follow and to adapt in future.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/io-pgtable-arm.c | 116 -
 1 file changed, 68 insertions(+), 48 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 6e5df5e0a3bd..97d039952367 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -264,13 +264,32 @@ static int __arm_lpae_unmap(struct arm_lpae_io_pgtable 
*data,
unsigned long iova, size_t size, int lvl,
arm_lpae_iopte *ptep);
 
+static arm_lpae_iopte __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
+ arm_lpae_iopte prot, int lvl,
+ phys_addr_t paddr)
+{
+   arm_lpae_iopte pte = prot;
+
+   if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_NS)
+   pte |= ARM_LPAE_PTE_NS;
+
+   if (lvl == ARM_LPAE_MAX_LEVELS - 1)
+   pte |= ARM_LPAE_PTE_TYPE_PAGE;
+   else
+   pte |= ARM_LPAE_PTE_TYPE_BLOCK;
+
+   pte |= ARM_LPAE_PTE_AF | ARM_LPAE_PTE_SH_IS;
+   pte |= pfn_to_iopte(paddr >> data->pg_shift, data);
+
+   return pte;
+}
+
 static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
 unsigned long iova, phys_addr_t paddr,
 arm_lpae_iopte prot, int lvl,
 arm_lpae_iopte *ptep)
 {
-   arm_lpae_iopte pte = prot;
-   struct io_pgtable_cfg *cfg = &data->iop.cfg;
+   arm_lpae_iopte pte;
 
if (iopte_leaf(*ptep, lvl)) {
/* We require an unmap first */
@@ -289,21 +308,25 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable 
*data,
return -EINVAL;
}
 
-   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
-   pte |= ARM_LPAE_PTE_NS;
-
-   if (lvl == ARM_LPAE_MAX_LEVELS - 1)
-   pte |= ARM_LPAE_PTE_TYPE_PAGE;
-   else
-   pte |= ARM_LPAE_PTE_TYPE_BLOCK;
-
-   pte |= ARM_LPAE_PTE_AF | ARM_LPAE_PTE_SH_IS;
-   pte |= pfn_to_iopte(paddr >> data->pg_shift, data);
-
-   __arm_lpae_set_pte(ptep, pte, cfg);
+   pte = __arm_lpae_init_pte(data, prot, lvl, paddr);
+   __arm_lpae_set_pte(ptep, pte, &data->iop.cfg);
return 0;
 }
 
+static arm_lpae_iopte arm_lpae_install_table(arm_lpae_iopte *table,
+arm_lpae_iopte *ptep,
+struct io_pgtable_cfg *cfg)
+{
+   arm_lpae_iopte new;
+
+   new = __pa(table) | ARM_LPAE_PTE_TYPE_TABLE;
+   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
+   new |= ARM_LPAE_PTE_NSTABLE;
+
+   __arm_lpae_set_pte(ptep, new, cfg);
+   return new;
+}
+
 static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, unsigned long iova,
  phys_addr_t paddr, size_t size, arm_lpae_iopte prot,
  int lvl, arm_lpae_iopte *ptep)
@@ -331,10 +354,7 @@ static int __arm_lpae_map(struct arm_lpae_io_pgtable 
*data, unsigned long iova,
if (!cptep)
return -ENOMEM;
 
-   pte = __pa(cptep) | ARM_LPAE_PTE_TYPE_TABLE;
-   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
-   pte |= ARM_LPAE_PTE_NSTABLE;
-   __arm_lpae_set_pte(ptep, pte, cfg);
+   arm_lpae_install_table(cptep, ptep, cfg);
} else if (!iopte_leaf(pte, lvl)) {
cptep = iopte_deref(pte, data);
} else {
@@ -452,40 +472,42 @@ static void arm_lpae_free_pgtable(struct io_pgtable *iop)
 
 static int arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
unsigned long iova, size_t size,
-   arm_lpae_iopte prot, int lv

[PATCH 1/8] iommu/io-pgtable-arm-v7s: Check table PTEs more precisely

Whilst we don't support the PXN bit at all, so should never encounter a
level 1 section or supersection PTE with it set, it would still be wise
to check both table type bits to resolve any theoretical ambiguity.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 8d6ca28c3e1f..a490db032c51 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -92,7 +92,8 @@
 #define ARM_V7S_PTE_TYPE_CONT_PAGE 0x1
 
 #define ARM_V7S_PTE_IS_VALID(pte)  (((pte) & 0x3) != 0)
-#define ARM_V7S_PTE_IS_TABLE(pte, lvl) (lvl == 1 && ((pte) & 
ARM_V7S_PTE_TYPE_TABLE))
+#define ARM_V7S_PTE_IS_TABLE(pte, lvl) \
+   ((lvl) == 1 && (((pte) & 0x3) == ARM_V7S_PTE_TYPE_TABLE))
 
 /* Page table bits */
 #define ARM_V7S_ATTR_XN(lvl)   BIT(4 * (2 - (lvl)))
-- 
2.12.2.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 8/8] iommu/arm-smmu-v3: Remove io-pgtable spinlock

As for SMMUv2, take advantage of io-pgtable's newfound tolerance for
concurrency. Unfortunately in this case the command queue lock remains a
point of serialisation for the unmap path, but there may be a little
more we can do to ameliorate that in future.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/arm-smmu-v3.c | 33 ++---
 1 file changed, 6 insertions(+), 27 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index b9c4cf4ccca2..291da5f918d5 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -645,7 +645,6 @@ struct arm_smmu_domain {
struct mutexinit_mutex; /* Protects smmu pointer */
 
struct io_pgtable_ops   *pgtbl_ops;
-   spinlock_t  pgtbl_lock;
 
enum arm_smmu_domain_stage  stage;
union {
@@ -1406,7 +1405,6 @@ static struct iommu_domain 
*arm_smmu_domain_alloc(unsigned type)
}
 
mutex_init(&smmu_domain->init_mutex);
-   spin_lock_init(&smmu_domain->pgtbl_lock);
return &smmu_domain->domain;
 }
 
@@ -1678,44 +1676,29 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
 static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
phys_addr_t paddr, size_t size, int prot)
 {
-   int ret;
-   unsigned long flags;
-   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-   struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+   struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
 
if (!ops)
return -ENODEV;
 
-   spin_lock_irqsave(&smmu_domain->pgtbl_lock, flags);
-   ret = ops->map(ops, iova, paddr, size, prot);
-   spin_unlock_irqrestore(&smmu_domain->pgtbl_lock, flags);
-   return ret;
+   return ops->map(ops, iova, paddr, size, prot);
 }
 
 static size_t
 arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
 {
-   size_t ret;
-   unsigned long flags;
-   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-   struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+   struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
 
if (!ops)
return 0;
 
-   spin_lock_irqsave(&smmu_domain->pgtbl_lock, flags);
-   ret = ops->unmap(ops, iova, size);
-   spin_unlock_irqrestore(&smmu_domain->pgtbl_lock, flags);
-   return ret;
+   return ops->unmap(ops, iova, size);
 }
 
 static phys_addr_t
 arm_smmu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova)
 {
-   phys_addr_t ret;
-   unsigned long flags;
-   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-   struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+   struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
 
if (domain->type == IOMMU_DOMAIN_IDENTITY)
return iova;
@@ -1723,11 +1706,7 @@ arm_smmu_iova_to_phys(struct iommu_domain *domain, 
dma_addr_t iova)
if (!ops)
return 0;
 
-   spin_lock_irqsave(&smmu_domain->pgtbl_lock, flags);
-   ret = ops->iova_to_phys(ops, iova);
-   spin_unlock_irqrestore(&smmu_domain->pgtbl_lock, flags);
-
-   return ret;
+   return ops->iova_to_phys(ops, iova);
 }
 
 static struct platform_driver arm_smmu_driver;
-- 
2.12.2.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 4/8] iommu/io-pgtable: Introduce explicit coherency

Once we remove the serialising spinlock, a potential race opens up for
non-coherent IOMMUs whereby a caller of .map() can be sure that cache
maintenance has been performed on their new PTE, but will have no
guarantee that such maintenance for table entries above it has actually
completed (e.g. if another CPU took an interrupt immediately after
writing the table entry, but before initiating the DMA sync).

Handling this race safely will add some potentially non-trivial overhead
to installing a table entry, which we would much rather avoid on
coherent systems where it will be unnecessary, and where we are stirivng
to minimise latency by removing the locking in the first place.

To that end, let's introduce an explicit notion of cache-coherency to
io-pgtable, such that we will be able to avoid penalising IOMMUs which
know enough to know when they are coherent.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/arm-smmu-v3.c|  3 +++
 drivers/iommu/arm-smmu.c   |  3 +++
 drivers/iommu/io-pgtable-arm-v7s.c | 17 ++---
 drivers/iommu/io-pgtable-arm.c | 11 ++-
 drivers/iommu/io-pgtable.h |  6 ++
 5 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 380969aa60d5..b9c4cf4ccca2 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1555,6 +1555,9 @@ static int arm_smmu_domain_finalise(struct iommu_domain 
*domain)
.iommu_dev  = smmu->dev,
};
 
+   if (smmu->features & ARM_SMMU_FEAT_COHERENCY)
+   pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
+
pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
if (!pgtbl_ops)
return -ENOMEM;
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 7ec30b08b3bd..1f42f339a284 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1010,6 +1010,9 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
.iommu_dev  = smmu->dev,
};
 
+   if (smmu->features & ARM_SMMU_FEAT_COHERENT_WALK)
+   pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
+
smmu_domain->smmu = smmu;
pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
if (!pgtbl_ops) {
diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 3ee8e61eeb18..2437f2899661 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -187,7 +187,8 @@ static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int 
lvl)
 static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
   struct arm_v7s_io_pgtable *data)
 {
-   struct device *dev = data->iop.cfg.iommu_dev;
+   struct io_pgtable_cfg *cfg = &data->iop.cfg;
+   struct device *dev = cfg->iommu_dev;
dma_addr_t dma;
size_t size = ARM_V7S_TABLE_SIZE(lvl);
void *table = NULL;
@@ -196,7 +197,7 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
table = (void *)__get_dma_pages(__GFP_ZERO, get_order(size));
else if (lvl == 2)
table = kmem_cache_zalloc(data->l2_tables, gfp | GFP_DMA);
-   if (table && !selftest_running) {
+   if (table && !(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA)) {
dma = dma_map_single(dev, table, size, DMA_TO_DEVICE);
if (dma_mapping_error(dev, dma))
goto out_free;
@@ -225,10 +226,11 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
 static void __arm_v7s_free_table(void *table, int lvl,
 struct arm_v7s_io_pgtable *data)
 {
-   struct device *dev = data->iop.cfg.iommu_dev;
+   struct io_pgtable_cfg *cfg = &data->iop.cfg;
+   struct device *dev = cfg->iommu_dev;
size_t size = ARM_V7S_TABLE_SIZE(lvl);
 
-   if (!selftest_running)
+   if (!(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA))
dma_unmap_single(dev, __arm_v7s_dma_addr(table), size,
 DMA_TO_DEVICE);
if (lvl == 1)
@@ -240,7 +242,7 @@ static void __arm_v7s_free_table(void *table, int lvl,
 static void __arm_v7s_pte_sync(arm_v7s_iopte *ptep, int num_entries,
   struct io_pgtable_cfg *cfg)
 {
-   if (selftest_running)
+   if (!(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA))
return;
 
dma_sync_single_for_device(cfg->iommu_dev, __arm_v7s_dma_addr(ptep),
@@ -657,7 +659,8 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
IO_PGTABLE_QUIRK_NO_PERMS |
IO_PGTABLE_QUIRK_TLBI_ON_MAP |
-   IO_PGTABLE_QUIRK_ARM_MTK_4GB))
+   IO_PGTABLE_QUIRK_ARM_MTK_4GB |
+   IO_PGTABLE_QUIRK_NO_DMA))

[PATCH 7/8] iommu/arm-smmu: Remove io-pgtable spinlock

With the io-pgtable code now robust against (valid) races, we no longer
need to serialise all operations with a lock. This might make broken
callers who issue concurrent operations on overlapping addresses go even
more wrong than before, but hey, they already had little hope of useful
or deterministic results.

We do however still have to keep a lock around to serialise the ATS1*
translation ops, as parallel iova_to_phys() calls could lead to
unpredictable hardware behaviour otherwise.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/arm-smmu.c | 45 ++---
 1 file changed, 14 insertions(+), 31 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 1f42f339a284..b8d069a2b31d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -425,10 +425,10 @@ enum arm_smmu_domain_stage {
 struct arm_smmu_domain {
struct arm_smmu_device  *smmu;
struct io_pgtable_ops   *pgtbl_ops;
-   spinlock_t  pgtbl_lock;
struct arm_smmu_cfg cfg;
enum arm_smmu_domain_stage  stage;
struct mutexinit_mutex; /* Protects smmu pointer */
+   spinlock_t  cb_lock; /* Serialises ATS1* ops */
struct iommu_domain domain;
 };
 
@@ -1105,7 +1105,7 @@ static struct iommu_domain 
*arm_smmu_domain_alloc(unsigned type)
}
 
mutex_init(&smmu_domain->init_mutex);
-   spin_lock_init(&smmu_domain->pgtbl_lock);
+   spin_lock_init(&smmu_domain->cb_lock);
 
return &smmu_domain->domain;
 }
@@ -1383,35 +1383,23 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
 static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
phys_addr_t paddr, size_t size, int prot)
 {
-   int ret;
-   unsigned long flags;
-   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-   struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;
+   struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
 
if (!ops)
return -ENODEV;
 
-   spin_lock_irqsave(&smmu_domain->pgtbl_lock, flags);
-   ret = ops->map(ops, iova, paddr, size, prot);
-   spin_unlock_irqrestore(&smmu_domain->pgtbl_lock, flags);
-   return ret;
+   return ops->map(ops, iova, paddr, size, prot);
 }
 
 static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 size_t size)
 {
-   size_t ret;
-   unsigned long flags;
-   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-   struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;
+   struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
 
if (!ops)
return 0;
 
-   spin_lock_irqsave(&smmu_domain->pgtbl_lock, flags);
-   ret = ops->unmap(ops, iova, size);
-   spin_unlock_irqrestore(&smmu_domain->pgtbl_lock, flags);
-   return ret;
+   return ops->unmap(ops, iova, size);
 }
 
 static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
@@ -1425,10 +1413,11 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct 
iommu_domain *domain,
void __iomem *cb_base;
u32 tmp;
u64 phys;
-   unsigned long va;
+   unsigned long va, flags;
 
cb_base = ARM_SMMU_CB(smmu, cfg->cbndx);
 
+   spin_lock_irqsave(&smmu_domain->cb_lock, flags);
/* ATS1 registers can only be written atomically */
va = iova & ~0xfffUL;
if (smmu->version == ARM_SMMU_V2)
@@ -1438,6 +1427,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct 
iommu_domain *domain,
 
if (readl_poll_timeout_atomic(cb_base + ARM_SMMU_CB_ATSR, tmp,
  !(tmp & ATSR_ACTIVE), 5, 50)) {
+   spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
dev_err(dev,
"iova to phys timed out on %pad. Falling back to 
software table walk.\n",
&iova);
@@ -1445,6 +1435,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct 
iommu_domain *domain,
}
 
phys = readq_relaxed(cb_base + ARM_SMMU_CB_PAR);
+   spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
if (phys & CB_PAR_F) {
dev_err(dev, "translation fault!\n");
dev_err(dev, "PAR = 0x%llx\n", phys);
@@ -1457,10 +1448,8 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct 
iommu_domain *domain,
 static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
dma_addr_t iova)
 {
-   phys_addr_t ret;
-   unsigned long flags;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-   struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;
+   struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
 
if (domain->type == IOMMU_D

[PATCH 5/8] iommu/io-pgtable-arm: Support lockless operation

For parallel I/O with multiple concurrent threads servicing the same
device (or devices, if several share a domain), serialising page table
updates becomes a massive bottleneck. On reflection, though, we don't
strictly need to do that - for valid IOMMU API usage, there are in fact
only two races that we need to guard against: multiple map requests for
different blocks within the same region, when the intermediate-level
table for that region does not yet exist; and multiple unmaps of
different parts of the same block entry. Both of those are fairly easily
solved by using a cmpxchg to install the new table, such that if we then
find that someone else's table got there first, we can simply free ours
and continue.

Make the requisite changes such that we can withstand being called
without the caller maintaining a lock. In theory, this opens up a few
corners in which wildly misbehaving callers making nonsensical
overlapping requests might lead to crashes instead of just unpredictable
results, but correct code really does not deserve to pay a significant
performance cost for the sake of masking bugs in theoretical broken code.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/io-pgtable-arm.c | 75 --
 1 file changed, 58 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index bdc954cb98c1..d857961af1d3 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -20,6 +20,7 @@
 
 #define pr_fmt(fmt)"arm-lpae io-pgtable: " fmt
 
+#include 
 #include 
 #include 
 #include 
@@ -99,6 +100,8 @@
 #define ARM_LPAE_PTE_ATTR_HI_MASK  (((arm_lpae_iopte)6) << 52)
 #define ARM_LPAE_PTE_ATTR_MASK (ARM_LPAE_PTE_ATTR_LO_MASK |\
 ARM_LPAE_PTE_ATTR_HI_MASK)
+/* Software bit for solving coherency races */
+#define ARM_LPAE_PTE_SW_SYNC   (((arm_lpae_iopte)1) << 55)
 
 /* Stage-1 PTE */
 #define ARM_LPAE_PTE_AP_UNPRIV (((arm_lpae_iopte)1) << 6)
@@ -249,15 +252,20 @@ static void __arm_lpae_free_pages(void *pages, size_t 
size,
free_pages_exact(pages, size);
 }
 
+static void __arm_lpae_sync_pte(arm_lpae_iopte *ptep,
+   struct io_pgtable_cfg *cfg)
+{
+   dma_sync_single_for_device(cfg->iommu_dev, __arm_lpae_dma_addr(ptep),
+  sizeof(*ptep), DMA_TO_DEVICE);
+}
+
 static void __arm_lpae_set_pte(arm_lpae_iopte *ptep, arm_lpae_iopte pte,
   struct io_pgtable_cfg *cfg)
 {
*ptep = pte;
 
if (!(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA))
-   dma_sync_single_for_device(cfg->iommu_dev,
-  __arm_lpae_dma_addr(ptep),
-  sizeof(pte), DMA_TO_DEVICE);
+   __arm_lpae_sync_pte(ptep, cfg);
 }
 
 static int __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
@@ -289,13 +297,13 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable 
*data,
 arm_lpae_iopte prot, int lvl,
 arm_lpae_iopte *ptep)
 {
-   arm_lpae_iopte pte;
+   arm_lpae_iopte pte = *ptep;
 
-   if (iopte_leaf(*ptep, lvl)) {
+   if (iopte_leaf(pte, lvl)) {
/* We require an unmap first */
WARN_ON(!selftest_running);
return -EEXIST;
-   } else if (iopte_type(*ptep, lvl) == ARM_LPAE_PTE_TYPE_TABLE) {
+   } else if (iopte_type(pte, lvl) == ARM_LPAE_PTE_TYPE_TABLE) {
/*
 * We need to unmap and free the old table before
 * overwriting it with a block entry.
@@ -315,16 +323,30 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable 
*data,
 
 static arm_lpae_iopte arm_lpae_install_table(arm_lpae_iopte *table,
 arm_lpae_iopte *ptep,
+arm_lpae_iopte curr,
 struct io_pgtable_cfg *cfg)
 {
-   arm_lpae_iopte new;
+   arm_lpae_iopte old, new;
 
new = __pa(table) | ARM_LPAE_PTE_TYPE_TABLE;
if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
new |= ARM_LPAE_PTE_NSTABLE;
 
-   __arm_lpae_set_pte(ptep, new, cfg);
-   return new;
+   old = cmpxchg64_relaxed(ptep, curr, new);
+
+   if (cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA)
+   return old;
+
+   /* Even if it's not ours, there's no point waiting; just kick it */
+   if (!(old & ARM_LPAE_PTE_SW_SYNC))
+   __arm_lpae_sync_pte(ptep, cfg);
+   if (old == curr) {
+   /* Ensure our sync is finished before we mark it as such */
+   wmb();
+   *ptep |= ARM_LPAE_PTE_SW_SYNC;
+   }
+
+   return old;
 }
 
 static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, unsigned long iova,
@@ -333,6 +355,7 @@ static int __arm_lpae_map(str

[PATCH 6/8] iommu/io-pgtable-arm-v7s: Support lockless operation

Mirroring the LPAE implementation, rework the v7s code to be robust
against concurrent operations. The same two potential races exist, and
are solved in the same manner, with the fixed 2-level structure making
life ever so slightly simpler.

What complicates matters compared to LPAE, however, is large page
entries, since we can't update a block of 16 PTEs atomically, nor assume
available software bits to do clever things with. As most users are
never likely to do partial unmaps anyway (due to DMA API rules), it
doesn't seem unreasonable for this case to remain behind a serialising
lock; we just pull said lock down into the bowels of the implementation
so it's well out of the way of the normal call paths.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 78 --
 1 file changed, 58 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 2437f2899661..35d0f41af276 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -32,6 +32,7 @@
 
 #define pr_fmt(fmt)"arm-v7s io-pgtable: " fmt
 
+#include 
 #include 
 #include 
 #include 
@@ -39,6 +40,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -168,6 +170,7 @@ struct arm_v7s_io_pgtable {
 
arm_v7s_iopte   *pgd;
struct kmem_cache   *l2_tables;
+   spinlock_t  split_lock;
 };
 
 static dma_addr_t __arm_v7s_dma_addr(void *pages)
@@ -396,16 +399,19 @@ static int arm_v7s_init_pte(struct arm_v7s_io_pgtable 
*data,
 
 static arm_v7s_iopte arm_v7s_install_table(arm_v7s_iopte *table,
   arm_v7s_iopte *ptep,
+  arm_v7s_iopte curr,
   struct io_pgtable_cfg *cfg)
 {
-   arm_v7s_iopte new;
+   arm_v7s_iopte old, new;
 
new = virt_to_phys(table) | ARM_V7S_PTE_TYPE_TABLE;
if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
new |= ARM_V7S_ATTR_NS_TABLE;
 
-   __arm_v7s_set_pte(ptep, new, 1, cfg);
-   return new;
+   old = cmpxchg_relaxed(ptep, curr, new);
+   __arm_v7s_pte_sync(ptep, 1, cfg);
+
+   return old;
 }
 
 static int __arm_v7s_map(struct arm_v7s_io_pgtable *data, unsigned long iova,
@@ -429,16 +435,23 @@ static int __arm_v7s_map(struct arm_v7s_io_pgtable *data, 
unsigned long iova,
return -EINVAL;
 
/* Grab a pointer to the next level */
-   pte = *ptep;
+   pte = READ_ONCE(*ptep);
if (!pte) {
cptep = __arm_v7s_alloc_table(lvl + 1, GFP_ATOMIC, data);
if (!cptep)
return -ENOMEM;
 
-   arm_v7s_install_table(cptep, ptep, cfg);
-   } else if (ARM_V7S_PTE_IS_TABLE(pte, lvl)) {
-   cptep = iopte_deref(pte, lvl);
+   pte = arm_v7s_install_table(cptep, ptep, 0, cfg);
+   if (pte)
+   __arm_v7s_free_table(cptep, lvl + 1, data);
} else {
+   /* We've no easy way of knowing if it's synced yet, so... */
+   __arm_v7s_pte_sync(ptep, 1, cfg);
+   }
+
+   if (ARM_V7S_PTE_IS_TABLE(pte, lvl)) {
+   cptep = iopte_deref(pte, lvl);
+   } else if (pte) {
/* We require an unmap first */
WARN_ON(!selftest_running);
return -EEXIST;
@@ -491,27 +504,31 @@ static void arm_v7s_free_pgtable(struct io_pgtable *iop)
kfree(data);
 }
 
-static void arm_v7s_split_cont(struct arm_v7s_io_pgtable *data,
-  unsigned long iova, int idx, int lvl,
-  arm_v7s_iopte *ptep)
+static arm_v7s_iopte arm_v7s_split_cont(struct arm_v7s_io_pgtable *data,
+   unsigned long iova, int idx, int lvl,
+   arm_v7s_iopte *ptep)
 {
struct io_pgtable *iop = &data->iop;
arm_v7s_iopte pte;
size_t size = ARM_V7S_BLOCK_SIZE(lvl);
int i;
 
+   /* Check that we didn't lose a race to get the lock */
+   pte = *ptep;
+   if (!arm_v7s_pte_is_cont(pte, lvl))
+   return pte;
+
ptep -= idx & (ARM_V7S_CONT_PAGES - 1);
-   pte = arm_v7s_cont_to_pte(*ptep, lvl);
-   for (i = 0; i < ARM_V7S_CONT_PAGES; i++) {
-   ptep[i] = pte;
-   pte += size;
-   }
+   pte = arm_v7s_cont_to_pte(pte, lvl);
+   for (i = 0; i < ARM_V7S_CONT_PAGES; i++)
+   ptep[i] = pte + i * size;
 
__arm_v7s_pte_sync(ptep, ARM_V7S_CONT_PAGES, &iop->cfg);
 
size *= ARM_V7S_CONT_PAGES;
io_pgtable_tlb_add_flush(iop, iova, size, size, true);
io_pgtable_tlb_sync(iop);
+   return pte;
 }
 
 static int arm_v7s_split_blk_unmap(struct arm_v7s_io_pgtable *data,
@@ -542,7 +559,16 @@ static int arm_v7s_split_blk_unmap(struct 
arm_v7s_io_pg

Re: [PATCH v1 0/3] iommu/amd: AMD IOMMU performance updates 2017-06-05

2017-06-08 Thread Joerg Roedel

On Mon, Jun 05, 2017 at 02:52:03PM -0500, Tom Lendacky wrote:
> This patch series addresses some performance issues in the AMD IOMMU
> driver:
> 
> - Reduce the amount of MMIO performed during command submission
> - When the command queue is (near) full, only wait till there is enough
>   room for the command rather than wait for the whole queue to be empty
> - Limit the flushing of protection domain TLBs to only the protection
>   domains associated with the iova data being freed
> 
> This patch series is based on the master branch of the iommu tree.
> 
> ---
> 
> Tom Lendacky (3):
>   iommu/amd: Reduce amount of MMIO when submitting commands
>   iommu/amd: Reduce delay waiting for command buffer space

Okay, applied these two and my queue-ring series atop.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 03/12] intel-ipu3: Add DMA API implementation

On 07/06/17 10:47, Tomasz Figa wrote:
> Hi Yong,
> 
> +Robin, Joerg, IOMMU ML
> 
> Please see my comments inline.
> 
> On Tue, Jun 6, 2017 at 5:39 AM, Yong Zhi  wrote:
>> IPU3 mmu based DMA mapping driver
>>
>> Signed-off-by: Yong Zhi 
>> ---
>>  drivers/media/pci/intel/ipu3/Kconfig   |   6 +
>>  drivers/media/pci/intel/ipu3/Makefile  |   1 +
>>  drivers/media/pci/intel/ipu3/ipu3-dmamap.c | 408 
>> +
>>  drivers/media/pci/intel/ipu3/ipu3-dmamap.h |  20 ++
>>  4 files changed, 435 insertions(+)
>>  create mode 100644 drivers/media/pci/intel/ipu3/ipu3-dmamap.c
>>  create mode 100644 drivers/media/pci/intel/ipu3/ipu3-dmamap.h
>>
>> diff --git a/drivers/media/pci/intel/ipu3/Kconfig 
>> b/drivers/media/pci/intel/ipu3/Kconfig
>> index ab2edcb..2030be7 100644
>> --- a/drivers/media/pci/intel/ipu3/Kconfig
>> +++ b/drivers/media/pci/intel/ipu3/Kconfig
>> @@ -26,3 +26,9 @@ config INTEL_IPU3_MMU
>>
>>   Say Y here if you have Skylake/Kaby Lake SoC with IPU3.
>>   Say N if un-sure.
>> +
>> +config INTEL_IPU3_DMAMAP
>> +   bool "Intel ipu3 DMA mapping driver"
>> +   select IOMMU_IOVA
>> +   ---help---
>> + This is IPU3 IOMMU domain specific DMA driver.
>> diff --git a/drivers/media/pci/intel/ipu3/Makefile 
>> b/drivers/media/pci/intel/ipu3/Makefile
>> index 2b669df..2c2a035 100644
>> --- a/drivers/media/pci/intel/ipu3/Makefile
>> +++ b/drivers/media/pci/intel/ipu3/Makefile
>> @@ -1,2 +1,3 @@
>>  obj-$(CONFIG_VIDEO_IPU3_CIO2) += ipu3-cio2.o
>>  obj-$(CONFIG_INTEL_IPU3_MMU) += ipu3-mmu.o
>> +obj-$(CONFIG_INTEL_IPU3_DMAMAP) += ipu3-dmamap.o
>> diff --git a/drivers/media/pci/intel/ipu3/ipu3-dmamap.c 
>> b/drivers/media/pci/intel/ipu3/ipu3-dmamap.c
>> new file mode 100644
>> index 000..74704d9
>> --- /dev/null
>> +++ b/drivers/media/pci/intel/ipu3/ipu3-dmamap.c
>> @@ -0,0 +1,408 @@
>> +/*
>> + * Copyright (c) 2017 Intel Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public License version
>> + * 2 as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + */
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include "ipu3-mmu.h"
>> +
>> +/* Begin of things adapted from arch/arm/mm/dma-mapping.c */
> 
> ARM's DMA ops are not a good example of today's coding standards.
> There are already generic DMA mapping helpers available in
> drivers/iommu/dma-iommu.c and drivers/base/dma-*. (Hmm, I remember
> writing this already, déjà vu maybe...)

Yes, dma-iommu exists for precisely this purpose -
arch/arm64/mm/dma-mapping.c would have been a better point of reference.

>> +static void ipu3_dmamap_clear_buffer(struct page *page, size_t size,
>> +unsigned long attrs)
>> +{
>> +   /*
>> +* Ensure that the allocated pages are zeroed, and that any data
>> +* lurking in the kernel direct-mapped region is invalidated.
>> +*/
>> +   if (PageHighMem(page)) {
>> +   while (size > 0) {
>> +   void *ptr = kmap_atomic(page);
>> +
>> +   memset(ptr, 0, PAGE_SIZE);
>> +   if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
>> +   clflush_cache_range(ptr, PAGE_SIZE);
>> +   kunmap_atomic(ptr);
>> +   page++;
>> +   size -= PAGE_SIZE;
>> +   }
>> +   } else {
>> +   void *ptr = page_address(page);
>> +
>> +   memset(ptr, 0, size);
>> +   if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
>> +   clflush_cache_range(ptr, size);
>> +   }
>> +}
>> +
>> +/**
>> + * ipu3_dmamap_alloc_buffer - allocate buffer based on attributes
>> + * @dev: struct device pointer
>> + * @size: size of buffer in bytes
>> + * @gfp: specify the free page type
>> + * @attrs: defined in linux/dma-attrs.h
>> + *
>> + * This is a helper function for physical page allocation
>> + *
>> + * Return array representing buffer from alloc_pages() on success
>> + * or NULL on failure
>> + *
>> + * Must be freed with ipu3_dmamap_free_buffer.
>> + */
>> +static struct page **ipu3_dmamap_alloc_buffer(struct device *dev, size_t 
>> size,
>> + gfp_t gfp, unsigned long attrs)
>> +{
>> +   struct page **pages;
>> +   int count = size >> PAGE_SHIFT;
>> +   int array_size = count * sizeof(struct page *);
>> +   int i = 0;
>> +
>> +   /* Allocate mem for array of page ptrs */
>> +   if (array_size <= PAGE_SIZE)
>> +   pages = kzalloc(array_size, GFP_KERNEL);
>> +   else
>> +   pages =

clean up and modularize arch dma_mapping interface

Hi all,

for a while we have a generic implementation of the dma mapping routines
that call into per-arch or per-device operations.  But right now there
still are various bits in the interfaces where don't clearly operate
on these ops.  This series tries to clean up a lot of those (but not all
yet, but the series is big enough).  It gets rid of the DMA_ERROR_CODE
way of signaling failures of the mapping routines from the
implementations to the generic code (and cleans up various drivers that
were incorrectly using it), and gets rid of the ->set_dma_mask routine
in favor of relying on the ->dma_capable method that can be used in
the same way, but which requires less code duplication.

Btw, we don't seem to have a tree every-growing amount of common dma
mapping code, and given that I have a fair amount of all over the tree
work in that area in my plate I'd like to start one.  Any good reason
to that?  Anyone willing to volunteer as co maintainer?

The whole series is also available in git:

git://git.infradead.org/users/hch/misc.git dma-map

Gitweb:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-map
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 01/44] firmware/ivc: use dma_mapping_error

DMA_ERROR_CODE is not supposed to be used by drivers.

Signed-off-by: Christoph Hellwig 
---
 drivers/firmware/tegra/ivc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/tegra/ivc.c b/drivers/firmware/tegra/ivc.c
index 29ecfd815320..a01461d63f68 100644
--- a/drivers/firmware/tegra/ivc.c
+++ b/drivers/firmware/tegra/ivc.c
@@ -646,12 +646,12 @@ int tegra_ivc_init(struct tegra_ivc *ivc, struct device 
*peer, void *rx,
if (peer) {
ivc->rx.phys = dma_map_single(peer, rx, queue_size,
  DMA_BIDIRECTIONAL);
-   if (ivc->rx.phys == DMA_ERROR_CODE)
+   if (dma_mapping_error(peer, ivc->rx.phys))
return -ENOMEM;
 
ivc->tx.phys = dma_map_single(peer, tx, queue_size,
  DMA_BIDIRECTIONAL);
-   if (ivc->tx.phys == DMA_ERROR_CODE) {
+   if (dma_mapping_error(peer, ivc->tx.phys)) {
dma_unmap_single(peer, ivc->rx.phys, queue_size,
 DMA_BIDIRECTIONAL);
return -ENOMEM;
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 02/44] ibmveth: properly unwind on init errors

That way the driver doesn't have to rely on DMA_ERROR_CODE, which
is not a public API and going away.

Signed-off-by: Christoph Hellwig 
---
 drivers/net/ethernet/ibm/ibmveth.c | 159 +
 1 file changed, 74 insertions(+), 85 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmveth.c 
b/drivers/net/ethernet/ibm/ibmveth.c
index 72ab7b6bf20b..3ac27f59e595 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -467,56 +467,6 @@ static void ibmveth_rxq_harvest_buffer(struct 
ibmveth_adapter *adapter)
}
 }
 
-static void ibmveth_cleanup(struct ibmveth_adapter *adapter)
-{
-   int i;
-   struct device *dev = &adapter->vdev->dev;
-
-   if (adapter->buffer_list_addr != NULL) {
-   if (!dma_mapping_error(dev, adapter->buffer_list_dma)) {
-   dma_unmap_single(dev, adapter->buffer_list_dma, 4096,
-   DMA_BIDIRECTIONAL);
-   adapter->buffer_list_dma = DMA_ERROR_CODE;
-   }
-   free_page((unsigned long)adapter->buffer_list_addr);
-   adapter->buffer_list_addr = NULL;
-   }
-
-   if (adapter->filter_list_addr != NULL) {
-   if (!dma_mapping_error(dev, adapter->filter_list_dma)) {
-   dma_unmap_single(dev, adapter->filter_list_dma, 4096,
-   DMA_BIDIRECTIONAL);
-   adapter->filter_list_dma = DMA_ERROR_CODE;
-   }
-   free_page((unsigned long)adapter->filter_list_addr);
-   adapter->filter_list_addr = NULL;
-   }
-
-   if (adapter->rx_queue.queue_addr != NULL) {
-   dma_free_coherent(dev, adapter->rx_queue.queue_len,
- adapter->rx_queue.queue_addr,
- adapter->rx_queue.queue_dma);
-   adapter->rx_queue.queue_addr = NULL;
-   }
-
-   for (i = 0; i < IBMVETH_NUM_BUFF_POOLS; i++)
-   if (adapter->rx_buff_pool[i].active)
-   ibmveth_free_buffer_pool(adapter,
-&adapter->rx_buff_pool[i]);
-
-   if (adapter->bounce_buffer != NULL) {
-   if (!dma_mapping_error(dev, adapter->bounce_buffer_dma)) {
-   dma_unmap_single(&adapter->vdev->dev,
-   adapter->bounce_buffer_dma,
-   adapter->netdev->mtu + IBMVETH_BUFF_OH,
-   DMA_BIDIRECTIONAL);
-   adapter->bounce_buffer_dma = DMA_ERROR_CODE;
-   }
-   kfree(adapter->bounce_buffer);
-   adapter->bounce_buffer = NULL;
-   }
-}
-
 static int ibmveth_register_logical_lan(struct ibmveth_adapter *adapter,
 union ibmveth_buf_desc rxq_desc, u64 mac_address)
 {
@@ -573,14 +523,17 @@ static int ibmveth_open(struct net_device *netdev)
for(i = 0; i < IBMVETH_NUM_BUFF_POOLS; i++)
rxq_entries += adapter->rx_buff_pool[i].size;
 
+   rc = -ENOMEM;
adapter->buffer_list_addr = (void*) get_zeroed_page(GFP_KERNEL);
-   adapter->filter_list_addr = (void*) get_zeroed_page(GFP_KERNEL);
+   if (!adapter->buffer_list_addr) {
+   netdev_err(netdev, "unable to allocate list pages\n");
+   goto out;
+   }
 
-   if (!adapter->buffer_list_addr || !adapter->filter_list_addr) {
-   netdev_err(netdev, "unable to allocate filter or buffer list "
-  "pages\n");
-   rc = -ENOMEM;
-   goto err_out;
+   adapter->filter_list_addr = (void*) get_zeroed_page(GFP_KERNEL);
+   if (!adapter->filter_list_addr) {
+   netdev_err(netdev, "unable to allocate filter pages\n");
+   goto out_free_buffer_list;
}
 
dev = &adapter->vdev->dev;
@@ -590,22 +543,21 @@ static int ibmveth_open(struct net_device *netdev)
adapter->rx_queue.queue_addr =
dma_alloc_coherent(dev, adapter->rx_queue.queue_len,
   &adapter->rx_queue.queue_dma, GFP_KERNEL);
-   if (!adapter->rx_queue.queue_addr) {
-   rc = -ENOMEM;
-   goto err_out;
-   }
+   if (!adapter->rx_queue.queue_addr)
+   goto out_free_filter_list;
 
adapter->buffer_list_dma = dma_map_single(dev,
adapter->buffer_list_addr, 4096, DMA_BIDIRECTIONAL);
+   if (dma_mapping_error(dev, adapter->buffer_list_dma)) {
+   netdev_err(netdev, "unable to map buffer list pages\n");
+   goto out_free_queue_mem;
+   }
+
adapter->filter_list_dma = dma_map_single(dev,
adapter->filter_list_addr, 4096, DMA_BIDIRECTIONAL);
-
-   if ((dma_mapping_error(dev, adapter->buffer_list_dma)) ||
-   (dma_mapping_e

[PATCH 03/44] dmaengine: ioat: don't use DMA_ERROR_CODE

DMA_ERROR_CODE is not a public API and will go away.  Instead properly
unwind based on the loop counter.

Signed-off-by: Christoph Hellwig 
---
 drivers/dma/ioat/init.c | 24 +++-
 1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/drivers/dma/ioat/init.c b/drivers/dma/ioat/init.c
index 6ad4384b3fa8..ed8ed1192775 100644
--- a/drivers/dma/ioat/init.c
+++ b/drivers/dma/ioat/init.c
@@ -839,8 +839,6 @@ static int ioat_xor_val_self_test(struct ioatdma_device 
*ioat_dma)
goto free_resources;
}
 
-   for (i = 0; i < IOAT_NUM_SRC_TEST; i++)
-   dma_srcs[i] = DMA_ERROR_CODE;
for (i = 0; i < IOAT_NUM_SRC_TEST; i++) {
dma_srcs[i] = dma_map_page(dev, xor_srcs[i], 0, PAGE_SIZE,
   DMA_TO_DEVICE);
@@ -910,8 +908,6 @@ static int ioat_xor_val_self_test(struct ioatdma_device 
*ioat_dma)
 
xor_val_result = 1;
 
-   for (i = 0; i < IOAT_NUM_SRC_TEST + 1; i++)
-   dma_srcs[i] = DMA_ERROR_CODE;
for (i = 0; i < IOAT_NUM_SRC_TEST + 1; i++) {
dma_srcs[i] = dma_map_page(dev, xor_val_srcs[i], 0, PAGE_SIZE,
   DMA_TO_DEVICE);
@@ -965,8 +961,6 @@ static int ioat_xor_val_self_test(struct ioatdma_device 
*ioat_dma)
op = IOAT_OP_XOR_VAL;
 
xor_val_result = 0;
-   for (i = 0; i < IOAT_NUM_SRC_TEST + 1; i++)
-   dma_srcs[i] = DMA_ERROR_CODE;
for (i = 0; i < IOAT_NUM_SRC_TEST + 1; i++) {
dma_srcs[i] = dma_map_page(dev, xor_val_srcs[i], 0, PAGE_SIZE,
   DMA_TO_DEVICE);
@@ -1017,18 +1011,14 @@ static int ioat_xor_val_self_test(struct ioatdma_device 
*ioat_dma)
goto free_resources;
 dma_unmap:
if (op == IOAT_OP_XOR) {
-   if (dest_dma != DMA_ERROR_CODE)
-   dma_unmap_page(dev, dest_dma, PAGE_SIZE,
-  DMA_FROM_DEVICE);
-   for (i = 0; i < IOAT_NUM_SRC_TEST; i++)
-   if (dma_srcs[i] != DMA_ERROR_CODE)
-   dma_unmap_page(dev, dma_srcs[i], PAGE_SIZE,
-  DMA_TO_DEVICE);
+   while (--i >= 0)
+   dma_unmap_page(dev, dma_srcs[i], PAGE_SIZE,
+  DMA_TO_DEVICE);
+   dma_unmap_page(dev, dest_dma, PAGE_SIZE, DMA_FROM_DEVICE);
} else if (op == IOAT_OP_XOR_VAL) {
-   for (i = 0; i < IOAT_NUM_SRC_TEST + 1; i++)
-   if (dma_srcs[i] != DMA_ERROR_CODE)
-   dma_unmap_page(dev, dma_srcs[i], PAGE_SIZE,
-  DMA_TO_DEVICE);
+   while (--i >= 0)
+   dma_unmap_page(dev, dma_srcs[i], PAGE_SIZE,
+  DMA_TO_DEVICE);
}
 free_resources:
dma->device_free_chan_resources(dma_chan);
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 04/44] drm/exynos: don't use DMA_ERROR_CODE

DMA_ERROR_CODE already isn't a valid API to user for drivers and will
go away soon.  exynos_drm_fb_dma_addr uses it a an error return when
the passed in index is invalid, but the callers never check for it
but instead pass the address straight to the hardware.

Add a WARN_ON instead and just return 0.

Signed-off-by: Christoph Hellwig 
---
 drivers/gpu/drm/exynos/exynos_drm_fb.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_fb.c 
b/drivers/gpu/drm/exynos/exynos_drm_fb.c
index c77a5aced81a..d48fd7c918f8 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fb.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fb.c
@@ -181,8 +181,8 @@ dma_addr_t exynos_drm_fb_dma_addr(struct drm_framebuffer 
*fb, int index)
 {
struct exynos_drm_fb *exynos_fb = to_exynos_fb(fb);
 
-   if (index >= MAX_FB_BUFFER)
-   return DMA_ERROR_CODE;
+   if (WARN_ON_ONCE(index >= MAX_FB_BUFFER))
+   return 0;
 
return exynos_fb->dma_addr[index];
 }
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 05/44] drm/armada: don't abuse DMA_ERROR_CODE

dev_addr isn't even a dma_addr_t, and DMA_ERROR_CODE has never been
a valid driver API.  Add a bool mapped flag instead.

Signed-off-by: Christoph Hellwig 
---
 drivers/gpu/drm/armada/armada_fb.c  | 2 +-
 drivers/gpu/drm/armada/armada_gem.c | 5 ++---
 drivers/gpu/drm/armada/armada_gem.h | 1 +
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/armada/armada_fb.c 
b/drivers/gpu/drm/armada/armada_fb.c
index 2a7eb6817c36..92e6b08ea64a 100644
--- a/drivers/gpu/drm/armada/armada_fb.c
+++ b/drivers/gpu/drm/armada/armada_fb.c
@@ -133,7 +133,7 @@ static struct drm_framebuffer *armada_fb_create(struct 
drm_device *dev,
}
 
/* Framebuffer objects must have a valid device address for scanout */
-   if (obj->dev_addr == DMA_ERROR_CODE) {
+   if (!obj->mapped) {
ret = -EINVAL;
goto err_unref;
}
diff --git a/drivers/gpu/drm/armada/armada_gem.c 
b/drivers/gpu/drm/armada/armada_gem.c
index d6c2a5d190eb..a76ca21d063b 100644
--- a/drivers/gpu/drm/armada/armada_gem.c
+++ b/drivers/gpu/drm/armada/armada_gem.c
@@ -175,6 +175,7 @@ armada_gem_linear_back(struct drm_device *dev, struct 
armada_gem_object *obj)
 
obj->phys_addr = obj->linear->start;
obj->dev_addr = obj->linear->start;
+   obj->mapped = true;
}
 
DRM_DEBUG_DRIVER("obj %p phys %#llx dev %#llx\n", obj,
@@ -205,7 +206,6 @@ armada_gem_alloc_private_object(struct drm_device *dev, 
size_t size)
return NULL;
 
drm_gem_private_object_init(dev, &obj->obj, size);
-   obj->dev_addr = DMA_ERROR_CODE;
 
DRM_DEBUG_DRIVER("alloc private obj %p size %zu\n", obj, size);
 
@@ -229,8 +229,6 @@ static struct armada_gem_object 
*armada_gem_alloc_object(struct drm_device *dev,
return NULL;
}
 
-   obj->dev_addr = DMA_ERROR_CODE;
-
mapping = obj->obj.filp->f_mapping;
mapping_set_gfp_mask(mapping, GFP_HIGHUSER | __GFP_RECLAIMABLE);
 
@@ -610,5 +608,6 @@ int armada_gem_map_import(struct armada_gem_object *dobj)
return -EINVAL;
}
dobj->dev_addr = sg_dma_address(dobj->sgt->sgl);
+   dobj->mapped = true;
return 0;
 }
diff --git a/drivers/gpu/drm/armada/armada_gem.h 
b/drivers/gpu/drm/armada/armada_gem.h
index b88d2b9853c7..6e524e0676bb 100644
--- a/drivers/gpu/drm/armada/armada_gem.h
+++ b/drivers/gpu/drm/armada/armada_gem.h
@@ -16,6 +16,7 @@ struct armada_gem_object {
void*addr;
phys_addr_t phys_addr;
resource_size_t dev_addr;
+   boolmapped;
struct drm_mm_node  *linear;/* for linear backed */
struct page *page;  /* for page backed */
struct sg_table *sgt;   /* for imported */
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 06/44] iommu/dma: don't rely on DMA_ERROR_CODE

DMA_ERROR_CODE is not a public API and will go away soon.  dma dma-iommu
driver already implements a proper ->mapping_error method, so it's only
using the value internally.  Add a new local define using the value
that arm64 which is the only current user of dma-iommu.

Signed-off-by: Christoph Hellwig 
---
 drivers/iommu/dma-iommu.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 62618e77bedc..638aea814b94 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -31,6 +31,8 @@
 #include 
 #include 
 
+#define IOMMU_MAPPING_ERROR(~(dma_addr_t)0)
+
 struct iommu_dma_msi_page {
struct list_headlist;
dma_addr_t  iova;
@@ -500,7 +502,7 @@ void iommu_dma_free(struct device *dev, struct page 
**pages, size_t size,
 {
__iommu_dma_unmap(iommu_get_domain_for_dev(dev), *handle, size);
__iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
-   *handle = DMA_ERROR_CODE;
+   *handle = IOMMU_MAPPING_ERROR;
 }
 
 /**
@@ -533,7 +535,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t 
size, gfp_t gfp,
dma_addr_t iova;
unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
 
-   *handle = DMA_ERROR_CODE;
+   *handle = IOMMU_MAPPING_ERROR;
 
min_size = alloc_sizes & -alloc_sizes;
if (min_size < PAGE_SIZE) {
@@ -627,11 +629,11 @@ static dma_addr_t __iommu_dma_map(struct device *dev, 
phys_addr_t phys,
 
iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev);
if (!iova)
-   return DMA_ERROR_CODE;
+   return IOMMU_MAPPING_ERROR;
 
if (iommu_map(domain, iova, phys - iova_off, size, prot)) {
iommu_dma_free_iova(cookie, iova, size);
-   return DMA_ERROR_CODE;
+   return IOMMU_MAPPING_ERROR;
}
return iova + iova_off;
 }
@@ -671,7 +673,7 @@ static int __finalise_sg(struct device *dev, struct 
scatterlist *sg, int nents,
 
s->offset += s_iova_off;
s->length = s_length;
-   sg_dma_address(s) = DMA_ERROR_CODE;
+   sg_dma_address(s) = IOMMU_MAPPING_ERROR;
sg_dma_len(s) = 0;
 
/*
@@ -714,11 +716,11 @@ static void __invalidate_sg(struct scatterlist *sg, int 
nents)
int i;
 
for_each_sg(sg, s, nents, i) {
-   if (sg_dma_address(s) != DMA_ERROR_CODE)
+   if (sg_dma_address(s) != IOMMU_MAPPING_ERROR)
s->offset += sg_dma_address(s);
if (sg_dma_len(s))
s->length = sg_dma_len(s);
-   sg_dma_address(s) = DMA_ERROR_CODE;
+   sg_dma_address(s) = IOMMU_MAPPING_ERROR;
sg_dma_len(s) = 0;
}
 }
@@ -836,7 +838,7 @@ void iommu_dma_unmap_resource(struct device *dev, 
dma_addr_t handle,
 
 int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
 {
-   return dma_addr == DMA_ERROR_CODE;
+   return dma_addr == IOMMU_MAPPING_ERROR;
 }
 
 static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 07/44] xen-swiotlb: consolidate xen_swiotlb_dma_ops

ARM and x86 had duplicated versions of the dma_ops structure, the
only difference is that x86 hasn't wired up the set_dma_mask,
mmap, and get_sgtable ops yet.  On x86 all of them are identical
to the generic version, so they aren't needed but harmless.

All the symbols used only for xen_swiotlb_dma_ops can now be marked
static as well.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/xen/mm.c  | 17 
 arch/x86/xen/pci-swiotlb-xen.c | 14 ---
 drivers/xen/swiotlb-xen.c  | 93 ++
 include/xen/swiotlb-xen.h  | 62 +---
 4 files changed, 49 insertions(+), 137 deletions(-)

diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index f0325d96b97a..785d2a562a23 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -185,23 +185,6 @@ EXPORT_SYMBOL_GPL(xen_destroy_contiguous_region);
 const struct dma_map_ops *xen_dma_ops;
 EXPORT_SYMBOL(xen_dma_ops);
 
-static const struct dma_map_ops xen_swiotlb_dma_ops = {
-   .alloc = xen_swiotlb_alloc_coherent,
-   .free = xen_swiotlb_free_coherent,
-   .sync_single_for_cpu = xen_swiotlb_sync_single_for_cpu,
-   .sync_single_for_device = xen_swiotlb_sync_single_for_device,
-   .sync_sg_for_cpu = xen_swiotlb_sync_sg_for_cpu,
-   .sync_sg_for_device = xen_swiotlb_sync_sg_for_device,
-   .map_sg = xen_swiotlb_map_sg_attrs,
-   .unmap_sg = xen_swiotlb_unmap_sg_attrs,
-   .map_page = xen_swiotlb_map_page,
-   .unmap_page = xen_swiotlb_unmap_page,
-   .dma_supported = xen_swiotlb_dma_supported,
-   .set_dma_mask = xen_swiotlb_set_dma_mask,
-   .mmap = xen_swiotlb_dma_mmap,
-   .get_sgtable = xen_swiotlb_get_sgtable,
-};
-
 int __init xen_mm_init(void)
 {
struct gnttab_cache_flush cflush;
diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c
index 42b08f8fc2ca..37c6056a7bba 100644
--- a/arch/x86/xen/pci-swiotlb-xen.c
+++ b/arch/x86/xen/pci-swiotlb-xen.c
@@ -18,20 +18,6 @@
 
 int xen_swiotlb __read_mostly;
 
-static const struct dma_map_ops xen_swiotlb_dma_ops = {
-   .alloc = xen_swiotlb_alloc_coherent,
-   .free = xen_swiotlb_free_coherent,
-   .sync_single_for_cpu = xen_swiotlb_sync_single_for_cpu,
-   .sync_single_for_device = xen_swiotlb_sync_single_for_device,
-   .sync_sg_for_cpu = xen_swiotlb_sync_sg_for_cpu,
-   .sync_sg_for_device = xen_swiotlb_sync_sg_for_device,
-   .map_sg = xen_swiotlb_map_sg_attrs,
-   .unmap_sg = xen_swiotlb_unmap_sg_attrs,
-   .map_page = xen_swiotlb_map_page,
-   .unmap_page = xen_swiotlb_unmap_page,
-   .dma_supported = xen_swiotlb_dma_supported,
-};
-
 /*
  * pci_xen_swiotlb_detect - set xen_swiotlb to 1 if necessary
  *
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 8dab0d3dc172..a0f006daab48 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -295,7 +295,8 @@ int __ref xen_swiotlb_init(int verbose, bool early)
free_pages((unsigned long)xen_io_tlb_start, order);
return rc;
 }
-void *
+
+static void *
 xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
   dma_addr_t *dma_handle, gfp_t flags,
   unsigned long attrs)
@@ -346,9 +347,8 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t 
size,
memset(ret, 0, size);
return ret;
 }
-EXPORT_SYMBOL_GPL(xen_swiotlb_alloc_coherent);
 
-void
+static void
 xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
  dma_addr_t dev_addr, unsigned long attrs)
 {
@@ -369,8 +369,6 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
size, void *vaddr,
 
xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs);
 }
-EXPORT_SYMBOL_GPL(xen_swiotlb_free_coherent);
-
 
 /*
  * Map a single buffer of the indicated size for DMA in streaming mode.  The
@@ -379,7 +377,7 @@ EXPORT_SYMBOL_GPL(xen_swiotlb_free_coherent);
  * Once the device is given the dma address, the device owns this memory until
  * either xen_swiotlb_unmap_page or xen_swiotlb_dma_sync_single is performed.
  */
-dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
+static dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
unsigned long offset, size_t size,
enum dma_data_direction dir,
unsigned long attrs)
@@ -429,7 +427,6 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct 
page *page,
 
return DMA_ERROR_CODE;
 }
-EXPORT_SYMBOL_GPL(xen_swiotlb_map_page);
 
 /*
  * Unmap a single streaming mode DMA translation.  The dma_addr and size must
@@ -467,13 +464,12 @@ static void xen_unmap_single(struct device *hwdev, 
dma_addr_t dev_addr,
dma_mark_clean(phys_to_virt(paddr), size);
 }
 
-void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
+static void xen_

[PATCH 08/44] xen-swiotlb: implement ->mapping_error

DMA_ERROR_CODE is going to go away, so don't rely on it.

Signed-off-by: Christoph Hellwig 
---
 drivers/xen/swiotlb-xen.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index a0f006daab48..c3a04b2d7532 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -67,6 +67,8 @@ static unsigned long dma_alloc_coherent_mask(struct device 
*dev,
 }
 #endif
 
+#define XEN_SWIOTLB_ERROR_CODE (~(dma_addr_t)0x0)
+
 static char *xen_io_tlb_start, *xen_io_tlb_end;
 static unsigned long xen_io_tlb_nslabs;
 /*
@@ -410,7 +412,7 @@ static dma_addr_t xen_swiotlb_map_page(struct device *dev, 
struct page *page,
map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir,
 attrs);
if (map == SWIOTLB_MAP_ERROR)
-   return DMA_ERROR_CODE;
+   return XEN_SWIOTLB_ERROR_CODE;
 
dev_addr = xen_phys_to_bus(map);
xen_dma_map_page(dev, pfn_to_page(map >> PAGE_SHIFT),
@@ -425,7 +427,7 @@ static dma_addr_t xen_swiotlb_map_page(struct device *dev, 
struct page *page,
attrs |= DMA_ATTR_SKIP_CPU_SYNC;
swiotlb_tbl_unmap_single(dev, map, size, dir, attrs);
 
-   return DMA_ERROR_CODE;
+   return XEN_SWIOTLB_ERROR_CODE;
 }
 
 /*
@@ -715,6 +717,11 @@ xen_swiotlb_get_sgtable(struct device *dev, struct 
sg_table *sgt,
return dma_common_get_sgtable(dev, sgt, cpu_addr, handle, size);
 }
 
+static int xen_swiotlb_mapping_error(struct device *dev, dma_addr_t dma_addr)
+{
+   return dma_addr == XEN_SWIOTLB_ERROR_CODE;
+}
+
 const struct dma_map_ops xen_swiotlb_dma_ops = {
.alloc = xen_swiotlb_alloc_coherent,
.free = xen_swiotlb_free_coherent,
@@ -730,4 +737,5 @@ const struct dma_map_ops xen_swiotlb_dma_ops = {
.set_dma_mask = xen_swiotlb_set_dma_mask,
.mmap = xen_swiotlb_dma_mmap,
.get_sgtable = xen_swiotlb_get_sgtable,
+   .mapping_error  = xen_swiotlb_mapping_error,
 };
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 13/44] openrisc: remove DMA_ERROR_CODE

openrisc does not return errors for dma_map_page.

Signed-off-by: Christoph Hellwig 
---
 arch/openrisc/include/asm/dma-mapping.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/openrisc/include/asm/dma-mapping.h 
b/arch/openrisc/include/asm/dma-mapping.h
index 0c0075f17145..a4ea139c2ef9 100644
--- a/arch/openrisc/include/asm/dma-mapping.h
+++ b/arch/openrisc/include/asm/dma-mapping.h
@@ -26,8 +26,6 @@
 #include 
 #include 
 
-#define DMA_ERROR_CODE (~(dma_addr_t)0x0)
-
 extern const struct dma_map_ops or1k_dma_map_ops;
 
 static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 09/44] c6x: remove DMA_ERROR_CODE

Signed-off-by: Christoph Hellwig 
---
 arch/c6x/include/asm/dma-mapping.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/arch/c6x/include/asm/dma-mapping.h 
b/arch/c6x/include/asm/dma-mapping.h
index aca9f755e4f8..05daf1038111 100644
--- a/arch/c6x/include/asm/dma-mapping.h
+++ b/arch/c6x/include/asm/dma-mapping.h
@@ -12,11 +12,6 @@
 #ifndef _ASM_C6X_DMA_MAPPING_H
 #define _ASM_C6X_DMA_MAPPING_H
 
-/*
- * DMA errors are defined by all-bits-set in the DMA address.
- */
-#define DMA_ERROR_CODE ~0
-
 extern const struct dma_map_ops c6x_dma_ops;
 
 static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 10/44] ia64: remove DMA_ERROR_CODE

All ia64 dma_mapping_ops instances already have a mapping_error member.

Signed-off-by: Christoph Hellwig 
---
 arch/ia64/include/asm/dma-mapping.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/ia64/include/asm/dma-mapping.h 
b/arch/ia64/include/asm/dma-mapping.h
index 73ec3c6f4cfe..3ce5ab4339f3 100644
--- a/arch/ia64/include/asm/dma-mapping.h
+++ b/arch/ia64/include/asm/dma-mapping.h
@@ -12,8 +12,6 @@
 
 #define ARCH_HAS_DMA_GET_REQUIRED_MASK
 
-#define DMA_ERROR_CODE 0
-
 extern const struct dma_map_ops *dma_ops;
 extern struct ia64_machine_vector ia64_mv;
 extern void set_iommu_machvec(void);
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 12/44] microblaze: remove DMA_ERROR_CODE

microblaze does not return errors for dma_map_page.

Signed-off-by: Christoph Hellwig 
---
 arch/microblaze/include/asm/dma-mapping.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/microblaze/include/asm/dma-mapping.h 
b/arch/microblaze/include/asm/dma-mapping.h
index 3fad5e722a66..e15cd2f76e23 100644
--- a/arch/microblaze/include/asm/dma-mapping.h
+++ b/arch/microblaze/include/asm/dma-mapping.h
@@ -28,8 +28,6 @@
 #include 
 #include 
 
-#define DMA_ERROR_CODE (~(dma_addr_t)0x0)
-
 #define __dma_alloc_coherent(dev, gfp, size, handle)   NULL
 #define __dma_free_coherent(size, addr)((void)0)
 
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 11/44] m32r: remove DMA_ERROR_CODE

dma-noop is the only dma_mapping_ops instance for m32r and does not return
errors.

Signed-off-by: Christoph Hellwig 
---
 arch/m32r/include/asm/dma-mapping.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/m32r/include/asm/dma-mapping.h 
b/arch/m32r/include/asm/dma-mapping.h
index c01d9f52d228..aff3ae8b62f7 100644
--- a/arch/m32r/include/asm/dma-mapping.h
+++ b/arch/m32r/include/asm/dma-mapping.h
@@ -8,8 +8,6 @@
 #include 
 #include 
 
-#define DMA_ERROR_CODE (~(dma_addr_t)0x0)
-
 static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
 {
return &dma_noop_ops;
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 14/44] sh: remove DMA_ERROR_CODE

sh does not return errors for dma_map_page.

Signed-off-by: Christoph Hellwig 
---
 arch/sh/include/asm/dma-mapping.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/sh/include/asm/dma-mapping.h 
b/arch/sh/include/asm/dma-mapping.h
index d99008af5f73..9b06be07db4d 100644
--- a/arch/sh/include/asm/dma-mapping.h
+++ b/arch/sh/include/asm/dma-mapping.h
@@ -9,8 +9,6 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct 
bus_type *bus)
return dma_ops;
 }
 
-#define DMA_ERROR_CODE 0
-
 void dma_cache_sync(struct device *dev, void *vaddr, size_t size,
enum dma_data_direction dir);
 
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 15/44] xtensa: remove DMA_ERROR_CODE

xtensa already implements the mapping_error method for its only
dma_map_ops instance.

Signed-off-by: Christoph Hellwig 
---
 arch/xtensa/include/asm/dma-mapping.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/xtensa/include/asm/dma-mapping.h 
b/arch/xtensa/include/asm/dma-mapping.h
index c6140fa8c0be..269738dc9d1d 100644
--- a/arch/xtensa/include/asm/dma-mapping.h
+++ b/arch/xtensa/include/asm/dma-mapping.h
@@ -16,8 +16,6 @@
 #include 
 #include 
 
-#define DMA_ERROR_CODE (~(dma_addr_t)0x0)
-
 extern const struct dma_map_ops xtensa_dma_map_ops;
 
 static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 17/44] hexagon: switch to use ->mapping_error for error reporting

Signed-off-by: Christoph Hellwig 
---
 arch/hexagon/include/asm/dma-mapping.h |  2 --
 arch/hexagon/kernel/dma.c  | 12 +---
 arch/hexagon/kernel/hexagon_ksyms.c|  1 -
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/hexagon/include/asm/dma-mapping.h 
b/arch/hexagon/include/asm/dma-mapping.h
index d3a87bd9b686..00e3f10113b0 100644
--- a/arch/hexagon/include/asm/dma-mapping.h
+++ b/arch/hexagon/include/asm/dma-mapping.h
@@ -29,8 +29,6 @@
 #include 
 
 struct device;
-extern int bad_dma_address;
-#define DMA_ERROR_CODE bad_dma_address
 
 extern const struct dma_map_ops *dma_ops;
 
diff --git a/arch/hexagon/kernel/dma.c b/arch/hexagon/kernel/dma.c
index e74b65009587..71269dc0f225 100644
--- a/arch/hexagon/kernel/dma.c
+++ b/arch/hexagon/kernel/dma.c
@@ -25,11 +25,11 @@
 #include 
 #include 
 
+#define HEXAGON_MAPPING_ERROR  0
+
 const struct dma_map_ops *dma_ops;
 EXPORT_SYMBOL(dma_ops);
 
-int bad_dma_address;  /*  globals are automatically initialized to zero  */
-
 static inline void *dma_addr_to_virt(dma_addr_t dma_addr)
 {
return phys_to_virt((unsigned long) dma_addr);
@@ -181,7 +181,7 @@ static dma_addr_t hexagon_map_page(struct device *dev, 
struct page *page,
WARN_ON(size == 0);
 
if (!check_addr("map_single", dev, bus, size))
-   return bad_dma_address;
+   return HEXAGON_MAPPING_ERROR;
 
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC))
dma_sync(dma_addr_to_virt(bus), size, dir);
@@ -203,6 +203,11 @@ static void hexagon_sync_single_for_device(struct device 
*dev,
dma_sync(dma_addr_to_virt(dma_handle), size, dir);
 }
 
+static int hexagon_mapping_error(struct device *dev, dma_addr_t dma_addr)
+{
+   return dma_addr == HEXAGON_MAPPING_ERROR;
+}
+
 const struct dma_map_ops hexagon_dma_ops = {
.alloc  = hexagon_dma_alloc_coherent,
.free   = hexagon_free_coherent,
@@ -210,6 +215,7 @@ const struct dma_map_ops hexagon_dma_ops = {
.map_page   = hexagon_map_page,
.sync_single_for_cpu = hexagon_sync_single_for_cpu,
.sync_single_for_device = hexagon_sync_single_for_device,
+   .mapping_error  = hexagon_mapping_error;
.is_phys= 1,
 };
 
diff --git a/arch/hexagon/kernel/hexagon_ksyms.c 
b/arch/hexagon/kernel/hexagon_ksyms.c
index 00bcad9cbd8f..aa248f595431 100644
--- a/arch/hexagon/kernel/hexagon_ksyms.c
+++ b/arch/hexagon/kernel/hexagon_ksyms.c
@@ -40,7 +40,6 @@ EXPORT_SYMBOL(memset);
 /* Additional variables */
 EXPORT_SYMBOL(__phys_offset);
 EXPORT_SYMBOL(_dflt_cache_att);
-EXPORT_SYMBOL(bad_dma_address);
 
 #define DECLARE_EXPORT(name) \
extern void name(void); EXPORT_SYMBOL(name)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 16/44] arm64: remove DMA_ERROR_CODE

The dma alloc interface returns an error by return NULL, and the
mapping interfaces rely on the mapping_error method, which the dummy
ops already implement correctly.

Thus remove the DMA_ERROR_CODE define.

Signed-off-by: Christoph Hellwig 
---
 arch/arm64/include/asm/dma-mapping.h | 1 -
 arch/arm64/mm/dma-mapping.c  | 3 +--
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/dma-mapping.h 
b/arch/arm64/include/asm/dma-mapping.h
index 5392dbeffa45..cf8fc8f05580 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -24,7 +24,6 @@
 #include 
 #include 
 
-#define DMA_ERROR_CODE (~(dma_addr_t)0)
 extern const struct dma_map_ops dummy_dma_ops;
 
 static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 3216e098c058..147fbb907a2f 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -184,7 +184,6 @@ static void *__dma_alloc(struct device *dev, size_t size,
 no_map:
__dma_free_coherent(dev, size, ptr, *dma_handle, attrs);
 no_mem:
-   *dma_handle = DMA_ERROR_CODE;
return NULL;
 }
 
@@ -487,7 +486,7 @@ static dma_addr_t __dummy_map_page(struct device *dev, 
struct page *page,
   enum dma_data_direction dir,
   unsigned long attrs)
 {
-   return DMA_ERROR_CODE;
+   return 0;
 }
 
 static void __dummy_unmap_page(struct device *dev, dma_addr_t dev_addr,
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 10/34] x86, x86/mm, x86/xen, olpc: Use __va() against just the physical address in cr3


On 6/7/2017 5:06 PM, Boris Ostrovsky wrote:

On 06/07/2017 03:14 PM, Tom Lendacky wrote:

The cr3 register entry can contain the SME encryption bit that indicates
the PGD is encrypted.  The encryption bit should not be used when creating
a virtual address for the PGD table.

Create a new function, read_cr3_pa(), that will extract the physical
address from the cr3 register. This function is then used where a virtual
address of the PGD needs to be created/used from the cr3 register.

Signed-off-by: Tom Lendacky 
---
  arch/x86/include/asm/special_insns.h |9 +
  arch/x86/kernel/head64.c |2 +-
  arch/x86/mm/fault.c  |   10 +-
  arch/x86/mm/ioremap.c|2 +-
  arch/x86/platform/olpc/olpc-xo1-pm.c |2 +-
  arch/x86/power/hibernate_64.c|2 +-
  arch/x86/xen/mmu_pv.c|6 +++---
  7 files changed, 21 insertions(+), 12 deletions(-)



...


diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
index 1f386d7..2dc5243 100644
--- a/arch/x86/xen/mmu_pv.c
+++ b/arch/x86/xen/mmu_pv.c
@@ -2022,7 +2022,7 @@ static phys_addr_t __init xen_early_virt_to_phys(unsigned 
long vaddr)
pmd_t pmd;
pte_t pte;
  
-	pa = read_cr3();

+   pa = read_cr3_pa();
pgd = native_make_pgd(xen_read_phys_ulong(pa + pgd_index(vaddr) *
   sizeof(pgd)));
if (!pgd_present(pgd))
@@ -2102,7 +2102,7 @@ void __init xen_relocate_p2m(void)
pt_phys = pmd_phys + PFN_PHYS(n_pmd);
p2m_pfn = PFN_DOWN(pt_phys) + n_pt;
  
-	pgd = __va(read_cr3());

+   pgd = __va(read_cr3_pa());
new_p2m = (unsigned long *)(2 * PGDIR_SIZE);
idx_p4d = 0;
save_pud = n_pud;
@@ -2209,7 +2209,7 @@ static void __init xen_write_cr3_init(unsigned long cr3)
  {
unsigned long pfn = PFN_DOWN(__pa(swapper_pg_dir));
  
-	BUG_ON(read_cr3() != __pa(initial_page_table));

+   BUG_ON(read_cr3_pa() != __pa(initial_page_table));
BUG_ON(cr3 != __pa(swapper_pg_dir));
  
  	/*



(Please copy Xen maintainers when modifying xen-related files.)


Sorry about that, missed adding the Xen maintainers when I added this
change.



Given that page tables for Xen PV guests are controlled by the
hypervisor I don't think this change (although harmless) is necessary.


I can back this change out if the Xen maintainers think that's best.


What may be needed is making sure X86_FEATURE_SME is not set for PV guests.


And that may be something that Xen will need to control through either
CPUID or MSR support for the PV guests.

Thanks,
Tom



-boris


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 22/44] x86/pci-nommu: implement ->mapping_error

DMA_ERROR_CODE is going to go away, so don't rely on it.

Signed-off-by: Christoph Hellwig 
---
 arch/x86/kernel/pci-nommu.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/pci-nommu.c b/arch/x86/kernel/pci-nommu.c
index a88952ef371c..085fe6ce4049 100644
--- a/arch/x86/kernel/pci-nommu.c
+++ b/arch/x86/kernel/pci-nommu.c
@@ -11,6 +11,8 @@
 #include 
 #include 
 
+#define NOMMU_MAPPING_ERROR0
+
 static int
 check_addr(char *name, struct device *hwdev, dma_addr_t bus, size_t size)
 {
@@ -33,7 +35,7 @@ static dma_addr_t nommu_map_page(struct device *dev, struct 
page *page,
dma_addr_t bus = page_to_phys(page) + offset;
WARN_ON(size == 0);
if (!check_addr("map_single", dev, bus, size))
-   return DMA_ERROR_CODE;
+   return NOMMU_MAPPING_ERROR;
flush_write_buffers();
return bus;
 }
@@ -88,6 +90,11 @@ static void nommu_sync_sg_for_device(struct device *dev,
flush_write_buffers();
 }
 
+static int nommu_mapping_error(struct device *dev, dma_addr_t dma_addr)
+{
+   return dma_addr == NOMMU_MAPPING_ERROR;
+}
+
 const struct dma_map_ops nommu_dma_ops = {
.alloc  = dma_generic_alloc_coherent,
.free   = dma_generic_free_coherent,
@@ -96,4 +103,5 @@ const struct dma_map_ops nommu_dma_ops = {
.sync_single_for_device = nommu_sync_single_for_device,
.sync_sg_for_device = nommu_sync_sg_for_device,
.is_phys= 1,
+   .mapping_error  = nommu_mapping_error,
 };
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 33/44] openrisc: remove arch-specific dma_supported implementation

This implementation is simply bogus - hexagon only has a simple
direct mapped DMA implementation and thus doesn't care about the
address.

Signed-off-by: Christoph Hellwig 
---
 arch/openrisc/include/asm/dma-mapping.h | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/arch/openrisc/include/asm/dma-mapping.h 
b/arch/openrisc/include/asm/dma-mapping.h
index a4ea139c2ef9..f41bd3cb76d9 100644
--- a/arch/openrisc/include/asm/dma-mapping.h
+++ b/arch/openrisc/include/asm/dma-mapping.h
@@ -33,11 +33,4 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
return &or1k_dma_map_ops;
 }
 
-#define HAVE_ARCH_DMA_SUPPORTED 1
-static inline int dma_supported(struct device *dev, u64 dma_mask)
-{
-   /* Support 32 bit DMA mask exclusively */
-   return dma_mask == DMA_BIT_MASK(32);
-}
-
 #endif /* __ASM_OPENRISC_DMA_MAPPING_H */
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 42/44] powerpc/cell: use the dma_supported method for ops switching

Besides removing the last instance of the set_dma_mask method this also
reduced the code duplication.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/platforms/cell/iommu.c | 25 +
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/platforms/cell/iommu.c 
b/arch/powerpc/platforms/cell/iommu.c
index 497bfbdbd967..29d4f96ed33e 100644
--- a/arch/powerpc/platforms/cell/iommu.c
+++ b/arch/powerpc/platforms/cell/iommu.c
@@ -644,20 +644,14 @@ static void dma_fixed_unmap_sg(struct device *dev, struct 
scatterlist *sg,
   direction, attrs);
 }
 
-static int dma_fixed_dma_supported(struct device *dev, u64 mask)
-{
-   return mask == DMA_BIT_MASK(64);
-}
-
-static int dma_set_mask_and_switch(struct device *dev, u64 dma_mask);
+static int dma_suported_and_switch(struct device *dev, u64 dma_mask);
 
 static const struct dma_map_ops dma_iommu_fixed_ops = {
.alloc  = dma_fixed_alloc_coherent,
.free   = dma_fixed_free_coherent,
.map_sg = dma_fixed_map_sg,
.unmap_sg   = dma_fixed_unmap_sg,
-   .dma_supported  = dma_fixed_dma_supported,
-   .set_dma_mask   = dma_set_mask_and_switch,
+   .dma_supported  = dma_suported_and_switch,
.map_page   = dma_fixed_map_page,
.unmap_page = dma_fixed_unmap_page,
.mapping_error  = dma_iommu_mapping_error,
@@ -952,11 +946,8 @@ static u64 cell_iommu_get_fixed_address(struct device *dev)
return dev_addr;
 }
 
-static int dma_set_mask_and_switch(struct device *dev, u64 dma_mask)
+static int dma_suported_and_switch(struct device *dev, u64 dma_mask)
 {
-   if (!dev->dma_mask || !dma_supported(dev, dma_mask))
-   return -EIO;
-
if (dma_mask == DMA_BIT_MASK(64) &&
cell_iommu_get_fixed_address(dev) != OF_BAD_ADDR) {
u64 addr = cell_iommu_get_fixed_address(dev) +
@@ -965,14 +956,16 @@ static int dma_set_mask_and_switch(struct device *dev, 
u64 dma_mask)
dev_dbg(dev, "iommu: fixed addr = %llx\n", addr);
set_dma_ops(dev, &dma_iommu_fixed_ops);
set_dma_offset(dev, addr);
-   } else {
+   return 1;
+   }
+
+   if (dma_iommu_dma_supported(dev, dma_mask)) {
dev_dbg(dev, "iommu: not 64-bit, using default ops\n");
set_dma_ops(dev, get_pci_dma_ops());
cell_dma_dev_setup(dev);
+   return 1;
}
 
-   *dev->dma_mask = dma_mask;
-
return 0;
 }
 
@@ -1127,7 +1120,7 @@ static int __init cell_iommu_fixed_mapping_init(void)
cell_iommu_setup_window(iommu, np, dbase, dsize, 0);
}
 
-   dma_iommu_ops.set_dma_mask = dma_set_mask_and_switch;
+   dma_iommu_ops.dma_supported = dma_suported_and_switch;
set_pci_dma_ops(&dma_iommu_ops);
 
return 0;
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 26/44] dma-mapping: remove DMA_ERROR_CODE

And update the documentation - dma_mapping_error has been supported
everywhere for a long time.

Signed-off-by: Christoph Hellwig 
---
 Documentation/DMA-API-HOWTO.txt | 31 +--
 include/linux/dma-mapping.h |  5 -
 2 files changed, 5 insertions(+), 31 deletions(-)

diff --git a/Documentation/DMA-API-HOWTO.txt b/Documentation/DMA-API-HOWTO.txt
index 979228bc9035..4ed388356898 100644
--- a/Documentation/DMA-API-HOWTO.txt
+++ b/Documentation/DMA-API-HOWTO.txt
@@ -550,32 +550,11 @@ and to unmap it:
dma_unmap_single(dev, dma_handle, size, direction);
 
 You should call dma_mapping_error() as dma_map_single() could fail and return
-error. Not all DMA implementations support the dma_mapping_error() interface.
-However, it is a good practice to call dma_mapping_error() interface, which
-will invoke the generic mapping error check interface. Doing so will ensure
-that the mapping code will work correctly on all DMA implementations without
-any dependency on the specifics of the underlying implementation. Using the
-returned address without checking for errors could result in failures ranging
-from panics to silent data corruption. A couple of examples of incorrect ways
-to check for errors that make assumptions about the underlying DMA
-implementation are as follows and these are applicable to dma_map_page() as
-well.
-
-Incorrect example 1:
-   dma_addr_t dma_handle;
-
-   dma_handle = dma_map_single(dev, addr, size, direction);
-   if ((dma_handle & 0x != 0) || (dma_handle >= 0x100)) {
-   goto map_error;
-   }
-
-Incorrect example 2:
-   dma_addr_t dma_handle;
-
-   dma_handle = dma_map_single(dev, addr, size, direction);
-   if (dma_handle == DMA_ERROR_CODE) {
-   goto map_error;
-   }
+error.  Doing so will ensure that the mapping code will work correctly on all
+DMA implementations without any dependency on the specifics of the underlying
+implementation. Using the returned address without checking for errors could
+result in failures ranging from panics to silent data corruption.  The same
+applies to dma_map_page() as well.
 
 You should call dma_unmap_single() when the DMA activity is finished, e.g.,
 from the interrupt which told you that the DMA transfer is done.
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 4f3eecedca2d..a57875309bfd 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -546,12 +546,7 @@ static inline int dma_mapping_error(struct device *dev, 
dma_addr_t dma_addr)
 
if (get_dma_ops(dev)->mapping_error)
return get_dma_ops(dev)->mapping_error(dev, dma_addr);
-
-#ifdef DMA_ERROR_CODE
-   return dma_addr == DMA_ERROR_CODE;
-#else
return 0;
-#endif
 }
 
 #ifndef HAVE_ARCH_DMA_SUPPORTED
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 40/44] tile: remove dma_supported and mapping_error methods

These just duplicate the default behavior if no method is provided.

Signed-off-by: Christoph Hellwig 
---
 arch/tile/kernel/pci-dma.c | 30 --
 1 file changed, 30 deletions(-)

diff --git a/arch/tile/kernel/pci-dma.c b/arch/tile/kernel/pci-dma.c
index 569bb6dd154a..f2abedc8a080 100644
--- a/arch/tile/kernel/pci-dma.c
+++ b/arch/tile/kernel/pci-dma.c
@@ -317,18 +317,6 @@ static void tile_dma_sync_sg_for_device(struct device *dev,
}
 }
 
-static inline int
-tile_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return 0;
-}
-
-static inline int
-tile_dma_supported(struct device *dev, u64 mask)
-{
-   return 1;
-}
-
 static const struct dma_map_ops tile_default_dma_map_ops = {
.alloc = tile_dma_alloc_coherent,
.free = tile_dma_free_coherent,
@@ -340,8 +328,6 @@ static const struct dma_map_ops tile_default_dma_map_ops = {
.sync_single_for_device = tile_dma_sync_single_for_device,
.sync_sg_for_cpu = tile_dma_sync_sg_for_cpu,
.sync_sg_for_device = tile_dma_sync_sg_for_device,
-   .mapping_error = tile_dma_mapping_error,
-   .dma_supported = tile_dma_supported
 };
 
 const struct dma_map_ops *tile_dma_map_ops = &tile_default_dma_map_ops;
@@ -504,18 +490,6 @@ static void tile_pci_dma_sync_sg_for_device(struct device 
*dev,
}
 }
 
-static inline int
-tile_pci_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return 0;
-}
-
-static inline int
-tile_pci_dma_supported(struct device *dev, u64 mask)
-{
-   return 1;
-}
-
 static const struct dma_map_ops tile_pci_default_dma_map_ops = {
.alloc = tile_pci_dma_alloc_coherent,
.free = tile_pci_dma_free_coherent,
@@ -527,8 +501,6 @@ static const struct dma_map_ops 
tile_pci_default_dma_map_ops = {
.sync_single_for_device = tile_pci_dma_sync_single_for_device,
.sync_sg_for_cpu = tile_pci_dma_sync_sg_for_cpu,
.sync_sg_for_device = tile_pci_dma_sync_sg_for_device,
-   .mapping_error = tile_pci_dma_mapping_error,
-   .dma_supported = tile_pci_dma_supported
 };
 
 const struct dma_map_ops *gx_pci_dma_map_ops = &tile_pci_default_dma_map_ops;
@@ -578,8 +550,6 @@ static const struct dma_map_ops pci_hybrid_dma_ops = {
.sync_single_for_device = tile_pci_dma_sync_single_for_device,
.sync_sg_for_cpu = tile_pci_dma_sync_sg_for_cpu,
.sync_sg_for_device = tile_pci_dma_sync_sg_for_device,
-   .mapping_error = tile_pci_dma_mapping_error,
-   .dma_supported = tile_pci_dma_supported
 };
 
 const struct dma_map_ops *gx_legacy_pci_dma_map_ops = &pci_swiotlb_dma_ops;
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 44/44] powerpc: merge __dma_set_mask into dma_set_mask

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/include/asm/dma-mapping.h |  1 -
 arch/powerpc/kernel/dma.c  | 13 -
 2 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/dma-mapping.h 
b/arch/powerpc/include/asm/dma-mapping.h
index 73aedbe6c977..eaece3d3e225 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -112,7 +112,6 @@ static inline void set_dma_offset(struct device *dev, 
dma_addr_t off)
 #define HAVE_ARCH_DMA_SET_MASK 1
 extern int dma_set_mask(struct device *dev, u64 dma_mask);
 
-extern int __dma_set_mask(struct device *dev, u64 dma_mask);
 extern u64 __dma_get_required_mask(struct device *dev);
 
 static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t 
size)
diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index 466c9f07b288..4194db10 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -314,14 +314,6 @@ EXPORT_SYMBOL(dma_set_coherent_mask);
 
 #define PREALLOC_DMA_DEBUG_ENTRIES (1 << 16)
 
-int __dma_set_mask(struct device *dev, u64 dma_mask)
-{
-   if (!dev->dma_mask || !dma_supported(dev, dma_mask))
-   return -EIO;
-   *dev->dma_mask = dma_mask;
-   return 0;
-}
-
 int dma_set_mask(struct device *dev, u64 dma_mask)
 {
if (ppc_md.dma_set_mask)
@@ -334,7 +326,10 @@ int dma_set_mask(struct device *dev, u64 dma_mask)
return phb->controller_ops.dma_set_mask(pdev, dma_mask);
}
 
-   return __dma_set_mask(dev, dma_mask);
+   if (!dev->dma_mask || !dma_supported(dev, dma_mask))
+   return -EIO;
+   *dev->dma_mask = dma_mask;
+   return 0;
 }
 EXPORT_SYMBOL(dma_set_mask);
 
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 27/44] sparc: remove leon_dma_ops

We can just use pci32_dma_ops.

Btw, given that leon is 32-bit and appears to be PCI based, do even need
the special case for it in get_arch_dma_ops at all?

Signed-off-by: Christoph Hellwig 
---
 arch/sparc/include/asm/dma-mapping.h | 3 +--
 arch/sparc/kernel/ioport.c   | 5 +
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/sparc/include/asm/dma-mapping.h 
b/arch/sparc/include/asm/dma-mapping.h
index b8e8dfcd065d..98da9f92c318 100644
--- a/arch/sparc/include/asm/dma-mapping.h
+++ b/arch/sparc/include/asm/dma-mapping.h
@@ -17,7 +17,6 @@ static inline void dma_cache_sync(struct device *dev, void 
*vaddr, size_t size,
 }
 
 extern const struct dma_map_ops *dma_ops;
-extern const struct dma_map_ops *leon_dma_ops;
 extern const struct dma_map_ops pci32_dma_ops;
 
 extern struct bus_type pci_bus_type;
@@ -26,7 +25,7 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
 {
 #ifdef CONFIG_SPARC_LEON
if (sparc_cpu_model == sparc_leon)
-   return leon_dma_ops;
+   return &pci32_dma_ops;
 #endif
 #if defined(CONFIG_SPARC32) && defined(CONFIG_PCI)
if (bus == &pci_bus_type)
diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c
index cf20033a1458..dd081d557609 100644
--- a/arch/sparc/kernel/ioport.c
+++ b/arch/sparc/kernel/ioport.c
@@ -637,6 +637,7 @@ static void pci32_sync_sg_for_device(struct device *device, 
struct scatterlist *
}
 }
 
+/* note: leon re-uses pci32_dma_ops */
 const struct dma_map_ops pci32_dma_ops = {
.alloc  = pci32_alloc_coherent,
.free   = pci32_free_coherent,
@@ -651,10 +652,6 @@ const struct dma_map_ops pci32_dma_ops = {
 };
 EXPORT_SYMBOL(pci32_dma_ops);
 
-/* leon re-uses pci32_dma_ops */
-const struct dma_map_ops *leon_dma_ops = &pci32_dma_ops;
-EXPORT_SYMBOL(leon_dma_ops);
-
 const struct dma_map_ops *dma_ops = &sbus_dma_ops;
 EXPORT_SYMBOL(dma_ops);
 
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 34/44] arm: remove arch specific dma_supported implementation

And instead wire it up as method for all the dma_map_ops instances.

Note that the code seems a little fishy for dmabounce and iommu, but
for now I'd like to preserve the existing behavior 1:1.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/common/dmabounce.c| 1 +
 arch/arm/include/asm/dma-iommu.h   | 2 ++
 arch/arm/include/asm/dma-mapping.h | 3 ---
 arch/arm/mm/dma-mapping.c  | 7 +--
 4 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index bad457395ff1..4aabf117e136 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -476,6 +476,7 @@ static const struct dma_map_ops dmabounce_ops = {
.sync_sg_for_device = arm_dma_sync_sg_for_device,
.set_dma_mask   = dmabounce_set_mask,
.mapping_error  = dmabounce_mapping_error,
+   .dma_supported  = arm_dma_supported,
 };
 
 static int dmabounce_init_pool(struct dmabounce_pool *pool, struct device *dev,
diff --git a/arch/arm/include/asm/dma-iommu.h b/arch/arm/include/asm/dma-iommu.h
index 389a26a10ea3..c090ec675eac 100644
--- a/arch/arm/include/asm/dma-iommu.h
+++ b/arch/arm/include/asm/dma-iommu.h
@@ -35,5 +35,7 @@ int arm_iommu_attach_device(struct device *dev,
struct dma_iommu_mapping *mapping);
 void arm_iommu_detach_device(struct device *dev);
 
+int arm_dma_supported(struct device *dev, u64 mask);
+
 #endif /* __KERNEL__ */
 #endif
diff --git a/arch/arm/include/asm/dma-mapping.h 
b/arch/arm/include/asm/dma-mapping.h
index 52a8fd5a8edb..8dabcfdf4505 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -20,9 +20,6 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
return &arm_dma_ops;
 }
 
-#define HAVE_ARCH_DMA_SUPPORTED 1
-extern int dma_supported(struct device *dev, u64 mask);
-
 #ifdef __arch_page_to_dma
 #error Please update to __arch_pfn_to_dma
 #endif
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 2dbc94b5fe5c..2938b724826e 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -199,6 +199,7 @@ const struct dma_map_ops arm_dma_ops = {
.sync_sg_for_cpu= arm_dma_sync_sg_for_cpu,
.sync_sg_for_device = arm_dma_sync_sg_for_device,
.mapping_error  = arm_dma_mapping_error,
+   .dma_supported  = arm_dma_supported,
 };
 EXPORT_SYMBOL(arm_dma_ops);
 
@@ -218,6 +219,7 @@ const struct dma_map_ops arm_coherent_dma_ops = {
.map_page   = arm_coherent_dma_map_page,
.map_sg = arm_dma_map_sg,
.mapping_error  = arm_dma_mapping_error,
+   .dma_supported  = arm_dma_supported,
 };
 EXPORT_SYMBOL(arm_coherent_dma_ops);
 
@@ -1184,11 +1186,10 @@ void arm_dma_sync_sg_for_device(struct device *dev, 
struct scatterlist *sg,
  * during bus mastering, then you would pass 0x00ff as the mask
  * to this function.
  */
-int dma_supported(struct device *dev, u64 mask)
+int arm_dma_supported(struct device *dev, u64 mask)
 {
return __dma_supported(dev, mask, false);
 }
-EXPORT_SYMBOL(dma_supported);
 
 #define PREALLOC_DMA_DEBUG_ENTRIES 4096
 
@@ -2149,6 +2150,7 @@ const struct dma_map_ops iommu_ops = {
.unmap_resource = arm_iommu_unmap_resource,
 
.mapping_error  = arm_dma_mapping_error,
+   .dma_supported  = arm_dma_supported,
 };
 
 const struct dma_map_ops iommu_coherent_ops = {
@@ -2167,6 +2169,7 @@ const struct dma_map_ops iommu_coherent_ops = {
.unmap_resource = arm_iommu_unmap_resource,
 
.mapping_error  = arm_dma_mapping_error,
+   .dma_supported  = arm_dma_supported,
 };
 
 /**
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 25/44] arm: implement ->mapping_error

DMA_ERROR_CODE is going to go away, so don't rely on it.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/common/dmabounce.c| 16 ---
 arch/arm/include/asm/dma-iommu.h   |  2 ++
 arch/arm/include/asm/dma-mapping.h |  1 -
 arch/arm/mm/dma-mapping.c  | 41 --
 4 files changed, 41 insertions(+), 19 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index 9b1b7be2ec0e..bad457395ff1 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -33,6 +33,7 @@
 #include 
 
 #include 
+#include 
 
 #undef STATS
 
@@ -256,7 +257,7 @@ static inline dma_addr_t map_single(struct device *dev, 
void *ptr, size_t size,
if (buf == NULL) {
dev_err(dev, "%s: unable to map unsafe buffer %p!\n",
   __func__, ptr);
-   return DMA_ERROR_CODE;
+   return ARM_MAPPING_ERROR;
}
 
dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n",
@@ -326,7 +327,7 @@ static dma_addr_t dmabounce_map_page(struct device *dev, 
struct page *page,
 
ret = needs_bounce(dev, dma_addr, size);
if (ret < 0)
-   return DMA_ERROR_CODE;
+   return ARM_MAPPING_ERROR;
 
if (ret == 0) {
arm_dma_ops.sync_single_for_device(dev, dma_addr, size, dir);
@@ -335,7 +336,7 @@ static dma_addr_t dmabounce_map_page(struct device *dev, 
struct page *page,
 
if (PageHighMem(page)) {
dev_err(dev, "DMA buffer bouncing of HIGHMEM pages is not 
supported\n");
-   return DMA_ERROR_CODE;
+   return ARM_MAPPING_ERROR;
}
 
return map_single(dev, page_address(page) + offset, size, dir, attrs);
@@ -452,6 +453,14 @@ static int dmabounce_set_mask(struct device *dev, u64 
dma_mask)
return arm_dma_ops.set_dma_mask(dev, dma_mask);
 }
 
+static int dmabounce_mapping_error(struct device *dev, dma_addr_t dma_addr)
+{
+   if (dev->archdata.dmabounce)
+   return 0;
+
+   return arm_dma_ops.mapping_error(dev, dma_addr);
+}
+
 static const struct dma_map_ops dmabounce_ops = {
.alloc  = arm_dma_alloc,
.free   = arm_dma_free,
@@ -466,6 +475,7 @@ static const struct dma_map_ops dmabounce_ops = {
.sync_sg_for_cpu= arm_dma_sync_sg_for_cpu,
.sync_sg_for_device = arm_dma_sync_sg_for_device,
.set_dma_mask   = dmabounce_set_mask,
+   .mapping_error  = dmabounce_mapping_error,
 };
 
 static int dmabounce_init_pool(struct dmabounce_pool *pool, struct device *dev,
diff --git a/arch/arm/include/asm/dma-iommu.h b/arch/arm/include/asm/dma-iommu.h
index 2ef282f96651..389a26a10ea3 100644
--- a/arch/arm/include/asm/dma-iommu.h
+++ b/arch/arm/include/asm/dma-iommu.h
@@ -9,6 +9,8 @@
 #include 
 #include 
 
+#define ARM_MAPPING_ERROR  (~(dma_addr_t)0x0)
+
 struct dma_iommu_mapping {
/* iommu specific data */
struct iommu_domain *domain;
diff --git a/arch/arm/include/asm/dma-mapping.h 
b/arch/arm/include/asm/dma-mapping.h
index 680d3f3889e7..52a8fd5a8edb 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -12,7 +12,6 @@
 #include 
 #include 
 
-#define DMA_ERROR_CODE (~(dma_addr_t)0x0)
 extern const struct dma_map_ops arm_dma_ops;
 extern const struct dma_map_ops arm_coherent_dma_ops;
 
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index c742dfd2967b..2dbc94b5fe5c 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -180,6 +180,11 @@ static void arm_dma_sync_single_for_device(struct device 
*dev,
__dma_page_cpu_to_dev(page, offset, size, dir);
 }
 
+static int arm_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
+{
+   return dma_addr == ARM_MAPPING_ERROR;
+}
+
 const struct dma_map_ops arm_dma_ops = {
.alloc  = arm_dma_alloc,
.free   = arm_dma_free,
@@ -193,6 +198,7 @@ const struct dma_map_ops arm_dma_ops = {
.sync_single_for_device = arm_dma_sync_single_for_device,
.sync_sg_for_cpu= arm_dma_sync_sg_for_cpu,
.sync_sg_for_device = arm_dma_sync_sg_for_device,
+   .mapping_error  = arm_dma_mapping_error,
 };
 EXPORT_SYMBOL(arm_dma_ops);
 
@@ -211,6 +217,7 @@ const struct dma_map_ops arm_coherent_dma_ops = {
.get_sgtable= arm_dma_get_sgtable,
.map_page   = arm_coherent_dma_map_page,
.map_sg = arm_dma_map_sg,
+   .mapping_error  = arm_dma_mapping_error,
 };
 EXPORT_SYMBOL(arm_coherent_dma_ops);
 
@@ -799,7 +806,7 @@ static void *__dma_alloc(struct device *dev, size_t size, 
dma_addr_t *handle,
gfp &= ~(__GFP_COMP);
args.gfp = gfp;
 
-   *handle = DMA_ERROR_CODE;
+   *handle = ARM_MAPPING_ERROR;
allowblock = gfpflags_allow_bl

[PATCH 36/44] dma-mapping: remove HAVE_ARCH_DMA_SUPPORTED

Signed-off-by: Christoph Hellwig 
---
 include/linux/dma-mapping.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index a57875309bfd..3e5908656226 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -549,7 +549,6 @@ static inline int dma_mapping_error(struct device *dev, 
dma_addr_t dma_addr)
return 0;
 }
 
-#ifndef HAVE_ARCH_DMA_SUPPORTED
 static inline int dma_supported(struct device *dev, u64 mask)
 {
const struct dma_map_ops *ops = get_dma_ops(dev);
@@ -560,7 +559,6 @@ static inline int dma_supported(struct device *dev, u64 
mask)
return 1;
return ops->dma_supported(dev, mask);
 }
-#endif
 
 #ifndef HAVE_ARCH_DMA_SET_MASK
 static inline int dma_set_mask(struct device *dev, u64 mask)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 06/44] iommu/dma: don't rely on DMA_ERROR_CODE

Hi Christoph,

On 08/06/17 14:25, Christoph Hellwig wrote:
> DMA_ERROR_CODE is not a public API and will go away soon.  dma dma-iommu
> driver already implements a proper ->mapping_error method, so it's only
> using the value internally.  Add a new local define using the value
> that arm64 which is the only current user of dma-iommu.

It would be fine to just use 0, since dma-iommu already makes sure that
that will never be allocated for a valid DMA address.

Otherwise, looks good!

Robin.

> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/iommu/dma-iommu.c | 18 ++
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 62618e77bedc..638aea814b94 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -31,6 +31,8 @@
>  #include 
>  #include 
>  
> +#define IOMMU_MAPPING_ERROR  (~(dma_addr_t)0)
> +
>  struct iommu_dma_msi_page {
>   struct list_headlist;
>   dma_addr_t  iova;
> @@ -500,7 +502,7 @@ void iommu_dma_free(struct device *dev, struct page 
> **pages, size_t size,
>  {
>   __iommu_dma_unmap(iommu_get_domain_for_dev(dev), *handle, size);
>   __iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
> - *handle = DMA_ERROR_CODE;
> + *handle = IOMMU_MAPPING_ERROR;
>  }
>  
>  /**
> @@ -533,7 +535,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t 
> size, gfp_t gfp,
>   dma_addr_t iova;
>   unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
>  
> - *handle = DMA_ERROR_CODE;
> + *handle = IOMMU_MAPPING_ERROR;
>  
>   min_size = alloc_sizes & -alloc_sizes;
>   if (min_size < PAGE_SIZE) {
> @@ -627,11 +629,11 @@ static dma_addr_t __iommu_dma_map(struct device *dev, 
> phys_addr_t phys,
>  
>   iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev);
>   if (!iova)
> - return DMA_ERROR_CODE;
> + return IOMMU_MAPPING_ERROR;
>  
>   if (iommu_map(domain, iova, phys - iova_off, size, prot)) {
>   iommu_dma_free_iova(cookie, iova, size);
> - return DMA_ERROR_CODE;
> + return IOMMU_MAPPING_ERROR;
>   }
>   return iova + iova_off;
>  }
> @@ -671,7 +673,7 @@ static int __finalise_sg(struct device *dev, struct 
> scatterlist *sg, int nents,
>  
>   s->offset += s_iova_off;
>   s->length = s_length;
> - sg_dma_address(s) = DMA_ERROR_CODE;
> + sg_dma_address(s) = IOMMU_MAPPING_ERROR;
>   sg_dma_len(s) = 0;
>  
>   /*
> @@ -714,11 +716,11 @@ static void __invalidate_sg(struct scatterlist *sg, int 
> nents)
>   int i;
>  
>   for_each_sg(sg, s, nents, i) {
> - if (sg_dma_address(s) != DMA_ERROR_CODE)
> + if (sg_dma_address(s) != IOMMU_MAPPING_ERROR)
>   s->offset += sg_dma_address(s);
>   if (sg_dma_len(s))
>   s->length = sg_dma_len(s);
> - sg_dma_address(s) = DMA_ERROR_CODE;
> + sg_dma_address(s) = IOMMU_MAPPING_ERROR;
>   sg_dma_len(s) = 0;
>   }
>  }
> @@ -836,7 +838,7 @@ void iommu_dma_unmap_resource(struct device *dev, 
> dma_addr_t handle,
>  
>  int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
>  {
> - return dma_addr == DMA_ERROR_CODE;
> + return dma_addr == IOMMU_MAPPING_ERROR;
>  }
>  
>  static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 43/44] dma-mapping: remove the set_dma_mask method

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/kernel/dma.c   | 4 
 include/linux/dma-mapping.h | 6 --
 2 files changed, 10 deletions(-)

diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index 41c749586bd2..466c9f07b288 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -316,10 +316,6 @@ EXPORT_SYMBOL(dma_set_coherent_mask);
 
 int __dma_set_mask(struct device *dev, u64 dma_mask)
 {
-   const struct dma_map_ops *dma_ops = get_dma_ops(dev);
-
-   if ((dma_ops != NULL) && (dma_ops->set_dma_mask != NULL))
-   return dma_ops->set_dma_mask(dev, dma_mask);
if (!dev->dma_mask || !dma_supported(dev, dma_mask))
return -EIO;
*dev->dma_mask = dma_mask;
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 3e5908656226..527f2ed8c645 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -127,7 +127,6 @@ struct dma_map_ops {
   enum dma_data_direction dir);
int (*mapping_error)(struct device *dev, dma_addr_t dma_addr);
int (*dma_supported)(struct device *dev, u64 mask);
-   int (*set_dma_mask)(struct device *dev, u64 mask);
 #ifdef ARCH_HAS_DMA_GET_REQUIRED_MASK
u64 (*get_required_mask)(struct device *dev);
 #endif
@@ -563,11 +562,6 @@ static inline int dma_supported(struct device *dev, u64 
mask)
 #ifndef HAVE_ARCH_DMA_SET_MASK
 static inline int dma_set_mask(struct device *dev, u64 mask)
 {
-   const struct dma_map_ops *ops = get_dma_ops(dev);
-
-   if (ops->set_dma_mask)
-   return ops->set_dma_mask(dev, mask);
-
if (!dev->dma_mask || !dma_supported(dev, mask))
return -EIO;
*dev->dma_mask = mask;
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 16/44] arm64: remove DMA_ERROR_CODE

On 08/06/17 14:25, Christoph Hellwig wrote:
> The dma alloc interface returns an error by return NULL, and the
> mapping interfaces rely on the mapping_error method, which the dummy
> ops already implement correctly.
> 
> Thus remove the DMA_ERROR_CODE define.

Reviewed-by: Robin Murphy 

> Signed-off-by: Christoph Hellwig 
> ---
>  arch/arm64/include/asm/dma-mapping.h | 1 -
>  arch/arm64/mm/dma-mapping.c  | 3 +--
>  2 files changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/dma-mapping.h 
> b/arch/arm64/include/asm/dma-mapping.h
> index 5392dbeffa45..cf8fc8f05580 100644
> --- a/arch/arm64/include/asm/dma-mapping.h
> +++ b/arch/arm64/include/asm/dma-mapping.h
> @@ -24,7 +24,6 @@
>  #include 
>  #include 
>  
> -#define DMA_ERROR_CODE   (~(dma_addr_t)0)
>  extern const struct dma_map_ops dummy_dma_ops;
>  
>  static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type 
> *bus)
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 3216e098c058..147fbb907a2f 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -184,7 +184,6 @@ static void *__dma_alloc(struct device *dev, size_t size,
>  no_map:
>   __dma_free_coherent(dev, size, ptr, *dma_handle, attrs);
>  no_mem:
> - *dma_handle = DMA_ERROR_CODE;
>   return NULL;
>  }
>  
> @@ -487,7 +486,7 @@ static dma_addr_t __dummy_map_page(struct device *dev, 
> struct page *page,
>  enum dma_data_direction dir,
>  unsigned long attrs)
>  {
> - return DMA_ERROR_CODE;
> + return 0;
>  }
>  
>  static void __dummy_unmap_page(struct device *dev, dma_addr_t dev_addr,
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 18/44] iommu/amd: implement ->mapping_error

DMA_ERROR_CODE is going to go away, so don't rely on it.

Signed-off-by: Christoph Hellwig 
---
 drivers/iommu/amd_iommu.c | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 63cacf5d6cf2..d41280e869de 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -54,6 +54,8 @@
 #include "amd_iommu_types.h"
 #include "irq_remapping.h"
 
+#define AMD_IOMMU_MAPPING_ERROR0
+
 #define CMD_SET_TYPE(cmd, t) ((cmd)->data[1] |= ((t) << 28))
 
 #define LOOP_TIMEOUT   10
@@ -2394,7 +2396,7 @@ static dma_addr_t __map_single(struct device *dev,
paddr &= PAGE_MASK;
 
address = dma_ops_alloc_iova(dev, dma_dom, pages, dma_mask);
-   if (address == DMA_ERROR_CODE)
+   if (address == AMD_IOMMU_MAPPING_ERROR)
goto out;
 
prot = dir2prot(direction);
@@ -2431,7 +2433,7 @@ static dma_addr_t __map_single(struct device *dev,
 
dma_ops_free_iova(dma_dom, address, pages);
 
-   return DMA_ERROR_CODE;
+   return AMD_IOMMU_MAPPING_ERROR;
 }
 
 /*
@@ -2483,7 +2485,7 @@ static dma_addr_t map_page(struct device *dev, struct 
page *page,
if (PTR_ERR(domain) == -EINVAL)
return (dma_addr_t)paddr;
else if (IS_ERR(domain))
-   return DMA_ERROR_CODE;
+   return AMD_IOMMU_MAPPING_ERROR;
 
dma_mask = *dev->dma_mask;
dma_dom = to_dma_ops_domain(domain);
@@ -2560,7 +2562,7 @@ static int map_sg(struct device *dev, struct scatterlist 
*sglist,
npages = sg_num_pages(dev, sglist, nelems);
 
address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask);
-   if (address == DMA_ERROR_CODE)
+   if (address == AMD_IOMMU_MAPPING_ERROR)
goto out_err;
 
prot = dir2prot(direction);
@@ -2683,7 +2685,7 @@ static void *alloc_coherent(struct device *dev, size_t 
size,
*dma_addr = __map_single(dev, dma_dom, page_to_phys(page),
 size, DMA_BIDIRECTIONAL, dma_mask);
 
-   if (*dma_addr == DMA_ERROR_CODE)
+   if (*dma_addr == AMD_IOMMU_MAPPING_ERROR)
goto out_free;
 
return page_address(page);
@@ -2732,6 +2734,11 @@ static int amd_iommu_dma_supported(struct device *dev, 
u64 mask)
return check_device(dev);
 }
 
+static int amd_iommu_mapping_error(struct device *dev, dma_addr_t dma_addr)
+{
+   return dma_addr == AMD_IOMMU_MAPPING_ERROR;
+}
+
 static const struct dma_map_ops amd_iommu_dma_ops = {
.alloc  = alloc_coherent,
.free   = free_coherent,
@@ -2740,6 +2747,7 @@ static const struct dma_map_ops amd_iommu_dma_ops = {
.map_sg = map_sg,
.unmap_sg   = unmap_sg,
.dma_supported  = amd_iommu_dma_supported,
+   .mapping_error  = amd_iommu_mapping_error,
 };
 
 static int init_reserved_iova_ranges(void)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 35/44] x86: remove arch specific dma_supported implementation

And instead wire it up as method for all the dma_map_ops instances.

Note that this also means the arch specific check will be fully instead
of partially applied in the AMD iommu driver.

Signed-off-by: Christoph Hellwig 
---
 arch/x86/include/asm/dma-mapping.h | 3 ---
 arch/x86/include/asm/iommu.h   | 2 ++
 arch/x86/kernel/amd_gart_64.c  | 1 +
 arch/x86/kernel/pci-calgary_64.c   | 1 +
 arch/x86/kernel/pci-dma.c  | 7 +--
 arch/x86/kernel/pci-nommu.c| 1 +
 arch/x86/pci/sta2x11-fixup.c   | 3 ++-
 drivers/iommu/amd_iommu.c  | 2 ++
 drivers/iommu/intel-iommu.c| 3 +++
 9 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/dma-mapping.h 
b/arch/x86/include/asm/dma-mapping.h
index c35d228aa381..398c79889f5c 100644
--- a/arch/x86/include/asm/dma-mapping.h
+++ b/arch/x86/include/asm/dma-mapping.h
@@ -33,9 +33,6 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
 bool arch_dma_alloc_attrs(struct device **dev, gfp_t *gfp);
 #define arch_dma_alloc_attrs arch_dma_alloc_attrs
 
-#define HAVE_ARCH_DMA_SUPPORTED 1
-extern int dma_supported(struct device *hwdev, u64 mask);
-
 extern void *dma_generic_alloc_coherent(struct device *dev, size_t size,
dma_addr_t *dma_addr, gfp_t flag,
unsigned long attrs);
diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h
index 793869879464..fca144a104e4 100644
--- a/arch/x86/include/asm/iommu.h
+++ b/arch/x86/include/asm/iommu.h
@@ -6,6 +6,8 @@ extern int force_iommu, no_iommu;
 extern int iommu_detected;
 extern int iommu_pass_through;
 
+int x86_dma_supported(struct device *dev, u64 mask);
+
 /* 10 seconds */
 #define DMAR_OPERATION_TIMEOUT ((cycles_t) tsc_khz*10*1000)
 
diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c
index 815dd63f49d0..cc0e8bc0ea3f 100644
--- a/arch/x86/kernel/amd_gart_64.c
+++ b/arch/x86/kernel/amd_gart_64.c
@@ -704,6 +704,7 @@ static const struct dma_map_ops gart_dma_ops = {
.alloc  = gart_alloc_coherent,
.free   = gart_free_coherent,
.mapping_error  = gart_mapping_error,
+   .dma_supported  = x86_dma_supported,
 };
 
 static void gart_iommu_shutdown(void)
diff --git a/arch/x86/kernel/pci-calgary_64.c b/arch/x86/kernel/pci-calgary_64.c
index e75b490f2b0b..5286a4a92cf7 100644
--- a/arch/x86/kernel/pci-calgary_64.c
+++ b/arch/x86/kernel/pci-calgary_64.c
@@ -493,6 +493,7 @@ static const struct dma_map_ops calgary_dma_ops = {
.map_page = calgary_map_page,
.unmap_page = calgary_unmap_page,
.mapping_error = calgary_mapping_error,
+   .dma_supported = x86_dma_supported,
 };
 
 static inline void __iomem * busno_to_bbar(unsigned char num)
diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index 3a216ec869cd..b6f5684be3b5 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -213,10 +213,8 @@ static __init int iommu_setup(char *p)
 }
 early_param("iommu", iommu_setup);
 
-int dma_supported(struct device *dev, u64 mask)
+int x86_dma_supported(struct device *dev, u64 mask)
 {
-   const struct dma_map_ops *ops = get_dma_ops(dev);
-
 #ifdef CONFIG_PCI
if (mask > 0x && forbid_dac > 0) {
dev_info(dev, "PCI: Disallowing DAC for device\n");
@@ -224,9 +222,6 @@ int dma_supported(struct device *dev, u64 mask)
}
 #endif
 
-   if (ops->dma_supported)
-   return ops->dma_supported(dev, mask);
-
/* Copied from i386. Doesn't make much sense, because it will
   only work for pci_alloc_coherent.
   The caller just has to use GFP_DMA in this case. */
diff --git a/arch/x86/kernel/pci-nommu.c b/arch/x86/kernel/pci-nommu.c
index 085fe6ce4049..a6d404087fe3 100644
--- a/arch/x86/kernel/pci-nommu.c
+++ b/arch/x86/kernel/pci-nommu.c
@@ -104,4 +104,5 @@ const struct dma_map_ops nommu_dma_ops = {
.sync_sg_for_device = nommu_sync_sg_for_device,
.is_phys= 1,
.mapping_error  = nommu_mapping_error,
+   .dma_supported  = x86_dma_supported,
 };
diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
index ec008e800b45..53d600217973 100644
--- a/arch/x86/pci/sta2x11-fixup.c
+++ b/arch/x86/pci/sta2x11-fixup.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define STA2X11_SWIOTLB_SIZE (4*1024*1024)
 extern int swiotlb_late_init_with_default_size(size_t default_size);
@@ -191,7 +192,7 @@ static const struct dma_map_ops sta2x11_dma_ops = {
.sync_sg_for_cpu = swiotlb_sync_sg_for_cpu,
.sync_sg_for_device = swiotlb_sync_sg_for_device,
.mapping_error = swiotlb_dma_mapping_error,
-   .dma_supported = NULL, /* FIXME: we should use this instead! */
+   .dma_supported = x86_dma_supported,
 };
 
 /* At setup

[PATCH 37/44] mips/loongson64: implement ->dma_supported instead of ->set_dma_mask

Same behavior, less code duplication.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/loongson64/common/dma-swiotlb.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/arch/mips/loongson64/common/dma-swiotlb.c 
b/arch/mips/loongson64/common/dma-swiotlb.c
index 178ca17a5667..34486c138206 100644
--- a/arch/mips/loongson64/common/dma-swiotlb.c
+++ b/arch/mips/loongson64/common/dma-swiotlb.c
@@ -75,19 +75,11 @@ static void loongson_dma_sync_sg_for_device(struct device 
*dev,
mb();
 }
 
-static int loongson_dma_set_mask(struct device *dev, u64 mask)
+static int loongson_dma_supported(struct device *dev, u64 mask)
 {
-   if (!dev->dma_mask || !dma_supported(dev, mask))
-   return -EIO;
-
-   if (mask > DMA_BIT_MASK(loongson_sysconf.dma_mask_bits)) {
-   *dev->dma_mask = DMA_BIT_MASK(loongson_sysconf.dma_mask_bits);
-   return -EIO;
-   }
-
-   *dev->dma_mask = mask;
-
-   return 0;
+   if (mask > DMA_BIT_MASK(loongson_sysconf.dma_mask_bits))
+   return 0;
+   return swiotlb_dma_supported(dev, mask);
 }
 
 dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
@@ -126,8 +118,7 @@ static const struct dma_map_ops loongson_dma_map_ops = {
.sync_sg_for_cpu = swiotlb_sync_sg_for_cpu,
.sync_sg_for_device = loongson_dma_sync_sg_for_device,
.mapping_error = swiotlb_dma_mapping_error,
-   .dma_supported = swiotlb_dma_supported,
-   .set_dma_mask = loongson_dma_set_mask
+   .dma_supported = loongson_dma_supported,
 };
 
 void __init plat_swiotlb_setup(void)
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 29/44] dma-noop: remove dma_supported and mapping_error methods

These just duplicate the default behavior if no method is provided.

Signed-off-by: Christoph Hellwig 
---
 lib/dma-noop.c | 12 
 1 file changed, 12 deletions(-)

diff --git a/lib/dma-noop.c b/lib/dma-noop.c
index de26c8b68f34..643a074f139d 100644
--- a/lib/dma-noop.c
+++ b/lib/dma-noop.c
@@ -54,23 +54,11 @@ static int dma_noop_map_sg(struct device *dev, struct 
scatterlist *sgl, int nent
return nents;
 }
 
-static int dma_noop_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return 0;
-}
-
-static int dma_noop_supported(struct device *dev, u64 mask)
-{
-   return 1;
-}
-
 const struct dma_map_ops dma_noop_ops = {
.alloc  = dma_noop_alloc,
.free   = dma_noop_free,
.map_page   = dma_noop_map_page,
.map_sg = dma_noop_map_sg,
-   .mapping_error  = dma_noop_mapping_error,
-   .dma_supported  = dma_noop_supported,
 };
 
 EXPORT_SYMBOL(dma_noop_ops);
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 24/44] x86: remove DMA_ERROR_CODE

All dma_map_ops instances now handle their errors through
->mapping_error.

Signed-off-by: Christoph Hellwig 
---
 arch/x86/include/asm/dma-mapping.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/include/asm/dma-mapping.h 
b/arch/x86/include/asm/dma-mapping.h
index 08a0838b83fb..c35d228aa381 100644
--- a/arch/x86/include/asm/dma-mapping.h
+++ b/arch/x86/include/asm/dma-mapping.h
@@ -19,8 +19,6 @@
 # define ISA_DMA_BIT_MASK DMA_BIT_MASK(32)
 #endif
 
-#define DMA_ERROR_CODE 0
-
 extern int iommu_merge;
 extern struct device x86_dma_fallback_dev;
 extern int panic_on_overflow;
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 39/44] xen-swiotlb: remove xen_swiotlb_set_dma_mask

This just duplicates the generic implementation.

Signed-off-by: Christoph Hellwig 
---
 drivers/xen/swiotlb-xen.c | 12 
 1 file changed, 12 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index c3a04b2d7532..82fc54f8eb77 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -661,17 +661,6 @@ xen_swiotlb_dma_supported(struct device *hwdev, u64 mask)
return xen_virt_to_bus(xen_io_tlb_end - 1) <= mask;
 }
 
-static int
-xen_swiotlb_set_dma_mask(struct device *dev, u64 dma_mask)
-{
-   if (!dev->dma_mask || !xen_swiotlb_dma_supported(dev, dma_mask))
-   return -EIO;
-
-   *dev->dma_mask = dma_mask;
-
-   return 0;
-}
-
 /*
  * Create userspace mapping for the DMA-coherent memory.
  * This function should be called with the pages from the current domain only,
@@ -734,7 +723,6 @@ const struct dma_map_ops xen_swiotlb_dma_ops = {
.map_page = xen_swiotlb_map_page,
.unmap_page = xen_swiotlb_unmap_page,
.dma_supported = xen_swiotlb_dma_supported,
-   .set_dma_mask = xen_swiotlb_set_dma_mask,
.mmap = xen_swiotlb_dma_mmap,
.get_sgtable = xen_swiotlb_get_sgtable,
.mapping_error  = xen_swiotlb_mapping_error,
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 28/44] sparc: remove arch specific dma_supported implementations

Usually dma_supported decisions are done by the dma_map_ops instance.
Switch sparc to that model by providing a ->dma_supported instance for
sbus that always returns false, and implementations tailored to the sun4u
and sun4v cases for sparc64, and leave it unimplemented for PCI on
sparc32, which means always supported.

Signed-off-by: Christoph Hellwig 
---
 arch/sparc/include/asm/dma-mapping.h |  3 ---
 arch/sparc/kernel/iommu.c| 40 +++-
 arch/sparc/kernel/ioport.c   | 22 ++--
 arch/sparc/kernel/pci_sun4v.c| 17 +++
 4 files changed, 39 insertions(+), 43 deletions(-)

diff --git a/arch/sparc/include/asm/dma-mapping.h 
b/arch/sparc/include/asm/dma-mapping.h
index 98da9f92c318..60bf1633d554 100644
--- a/arch/sparc/include/asm/dma-mapping.h
+++ b/arch/sparc/include/asm/dma-mapping.h
@@ -5,9 +5,6 @@
 #include 
 #include 
 
-#define HAVE_ARCH_DMA_SUPPORTED 1
-int dma_supported(struct device *dev, u64 mask);
-
 static inline void dma_cache_sync(struct device *dev, void *vaddr, size_t size,
  enum dma_data_direction dir)
 {
diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c
index dafa316d978d..fcbcc031f615 100644
--- a/arch/sparc/kernel/iommu.c
+++ b/arch/sparc/kernel/iommu.c
@@ -746,6 +746,21 @@ static int dma_4u_mapping_error(struct device *dev, 
dma_addr_t dma_addr)
return dma_addr == SPARC_MAPPING_ERROR;
 }
 
+static int dma_4u_supported(struct device *dev, u64 device_mask)
+{
+   struct iommu *iommu = dev->archdata.iommu;
+
+   if (device_mask > DMA_BIT_MASK(32))
+   return 0;
+   if ((device_mask & iommu->dma_addr_mask) == iommu->dma_addr_mask)
+   return 1;
+#ifdef CONFIG_PCI
+   if (dev_is_pci(dev))
+   return pci64_dma_supported(to_pci_dev(dev), device_mask);
+#endif
+   return 0;
+}
+
 static const struct dma_map_ops sun4u_dma_ops = {
.alloc  = dma_4u_alloc_coherent,
.free   = dma_4u_free_coherent,
@@ -755,32 +770,9 @@ static const struct dma_map_ops sun4u_dma_ops = {
.unmap_sg   = dma_4u_unmap_sg,
.sync_single_for_cpu= dma_4u_sync_single_for_cpu,
.sync_sg_for_cpu= dma_4u_sync_sg_for_cpu,
+   .dma_supported  = dma_4u_supported,
.mapping_error  = dma_4u_mapping_error,
 };
 
 const struct dma_map_ops *dma_ops = &sun4u_dma_ops;
 EXPORT_SYMBOL(dma_ops);
-
-int dma_supported(struct device *dev, u64 device_mask)
-{
-   struct iommu *iommu = dev->archdata.iommu;
-   u64 dma_addr_mask = iommu->dma_addr_mask;
-
-   if (device_mask > DMA_BIT_MASK(32)) {
-   if (iommu->atu)
-   dma_addr_mask = iommu->atu->dma_addr_mask;
-   else
-   return 0;
-   }
-
-   if ((device_mask & dma_addr_mask) == dma_addr_mask)
-   return 1;
-
-#ifdef CONFIG_PCI
-   if (dev_is_pci(dev))
-   return pci64_dma_supported(to_pci_dev(dev), device_mask);
-#endif
-
-   return 0;
-}
-EXPORT_SYMBOL(dma_supported);
diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c
index dd081d557609..12894f259bea 100644
--- a/arch/sparc/kernel/ioport.c
+++ b/arch/sparc/kernel/ioport.c
@@ -401,6 +401,11 @@ static void sbus_sync_sg_for_device(struct device *dev, 
struct scatterlist *sg,
BUG();
 }
 
+static int sbus_dma_supported(struct device *dev, u64 mask)
+{
+   return 0;
+}
+
 static const struct dma_map_ops sbus_dma_ops = {
.alloc  = sbus_alloc_coherent,
.free   = sbus_free_coherent,
@@ -410,6 +415,7 @@ static const struct dma_map_ops sbus_dma_ops = {
.unmap_sg   = sbus_unmap_sg,
.sync_sg_for_cpu= sbus_sync_sg_for_cpu,
.sync_sg_for_device = sbus_sync_sg_for_device,
+   .dma_supported  = sbus_dma_supported,
 };
 
 static int __init sparc_register_ioport(void)
@@ -655,22 +661,6 @@ EXPORT_SYMBOL(pci32_dma_ops);
 const struct dma_map_ops *dma_ops = &sbus_dma_ops;
 EXPORT_SYMBOL(dma_ops);
 
-
-/*
- * Return whether the given PCI device DMA address mask can be
- * supported properly.  For example, if your device can only drive the
- * low 24-bits during PCI bus mastering, then you would pass
- * 0x00ff as the mask to this function.
- */
-int dma_supported(struct device *dev, u64 mask)
-{
-   if (dev_is_pci(dev))
-   return 1;
-
-   return 0;
-}
-EXPORT_SYMBOL(dma_supported);
-
 #ifdef CONFIG_PROC_FS
 
 static int sparc_io_proc_show(struct seq_file *m, void *v)
diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index 8e2a56f4c03a..24f21c726dfa 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -24,6 +24,7 @@
 
 #include "pci_impl.h"
 #include "iommu_common.h"
+#include "kernel.h"
 
 #include "pci_sun4v.h"
 
@@ -669,6 +670,

[PATCH 30/44] dma-virt: remove dma_supported and mapping_error methods

These just duplicate the default behavior if no method is provided.

Signed-off-by: Christoph Hellwig 
---
 lib/dma-virt.c | 12 
 1 file changed, 12 deletions(-)

diff --git a/lib/dma-virt.c b/lib/dma-virt.c
index dcd4df1f7174..5c4f11329721 100644
--- a/lib/dma-virt.c
+++ b/lib/dma-virt.c
@@ -51,22 +51,10 @@ static int dma_virt_map_sg(struct device *dev, struct 
scatterlist *sgl,
return nents;
 }
 
-static int dma_virt_mapping_error(struct device *dev, dma_addr_t dma_addr)
-{
-   return false;
-}
-
-static int dma_virt_supported(struct device *dev, u64 mask)
-{
-   return true;
-}
-
 const struct dma_map_ops dma_virt_ops = {
.alloc  = dma_virt_alloc,
.free   = dma_virt_free,
.map_page   = dma_virt_map_page,
.map_sg = dma_virt_map_sg,
-   .mapping_error  = dma_virt_mapping_error,
-   .dma_supported  = dma_virt_supported,
 };
 EXPORT_SYMBOL(dma_virt_ops);
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 21/44] powerpc: implement ->mapping_error

DMA_ERROR_CODE is going to go away, so don't rely on it.  Instead
define a ->mapping_error method for all IOMMU based dma operation
instances.  The direct ops don't ever return an error and don't
need a ->mapping_error method.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/include/asm/dma-mapping.h |  4 
 arch/powerpc/include/asm/iommu.h   |  4 
 arch/powerpc/kernel/dma-iommu.c|  6 ++
 arch/powerpc/kernel/iommu.c| 28 ++--
 arch/powerpc/platforms/cell/iommu.c|  1 +
 arch/powerpc/platforms/pseries/vio.c   |  3 ++-
 6 files changed, 27 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/include/asm/dma-mapping.h 
b/arch/powerpc/include/asm/dma-mapping.h
index 181a095468e4..73aedbe6c977 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -17,10 +17,6 @@
 #include 
 #include 
 
-#ifdef CONFIG_PPC64
-#define DMA_ERROR_CODE (~(dma_addr_t)0x0)
-#endif
-
 /* Some dma direct funcs must be visible for use in other dma_ops */
 extern void *__dma_direct_alloc_coherent(struct device *dev, size_t size,
 dma_addr_t *dma_handle, gfp_t flag,
diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
index 8a8ce220d7d0..20febe0b7f32 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -139,6 +139,8 @@ struct scatterlist;
 
 #ifdef CONFIG_PPC64
 
+#define IOMMU_MAPPING_ERROR(~(dma_addr_t)0x0)
+
 static inline void set_iommu_table_base(struct device *dev,
struct iommu_table *base)
 {
@@ -238,6 +240,8 @@ static inline int __init tce_iommu_bus_notifier_init(void)
 }
 #endif /* !CONFIG_IOMMU_API */
 
+int dma_iommu_mapping_error(struct device *dev, dma_addr_t dma_addr);
+
 #else
 
 static inline void *get_iommu_table_base(struct device *dev)
diff --git a/arch/powerpc/kernel/dma-iommu.c b/arch/powerpc/kernel/dma-iommu.c
index fb7cbaa37658..8f7abf9baa63 100644
--- a/arch/powerpc/kernel/dma-iommu.c
+++ b/arch/powerpc/kernel/dma-iommu.c
@@ -105,6 +105,11 @@ static u64 dma_iommu_get_required_mask(struct device *dev)
return mask;
 }
 
+int dma_iommu_mapping_error(struct device *dev, dma_addr_t dma_addr)
+{
+   return dma_addr == IOMMU_MAPPING_ERROR;
+}
+
 struct dma_map_ops dma_iommu_ops = {
.alloc  = dma_iommu_alloc_coherent,
.free   = dma_iommu_free_coherent,
@@ -115,5 +120,6 @@ struct dma_map_ops dma_iommu_ops = {
.map_page   = dma_iommu_map_page,
.unmap_page = dma_iommu_unmap_page,
.get_required_mask  = dma_iommu_get_required_mask,
+   .mapping_error  = dma_iommu_mapping_error,
 };
 EXPORT_SYMBOL(dma_iommu_ops);
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index f2b724cd9e64..233ca3fe4754 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -198,11 +198,11 @@ static unsigned long iommu_range_alloc(struct device *dev,
if (unlikely(npages == 0)) {
if (printk_ratelimit())
WARN_ON(1);
-   return DMA_ERROR_CODE;
+   return IOMMU_MAPPING_ERROR;
}
 
if (should_fail_iommu(dev))
-   return DMA_ERROR_CODE;
+   return IOMMU_MAPPING_ERROR;
 
/*
 * We don't need to disable preemption here because any CPU can
@@ -278,7 +278,7 @@ static unsigned long iommu_range_alloc(struct device *dev,
} else {
/* Give up */
spin_unlock_irqrestore(&(pool->lock), flags);
-   return DMA_ERROR_CODE;
+   return IOMMU_MAPPING_ERROR;
}
}
 
@@ -310,13 +310,13 @@ static dma_addr_t iommu_alloc(struct device *dev, struct 
iommu_table *tbl,
  unsigned long attrs)
 {
unsigned long entry;
-   dma_addr_t ret = DMA_ERROR_CODE;
+   dma_addr_t ret = IOMMU_MAPPING_ERROR;
int build_fail;
 
entry = iommu_range_alloc(dev, tbl, npages, NULL, mask, align_order);
 
-   if (unlikely(entry == DMA_ERROR_CODE))
-   return DMA_ERROR_CODE;
+   if (unlikely(entry == IOMMU_MAPPING_ERROR))
+   return IOMMU_MAPPING_ERROR;
 
entry += tbl->it_offset;/* Offset into real TCE table */
ret = entry << tbl->it_page_shift;  /* Set the return dma address */
@@ -328,12 +328,12 @@ static dma_addr_t iommu_alloc(struct device *dev, struct 
iommu_table *tbl,
 
/* tbl->it_ops->set() only returns non-zero for transient errors.
 * Clean up the table bitmap in this case and return
-* DMA_ERROR_CODE. For all other errors the functionality is
+* IOMMU_MAPPING_ERROR. For all other errors the functionality is
 * not altered.
 */
if (unlikely(buil

[PATCH 19/44] s390: implement ->mapping_error

s390 can also use noop_dma_ops, and while that currently does not return
errors it will so in the future.  Implementing the mapping_error method
is the proper way to have per-ops error conditions.

Signed-off-by: Christoph Hellwig 
---
 arch/s390/include/asm/dma-mapping.h |  2 --
 arch/s390/pci/pci_dma.c | 18 +-
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/s390/include/asm/dma-mapping.h 
b/arch/s390/include/asm/dma-mapping.h
index 3108b8dbe266..512ad0eaa11a 100644
--- a/arch/s390/include/asm/dma-mapping.h
+++ b/arch/s390/include/asm/dma-mapping.h
@@ -8,8 +8,6 @@
 #include 
 #include 
 
-#define DMA_ERROR_CODE (~(dma_addr_t) 0x0)
-
 extern const struct dma_map_ops s390_pci_dma_ops;
 
 static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index 9081a57fa340..ea623faab525 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -14,6 +14,8 @@
 #include 
 #include 
 
+#define S390_MAPPING_ERROR (~(dma_addr_t) 0x0)
+
 static struct kmem_cache *dma_region_table_cache;
 static struct kmem_cache *dma_page_table_cache;
 static int s390_iommu_strict;
@@ -281,7 +283,7 @@ static dma_addr_t dma_alloc_address(struct device *dev, int 
size)
 
 out_error:
spin_unlock_irqrestore(&zdev->iommu_bitmap_lock, flags);
-   return DMA_ERROR_CODE;
+   return S390_MAPPING_ERROR;
 }
 
 static void dma_free_address(struct device *dev, dma_addr_t dma_addr, int size)
@@ -329,7 +331,7 @@ static dma_addr_t s390_dma_map_pages(struct device *dev, 
struct page *page,
/* This rounds up number of pages based on size and offset */
nr_pages = iommu_num_pages(pa, size, PAGE_SIZE);
dma_addr = dma_alloc_address(dev, nr_pages);
-   if (dma_addr == DMA_ERROR_CODE) {
+   if (dma_addr == S390_MAPPING_ERROR) {
ret = -ENOSPC;
goto out_err;
}
@@ -352,7 +354,7 @@ static dma_addr_t s390_dma_map_pages(struct device *dev, 
struct page *page,
 out_err:
zpci_err("map error:\n");
zpci_err_dma(ret, pa);
-   return DMA_ERROR_CODE;
+   return S390_MAPPING_ERROR;
 }
 
 static void s390_dma_unmap_pages(struct device *dev, dma_addr_t dma_addr,
@@ -429,7 +431,7 @@ static int __s390_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
int ret;
 
dma_addr_base = dma_alloc_address(dev, nr_pages);
-   if (dma_addr_base == DMA_ERROR_CODE)
+   if (dma_addr_base == S390_MAPPING_ERROR)
return -ENOMEM;
 
dma_addr = dma_addr_base;
@@ -476,7 +478,7 @@ static int s390_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
for (i = 1; i < nr_elements; i++) {
s = sg_next(s);
 
-   s->dma_address = DMA_ERROR_CODE;
+   s->dma_address = S390_MAPPING_ERROR;
s->dma_length = 0;
 
if (s->offset || (size & ~PAGE_MASK) ||
@@ -525,6 +527,11 @@ static void s390_dma_unmap_sg(struct device *dev, struct 
scatterlist *sg,
s->dma_length = 0;
}
 }
+   
+static int s390_mapping_error(struct device *dev, dma_addr_t dma_addr)
+{
+   return dma_addr == S390_MAPPING_ERROR;
+}
 
 int zpci_dma_init_device(struct zpci_dev *zdev)
 {
@@ -657,6 +664,7 @@ const struct dma_map_ops s390_pci_dma_ops = {
.unmap_sg   = s390_dma_unmap_sg,
.map_page   = s390_dma_map_pages,
.unmap_page = s390_dma_unmap_pages,
+   .mapping_error  = s390_mapping_error,
/* if we support direct DMA this must be conditional */
.is_phys= 0,
/* dma_supported is unconditionally true without a callback */
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 32/44] hexagon: remove the unused dma_is_consistent prototype

Signed-off-by: Christoph Hellwig 
---
 arch/hexagon/include/asm/dma-mapping.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/hexagon/include/asm/dma-mapping.h 
b/arch/hexagon/include/asm/dma-mapping.h
index 9c15cb5271a6..463dbc18f853 100644
--- a/arch/hexagon/include/asm/dma-mapping.h
+++ b/arch/hexagon/include/asm/dma-mapping.h
@@ -37,7 +37,6 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
return dma_ops;
 }
 
-extern int dma_is_consistent(struct device *dev, dma_addr_t dma_handle);
 extern void dma_cache_sync(struct device *dev, void *vaddr, size_t size,
   enum dma_data_direction direction);
 
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 23/44] x86/calgary: implement ->mapping_error

DMA_ERROR_CODE is going to go away, so don't rely on it.

Signed-off-by: Christoph Hellwig 
---
 arch/x86/kernel/pci-calgary_64.c | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/pci-calgary_64.c b/arch/x86/kernel/pci-calgary_64.c
index fda7867046d0..e75b490f2b0b 100644
--- a/arch/x86/kernel/pci-calgary_64.c
+++ b/arch/x86/kernel/pci-calgary_64.c
@@ -50,6 +50,8 @@
 #include 
 #include 
 
+#define CALGARY_MAPPING_ERROR  0
+
 #ifdef CONFIG_CALGARY_IOMMU_ENABLED_BY_DEFAULT
 int use_calgary __read_mostly = 1;
 #else
@@ -252,7 +254,7 @@ static unsigned long iommu_range_alloc(struct device *dev,
if (panic_on_overflow)
panic("Calgary: fix the allocator.\n");
else
-   return DMA_ERROR_CODE;
+   return CALGARY_MAPPING_ERROR;
}
}
 
@@ -272,10 +274,10 @@ static dma_addr_t iommu_alloc(struct device *dev, struct 
iommu_table *tbl,
 
entry = iommu_range_alloc(dev, tbl, npages);
 
-   if (unlikely(entry == DMA_ERROR_CODE)) {
+   if (unlikely(entry == CALGARY_MAPPING_ERROR)) {
pr_warn("failed to allocate %u pages in iommu %p\n",
npages, tbl);
-   return DMA_ERROR_CODE;
+   return CALGARY_MAPPING_ERROR;
}
 
/* set the return dma address */
@@ -295,7 +297,7 @@ static void iommu_free(struct iommu_table *tbl, dma_addr_t 
dma_addr,
unsigned long flags;
 
/* were we called with bad_dma_address? */
-   badend = DMA_ERROR_CODE + (EMERGENCY_PAGES * PAGE_SIZE);
+   badend = CALGARY_MAPPING_ERROR + (EMERGENCY_PAGES * PAGE_SIZE);
if (unlikely(dma_addr < badend)) {
WARN(1, KERN_ERR "Calgary: driver tried unmapping bad DMA "
   "address 0x%Lx\n", dma_addr);
@@ -380,7 +382,7 @@ static int calgary_map_sg(struct device *dev, struct 
scatterlist *sg,
npages = iommu_num_pages(vaddr, s->length, PAGE_SIZE);
 
entry = iommu_range_alloc(dev, tbl, npages);
-   if (entry == DMA_ERROR_CODE) {
+   if (entry == CALGARY_MAPPING_ERROR) {
/* makes sure unmap knows to stop */
s->dma_length = 0;
goto error;
@@ -398,7 +400,7 @@ static int calgary_map_sg(struct device *dev, struct 
scatterlist *sg,
 error:
calgary_unmap_sg(dev, sg, nelems, dir, 0);
for_each_sg(sg, s, nelems, i) {
-   sg->dma_address = DMA_ERROR_CODE;
+   sg->dma_address = CALGARY_MAPPING_ERROR;
sg->dma_length = 0;
}
return 0;
@@ -453,7 +455,7 @@ static void* calgary_alloc_coherent(struct device *dev, 
size_t size,
 
/* set up tces to cover the allocated range */
mapping = iommu_alloc(dev, tbl, ret, npages, DMA_BIDIRECTIONAL);
-   if (mapping == DMA_ERROR_CODE)
+   if (mapping == CALGARY_MAPPING_ERROR)
goto free;
*dma_handle = mapping;
return ret;
@@ -478,6 +480,11 @@ static void calgary_free_coherent(struct device *dev, 
size_t size,
free_pages((unsigned long)vaddr, get_order(size));
 }
 
+static int calgary_mapping_error(struct device *dev, dma_addr_t dma_addr)
+{
+   return dma_addr == CALGARY_MAPPING_ERROR;
+}
+
 static const struct dma_map_ops calgary_dma_ops = {
.alloc = calgary_alloc_coherent,
.free = calgary_free_coherent,
@@ -485,6 +492,7 @@ static const struct dma_map_ops calgary_dma_ops = {
.unmap_sg = calgary_unmap_sg,
.map_page = calgary_map_page,
.unmap_page = calgary_unmap_page,
+   .mapping_error = calgary_mapping_error,
 };
 
 static inline void __iomem * busno_to_bbar(unsigned char num)
@@ -732,7 +740,7 @@ static void __init calgary_reserve_regions(struct pci_dev 
*dev)
struct iommu_table *tbl = pci_iommu(dev->bus);
 
/* reserve EMERGENCY_PAGES from bad_dma_address and up */
-   iommu_range_reserve(tbl, DMA_ERROR_CODE, EMERGENCY_PAGES);
+   iommu_range_reserve(tbl, CALGARY_MAPPING_ERROR, EMERGENCY_PAGES);
 
/* avoid the BIOS/VGA first 640KB-1MB region */
/* for CalIOC2 - avoid the entire first MB */
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: clean up and modularize arch dma_mapping interface

From: Christoph Hellwig 
Date: Thu,  8 Jun 2017 15:25:25 +0200

> for a while we have a generic implementation of the dma mapping routines
> that call into per-arch or per-device operations.  But right now there
> still are various bits in the interfaces where don't clearly operate
> on these ops.  This series tries to clean up a lot of those (but not all
> yet, but the series is big enough).  It gets rid of the DMA_ERROR_CODE
> way of signaling failures of the mapping routines from the
> implementations to the generic code (and cleans up various drivers that
> were incorrectly using it), and gets rid of the ->set_dma_mask routine
> in favor of relying on the ->dma_capable method that can be used in
> the same way, but which requires less code duplication.

There is unlikely to be conflicts for the sparc and net changes, so I
will simply ACK them.

Thanks Christoph.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 02/44] ibmveth: properly unwind on init errors

From: Christoph Hellwig 
Date: Thu,  8 Jun 2017 15:25:27 +0200

> That way the driver doesn't have to rely on DMA_ERROR_CODE, which
> is not a public API and going away.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: David S. Miller 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 27/44] sparc: remove leon_dma_ops

From: Christoph Hellwig 
Date: Thu,  8 Jun 2017 15:25:52 +0200

> We can just use pci32_dma_ops.
> 
> Btw, given that leon is 32-bit and appears to be PCI based, do even need
> the special case for it in get_arch_dma_ops at all?

I would need to defer to the LEON developers on that, but they haven't
been very actively lately so whether you'll get a response or not is
hard to predict.

> Signed-off-by: Christoph Hellwig 

Acked-by: David S. Miller 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 20/44] sparc: implement ->mapping_error

From: Christoph Hellwig 
Date: Thu,  8 Jun 2017 15:25:45 +0200

> DMA_ERROR_CODE is going to go away, so don't rely on it.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: David S. Miller 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 28/44] sparc: remove arch specific dma_supported implementations

2017-06-08 Thread Julian Calaby

Hi Christoph,

On Thu, Jun 8, 2017 at 11:25 PM, Christoph Hellwig  wrote:
> Usually dma_supported decisions are done by the dma_map_ops instance.
> Switch sparc to that model by providing a ->dma_supported instance for
> sbus that always returns false, and implementations tailored to the sun4u
> and sun4v cases for sparc64, and leave it unimplemented for PCI on
> sparc32, which means always supported.
>
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/sparc/include/asm/dma-mapping.h |  3 ---
>  arch/sparc/kernel/iommu.c| 40 
> +++-
>  arch/sparc/kernel/ioport.c   | 22 ++--
>  arch/sparc/kernel/pci_sun4v.c| 17 +++
>  4 files changed, 39 insertions(+), 43 deletions(-)
>
> diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c
> index dd081d557609..12894f259bea 100644
> --- a/arch/sparc/kernel/ioport.c
> +++ b/arch/sparc/kernel/ioport.c
> @@ -401,6 +401,11 @@ static void sbus_sync_sg_for_device(struct device *dev, 
> struct scatterlist *sg,
> BUG();
>  }
>
> +static int sbus_dma_supported(struct device *dev, u64 mask)
> +{
> +   return 0;
> +}
> +

I'm guessing there's a few places that have DMA ops but DMA isn't
actually supported. Why not have a common method for this, maybe
"dma_not_supported"?

>  static const struct dma_map_ops sbus_dma_ops = {
> .alloc  = sbus_alloc_coherent,
> .free   = sbus_free_coherent,
> @@ -410,6 +415,7 @@ static const struct dma_map_ops sbus_dma_ops = {
> .unmap_sg   = sbus_unmap_sg,
> .sync_sg_for_cpu= sbus_sync_sg_for_cpu,
> .sync_sg_for_device = sbus_sync_sg_for_device,
> +   .dma_supported  = sbus_dma_supported,
>  };
>
>  static int __init sparc_register_ioport(void)

Thanks,

-- 
Julian Calaby

Email: julian.cal...@gmail.com
Profile: http://www.google.com/profiles/julian.calaby/
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 28/44] sparc: remove arch specific dma_supported implementations

From: Christoph Hellwig 
Date: Thu,  8 Jun 2017 15:25:53 +0200

> Usually dma_supported decisions are done by the dma_map_ops instance.
> Switch sparc to that model by providing a ->dma_supported instance for
> sbus that always returns false, and implementations tailored to the sun4u
> and sun4v cases for sparc64, and leave it unimplemented for PCI on
> sparc32, which means always supported.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: David S. Miller 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 31/44] hexagon: remove arch-specific dma_supported implementation

This implementation is simply bogus - hexagon only has a simple
direct mapped DMA implementation and thus doesn't care about the
address.

Signed-off-by: Christoph Hellwig 
---
 arch/hexagon/include/asm/dma-mapping.h | 2 --
 arch/hexagon/kernel/dma.c  | 9 -
 2 files changed, 11 deletions(-)

diff --git a/arch/hexagon/include/asm/dma-mapping.h 
b/arch/hexagon/include/asm/dma-mapping.h
index 00e3f10113b0..9c15cb5271a6 100644
--- a/arch/hexagon/include/asm/dma-mapping.h
+++ b/arch/hexagon/include/asm/dma-mapping.h
@@ -37,8 +37,6 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
return dma_ops;
 }
 
-#define HAVE_ARCH_DMA_SUPPORTED 1
-extern int dma_supported(struct device *dev, u64 mask);
 extern int dma_is_consistent(struct device *dev, dma_addr_t dma_handle);
 extern void dma_cache_sync(struct device *dev, void *vaddr, size_t size,
   enum dma_data_direction direction);
diff --git a/arch/hexagon/kernel/dma.c b/arch/hexagon/kernel/dma.c
index 71269dc0f225..9ff1b2041f85 100644
--- a/arch/hexagon/kernel/dma.c
+++ b/arch/hexagon/kernel/dma.c
@@ -35,15 +35,6 @@ static inline void *dma_addr_to_virt(dma_addr_t dma_addr)
return phys_to_virt((unsigned long) dma_addr);
 }
 
-int dma_supported(struct device *dev, u64 mask)
-{
-   if (mask == DMA_BIT_MASK(32))
-   return 1;
-   else
-   return 0;
-}
-EXPORT_SYMBOL(dma_supported);
-
 static struct gen_pool *coherent_pool;
 
 
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 41/44] powerpc/cell: clean up fixed mapping dma_ops initialization

By the time cell_pci_dma_dev_setup calls cell_dma_dev_setup no device can
have the fixed map_ops set yet as it's only set by the set_dma_mask
method.  So move the setup for the fixed case to be only called in that
place instead of indirecting through cell_dma_dev_setup.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/platforms/cell/iommu.c | 27 +++
 1 file changed, 7 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/platforms/cell/iommu.c 
b/arch/powerpc/platforms/cell/iommu.c
index 948086e33a0c..497bfbdbd967 100644
--- a/arch/powerpc/platforms/cell/iommu.c
+++ b/arch/powerpc/platforms/cell/iommu.c
@@ -663,14 +663,9 @@ static const struct dma_map_ops dma_iommu_fixed_ops = {
.mapping_error  = dma_iommu_mapping_error,
 };
 
-static void cell_dma_dev_setup_fixed(struct device *dev);
-
 static void cell_dma_dev_setup(struct device *dev)
 {
-   /* Order is important here, these are not mutually exclusive */
-   if (get_dma_ops(dev) == &dma_iommu_fixed_ops)
-   cell_dma_dev_setup_fixed(dev);
-   else if (get_pci_dma_ops() == &dma_iommu_ops)
+   if (get_pci_dma_ops() == &dma_iommu_ops)
set_iommu_table_base(dev, cell_get_iommu_table(dev));
else if (get_pci_dma_ops() == &dma_direct_ops)
set_dma_offset(dev, cell_dma_direct_offset);
@@ -963,32 +958,24 @@ static int dma_set_mask_and_switch(struct device *dev, 
u64 dma_mask)
return -EIO;
 
if (dma_mask == DMA_BIT_MASK(64) &&
-   cell_iommu_get_fixed_address(dev) != OF_BAD_ADDR)
-   {
+   cell_iommu_get_fixed_address(dev) != OF_BAD_ADDR) {
+   u64 addr = cell_iommu_get_fixed_address(dev) +
+   dma_iommu_fixed_base;
dev_dbg(dev, "iommu: 64-bit OK, using fixed ops\n");
+   dev_dbg(dev, "iommu: fixed addr = %llx\n", addr);
set_dma_ops(dev, &dma_iommu_fixed_ops);
+   set_dma_offset(dev, addr);
} else {
dev_dbg(dev, "iommu: not 64-bit, using default ops\n");
set_dma_ops(dev, get_pci_dma_ops());
+   cell_dma_dev_setup(dev);
}
 
-   cell_dma_dev_setup(dev);
-
*dev->dma_mask = dma_mask;
 
return 0;
 }
 
-static void cell_dma_dev_setup_fixed(struct device *dev)
-{
-   u64 addr;
-
-   addr = cell_iommu_get_fixed_address(dev) + dma_iommu_fixed_base;
-   set_dma_offset(dev, addr);
-
-   dev_dbg(dev, "iommu: fixed addr = %llx\n", addr);
-}
-
 static void insert_16M_pte(unsigned long addr, unsigned long *ptab,
   unsigned long base_pte)
 {
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 38/44] arm: implement ->dma_supported instead of ->set_dma_mask

Same behavior, less code duplication.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/common/dmabounce.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index 4aabf117e136..d89a0b56b245 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -445,12 +445,12 @@ static void dmabounce_sync_for_device(struct device *dev,
arm_dma_ops.sync_single_for_device(dev, handle, size, dir);
 }
 
-static int dmabounce_set_mask(struct device *dev, u64 dma_mask)
+static int dmabounce_dma_supported(struct device *dev, u64 dma_mask)
 {
if (dev->archdata.dmabounce)
return 0;
 
-   return arm_dma_ops.set_dma_mask(dev, dma_mask);
+   return arm_dma_ops.dma_supported(dev, dma_mask);
 }
 
 static int dmabounce_mapping_error(struct device *dev, dma_addr_t dma_addr)
@@ -474,9 +474,8 @@ static const struct dma_map_ops dmabounce_ops = {
.unmap_sg   = arm_dma_unmap_sg,
.sync_sg_for_cpu= arm_dma_sync_sg_for_cpu,
.sync_sg_for_device = arm_dma_sync_sg_for_device,
-   .set_dma_mask   = dmabounce_set_mask,
+   .dma_supported  = dmabounce_dma_supported,
.mapping_error  = dmabounce_mapping_error,
-   .dma_supported  = arm_dma_supported,
 };
 
 static int dmabounce_init_pool(struct dmabounce_pool *pool, struct device *dev,
-- 
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 26/34] iommu/amd: Allow the AMD IOMMU to work with memory encryption


On 6/7/2017 9:38 PM, Nick Sarnie wrote:

On Wed, Jun 7, 2017 at 3:17 PM, Tom Lendacky  wrote:

The IOMMU is programmed with physical addresses for the various tables
and buffers that are used to communicate between the device and the
driver. When the driver allocates this memory it is encrypted. In order
for the IOMMU to access the memory as encrypted the encryption mask needs
to be included in these physical addresses during configuration.

The PTE entries created by the IOMMU should also include the encryption
mask so that when the device behind the IOMMU performs a DMA, the DMA
will be performed to encrypted memory.

Signed-off-by: Tom Lendacky 
---
  arch/x86/include/asm/mem_encrypt.h |7 +++
  arch/x86/mm/mem_encrypt.c  |   30 ++
  drivers/iommu/amd_iommu.c  |   36 +++-
  drivers/iommu/amd_iommu_init.c |   18 --
  drivers/iommu/amd_iommu_proto.h|   10 ++
  drivers/iommu/amd_iommu_types.h|2 +-
  include/asm-generic/mem_encrypt.h  |5 +
  7 files changed, 84 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index c7a2525..d86e544 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -31,6 +31,8 @@ void __init sme_early_decrypt(resource_size_t paddr,

  void __init sme_early_init(void);

+bool sme_iommu_supported(void);
+
  /* Architecture __weak replacement functions */
  void __init mem_encrypt_init(void);

@@ -62,6 +64,11 @@ static inline void __init sme_early_init(void)
  {
  }

+static inline bool sme_iommu_supported(void)
+{
+   return true;
+}
+
  #endif /* CONFIG_AMD_MEM_ENCRYPT */

  static inline bool sme_active(void)
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 5d7c51d..018b58a 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -197,6 +197,36 @@ void __init sme_early_init(void)
 protection_map[i] = pgprot_encrypted(protection_map[i]);
  }

+bool sme_iommu_supported(void)
+{
+   struct cpuinfo_x86 *c = &boot_cpu_data;
+
+   if (!sme_me_mask || (c->x86 != 0x17))
+   return true;
+
+   /* For Fam17h, a specific level of support is required */
+   switch (c->microcode & 0xf000) {
+   case 0x:
+   return false;
+   case 0x1000:
+   switch (c->microcode & 0x0f00) {
+   case 0x:
+   return false;
+   case 0x0100:
+   if ((c->microcode & 0xff) < 0x26)
+   return false;
+   break;
+   case 0x0200:
+   if ((c->microcode & 0xff) < 0x05)
+   return false;
+   break;
+   }
+   break;
+   }
+
+   return true;
+}
+
  /* Architecture __weak replacement functions */
  void __init mem_encrypt_init(void)
  {




...



Hi Tom,

This sounds like a cool feature. I'm trying to test it on my Ryzen
system, but c->microcode & 0xf000 is evaluating as 0, so IOMMU is not
being enabled on my system. I'm using the latest microcode for AGESA
1.0.0.6, 0x08001126. Is this work reliant on a future microcode
update, or is there some other issue?


This is my mistake. I moved the check and didn't re-test. At this point
the c->microcode field hasn't been filled in so I'll need to read
MSR_AMD64_PATCH_LEVEL directly in the sme_iommu_supported() function.

Thanks,
Tom



Thanks,
Sarnex


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 20/44] sparc: implement ->mapping_error

DMA_ERROR_CODE is going to go away, so don't rely on it.

Signed-off-by: Christoph Hellwig 
---
 arch/sparc/include/asm/dma-mapping.h |  2 --
 arch/sparc/kernel/iommu.c| 12 +---
 arch/sparc/kernel/iommu_common.h |  2 ++
 arch/sparc/kernel/pci_sun4v.c| 14 ++
 4 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/arch/sparc/include/asm/dma-mapping.h 
b/arch/sparc/include/asm/dma-mapping.h
index 69cc627779f2..b8e8dfcd065d 100644
--- a/arch/sparc/include/asm/dma-mapping.h
+++ b/arch/sparc/include/asm/dma-mapping.h
@@ -5,8 +5,6 @@
 #include 
 #include 
 
-#define DMA_ERROR_CODE (~(dma_addr_t)0x0)
-
 #define HAVE_ARCH_DMA_SUPPORTED 1
 int dma_supported(struct device *dev, u64 mask);
 
diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c
index c63ba99ca551..dafa316d978d 100644
--- a/arch/sparc/kernel/iommu.c
+++ b/arch/sparc/kernel/iommu.c
@@ -314,7 +314,7 @@ static dma_addr_t dma_4u_map_page(struct device *dev, 
struct page *page,
 bad_no_ctx:
if (printk_ratelimit())
WARN_ON(1);
-   return DMA_ERROR_CODE;
+   return SPARC_MAPPING_ERROR;
 }
 
 static void strbuf_flush(struct strbuf *strbuf, struct iommu *iommu,
@@ -547,7 +547,7 @@ static int dma_4u_map_sg(struct device *dev, struct 
scatterlist *sglist,
 
if (outcount < incount) {
outs = sg_next(outs);
-   outs->dma_address = DMA_ERROR_CODE;
+   outs->dma_address = SPARC_MAPPING_ERROR;
outs->dma_length = 0;
}
 
@@ -573,7 +573,7 @@ static int dma_4u_map_sg(struct device *dev, struct 
scatterlist *sglist,
iommu_tbl_range_free(&iommu->tbl, vaddr, npages,
 IOMMU_ERROR_CODE);
 
-   s->dma_address = DMA_ERROR_CODE;
+   s->dma_address = SPARC_MAPPING_ERROR;
s->dma_length = 0;
}
if (s == outs)
@@ -741,6 +741,11 @@ static void dma_4u_sync_sg_for_cpu(struct device *dev,
spin_unlock_irqrestore(&iommu->lock, flags);
 }
 
+static int dma_4u_mapping_error(struct device *dev, dma_addr_t dma_addr)
+{
+   return dma_addr == SPARC_MAPPING_ERROR;
+}
+
 static const struct dma_map_ops sun4u_dma_ops = {
.alloc  = dma_4u_alloc_coherent,
.free   = dma_4u_free_coherent,
@@ -750,6 +755,7 @@ static const struct dma_map_ops sun4u_dma_ops = {
.unmap_sg   = dma_4u_unmap_sg,
.sync_single_for_cpu= dma_4u_sync_single_for_cpu,
.sync_sg_for_cpu= dma_4u_sync_sg_for_cpu,
+   .mapping_error  = dma_4u_mapping_error,
 };
 
 const struct dma_map_ops *dma_ops = &sun4u_dma_ops;
diff --git a/arch/sparc/kernel/iommu_common.h b/arch/sparc/kernel/iommu_common.h
index 828493329f68..5ea5c192b1d9 100644
--- a/arch/sparc/kernel/iommu_common.h
+++ b/arch/sparc/kernel/iommu_common.h
@@ -47,4 +47,6 @@ static inline int is_span_boundary(unsigned long entry,
return iommu_is_span_boundary(entry, nr, shift, boundary_size);
 }
 
+#define SPARC_MAPPING_ERROR(~(dma_addr_t)0x0)
+
 #endif /* _IOMMU_COMMON_H */
diff --git a/arch/sparc/kernel/pci_sun4v.c b/arch/sparc/kernel/pci_sun4v.c
index 68bec7c97cb8..8e2a56f4c03a 100644
--- a/arch/sparc/kernel/pci_sun4v.c
+++ b/arch/sparc/kernel/pci_sun4v.c
@@ -412,12 +412,12 @@ static dma_addr_t dma_4v_map_page(struct device *dev, 
struct page *page,
 bad:
if (printk_ratelimit())
WARN_ON(1);
-   return DMA_ERROR_CODE;
+   return SPARC_MAPPING_ERROR;
 
 iommu_map_fail:
local_irq_restore(flags);
iommu_tbl_range_free(tbl, bus_addr, npages, IOMMU_ERROR_CODE);
-   return DMA_ERROR_CODE;
+   return SPARC_MAPPING_ERROR;
 }
 
 static void dma_4v_unmap_page(struct device *dev, dma_addr_t bus_addr,
@@ -590,7 +590,7 @@ static int dma_4v_map_sg(struct device *dev, struct 
scatterlist *sglist,
 
if (outcount < incount) {
outs = sg_next(outs);
-   outs->dma_address = DMA_ERROR_CODE;
+   outs->dma_address = SPARC_MAPPING_ERROR;
outs->dma_length = 0;
}
 
@@ -607,7 +607,7 @@ static int dma_4v_map_sg(struct device *dev, struct 
scatterlist *sglist,
iommu_tbl_range_free(tbl, vaddr, npages,
 IOMMU_ERROR_CODE);
/* XXX demap? XXX */
-   s->dma_address = DMA_ERROR_CODE;
+   s->dma_address = SPARC_MAPPING_ERROR;
s->dma_length = 0;
}
if (s == outs)
@@ -669,6 +669,11 @@ static void dma_4v_unmap_sg(struct device *dev, struct 
scatterlist *sglist,
local_irq_restore(flags);
 }
 
+static int dma_4v_mapping_error(struct device *dev, dma_addr_t dma_addr)
+{
+   return dma_addr == SPARC_MAPPING_ERROR;
+}
+
 static const struct dma_map_ops

Re: [PATCH 03/12] intel-ipu3: Add DMA API implementation

2017-06-08 Thread Tomasz Figa

On Thu, Jun 8, 2017 at 10:22 PM, Robin Murphy  wrote:
> On 07/06/17 10:47, Tomasz Figa wrote:
>> Hi Yong,
>>
>> +Robin, Joerg, IOMMU ML
>>
>> Please see my comments inline.
>>
>> On Tue, Jun 6, 2017 at 5:39 AM, Yong Zhi  wrote:
[snip]
>>> +
>>> +/* End of things adapted from arch/arm/mm/dma-mapping.c */
>>> +static void ipu3_dmamap_sync_single_for_cpu(struct device *dev,
>>> +   dma_addr_t dma_handle, size_t 
>>> size,
>>> +   enum dma_data_direction dir)
>>> +{
>>> +   struct ipu3_mmu *mmu = to_ipu3_mmu(dev);
>>> +   dma_addr_t daddr = iommu_iova_to_phys(mmu->domain, dma_handle);
>>> +
>>> +   clflush_cache_range(phys_to_virt(daddr), size);
>>
>> You might need to consider another IOMMU on the way here. Generally,
>> given that daddr is your MMU DMA address (not necessarily CPU physical
>> address), you should be able to call
>>
>> dma_sync_single_for_cpu(, daddr, size, dir)
>
> I'd hope that this IPU complex is some kind of embedded endpoint thing
> that bypasses the VT-d IOMMU or is always using its own local RAM,
> because it would be pretty much unworkable otherwise.

It uses system RAM and, as far as my understanding goes, by default it
operates without the VT-d IOMMU and that's how it's used right now.
I'm suggesting VT-d IOMMU as a way to further strengthen the security
and error resilience in future (due to the IPU complex being
non-coherent and also running a closed source firmware).

> The whole
> infrastructure isn't really capable of dealing with nested IOMMUs, and
> nested DMA ops would be an equally horrible idea.

Could you elaborate a bit more on this? I think we should be able to
deal with this in a way I suggested before:

a) the PCI device would use the system DMA ops,
b) the PCI device would implement a secondary bus for which it would
provide its own DMA and IOMMU ops.
c) a secondary device would be registered on the secondary bus,
d) all memory for the IPU would be managed on behalf of the secondary device.

In fact, the driver already is designed in a way that all the points
above are true. If I'm not missing something, the only significant
missing point is calling into system DMA ops from IPU DMA ops.

Best regards,
Tomasz
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 25/44] arm: implement ->mapping_error

2017-06-08 Thread Russell King - ARM Linux

BOn Thu, Jun 08, 2017 at 03:25:50PM +0200, Christoph Hellwig wrote:
> +static int dmabounce_mapping_error(struct device *dev, dma_addr_t dma_addr)
> +{
> + if (dev->archdata.dmabounce)
> + return 0;

I'm not convinced that we need this check here:

dev->archdata.dmabounce = device_info;
set_dma_ops(dev, &dmabounce_ops);

There shouldn't be any chance of dev->archdata.dmabounce being NULL if
the dmabounce_ops has been set as the current device DMA ops.  So I
think that test can be killed.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 00/34] x86: Secure Memory Encryption (AMD)

On 6/7/2017 9:40 PM, Nick Sarnie wrote:

On Wed, Jun 7, 2017 at 3:13 PM, Tom Lendacky wrote:

This patch series provides support for AMD's new Secure Memory Encryption (SME)
feature.

SME can be used to mark individual pages of memory as encrypted through the
page tables. A page of memory that is marked encrypted will be automatically
decrypted when read from DRAM and will be automatically encrypted when
written to DRAM. Details on SME can found in the links below.

The SME feature is identified through a CPUID function and enabled through
the SYSCFG MSR. Once enabled, page table entries will determine how the
memory is accessed. If a page table entry has the memory encryption mask set,
then that memory will be accessed as encrypted memory. The memory encryption
mask (as well as other related information) is determined from settings
returned through the same CPUID function that identifies the presence of the
feature.

The approach that this patch series takes is to encrypt everything possible
starting early in the boot where the kernel is encrypted. Using the page
table macros the encryption mask can be incorporated into all page table
entries and page allocations. By updating the protection map, userspace
allocations are also marked encrypted. Certain data must be accounted for
as having been placed in memory before SME was enabled (EFI, initrd, etc.)
and accessed accordingly.

This patch series is a pre-cursor to another AMD processor feature called
Secure Encrypted Virtualization (SEV). The support for SEV will build upon
the SME support and will be submitted later. Details on SEV can be found
in the links below.

The following links provide additional detail:

AMD Memory Encryption whitepaper:

http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf

AMD64 Architecture Programmer's Manual:
http://support.amd.com/TechDocs/24593.pdf
SME is section 7.10
SEV is section 15.34

---

...

Hi Tom,

Thanks for your work on this. This may be a stupid question, but is
using bounce buffers for the GPU(s) expected to reduce performance in
any/a noticeable way? I'm hitting another issue which I've already
sent mail about so I can't test it for myself at the moment,

That all depends on the workload, how much DMA is being performed, etc.
But it is extra overhead to use bounce buffers.

Thanks,
Tom

Thanks,
Sarnex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 19/44] s390: implement ->mapping_error

2017-06-08 Thread Gerald Schaefer

On Thu,  8 Jun 2017 15:25:44 +0200
Christoph Hellwig  wrote:

> s390 can also use noop_dma_ops, and while that currently does not return
> errors it will so in the future.  Implementing the mapping_error method
> is the proper way to have per-ops error conditions.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: Gerald Schaefer 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 0/3] Cavium ThunderX2 SMMUv3 errata workarounds

On Tue, May 30, 2017 at 05:33:38PM +0530, Geetha sowjanya wrote:
> Cavium ThunderX2 SMMUv3 implementation has two Silicon Erratas.
> 1. Errata ID #74
>SMMU register alias Page 1 is not implemented
> 2. Errata ID #126
>SMMU doesnt support unique IRQ lines and also MSI for gerror,
>eventq and cmdq-sync
> 
> The following patchset does software workaround for these two erratas.
> 
> This series is based on patchset.
> https://www.spinics.net/lists/arm-kernel/msg578443.html

Yes so it is not standalone. How are we going to merge these
ACPI IORT/ACPICA/SMMU patches - inclusive of:

[1] https://www.spinics.net/lists/arm-kernel/msg586458.html

Rafael, do ACPICA patches go upstream via the ACPI tree pull request ?

To remove dependency on ACPICA changes this series needs updating
anyway and for [1] above I think the only solution is for all the
patches to go via the ACPI tree (if ACPICA updates go upstream with it).

Thanks,
Lorenzo

> Changes since v6:
>- Changed device tree compatible string to vendor specific.
>- Rebased on Robin's latest "Update SMMU models for IORT rev. C" v2 patch.
>  https://www.spinics.net/lists/arm-kernel/msg582809.html
> 
> Changes since v5:
>   - Rebased on Robin's "Update SMMU models for IORT rev. C" patch.
>  https://www.spinics.net/lists/arm-kernel/msg580728.html
>   - Replaced ACPI_IORT_SMMU_V3_CAVIUM_CN99XX macro with 
> ACPI_IORT_SMMU_CAVIUM_CN99XX
> 
> Changes since v4:
>  - Replaced all page1 offset macros ARM_SMMU_EVTQ/PRIQ_PROD/CONS with
> arm_smmu_page1_fixup(ARM_SMMU_EVTQ/PRIQ_PROD/CONS, smmu)
> 
> Changes since v3:
>  - Merged patches 1, 2 and 4 of Version 3.
>  - Modified the page1_offset_adjust() and get_irq_flags() implementation as
>suggested by Robin.
> 
> Changes since v2:
>  - Updated "Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt" document 
> with
>new SMMU option used to enable errata workaround.
> 
> Changes since v1:
>  - Since the use of MIDR register is rejected and SMMU_IIDR is broken on this
>silicon, as suggested by Will Deacon modified the patches to use ThunderX2
>SMMUv3 IORT model number to enable errata workaround.
> 
> Geetha Sowjanya (1):
>   iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126
> 
> Linu Cherian (2):
>   ACPI/IORT: Fixup SMMUv3 resource size for Cavium ThunderX2 SMMUv3
> model
>   iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2  erratum
> #74
> 
>  Documentation/arm64/silicon-errata.txt |2 +
>  .../devicetree/bindings/iommu/arm,smmu-v3.txt  |6 ++
>  drivers/acpi/arm64/iort.c  |   10 ++-
>  drivers/iommu/arm-smmu-v3.c|   93 
> 
>  4 files changed, 91 insertions(+), 20 deletions(-)
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 03/44] dmaengine: ioat: don't use DMA_ERROR_CODE

2017-06-08 Thread Dave Jiang

On 06/08/2017 06:25 AM, Christoph Hellwig wrote:
> DMA_ERROR_CODE is not a public API and will go away.  Instead properly
> unwind based on the loop counter.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: Dave Jiang 

> ---
>  drivers/dma/ioat/init.c | 24 +++-
>  1 file changed, 7 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/dma/ioat/init.c b/drivers/dma/ioat/init.c
> index 6ad4384b3fa8..ed8ed1192775 100644
> --- a/drivers/dma/ioat/init.c
> +++ b/drivers/dma/ioat/init.c
> @@ -839,8 +839,6 @@ static int ioat_xor_val_self_test(struct ioatdma_device 
> *ioat_dma)
>   goto free_resources;
>   }
>  
> - for (i = 0; i < IOAT_NUM_SRC_TEST; i++)
> - dma_srcs[i] = DMA_ERROR_CODE;
>   for (i = 0; i < IOAT_NUM_SRC_TEST; i++) {
>   dma_srcs[i] = dma_map_page(dev, xor_srcs[i], 0, PAGE_SIZE,
>  DMA_TO_DEVICE);
> @@ -910,8 +908,6 @@ static int ioat_xor_val_self_test(struct ioatdma_device 
> *ioat_dma)
>  
>   xor_val_result = 1;
>  
> - for (i = 0; i < IOAT_NUM_SRC_TEST + 1; i++)
> - dma_srcs[i] = DMA_ERROR_CODE;
>   for (i = 0; i < IOAT_NUM_SRC_TEST + 1; i++) {
>   dma_srcs[i] = dma_map_page(dev, xor_val_srcs[i], 0, PAGE_SIZE,
>  DMA_TO_DEVICE);
> @@ -965,8 +961,6 @@ static int ioat_xor_val_self_test(struct ioatdma_device 
> *ioat_dma)
>   op = IOAT_OP_XOR_VAL;
>  
>   xor_val_result = 0;
> - for (i = 0; i < IOAT_NUM_SRC_TEST + 1; i++)
> - dma_srcs[i] = DMA_ERROR_CODE;
>   for (i = 0; i < IOAT_NUM_SRC_TEST + 1; i++) {
>   dma_srcs[i] = dma_map_page(dev, xor_val_srcs[i], 0, PAGE_SIZE,
>  DMA_TO_DEVICE);
> @@ -1017,18 +1011,14 @@ static int ioat_xor_val_self_test(struct 
> ioatdma_device *ioat_dma)
>   goto free_resources;
>  dma_unmap:
>   if (op == IOAT_OP_XOR) {
> - if (dest_dma != DMA_ERROR_CODE)
> - dma_unmap_page(dev, dest_dma, PAGE_SIZE,
> -DMA_FROM_DEVICE);
> - for (i = 0; i < IOAT_NUM_SRC_TEST; i++)
> - if (dma_srcs[i] != DMA_ERROR_CODE)
> - dma_unmap_page(dev, dma_srcs[i], PAGE_SIZE,
> -DMA_TO_DEVICE);
> + while (--i >= 0)
> + dma_unmap_page(dev, dma_srcs[i], PAGE_SIZE,
> +DMA_TO_DEVICE);
> + dma_unmap_page(dev, dest_dma, PAGE_SIZE, DMA_FROM_DEVICE);
>   } else if (op == IOAT_OP_XOR_VAL) {
> - for (i = 0; i < IOAT_NUM_SRC_TEST + 1; i++)
> - if (dma_srcs[i] != DMA_ERROR_CODE)
> - dma_unmap_page(dev, dma_srcs[i], PAGE_SIZE,
> -DMA_TO_DEVICE);
> + while (--i >= 0)
> + dma_unmap_page(dev, dma_srcs[i], PAGE_SIZE,
> +DMA_TO_DEVICE);
>   }
>  free_resources:
>   dma->device_free_chan_resources(dma_chan);
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 02/12] intel-ipu3: mmu: implement driver

2017-06-08 Thread Sakari Ailus

Hi Tomasz,

On Wed, Jun 07, 2017 at 05:35:13PM +0900, Tomasz Figa wrote:
> Hi Yong, Tuukka,
> 
> Continuing from yesterday. Please see comments inline.
> 
> > On Tue, Jun 6, 2017 at 5:39 AM, Yong Zhi  wrote:
> [snip]
> >> +   ptr = ipu3_mmu_alloc_page_table(mmu_dom, false);
> >> +   if (!ptr)
> >> +   goto fail_page_table;
> >> +
> >> +   /*
> >> +* We always map the L1 page table (a single page as well as
> >> +* the L2 page tables).
> >> +*/
> >> +   mmu_dom->dummy_l2_tbl = virt_to_phys(ptr) >> IPU3_MMU_PAGE_SHIFT;
> >> +   mmu_dom->pgtbl = ipu3_mmu_alloc_page_table(mmu_dom, true);
> >> +   if (!mmu_dom->pgtbl)
> >> +   goto fail_page_table;
> >> +
> >> +   spin_lock_init(&mmu_dom->lock);
> >> +   return &mmu_dom->domain;
> >> +
> >> +fail_page_table:
> >> +   free_page((unsigned long)TBL_VIRT_ADDR(mmu_dom->dummy_page));
> >> +   free_page((unsigned long)TBL_VIRT_ADDR(mmu_dom->dummy_l2_tbl));
> >> +fail_get_page:
> >> +   kfree(mmu_dom);
> >> +   return NULL;
> >> +}
> >> +
> >> +static void ipu3_mmu_domain_free(struct iommu_domain *dom)
> >> +{
> >> +   struct ipu3_mmu_domain *mmu_dom =
> >> +   container_of(dom, struct ipu3_mmu_domain, domain);
> >> +   uint32_t l1_idx;
> >> +
> >> +   for (l1_idx = 0; l1_idx < IPU3_MMU_L1PT_PTES; l1_idx++)
> >> +   if (mmu_dom->pgtbl[l1_idx] != mmu_dom->dummy_l2_tbl)
> >> +   free_page((unsigned long)
> >> + TBL_VIRT_ADDR(mmu_dom->pgtbl[l1_idx]));
> >> +
> >> +   free_page((unsigned long)TBL_VIRT_ADDR(mmu_dom->dummy_page));
> >> +   free_page((unsigned long)TBL_VIRT_ADDR(mmu_dom->dummy_l2_tbl));
> 
> I might be overly paranoid, but reading back kernel virtual pointers
> from device accessible memory doesn't seem safe to me. Other drivers
> keep kernel pointers of page tables in a dedicated array (it's only 8K
> of memory, but much better safety).

Do you happen to have an example of that?

All system memory typically is accessible for devices, I think you wanted to
say that the device is intended to access that memory. Albeit for reading
only.

...

> >> +static int ipu3_mmu_bus_remove(struct device *dev)
> >> +{
> >> +   struct ipu3_mmu *mmu = dev_get_drvdata(dev);
> >> +
> >> +   put_iova_domain(&mmu->iova_domain);
> >> +   iova_cache_put();
> 
> Don't we need to set the L1 table address to something invalid and
> invalidate the TLB, so that the IOMMU doesn't reference the page freed
> below anymore?

I think the expectation is that if a device gets removed, its memory is
unmapped by that time. Unmapping that memory will cause explicit TLB flush.

-- 
Regards,

Sakari Ailus
e-mail: sakari.ai...@iki.fi XMPP: sai...@retiisi.org.uk
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 03/12] intel-ipu3: Add DMA API implementation

2017-06-08 Thread Sakari Ailus

Hi Tomasz and Alan,

On Thu, Jun 08, 2017 at 11:55:18AM +0900, Tomasz Figa wrote:
> Hi Alan,
> 
> On Thu, Jun 8, 2017 at 2:45 AM, Alan Cox  wrote:
> >> > +   struct ipu3_mmu *mmu = to_ipu3_mmu(dev);
> >> > +   dma_addr_t daddr = iommu_iova_to_phys(mmu->domain, dma_handle);
> >> > +
> >> > +   clflush_cache_range(phys_to_virt(daddr), size);
> >>
> >> You might need to consider another IOMMU on the way here. Generally,
> >> given that daddr is your MMU DMA address (not necessarily CPU physical
> >> address), you should be able to call
> >>
> >> dma_sync_single_for_cpu(, daddr, size, dir)
> >
> > Te system IOMMU (if enabled) may be cache coherent - and on x86 would be,
> > so it doesn't think it needs to do anything for cache synchronization
> > and the dma_sync won't actually do any work.
> 
> I'm not very familiar with x86, but typically I found coherency to be
> an attribute of the DMA master (i.e. if it is connected to a coherent
> memory port).
> 
> Looking at all the IPU3 code, it looks like the whole PCI device is
> non-coherent for some reason (e.g. you can see implicit cache flushes
> for page tables). So I would have expected that a non-coherent variant
> of x86 dma_ops is used for the PCI struct device, which would do cache
> maintenance in its dma_sync_* ops.

It can actually do both --- in most cases.

The MMU page tables are an exception so they will still need an explicit
flush.

-- 
Sakari Ailus
e-mail: sakari.ai...@iki.fi XMPP: sai...@retiisi.org.uk
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 0/3] Cavium ThunderX2 SMMUv3 errata workarounds

2017-06-08 Thread Rafael J. Wysocki

On Thu, Jun 8, 2017 at 6:32 PM, Lorenzo Pieralisi
 wrote:
> On Tue, May 30, 2017 at 05:33:38PM +0530, Geetha sowjanya wrote:
>> Cavium ThunderX2 SMMUv3 implementation has two Silicon Erratas.
>> 1. Errata ID #74
>>SMMU register alias Page 1 is not implemented
>> 2. Errata ID #126
>>SMMU doesnt support unique IRQ lines and also MSI for gerror,
>>eventq and cmdq-sync
>>
>> The following patchset does software workaround for these two erratas.
>>
>> This series is based on patchset.
>> https://www.spinics.net/lists/arm-kernel/msg578443.html
>
> Yes so it is not standalone. How are we going to merge these
> ACPI IORT/ACPICA/SMMU patches - inclusive of:
>
> [1] https://www.spinics.net/lists/arm-kernel/msg586458.html
>
> Rafael, do ACPICA patches go upstream via the ACPI tree pull request ?

Not as a rule.

> To remove dependency on ACPICA changes this series needs updating
> anyway and for [1] above I think the only solution is for all the
> patches to go via the ACPI tree (if ACPICA updates go upstream with it).

I think we may ask Lv to backport the header changes once they have
been merged into Linux.

Lv, would that work?

Thanks,
Rafael
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 0/3] Cavium ThunderX2 SMMUv3 errata workarounds

On 08/06/17 18:13, Rafael J. Wysocki wrote:
> On Thu, Jun 8, 2017 at 6:32 PM, Lorenzo Pieralisi
>  wrote:
>> On Tue, May 30, 2017 at 05:33:38PM +0530, Geetha sowjanya wrote:
>>> Cavium ThunderX2 SMMUv3 implementation has two Silicon Erratas.
>>> 1. Errata ID #74
>>>SMMU register alias Page 1 is not implemented
>>> 2. Errata ID #126
>>>SMMU doesnt support unique IRQ lines and also MSI for gerror,
>>>eventq and cmdq-sync
>>>
>>> The following patchset does software workaround for these two erratas.
>>>
>>> This series is based on patchset.
>>> https://www.spinics.net/lists/arm-kernel/msg578443.html
>>
>> Yes so it is not standalone. How are we going to merge these
>> ACPI IORT/ACPICA/SMMU patches - inclusive of:
>>
>> [1] https://www.spinics.net/lists/arm-kernel/msg586458.html
>>
>> Rafael, do ACPICA patches go upstream via the ACPI tree pull request ?
> 
> Not as a rule.
> 
>> To remove dependency on ACPICA changes this series needs updating
>> anyway and for [1] above I think the only solution is for all the
>> patches to go via the ACPI tree (if ACPICA updates go upstream with it).
> 
> I think we may ask Lv to backport the header changes once they have
> been merged into Linux.
> 
> Lv, would that work?

FWIW, I have already sent a PR for the header patch for the new model
IDs to ACPICA upstream. I briefly considered the actual table update as
well, but didn't find time to comprehend the code changes that appeared
to be necessary for that.

Robin.

> 
> Thanks,
> Rafael
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 03/12] intel-ipu3: Add DMA API implementation

On 08/06/17 15:35, Tomasz Figa wrote:
> On Thu, Jun 8, 2017 at 10:22 PM, Robin Murphy  wrote:
>> On 07/06/17 10:47, Tomasz Figa wrote:
>>> Hi Yong,
>>>
>>> +Robin, Joerg, IOMMU ML
>>>
>>> Please see my comments inline.
>>>
>>> On Tue, Jun 6, 2017 at 5:39 AM, Yong Zhi  wrote:
> [snip]
 +
 +/* End of things adapted from arch/arm/mm/dma-mapping.c */
 +static void ipu3_dmamap_sync_single_for_cpu(struct device *dev,
 +   dma_addr_t dma_handle, size_t 
 size,
 +   enum dma_data_direction dir)
 +{
 +   struct ipu3_mmu *mmu = to_ipu3_mmu(dev);
 +   dma_addr_t daddr = iommu_iova_to_phys(mmu->domain, dma_handle);
 +
 +   clflush_cache_range(phys_to_virt(daddr), size);
>>>
>>> You might need to consider another IOMMU on the way here. Generally,
>>> given that daddr is your MMU DMA address (not necessarily CPU physical
>>> address), you should be able to call
>>>
>>> dma_sync_single_for_cpu(, daddr, size, dir)
>>
>> I'd hope that this IPU complex is some kind of embedded endpoint thing
>> that bypasses the VT-d IOMMU or is always using its own local RAM,
>> because it would be pretty much unworkable otherwise.
> 
> It uses system RAM and, as far as my understanding goes, by default it
> operates without the VT-d IOMMU and that's how it's used right now.

OK, if it *is* behind a DMAR unit then booting with "iommu=force" (or
whatever the exact incantation for intel-iommu is) should be fun...

> I'm suggesting VT-d IOMMU as a way to further strengthen the security
> and error resilience in future (due to the IPU complex being
> non-coherent and also running a closed source firmware).

TBH, doing DMA remapping through *two* IOMMUS will add horrible hardware
overhead, increase the scope for kernel-side bugs, and not much more. If
we don't trust this IOMMU to behave, why are we trying to drive it in
the first place? If we do, then a second IOMMU behind it won't protect
anything that the first one doesn't already.

>> The whole
>> infrastructure isn't really capable of dealing with nested IOMMUs, and
>> nested DMA ops would be an equally horrible idea.
> 
> Could you elaborate a bit more on this? I think we should be able to
> deal with this in a way I suggested before:
> 
> a) the PCI device would use the system DMA ops,
> b) the PCI device would implement a secondary bus for which it would
> provide its own DMA and IOMMU ops.
> c) a secondary device would be registered on the secondary bus,
> d) all memory for the IPU would be managed on behalf of the secondary device.
> 
> In fact, the driver already is designed in a way that all the points
> above are true. If I'm not missing something, the only significant
> missing point is calling into system DMA ops from IPU DMA ops.

I don't believe x86 has any non-coherent DMA ops, therefore the IPU DMA
ops would still probably have to do all their own cache maintenance.
Allocation/mapping, though, would have to be done with the parent DMA
ops first (in case DMA address != physical address), *then* mapped at
the IPU MMU, which is the real killer - if the PCI DMA ops are from
intel-iommu, then there's little need for the IPU MMU mapping to be
anything other than 1:1, so you may as well not bother. If the PCI DMA
ops are from SWIOTLB, then the constraints of having to go through that
first eliminate all the scatter-gather benefit of the IPU MMU.

The IOMMU API ops would have to be handled similarly, by checking for
ops on the parent bus, calling those first if present, then running the
intermediate results through the IPU MMU's own functions. Sure, it's not
impossible, but it's really really grim. Not to mention that all the IPU
MMU's page tables/control structures/etc. would also have to be
DMA-allocated/mapped because it may or may not be operating in physical
address space.

The reasonable option - assuming the topology really is this way - would
seem to be special-casing the IPU in intel-iommu in a similar manner to
integrated graphics, to make sure it gets a passthrough domain for DMA
ops, but still allowing the whole PCI device to be passed through to a
guest VM via VFIO if desired (which is really the only case where nested
translation does start to make sense).

Robin.

> 
> Best regards,
> Tomasz
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] iommu/vt-d: correctly disable Intel IOMMU force on

2017-06-08 Thread Shaohua Li

From: Shaohua Li 

I made a mistake in commit bfd20f1. We should skip the force on with the
option enabled instead of vice versa. Not sure why this passed our
performance test, sorry.

Fix: bfd20f1(x86, iommu/vt-d: Add an option to disable Intel IOMMU force on)
Signed-off-by: Shaohua Li 
---
 arch/x86/kernel/tboot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index 4b17240..a4eb279 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -514,7 +514,7 @@ int tboot_force_iommu(void)
if (!tboot_enabled())
return 0;
 
-   if (!intel_iommu_tboot_noforce)
+   if (intel_iommu_tboot_noforce)
return 1;
 
if (no_iommu || swiotlb || dmar_disabled)
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush

2017-06-08 Thread Jan Vesely

On Tue, 2017-06-06 at 10:02 +, Nath, Arindam wrote:
> > -Original Message-
> > From: Lendacky, Thomas
> > Sent: Tuesday, June 06, 2017 1:23 AM
> > To: iommu@lists.linux-foundation.org
> > Cc: Nath, Arindam ; Joerg Roedel
> > ; Duran, Leo ; Suthikulpanit,
> > Suravee 
> > Subject: [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush
> > 
> > After reducing the amount of MMIO performed by the IOMMU during
> > operation,
> > perf data shows that flushing the TLB for all protection domains during
> > DMA unmapping is a performance issue. It is not necessary to flush the
> > TLBs for all protection domains, only the protection domains associated
> > with iova's on the flush queue.
> > 
> > Create a separate queue that tracks the protection domains associated with
> > the iova's on the flush queue. This new queue optimizes the flushing of
> > TLBs to the required protection domains.
> > 
> > Reviewed-by: Arindam Nath 
> > Signed-off-by: Tom Lendacky 
> > ---
> > drivers/iommu/amd_iommu.c |   56
> > -
> > 1 file changed, 50 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> > index 856103b..a5e77f0 100644
> > --- a/drivers/iommu/amd_iommu.c
> > +++ b/drivers/iommu/amd_iommu.c
> > @@ -103,7 +103,18 @@ struct flush_queue {
> > struct flush_queue_entry *entries;
> > };
> > 
> > +struct flush_pd_queue_entry {
> > +   struct protection_domain *pd;
> > +};
> > +
> > +struct flush_pd_queue {
> > +   /* No lock needed, protected by flush_queue lock */
> > +   unsigned next;
> > +   struct flush_pd_queue_entry *entries;
> > +};
> > +
> > static DEFINE_PER_CPU(struct flush_queue, flush_queue);
> > +static DEFINE_PER_CPU(struct flush_pd_queue, flush_pd_queue);
> > 
> > static atomic_t queue_timer_on;
> > static struct timer_list queue_timer;
> > @@ -2227,16 +2238,20 @@ static struct iommu_group
> > *amd_iommu_device_group(struct device *dev)
> >  *
> > 
> > ***
> > **/
> > 
> > -static void __queue_flush(struct flush_queue *queue)
> > +static void __queue_flush(struct flush_queue *queue,
> > + struct flush_pd_queue *pd_queue)
> > {
> > -   struct protection_domain *domain;
> > unsigned long flags;
> > int idx;
> > 
> > /* First flush TLB of all known domains */
> > spin_lock_irqsave(&amd_iommu_pd_lock, flags);
> > -   list_for_each_entry(domain, &amd_iommu_pd_list, list)
> > -   domain_flush_tlb(domain);
> > +   for (idx = 0; idx < pd_queue->next; ++idx) {
> > +   struct flush_pd_queue_entry *entry;
> > +
> > +   entry = pd_queue->entries + idx;
> > +   domain_flush_tlb(entry->pd);
> > +   }
> > spin_unlock_irqrestore(&amd_iommu_pd_lock, flags);
> > 
> > /* Wait until flushes have completed */
> > @@ -2255,6 +2270,7 @@ static void __queue_flush(struct flush_queue
> > *queue)
> > entry->dma_dom = NULL;
> > }
> > 
> > +   pd_queue->next = 0;
> > queue->next = 0;
> > }
> > 
> > @@ -2263,13 +2279,15 @@ static void queue_flush_all(void)
> > int cpu;
> > 
> > for_each_possible_cpu(cpu) {
> > +   struct flush_pd_queue *pd_queue;
> > struct flush_queue *queue;
> > unsigned long flags;
> > 
> > queue = per_cpu_ptr(&flush_queue, cpu);
> > +   pd_queue = per_cpu_ptr(&flush_pd_queue, cpu);
> > spin_lock_irqsave(&queue->lock, flags);
> > if (queue->next > 0)
> > -   __queue_flush(queue);
> > +   __queue_flush(queue, pd_queue);
> > spin_unlock_irqrestore(&queue->lock, flags);
> > }
> > }
> > @@ -2283,6 +2301,8 @@ static void queue_flush_timeout(unsigned long
> > unsused)
> > static void queue_add(struct dma_ops_domain *dma_dom,
> >   unsigned long address, unsigned long pages)
> > {
> > +   struct flush_pd_queue_entry *pd_entry;
> > +   struct flush_pd_queue *pd_queue;
> > struct flush_queue_entry *entry;
> > struct flush_queue *queue;
> > unsigned long flags;
> > @@ -2292,10 +2312,22 @@ static void queue_add(struct dma_ops_domain
> > *dma_dom,
> > address >>= PAGE_SHIFT;
> > 
> > queue = get_cpu_ptr(&flush_queue);
> > +   pd_queue = get_cpu_ptr(&flush_pd_queue);
> > spin_lock_irqsave(&queue->lock, flags);
> > 
> > if (queue->next == FLUSH_QUEUE_SIZE)
> > -   __queue_flush(queue);
> > +   __queue_flush(queue, pd_queue);
> > +
> > +   for (idx = 0; idx < pd_queue->next; ++idx) {
> > +   pd_entry = pd_queue->entries + idx;
> > +   if (pd_entry->pd == &dma_dom->domain)
> > +   break;
> > +   }
> > +   if (idx == pd_queue->next) {
> > +   /* New protection domain, add it to the list */
> > +   pd_entry = pd_queue->entries + pd_queue->next++;
> > +   pd_entry->pd = &dma_dom->domain;
> > +   }
> > 
> > idx   = qu

Re: [PATCH v6 10/34] x86, x86/mm, x86/xen, olpc: Use __va() against just the physical address in cr3

2017-06-08 Thread Boris Ostrovsky


>
>> What may be needed is making sure X86_FEATURE_SME is not set for PV
>> guests.
>
> And that may be something that Xen will need to control through either
> CPUID or MSR support for the PV guests.


Only on newer versions of Xen. On earlier versions (2-3 years old) leaf
0x8007 is passed to the guest unchanged. And so is MSR_K8_SYSCFG.

-boris
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 10/34] x86, x86/mm, x86/xen, olpc: Use __va() against just the physical address in cr3


On 6/8/2017 3:51 PM, Boris Ostrovsky wrote:





What may be needed is making sure X86_FEATURE_SME is not set for PV
guests.


And that may be something that Xen will need to control through either
CPUID or MSR support for the PV guests.



Only on newer versions of Xen. On earlier versions (2-3 years old) leaf
0x8007 is passed to the guest unchanged. And so is MSR_K8_SYSCFG.


The SME feature is in leaf 0x801f, is that leaf passed to the guest
unchanged?

Thanks,
Tom



-boris


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 25/34] swiotlb: Add warnings for use of bounce buffers with SME


On 6/8/2017 12:53 AM, kbuild test robot wrote:

Hi Tom,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.12-rc4 next-20170607]
[cannot apply to tip/x86/core]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Tom-Lendacky/x86-Secure-Memory-Encryption-AMD/20170608-104147
config: sparc-defconfig (attached as .config)
compiler: sparc-linux-gcc (GCC) 6.2.0
reproduce:
 wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
 chmod +x ~/bin/make.cross
 # save the attached .config to linux build tree
 make.cross ARCH=sparc

All errors (new ones prefixed by >>):

In file included from include/linux/dma-mapping.h:13:0,
 from include/linux/skbuff.h:34,
 from include/linux/filter.h:12,
 from kernel//bpf/core.c:24:

include/linux/mem_encrypt.h:16:29: fatal error: asm/mem_encrypt.h: No such file 
or directory

 #include 
 ^
compilation terminated.


Okay, I had the wrong understanding of the asm-generic directory. The
next series will fix this so it is not an issue for other arches.

Thanks,
Tom



vim +16 include/linux/mem_encrypt.h

2d7c2ec4 Tom Lendacky 2017-06-07  10   * published by the Free Software 
Foundation.
2d7c2ec4 Tom Lendacky 2017-06-07  11   */
2d7c2ec4 Tom Lendacky 2017-06-07  12
2d7c2ec4 Tom Lendacky 2017-06-07  13  #ifndef __MEM_ENCRYPT_H__
2d7c2ec4 Tom Lendacky 2017-06-07  14  #define __MEM_ENCRYPT_H__
2d7c2ec4 Tom Lendacky 2017-06-07  15
2d7c2ec4 Tom Lendacky 2017-06-07 @16  #include 
2d7c2ec4 Tom Lendacky 2017-06-07  17
2d7c2ec4 Tom Lendacky 2017-06-07  18  #endif/* __MEM_ENCRYPT_H__ */

:: The code at line 16 was first introduced by commit
:: 2d7c2ec4c60e83432b27bfb32042706f404d4158 x86/mm: Add Secure Memory 
Encryption (SME) support

:: TO: Tom Lendacky 
:: CC: 0day robot 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 10/34] x86, x86/mm, x86/xen, olpc: Use __va() against just the physical address in cr3

2017-06-08 Thread Boris Ostrovsky

On 06/08/2017 05:02 PM, Tom Lendacky wrote:
> On 6/8/2017 3:51 PM, Boris Ostrovsky wrote:
>>
>>>
 What may be needed is making sure X86_FEATURE_SME is not set for PV
 guests.
>>>
>>> And that may be something that Xen will need to control through either
>>> CPUID or MSR support for the PV guests.
>>
>>
>> Only on newer versions of Xen. On earlier versions (2-3 years old) leaf
>> 0x8007 is passed to the guest unchanged. And so is MSR_K8_SYSCFG.
>
> The SME feature is in leaf 0x801f, is that leaf passed to the guest
> unchanged?

Oh, I misread the patch where X86_FEATURE_SME is defined. Then all
versions, including the current one, pass it unchanged.

All that's needed is setup_clear_cpu_cap(X86_FEATURE_SME) in
xen_init_capabilities().


-boris
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 10/34] x86, x86/mm, x86/xen, olpc: Use __va() against just the physical address in cr3


On 6/8/2017 1:05 AM, Andy Lutomirski wrote:

On Wed, Jun 7, 2017 at 12:14 PM, Tom Lendacky  wrote:

The cr3 register entry can contain the SME encryption bit that indicates
the PGD is encrypted.  The encryption bit should not be used when creating
a virtual address for the PGD table.

Create a new function, read_cr3_pa(), that will extract the physical
address from the cr3 register. This function is then used where a virtual
address of the PGD needs to be created/used from the cr3 register.


This is going to conflict with:

https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/pcid&id=555c81e5d01a62b629ec426a2f50d27e2127c1df

We're both encountering the fact that CR3 munges the page table PA
with some other stuff, and some readers want to see the actual CR3
value and other readers just want the PA.  The thing I prefer about my
patch is that I get rid of read_cr3() entirely, forcing the patch to
update every single reader, making review and conflict resolution much
safer.

I'd be willing to send a patch tomorrow that just does the split into
__read_cr3() and read_cr3_pa() (I like your name better) and then we
can both base on top of it.  Would that make sense?


That makes sense to me.



Also:


+static inline unsigned long read_cr3_pa(void)
+{
+   return (read_cr3() & PHYSICAL_PAGE_MASK);
+}


Is there any guarantee that the magic encryption bit is masked out in
PHYSICAL_PAGE_MASK?  The docs make it sound like it could be any bit.
(But if it's one of the low 12 bits, that would be quite confusing.)


Right now it's bit 47 and we're steering away from any of the currently
reserved bits so we should be safe.

Thanks,
Tom



--Andy


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 25/34] swiotlb: Add warnings for use of bounce buffers with SME