Re: [PATCHv4 6/6] iommu: arm-smmu-impl: Remove unwanted extra blank lines

2020-09-14 Thread Sai Prakash Ranjan

On 2020-09-11 22:20, Sai Prakash Ranjan wrote:

On 2020-09-11 22:04, Robin Murphy wrote:

On 2020-09-11 17:21, Sai Prakash Ranjan wrote:

On 2020-09-11 21:37, Will Deacon wrote:

On Fri, Sep 11, 2020 at 05:03:06PM +0100, Robin Murphy wrote:
BTW am I supposed to have received 3 copies of everything? Because 
I did...


Yeah, this seems to be happening for all of Sai's emails :/



Sorry, I am not sure what went wrong as I only sent this once
and there are no recent changes to any of my configs, I'll
check it further.


Actually on closer inspection it appears to be "correct" behaviour.
I'm still subscribed to LAKML and the IOMMU list on this account, but
normally Office 365 deduplicates so aggressively that I have rules set
up to copy list mails that I'm cc'ed on back to my inbox, in case they
arrive first and cause the direct copy to get eaten - apparently
there's something unique about your email setup that manages to defeat
the deduplicator and make it deliver all 3 copies intact... :/



No changes in my local setup atleast, but in the past we have
had cases with codeaurora mail acting weird or it could be my vpn,
will have to check.



This was an issue with codeaurora servers and I am told that it is
fixed now.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] irqchip/qcom-pdc: Allow QCOM_PDC to be loadable as a permanent module

2020-09-14 Thread Greg Kroah-Hartman
On Mon, Sep 14, 2020 at 09:04:23PM +, John Stultz wrote:
> Allows qcom-pdc driver to be loaded as a permanent module.
> 
> An earlier version of this patch was merged in a larger patchset
> but was reverted entirely when issues were found with other
> drivers, so now that Marc has provided a better solution in his
> Hybrid probing patch set, I wanted to re-submit this change.
> 
> Cc: Andy Gross 
> Cc: Bjorn Andersson 
> Cc: Joerg Roedel 
> Cc: Thomas Gleixner 
> Cc: Jason Cooper 
> Cc: Marc Zyngier 
> Cc: Linus Walleij 
> Cc: Maulik Shah 
> Cc: Lina Iyer 
> Cc: Saravana Kannan 
> Cc: Todd Kjos 
> Cc: Greg Kroah-Hartman 
> Cc: linux-arm-...@vger.kernel.org
> Cc: iommu@lists.linux-foundation.org
> Cc: linux-g...@vger.kernel.org
> Signed-off-by: John Stultz 
> ---

Reviewed-by: Greg Kroah-Hartman 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: a saner API for allocating DMA addressable pages v2

2020-09-14 Thread Christoph Hellwig
On Mon, Sep 14, 2020 at 04:26:17PM +0100, Matthew Wilcox wrote:
> On Mon, Sep 14, 2020 at 04:44:16PM +0200, Christoph Hellwig wrote:
> > I'm still a little unsure about the API naming, as alloc_pages sort of
> > implies a struct page return value, but we return a kernel virtual
> > address.
> 
> Erm ... dma_alloc_pages() returns a struct page, so is this sentence
> stale?

Yes.

> You say that like it's a bad thing.  I think the problem is more that
> people don't understand what non-coherent means and think they're
> supporting it when they're not.
> 
> dma_alloc_manual_flushing()?

That sounds pretty awkward..

> 
> > As a follow up I plan to move the implementation of the
> > DMA_ATTR_NO_KERNEL_MAPPING flag over to this framework as well, given
> > that is also is a fundamentally non coherent allocation.  The replacement
> > for that flag would then return a struct page, as it is allowed to
> > actually return pages without a kernel mapping as the name suggested
> > (although most of the time they will actually have a kernel mapping..)
> 
> If the page doesn't have a kernel mapping, shouldn't it return a PFN
> or a phys_addr?

Most APIs we'll feed it into need a struct page.  The difference is just
that it can be a highmem page.  And if we want to get fancy we could
change the kernel mapping to PROT_NONE eventually.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 03/17] drm/exynos: stop setting DMA_ATTR_NON_CONSISTENT

2020-09-14 Thread Christoph Hellwig
On Mon, Sep 14, 2020 at 06:34:02PM +0300, Sergei Shtylyov wrote:
> On 9/14/20 5:44 PM, Christoph Hellwig wrote:
> 
> > DMA_ATTR_NON_CONSISTENT is a no-op except on PARISC and some mips
> > configs, so don't set it in this ARM specific driver.
> 
>Hm, PARICS and ARM capitalized but mips in lower case? :-)

I guess it should also be PA-RISC if we start nitpicking..
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 11/17] sgiseeq: convert to dma_alloc_noncoherent

2020-09-14 Thread Christoph Hellwig
On Mon, Sep 14, 2020 at 04:13:58PM +0100, Matthew Wilcox wrote:
> On Mon, Sep 14, 2020 at 04:44:27PM +0200, Christoph Hellwig wrote:
> >  drivers/net/ethernet/i825xx/lasi_82596.c |  25 ++---
> >  drivers/net/ethernet/i825xx/lib82596.c   | 114 ++-
> >  drivers/net/ethernet/i825xx/sni_82596.c  |   4 -
> >  drivers/net/ethernet/seeq/sgiseeq.c  |  28 --
> >  drivers/scsi/53c700.c|   9 +-
> >  5 files changed, 103 insertions(+), 77 deletions(-)
> 
> I think your patch slicing-and-dicing went wrong here ;-(

Pretty much..  Fixed up for the next version.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 07/17] 53c700: improve non-coherent DMA handling

2020-09-14 Thread Christoph Hellwig
On Mon, Sep 14, 2020 at 08:20:18AM -0700, James Bottomley wrote:
> If you're going to change the macros from taking a device to taking a
> hostdata structure then the descriptive argument name needs to change
> ... it can't be dev anymore.  I'm happy with it simply becoming 'h' if
> hostdata is too long.
> 
> I already asked for this on the first go around:

And I did rename them, those hunks just accidentally slipped into patch
12 instead of this one.  Fixed for the next versions.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 02/23] dt-bindings: memory: mediatek: Convert SMI to DT schema

2020-09-14 Thread Yong Wu
On Mon, 2020-09-14 at 17:23 -0600, Rob Herring wrote:
> On Sat, Sep 05, 2020 at 04:08:59PM +0800, Yong Wu wrote:
> > Convert MediaTek SMI to DT schema.
> > 
> > Signed-off-by: Yong Wu 
> > ---
> >  .../mediatek,smi-common.txt   | 49 --
> >  .../mediatek,smi-common.yaml  | 96 +++
> >  .../memory-controllers/mediatek,smi-larb.txt  | 49 --
> >  .../memory-controllers/mediatek,smi-larb.yaml | 85 

[...]

> > +  mediatek,smi:
> > +$ref: /schemas/types.yaml#/definitions/phandle-array
> > +description: a phandle to the smi_common node.
> > +
> > +  mediatek,larb-id:
> > +$ref: /schemas/types.yaml#/definitions/uint32
> > +description: the hardware id of this larb.
> > +  Required property for mt2701, mt2712, mt6779 and mt7623.
> 
> Is there a set of valid values?

In the patch[4/23], I change the larb_nr to 32. normally this id is from
0 to 31. I will add this in next version:

minimum: 0
maximum: 31

> 
> > +
> > +required:
> > +  - compatible
> > +  - reg
> > +  - clocks
> > +  - clock-names
> > +  - power-domains
> > +
> > +additionalProperties: false
> > +
> > +examples:
> > +  - |
> > +#include 
> > +#include 
> > +
> > +larb1: larb@1601 {
> > +  compatible = "mediatek,mt8173-smi-larb";
> > +  reg = <0x1601 0x1000>;
> > +  mediatek,smi = <&smi_common>;
> > +  power-domains = <&scpsys MT8173_POWER_DOMAIN_VDEC>;
> > +  clocks = <&vdecsys CLK_VDEC_CKEN>,
> > +   <&vdecsys CLK_VDEC_LARB_CKEN>;
> > +  clock-names = "apb", "smi";
> > +};
> > +
> > -- 
> > 2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 01/23] dt-bindings: iommu: mediatek: Convert IOMMU to DT schema

2020-09-14 Thread Yong Wu
On Mon, 2020-09-14 at 17:22 -0600, Rob Herring wrote:
> On Sat, Sep 05, 2020 at 04:08:58PM +0800, Yong Wu wrote:
> > Convert MediaTek IOMMU to DT schema.
> > 
> > Signed-off-by: Yong Wu 
> > ---

[...]

> > +properties:
> > +  compatible:
> > +enum:
> > +  - mediatek,mt2701-m4u #mt2701 generation one HW
> > +  - mediatek,mt2712-m4u #mt2712 generation two HW
> > +  - mediatek,mt6779-m4u #mt6779 generation two HW
> > +  - mediatek,mt7623-m4u, mediatek,mt2701-m4u #mt7623 generation one HW
> 
> This is not right.
> 
> items:
>   - const: mediatek,mt7623-m4u
>   - const: mediatek,mt2701-m4u
> 
> And that has to be under a 'oneOf' with the rest of this.

Thanks for the review. Is this OK?

  compatible:
oneOf:
  - const: mediatek,mt2701-m4u # mt2701 generation one HW
  - const: mediatek,mt2712-m4u # mt2712 generation two HW
  - const: mediatek,mt6779-m4u # mt6779 generation two HW
  - const: mediatek,mt8173-m4u # mt8173 generation two HW
  - const: mediatek,mt8183-m4u # mt8183 generation two HW
  - const: mediatek,mt8192-m4u # mt8192 generation two HW

  - description: mt7623 generation one HW
items:
  - const: mediatek,mt7623-m4u
  - const: mediatek,mt2701-m4u

> 
> > +  - mediatek,mt8173-m4u #mt8173 generation two HW
> > +  - mediatek,mt8183-m4u #mt8183 generation two HW
> > +
> > +  reg:
> > +maxItems: 1

[snip]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 6/6] dma-mapping: introduce DMA range map, supplanting dma_pfn_offset

2020-09-14 Thread Christoph Hellwig
On Mon, Sep 14, 2020 at 05:01:47PM -0600, Mathieu Poirier wrote:

[700 lines of the fullquote deleted..]

> > +   for (r = map; r->size; r++)
> > +   num_ranges++;
> > +
> > +   new_map = kmemdup(map, array_size(num_ranges + 1, sizeof(*map)),
> > + GFP_KERNEL);
> > +   if (!new_map)
> > +   return -ENOMEM;
> > +   to->dma_range_map = new_map;
> > +   return 0;
> > +}
> > +
> 
> This patch seemed Ok to me but it broke the stm32 remoteproc implementation.  
> When
> I tested things out function dma_coerce_mask_and_cohenrent() returns -5 and 
> the
> rest of the initialisation fails.  I isolated things to function dma_to_pfn()
> [2].  In the original implementation __bus_to_pfn() returns 0xf and
> dev->dma_pfn_offset is equal to 0x38000.  As such the function returns 
> 0x137fff
> and dma_supported() a non-zero value[3].
> 
> With this set function dma_to_pfn() received a face lift.  Function
> __bus_to_pfn() still returns 0xf but translate_dma_to_phys() returns 0,
> which forces dma_supported() to also return 0 and that is where the -5 (-EIO)
> comes from.
> 
> Taking a futher look at translate_dma_to_phy(), @dma_addr never falls within 
> the
> bus_dma_region ranges and returns 0.
> 
> I'm suspecting an initialisation problem and if it occurred here, it will
> likely show up elsewhere.

Can you try this incremental patch?

diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 088c97181ab146..c6b21acba7a459 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -46,7 +46,7 @@ static inline phys_addr_t translate_dma_to_phys(struct device 
*dev,
if (dma_addr >= m->dma_start && dma_addr - m->dma_start < 
m->size)
return (phys_addr_t)dma_addr + m->offset;
 
-   return 0;
+   return (phys_addr_t)-1;
 }
 
 #ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v7 04/16] vfio: Add PASID allocation/free support

2020-09-14 Thread Liu, Yi L
Hi Alex,

> From: Alex Williamson 
> Sent: Saturday, September 12, 2020 4:55 AM
> 
> On Thu, 10 Sep 2020 03:45:21 -0700
> Liu Yi L  wrote:
> 
> > Shared Virtual Addressing (a.k.a Shared Virtual Memory) allows sharing
> > multiple process virtual address spaces with the device for simplified
> > programming model. PASID is used to tag an virtual address space in
> > DMA requests and to identify the related translation structure in
> > IOMMU. When a PASID-capable device is assigned to a VM, we want the
> > same capability of using PASID to tag guest process virtual address
> > spaces to achieve virtual SVA (vSVA).
> >
> > PASID management for guest is vendor specific. Some vendors (e.g.
> > Intel
> > VT-d) requires system-wide managed PASIDs across all devices,
> > regardless of whether a device is used by host or assigned to guest.
> > Other vendors (e.g. ARM SMMU) may allow PASIDs managed per-device thus
> > could be fully delegated to the guest for assigned devices.
> >
> > For system-wide managed PASIDs, this patch introduces a vfio module to
> > handle explicit PASID alloc/free requests from guest. Allocated PASIDs
> > are associated to a process (or, mm_struct) in IOASID core. A vfio_mm
> > object is introduced to track mm_struct. Multiple VFIO containers
> > within a process share the same vfio_mm object.
> >
> > A quota mechanism is provided to prevent malicious user from
> > exhausting available PASIDs. Currently the quota is a global parameter
> > applied to all VFIO devices. In the future per-device quota might be 
> > supported too.
> >
> > Cc: Kevin Tian 
> > CC: Jacob Pan 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Cc: Joerg Roedel 
> > Cc: Lu Baolu 
> > Suggested-by: Alex Williamson 
> > Signed-off-by: Liu Yi L 
> > Reviewed-by: Eric Auger 
> > ---
> > v6 -> v7:
> > *) remove "#include " and add r-b from Eric Auger.
> >
> > v5 -> v6:
> > *) address comments from Eric. Add vfio_unlink_pasid() to be consistent
> >with vfio_unlink_dma(). Add a comment in vfio_pasid_exit().
> >
> > v4 -> v5:
> > *) address comments from Eric Auger.
> > *) address the comments from Alex on the pasid free range support. Added
> >per vfio_mm pasid r-b tree.
> >https://lore.kernel.org/kvm/20200709082751.32074...@x1.home/
> >
> > v3 -> v4:
> > *) fix lock leam in vfio_mm_get_from_task()
> > *) drop pasid_quota field in struct vfio_mm
> > *) vfio_mm_get_from_task() returns ERR_PTR(-ENOTTY) when
> > !CONFIG_VFIO_PASID
> >
> > v1 -> v2:
> > *) added in v2, split from the pasid alloc/free support of v1
> > ---
> >  drivers/vfio/Kconfig  |   5 +
> >  drivers/vfio/Makefile |   1 +
> >  drivers/vfio/vfio_pasid.c | 247
> ++
> >  include/linux/vfio.h  |  28 ++
> >  4 files changed, 281 insertions(+)
> >  create mode 100644 drivers/vfio/vfio_pasid.c
> >
> > diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index
> > fd17db9..3d8a108 100644
> > --- a/drivers/vfio/Kconfig
> > +++ b/drivers/vfio/Kconfig
> > @@ -19,6 +19,11 @@ config VFIO_VIRQFD
> > depends on VFIO && EVENTFD
> > default n
> >
> > +config VFIO_PASID
> > +   tristate
> > +   depends on IOASID && VFIO
> > +   default n
> > +
> >  menuconfig VFIO
> > tristate "VFIO Non-Privileged userspace driver framework"
> > depends on IOMMU_API
> > diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index
> > de67c47..bb836a3 100644
> > --- a/drivers/vfio/Makefile
> > +++ b/drivers/vfio/Makefile
> > @@ -3,6 +3,7 @@ vfio_virqfd-y := virqfd.o
> >
> >  obj-$(CONFIG_VFIO) += vfio.o
> >  obj-$(CONFIG_VFIO_VIRQFD) += vfio_virqfd.o
> > +obj-$(CONFIG_VFIO_PASID) += vfio_pasid.o
> >  obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
> >  obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
> >  obj-$(CONFIG_VFIO_SPAPR_EEH) += vfio_spapr_eeh.o diff --git
> > a/drivers/vfio/vfio_pasid.c b/drivers/vfio/vfio_pasid.c new file mode
> > 100644 index 000..44ecdd5
> > --- /dev/null
> > +++ b/drivers/vfio/vfio_pasid.c
> > @@ -0,0 +1,247 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Copyright (C) 2020 Intel Corporation.
> > + * Author: Liu Yi L 
> > + *
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#define DRIVER_VERSION  "0.1"
> > +#define DRIVER_AUTHOR   "Liu Yi L "
> > +#define DRIVER_DESC "PASID management for VFIO bus drivers"
> > +
> > +#define VFIO_DEFAULT_PASID_QUOTA   1000
> 
> I'm not sure we really need a macro to define this since it's only used once, 
> but a
> comment discussing the basis for this default value would be useful.

yep, may remove the macro. 1000 is actually a value come from an offline
discussion with Jacob. And was first mentioned in below link. Since we
don't have much data to decide a default quota today, so we'd like to
make 1000 be default quota as a start. In future we would give administrator
the ability to tune the quota.

https://lore.kernel.org/kvm/a297566

Re: [PATCH v7 3/9] docs: x86: Add documentation for SVA (Shared Virtual Addressing)

2020-09-14 Thread Fenghua Yu
Hi, Randy,

On Sat, Sep 05, 2020 at 10:54:59AM -0700, Randy Dunlap wrote:
> Hi,
> 
> I'll add a few edits other than those that Borislav made.
> (nice review job, BP)
> 
> 
> On 8/27/20 8:06 AM, Fenghua Yu wrote:
> > From: Ashok Raj 
> > 
> > ENQCMD and Data Streaming Accelerator (DSA) and all of their associated
> > features are a complicated stack with lots of interconnected pieces.
> > This documentation provides a big picture overview for all of the
> > features.
> > 
> > Signed-off-by: Ashok Raj 
> > Co-developed-by: Fenghua Yu 
> > Signed-off-by: Fenghua Yu 
> > Reviewed-by: Tony Luck 
> > ---
> > diff --git a/Documentation/x86/sva.rst b/Documentation/x86/sva.rst
> > new file mode 100644
> > index ..6e7ac565e127
> > --- /dev/null
> > +++ b/Documentation/x86/sva.rst
> > @@ -0,0 +1,254 @@
> > +MMIO. This doesn't scale as the number of threads becomes quite large. The
> > +hardware also manages the queue depth for Shared Work Queues (SWQ), and
> > +consumers don't need to track queue depth. If there is no space to accept
> > +a command, the device will return an error indicating retry. Also
> > +submitting a command to an MMIO address that can't accept ENQCMD will
> > +return retry in response. In the new DMWr PCIe terminology, devices need to
> 
> so how does a submitter know whether a return of "retry" means no_space or
> invalid_for_this_device?

I will add "A user should check Deferrable Memory Write (DMWr) capability on
the device and only submits ENQCMD when the device supports it."

So the user doesn't need to distinguish "no space" and "invalid for this device"
errors.

All of your other comments will be addressed in the next version.

Thank you very much for your comments!

-Fenghua
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 0/6] Convert the intel iommu driver to the dma-iommu api

2020-09-14 Thread Lu Baolu

Hi Tvrtko,

On 9/14/20 4:04 PM, Tvrtko Ursulin wrote:


Hi,

On 12/09/2020 04:21, Lu Baolu wrote:

Tom Murphy has almost done all the work. His latest patch series was
posted here.

https://lore.kernel.org/linux-iommu/20200903201839.7327-1-murph...@tcd.ie/

Thanks a lot!

This series is a follow-up with below changes:

1. Add a quirk for the i915 driver issue described in Tom's cover
letter.


Last week I have copied you on an i915 series which appears to remove the need 
for this quirk. so if we get those i915 patches reviewed and merged, do you 
still want to pursue this quirk?


It's up to the graphic guys. I don't know the details in i915 driver.
I don't think my tests could cover all cases.




2. Fix several bugs in patch "iommu: Allow the dma-iommu api to use
bounce buffers" to make the bounce buffer work for untrusted devices.
3. Several cleanups in iommu/vt-d driver after the conversion.


With the previous version of the series I hit a problem on Ivybridge where 
apparently the dma engine width is not respected. At least that is my layman 
interpretation of the errors. From the older thread:

<3> [209.526605] DMAR: intel_iommu_map: iommu width (39) is not sufficient for 
the mapped address (008000)

Relevant iommu boot related messages are:

<6>[0.184234] DMAR: Host address width 36
<6>[0.184245] DMAR: DRHD base: 0x00fed9 flags: 0x0
<6>[0.184288] DMAR: dmar0: reg_base_addr fed9 ver 1:0 cap 
c020e60262 ecap f0101a
<6>[0.184308] DMAR: DRHD base: 0x00fed91000 flags: 0x1
<6>[0.184337] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap 
c9008020660262 ecap f0105a
<6>[0.184357] DMAR: RMRR base: 0x00d8d28000 end: 0x00d8d46fff
<6>[0.184377] DMAR: RMRR base: 0x00db00 end: 0x00df1f
<6>[0.184398] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 1
<6>[0.184414] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
<6>[0.184428] DMAR-IR: Queued invalidation will be enabled to support 
x2apic and Intr-remapping.
<6>[0.185173] DMAR-IR: Enabled IRQ remapping in x2apic mode

<6>[0.878934] DMAR: No ATSR found
<6>[0.878966] DMAR: dmar0: Using Queued invalidation
<6>[0.879007] DMAR: dmar1: Using Queued invalidation

<6>[0.915032] DMAR: Intel(R) Virtualization Technology for Directed I/O
<6>[0.915060] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
<6>[0.915084] software IO TLB: mapped [mem 0xc80d4000-0xcc0d4000] (64MB)

(Full boot log at 
https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7054/fi-ivb-3770/boot0.txt, 
failures at 
https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7054/fi-ivb-3770/igt@i915_selftest@l...@blt.html.)

Does this look familiar or at least plausible to you? Is this something your 
new series has fixed?


This happens during attaching a domain to device. It has nothing to do
with this patch series. I will look into this issue, but not in this
email thread context.

Best regards,
baolu



Regards,

Tvrtko



Please review and test.

Best regards,
baolu

Lu Baolu (2):
iommu: Add quirk for Intel graphic devices in map_sg
iommu/vt-d: Cleanup after converting to dma-iommu ops

Tom Murphy (4):
iommu: Handle freelists when using deferred flushing in iommu drivers
iommu: Add iommu_dma_free_cpu_cached_iovas()
iommu: Allow the dma-iommu api to use bounce buffers
iommu/vt-d: Convert intel iommu driver to the iommu ops

   .../admin-guide/kernel-parameters.txt |   5 -
   drivers/iommu/dma-iommu.c | 229 -
   drivers/iommu/intel/Kconfig   |   1 +
   drivers/iommu/intel/iommu.c   | 885 +++---
   include/linux/dma-iommu.h |   8 +
   include/linux/iommu.h |   1 +
   6 files changed, 323 insertions(+), 806 deletions(-)


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 06/23] dt-bindings: mediatek: Add binding for mt8192 IOMMU and SMI

2020-09-14 Thread Rob Herring
On Sat, 05 Sep 2020 16:09:03 +0800, Yong Wu wrote:
> This patch adds decriptions for mt8192 IOMMU and SMI.
> 
> mt8192 also is MTK IOMMU gen2 which uses ARM Short-Descriptor translation
> table format. The M4U-SMI HW diagram is as below:
> 
>   EMI
>|
>   M4U
>|
>   
>SMI Common
>   
>|
>   +---+--+--+--+---+
>   |   |  |  |   .. |   |
>   |   |  |  |  |   |
> larb0   larb1  larb2  larb4 ..  larb19   larb20
> disp0   disp1   mdpvdec   IPE  IPE
> 
> All the connections are HW fixed, SW can NOT adjust it.
> 
> mt8192 M4U support 0~16GB iova range. we preassign different engines
> into different iova ranges:
> 
> domain-id  module iova-range  larbs
>0   disp0 ~ 4G  larb0/1
>1   vcodec  4G ~ 8G larb4/5/7
>2   cam/mdp 8G ~ 12G larb2/9/11/13/14/16/17/18/19/20
>3   CCU00x4000_ ~ 0x43ff_ larb13: port 9/10
>4   CCU10x4400_ ~ 0x47ff_ larb14: port 4/5
> 
> The iova range for CCU0/1(camera control unit) is HW requirement.
> 
> Signed-off-by: Yong Wu 
> ---
>  .../bindings/iommu/mediatek,iommu.yaml|   9 +-
>  .../mediatek,smi-common.yaml  |   5 +-
>  .../memory-controllers/mediatek,smi-larb.yaml |   3 +-
>  include/dt-bindings/memory/mt8192-larb-port.h | 239 ++
>  4 files changed, 251 insertions(+), 5 deletions(-)
>  create mode 100644 include/dt-bindings/memory/mt8192-larb-port.h
> 

Reviewed-by: Rob Herring 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 02/23] dt-bindings: memory: mediatek: Convert SMI to DT schema

2020-09-14 Thread Rob Herring
On Sat, Sep 05, 2020 at 04:08:59PM +0800, Yong Wu wrote:
> Convert MediaTek SMI to DT schema.
> 
> Signed-off-by: Yong Wu 
> ---
>  .../mediatek,smi-common.txt   | 49 --
>  .../mediatek,smi-common.yaml  | 96 +++
>  .../memory-controllers/mediatek,smi-larb.txt  | 49 --
>  .../memory-controllers/mediatek,smi-larb.yaml | 85 
>  4 files changed, 181 insertions(+), 98 deletions(-)
>  delete mode 100644 
> Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
>  create mode 100644 
> Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
>  delete mode 100644 
> Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.txt
>  create mode 100644 
> Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
> 
> diff --git 
> a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
>  
> b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
> deleted file mode 100644
> index b64573680b42..
> --- 
> a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
> +++ /dev/null
> @@ -1,49 +0,0 @@
> -SMI (Smart Multimedia Interface) Common
> -
> -The hardware block diagram please check bindings/iommu/mediatek,iommu.txt
> -
> -Mediatek SMI have two generations of HW architecture, here is the list
> -which generation the SoCs use:
> -generation 1: mt2701 and mt7623.
> -generation 2: mt2712, mt6779, mt8173 and mt8183.
> -
> -There's slight differences between the two SMI, for generation 2, the
> -register which control the iommu port is at each larb's register base. But
> -for generation 1, the register is at smi ao base(smi always on register
> -base). Besides that, the smi async clock should be prepared and enabled for
> -SMI generation 1 to transform the smi clock into emi clock domain, but that 
> is
> -not needed for SMI generation 2.
> -
> -Required properties:
> -- compatible : must be one of :
> - "mediatek,mt2701-smi-common"
> - "mediatek,mt2712-smi-common"
> - "mediatek,mt6779-smi-common"
> - "mediatek,mt7623-smi-common", "mediatek,mt2701-smi-common"
> - "mediatek,mt8173-smi-common"
> - "mediatek,mt8183-smi-common"
> -- reg : the register and size of the SMI block.
> -- power-domains : a phandle to the power domain of this local arbiter.
> -- clocks : Must contain an entry for each entry in clock-names.
> -- clock-names : must contain 3 entries for generation 1 smi HW and 2 entries
> -  for generation 2 smi HW as follows:
> -  - "apb" : Advanced Peripheral Bus clock, It's the clock for setting
> - the register.
> -  - "smi" : It's the clock for transfer data and command.
> - They may be the same if both source clocks are the same.
> -  - "async" : asynchronous clock, it help transform the smi clock into the 
> emi
> -   clock domain, this clock is only needed by generation 1 smi HW.
> -  and these 2 option clocks for generation 2 smi HW:
> -  - "gals0": the path0 clock of GALS(Global Async Local Sync).
> -  - "gals1": the path1 clock of GALS(Global Async Local Sync).
> -  Here is the list which has this GALS: mt6779 and mt8183.
> -
> -Example:
> - smi_common: smi@14022000 {
> - compatible = "mediatek,mt8173-smi-common";
> - reg = <0 0x14022000 0 0x1000>;
> - power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
> - clocks = <&mmsys CLK_MM_SMI_COMMON>,
> -  <&mmsys CLK_MM_SMI_COMMON>;
> - clock-names = "apb", "smi";
> - };
> diff --git 
> a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
>  
> b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
> new file mode 100644
> index ..781d7b7617d9
> --- /dev/null
> +++ 
> b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
> @@ -0,0 +1,96 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: 
> http://devicetree.org/schemas/memory-controllers/mediatek,smi-common.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: SMI (Smart Multimedia Interface) Common
> +
> +maintainers:
> +  - Yong Wu 
> +
> +description: |+
> +  The hardware block diagram please check bindings/iommu/mediatek,iommu.yaml
> +
> +  MediaTek SMI have two generations of HW architecture, here is the list
> +  which generation the SoCs use:
> +  generation 1: mt2701 and mt7623.
> +  generation 2: mt2712, mt6779, mt8173 and mt8183.
> +
> +  There's slight differences between the two SMI, for generation 2, the
> +  register which control the iommu port is at each larb's register base. But
> +  for generation 1, the register is at smi ao base(smi always on register
> +  base). Besides that, the smi async clock should be prepared and enabled for
> +  SMI generation 1 to transform the smi clock in

Re: [PATCH v2 01/23] dt-bindings: iommu: mediatek: Convert IOMMU to DT schema

2020-09-14 Thread Rob Herring
On Sat, Sep 05, 2020 at 04:08:58PM +0800, Yong Wu wrote:
> Convert MediaTek IOMMU to DT schema.
> 
> Signed-off-by: Yong Wu 
> ---
>  .../bindings/iommu/mediatek,iommu.txt | 103 
>  .../bindings/iommu/mediatek,iommu.yaml| 150 ++
>  2 files changed, 150 insertions(+), 103 deletions(-)
>  delete mode 100644 Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
>  create mode 100644 
> Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
> 
> diff --git a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt 
> b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
> deleted file mode 100644
> index c1ccd8582eb2..
> --- a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
> +++ /dev/null
> @@ -1,103 +0,0 @@
> -* Mediatek IOMMU Architecture Implementation
> -
> -  Some Mediatek SOCs contain a Multimedia Memory Management Unit (M4U), and
> -this M4U have two generations of HW architecture. Generation one uses flat
> -pagetable, and only supports 4K size page mapping. Generation two uses the
> -ARM Short-Descriptor translation table format for address translation.
> -
> -  About the M4U Hardware Block Diagram, please check below:
> -
> -  EMI (External Memory Interface)
> -   |
> -  m4u (Multimedia Memory Management Unit)
> -   |
> -  ++
> -  ||
> -  gals0-rx   gals1-rx(Global Async Local Sync rx)
> -  ||
> -  ||
> -  gals0-tx   gals1-tx(Global Async Local Sync tx)
> -  ||  Some SoCs may have GALS.
> -  ++
> -   |
> -   SMI Common(Smart Multimedia Interface Common)
> -   |
> -   ++---
> -   ||
> -   | gals-rxThere may be GALS in some larbs.
> -   ||
> -   ||
> -   | gals-tx
> -   ||
> -   SMI larb0SMI larb1   ... SoCs have several SMI local 
> arbiter(larb).
> -   (display) (vdec)
> -   ||
> -   ||
> - +-+-+ +++
> - | | | |||
> - | | |...  |||  ... There are different ports in each larb.
> - | | | |||
> -OVL0 RDMA0 WDMA0  MC   PP   VLD
> -
> -  As above, The Multimedia HW will go through SMI and M4U while it
> -access EMI. SMI is a bridge between m4u and the Multimedia HW. It contain
> -smi local arbiter and smi common. It will control whether the Multimedia
> -HW should go though the m4u for translation or bypass it and talk
> -directly with EMI. And also SMI help control the power domain and clocks for
> -each local arbiter.
> -  Normally we specify a local arbiter(larb) for each multimedia HW
> -like display, video decode, and camera. And there are different ports
> -in each larb. Take a example, There are many ports like MC, PP, VLD in the
> -video decode local arbiter, all these ports are according to the video HW.
> -  In some SoCs, there may be a GALS(Global Async Local Sync) module between
> -smi-common and m4u, and additional GALS module between smi-larb and
> -smi-common. GALS can been seen as a "asynchronous fifo" which could help
> -synchronize for the modules in different clock frequency.
> -
> -Required properties:
> -- compatible : must be one of the following string:
> - "mediatek,mt2701-m4u" for mt2701 which uses generation one m4u HW.
> - "mediatek,mt2712-m4u" for mt2712 which uses generation two m4u HW.
> - "mediatek,mt6779-m4u" for mt6779 which uses generation two m4u HW.
> - "mediatek,mt7623-m4u", "mediatek,mt2701-m4u" for mt7623 which uses
> -  generation one m4u HW.
> - "mediatek,mt8173-m4u" for mt8173 which uses generation two m4u HW.
> - "mediatek,mt8183-m4u" for mt8183 which uses generation two m4u HW.
> -- reg : m4u register base and size.
> -- interrupts : the interrupt of m4u.
> -- clocks : must contain one entry for each clock-names.
> -- clock-names : Only 1 optional clock:
> -  - "bclk": the block clock of m4u.
> -  Here is the list which require this "bclk":
> -  - mt2701, mt2712, mt7623 and mt8173.
> -  Note that m4u use the EMI clock which always has been enabled before kernel
> -  if there is no this "bclk".
> -- mediatek,larbs : List of phandle to the local arbiters in the current Socs.
> - Refer to bindings/memory-controllers/mediatek,smi-larb.txt. It must sort
> - according to the local arbiter index, like larb0, larb1, larb2...
> -- iommu-cells : must be 1. This is the mtk_m4u_id according to the HW.
> - Specifies the mtk_m4u_id as defined in
> - dt-binding/memory/mt2701-larb-port.h for mt2701, mt7623
> - dt-binding/memory/mt2712-larb-port.h for mt2712,
> - dt-binding/memory/mt6779-larb-port.h for mt6779,
> - dt-binding/me

Re: [PATCH 6/6] dma-mapping: introduce DMA range map, supplanting dma_pfn_offset

2020-09-14 Thread Mathieu Poirier
Hi Christoph,

On Mon, Sep 14, 2020 at 09:33:43AM +0200, Christoph Hellwig wrote:
> From: Jim Quinlan 
> 
> The new field 'dma_range_map' in struct device is used to facilitate the
> use of single or multiple offsets between mapping regions of cpu addrs and
> dma addrs.  It subsumes the role of "dev->dma_pfn_offset" which was only
> capable of holding a single uniform offset and had no region bounds
> checking.
> 
> The function of_dma_get_range() has been modified so that it takes a single
> argument -- the device node -- and returns a map, NULL, or an error code.
> The map is an array that holds the information regarding the DMA regions.
> Each range entry contains the address offset, the cpu_start address, the
> dma_start address, and the size of the region.
> 
> of_dma_configure() is the typical manner to set range offsets but there are
> a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel
> driver code.  These cases now invoke the function
> dma_direct_set_offset(dev, cpu_addr, dma_addr, size).
> 
> Signed-off-by: Jim Quinlan 
> [hch: various interface cleanups]
> Signed-off-by: Christoph Hellwig 
> Tested-by: Nathan Chancellor 
> ---
>  arch/arm/include/asm/dma-direct.h |  9 +--
>  arch/arm/mach-keystone/keystone.c | 17 ++---
>  arch/arm/mach-omap1/include/mach/memory.h |  4 +
>  arch/sh/drivers/pci/pcie-sh7786.c |  9 ++-
>  arch/x86/pci/sta2x11-fixup.c  |  6 +-
>  drivers/acpi/arm64/iort.c |  6 +-
>  drivers/base/core.c   |  2 +
>  drivers/gpu/drm/sun4i/sun4i_backend.c |  8 +-
>  drivers/iommu/io-pgtable-arm.c|  2 +-
>  .../platform/sunxi/sun4i-csi/sun4i_csi.c  |  9 ++-
>  .../platform/sunxi/sun6i-csi/sun6i_csi.c  | 11 ++-
>  drivers/of/address.c  | 73 ---
>  drivers/of/device.c   | 44 ++-
>  drivers/of/of_private.h   | 11 +--
>  drivers/of/unittest.c | 34 ++---
>  drivers/remoteproc/remoteproc_core.c  | 24 +-
>  .../staging/media/sunxi/cedrus/cedrus_hw.c| 10 ++-
>  include/linux/device.h|  4 +-
>  include/linux/dma-direct.h| 52 +++--
>  include/linux/dma-mapping.h   |  9 ++-
>  kernel/dma/coherent.c |  7 +-
>  kernel/dma/direct.c   | 51 -
>  22 files changed, 284 insertions(+), 118 deletions(-)
> 
> diff --git a/arch/arm/include/asm/dma-direct.h 
> b/arch/arm/include/asm/dma-direct.h
> index fbcf4367b5cb1a..436544aeb83405 100644
> --- a/arch/arm/include/asm/dma-direct.h
> +++ b/arch/arm/include/asm/dma-direct.h
> @@ -12,8 +12,8 @@
>  #ifndef __arch_pfn_to_dma
>  static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
>  {
> - if (dev)
> - pfn -= dev->dma_pfn_offset;
> + if (dev && dev->dma_range_map)
> + pfn = PFN_DOWN(translate_phys_to_dma(dev, PFN_PHYS(pfn)));
>   return (dma_addr_t)__pfn_to_bus(pfn);
>  }
>  
> @@ -21,9 +21,8 @@ static inline unsigned long dma_to_pfn(struct device *dev, 
> dma_addr_t addr)
>  {
>   unsigned long pfn = __bus_to_pfn(addr);
>  
> - if (dev)
> - pfn += dev->dma_pfn_offset;
> -
> + if (dev && dev->dma_range_map)
> + pfn = PFN_DOWN(translate_dma_to_phys(dev, PFN_PHYS(pfn)));
>   return pfn;
>  }
>  
> diff --git a/arch/arm/mach-keystone/keystone.c 
> b/arch/arm/mach-keystone/keystone.c
> index dcd031ba84c2e0..09a65c2dfd7327 100644
> --- a/arch/arm/mach-keystone/keystone.c
> +++ b/arch/arm/mach-keystone/keystone.c
> @@ -8,6 +8,7 @@
>   */
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -25,8 +26,6 @@
>  #include "keystone.h"
>  
>  #ifdef CONFIG_ARM_LPAE
> -static unsigned long keystone_dma_pfn_offset __read_mostly;
> -
>  static int keystone_platform_notifier(struct notifier_block *nb,
> unsigned long event, void *data)
>  {
> @@ -39,9 +38,12 @@ static int keystone_platform_notifier(struct 
> notifier_block *nb,
>   return NOTIFY_BAD;
>  
>   if (!dev->of_node) {
> - dev->dma_pfn_offset = keystone_dma_pfn_offset;
> - dev_err(dev, "set dma_pfn_offset%08lx\n",
> - dev->dma_pfn_offset);
> + int ret = dma_direct_set_offset(dev, KEYSTONE_HIGH_PHYS_START,
> + KEYSTONE_LOW_PHYS_START,
> + KEYSTONE_HIGH_PHYS_SIZE);
> + dev_err(dev, "set dma_offset%08llx%s\n",
> + KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START,
> + ret ? " failed" : "");
>   }
>   return NOTIFY_OK;
>  }
> @@ -54,11 +56,8 @@ static struct notifier_block platform_nb = {
>  static void __init keystone_init(void)
>  {
>  #ifd

Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-14 Thread Raj, Ashok
Hi Jason,

I thought we discussed this at LPC, but still seems to be going in
circles :-(.


On Mon, Sep 14, 2020 at 04:00:57PM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 14, 2020 at 12:23:28PM -0600, Alex Williamson wrote:
> > On Mon, 14 Sep 2020 14:41:21 -0300
> > Jason Gunthorpe  wrote:
> > 
> > > On Mon, Sep 14, 2020 at 10:58:57AM -0600, Alex Williamson wrote:
> > >  
> > > > "its own special way" is arguable, VFIO is just making use of what's
> > > > being proposed as the uapi via its existing IOMMU interface.  
> > > 
> > > I mean, if we have a /dev/sva then it makes no sense to extend the
> > > VFIO interfaces with the same stuff. VFIO should simply accept a PASID
> > > created from /dev/sva and use it just like any other user-DMA driver
> > > would.
> > 
> > I don't think that's absolutely true.  By the same logic, we could say
> > that pci-sysfs provides access to PCI BAR and config space
> > resources,
> 
> No, it is the reverse, VFIO is a better version of pci-sysfs, so
> pci-sysfs is the one that is obsoleted by VFIO. Similarly a /dev/sva
> would be the superset interface for PASID, so whatver VFIO has would
> be obsoleted.

As you had suggested earlier in the mail thread could Jason Wang maybe
build out what it takes to have a full fledged /dev/sva interface for vDPA
and figure out how the interfaces should emerge? otherwise it appears
everyone is talking very high level and with that limited understanding of
how things work at the moment. 

As Kevin pointed out there are several aspects, and a real prototype from
interested people would be the best way to understand the easy/hard aspects
of moving between the proposals.

- PASID allocation and life cycle management
  Managing both 1-1 (as its done today) and also support a guest PASID
  space. (Supporting guest PASID range is required for migration I suppose)
- Page request processing.
- Interaction with vIOMMU, vSVA requires vIOMMU for supporting
  invalidations, forwarding prq and such.
- Supporting ENQCMD in guest. (Today its just in Intel products, but its
  also submitted to PCIe SIG) and if you are a member should be able to see
  that. FWIW, it might already be open for public review, it not now maybe
  pretty soon.
  
  For Intel we have some KVM interaction setting up the guest pasid->host
  pasid interaces.

This has to move ahead of these email discussions, hoping somone with the
right ideas would help move this forward.

Cheers,
Ashok


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-14 Thread Alex Williamson
On Mon, 14 Sep 2020 16:00:57 -0300
Jason Gunthorpe  wrote:

> On Mon, Sep 14, 2020 at 12:23:28PM -0600, Alex Williamson wrote:
> > On Mon, 14 Sep 2020 14:41:21 -0300
> > Jason Gunthorpe  wrote:
> >   
> > > On Mon, Sep 14, 2020 at 10:58:57AM -0600, Alex Williamson wrote:
> > >
> > > > "its own special way" is arguable, VFIO is just making use of what's
> > > > being proposed as the uapi via its existing IOMMU interface.
> > > 
> > > I mean, if we have a /dev/sva then it makes no sense to extend the
> > > VFIO interfaces with the same stuff. VFIO should simply accept a PASID
> > > created from /dev/sva and use it just like any other user-DMA driver
> > > would.  
> > 
> > I don't think that's absolutely true.  By the same logic, we could say
> > that pci-sysfs provides access to PCI BAR and config space
> > resources,  
> 
> No, it is the reverse, VFIO is a better version of pci-sysfs, so
> pci-sysfs is the one that is obsoleted by VFIO. Similarly a /dev/sva
> would be the superset interface for PASID, so whatver VFIO has would
> be obsoleted.
> 
> It would be very unusual for the kernel to have to 'preferred'
> interfaces for the same thing, IMHO. The review process for uAPI
> should really prevent that by allowing all interests to be served
> while the uAPI is designed.
> 
> > the VFIO device interface duplicates part of that interface therefore it
> > should be abandoned.  But in reality, VFIO providing access to those
> > resources puts those accesses within the scope and control of the VFIO
> > interface.  
> 
> Not clear to my why VFIO needs that. PASID seems quite orthogonal from
> VFIO to me.


Can you explain that further, or spit-ball what you think this /dev/sva
interface looks like and how a user might interact between vfio and
this new interface?  The interface proposed here definitely does not
seem orthogonal to the vfio IOMMU interface, ie. selecting a specific
IOMMU domain mode during vfio setup, allocating pasids and associating
them with page tables for that two-stage IOMMU setup, performing cache
invalidations based on page table updates, etc.  How does it make more
sense for a vIOMMU to setup some aspects of the IOMMU through vfio and
others through a TBD interface?


> > > This has already happened, the SVA patches generally allow unpriv user
> > > space to allocate a PASID for their process.
> > > 
> > > If a device implements a mdev shared with a kernel driver (like IDXD)
> > > then it will be sharing that PASID pool across both drivers. In this
> > > case it makes no sense that VFIO has PASID quota logic because it has
> > > an incomplete view. It could only make sense if VFIO is the exclusive
> > > owner of the bus/device/function.
> > > 
> > > The tracking logic needs to be global.. Most probably in some kind of
> > > PASID cgroup controller?  
> > 
> > AIUI, that doesn't exist yet, so it makes sense that VFIO, as the
> > mechanism through which a user would allocate a PASID,   
> 
> VFIO is not the exclusive user interface for PASID. Other SVA drivers
> will allocate PASIDs. Any quota has to be implemented by the IOMMU
> layer, and shared across all drivers.


The IOMMU needs to allocate PASIDs, so in that sense it enforces a
quota via the architectural limits, but is the IOMMU layer going to
distinguish in-kernel versus user limits?  A cgroup limit seems like a
good idea, but that's not really at the IOMMU layer either and I don't
see that a /dev/sva and vfio interface couldn't both support a cgroup
type quota.

 
> > space.  Also, "unprivileged user" is a bit of a misnomer in this
> > context as the VFIO user must be privileged with ownership of a device
> > before they can even participate in PASID allocation.  Is truly
> > unprivileged access reasonable for a limited resource?  
> 
> I'm not talking about VFIO, I'm talking about the other SVA drivers. I
> expect some of them will be unpriv safe, like IDXD, for
> instance.
> 
> Some way to manage the limited PASID resource will be necessary beyond
> just VFIO.

And it's not clear that they'll have compatible requirements.  A
userspace idxd driver might have limited needs versus a vIOMMU backend.
Does a single quota model adequately support both or are we back to the
differences between access to a device and ownership of a device?
Maybe a single pasid per user makes sense in the former.  If we could
bring this discussion to some sort of more concrete proposal it might
be easier to weigh the choices.
 
> > QEMU typically runs in a sandbox with limited access, when a device or
> > mdev is assigned to a VM, file permissions are configured to allow that
> > access.  QEMU doesn't get to poke at any random dev file it likes,
> > that's part of how userspace reduces the potential attack surface.  
> 
> Plumbing the exact same APIs through VFIO's uAPI vs /dev/sva doesn't
> reduce the attack surface. qemu can simply include /dev/sva in the
> sandbox when using VFIO with no increase in attack surface from this
> proposed series.

Re: [PATCH v2 1/4] dt-bindings: reserved-memory: Document "active" property

2020-09-14 Thread Rob Herring
On Fri, Sep 04, 2020 at 02:59:57PM +0200, Thierry Reding wrote:
> From: Thierry Reding 
> 
> Reserved memory regions can be marked as "active" if hardware is
> expected to access the regions during boot and before the operating
> system can take control. One example where this is useful is for the
> operating system to infer whether the region needs to be identity-
> mapped through an IOMMU.

I like simple solutions, but this hardly seems adequate to solve the 
problem of passing IOMMU setup from bootloader/firmware to the OS. Like 
what is the IOVA that's supposed to be used if identity mapping is not 
used?

If you know enough about the regions to assume identity mapping, then 
can't you know if active or not?

> Signed-off-by: Thierry Reding 
> ---
>  .../bindings/reserved-memory/reserved-memory.txt   | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git 
> a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt 
> b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> index 4dd20de6977f..163d2927e4fc 100644
> --- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> +++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> @@ -63,6 +63,13 @@ reusable (optional) - empty property
>able to reclaim it back. Typically that means that the operating
>system can use that region to store volatile or cached data that
>can be otherwise regenerated or migrated elsewhere.
> +active (optional) - empty property
> +- If this property is set for a reserved memory region, it indicates
> +  that some piece of hardware may be actively accessing this region.
> +  Should the operating system want to enable IOMMU protection for a
> +  device, all active memory regions must have been identity-mapped
> +  in order to ensure that non-quiescent hardware during boot can
> +  continue to access the memory.
>  
>  Linux implementation note:
>  - If a "linux,cma-default" property is present, then Linux will use the
> -- 
> 2.28.0
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] irqchip/qcom-pdc: Allow QCOM_PDC to be loadable as a permanent module

2020-09-14 Thread John Stultz
Allows qcom-pdc driver to be loaded as a permanent module.

An earlier version of this patch was merged in a larger patchset
but was reverted entirely when issues were found with other
drivers, so now that Marc has provided a better solution in his
Hybrid probing patch set, I wanted to re-submit this change.

Cc: Andy Gross 
Cc: Bjorn Andersson 
Cc: Joerg Roedel 
Cc: Thomas Gleixner 
Cc: Jason Cooper 
Cc: Marc Zyngier 
Cc: Linus Walleij 
Cc: Maulik Shah 
Cc: Lina Iyer 
Cc: Saravana Kannan 
Cc: Todd Kjos 
Cc: Greg Kroah-Hartman 
Cc: linux-arm-...@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Cc: linux-g...@vger.kernel.org
Signed-off-by: John Stultz 
---
 drivers/irqchip/Kconfig| 2 +-
 drivers/irqchip/qcom-pdc.c | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index bfc9719dbcdc..bb70b7177f94 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -425,7 +425,7 @@ config GOLDFISH_PIC
  for Goldfish based virtual platforms.
 
 config QCOM_PDC
-   bool "QCOM PDC"
+   tristate "QCOM PDC"
depends on ARCH_QCOM
select IRQ_DOMAIN_HIERARCHY
help
diff --git a/drivers/irqchip/qcom-pdc.c b/drivers/irqchip/qcom-pdc.c
index 8543fa23da10..59eb3c8473b0 100644
--- a/drivers/irqchip/qcom-pdc.c
+++ b/drivers/irqchip/qcom-pdc.c
@@ -433,3 +433,5 @@ static int qcom_pdc_init(struct device_node *node, struct 
device_node *parent)
 IRQCHIP_HYBRID_DRIVER_BEGIN(qcom_pdc)
 IRQCHIP_MATCH("qcom,pdc", qcom_pdc_init)
 IRQCHIP_HYBRID_DRIVER_END(qcom_pdc)
+MODULE_DESCRIPTION("Qualcomm Technologies, Inc. Power Domain Controller");
+MODULE_LICENSE("GPL v2");
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 9/9] x86/mmu: Allocate/free PASID

2020-09-14 Thread Borislav Petkov
On Mon, Sep 14, 2020 at 06:37:15PM +, Fenghua Yu wrote:
> get_xsave_addr() will not return NULL because xsave->header.xfeatures has
> XFEATURE_MASK_PASID set.

Ah, you're setting it in the line above.

> I will remove the unnecessary check "if (ppasid_state)" to optimize
> the function and add a comment on why the check is unnecessary.

Ok.

> I addressed all of your comments and will send out the updated series soon.
> 
> Thank you very much for your review!

You're welcome and thanks!

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-14 Thread Jason Gunthorpe
On Mon, Sep 14, 2020 at 12:23:28PM -0600, Alex Williamson wrote:
> On Mon, 14 Sep 2020 14:41:21 -0300
> Jason Gunthorpe  wrote:
> 
> > On Mon, Sep 14, 2020 at 10:58:57AM -0600, Alex Williamson wrote:
> >  
> > > "its own special way" is arguable, VFIO is just making use of what's
> > > being proposed as the uapi via its existing IOMMU interface.  
> > 
> > I mean, if we have a /dev/sva then it makes no sense to extend the
> > VFIO interfaces with the same stuff. VFIO should simply accept a PASID
> > created from /dev/sva and use it just like any other user-DMA driver
> > would.
> 
> I don't think that's absolutely true.  By the same logic, we could say
> that pci-sysfs provides access to PCI BAR and config space
> resources,

No, it is the reverse, VFIO is a better version of pci-sysfs, so
pci-sysfs is the one that is obsoleted by VFIO. Similarly a /dev/sva
would be the superset interface for PASID, so whatver VFIO has would
be obsoleted.

It would be very unusual for the kernel to have to 'preferred'
interfaces for the same thing, IMHO. The review process for uAPI
should really prevent that by allowing all interests to be served
while the uAPI is designed.

> the VFIO device interface duplicates part of that interface therefore it
> should be abandoned.  But in reality, VFIO providing access to those
> resources puts those accesses within the scope and control of the VFIO
> interface.

Not clear to my why VFIO needs that. PASID seems quite orthogonal from
VFIO to me.

> > This has already happened, the SVA patches generally allow unpriv user
> > space to allocate a PASID for their process.
> > 
> > If a device implements a mdev shared with a kernel driver (like IDXD)
> > then it will be sharing that PASID pool across both drivers. In this
> > case it makes no sense that VFIO has PASID quota logic because it has
> > an incomplete view. It could only make sense if VFIO is the exclusive
> > owner of the bus/device/function.
> > 
> > The tracking logic needs to be global.. Most probably in some kind of
> > PASID cgroup controller?
> 
> AIUI, that doesn't exist yet, so it makes sense that VFIO, as the
> mechanism through which a user would allocate a PASID, 

VFIO is not the exclusive user interface for PASID. Other SVA drivers
will allocate PASIDs. Any quota has to be implemented by the IOMMU
layer, and shared across all drivers.

> space.  Also, "unprivileged user" is a bit of a misnomer in this
> context as the VFIO user must be privileged with ownership of a device
> before they can even participate in PASID allocation.  Is truly
> unprivileged access reasonable for a limited resource?

I'm not talking about VFIO, I'm talking about the other SVA drivers. I
expect some of them will be unpriv safe, like IDXD, for
instance.

Some way to manage the limited PASID resource will be necessary beyond
just VFIO.

> QEMU typically runs in a sandbox with limited access, when a device or
> mdev is assigned to a VM, file permissions are configured to allow that
> access.  QEMU doesn't get to poke at any random dev file it likes,
> that's part of how userspace reduces the potential attack surface.

Plumbing the exact same APIs through VFIO's uAPI vs /dev/sva doesn't
reduce the attack surface. qemu can simply include /dev/sva in the
sandbox when using VFIO with no increase in attack surface from this
proposed series.

> This series is a blueprint within the context of the ownership and
> permission model that VFIO already provides.  It doesn't seem like we
> can pluck that out on its own, nor is it necessarily the case that VFIO
> wouldn't want to provide PASID services within its own API even if we
> did have this undefined /dev/sva interface.

I don't see what you do - VFIO does not own PASID, and in this
vfio-mdev mode it does not own the PCI device/IOMMU either. So why
would this need to be part of the VFIO owernship and permission model?

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 9/9] x86/mmu: Allocate/free PASID

2020-09-14 Thread Fenghua Yu
Hi, Boris,

On Mon, Sep 07, 2020 at 01:18:43PM +0200, Borislav Petkov wrote:
> On Thu, Aug 27, 2020 at 08:06:34AM -0700, Fenghua Yu wrote:
> > +*/
> > +   xsave = &fpu->state.xsave;
> > +   xsave->header.xfeatures |= XFEATURE_MASK_PASID;
> > +   ppasid_state = get_xsave_addr(xsave, XFEATURE_PASID);
> > +   if (ppasid_state) {
> > +   /*
> > +* Only update the task's PASID state when it's
> > +* different from the mm's pasid.
> > +*/
> > +   if (ppasid_state->pasid != pasid_state) {
> > +   /*
> > +* Invalid fpregs so that xrstors will pick up
> ^^^
> 
> Not "xrstors" but the "state restoring" or so.

Fixed.

> > +* the PASID state.
> > +*/
> > +   __fpu_invalidate_fpregs_state(fpu);
> > +   ppasid_state->pasid = pasid_state;
> > +   }
> 
> What happens if get_xsave_addr() returns NULL? A WARN_ONCE maybe?

get_xsave_addr() will not return NULL because xsave->header.xfeatures has
XFEATURE_MASK_PASID set. I will remove the unnecessary check "if (ppasid_state)"
to optimize the function and add a comment on why the check is unnecessary.

> Ok, done with review.

I addressed all of your comments and will send out the updated series soon.

Thank you very much for your review!

-Fenghua
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-14 Thread Alex Williamson
On Mon, 14 Sep 2020 14:41:21 -0300
Jason Gunthorpe  wrote:

> On Mon, Sep 14, 2020 at 10:58:57AM -0600, Alex Williamson wrote:
>  
> > "its own special way" is arguable, VFIO is just making use of what's
> > being proposed as the uapi via its existing IOMMU interface.  
> 
> I mean, if we have a /dev/sva then it makes no sense to extend the
> VFIO interfaces with the same stuff. VFIO should simply accept a PASID
> created from /dev/sva and use it just like any other user-DMA driver
> would.

I don't think that's absolutely true.  By the same logic, we could say
that pci-sysfs provides access to PCI BAR and config space resources,
the VFIO device interface duplicates part of that interface therefore it
should be abandoned.  But in reality, VFIO providing access to those
resources puts those accesses within the scope and control of the VFIO
interface.  Ownership of a device through vfio is provided by allowing
the user access to the vfio group dev file, not by the group file, plus
some number of resource files, and the config file, and running with
admin permissions to see the full extent of config space.  Reserved
ranges for the IOMMU are also provided via sysfs, but VFIO includes a
capability on the IOMMU get_info ioctl for the user to learn about
available IOVA ranges w/o scraping through sysfs.

> > are also a system resource, so we require some degree of access control
> > and quotas for management of PASIDs.
> 
> This has already happened, the SVA patches generally allow unpriv user
> space to allocate a PASID for their process.
> 
> If a device implements a mdev shared with a kernel driver (like IDXD)
> then it will be sharing that PASID pool across both drivers. In this
> case it makes no sense that VFIO has PASID quota logic because it has
> an incomplete view. It could only make sense if VFIO is the exclusive
> owner of the bus/device/function.
> 
> The tracking logic needs to be global.. Most probably in some kind of
> PASID cgroup controller?

AIUI, that doesn't exist yet, so it makes sense that VFIO, as the
mechanism through which a user would allocate a PASID, implements a
reasonable quota to avoid an unprivileged user exhausting the address
space.  Also, "unprivileged user" is a bit of a misnomer in this
context as the VFIO user must be privileged with ownership of a device
before they can even participate in PASID allocation.  Is truly
unprivileged access reasonable for a limited resource?
 
> > know whether an assigned device requires PASIDs such that access to
> > this dev file is provided to QEMU?  
> 
> Wouldn't QEMU just open /dev/sva if it needs it? Like other dev files?
> Why would it need something special?

QEMU typically runs in a sandbox with limited access, when a device or
mdev is assigned to a VM, file permissions are configured to allow that
access.  QEMU doesn't get to poke at any random dev file it likes,
that's part of how userspace reduces the potential attack surface.
 
> > would be an obvious DoS path if any user can create arbitrary
> > allocations.  If we can move code out of VFIO, I'm all for it, but I
> > think it needs to be better defined than "implement magic universal sva
> > uapi interface" before we can really consider it.  Thanks,  
> 
> Jason began by saying VDPA will need this too, I agree with him.
> 
> I'm not sure why it would be "magic"? This series already gives a
> pretty solid blueprint for what the interface would need to
> have. Interested folks need to sit down and talk about it not just
> default everything to being built inside VFIO.

This series is a blueprint within the context of the ownership and
permission model that VFIO already provides.  It doesn't seem like we
can pluck that out on its own, nor is it necessarily the case that VFIO
wouldn't want to provide PASID services within its own API even if we
did have this undefined /dev/sva interface.  Thanks,

Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2] iommu/arm: Add module parameter to set msi iova address

2020-09-14 Thread Vennila Megavannan
From: Srinath Mannam 

Add provision to change default value of MSI IOVA base to platform's
suitable IOVA using module parameter. The present hardcoded MSI IOVA base
may not be the accessible IOVA ranges of platform.

If any platform has the limitaion to access default MSI IOVA, then it can
be changed using "arm-smmu.msi_iova_base=0xa000" command line argument.

Signed-off-by: Srinath Mannam 
Co-developed-by: Vennila Megavannan 
Signed-off-by: Vennila Megavannan 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 5 -
 drivers/iommu/arm/arm-smmu/arm-smmu.c   | 5 -
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index c192544e874b..dfef0df66c19 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -417,6 +417,9 @@ static bool disable_bypass = 1;
 module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO);
 MODULE_PARM_DESC(disable_bypass,
"Disable bypass streams such that incoming transactions from devices 
that are not attached to an iommu domain will report an abort back to the 
device and will not be allowed to pass through the SMMU.");
+static unsigned long msi_iova_base = MSI_IOVA_BASE;
+module_param(msi_iova_base, ulong, S_IRUGO);
+MODULE_PARM_DESC(msi_iova_base, "set MSI IOVA base address if default 
(0x800) does not work for your platform.");
 
 enum pri_resp {
PRI_RESP_DENY = 0,
@@ -3102,7 +3105,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
struct iommu_resv_region *region;
int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
-   region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
+   region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
 prot, IOMMU_RESV_SW_MSI);
if (!region)
return;
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 09c42af9f31e..9d46a2628dd5 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -64,6 +64,9 @@ static bool disable_bypass =
 module_param(disable_bypass, bool, S_IRUGO);
 MODULE_PARM_DESC(disable_bypass,
"Disable bypass streams such that incoming transactions from devices 
that are not attached to an iommu domain will report an abort back to the 
device and will not be allowed to pass through the SMMU.");
+static unsigned long msi_iova_base = MSI_IOVA_BASE;
+module_param(msi_iova_base, ulong, S_IRUGO);
+MODULE_PARM_DESC(msi_iova_base, "set MSI IOVA base address if default 
(0x800) does not work for your platform.");
 
 struct arm_smmu_s2cr {
struct iommu_group  *group;
@@ -1603,7 +1606,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
struct iommu_resv_region *region;
int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
-   region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
+   region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
 prot, IOMMU_RESV_SW_MSI);
if (!region)
return;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-14 Thread Jason Gunthorpe
On Mon, Sep 14, 2020 at 10:58:57AM -0600, Alex Williamson wrote:
 
> "its own special way" is arguable, VFIO is just making use of what's
> being proposed as the uapi via its existing IOMMU interface.

I mean, if we have a /dev/sva then it makes no sense to extend the
VFIO interfaces with the same stuff. VFIO should simply accept a PASID
created from /dev/sva and use it just like any other user-DMA driver
would.

> are also a system resource, so we require some degree of access control
> and quotas for management of PASIDs.  

This has already happened, the SVA patches generally allow unpriv user
space to allocate a PASID for their process.

If a device implements a mdev shared with a kernel driver (like IDXD)
then it will be sharing that PASID pool across both drivers. In this
case it makes no sense that VFIO has PASID quota logic because it has
an incomplete view. It could only make sense if VFIO is the exclusive
owner of the bus/device/function.

The tracking logic needs to be global.. Most probably in some kind of
PASID cgroup controller?

> know whether an assigned device requires PASIDs such that access to
> this dev file is provided to QEMU?

Wouldn't QEMU just open /dev/sva if it needs it? Like other dev files?
Why would it need something special?

> would be an obvious DoS path if any user can create arbitrary
> allocations.  If we can move code out of VFIO, I'm all for it, but I
> think it needs to be better defined than "implement magic universal sva
> uapi interface" before we can really consider it.  Thanks,

Jason began by saying VDPA will need this too, I agree with him.

I'm not sure why it would be "magic"? This series already gives a
pretty solid blueprint for what the interface would need to
have. Interested folks need to sit down and talk about it not just
default everything to being built inside VFIO.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-14 Thread Alex Williamson
On Mon, 14 Sep 2020 13:33:54 -0300
Jason Gunthorpe  wrote:

> On Mon, Sep 14, 2020 at 09:22:47AM -0700, Raj, Ashok wrote:
> > Hi Jason,
> > 
> > On Mon, Sep 14, 2020 at 10:47:38AM -0300, Jason Gunthorpe wrote:  
> > > On Mon, Sep 14, 2020 at 03:31:13PM +0200, Jean-Philippe Brucker wrote:
> > >   
> > > > > Jason suggest something like /dev/sva. There will be a lot of other
> > > > > subsystems that could benefit from this (e.g vDPA).  
> > > > 
> > > > Do you have a more precise idea of the interface /dev/sva would provide,
> > > > how it would interact with VFIO and others?  vDPA could transport the
> > > > generic iommu.h structures via its own uAPI, and call the IOMMU API
> > > > directly without going through an intermediate /dev/sva handle.  
> > > 
> > > Prior to PASID IOMMU really only makes sense as part of vfio-pci
> > > because the iommu can only key on the BDF. That can't work unless the
> > > whole PCI function can be assigned. It is hard to see how a shared PCI
> > > device can work with IOMMU like this, so may as well use vfio.
> > > 
> > > SVA and various vIOMMU models change this, a shared PCI driver can
> > > absoultely work with a PASID that is assigned to a VM safely, and
> > > actually don't need to know if their PASID maps a mm_struct or
> > > something else.  
> > 
> > Well, IOMMU does care if its a native mm_struct or something that belongs
> > to guest. Because you need ability to forward page-requests and pickup
> > page-responses from guest. Since there is just one PRQ on the IOMMU and
> > responses can't be sent directly. You have to depend on vIOMMU type
> > interface in guest to make all of this magic work right?  
> 
> Yes, IOMMU cares, but not the PCI Driver. It just knows it has a
> PASID. Details on how page faultings is handled or how the mapping is
> setup is abstracted by the PASID.
> 
> > > This new PASID allocator would match the guest memory layout and  
> > 
> > Not sure what you mean by "match guest memory layout"? 
> > Probably, meaning first level is gVA or gIOVA?   
> 
> It means whatever the qemu/viommu/guest/etc needs across all the
> IOMMU/arch implementations.
> 
> Basically, there should only be two ways to get a PASID
>  - From mm_struct that mirrors the creating process
>  - Via '/dev/sva' which has an complete interface to create and
>control a PASID suitable for virtualization and more
> 
> VFIO should not have its own special way to get a PASID.

"its own special way" is arguable, VFIO is just making use of what's
being proposed as the uapi via its existing IOMMU interface.  PASIDs
are also a system resource, so we require some degree of access control
and quotas for management of PASIDs.  Does libvirt now get involved to
know whether an assigned device requires PASIDs such that access to
this dev file is provided to QEMU?  How does the kernel validate usage
or implement quotas when disconnected from device ownership?  PASIDs
would be an obvious DoS path if any user can create arbitrary
allocations.  If we can move code out of VFIO, I'm all for it, but I
think it needs to be better defined than "implement magic universal sva
uapi interface" before we can really consider it.  Thanks,

Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-14 Thread Jason Gunthorpe
On Mon, Sep 14, 2020 at 09:22:47AM -0700, Raj, Ashok wrote:
> Hi Jason,
> 
> On Mon, Sep 14, 2020 at 10:47:38AM -0300, Jason Gunthorpe wrote:
> > On Mon, Sep 14, 2020 at 03:31:13PM +0200, Jean-Philippe Brucker wrote:
> > 
> > > > Jason suggest something like /dev/sva. There will be a lot of other
> > > > subsystems that could benefit from this (e.g vDPA).
> > > 
> > > Do you have a more precise idea of the interface /dev/sva would provide,
> > > how it would interact with VFIO and others?  vDPA could transport the
> > > generic iommu.h structures via its own uAPI, and call the IOMMU API
> > > directly without going through an intermediate /dev/sva handle.
> > 
> > Prior to PASID IOMMU really only makes sense as part of vfio-pci
> > because the iommu can only key on the BDF. That can't work unless the
> > whole PCI function can be assigned. It is hard to see how a shared PCI
> > device can work with IOMMU like this, so may as well use vfio.
> > 
> > SVA and various vIOMMU models change this, a shared PCI driver can
> > absoultely work with a PASID that is assigned to a VM safely, and
> > actually don't need to know if their PASID maps a mm_struct or
> > something else.
> 
> Well, IOMMU does care if its a native mm_struct or something that belongs
> to guest. Because you need ability to forward page-requests and pickup
> page-responses from guest. Since there is just one PRQ on the IOMMU and
> responses can't be sent directly. You have to depend on vIOMMU type
> interface in guest to make all of this magic work right?

Yes, IOMMU cares, but not the PCI Driver. It just knows it has a
PASID. Details on how page faultings is handled or how the mapping is
setup is abstracted by the PASID.

> > This new PASID allocator would match the guest memory layout and
> 
> Not sure what you mean by "match guest memory layout"? 
> Probably, meaning first level is gVA or gIOVA? 

It means whatever the qemu/viommu/guest/etc needs across all the
IOMMU/arch implementations.

Basically, there should only be two ways to get a PASID
 - From mm_struct that mirrors the creating process
 - Via '/dev/sva' which has an complete interface to create and
   control a PASID suitable for virtualization and more

VFIO should not have its own special way to get a PASID.

Jason
 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-14 Thread Raj, Ashok
Hi Jason,

On Mon, Sep 14, 2020 at 10:47:38AM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 14, 2020 at 03:31:13PM +0200, Jean-Philippe Brucker wrote:
> 
> > > Jason suggest something like /dev/sva. There will be a lot of other
> > > subsystems that could benefit from this (e.g vDPA).
> > 
> > Do you have a more precise idea of the interface /dev/sva would provide,
> > how it would interact with VFIO and others?  vDPA could transport the
> > generic iommu.h structures via its own uAPI, and call the IOMMU API
> > directly without going through an intermediate /dev/sva handle.
> 
> Prior to PASID IOMMU really only makes sense as part of vfio-pci
> because the iommu can only key on the BDF. That can't work unless the
> whole PCI function can be assigned. It is hard to see how a shared PCI
> device can work with IOMMU like this, so may as well use vfio.
> 
> SVA and various vIOMMU models change this, a shared PCI driver can
> absoultely work with a PASID that is assigned to a VM safely, and
> actually don't need to know if their PASID maps a mm_struct or
> something else.

Well, IOMMU does care if its a native mm_struct or something that belongs
to guest. Because you need ability to forward page-requests and pickup
page-responses from guest. Since there is just one PRQ on the IOMMU and
responses can't be sent directly. You have to depend on vIOMMU type
interface in guest to make all of this magic work right?

> 
> So, some /dev/sva is another way to allocate a PASID that is not 1:1
> with mm_struct, as the existing SVA stuff enforces. ie it is a way to
> program the DMA address map of the PASID.
> 
> This new PASID allocator would match the guest memory layout and

Not sure what you mean by "match guest memory layout"? 
Probably, meaning first level is gVA or gIOVA? 

Cheers,
Ashok
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 03/17] drm/exynos: stop setting DMA_ATTR_NON_CONSISTENT

2020-09-14 Thread Sergei Shtylyov
On 9/14/20 5:44 PM, Christoph Hellwig wrote:

> DMA_ATTR_NON_CONSISTENT is a no-op except on PARISC and some mips
> configs, so don't set it in this ARM specific driver.

   Hm, PARICS and ARM capitalized but mips in lower case? :-)

> Signed-off-by: Christoph Hellwig 
[...]

MBR, Sergei
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: a saner API for allocating DMA addressable pages v2

2020-09-14 Thread Matthew Wilcox
On Mon, Sep 14, 2020 at 04:44:16PM +0200, Christoph Hellwig wrote:
> I'm still a little unsure about the API naming, as alloc_pages sort of
> implies a struct page return value, but we return a kernel virtual
> address.

Erm ... dma_alloc_pages() returns a struct page, so is this sentence
stale?

>From patch 14:

+struct page *dma_alloc_pages(struct device *dev, size_t size,
+   dma_addr_t *dma_handle, enum dma_data_direction dir, gfp_t gfp);

> The other alternative would be to name the API
> dma_alloc_noncoherent, but the whole non-coherent naming seems to put
> people off.

You say that like it's a bad thing.  I think the problem is more that
people don't understand what non-coherent means and think they're
supporting it when they're not.

dma_alloc_manual_flushing()?

> As a follow up I plan to move the implementation of the
> DMA_ATTR_NO_KERNEL_MAPPING flag over to this framework as well, given
> that is also is a fundamentally non coherent allocation.  The replacement
> for that flag would then return a struct page, as it is allowed to
> actually return pages without a kernel mapping as the name suggested
> (although most of the time they will actually have a kernel mapping..)

If the page doesn't have a kernel mapping, shouldn't it return a PFN
or a phys_addr?

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 17/17] firewire-ohci: use dma_alloc_pages

2020-09-14 Thread Christoph Hellwig
Use dma_alloc_pages to allocate DMAable pages instead of hoping that
the architecture either has GFP_DMA32 or not more than 4G of memory.

Signed-off-by: Christoph Hellwig 
---
 drivers/firewire/ohci.c | 26 +++---
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/firewire/ohci.c b/drivers/firewire/ohci.c
index 020cb15a4d8fcc..9811c40956e54d 100644
--- a/drivers/firewire/ohci.c
+++ b/drivers/firewire/ohci.c
@@ -674,17 +674,16 @@ static void ar_context_link_page(struct ar_context *ctx, 
unsigned int index)
 
 static void ar_context_release(struct ar_context *ctx)
 {
+   struct device *dev = ctx->ohci->card.device;
unsigned int i;
 
vunmap(ctx->buffer);
 
-   for (i = 0; i < AR_BUFFERS; i++)
-   if (ctx->pages[i]) {
-   dma_unmap_page(ctx->ohci->card.device,
-  ar_buffer_bus(ctx, i),
-  PAGE_SIZE, DMA_FROM_DEVICE);
-   __free_page(ctx->pages[i]);
-   }
+   for (i = 0; i < AR_BUFFERS; i++) {
+   if (ctx->pages[i])
+   dma_free_pages(dev, PAGE_SIZE, ctx->pages[i],
+  ar_buffer_bus(ctx, i), DMA_FROM_DEVICE);
+   }
 }
 
 static void ar_context_abort(struct ar_context *ctx, const char *error_msg)
@@ -970,6 +969,7 @@ static void ar_context_tasklet(unsigned long data)
 static int ar_context_init(struct ar_context *ctx, struct fw_ohci *ohci,
   unsigned int descriptors_offset, u32 regs)
 {
+   struct device *dev = ohci->card.device;
unsigned int i;
dma_addr_t dma_addr;
struct page *pages[AR_BUFFERS + AR_WRAPAROUND_PAGES];
@@ -980,17 +980,13 @@ static int ar_context_init(struct ar_context *ctx, struct 
fw_ohci *ohci,
tasklet_init(&ctx->tasklet, ar_context_tasklet, (unsigned long)ctx);
 
for (i = 0; i < AR_BUFFERS; i++) {
-   ctx->pages[i] = alloc_page(GFP_KERNEL | GFP_DMA32);
+   ctx->pages[i] = dma_alloc_pages(dev, PAGE_SIZE, &dma_addr,
+   DMA_FROM_DEVICE, GFP_KERNEL);
if (!ctx->pages[i])
goto out_of_memory;
-   dma_addr = dma_map_page(ohci->card.device, ctx->pages[i],
-   0, PAGE_SIZE, DMA_FROM_DEVICE);
-   if (dma_mapping_error(ohci->card.device, dma_addr)) {
-   __free_page(ctx->pages[i]);
-   ctx->pages[i] = NULL;
-   goto out_of_memory;
-   }
set_page_private(ctx->pages[i], dma_addr);
+   dma_sync_single_for_device(dev, dma_addr, PAGE_SIZE,
+  DMA_FROM_DEVICE);
}
 
for (i = 0; i < AR_BUFFERS; i++)
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 16/17] dma-iommu: implement ->alloc_noncoherent

2020-09-14 Thread Christoph Hellwig
Implement the alloc_noncoherent method to provide memory that is neither
coherent not contiguous.

Signed-off-by: Christoph Hellwig 
---
 drivers/iommu/dma-iommu.c | 41 +++
 1 file changed, 37 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 00a5b49248e334..c12c1dc43d312e 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -572,6 +572,7 @@ static struct page **__iommu_dma_alloc_pages(struct device 
*dev,
  * @size: Size of buffer in bytes
  * @dma_handle: Out argument for allocated DMA handle
  * @gfp: Allocation flags
+ * @prot: pgprot_t to use for the remapped mapping
  * @attrs: DMA attributes for this allocation
  *
  * If @size is less than PAGE_SIZE, then a full CPU page will be allocated,
@@ -580,14 +581,14 @@ static struct page **__iommu_dma_alloc_pages(struct 
device *dev,
  * Return: Mapped virtual address, or NULL on failure.
  */
 static void *iommu_dma_alloc_remap(struct device *dev, size_t size,
-   dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
+   dma_addr_t *dma_handle, gfp_t gfp, pgprot_t prot,
+   unsigned long attrs)
 {
struct iommu_domain *domain = iommu_get_dma_domain(dev);
struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iova_domain *iovad = &cookie->iovad;
bool coherent = dev_is_dma_coherent(dev);
int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
-   pgprot_t prot = dma_pgprot(dev, PAGE_KERNEL, attrs);
unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
struct page **pages;
struct sg_table sgt;
@@ -1030,8 +1031,10 @@ static void *iommu_dma_alloc(struct device *dev, size_t 
size,
gfp |= __GFP_ZERO;
 
if (IS_ENABLED(CONFIG_DMA_REMAP) && gfpflags_allow_blocking(gfp) &&
-   !(attrs & DMA_ATTR_FORCE_CONTIGUOUS))
-   return iommu_dma_alloc_remap(dev, size, handle, gfp, attrs);
+   !(attrs & DMA_ATTR_FORCE_CONTIGUOUS)) {
+   return iommu_dma_alloc_remap(dev, size, handle, gfp,
+   dma_pgprot(dev, PAGE_KERNEL, attrs), attrs);
+   }
 
if (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
!gfpflags_allow_blocking(gfp) && !coherent)
@@ -1052,6 +1055,34 @@ static void *iommu_dma_alloc(struct device *dev, size_t 
size,
return cpu_addr;
 }
 
+#ifdef CONFIG_DMA_REMAP
+static void *iommu_dma_alloc_noncoherent(struct device *dev, size_t size,
+   dma_addr_t *handle, enum dma_data_direction dir, gfp_t gfp)
+{
+   if (!gfpflags_allow_blocking(gfp)) {
+   struct page *page;
+
+   page = dma_common_alloc_pages(dev, size, handle, dir, gfp);
+   if (!page)
+   return NULL;
+   return page_address(page);
+   }
+
+   return iommu_dma_alloc_remap(dev, size, handle, gfp | __GFP_ZERO,
+PAGE_KERNEL, 0);
+}
+
+static void iommu_dma_free_noncoherent(struct device *dev, size_t size,
+   void *cpu_addr, dma_addr_t handle, enum dma_data_direction dir)
+{
+   __iommu_dma_unmap(dev, handle, size);
+   __iommu_dma_free(dev, size, cpu_addr);
+}
+#else
+#define iommu_dma_alloc_noncoherentNULL
+#define iommu_dma_free_noncoherent NULL
+#endif /* CONFIG_DMA_REMAP */
+
 static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
void *cpu_addr, dma_addr_t dma_addr, size_t size,
unsigned long attrs)
@@ -1122,6 +1153,8 @@ static const struct dma_map_ops iommu_dma_ops = {
.free   = iommu_dma_free,
.alloc_pages= dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
+   .alloc_noncoherent  = iommu_dma_alloc_noncoherent,
+   .free_noncoherent   = iommu_dma_free_noncoherent,
.mmap   = iommu_dma_mmap,
.get_sgtable= iommu_dma_get_sgtable,
.map_page   = iommu_dma_map_page,
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 07/17] 53c700: improve non-coherent DMA handling

2020-09-14 Thread James Bottomley
On Mon, 2020-09-14 at 16:44 +0200, Christoph Hellwig wrote:
> @@ -429,7 +430,7 @@ struct NCR_700_Host_Parameters {
>   for(i=0; i< (sizeof(A_##symbol##_used) / sizeof(__u32));
> i++) { \
>   __u32 val =
> bS_to_cpu((script)[A_##symbol##_used[i]]) + da; \
>   (script)[A_##symbol##_used[i]] = bS_to_host(val); \
> - dma_cache_sync((dev),
> &(script)[A_##symbol##_used[i]], 4, DMA_TO_DEVICE); \
> + dma_sync_to_dev((dev),
> &(script)[A_##symbol##_used[i]], 4); \
>   DEBUG((" script, patching %s at %d to %pad\n", \
>  #symbol, A_##symbol##_used[i], &da)); \
>   } \
> @@ -441,7 +442,7 @@ struct NCR_700_Host_Parameters {
>   dma_addr_t da = value; \
>   for(i=0; i< (sizeof(A_##symbol##_used) / sizeof(__u32));
> i++) { \
>   (script)[A_##symbol##_used[i]] = bS_to_host(da); \
> - dma_cache_sync((dev),
> &(script)[A_##symbol##_used[i]], 4, DMA_TO_DEVICE); \
> + dma_sync_to_dev((dev),
> &(script)[A_##symbol##_used[i]], 4); \
>   DEBUG((" script, patching %s at %d to %pad\n", \
>  #symbol, A_##symbol##_used[i], &da)); \
>   } \
> @@ -456,7 +457,7 @@ struct NCR_700_Host_Parameters {
>   val &= 0xff00; \
>   val |= ((value) & 0xff) << 16; \
>   (script)[A_##symbol##_used[i]] = bS_to_host(val); \
> - dma_cache_sync((dev),
> &(script)[A_##symbol##_used[i]], 4, DMA_TO_DEVICE); \
> + dma_sync_to_dev((dev),
> &(script)[A_##symbol##_used[i]], 4); \
>   DEBUG((" script, patching ID field %s at %d to
> 0x%x\n", \
>  #symbol, A_##symbol##_used[i], val)); \
>   } \
> @@ -470,7 +471,7 @@ struct NCR_700_Host_Parameters {
>   val &= 0x; \
>   val |= ((value) & 0x); \
>   (script)[A_##symbol##_used[i]] = bS_to_host(val); \
> - dma_cache_sync((dev),
> &(script)[A_##symbol##_used[i]], 4, DMA_TO_DEVICE); \
> + dma_sync_to_dev((dev),
> &(script)[A_##symbol##_used[i]], 4); \
>   DEBUG((" script, patching short field %s at %d to
> 0x%x\n", \
>  #symbol, A_##symbol##_used[i], val)); \
>   } \

If you're going to change the macros from taking a device to taking a
hostdata structure then the descriptive argument name needs to change
... it can't be dev anymore.  I'm happy with it simply becoming 'h' if
hostdata is too long.

I already asked for this on the first go around:

https://lore.kernel.org/alsa-devel/1598971960.4238.5.ca...@hansenpartnership.com/

James

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 15/17] dma-mapping: add new {alloc, free}_noncoherent dma_map_ops methods

2020-09-14 Thread Christoph Hellwig
This will allow IOMMU drivers to allocate non-contigous memory and
return a vmapped virtual address.

Signed-off-by: Christoph Hellwig 
---
 include/linux/dma-mapping.h |  5 +
 kernel/dma/mapping.c| 33 +++--
 2 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index bf592cf0db4acb..b4b5d75260d6dc 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -80,6 +80,11 @@ struct dma_map_ops {
gfp_t gfp);
void (*free_pages)(struct device *dev, size_t size, struct page *vaddr,
dma_addr_t dma_handle, enum dma_data_direction dir);
+   void* (*alloc_noncoherent)(struct device *dev, size_t size,
+   dma_addr_t *dma_handle, enum dma_data_direction dir,
+   gfp_t gfp);
+   void (*free_noncoherent)(struct device *dev, size_t size, void *vaddr,
+   dma_addr_t dma_handle, enum dma_data_direction dir);
int (*mmap)(struct device *, struct vm_area_struct *,
  void *, dma_addr_t, size_t,
  unsigned long attrs);
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 6f86c925b8251d..8614d7d2ee59a9 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -502,19 +502,40 @@ EXPORT_SYMBOL_GPL(dma_free_pages);
 void *dma_alloc_noncoherent(struct device *dev, size_t size,
dma_addr_t *dma_handle, enum dma_data_direction dir, gfp_t gfp)
 {
-   struct page *page;
+   const struct dma_map_ops *ops = get_dma_ops(dev);
+   void *vaddr;
 
-   page = dma_alloc_pages(dev, size, dma_handle, dir, gfp);
-   if (!page)
-   return NULL;
-   return page_address(page);
+   if (!ops || !ops->alloc_noncoherent) {
+   struct page *page;
+
+   page = dma_alloc_pages(dev, size, dma_handle, dir, gfp);
+   if (!page)
+   return NULL;
+   return page_address(page);
+   }
+
+   size = PAGE_ALIGN(size);
+   vaddr = ops->alloc_noncoherent(dev, size, dma_handle, dir, gfp);
+   if (vaddr)
+   debug_dma_map_page(dev, virt_to_page(vaddr), 0, size, dir,
+  *dma_handle);
+   return vaddr;
 }
 EXPORT_SYMBOL_GPL(dma_alloc_noncoherent);
 
 void dma_free_noncoherent(struct device *dev, size_t size, void *vaddr,
dma_addr_t dma_handle, enum dma_data_direction dir)
 {
-   dma_free_pages(dev, size, virt_to_page(vaddr), dma_handle, dir);
+   const struct dma_map_ops *ops = get_dma_ops(dev);
+
+   if (!ops || !ops->free_noncoherent) {
+   dma_free_pages(dev, size, virt_to_page(vaddr), dma_handle, dir);
+   return;
+   }
+
+   size = PAGE_ALIGN(size);
+   debug_dma_unmap_page(dev, dma_handle, size, dir);
+   ops->free_noncoherent(dev, size, vaddr, dma_handle, dir);
 }
 EXPORT_SYMBOL_GPL(dma_free_noncoherent);
 
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 14/17] dma-mapping: add a new dma_alloc_pages API

2020-09-14 Thread Christoph Hellwig
This API is the equivalent of alloc_pages, except that the returned memory
is guaranteed to be DMA addressable by the passed in device.  The
implementation will also be used to provide a more sensible replacement
for DMA_ATTR_NON_CONSISTENT flag.

Additionally dma_alloc_noncoherent is switched over to use dma_alloc_pages
as its backend.

Signed-off-by: Christoph Hellwig 
---
 Documentation/core-api/dma-attributes.rst |  8 ---
 arch/alpha/kernel/pci_iommu.c |  2 +
 arch/arm/mm/dma-mapping-nommu.c   |  2 +
 arch/arm/mm/dma-mapping.c |  4 ++
 arch/ia64/hp/common/sba_iommu.c   |  2 +
 arch/mips/jazz/jazzdma.c  |  7 +--
 arch/powerpc/kernel/dma-iommu.c   |  2 +
 arch/powerpc/platforms/ps3/system-bus.c   |  4 ++
 arch/powerpc/platforms/pseries/vio.c  |  2 +
 arch/s390/pci/pci_dma.c   |  2 +
 arch/x86/kernel/amd_gart_64.c |  2 +
 drivers/iommu/dma-iommu.c |  2 +
 drivers/iommu/intel/iommu.c   |  4 ++
 drivers/parisc/ccio-dma.c |  2 +
 drivers/parisc/sba_iommu.c|  2 +
 drivers/xen/swiotlb-xen.c |  2 +
 include/linux/dma-direct.h|  5 ++
 include/linux/dma-mapping.h   | 34 ++--
 include/linux/dma-noncoherent.h   |  3 --
 kernel/dma/direct.c   | 52 ++-
 kernel/dma/mapping.c  | 63 +--
 kernel/dma/ops_helpers.c  | 35 +
 kernel/dma/virt.c |  2 +
 23 files changed, 206 insertions(+), 37 deletions(-)

diff --git a/Documentation/core-api/dma-attributes.rst 
b/Documentation/core-api/dma-attributes.rst
index 29dcbe8826e85e..1887d92e8e9269 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -25,14 +25,6 @@ Since it is optional for platforms to implement 
DMA_ATTR_WRITE_COMBINE,
 those that do not will simply ignore the attribute and exhibit default
 behavior.
 
-DMA_ATTR_NON_CONSISTENT

-
-DMA_ATTR_NON_CONSISTENT lets the platform to choose to return either
-consistent or non-consistent memory as it sees fit.  By using this API,
-you are guaranteeing to the platform that you have all the correct and
-necessary sync points for this memory in the driver.
-
 DMA_ATTR_NO_KERNEL_MAPPING
 --
 
diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index 6f7de4f4e191e7..447e0fd0ed3895 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -952,5 +952,7 @@ const struct dma_map_ops alpha_pci_ops = {
.dma_supported  = alpha_pci_supported,
.mmap   = dma_common_mmap,
.get_sgtable= dma_common_get_sgtable,
+   .alloc_pages= dma_common_alloc_pages,
+   .free_pages = dma_common_free_pages,
 };
 EXPORT_SYMBOL(alpha_pci_ops);
diff --git a/arch/arm/mm/dma-mapping-nommu.c b/arch/arm/mm/dma-mapping-nommu.c
index 287ef898a55e11..43c6d66b6e733a 100644
--- a/arch/arm/mm/dma-mapping-nommu.c
+++ b/arch/arm/mm/dma-mapping-nommu.c
@@ -176,6 +176,8 @@ static void arm_nommu_dma_sync_sg_for_cpu(struct device 
*dev, struct scatterlist
 const struct dma_map_ops arm_nommu_dma_ops = {
.alloc  = arm_nommu_dma_alloc,
.free   = arm_nommu_dma_free,
+   .alloc_pages= dma_direct_alloc_pages,
+   .free_pages = dma_direct_free_pages,
.mmap   = arm_nommu_dma_mmap,
.map_page   = arm_nommu_dma_map_page,
.unmap_page = arm_nommu_dma_unmap_page,
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 8a8949174b1c06..7738b4d23f692f 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -199,6 +199,8 @@ static int arm_dma_supported(struct device *dev, u64 mask)
 const struct dma_map_ops arm_dma_ops = {
.alloc  = arm_dma_alloc,
.free   = arm_dma_free,
+   .alloc_pages= dma_direct_alloc_pages,
+   .free_pages = dma_direct_free_pages,
.mmap   = arm_dma_mmap,
.get_sgtable= arm_dma_get_sgtable,
.map_page   = arm_dma_map_page,
@@ -226,6 +228,8 @@ static int arm_coherent_dma_mmap(struct device *dev, struct 
vm_area_struct *vma,
 const struct dma_map_ops arm_coherent_dma_ops = {
.alloc  = arm_coherent_dma_alloc,
.free   = arm_coherent_dma_free,
+   .alloc_pages= dma_direct_alloc_pages,
+   .free_pages = dma_direct_free_pages,
.mmap   = arm_coherent_dma_mmap,
.get_sgtable= arm_dma_get_sgtable,
.map_page   = arm_coherent_dma_map_page,
diff --git a/a

[PATCH 13/17] dma-mapping: remove dma_cache_sync

2020-09-14 Thread Christoph Hellwig
All users are gone now, remove the API.

Signed-off-by: Christoph Hellwig 
---
 arch/mips/Kconfig   |  1 -
 arch/mips/jazz/jazzdma.c|  1 -
 arch/mips/mm/dma-noncoherent.c  |  6 --
 arch/parisc/Kconfig |  1 -
 arch/parisc/kernel/pci-dma.c|  6 --
 include/linux/dma-mapping.h |  8 
 include/linux/dma-noncoherent.h | 10 --
 kernel/dma/Kconfig  |  3 ---
 kernel/dma/mapping.c| 14 --
 9 files changed, 50 deletions(-)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index c95fa3a2484cf0..1be91c5d666e61 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -1134,7 +1134,6 @@ config DMA_NONCOHERENT
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
select ARCH_HAS_DMA_SET_UNCACHED
select DMA_NONCOHERENT_MMAP
-   select DMA_NONCOHERENT_CACHE_SYNC
select NEED_DMA_MAP_STATE
 
 config SYS_HAS_EARLY_PRINTK
diff --git a/arch/mips/jazz/jazzdma.c b/arch/mips/jazz/jazzdma.c
index dab4d058cea9b1..2bf849caf507b1 100644
--- a/arch/mips/jazz/jazzdma.c
+++ b/arch/mips/jazz/jazzdma.c
@@ -620,7 +620,6 @@ const struct dma_map_ops jazz_dma_ops = {
.sync_single_for_device = jazz_dma_sync_single_for_device,
.sync_sg_for_cpu= jazz_dma_sync_sg_for_cpu,
.sync_sg_for_device = jazz_dma_sync_sg_for_device,
-   .cache_sync = arch_dma_cache_sync,
.mmap   = dma_common_mmap,
.get_sgtable= dma_common_get_sgtable,
 };
diff --git a/arch/mips/mm/dma-noncoherent.c b/arch/mips/mm/dma-noncoherent.c
index 97a14adbafc99c..f34ad1f09799f1 100644
--- a/arch/mips/mm/dma-noncoherent.c
+++ b/arch/mips/mm/dma-noncoherent.c
@@ -137,12 +137,6 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
 }
 #endif
 
-void arch_dma_cache_sync(struct device *dev, void *vaddr, size_t size,
-   enum dma_data_direction direction)
-{
-   dma_sync_virt_for_device(vaddr, size, direction);
-}
-
 #ifdef CONFIG_DMA_PERDEV_COHERENT
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
const struct iommu_ops *iommu, bool coherent)
diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index 3b0f53dd70bc9b..ed15da1da174e0 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -195,7 +195,6 @@ config PA11
depends on PA7000 || PA7100LC || PA7200 || PA7300LC
select ARCH_HAS_SYNC_DMA_FOR_CPU
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
-   select DMA_NONCOHERENT_CACHE_SYNC
 
 config PREFETCH
def_bool y
diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c
index 38c68e131bbe2a..ce38c0b9158125 100644
--- a/arch/parisc/kernel/pci-dma.c
+++ b/arch/parisc/kernel/pci-dma.c
@@ -454,9 +454,3 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
 {
flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size);
 }
-
-void arch_dma_cache_sync(struct device *dev, void *vaddr, size_t size,
-  enum dma_data_direction direction)
-{
-   flush_kernel_dcache_range((unsigned long)vaddr, size);
-}
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 4e1de194b45cbf..5b4e97b0846fd3 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -123,8 +123,6 @@ struct dma_map_ops {
void (*sync_sg_for_device)(struct device *dev,
   struct scatterlist *sg, int nents,
   enum dma_data_direction dir);
-   void (*cache_sync)(struct device *dev, void *vaddr, size_t size,
-   enum dma_data_direction direction);
int (*dma_supported)(struct device *dev, u64 mask);
u64 (*get_required_mask)(struct device *dev);
size_t (*max_mapping_size)(struct device *dev);
@@ -254,8 +252,6 @@ void *dmam_alloc_attrs(struct device *dev, size_t size, 
dma_addr_t *dma_handle,
gfp_t gfp, unsigned long attrs);
 void dmam_free_coherent(struct device *dev, size_t size, void *vaddr,
dma_addr_t dma_handle);
-void dma_cache_sync(struct device *dev, void *vaddr, size_t size,
-   enum dma_data_direction dir);
 int dma_get_sgtable_attrs(struct device *dev, struct sg_table *sgt,
void *cpu_addr, dma_addr_t dma_addr, size_t size,
unsigned long attrs);
@@ -339,10 +335,6 @@ static inline void dmam_free_coherent(struct device *dev, 
size_t size,
void *vaddr, dma_addr_t dma_handle)
 {
 }
-static inline void dma_cache_sync(struct device *dev, void *vaddr, size_t size,
-   enum dma_data_direction dir)
-{
-}
 static inline int dma_get_sgtable_attrs(struct device *dev,
struct sg_table *sgt, void *cpu_addr, dma_addr_t dma_addr,
size_t size, unsigned long attrs)
diff --git a/include/linux/dma-noncoherent.h b/include/linux/dma-noncoherent.h
index b9bc6c557ea46f..0888656369a45b 100644
--- a/include/linux/dma-nonc

Re: [PATCH 11/17] sgiseeq: convert to dma_alloc_noncoherent

2020-09-14 Thread Matthew Wilcox
On Mon, Sep 14, 2020 at 04:44:27PM +0200, Christoph Hellwig wrote:
>  drivers/net/ethernet/i825xx/lasi_82596.c |  25 ++---
>  drivers/net/ethernet/i825xx/lib82596.c   | 114 ++-
>  drivers/net/ethernet/i825xx/sni_82596.c  |   4 -
>  drivers/net/ethernet/seeq/sgiseeq.c  |  28 --
>  drivers/scsi/53c700.c|   9 +-
>  5 files changed, 103 insertions(+), 77 deletions(-)

I think your patch slicing-and-dicing went wrong here ;-(
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 12/17] 53c700: convert to dma_alloc_noncoherent

2020-09-14 Thread Christoph Hellwig
Use the new non-coherent DMA API including proper ownership transfers.

Signed-off-by: Christoph Hellwig 
---
 drivers/scsi/53c700.c | 11 +--
 drivers/scsi/53c700.h | 16 
 2 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c
index 9a343f8ecb6c3e..5117d90ccd9edf 100644
--- a/drivers/scsi/53c700.c
+++ b/drivers/scsi/53c700.c
@@ -269,18 +269,25 @@ NCR_700_get_SXFER(struct scsi_device *SDp)
  spi_period(SDp->sdev_target));
 }
 
+static inline dma_addr_t virt_to_dma(struct NCR_700_Host_Parameters *h, void 
*p)
+{
+   return h->pScript + ((uintptr_t)p - (uintptr_t)h->script);
+}
+
 static inline void dma_sync_to_dev(struct NCR_700_Host_Parameters *h,
void *addr, size_t size)
 {
if (h->noncoherent)
-   dma_cache_sync(h->dev, addr, size, DMA_TO_DEVICE);
+   dma_sync_single_for_device(h->dev, virt_to_dma(h, addr),
+  size, DMA_BIDIRECTIONAL);
 }
 
 static inline void dma_sync_from_dev(struct NCR_700_Host_Parameters *h,
void *addr, size_t size)
 {
if (h->noncoherent)
-   dma_cache_sync(h->dev, addr, size, DMA_FROM_DEVICE);
+   dma_sync_single_for_device(h->dev, virt_to_dma(h, addr), size,
+  DMA_BIDIRECTIONAL);
 }
 
 struct Scsi_Host *
diff --git a/drivers/scsi/53c700.h b/drivers/scsi/53c700.h
index 0f545b05fe611d..c9f8c497babb3d 100644
--- a/drivers/scsi/53c700.h
+++ b/drivers/scsi/53c700.h
@@ -423,33 +423,33 @@ struct NCR_700_Host_Parameters {
 #define NCR_710_MIN_XFERP  0
 #define NCR_700_MIN_PERIOD 25 /* for SDTR message, 100ns */
 
-#define script_patch_32(dev, script, symbol, value) \
+#define script_patch_32(h, script, symbol, value) \
 { \
int i; \
dma_addr_t da = value; \
for(i=0; i< (sizeof(A_##symbol##_used) / sizeof(__u32)); i++) { \
__u32 val = bS_to_cpu((script)[A_##symbol##_used[i]]) + da; \
(script)[A_##symbol##_used[i]] = bS_to_host(val); \
-   dma_sync_to_dev((dev), &(script)[A_##symbol##_used[i]], 4); \
+   dma_sync_to_dev((h), &(script)[A_##symbol##_used[i]], 4); \
DEBUG((" script, patching %s at %d to %pad\n", \
   #symbol, A_##symbol##_used[i], &da)); \
} \
 }
 
-#define script_patch_32_abs(dev, script, symbol, value) \
+#define script_patch_32_abs(h, script, symbol, value) \
 { \
int i; \
dma_addr_t da = value; \
for(i=0; i< (sizeof(A_##symbol##_used) / sizeof(__u32)); i++) { \
(script)[A_##symbol##_used[i]] = bS_to_host(da); \
-   dma_sync_to_dev((dev), &(script)[A_##symbol##_used[i]], 4); \
+   dma_sync_to_dev((h), &(script)[A_##symbol##_used[i]], 4); \
DEBUG((" script, patching %s at %d to %pad\n", \
   #symbol, A_##symbol##_used[i], &da)); \
} \
 }
 
 /* Used for patching the SCSI ID in the SELECT instruction */
-#define script_patch_ID(dev, script, symbol, value) \
+#define script_patch_ID(h, script, symbol, value) \
 { \
int i; \
for(i=0; i< (sizeof(A_##symbol##_used) / sizeof(__u32)); i++) { \
@@ -457,13 +457,13 @@ struct NCR_700_Host_Parameters {
val &= 0xff00; \
val |= ((value) & 0xff) << 16; \
(script)[A_##symbol##_used[i]] = bS_to_host(val); \
-   dma_sync_to_dev((dev), &(script)[A_##symbol##_used[i]], 4); \
+   dma_sync_to_dev((h), &(script)[A_##symbol##_used[i]], 4); \
DEBUG((" script, patching ID field %s at %d to 0x%x\n", \
   #symbol, A_##symbol##_used[i], val)); \
} \
 }
 
-#define script_patch_16(dev, script, symbol, value) \
+#define script_patch_16(h, script, symbol, value) \
 { \
int i; \
for(i=0; i< (sizeof(A_##symbol##_used) / sizeof(__u32)); i++) { \
@@ -471,7 +471,7 @@ struct NCR_700_Host_Parameters {
val &= 0x; \
val |= ((value) & 0x); \
(script)[A_##symbol##_used[i]] = bS_to_host(val); \
-   dma_sync_to_dev((dev), &(script)[A_##symbol##_used[i]], 4); \
+   dma_sync_to_dev((h), &(script)[A_##symbol##_used[i]], 4); \
DEBUG((" script, patching short field %s at %d to 0x%x\n", \
   #symbol, A_##symbol##_used[i], val)); \
} \
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 11/17] sgiseeq: convert to dma_alloc_noncoherent

2020-09-14 Thread Christoph Hellwig
Use the new non-coherent DMA API including proper ownership transfers.
This includes adding additional calls to dma_sync_desc_dev as the
old syncing was rather ad-hoc.

Thanks to Thomas Bogendoerfer for debugging the ownership transfer
issues.

Signed-off-by: Christoph Hellwig 
---
 drivers/net/ethernet/i825xx/lasi_82596.c |  25 ++---
 drivers/net/ethernet/i825xx/lib82596.c   | 114 ++-
 drivers/net/ethernet/i825xx/sni_82596.c  |   4 -
 drivers/net/ethernet/seeq/sgiseeq.c  |  28 --
 drivers/scsi/53c700.c|   9 +-
 5 files changed, 103 insertions(+), 77 deletions(-)

diff --git a/drivers/net/ethernet/i825xx/lasi_82596.c 
b/drivers/net/ethernet/i825xx/lasi_82596.c
index a12218e940a2fa..96c6f4f36904ed 100644
--- a/drivers/net/ethernet/i825xx/lasi_82596.c
+++ b/drivers/net/ethernet/i825xx/lasi_82596.c
@@ -96,21 +96,14 @@
 
 #define OPT_SWAP_PORT  0x0001  /* Need to wordswp on the MPU port */
 
-#define DMA_WBACK(ndev, addr, len) \
-   do { dma_cache_sync((ndev)->dev.parent, (void *)addr, len, 
DMA_TO_DEVICE); } while (0)
-
-#define DMA_INV(ndev, addr, len) \
-   do { dma_cache_sync((ndev)->dev.parent, (void *)addr, len, 
DMA_FROM_DEVICE); } while (0)
-
-#define DMA_WBACK_INV(ndev, addr, len) \
-   do { dma_cache_sync((ndev)->dev.parent, (void *)addr, len, 
DMA_BIDIRECTIONAL); } while (0)
-
 #define SYSBUS  0x006c
 
 /* big endian CPU, 82596 "big" endian mode */
 #define SWAP32(x)   (((u32)(x)<<16) | u32)(x)))>>16))
 #define SWAP16(x)   (x)
 
+#define NONCOHERENT_DMA 1
+
 #include "lib82596.c"
 
 MODULE_AUTHOR("Richard Hirst");
@@ -184,9 +177,9 @@ lan_init_chip(struct parisc_device *dev)
 
lp = netdev_priv(netdevice);
lp->options = dev->id.sversion == 0x72 ? OPT_SWAP_PORT : 0;
-   lp->dma = dma_alloc_attrs(&dev->dev, sizeof(struct i596_dma),
- &lp->dma_addr, GFP_KERNEL,
- DMA_ATTR_NON_CONSISTENT);
+   lp->dma = dma_alloc_noncoherent(&dev->dev,
+   sizeof(struct i596_dma), &lp->dma_addr,
+   DMA_BIDIRECTIONAL, GFP_KERNEL);
if (!lp->dma)
goto out_free_netdev;
 
@@ -196,8 +189,8 @@ lan_init_chip(struct parisc_device *dev)
return 0;
 
 out_free_dma:
-   dma_free_attrs(&dev->dev, sizeof(struct i596_dma), lp->dma,
-   lp->dma_addr, DMA_ATTR_NON_CONSISTENT);
+   dma_free_noncoherent(&dev->dev, sizeof(struct i596_dma),
+  lp->dma, lp->dma_addr, DMA_BIDIRECTIONAL);
 out_free_netdev:
free_netdev(netdevice);
return retval;
@@ -209,8 +202,8 @@ static int __exit lan_remove_chip(struct parisc_device 
*pdev)
struct i596_private *lp = netdev_priv(dev);
 
unregister_netdev (dev);
-   dma_free_attrs(&pdev->dev, sizeof(struct i596_private), lp->dma,
-  lp->dma_addr, DMA_ATTR_NON_CONSISTENT);
+   dma_free_noncoherent(&pdev->dev, sizeof(struct i596_private), lp->dma,
+  lp->dma_addr, DMA_BIDIRECTIONAL);
free_netdev (dev);
return 0;
 }
diff --git a/drivers/net/ethernet/i825xx/lib82596.c 
b/drivers/net/ethernet/i825xx/lib82596.c
index b4e4b3eb5758b5..ca2fb303fcc6f6 100644
--- a/drivers/net/ethernet/i825xx/lib82596.c
+++ b/drivers/net/ethernet/i825xx/lib82596.c
@@ -365,13 +365,44 @@ static int max_cmd_backlog = TX_RING_SIZE-1;
 static void i596_poll_controller(struct net_device *dev);
 #endif
 
+static inline dma_addr_t virt_to_dma(struct i596_private *lp, volatile void *v)
+{
+   return lp->dma_addr + ((unsigned long)v - (unsigned long)lp->dma);
+}
+
+#ifdef NONCOHERENT_DMA
+static inline void dma_sync_dev(struct net_device *ndev, volatile void *addr,
+   size_t len)
+{
+   dma_sync_single_for_device(ndev->dev.parent,
+   virt_to_dma(netdev_priv(ndev), addr), len,
+   DMA_BIDIRECTIONAL);
+}
+
+static inline void dma_sync_cpu(struct net_device *ndev, volatile void *addr,
+   size_t len)
+{
+   dma_sync_single_for_cpu(ndev->dev.parent,
+   virt_to_dma(netdev_priv(ndev), addr), len,
+   DMA_BIDIRECTIONAL);
+}
+#else
+static inline void dma_sync_dev(struct net_device *ndev, volatile void *addr,
+   size_t len)
+{
+}
+static inline void dma_sync_cpu(struct net_device *ndev, volatile void *addr,
+   size_t len)
+{
+}
+#endif /* NONCOHERENT_DMA */
 
 static inline int wait_istat(struct net_device *dev, struct i596_dma *dma, int 
delcnt, char *str)
 {
-   DMA_INV(dev, &(dma->iscp), sizeof(struct i596_iscp));
+   dma_sync_cpu(dev, &(dma->iscp), sizeof(struct i596_iscp));
while (--delcnt && dma->iscp.stat) {
udelay(10);
-   DMA_INV(dev, &(dma->iscp), sizeof(struct i596_iscp));
+   dma_sync_cpu(dev, &(dma->iscp), sizeof(struct i596_iscp));
}
if (!delcnt) {
print

[PATCH 10/17] hal2: convert to dma_alloc_noncoherent

2020-09-14 Thread Christoph Hellwig
Use the new non-coherent DMA API including proper ownership transfers.
This also means we can allocate the buffer memory with the proper
direction instead of bidirectional.

Signed-off-by: Christoph Hellwig 
---
 sound/mips/hal2.c | 58 ++-
 1 file changed, 27 insertions(+), 31 deletions(-)

diff --git a/sound/mips/hal2.c b/sound/mips/hal2.c
index ec84bc4c3a6e77..9ac9b58d7c8cdd 100644
--- a/sound/mips/hal2.c
+++ b/sound/mips/hal2.c
@@ -441,7 +441,8 @@ static inline void hal2_stop_adc(struct snd_hal2 *hal2)
hal2->adc.pbus.pbus->pbdma_ctrl = HPC3_PDMACTRL_LD;
 }
 
-static int hal2_alloc_dmabuf(struct snd_hal2 *hal2, struct hal2_codec *codec)
+static int hal2_alloc_dmabuf(struct snd_hal2 *hal2, struct hal2_codec *codec,
+   enum dma_data_direction buffer_dir)
 {
struct device *dev = hal2->card->dev;
struct hal2_desc *desc;
@@ -449,15 +450,15 @@ static int hal2_alloc_dmabuf(struct snd_hal2 *hal2, 
struct hal2_codec *codec)
int count = H2_BUF_SIZE / H2_BLOCK_SIZE;
int i;
 
-   codec->buffer = dma_alloc_attrs(dev, H2_BUF_SIZE, &buffer_dma,
-   GFP_KERNEL, DMA_ATTR_NON_CONSISTENT);
+   codec->buffer = dma_alloc_noncoherent(dev, H2_BUF_SIZE, &buffer_dma,
+   buffer_dir, GFP_KERNEL);
if (!codec->buffer)
return -ENOMEM;
-   desc = dma_alloc_attrs(dev, count * sizeof(struct hal2_desc),
-  &desc_dma, GFP_KERNEL, DMA_ATTR_NON_CONSISTENT);
+   desc = dma_alloc_noncoherent(dev, count * sizeof(struct hal2_desc),
+   &desc_dma, DMA_BIDIRECTIONAL, GFP_KERNEL);
if (!desc) {
-   dma_free_attrs(dev, H2_BUF_SIZE, codec->buffer, buffer_dma,
-  DMA_ATTR_NON_CONSISTENT);
+   dma_free_noncoherent(dev, H2_BUF_SIZE, codec->buffer, 
buffer_dma,
+   buffer_dir);
return -ENOMEM;
}
codec->buffer_dma = buffer_dma;
@@ -470,20 +471,22 @@ static int hal2_alloc_dmabuf(struct snd_hal2 *hal2, 
struct hal2_codec *codec)
  desc_dma : desc_dma + (i + 1) * sizeof(struct hal2_desc);
desc++;
}
-   dma_cache_sync(dev, codec->desc, count * sizeof(struct hal2_desc),
-  DMA_TO_DEVICE);
+   dma_sync_single_for_device(dev, codec->desc_dma,
+  count * sizeof(struct hal2_desc),
+  DMA_BIDIRECTIONAL);
codec->desc_count = count;
return 0;
 }
 
-static void hal2_free_dmabuf(struct snd_hal2 *hal2, struct hal2_codec *codec)
+static void hal2_free_dmabuf(struct snd_hal2 *hal2, struct hal2_codec *codec,
+   enum dma_data_direction buffer_dir)
 {
struct device *dev = hal2->card->dev;
 
-   dma_free_attrs(dev, codec->desc_count * sizeof(struct hal2_desc),
-  codec->desc, codec->desc_dma, DMA_ATTR_NON_CONSISTENT);
-   dma_free_attrs(dev, H2_BUF_SIZE, codec->buffer, codec->buffer_dma,
-  DMA_ATTR_NON_CONSISTENT);
+   dma_free_noncoherent(dev, codec->desc_count * sizeof(struct hal2_desc),
+  codec->desc, codec->desc_dma, DMA_BIDIRECTIONAL);
+   dma_free_noncoherent(dev, H2_BUF_SIZE, codec->buffer, codec->buffer_dma,
+   buffer_dir);
 }
 
 static const struct snd_pcm_hardware hal2_pcm_hw = {
@@ -509,21 +512,16 @@ static int hal2_playback_open(struct snd_pcm_substream 
*substream)
 {
struct snd_pcm_runtime *runtime = substream->runtime;
struct snd_hal2 *hal2 = snd_pcm_substream_chip(substream);
-   int err;
 
runtime->hw = hal2_pcm_hw;
-
-   err = hal2_alloc_dmabuf(hal2, &hal2->dac);
-   if (err)
-   return err;
-   return 0;
+   return hal2_alloc_dmabuf(hal2, &hal2->dac, DMA_TO_DEVICE);
 }
 
 static int hal2_playback_close(struct snd_pcm_substream *substream)
 {
struct snd_hal2 *hal2 = snd_pcm_substream_chip(substream);
 
-   hal2_free_dmabuf(hal2, &hal2->dac);
+   hal2_free_dmabuf(hal2, &hal2->dac, DMA_TO_DEVICE);
return 0;
 }
 
@@ -579,7 +577,9 @@ static void hal2_playback_transfer(struct snd_pcm_substream 
*substream,
unsigned char *buf = hal2->dac.buffer + rec->hw_data;
 
memcpy(buf, substream->runtime->dma_area + rec->sw_data, bytes);
-   dma_cache_sync(hal2->card->dev, buf, bytes, DMA_TO_DEVICE);
+   dma_sync_single_for_device(hal2->card->dev,
+   hal2->dac.buffer_dma + rec->hw_data, bytes,
+   DMA_TO_DEVICE);
 
 }
 
@@ -597,22 +597,16 @@ static int hal2_capture_open(struct snd_pcm_substream 
*substream)
 {
struct snd_pcm_runtime *runtime = substream->runtime;
struct snd_hal2 *hal2 = snd_pcm_substream_chip(substream);
-   struct hal2_codec *adc = &hal2->adc;
-   int err

[PATCH 09/17] sgiwd93: convert to dma_alloc_noncoherent

2020-09-14 Thread Christoph Hellwig
Use the new non-coherent DMA API including proper ownership transfers.
This also means we can allocate the memory as DMA_TO_DEVICE instead
of bidirectional.

Signed-off-by: Christoph Hellwig 
---
 drivers/scsi/sgiwd93.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/sgiwd93.c b/drivers/scsi/sgiwd93.c
index 3bdf0deb8f1529..cf1030c9dda17f 100644
--- a/drivers/scsi/sgiwd93.c
+++ b/drivers/scsi/sgiwd93.c
@@ -95,7 +95,7 @@ void fill_hpc_entries(struct ip22_hostdata *hd, struct 
scsi_cmnd *cmd, int din)
 */
hcp->desc.pbuf = 0;
hcp->desc.cntinfo = HPCDMA_EOX;
-   dma_cache_sync(hd->dev, hd->cpu,
+   dma_sync_single_for_device(hd->dev, hd->dma,
   (unsigned long)(hcp + 1) - (unsigned long)hd->cpu,
   DMA_TO_DEVICE);
 }
@@ -234,8 +234,8 @@ static int sgiwd93_probe(struct platform_device *pdev)
 
hdata = host_to_hostdata(host);
hdata->dev = &pdev->dev;
-   hdata->cpu = dma_alloc_attrs(&pdev->dev, HPC_DMA_SIZE, &hdata->dma,
-GFP_KERNEL, DMA_ATTR_NON_CONSISTENT);
+   hdata->cpu = dma_alloc_noncoherent(&pdev->dev, HPC_DMA_SIZE,
+   &hdata->dma, DMA_TO_DEVICE, GFP_KERNEL);
if (!hdata->cpu) {
printk(KERN_WARNING "sgiwd93: Could not allocate memory for "
   "host %d buffer.\n", unit);
@@ -274,8 +274,8 @@ static int sgiwd93_probe(struct platform_device *pdev)
 out_irq:
free_irq(irq, host);
 out_free:
-   dma_free_attrs(&pdev->dev, HPC_DMA_SIZE, hdata->cpu, hdata->dma,
-  DMA_ATTR_NON_CONSISTENT);
+   dma_free_noncoherent(&pdev->dev, HPC_DMA_SIZE, hdata->cpu, hdata->dma,
+   DMA_TO_DEVICE);
 out_put:
scsi_host_put(host);
 out:
@@ -291,8 +291,8 @@ static int sgiwd93_remove(struct platform_device *pdev)
 
scsi_remove_host(host);
free_irq(pd->irq, host);
-   dma_free_attrs(&pdev->dev, HPC_DMA_SIZE, hdata->cpu, hdata->dma,
-  DMA_ATTR_NON_CONSISTENT);
+   dma_free_noncoherent(&pdev->dev, HPC_DMA_SIZE, hdata->cpu, hdata->dma,
+   DMA_TO_DEVICE);
scsi_host_put(host);
return 0;
 }
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 08/17] dma-mapping: add a new dma_alloc_noncoherent API

2020-09-14 Thread Christoph Hellwig
Add a new API to allocate and free memory that is guaranteed to be
addressable by a device, but which potentially is not cache coherent
for DMA.

To transfer ownership to and from the device, the existing streaming
DMA API calls dma_sync_single_for_device and dma_sync_single_for_cpu
must be used.

For now the new calls are implemented on top of dma_alloc_attrs just
like the old-noncoherent API, but once all drivers are switched to
the new API it will be replaced with a better working implementation
that is available on all architectures.

Signed-off-by: Christoph Hellwig 
---
 Documentation/core-api/dma-api.rst | 75 ++
 include/linux/dma-mapping.h| 12 +
 2 files changed, 48 insertions(+), 39 deletions(-)

diff --git a/Documentation/core-api/dma-api.rst 
b/Documentation/core-api/dma-api.rst
index 90239348b30f6f..ea0413276ddb70 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -516,48 +516,56 @@ routines, e.g.:::
}
 
 
-Part II - Advanced dma usage
-
+Part II - Non-coherent DMA allocations
+--
 
-Warning: These pieces of the DMA API should not be used in the
-majority of cases, since they cater for unlikely corner cases that
-don't belong in usual drivers.
+These APIs allow to allocate pages in the kernel direct mapping that are
+guaranteed to be DMA addressable.  This means that unlike dma_alloc_coherent,
+virt_to_page can be called on the resulting address, and the resulting
+struct page can be used for everything a struct page is suitable for.
 
-If you don't understand how cache line coherency works between a
-processor and an I/O device, you should not be using this part of the
-API at all.
+If you don't understand how cache line coherency works between a processor and
+an I/O device, you should not be using this part of the API.
 
 ::
 
void *
-   dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
-   gfp_t flag, unsigned long attrs)
+   dma_alloc_noncoherent(struct device *dev, size_t size,
+   dma_addr_t *dma_handle, enum dma_data_direction dir,
+   gfp_t gfp)
 
-Identical to dma_alloc_coherent() except that when the
-DMA_ATTR_NON_CONSISTENT flags is passed in the attrs argument, the
-platform will choose to return either consistent or non-consistent memory
-as it sees fit.  By using this API, you are guaranteeing to the platform
-that you have all the correct and necessary sync points for this memory
-in the driver should it choose to return non-consistent memory.
+This routine allocates a region of  bytes of consistent memory.  It
+returns a pointer to the allocated region (in the processor's virtual address
+space) or NULL if the allocation failed.  The returned memory may or may not
+be in the kernels direct mapping.  Drivers must not call virt_to_page on
+the returned memory region.
 
-Note: where the platform can return consistent memory, it will
-guarantee that the sync points become nops.
+It also returns a  which may be cast to an unsigned integer the
+same width as the bus and given to the device as the DMA address base of
+the region.
 
-Warning:  Handling non-consistent memory is a real pain.  You should
-only use this API if you positively know your driver will be
-required to work on one of the rare (usually non-PCI) architectures
-that simply cannot make consistent memory.
+The dir parameter specified if data is read and/or written by the device,
+see dma_map_single() for details.
+
+The gfp parameter allows the caller to specify the ``GFP_`` flags (see
+kmalloc()) for the allocation, but rejects flags used to specify a memory
+zone such as GFP_DMA or GFP_HIGHMEM.
+
+Before giving the memory to the device, dma_sync_single_for_device() needs
+to be called, and before reading memory written by the device,
+dma_sync_single_for_cpu(), just like for streaming DMA mappings that are
+reused.
 
 ::
 
void
-   dma_free_attrs(struct device *dev, size_t size, void *cpu_addr,
-  dma_addr_t dma_handle, unsigned long attrs)
+   dma_free_noncoherent(struct device *dev, size_t size, void *cpu_addr,
+   dma_addr_t dma_handle, enum dma_data_direction dir)
 
-Free memory allocated by the dma_alloc_attrs().  All common
-parameters must be identical to those otherwise passed to dma_free_coherent,
-and the attrs argument must be identical to the attrs passed to
-dma_alloc_attrs().
+Free a region of memory previously allocated using dma_alloc_noncoherent().
+dev, size and dma_handle and dir must all be the same as those passed into
+dma_alloc_noncoherent().  cpu_addr must be the virtual address returned by
+the dma_alloc_noncoherent().
 
 ::
 
@@ -575,17 +583,6 @@ memory or doing partial flushes.
into the width returned by this call.  It will also always be a power
of two for easy alignment

Re: [PATCH] iommu/amd: fix interrupt remapping for avic

2020-09-14 Thread Suravee Suthikulpanit

Maxim,

On 9/13/2020 7:42 PM, Maxim Levitsky wrote:

Commit e52d58d54a32 ("iommu/amd: Use cmpxchg_double() when updating 128-bit 
IRTE")
accidentally removed an assumption that modify_irte_ga always set the valid bit
and amd_iommu_activate_guest_mode relied on that.

Side effect of this is that on my machine, VFIO based VMs with AVIC enabled
would eventually crash and show IOMMU errors like that:

AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0055 address=0xfffdf800 
flags=0x0008]

Fixes: e52d58d54a321 ("iommu/amd: Use cmpxchg_double() when updating 128-bit 
IRTE")
Signed-off-by: Maxim Levitsky 
---
  drivers/iommu/amd/iommu.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 07ae8b93887e5..aff4cc1869356 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3853,6 +3853,7 @@ int amd_iommu_activate_guest_mode(void *data)
entry->hi.fields.ga_root_ptr   = ir_data->ga_root_ptr;
entry->hi.fields.vector= ir_data->ga_vector;
entry->lo.fields_vapic.ga_tag  = ir_data->ga_tag;
+   entry->lo.fields_remap.valid = 1;
  
  	return modify_irte_ga(ir_data->irq_2_irte.devid,

  ir_data->irq_2_irte.index, entry, ir_data);



Could you please try with the following patch instead?

--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3840,14 +3840,18 @@ int amd_iommu_activate_guest_mode(void *data)
 {
struct amd_ir_data *ir_data = (struct amd_ir_data *)data;
struct irte_ga *entry = (struct irte_ga *) ir_data->entry;
+   u64 valid;

if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) ||
!entry || entry->lo.fields_vapic.guest_mode)
return 0;

+   valid = entry->lo.fields_vapic.valid;
+
entry->lo.val = 0;
entry->hi.val = 0;

+   entry->lo.fields_vapic.valid   = valid;
entry->lo.fields_vapic.guest_mode  = 1;
entry->lo.fields_vapic.ga_log_intr = 1;
entry->hi.fields.ga_root_ptr   = ir_data->ga_root_ptr;
@@ -3864,12 +3868,14 @@ int amd_iommu_deactivate_guest_mode(void *data)
struct amd_ir_data *ir_data = (struct amd_ir_data *)data;
struct irte_ga *entry = (struct irte_ga *) ir_data->entry;
struct irq_cfg *cfg = ir_data->cfg;
-   u64 valid = entry->lo.fields_remap.valid;
+   u64 valid;

if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) ||
!entry || !entry->lo.fields_vapic.guest_mode)
return 0;

+   valid = entry->lo.fields_remap.valid;
+
entry->lo.val = 0;
entry->hi.val = 0;
--

Thanks,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 07/17] 53c700: improve non-coherent DMA handling

2020-09-14 Thread Christoph Hellwig
Switch the 53c700 driver to only use non-coherent descriptor memory if it
really has to because dma_alloc_coherent fails.  This doesn't matter for
any of the platforms it runs on currently, but that will change soon.

To help with this two new helpers to transfer ownership to and from the
device are added that abstract the syncing of the non-coherent memory.
The two current bidirectional cases are mapped to transfers to the
device, as that appears to what they are used for.  Note that for parisc,
which is the only architecture this driver needs to use non-coherent
memory on, the direction argument of dma_cache_sync is ignored, so this
will not change behavior in any way.

Signed-off-by: Christoph Hellwig 
---
 drivers/scsi/53c700.c | 113 +++---
 drivers/scsi/53c700.h |   9 ++--
 2 files changed, 68 insertions(+), 54 deletions(-)

diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c
index 84b57a8f86bfa9..c59226d7e2f6b5 100644
--- a/drivers/scsi/53c700.c
+++ b/drivers/scsi/53c700.c
@@ -269,6 +269,20 @@ NCR_700_get_SXFER(struct scsi_device *SDp)
  spi_period(SDp->sdev_target));
 }
 
+static inline void dma_sync_to_dev(struct NCR_700_Host_Parameters *h,
+   void *addr, size_t size)
+{
+   if (h->noncoherent)
+   dma_cache_sync(h->dev, addr, size, DMA_TO_DEVICE);
+}
+
+static inline void dma_sync_from_dev(struct NCR_700_Host_Parameters *h,
+   void *addr, size_t size)
+{
+   if (h->noncoherent)
+   dma_cache_sync(h->dev, addr, size, DMA_FROM_DEVICE);
+}
+
 struct Scsi_Host *
 NCR_700_detect(struct scsi_host_template *tpnt,
   struct NCR_700_Host_Parameters *hostdata, struct device *dev)
@@ -283,9 +297,13 @@ NCR_700_detect(struct scsi_host_template *tpnt,
if(tpnt->sdev_attrs == NULL)
tpnt->sdev_attrs = NCR_700_dev_attrs;
 
-   memory = dma_alloc_attrs(dev, TOTAL_MEM_SIZE, &pScript,
-GFP_KERNEL, DMA_ATTR_NON_CONSISTENT);
-   if(memory == NULL) {
+   memory = dma_alloc_coherent(dev, TOTAL_MEM_SIZE, &pScript, GFP_KERNEL);
+   if (!memory) {
+   hostdata->noncoherent = 1;
+   memory = dma_alloc_attrs(dev, TOTAL_MEM_SIZE, &pScript,
+GFP_KERNEL, DMA_ATTR_NON_CONSISTENT);
+   }
+   if (!memory) {
printk(KERN_ERR "53c700: Failed to allocate memory for driver, 
detaching\n");
return NULL;
}
@@ -339,11 +357,11 @@ NCR_700_detect(struct scsi_host_template *tpnt,
for (j = 0; j < PATCHES; j++)
script[LABELPATCHES[j]] = bS_to_host(pScript + 
SCRIPT[LABELPATCHES[j]]);
/* now patch up fixed addresses. */
-   script_patch_32(hostdata->dev, script, MessageLocation,
+   script_patch_32(hostdata, script, MessageLocation,
pScript + MSGOUT_OFFSET);
-   script_patch_32(hostdata->dev, script, StatusAddress,
+   script_patch_32(hostdata, script, StatusAddress,
pScript + STATUS_OFFSET);
-   script_patch_32(hostdata->dev, script, ReceiveMsgAddress,
+   script_patch_32(hostdata, script, ReceiveMsgAddress,
pScript + MSGIN_OFFSET);
 
hostdata->script = script;
@@ -395,8 +413,12 @@ NCR_700_release(struct Scsi_Host *host)
struct NCR_700_Host_Parameters *hostdata = 
(struct NCR_700_Host_Parameters *)host->hostdata[0];
 
-   dma_free_attrs(hostdata->dev, TOTAL_MEM_SIZE, hostdata->script,
-  hostdata->pScript, DMA_ATTR_NON_CONSISTENT);
+   if (hostdata->noncoherent)
+   dma_free_attrs(hostdata->dev, TOTAL_MEM_SIZE, hostdata->script,
+  hostdata->pScript, DMA_ATTR_NON_CONSISTENT);
+   else
+   dma_free_coherent(hostdata->dev, TOTAL_MEM_SIZE,
+ hostdata->script, hostdata->pScript);
return 1;
 }
 
@@ -804,8 +826,8 @@ process_extended_message(struct Scsi_Host *host,
shost_printk(KERN_WARNING, host,
"Unexpected SDTR msg\n");
hostdata->msgout[0] = A_REJECT_MSG;
-   dma_cache_sync(hostdata->dev, hostdata->msgout, 1, 
DMA_TO_DEVICE);
-   script_patch_16(hostdata->dev, hostdata->script,
+   dma_sync_to_dev(hostdata, hostdata->msgout, 1);
+   script_patch_16(hostdata, hostdata->script,
MessageCount, 1);
/* SendMsgOut returns, so set up the return
 * address */
@@ -817,9 +839,8 @@ process_extended_message(struct Scsi_Host *host,
printk(KERN_INFO "scsi%d: (%d:%d), Unsolicited WDTR after CMD, 
Rejecting\n",
   host->host_no, pun, lun);
hostdata->msgout[0] 

[PATCH 06/17] lib82596: move DMA allocation into the callers of i82596_probe

2020-09-14 Thread Christoph Hellwig
This allows us to get rid of the LIB82596_DMA_ATTR defined and prepare
for untangling the coherent vs non-coherent DMA allocation API.

Signed-off-by: Christoph Hellwig 
---
 drivers/net/ethernet/i825xx/lasi_82596.c | 24 ++--
 drivers/net/ethernet/i825xx/lib82596.c   | 36 
 drivers/net/ethernet/i825xx/sni_82596.c  | 19 +
 3 files changed, 40 insertions(+), 39 deletions(-)

diff --git a/drivers/net/ethernet/i825xx/lasi_82596.c 
b/drivers/net/ethernet/i825xx/lasi_82596.c
index aec7e98bcc853a..a12218e940a2fa 100644
--- a/drivers/net/ethernet/i825xx/lasi_82596.c
+++ b/drivers/net/ethernet/i825xx/lasi_82596.c
@@ -96,8 +96,6 @@
 
 #define OPT_SWAP_PORT  0x0001  /* Need to wordswp on the MPU port */
 
-#define LIB82596_DMA_ATTR  DMA_ATTR_NON_CONSISTENT
-
 #define DMA_WBACK(ndev, addr, len) \
do { dma_cache_sync((ndev)->dev.parent, (void *)addr, len, 
DMA_TO_DEVICE); } while (0)
 
@@ -155,7 +153,7 @@ lan_init_chip(struct parisc_device *dev)
 {
struct  net_device *netdevice;
struct i596_private *lp;
-   int retval;
+   int retval = -ENOMEM;
int i;
 
if (!dev->irq) {
@@ -186,12 +184,22 @@ lan_init_chip(struct parisc_device *dev)
 
lp = netdev_priv(netdevice);
lp->options = dev->id.sversion == 0x72 ? OPT_SWAP_PORT : 0;
+   lp->dma = dma_alloc_attrs(&dev->dev, sizeof(struct i596_dma),
+ &lp->dma_addr, GFP_KERNEL,
+ DMA_ATTR_NON_CONSISTENT);
+   if (!lp->dma)
+   goto out_free_netdev;
 
retval = i82596_probe(netdevice);
-   if (retval) {
-   free_netdev(netdevice);
-   return -ENODEV;
-   }
+   if (retval)
+   goto out_free_dma;
+   return 0;
+
+out_free_dma:
+   dma_free_attrs(&dev->dev, sizeof(struct i596_dma), lp->dma,
+   lp->dma_addr, DMA_ATTR_NON_CONSISTENT);
+out_free_netdev:
+   free_netdev(netdevice);
return retval;
 }
 
@@ -202,7 +210,7 @@ static int __exit lan_remove_chip(struct parisc_device 
*pdev)
 
unregister_netdev (dev);
dma_free_attrs(&pdev->dev, sizeof(struct i596_private), lp->dma,
-  lp->dma_addr, LIB82596_DMA_ATTR);
+  lp->dma_addr, DMA_ATTR_NON_CONSISTENT);
free_netdev (dev);
return 0;
 }
diff --git a/drivers/net/ethernet/i825xx/lib82596.c 
b/drivers/net/ethernet/i825xx/lib82596.c
index b03757e169e475..b4e4b3eb5758b5 100644
--- a/drivers/net/ethernet/i825xx/lib82596.c
+++ b/drivers/net/ethernet/i825xx/lib82596.c
@@ -1047,9 +1047,8 @@ static const struct net_device_ops i596_netdev_ops = {
 
 static int i82596_probe(struct net_device *dev)
 {
-   int i;
struct i596_private *lp = netdev_priv(dev);
-   struct i596_dma *dma;
+   int ret;
 
/* This lot is ensure things have been cache line aligned. */
BUILD_BUG_ON(sizeof(struct i596_rfd) != 32);
@@ -1063,41 +1062,28 @@ static int i82596_probe(struct net_device *dev)
if (!dev->base_addr || !dev->irq)
return -ENODEV;
 
-   dma = dma_alloc_attrs(dev->dev.parent, sizeof(struct i596_dma),
- &lp->dma_addr, GFP_KERNEL,
- LIB82596_DMA_ATTR);
-   if (!dma) {
-   printk(KERN_ERR "%s: Couldn't get shared memory\n", __FILE__);
-   return -ENOMEM;
-   }
-
dev->netdev_ops = &i596_netdev_ops;
dev->watchdog_timeo = TX_TIMEOUT;
 
-   memset(dma, 0, sizeof(struct i596_dma));
-   lp->dma = dma;
-
-   dma->scb.command = 0;
-   dma->scb.cmd = I596_NULL;
-   dma->scb.rfd = I596_NULL;
+   memset(lp->dma, 0, sizeof(struct i596_dma));
+   lp->dma->scb.command = 0;
+   lp->dma->scb.cmd = I596_NULL;
+   lp->dma->scb.rfd = I596_NULL;
spin_lock_init(&lp->lock);
 
-   DMA_WBACK_INV(dev, dma, sizeof(struct i596_dma));
+   DMA_WBACK_INV(dev, lp->dma, sizeof(struct i596_dma));
 
-   i = register_netdev(dev);
-   if (i) {
-   dma_free_attrs(dev->dev.parent, sizeof(struct i596_dma),
-  dma, lp->dma_addr, LIB82596_DMA_ATTR);
-   return i;
-   }
+   ret = register_netdev(dev);
+   if (ret)
+   return ret;
 
DEB(DEB_PROBE, printk(KERN_INFO "%s: 82596 at %#3lx, %pM IRQ %d.\n",
  dev->name, dev->base_addr, dev->dev_addr,
  dev->irq));
DEB(DEB_INIT, printk(KERN_INFO
 "%s: dma at 0x%p (%d bytes), lp->scb at 0x%p\n",
-dev->name, dma, (int)sizeof(struct i596_dma),
-&dma->scb));
+dev->name, lp->dma, (int)sizeof(struct i596_dma),
+&lp->dma->scb));
 
return 0;
 }
diff --git a/drivers/net/ethernet/i825xx/sni_82596.c 
b/drivers/n

[PATCH 05/17] net/au1000-eth: stop using DMA_ATTR_NON_CONSISTENT

2020-09-14 Thread Christoph Hellwig
The au1000-eth driver contains none of the manual cache synchronization
required for using DMA_ATTR_NON_CONSISTENT.  From what I can tell it
can be used on both dma coherent and non-coherent DMA platforms, but
I suspect it has been buggy on the non-coherent platforms all along.

Signed-off-by: Christoph Hellwig 
---
 drivers/net/ethernet/amd/au1000_eth.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/amd/au1000_eth.c 
b/drivers/net/ethernet/amd/au1000_eth.c
index 75dbd221dc594b..19e195420e2434 100644
--- a/drivers/net/ethernet/amd/au1000_eth.c
+++ b/drivers/net/ethernet/amd/au1000_eth.c
@@ -1131,10 +1131,9 @@ static int au1000_probe(struct platform_device *pdev)
/* Allocate the data buffers
 * Snooping works fine with eth on all au1xxx
 */
-   aup->vaddr = (u32)dma_alloc_attrs(&pdev->dev, MAX_BUF_SIZE *
+   aup->vaddr = (u32)dma_alloc_coherent(&pdev->dev, MAX_BUF_SIZE *
  (NUM_TX_BUFFS + NUM_RX_BUFFS),
- &aup->dma_addr, 0,
- DMA_ATTR_NON_CONSISTENT);
+ &aup->dma_addr, 0);
if (!aup->vaddr) {
dev_err(&pdev->dev, "failed to allocate data buffers\n");
err = -ENOMEM;
@@ -1310,9 +1309,8 @@ static int au1000_probe(struct platform_device *pdev)
 err_remap2:
iounmap(aup->mac);
 err_remap1:
-   dma_free_attrs(&pdev->dev, MAX_BUF_SIZE * (NUM_TX_BUFFS + NUM_RX_BUFFS),
-   (void *)aup->vaddr, aup->dma_addr,
-   DMA_ATTR_NON_CONSISTENT);
+   dma_free_coherent(&pdev->dev, MAX_BUF_SIZE * (NUM_TX_BUFFS + 
NUM_RX_BUFFS),
+   (void *)aup->vaddr, aup->dma_addr);
 err_vaddr:
free_netdev(dev);
 err_alloc:
@@ -1344,9 +1342,8 @@ static int au1000_remove(struct platform_device *pdev)
if (aup->tx_db_inuse[i])
au1000_ReleaseDB(aup, aup->tx_db_inuse[i]);
 
-   dma_free_attrs(&pdev->dev, MAX_BUF_SIZE * (NUM_TX_BUFFS + NUM_RX_BUFFS),
-   (void *)aup->vaddr, aup->dma_addr,
-   DMA_ATTR_NON_CONSISTENT);
+   dma_free_coherent(&pdev->dev, MAX_BUF_SIZE * (NUM_TX_BUFFS + 
NUM_RX_BUFFS),
+   (void *)aup->vaddr, aup->dma_addr);
 
iounmap(aup->macdma);
iounmap(aup->mac);
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 04/17] drm/nouveau/gk20a: stop setting DMA_ATTR_NON_CONSISTENT

2020-09-14 Thread Christoph Hellwig
DMA_ATTR_NON_CONSISTENT is a no-op except on PARISC and some mips
configs, so don't set it in this ARM specific driver part.

Signed-off-by: Christoph Hellwig 
---
 drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c
index 985f2990ab0dda..13d4d7ac0697b4 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c
@@ -594,8 +594,7 @@ gk20a_instmem_new(struct nvkm_device *device, int index,
 
nvkm_info(&imem->base.subdev, "using IOMMU\n");
} else {
-   imem->attrs = DMA_ATTR_NON_CONSISTENT |
- DMA_ATTR_WEAK_ORDERING |
+   imem->attrs = DMA_ATTR_WEAK_ORDERING |
  DMA_ATTR_WRITE_COMBINE;
 
nvkm_info(&imem->base.subdev, "using DMA API\n");
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 03/17] drm/exynos: stop setting DMA_ATTR_NON_CONSISTENT

2020-09-14 Thread Christoph Hellwig
DMA_ATTR_NON_CONSISTENT is a no-op except on PARISC and some mips
configs, so don't set it in this ARM specific driver.

Signed-off-by: Christoph Hellwig 
---
 drivers/gpu/drm/exynos/exynos_drm_gem.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c 
b/drivers/gpu/drm/exynos/exynos_drm_gem.c
index efa476858db54b..07073222b8f691 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
@@ -42,8 +42,6 @@ static int exynos_drm_alloc_buf(struct exynos_drm_gem 
*exynos_gem, bool kvmap)
if (exynos_gem->flags & EXYNOS_BO_WC ||
!(exynos_gem->flags & EXYNOS_BO_CACHABLE))
attr |= DMA_ATTR_WRITE_COMBINE;
-   else
-   attr |= DMA_ATTR_NON_CONSISTENT;
 
/* FBDev emulation requires kernel mapping */
if (!kvmap)
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 02/17] mm: turn alloc_pages into an inline function

2020-09-14 Thread Christoph Hellwig
To prevent a compiler error when a method call alloc_pages is
added (which I plan to for the dma_map_ops).

Signed-off-by: Christoph Hellwig 
---
 include/linux/gfp.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 67a0774e080b98..dd2577c5407112 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -550,8 +550,10 @@ extern struct page *alloc_pages_vma(gfp_t gfp_mask, int 
order,
 #define alloc_hugepage_vma(gfp_mask, vma, addr, order) \
alloc_pages_vma(gfp_mask, order, vma, addr, numa_node_id(), true)
 #else
-#define alloc_pages(gfp_mask, order) \
-   alloc_pages_node(numa_node_id(), gfp_mask, order)
+static inline struct page *alloc_pages(gfp_t gfp_mask, unsigned int order)
+{
+   return alloc_pages_node(numa_node_id(), gfp_mask, order);
+}
 #define alloc_pages_vma(gfp_mask, order, vma, addr, node, false)\
alloc_pages(gfp_mask, order)
 #define alloc_hugepage_vma(gfp_mask, vma, addr, order) \
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 01/17] media/v4l2: remove V4L2-FLAG-MEMORY-NON-CONSISTENT flag

2020-09-14 Thread Christoph Hellwig
From: Sergey Senozhatsky 

The patch partially reverts some of the UAPI bits of the buffer
cache management hints. Namely, the queue consistency (memory
coherency) user-space hint because, as it turned out, the kernel
implementation of this feature was misusing DMA_ATTR_NON_CONSISTENT.

The patch revers both kernel and user space parts: removes the
DMA consistency attr functions, rollbacks changes to v4l2_requestbuffers,
v4l2_create_buffers structures and corresponding UAPI functions
(plus compat32 layer) and cleanups the documentation.

Signed-off-by: Christoph Hellwig 
Signed-off-by: Sergey Senozhatsky 
Signed-off-by: Christoph Hellwig 
---
 .../userspace-api/media/v4l/buffer.rst| 17 ---
 .../media/v4l/vidioc-create-bufs.rst  |  6 +--
 .../media/v4l/vidioc-reqbufs.rst  | 12 +
 .../media/common/videobuf2/videobuf2-core.c   | 46 +++
 .../common/videobuf2/videobuf2-dma-contig.c   | 19 
 .../media/common/videobuf2/videobuf2-dma-sg.c |  3 +-
 .../media/common/videobuf2/videobuf2-v4l2.c   | 18 +---
 drivers/media/v4l2-core/v4l2-compat-ioctl32.c | 10 +---
 drivers/media/v4l2-core/v4l2-ioctl.c  |  5 +-
 include/media/videobuf2-core.h|  7 +--
 include/uapi/linux/videodev2.h| 13 +-
 11 files changed, 22 insertions(+), 134 deletions(-)

diff --git a/Documentation/userspace-api/media/v4l/buffer.rst 
b/Documentation/userspace-api/media/v4l/buffer.rst
index 57e752aaf414a7..2044ed13cd9d7d 100644
--- a/Documentation/userspace-api/media/v4l/buffer.rst
+++ b/Documentation/userspace-api/media/v4l/buffer.rst
@@ -701,23 +701,6 @@ Memory Consistency Flags
 :stub-columns: 0
 :widths:   3 1 4
 
-* .. _`V4L2-FLAG-MEMORY-NON-CONSISTENT`:
-
-  - ``V4L2_FLAG_MEMORY_NON_CONSISTENT``
-  - 0x0001
-  - A buffer is allocated either in consistent (it will be automatically
-   coherent between the CPU and the bus) or non-consistent memory. The
-   latter can provide performance gains, for instance the CPU cache
-   sync/flush operations can be avoided if the buffer is accessed by the
-   corresponding device only and the CPU does not read/write to/from that
-   buffer. However, this requires extra care from the driver -- it must
-   guarantee memory consistency by issuing a cache flush/sync when
-   consistency is needed. If this flag is set V4L2 will attempt to
-   allocate the buffer in non-consistent memory. The flag takes effect
-   only if the buffer is used for :ref:`memory mapping ` I/O and the
-   queue reports the :ref:`V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS
-   ` capability.
-
 .. c:type:: v4l2_memory
 
 enum v4l2_memory
diff --git a/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst 
b/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst
index f2a702870fadc1..12cf6b44f414f7 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-create-bufs.rst
@@ -120,13 +120,9 @@ than the number requested.
If you want to just query the capabilities without making any
other changes, then set ``count`` to 0, ``memory`` to
``V4L2_MEMORY_MMAP`` and ``format.type`` to the buffer type.
-* - __u32
-  - ``flags``
-  - Specifies additional buffer management attributes.
-   See :ref:`memory-flags`.
 
 * - __u32
-  - ``reserved``\ [6]
+  - ``reserved``\ [7]
   - A place holder for future extensions. Drivers and applications
must set the array to zero.
 
diff --git a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst 
b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
index 75d894d9c36c42..0e3e2fde65e850 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-reqbufs.rst
@@ -112,17 +112,10 @@ aborting or finishing any DMA in progress, an implicit
``V4L2_MEMORY_MMAP`` and ``type`` set to the buffer type. This will
free any previously allocated buffers, so this is typically something
that will be done at the start of the application.
-* - union {
-  - (anonymous)
-* - __u32
-  - ``flags``
-  - Specifies additional buffer management attributes.
-   See :ref:`memory-flags`.
 * - __u32
   - ``reserved``\ [1]
-  - Kept for backwards compatibility. Use ``flags`` instead.
-* - }
-  -
+  - A place holder for future extensions. Drivers and applications
+   must set the array to zero.
 
 .. tabularcolumns:: |p{6.1cm}|p{2.2cm}|p{8.7cm}|
 
@@ -169,7 +162,6 @@ aborting or finishing any DMA in progress, an implicit
   - This capability is set by the driver to indicate that the queue 
supports
 cache and memory management hints. However, it's only valid when the
 queue is used for :ref:`memory mapping ` streaming I/O. See
-:ref:`V4L2_FLAG_MEMORY_NON_CONSISTENT 

a saner API for allocating DMA addressable pages v2

2020-09-14 Thread Christoph Hellwig
Hi all,

this series replaced the DMA_ATTR_NON_CONSISTENT flag to dma_alloc_attrs
with a separate new dma_alloc_pages API, which is available on all
platforms.  In addition to cleaning up the convoluted code path, this
ensures that other drivers that have asked for better support for
non-coherent DMA to pages with incurring bounce buffering over can finally
be properly supported.

I'm still a little unsure about the API naming, as alloc_pages sort of
implies a struct page return value, but we return a kernel virtual
address.  The other alternative would be to name the API
dma_alloc_noncoherent, but the whole non-coherent naming seems to put
people off.  As a follow up I plan to move the implementation of the
DMA_ATTR_NO_KERNEL_MAPPING flag over to this framework as well, given
that is also is a fundamentally non coherent allocation.  The replacement
for that flag would then return a struct page, as it is allowed to
actually return pages without a kernel mapping as the name suggested
(although most of the time they will actually have a kernel mapping..)

In addition to the conversions of the existing non-coherent DMA users,
I've also added a patch to convert the firewire ohci driver to use
the new dma_alloc_pages API.

Note that I haven't carried over any Tested-by: tags for the noncoherent
allocation conversions as there was a bit of a patch reshuffle, but the
result should be the same.

The first patch is queued up for 5.9 in the media tree, but included here
for completeness.


A git tree is available here:

git://git.infradead.org/users/hch/misc.git dma_alloc_pages

Gitweb:


http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma_alloc_pages


Changes since v1:
 - rebased on the latests dma-mapping tree, which merged many of the
   cleanups
 - fix an argument passing typo in 53c700, caught by sparse
 - rename a few macro arguments in 53c700
 - pass the right device to the DMA API in the lib82596 drivers
 - fix memory ownershiptransfers in sgiseeq
 - better document what a page in the direct kernel mapping means
 - split into dma_alloc_pages that returns a struct page and is in the
   direct mapping vs dma_alloc_noncoherent that can be vmapped
 - conver the firewire ohci driver to dma_alloc_pages

Diffstat:
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-14 Thread Jason Gunthorpe
On Mon, Sep 14, 2020 at 03:31:13PM +0200, Jean-Philippe Brucker wrote:

> > Jason suggest something like /dev/sva. There will be a lot of other
> > subsystems that could benefit from this (e.g vDPA).
> 
> Do you have a more precise idea of the interface /dev/sva would provide,
> how it would interact with VFIO and others?  vDPA could transport the
> generic iommu.h structures via its own uAPI, and call the IOMMU API
> directly without going through an intermediate /dev/sva handle.

Prior to PASID IOMMU really only makes sense as part of vfio-pci
because the iommu can only key on the BDF. That can't work unless the
whole PCI function can be assigned. It is hard to see how a shared PCI
device can work with IOMMU like this, so may as well use vfio.

SVA and various vIOMMU models change this, a shared PCI driver can
absoultely work with a PASID that is assigned to a VM safely, and
actually don't need to know if their PASID maps a mm_struct or
something else.

So, some /dev/sva is another way to allocate a PASID that is not 1:1
with mm_struct, as the existing SVA stuff enforces. ie it is a way to
program the DMA address map of the PASID.

This new PASID allocator would match the guest memory layout and
support the IOMMU nesting stuff needed for vPASID.

This is the common code for the complex cases of virtualization with
PASID, shared by all user DMA drivers, including VFIO.

It doesn't make a lot of sense to build a uAPI exclusive to VFIO just
for PASID and vPASID. We already know everything doing user DMA will
eventually need this stuff.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-14 Thread Jean-Philippe Brucker
On Mon, Sep 14, 2020 at 12:20:10PM +0800, Jason Wang wrote:
> 
> On 2020/9/10 下午6:45, Liu Yi L wrote:
> > Shared Virtual Addressing (SVA), a.k.a, Shared Virtual Memory (SVM) on
> > Intel platforms allows address space sharing between device DMA and
> > applications. SVA can reduce programming complexity and enhance security.
> > 
> > This VFIO series is intended to expose SVA usage to VMs. i.e. Sharing
> > guest application address space with passthru devices. This is called
> > vSVA in this series. The whole vSVA enabling requires QEMU/VFIO/IOMMU
> > changes. For IOMMU and QEMU changes, they are in separate series (listed
> > in the "Related series").
> > 
> > The high-level architecture for SVA virtualization is as below, the key
> > design of vSVA support is to utilize the dual-stage IOMMU translation (
> > also known as IOMMU nesting translation) capability in host IOMMU.
> > 
> > 
> >  .-.  .---.
> >  |   vIOMMU|  | Guest process CR3, FL only|
> >  | |  '---'
> >  ./
> >  | PASID Entry |--- PASID cache flush -
> >  '-'   |
> >  | |   V
> >  | |CR3 in GPA
> >  '-'
> > Guest
> > --| Shadow |--|
> >vv  v
> > Host
> >  .-.  .--.
> >  |   pIOMMU|  | Bind FL for GVA-GPA  |
> >  | |  '--'
> >  ./  |
> >  | PASID Entry | V (Nested xlate)
> >  '\.--.
> >  | ||SL for GPA-HPA, default domain|
> >  | |   '--'
> >  '-'
> > Where:
> >   - FL = First level/stage one page tables
> >   - SL = Second level/stage two page tables
> > 
> > Patch Overview:
> >   1. reports IOMMU nesting info to userspace ( patch 0001, 0002, 0003, 0015 
> > , 0016)
> >   2. vfio support for PASID allocation and free for VMs (patch 0004, 0005, 
> > 0007)
> >   3. a fix to a revisit in intel iommu driver (patch 0006)
> >   4. vfio support for binding guest page table to host (patch 0008, 0009, 
> > 0010)
> >   5. vfio support for IOMMU cache invalidation from VMs (patch 0011)
> >   6. vfio support for vSVA usage on IOMMU-backed mdevs (patch 0012)
> >   7. expose PASID capability to VM (patch 0013)
> >   8. add doc for VFIO dual stage control (patch 0014)
> 
> 
> If it's possible, I would suggest a generic uAPI instead of a VFIO specific
> one.

A large part of this work is already generic uAPI, in
include/uapi/linux/iommu.h. This patchset connects that generic interface
to the pre-existing VFIO uAPI that deals with IOMMU mappings of an
assigned device. But the bulk of the work is done by the IOMMU subsystem,
and is available to all device drivers.

> Jason suggest something like /dev/sva. There will be a lot of other
> subsystems that could benefit from this (e.g vDPA).

Do you have a more precise idea of the interface /dev/sva would provide,
how it would interact with VFIO and others?  vDPA could transport the
generic iommu.h structures via its own uAPI, and call the IOMMU API
directly without going through an intermediate /dev/sva handle.

Thanks,
Jean

> Have you ever considered this approach?
> 
> Thanks
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: Kconfig: Update help description for IPMMU_VMSA config

2020-09-14 Thread Geert Uytterhoeven
On Fri, Sep 11, 2020 at 12:19 PM Lad Prabhakar
 wrote:
> ipmmu-vmsa driver is also used on Renesas RZ/G{1,2} Soc's, update the
> same to reflect the help description for IPMMU_VMSA config.

... update the help description for the IPMMU_VMSA config symbol to reflect
this?

>
> Signed-off-by: Lad Prabhakar 
> Reviewed-by: Chris Paterson 

Reviewed-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-14 Thread Jason Gunthorpe
On Mon, Sep 14, 2020 at 10:38:10AM +, Tian, Kevin wrote:

> is widely used thus can better help verify the core logic with
> many existing devices. For vSVA, vDPA support has not be started
> while VFIO support is close to be accepted. It doesn't make much
> sense by blocking the VFIO part until vDPA is ready for wide

You keep saying that, but if we keep ignoring the right architecture
we end up with a mess inside VFIO just to save some development
time. That is usually not how the kernel process works.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC v1 04/18] iommu/hyperv: don't setup IRQ remapping when running as root

2020-09-14 Thread Wei Liu
The IOMMU code needs more work. We're sure for now the IRQ remapping
hooks are not applicable when Linux is the root.

Signed-off-by: Wei Liu 
---
 drivers/iommu/hyperv-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/hyperv-iommu.c b/drivers/iommu/hyperv-iommu.c
index 8919c1c70b68..4da684ab292c 100644
--- a/drivers/iommu/hyperv-iommu.c
+++ b/drivers/iommu/hyperv-iommu.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "irq_remapping.h"
 
@@ -143,7 +144,7 @@ static int __init hyperv_prepare_irq_remapping(void)
int i;
 
if (!hypervisor_is_type(X86_HYPER_MS_HYPERV) ||
-   !x2apic_supported())
+   !x2apic_supported() || hv_root_partition)
return -ENODEV;
 
fn = irq_domain_alloc_named_id_fwnode("HYPERV-IR", 0);
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-14 Thread Tian, Kevin
> From: Jason Wang
> Sent: Monday, September 14, 2020 4:57 PM
> 
> On 2020/9/14 下午4:01, Tian, Kevin wrote:
> >> From: Jason Wang 
> >> Sent: Monday, September 14, 2020 12:20 PM
> >>
> >> On 2020/9/10 下午6:45, Liu Yi L wrote:
> >>> Shared Virtual Addressing (SVA), a.k.a, Shared Virtual Memory (SVM) on
> >>> Intel platforms allows address space sharing between device DMA and
> >>> applications. SVA can reduce programming complexity and enhance
> >> security.
> >>> This VFIO series is intended to expose SVA usage to VMs. i.e. Sharing
> >>> guest application address space with passthru devices. This is called
> >>> vSVA in this series. The whole vSVA enabling requires
> QEMU/VFIO/IOMMU
> >>> changes. For IOMMU and QEMU changes, they are in separate series
> (listed
> >>> in the "Related series").
> >>>
> >>> The high-level architecture for SVA virtualization is as below, the key
> >>> design of vSVA support is to utilize the dual-stage IOMMU translation (
> >>> also known as IOMMU nesting translation) capability in host IOMMU.
> >>>
> >>>
> >>>   .-.  .---.
> >>>   |   vIOMMU|  | Guest process CR3, FL only|
> >>>   | |  '---'
> >>>   ./
> >>>   | PASID Entry |--- PASID cache flush -
> >>>   '-'   |
> >>>   | |   V
> >>>   | |CR3 in GPA
> >>>   '-'
> >>> Guest
> >>> --| Shadow |--|
> >>> vv  v
> >>> Host
> >>>   .-.  .--.
> >>>   |   pIOMMU|  | Bind FL for GVA-GPA  |
> >>>   | |  '--'
> >>>   ./  |
> >>>   | PASID Entry | V (Nested xlate)
> >>>   '\.--.
> >>>   | ||SL for GPA-HPA, default domain|
> >>>   | |   '--'
> >>>   '-'
> >>> Where:
> >>>- FL = First level/stage one page tables
> >>>- SL = Second level/stage two page tables
> >>>
> >>> Patch Overview:
> >>>1. reports IOMMU nesting info to userspace ( patch 0001, 0002, 0003,
> >> 0015 , 0016)
> >>>2. vfio support for PASID allocation and free for VMs (patch 0004, 
> >>> 0005,
> >> 0007)
> >>>3. a fix to a revisit in intel iommu driver (patch 0006)
> >>>4. vfio support for binding guest page table to host (patch 0008, 0009,
> >> 0010)
> >>>5. vfio support for IOMMU cache invalidation from VMs (patch 0011)
> >>>6. vfio support for vSVA usage on IOMMU-backed mdevs (patch 0012)
> >>>7. expose PASID capability to VM (patch 0013)
> >>>8. add doc for VFIO dual stage control (patch 0014)
> >>
> >> If it's possible, I would suggest a generic uAPI instead of a VFIO
> >> specific one.
> >>
> >> Jason suggest something like /dev/sva. There will be a lot of other
> >> subsystems that could benefit from this (e.g vDPA).
> >>
> > Just be curious. When does vDPA subsystem plan to support vSVA and
> > when could one expect a SVA-capable vDPA device in market?
> >
> > Thanks
> > Kevin
> 
> 
> vSVA is in the plan but there's no ETA. I think we might start the work
> after control vq support.  It will probably start from SVA first and
> then vSVA (since it might require platform support).
> 
> For the device part, it really depends on the chipset and other device
> vendors. We plan to do the prototype in virtio by introducing PASID
> support in the spec.
> 

Thanks for the info. Then here is my thought.

First, I don't think /dev/sva is the right interface. Once we start 
considering such generic uAPI, it better behaves as the one interface
for all kinds of DMA requirements on device/subdevice passthrough.
Nested page table thru vSVA is one way. Manual map/unmap is
another way. It doesn't make sense to have one through generic
uAPI and the other through subsystem specific uAPI. In the end
the interface might become /dev/iommu, for delegating certain
IOMMU operations to userspace. 

In addition, delegated IOMMU operations have different scopes.
PASID allocation is per process/VM. pgtbl-bind/unbind, map/unmap 
and cache invalidation are per iommu domain. page request/
response are per device/subdevice. This requires the uAPI to also
understand and manage the association between domain/group/
device/subdevice (such as group attach/detach), instead of doing 
it separately in VFIO or vDPA as today. 

Based on above, I feel a more reasonable way is to first make a 
/dev/iommu uAPI supporting DMA map/unmap usages and then 
introduce vSVA to it. Doing this order is because DMA map/unmap 
is widely used thus can better help verify the core logic with 
many existing devices. For vSVA, vDPA support has not be started
while VFIO support is close to be accepted. It doesn't make much 
sense by blocking the VFIO part unti

Re: [PATCH v4 2/3] iommu/mediatek: add flag for legacy ivrp paddr

2020-09-14 Thread Matthias Brugger




On 07/09/2020 12:16, Fabien Parent wrote:

Add a new flag in order to select which IVRP_PADDR format is used
by an SoC.

Signed-off-by: Fabien Parent 
Reviewed-by: Yong Wu 


Reviewed-by: Matthias Brugger 


---

v4: no change
v3: set LEGACY_IVRP_PADDR as a flag instead of platform data
v2: new patch

---
  drivers/iommu/mtk_iommu.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 785b228d39a6..b1f85a7e9346 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -116,6 +116,7 @@
  #define OUT_ORDER_WR_EN   BIT(4)
  #define HAS_SUB_COMM  BIT(5)
  #define WR_THROT_EN   BIT(6)
+#define HAS_LEGACY_IVRP_PADDR  BIT(7)
  
  #define MTK_IOMMU_HAS_FLAG(pdata, _x) \

pdata)->flags) & (_x)) == (_x))
@@ -582,7 +583,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
*data)
F_INT_PRETETCH_TRANSATION_FIFO_FAULT;
writel_relaxed(regval, data->base + REG_MMU_INT_MAIN_CONTROL);
  
-	if (data->plat_data->m4u_plat == M4U_MT8173)

+   if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_LEGACY_IVRP_PADDR))
regval = (data->protect_base >> 1) | (data->enable_4GB << 31);
else
regval = lower_32_bits(data->protect_base) |
@@ -818,7 +819,8 @@ static const struct mtk_iommu_plat_data mt6779_data = {
  
  static const struct mtk_iommu_plat_data mt8173_data = {

.m4u_plat = M4U_MT8173,
-   .flags= HAS_4GB_MODE | HAS_BCLK | RESET_AXI,
+   .flags= HAS_4GB_MODE | HAS_BCLK | RESET_AXI |
+   HAS_LEGACY_IVRP_PADDR,
.inv_sel_reg  = REG_MMU_INV_SEL_GEN1,
.larbid_remap = {{0}, {1}, {2}, {3}, {4}, {5}}, /* Linear mapping. */
  };


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-14 Thread Jason Wang


On 2020/9/14 下午4:01, Tian, Kevin wrote:

From: Jason Wang 
Sent: Monday, September 14, 2020 12:20 PM

On 2020/9/10 下午6:45, Liu Yi L wrote:

Shared Virtual Addressing (SVA), a.k.a, Shared Virtual Memory (SVM) on
Intel platforms allows address space sharing between device DMA and
applications. SVA can reduce programming complexity and enhance

security.

This VFIO series is intended to expose SVA usage to VMs. i.e. Sharing
guest application address space with passthru devices. This is called
vSVA in this series. The whole vSVA enabling requires QEMU/VFIO/IOMMU
changes. For IOMMU and QEMU changes, they are in separate series (listed
in the "Related series").

The high-level architecture for SVA virtualization is as below, the key
design of vSVA support is to utilize the dual-stage IOMMU translation (
also known as IOMMU nesting translation) capability in host IOMMU.


  .-.  .---.
  |   vIOMMU|  | Guest process CR3, FL only|
  | |  '---'
  ./
  | PASID Entry |--- PASID cache flush -
  '-'   |
  | |   V
  | |CR3 in GPA
  '-'
Guest
--| Shadow |--|
vv  v
Host
  .-.  .--.
  |   pIOMMU|  | Bind FL for GVA-GPA  |
  | |  '--'
  ./  |
  | PASID Entry | V (Nested xlate)
  '\.--.
  | ||SL for GPA-HPA, default domain|
  | |   '--'
  '-'
Where:
   - FL = First level/stage one page tables
   - SL = Second level/stage two page tables

Patch Overview:
   1. reports IOMMU nesting info to userspace ( patch 0001, 0002, 0003,

0015 , 0016)

   2. vfio support for PASID allocation and free for VMs (patch 0004, 0005,

0007)

   3. a fix to a revisit in intel iommu driver (patch 0006)
   4. vfio support for binding guest page table to host (patch 0008, 0009,

0010)

   5. vfio support for IOMMU cache invalidation from VMs (patch 0011)
   6. vfio support for vSVA usage on IOMMU-backed mdevs (patch 0012)
   7. expose PASID capability to VM (patch 0013)
   8. add doc for VFIO dual stage control (patch 0014)


If it's possible, I would suggest a generic uAPI instead of a VFIO
specific one.

Jason suggest something like /dev/sva. There will be a lot of other
subsystems that could benefit from this (e.g vDPA).


Just be curious. When does vDPA subsystem plan to support vSVA and
when could one expect a SVA-capable vDPA device in market?

Thanks
Kevin



vSVA is in the plan but there's no ETA. I think we might start the work 
after control vq support.  It will probably start from SVA first and 
then vSVA (since it might require platform support).


For the device part, it really depends on the chipset and other device 
vendors. We plan to do the prototype in virtio by introducing PASID 
support in the spec.


Thanks


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/2] iommu/iova: Free global iova rcache on iova alloc failure

2020-09-14 Thread John Garry

On 13/08/2020 06:36, Vijayanand Jitta wrote:



On 8/12/2020 8:46 PM, Joerg Roedel wrote:

On Mon, Aug 03, 2020 at 03:30:48PM +0530, Vijayanand Jitta wrote:

ping?


Please repost when v5.9-rc1 is released and add

Robin Murphy 

on your Cc list.

Thanks,

Joerg



Sure, will do.

Thanks,
Vijay



And a cover letter would be useful also, to at least us know what 
changes have been made per version.


>
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index 4e77116..5836c87 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -442,6 +442,7 @@ struct iova *find_iova(struct iova_domain *iovad, 
unsigned long pfn)

>flush_rcache = false;
>for_each_online_cpu(cpu)
>free_cpu_cached_iovas(cpu, iovad);
> +  free_global_cached_iovas(iovad);

Have you seen an issue where this is needed?

If we have filled the IOVA space, then as a measure we flush all the CPU 
rcaches, and then there should be free IOVA space and we can make 
progress. And it may be useful to still have the global depots to use 
straightaway then to swap into empty CPU rcaches.


>goto retry;
>}
>
> @@ -1055,5 +1056,27 @@ void free_cpu_cached_iovas(unsigned int cpu, 
struct iova_domain *iovad)

>}
>   }
>
> +/*
> + * free all the IOVA ranges of global cache
> + */
> +void free_global_cached_iovas(struct iova_domain *iovad)
> +{
> +  struct iova_rcache *rcache;
> +  unsigned long flags;
> +  int i, j;
> +
> +  for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
> +  rcache = &iovad->rcaches[i];
> +  spin_lock_irqsave(&rcache->lock, flags);
> +  for (j = 0; j < rcache->depot_size; ++j) {
> +  iova_magazine_free_pfns(rcache->depot[j], iovad);
> +  iova_magazine_free(rcache->depot[j]);
> +  rcache->depot[j] = NULL;
> +  }
> +  rcache->depot_size = 0;
> +  spin_unlock_irqrestore(&rcache->lock, flags);
> +  }
> +}
> +
>   MODULE_AUTHOR("Anil S Keshavamurthy ");
>   MODULE_LICENSE("GPL");
> diff --git a/include/linux/iova.h b/include/linux/iova.h
> index a0637ab..a905726 100644
> --- a/include/linux/iova.h
> +++ b/include/linux/iova.h

why is this in the iova.h, when it is only used internally in iova.c?

> @@ -163,6 +163,7 @@ int init_iova_flush_queue(struct iova_domain *iovad,
>   struct iova *split_and_remove_iova(struct iova_domain *iovad,
>struct iova *iova, unsigned long pfn_lo, unsigned long pfn_hi);
>   void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain 
*iovad);

> +void free_global_cached_iovas(struct iova_domain *iovad);
>   #else
>   static inline int iova_cache_get(void)
>   {
> @@ -270,6 +271,11 @@ static inline void 
free_cpu_cached_iovas(unsigned int cpu,

> struct iova_domain *iovad)
>   {
>   }
> +
> +static inline void free_global_cached_iovas(struct iova_domain *iovad)
> +{
> +}
> +
>   #endif
>
>   #endif


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 3/3] iommu/mediatek: add support for MT8167

2020-09-14 Thread Yong Wu
On Mon, 2020-09-07 at 12:16 +0200, Fabien Parent wrote:
> Add support for the IOMMU on MT8167
> 
> Signed-off-by: Fabien Parent 

Reviewed-by: Yong Wu 

> ---
> 
> V4;
>   * Removed HAS_4GB_MODE flag since this SoC does not seem to support it
> V3:
>   * use LEGACY_IVRP_PADDR flag instead of using a platform data member
> V2:
>   * removed if based on m4u_plat, and using instead the new
> has_legacy_ivrp_paddr member that was introduced in patch 2.
> 
> ---
>  drivers/iommu/mtk_iommu.c | 8 
>  drivers/iommu/mtk_iommu.h | 1 +
>  2 files changed, 9 insertions(+)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index b1f85a7e9346..4ff071eb5279 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -817,6 +817,13 @@ static const struct mtk_iommu_plat_data mt6779_data = {
>   .larbid_remap  = {{0}, {1}, {2}, {3}, {5}, {7, 8}, {10}, {9}},
>  };
>  
> +static const struct mtk_iommu_plat_data mt8167_data = {
> + .m4u_plat = M4U_MT8167,
> + .flags= RESET_AXI | HAS_LEGACY_IVRP_PADDR,
> + .inv_sel_reg  = REG_MMU_INV_SEL_GEN1,
> + .larbid_remap = {{0}, {1}, {2}}, /* Linear mapping. */
> +};
> +
>  static const struct mtk_iommu_plat_data mt8173_data = {
>   .m4u_plat = M4U_MT8173,
>   .flags= HAS_4GB_MODE | HAS_BCLK | RESET_AXI |
> @@ -835,6 +842,7 @@ static const struct mtk_iommu_plat_data mt8183_data = {
>  static const struct of_device_id mtk_iommu_of_ids[] = {
>   { .compatible = "mediatek,mt2712-m4u", .data = &mt2712_data},
>   { .compatible = "mediatek,mt6779-m4u", .data = &mt6779_data},
> + { .compatible = "mediatek,mt8167-m4u", .data = &mt8167_data},
>   { .compatible = "mediatek,mt8173-m4u", .data = &mt8173_data},
>   { .compatible = "mediatek,mt8183-m4u", .data = &mt8183_data},
>   {}
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index 122925dbe547..df32b3e3408b 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -39,6 +39,7 @@ enum mtk_iommu_plat {
>   M4U_MT2701,
>   M4U_MT2712,
>   M4U_MT6779,
> + M4U_MT8167,
>   M4U_MT8173,
>   M4U_MT8183,
>  };

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 0/6] Convert the intel iommu driver to the dma-iommu api

2020-09-14 Thread Tvrtko Ursulin


Hi,

On 12/09/2020 04:21, Lu Baolu wrote:
> Tom Murphy has almost done all the work. His latest patch series was
> posted here.
> 
> https://lore.kernel.org/linux-iommu/20200903201839.7327-1-murph...@tcd.ie/
> 
> Thanks a lot!
> 
> This series is a follow-up with below changes:
> 
> 1. Add a quirk for the i915 driver issue described in Tom's cover
> letter.

Last week I have copied you on an i915 series which appears to remove the need 
for this quirk. so if we get those i915 patches reviewed and merged, do you 
still want to pursue this quirk?

> 2. Fix several bugs in patch "iommu: Allow the dma-iommu api to use
> bounce buffers" to make the bounce buffer work for untrusted devices.
> 3. Several cleanups in iommu/vt-d driver after the conversion.

With the previous version of the series I hit a problem on Ivybridge where 
apparently the dma engine width is not respected. At least that is my layman 
interpretation of the errors. From the older thread:

<3> [209.526605] DMAR: intel_iommu_map: iommu width (39) is not sufficient for 
the mapped address (008000)

Relevant iommu boot related messages are:

<6>[0.184234] DMAR: Host address width 36
<6>[0.184245] DMAR: DRHD base: 0x00fed9 flags: 0x0
<6>[0.184288] DMAR: dmar0: reg_base_addr fed9 ver 1:0 cap 
c020e60262 ecap f0101a
<6>[0.184308] DMAR: DRHD base: 0x00fed91000 flags: 0x1
<6>[0.184337] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap 
c9008020660262 ecap f0105a
<6>[0.184357] DMAR: RMRR base: 0x00d8d28000 end: 0x00d8d46fff
<6>[0.184377] DMAR: RMRR base: 0x00db00 end: 0x00df1f
<6>[0.184398] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 1
<6>[0.184414] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
<6>[0.184428] DMAR-IR: Queued invalidation will be enabled to support 
x2apic and Intr-remapping.
<6>[0.185173] DMAR-IR: Enabled IRQ remapping in x2apic mode

<6>[0.878934] DMAR: No ATSR found
<6>[0.878966] DMAR: dmar0: Using Queued invalidation
<6>[0.879007] DMAR: dmar1: Using Queued invalidation

<6>[0.915032] DMAR: Intel(R) Virtualization Technology for Directed I/O
<6>[0.915060] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
<6>[0.915084] software IO TLB: mapped [mem 0xc80d4000-0xcc0d4000] (64MB)

(Full boot log at 
https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7054/fi-ivb-3770/boot0.txt, 
failures at 
https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7054/fi-ivb-3770/igt@i915_selftest@l...@blt.html.)

Does this look familiar or at least plausible to you? Is this something your 
new series has fixed?

Regards,

Tvrtko

> 
> Please review and test.
> 
> Best regards,
> baolu
> 
> Lu Baolu (2):
>iommu: Add quirk for Intel graphic devices in map_sg
>iommu/vt-d: Cleanup after converting to dma-iommu ops
> 
> Tom Murphy (4):
>iommu: Handle freelists when using deferred flushing in iommu drivers
>iommu: Add iommu_dma_free_cpu_cached_iovas()
>iommu: Allow the dma-iommu api to use bounce buffers
>iommu/vt-d: Convert intel iommu driver to the iommu ops
> 
>   .../admin-guide/kernel-parameters.txt |   5 -
>   drivers/iommu/dma-iommu.c | 229 -
>   drivers/iommu/intel/Kconfig   |   1 +
>   drivers/iommu/intel/iommu.c   | 885 +++---
>   include/linux/dma-iommu.h |   8 +
>   include/linux/iommu.h |   1 +
>   6 files changed, 323 insertions(+), 806 deletions(-)
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v7 00/16] vfio: expose virtual Shared Virtual Addressing to VMs

2020-09-14 Thread Tian, Kevin
> From: Jason Wang 
> Sent: Monday, September 14, 2020 12:20 PM
> 
> On 2020/9/10 下午6:45, Liu Yi L wrote:
> > Shared Virtual Addressing (SVA), a.k.a, Shared Virtual Memory (SVM) on
> > Intel platforms allows address space sharing between device DMA and
> > applications. SVA can reduce programming complexity and enhance
> security.
> >
> > This VFIO series is intended to expose SVA usage to VMs. i.e. Sharing
> > guest application address space with passthru devices. This is called
> > vSVA in this series. The whole vSVA enabling requires QEMU/VFIO/IOMMU
> > changes. For IOMMU and QEMU changes, they are in separate series (listed
> > in the "Related series").
> >
> > The high-level architecture for SVA virtualization is as below, the key
> > design of vSVA support is to utilize the dual-stage IOMMU translation (
> > also known as IOMMU nesting translation) capability in host IOMMU.
> >
> >
> >  .-.  .---.
> >  |   vIOMMU|  | Guest process CR3, FL only|
> >  | |  '---'
> >  ./
> >  | PASID Entry |--- PASID cache flush -
> >  '-'   |
> >  | |   V
> >  | |CR3 in GPA
> >  '-'
> > Guest
> > --| Shadow |--|
> >vv  v
> > Host
> >  .-.  .--.
> >  |   pIOMMU|  | Bind FL for GVA-GPA  |
> >  | |  '--'
> >  ./  |
> >  | PASID Entry | V (Nested xlate)
> >  '\.--.
> >  | ||SL for GPA-HPA, default domain|
> >  | |   '--'
> >  '-'
> > Where:
> >   - FL = First level/stage one page tables
> >   - SL = Second level/stage two page tables
> >
> > Patch Overview:
> >   1. reports IOMMU nesting info to userspace ( patch 0001, 0002, 0003,
> 0015 , 0016)
> >   2. vfio support for PASID allocation and free for VMs (patch 0004, 0005,
> 0007)
> >   3. a fix to a revisit in intel iommu driver (patch 0006)
> >   4. vfio support for binding guest page table to host (patch 0008, 0009,
> 0010)
> >   5. vfio support for IOMMU cache invalidation from VMs (patch 0011)
> >   6. vfio support for vSVA usage on IOMMU-backed mdevs (patch 0012)
> >   7. expose PASID capability to VM (patch 0013)
> >   8. add doc for VFIO dual stage control (patch 0014)
> 
> 
> If it's possible, I would suggest a generic uAPI instead of a VFIO
> specific one.
> 
> Jason suggest something like /dev/sva. There will be a lot of other
> subsystems that could benefit from this (e.g vDPA).
> 

Just be curious. When does vDPA subsystem plan to support vSVA and 
when could one expect a SVA-capable vDPA device in market?

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 5/6] usb: don't inherity DMA properties for USB devices

2020-09-14 Thread Greg Kroah-Hartman
On Mon, Sep 14, 2020 at 09:33:42AM +0200, Christoph Hellwig wrote:
> As the comment in usb_alloc_dev correctly states, drivers can't use
> the DMA API on usb device, and at least calling dma_set_mask on them
> is highly dangerous.  Unlike what the comment states upper level drivers
> also can't really use the presence of a dma mask to check for DMA
> support, as the dma_mask is set by default for most busses.
> 
> Setting the dma_mask comes from "[PATCH] usbcore dma updates (and doc)"
> in BitKeeper times, as it seems like it was primarily for setting the
> NETIF_F_HIGHDMA flag in USB drivers, something that has long been
> fixed up since.
> 
> Setting the dma_pfn_offset comes from commit b44bbc46a8bb
> ("usb: core: setup dma_pfn_offset for USB devices and, interfaces"),
> which worked around the fact that the scsi_calculate_bounce_limits
> functions wasn't going through the proper driver interface to query
> DMA information, but that function was removed in commit 21e07dba9fb1
> ("scsi: reduce use of block bounce buffers") years ago.
> 
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Greg Kroah-Hartman 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 6/6] dma-mapping: introduce DMA range map, supplanting dma_pfn_offset

2020-09-14 Thread Christoph Hellwig
From: Jim Quinlan 

The new field 'dma_range_map' in struct device is used to facilitate the
use of single or multiple offsets between mapping regions of cpu addrs and
dma addrs.  It subsumes the role of "dev->dma_pfn_offset" which was only
capable of holding a single uniform offset and had no region bounds
checking.

The function of_dma_get_range() has been modified so that it takes a single
argument -- the device node -- and returns a map, NULL, or an error code.
The map is an array that holds the information regarding the DMA regions.
Each range entry contains the address offset, the cpu_start address, the
dma_start address, and the size of the region.

of_dma_configure() is the typical manner to set range offsets but there are
a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel
driver code.  These cases now invoke the function
dma_direct_set_offset(dev, cpu_addr, dma_addr, size).

Signed-off-by: Jim Quinlan 
[hch: various interface cleanups]
Signed-off-by: Christoph Hellwig 
Tested-by: Nathan Chancellor 
---
 arch/arm/include/asm/dma-direct.h |  9 +--
 arch/arm/mach-keystone/keystone.c | 17 ++---
 arch/arm/mach-omap1/include/mach/memory.h |  4 +
 arch/sh/drivers/pci/pcie-sh7786.c |  9 ++-
 arch/x86/pci/sta2x11-fixup.c  |  6 +-
 drivers/acpi/arm64/iort.c |  6 +-
 drivers/base/core.c   |  2 +
 drivers/gpu/drm/sun4i/sun4i_backend.c |  8 +-
 drivers/iommu/io-pgtable-arm.c|  2 +-
 .../platform/sunxi/sun4i-csi/sun4i_csi.c  |  9 ++-
 .../platform/sunxi/sun6i-csi/sun6i_csi.c  | 11 ++-
 drivers/of/address.c  | 73 ---
 drivers/of/device.c   | 44 ++-
 drivers/of/of_private.h   | 11 +--
 drivers/of/unittest.c | 34 ++---
 drivers/remoteproc/remoteproc_core.c  | 24 +-
 .../staging/media/sunxi/cedrus/cedrus_hw.c| 10 ++-
 include/linux/device.h|  4 +-
 include/linux/dma-direct.h| 52 +++--
 include/linux/dma-mapping.h   |  9 ++-
 kernel/dma/coherent.c |  7 +-
 kernel/dma/direct.c   | 51 -
 22 files changed, 284 insertions(+), 118 deletions(-)

diff --git a/arch/arm/include/asm/dma-direct.h 
b/arch/arm/include/asm/dma-direct.h
index fbcf4367b5cb1a..436544aeb83405 100644
--- a/arch/arm/include/asm/dma-direct.h
+++ b/arch/arm/include/asm/dma-direct.h
@@ -12,8 +12,8 @@
 #ifndef __arch_pfn_to_dma
 static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
 {
-   if (dev)
-   pfn -= dev->dma_pfn_offset;
+   if (dev && dev->dma_range_map)
+   pfn = PFN_DOWN(translate_phys_to_dma(dev, PFN_PHYS(pfn)));
return (dma_addr_t)__pfn_to_bus(pfn);
 }
 
@@ -21,9 +21,8 @@ static inline unsigned long dma_to_pfn(struct device *dev, 
dma_addr_t addr)
 {
unsigned long pfn = __bus_to_pfn(addr);
 
-   if (dev)
-   pfn += dev->dma_pfn_offset;
-
+   if (dev && dev->dma_range_map)
+   pfn = PFN_DOWN(translate_dma_to_phys(dev, PFN_PHYS(pfn)));
return pfn;
 }
 
diff --git a/arch/arm/mach-keystone/keystone.c 
b/arch/arm/mach-keystone/keystone.c
index dcd031ba84c2e0..09a65c2dfd7327 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -8,6 +8,7 @@
  */
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -25,8 +26,6 @@
 #include "keystone.h"
 
 #ifdef CONFIG_ARM_LPAE
-static unsigned long keystone_dma_pfn_offset __read_mostly;
-
 static int keystone_platform_notifier(struct notifier_block *nb,
  unsigned long event, void *data)
 {
@@ -39,9 +38,12 @@ static int keystone_platform_notifier(struct notifier_block 
*nb,
return NOTIFY_BAD;
 
if (!dev->of_node) {
-   dev->dma_pfn_offset = keystone_dma_pfn_offset;
-   dev_err(dev, "set dma_pfn_offset%08lx\n",
-   dev->dma_pfn_offset);
+   int ret = dma_direct_set_offset(dev, KEYSTONE_HIGH_PHYS_START,
+   KEYSTONE_LOW_PHYS_START,
+   KEYSTONE_HIGH_PHYS_SIZE);
+   dev_err(dev, "set dma_offset%08llx%s\n",
+   KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START,
+   ret ? " failed" : "");
}
return NOTIFY_OK;
 }
@@ -54,11 +56,8 @@ static struct notifier_block platform_nb = {
 static void __init keystone_init(void)
 {
 #ifdef CONFIG_ARM_LPAE
-   if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START) {
-   keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START -
-  KEYSTONE_LOW_PHYS_START);
+   if (PHYS_OFFSET >= KEYSTONE_HIGH_PH

[PATCH 5/6] usb: don't inherity DMA properties for USB devices

2020-09-14 Thread Christoph Hellwig
As the comment in usb_alloc_dev correctly states, drivers can't use
the DMA API on usb device, and at least calling dma_set_mask on them
is highly dangerous.  Unlike what the comment states upper level drivers
also can't really use the presence of a dma mask to check for DMA
support, as the dma_mask is set by default for most busses.

Setting the dma_mask comes from "[PATCH] usbcore dma updates (and doc)"
in BitKeeper times, as it seems like it was primarily for setting the
NETIF_F_HIGHDMA flag in USB drivers, something that has long been
fixed up since.

Setting the dma_pfn_offset comes from commit b44bbc46a8bb
("usb: core: setup dma_pfn_offset for USB devices and, interfaces"),
which worked around the fact that the scsi_calculate_bounce_limits
functions wasn't going through the proper driver interface to query
DMA information, but that function was removed in commit 21e07dba9fb1
("scsi: reduce use of block bounce buffers") years ago.

Signed-off-by: Christoph Hellwig 
---
 drivers/usb/core/message.c |  6 --
 drivers/usb/core/usb.c | 12 
 2 files changed, 18 deletions(-)

diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
index 6197938dcc2d8f..9e45732dc1d1d1 100644
--- a/drivers/usb/core/message.c
+++ b/drivers/usb/core/message.c
@@ -1954,12 +1954,6 @@ int usb_set_configuration(struct usb_device *dev, int 
configuration)
intf->dev.bus = &usb_bus_type;
intf->dev.type = &usb_if_device_type;
intf->dev.groups = usb_interface_groups;
-   /*
-* Please refer to usb_alloc_dev() to see why we set
-* dma_mask and dma_pfn_offset.
-*/
-   intf->dev.dma_mask = dev->dev.dma_mask;
-   intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset;
INIT_WORK(&intf->reset_ws, __usb_queue_reset_device);
intf->minor = -1;
device_initialize(&intf->dev);
diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
index bafc113f2b3ef3..9b4ac4415f1a47 100644
--- a/drivers/usb/core/usb.c
+++ b/drivers/usb/core/usb.c
@@ -599,18 +599,6 @@ struct usb_device *usb_alloc_dev(struct usb_device *parent,
dev->dev.bus = &usb_bus_type;
dev->dev.type = &usb_device_type;
dev->dev.groups = usb_device_groups;
-   /*
-* Fake a dma_mask/offset for the USB device:
-* We cannot really use the dma-mapping API (dma_alloc_* and
-* dma_map_*) for USB devices but instead need to use
-* usb_alloc_coherent and pass data in 'urb's, but some subsystems
-* manually look into the mask/offset pair to determine whether
-* they need bounce buffers.
-* Note: calling dma_set_mask() on a USB device would set the
-* mask for the entire HCD, so don't do that.
-*/
-   dev->dev.dma_mask = bus->sysdev->dma_mask;
-   dev->dev.dma_pfn_offset = bus->sysdev->dma_pfn_offset;
set_dev_node(&dev->dev, dev_to_node(bus->sysdev));
dev->state = USB_STATE_ATTACHED;
dev->lpm_disable_count = 1;
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/6] ARM/keystone: move the DMA offset handling under ifdef CONFIG_ARM_LPAE

2020-09-14 Thread Christoph Hellwig
The DMA offset notifier can only be used if PHYS_OFFSET is at least
KEYSTONE_HIGH_PHYS_START, which can't be represented by a 32-bit
phys_addr_t.  Currently the code compiles fine despite that, a pending
change to the DMA offset handling would create a compiler warning for
this case.  Add an ifdef to not compile the code except for LPAE
configs.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Robin Murphy 
---
 arch/arm/mach-keystone/keystone.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm/mach-keystone/keystone.c 
b/arch/arm/mach-keystone/keystone.c
index 638808c4e12247..dcd031ba84c2e0 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -24,6 +24,7 @@
 
 #include "keystone.h"
 
+#ifdef CONFIG_ARM_LPAE
 static unsigned long keystone_dma_pfn_offset __read_mostly;
 
 static int keystone_platform_notifier(struct notifier_block *nb,
@@ -48,14 +49,17 @@ static int keystone_platform_notifier(struct notifier_block 
*nb,
 static struct notifier_block platform_nb = {
.notifier_call = keystone_platform_notifier,
 };
+#endif /* CONFIG_ARM_LPAE */
 
 static void __init keystone_init(void)
 {
+#ifdef CONFIG_ARM_LPAE
if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START) {
keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START -
   KEYSTONE_LOW_PHYS_START);
bus_register_notifier(&platform_bus_type, &platform_nb);
}
+#endif
keystone_pm_runtime_init();
 }
 
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/6] ARM/dma-mapping: move various helpers from dma-mapping.h to dma-direct.h

2020-09-14 Thread Christoph Hellwig
Move the helpers to translate to and from direct mapping DMA addresses
to dma-direct.h.  This not only is the most logical place, but the new
placement also avoids dependency loops with pending commits.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Robin Murphy 
---
 arch/arm/common/dmabounce.c|  2 +-
 arch/arm/include/asm/dma-direct.h  | 50 ++
 arch/arm/include/asm/dma-mapping.h | 50 --
 3 files changed, 51 insertions(+), 51 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index f4b719bde76367..d3e00ea9208834 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -24,7 +24,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/arm/include/asm/dma-direct.h 
b/arch/arm/include/asm/dma-direct.h
index bca0de56753439..fbcf4367b5cb1a 100644
--- a/arch/arm/include/asm/dma-direct.h
+++ b/arch/arm/include/asm/dma-direct.h
@@ -2,6 +2,56 @@
 #ifndef ASM_ARM_DMA_DIRECT_H
 #define ASM_ARM_DMA_DIRECT_H 1
 
+#include 
+
+/*
+ * dma_to_pfn/pfn_to_dma/virt_to_dma are architecture private
+ * functions used internally by the DMA-mapping API to provide DMA
+ * addresses. They must not be used by drivers.
+ */
+#ifndef __arch_pfn_to_dma
+static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
+{
+   if (dev)
+   pfn -= dev->dma_pfn_offset;
+   return (dma_addr_t)__pfn_to_bus(pfn);
+}
+
+static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
+{
+   unsigned long pfn = __bus_to_pfn(addr);
+
+   if (dev)
+   pfn += dev->dma_pfn_offset;
+
+   return pfn;
+}
+
+static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
+{
+   if (dev)
+   return pfn_to_dma(dev, virt_to_pfn(addr));
+
+   return (dma_addr_t)__virt_to_bus((unsigned long)(addr));
+}
+
+#else
+static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
+{
+   return __arch_pfn_to_dma(dev, pfn);
+}
+
+static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
+{
+   return __arch_dma_to_pfn(dev, addr);
+}
+
+static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
+{
+   return __arch_virt_to_dma(dev, addr);
+}
+#endif
+
 static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
 {
unsigned int offset = paddr & ~PAGE_MASK;
diff --git a/arch/arm/include/asm/dma-mapping.h 
b/arch/arm/include/asm/dma-mapping.h
index cf2535fb8891f5..0a1a536368c3a4 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -8,8 +8,6 @@
 #include 
 #include 
 
-#include 
-
 #include 
 #include 
 
@@ -23,54 +21,6 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
return NULL;
 }
 
-/*
- * dma_to_pfn/pfn_to_dma/virt_to_dma are architecture private
- * functions used internally by the DMA-mapping API to provide DMA
- * addresses. They must not be used by drivers.
- */
-#ifndef __arch_pfn_to_dma
-static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
-{
-   if (dev)
-   pfn -= dev->dma_pfn_offset;
-   return (dma_addr_t)__pfn_to_bus(pfn);
-}
-
-static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
-{
-   unsigned long pfn = __bus_to_pfn(addr);
-
-   if (dev)
-   pfn += dev->dma_pfn_offset;
-
-   return pfn;
-}
-
-static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
-{
-   if (dev)
-   return pfn_to_dma(dev, virt_to_pfn(addr));
-
-   return (dma_addr_t)__virt_to_bus((unsigned long)(addr));
-}
-
-#else
-static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
-{
-   return __arch_pfn_to_dma(dev, pfn);
-}
-
-static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
-{
-   return __arch_dma_to_pfn(dev, addr);
-}
-
-static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
-{
-   return __arch_virt_to_dma(dev, addr);
-}
-#endif
-
 /**
  * arm_dma_alloc - allocate consistent memory for DMA
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/6] ARM/dma-mapping: remove dma_to_virt

2020-09-14 Thread Christoph Hellwig
dma_to_virt is entirely unused, remove it.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/include/asm/dma-mapping.h| 18 +-
 arch/arm/mach-omap1/include/mach/memory.h |  4 
 2 files changed, 1 insertion(+), 21 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h 
b/arch/arm/include/asm/dma-mapping.h
index 70d95677656044..cf2535fb8891f5 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -24,7 +24,7 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
 }
 
 /*
- * dma_to_pfn/pfn_to_dma/dma_to_virt/virt_to_dma are architecture private
+ * dma_to_pfn/pfn_to_dma/virt_to_dma are architecture private
  * functions used internally by the DMA-mapping API to provide DMA
  * addresses. They must not be used by drivers.
  */
@@ -46,17 +46,6 @@ static inline unsigned long dma_to_pfn(struct device *dev, 
dma_addr_t addr)
return pfn;
 }
 
-static inline void *dma_to_virt(struct device *dev, dma_addr_t addr)
-{
-   if (dev) {
-   unsigned long pfn = dma_to_pfn(dev, addr);
-
-   return phys_to_virt(__pfn_to_phys(pfn));
-   }
-
-   return (void *)__bus_to_virt((unsigned long)addr);
-}
-
 static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
 {
if (dev)
@@ -76,11 +65,6 @@ static inline unsigned long dma_to_pfn(struct device *dev, 
dma_addr_t addr)
return __arch_dma_to_pfn(dev, addr);
 }
 
-static inline void *dma_to_virt(struct device *dev, dma_addr_t addr)
-{
-   return __arch_dma_to_virt(dev, addr);
-}
-
 static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
 {
return __arch_virt_to_dma(dev, addr);
diff --git a/arch/arm/mach-omap1/include/mach/memory.h 
b/arch/arm/mach-omap1/include/mach/memory.h
index 1142560e0078f5..e43697c3297bf2 100644
--- a/arch/arm/mach-omap1/include/mach/memory.h
+++ b/arch/arm/mach-omap1/include/mach/memory.h
@@ -41,10 +41,6 @@
   __phys_to_pfn(__dma);\
})
 
-#define __arch_dma_to_virt(dev, addr)  ({ (void *) (is_lbus_device(dev) ? \
-   lbus_to_virt(addr) : \
-   __phys_to_virt(addr)); })
-
 #define __arch_virt_to_dma(dev, addr)  ({ unsigned long __addr = (unsigned 
long)(addr); \
   (dma_addr_t) (is_lbus_device(dev) ? \
virt_to_lbus(__addr) : \
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/6] ARM/dma-mapping: remove a __arch_page_to_dma #error

2020-09-14 Thread Christoph Hellwig
The __arch_page_to_dma hook is long gone.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/include/asm/dma-mapping.h | 4 
 1 file changed, 4 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h 
b/arch/arm/include/asm/dma-mapping.h
index bdd80ddbca3451..70d95677656044 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -23,10 +23,6 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
return NULL;
 }
 
-#ifdef __arch_page_to_dma
-#error Please update to __arch_pfn_to_dma
-#endif
-
 /*
  * dma_to_pfn/pfn_to_dma/dma_to_virt/virt_to_dma are architecture private
  * functions used internally by the DMA-mapping API to provide DMA
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


support range based offsets in dma-direct v2

2020-09-14 Thread Christoph Hellwig
Hi all,

this series adds range-based offsets to the dma-direct implementation.  The
guts of the change are a patch from Jim with some modifications from me,
but to do it nicely we need to ARM patches to prepare for it as well.

Changes since v1:
 - rebased on top of the latests dma-mapping for-next tree
 - add two more trivial ARM cleanups
 - remove the DMA property inheritance hack in usb
 - move the remaining copy of the ranges into the remoteproc driver
   as it should not be seen as a general API, but as a quirk for
   remoteproc that we need to fix ASAP

Diffstat:
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3] iommu/dma: Fix IOVA reserve dma ranges

2020-09-14 Thread Srinath Mannam via iommu
Fix IOVA reserve failure in the case when address of first memory region
listed in dma-ranges is equal to 0x0.

Fixes: aadad097cd46f ("iommu/dma: Reserve IOVA for PCIe inaccessible DMA 
address")
Signed-off-by: Srinath Mannam 
---
Changes from v2:
   Modify error message with useful information based on Bjorn's comments.

Changes from v1:
   Removed unnecessary changes based on Robin's review comments.

 drivers/iommu/dma-iommu.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 5141d49a046b..5b9791f35c5e 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -217,9 +217,11 @@ static int iova_reserve_pci_windows(struct pci_dev *dev,
lo = iova_pfn(iovad, start);
hi = iova_pfn(iovad, end);
reserve_iova(iovad, lo, hi);
-   } else {
+   } else if (end < start) {
/* dma_ranges list should be sorted */
-   dev_err(&dev->dev, "Failed to reserve IOVA\n");
+   dev_err(&dev->dev,
+   "Failed to reserve IOVA [%#010llx-%#010llx]\n",
+   start, end);
return -EINVAL;
}
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/dma: Fix IOVA reserve dma ranges

2020-09-14 Thread Srinath Mannam via iommu
On Fri, Sep 11, 2020 at 8:47 PM Bjorn Helgaas  wrote:
>
Hi Bjorn,
Thanks for review.
> On Fri, Sep 11, 2020 at 03:55:34PM +0530, Srinath Mannam wrote:
> > Fix IOVA reserve failure in the case when address of first memory region
> > listed in dma-ranges is equal to 0x0.
> >
> > Fixes: aadad097cd46f ("iommu/dma: Reserve IOVA for PCIe inaccessible DMA 
> > address")
> > Signed-off-by: Srinath Mannam 
> > ---
> > Changes from v1:
> >Removed unnecessary changes based on Robin's review comments.
> >
> >  drivers/iommu/dma-iommu.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index 5141d49a046b..682068a9aae7 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -217,7 +217,7 @@ static int iova_reserve_pci_windows(struct pci_dev *dev,
> >   lo = iova_pfn(iovad, start);
> >   hi = iova_pfn(iovad, end);
> >   reserve_iova(iovad, lo, hi);
> > - } else {
> > + } else if (end < start) {
> >   /* dma_ranges list should be sorted */
> >   dev_err(&dev->dev, "Failed to reserve IOVA\n");
>
It is very unlikely to come to this error, dma_ranges list is sorted
in "devm_of_pci_get_host_bridge_resources" function.
> You didn't actually change the error message, but the message would be
> way more useful if it included the IOVA address range, e.g., the
> format used in pci_register_host_bridge():
>
>   bus address [%#010llx-%#010llx]
I will add this change and send the next patchset.

Thanks & Regards,
Srinath.
>
> Incidentally, the pr_err() in copy_reserved_iova() looks bogus; it
> prints iova->pfn_low twice, when it should probably print the base and
> size or (my preference) something like the above:
>
> pr_err("Reserve iova range %lx@%lx failed\n",
>iova->pfn_lo, iova->pfn_lo);
>
> >   return -EINVAL;
> > --
> > 2.17.1
> >
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu