Re: [PATCH v11 04/13] vfio/pci: Add VFIO_REGION_TYPE_NESTED region type

2020-12-17 Thread Kunkun Jiang

Hi Eric,

On 2020/11/16 19:00, Eric Auger wrote:

Add a new specific DMA_FAULT region aiming to exposed nested mode
translation faults. This region only is exposed if the device
is attached to a nested domain.

The region has a ring buffer that contains the actual fault
records plus a header allowing to handle it (tail/head indices,
max capacity, entry size). At the moment the region is dimensionned
for 512 fault records.

Signed-off-by: Eric Auger 

---

[...]

diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index b352e76cfb71..629dfb38d9e7 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -343,6 +343,9 @@ struct vfio_region_info_cap_type {
  /* sub-types for VFIO_REGION_TYPE_GFX */
  #define VFIO_REGION_SUBTYPE_GFX_EDID(1)
  
+#define VFIO_REGION_TYPE_NESTED			(2)

+#define VFIO_REGION_SUBTYPE_NESTED_DMA_FAULT   (1)
+


The macro *define VFIO_REGION_TYPE_NESTED    (2)* is in conflict with

*#define VFIO_REGION_TYPE_CCW    (2)*.


Thanks,

Kunkun Jiang.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v2 0/8] ACPI/IORT: Support for IORT RMR node

2020-12-17 Thread Jon Nettleton
On Mon, Dec 14, 2020 at 3:48 PM Shameerali Kolothum Thodi
 wrote:
>
>
>
> > -Original Message-
> > From: Steven Price [mailto:steven.pr...@arm.com]
> > Sent: 14 December 2020 13:43
> > To: Robin Murphy ; Shameerali Kolothum Thodi
> > ;
> > linux-arm-ker...@lists.infradead.org; linux-a...@vger.kernel.org;
> > iommu@lists.linux-foundation.org; de...@acpica.org
> > Cc: Linuxarm ; lorenzo.pieral...@arm.com;
> > j...@8bytes.org; wanghuiqiang ; Guohanjun
> > (Hanjun Guo) ; Jonathan Cameron
> > ; sami.muja...@arm.com
> > Subject: Re: [RFC PATCH v2 0/8] ACPI/IORT: Support for IORT RMR node
> >
> > On 14/12/2020 12:33, Robin Murphy wrote:
> > > On 2020-12-14 10:55, Shameerali Kolothum Thodi wrote:
> > >> Hi Steve,
> > >>
> > >>> -Original Message-
> > >>> From: Steven Price [mailto:steven.pr...@arm.com]
> > >>> Sent: 10 December 2020 10:26
> > >>> To: Shameerali Kolothum Thodi
> > ;
> > >>> linux-arm-ker...@lists.infradead.org; linux-a...@vger.kernel.org;
> > >>> iommu@lists.linux-foundation.org; de...@acpica.org
> > >>> Cc: Linuxarm ; lorenzo.pieral...@arm.com;
> > >>> j...@8bytes.org; robin.mur...@arm.com; wanghuiqiang
> > >>> ; Guohanjun (Hanjun Guo)
> > >>> ; Jonathan Cameron
> > >>> ; sami.muja...@arm.com
> > >>> Subject: Re: [RFC PATCH v2 0/8] ACPI/IORT: Support for IORT RMR node
> > >>>
> > >>> On 19/11/2020 12:11, Shameer Kolothum wrote:
> >  RFC v1 --> v2:
> > - Added a generic interface for IOMMU drivers to retrieve all the
> >   RMR info associated with a given IOMMU.
> > - SMMUv3 driver gets the RMR list during probe() and installs
> >   bypass STEs for all the SIDs in the RMR list. This is to keep
> >   the ongoing traffic alive(if any) during SMMUv3 reset. This is
> >   based on the suggestions received for v1 to take care of the
> >   EFI framebuffer use case. Only sanity tested for now.
> > >>>
> > >>> Hi Shameer,
> > >>>
> > >>> Sorry for not looking at this before.
> > >>>
> > >>> Do you have any plans to implement support in the SMMUv2 driver? The
> > >>> platform I've been testing the EFI framebuffer support on has the
> > >>> display controller behind SMMUv2, so as it stands this series doesn't
> > >>> work. I did hack something up for SMMUv2 so I was able to test the first
> > >>> 4 patches.
> > >>
> > >> Thanks for taking a look. Sure, I can look into adding the support for
> > >> SMMUv2.
> >
> > Great, thanks!
> >
> > >>>
> > - During the probe/attach device, SMMUv3 driver reserves any
> >   RMR region associated with the device such that there is a unity
> >   mapping for them in SMMU.
> > >>>
> > >>> For the EFI framebuffer use case there is no device to attach so I
> > >>> believe we are left with just the stream ID in bypass mode - which is
> > >>> definitely an improvement (the display works!)
> > >>
> > >> Cool. That’s good to know.
> > >>
> > >>   but not actually a unity
> > >>> mapping of the RMR range. I'm not sure whether it's worth fixing this or
> > >>> not, but I just wanted to point out there's still a need for a driver
> > >>> for the device before the bypass mode is replaced with the unity
> > >>> mapping.
> > >>
> > >> I am not sure either. My idea was we will have bypass STE setup for
> > >> all devices
> > >> with RMR initially and when the corresponding driver takes over(if
> > >> that happens)
> > >> we will have the unity mapping setup properly for the RMR regions. And
> > >> for cases
> > >> like the above, it will remain in the bypass mode.
> > >>
> > >> Do you see any problem(security?) if the dev streams remain in bypass
> > >> mode for
> > >> this dev? Or is it possible to have a stub driver for this dev, so
> > >> that we will have
> > >> the probe/attach invoked and everything will fall in place?
> > >
> > > The downside is that if a driver never binds to that device, it remains
> > > bypassed. If some other externally-controlled malicious device could
> > > manage to spoof that device's requester ID, that could potentially be
> > > exploited.
> > >
> > >> TBH, I haven't looked into creating a temp domain for these types of
> > >> the devices
> > >> and also not sure how we benefit from that compared to the STE bypass
> > >> mode.
> > >
> > > That said, setting up temporary translation contexts that ensure any
> > > access can *only* be to RMR regions until a driver takes over is an
> > > awful lot more work. I'm inclined to be pragmatic here and say we should
> > > get things working at all with the simple bypass STE/S2CR method, then
> > > look at adding the higher-security effort on top.
> > >
> > > Right now systems that need this are either broken (but effectively
> > > secure) or using default bypass with SMMUv2. People who prefer to
> > > maintain security over functionality in the interim can maintain that
> > > status quo by simply continuing to not describe any RMRs.
> >
> > I agree with Robin, let's get this working with the bypass mode (until
> > the d

Re: [PATCH v2 1/4] dt-bindings: reserved-memory: Document "active" property

2020-12-17 Thread Thierry Reding
On Tue, Nov 10, 2020 at 08:33:09PM +0100, Thierry Reding wrote:
> On Fri, Nov 06, 2020 at 04:25:48PM +0100, Thierry Reding wrote:
> > On Thu, Nov 05, 2020 at 05:47:21PM +, Robin Murphy wrote:
> > > On 2020-11-05 16:43, Thierry Reding wrote:
> > > > On Thu, Sep 24, 2020 at 01:27:25PM +0200, Thierry Reding wrote:
> > > > > On Tue, Sep 15, 2020 at 02:36:48PM +0200, Thierry Reding wrote:
> > > > > > On Mon, Sep 14, 2020 at 04:08:29PM -0600, Rob Herring wrote:
> > > > > > > On Fri, Sep 04, 2020 at 02:59:57PM +0200, Thierry Reding wrote:
> > > > > > > > From: Thierry Reding 
> > > > > > > > 
> > > > > > > > Reserved memory regions can be marked as "active" if hardware is
> > > > > > > > expected to access the regions during boot and before the 
> > > > > > > > operating
> > > > > > > > system can take control. One example where this is useful is 
> > > > > > > > for the
> > > > > > > > operating system to infer whether the region needs to be 
> > > > > > > > identity-
> > > > > > > > mapped through an IOMMU.
> > > > > > > 
> > > > > > > I like simple solutions, but this hardly seems adequate to solve 
> > > > > > > the
> > > > > > > problem of passing IOMMU setup from bootloader/firmware to the 
> > > > > > > OS. Like
> > > > > > > what is the IOVA that's supposed to be used if identity mapping 
> > > > > > > is not
> > > > > > > used?
> > > > > > 
> > > > > > The assumption here is that if the region is not active there is no 
> > > > > > need
> > > > > > for the IOVA to be specified because the kernel will allocate 
> > > > > > memory and
> > > > > > assign any IOVA of its choosing.
> > > > > > 
> > > > > > Also, note that this is not meant as a way of passing IOMMU setup 
> > > > > > from
> > > > > > the bootloader or firmware to the OS. The purpose of this is to 
> > > > > > specify
> > > > > > that some region of memory is actively being accessed during boot. 
> > > > > > The
> > > > > > particular case that I'm looking at is where the bootloader set up a
> > > > > > splash screen and keeps it on during boot. The bootloader has not 
> > > > > > set up
> > > > > > an IOMMU mapping and the identity mapping serves as a way of 
> > > > > > keeping the
> > > > > > accesses by the display hardware working during the transitional 
> > > > > > period
> > > > > > after the IOMMU translations have been enabled by the kernel but 
> > > > > > before
> > > > > > the kernel display driver has had a chance to set up its own IOMMU
> > > > > > mappings.
> > > > > > 
> > > > > > > If you know enough about the regions to assume identity mapping, 
> > > > > > > then
> > > > > > > can't you know if active or not?
> > > > > > 
> > > > > > We could alternatively add some property that describes the region 
> > > > > > as
> > > > > > requiring an identity mapping. But note that we can't make any
> > > > > > assumptions here about the usage of these regions because the IOMMU
> > > > > > driver simply has no way of knowing what they are being used for.
> > > > > > 
> > > > > > Some additional information is required in device tree for the IOMMU
> > > > > > driver to be able to make that decision.
> > > > > 
> > > > > Rob, can you provide any hints on exactly how you want to move this
> > > > > forward? I don't know in what direction you'd like to proceed.
> > > > 
> > > > Hi Rob,
> > > > 
> > > > do you have any suggestions on how to proceed with this? I'd like to get
> > > > this moving again because it's something that's been nagging me for some
> > > > months now. It also requires changes across two levels in the bootloader
> > > > stack as well as Linux and it takes quite a bit of work to make all the
> > > > changes, so before I go and rewrite everything I'd like to get the DT
> > > > bindings sorted out first.
> > > > 
> > > > So just to summarize why I think this simple solution is good enough: it
> > > > tries to solve a very narrow and simple problem. This is not an attempt
> > > > at describing the firmware's full IOMMU setup to the kernel. In fact, it
> > > > is primarily targetted at cases where the firmware hasn't setup an IOMMU
> > > > at all, and we just want to make sure that when the kernel takes over
> > > > and does want to enable the IOMMU, that all the regions that are
> > > > actively being accessed by non-quiesced hardware (the most typical
> > > > example would be a framebuffer scanning out a splat screen or animation,
> > > > but it could equally well be some sort of welcoming tone or music being
> > > > played back) are described in device tree.
> > > > 
> > > > In other words, and this is perhaps better answering your second
> > > > question: in addition to describing reserved memory regions, we want to
> > > > add a bit of information here about the usage of these memory regions.
> > > > Some memory regions may contain information that the kernel may want to
> > > > use (such an external memory frequency scaling tables) and those I would
> > > > describe as "inactive" memory because it isn't being acc

RE: [RFC PATCH v2 0/8] ACPI/IORT: Support for IORT RMR node

2020-12-17 Thread Shameerali Kolothum Thodi


> -Original Message-
> From: Jon Nettleton [mailto:j...@solid-run.com]
> Sent: 17 December 2020 14:48
> To: Shameerali Kolothum Thodi 
> Cc: Steven Price ; Robin Murphy
> ; linux-arm-ker...@lists.infradead.org;
> linux-a...@vger.kernel.org; iommu@lists.linux-foundation.org;
> de...@acpica.org; lorenzo.pieral...@arm.com; j...@8bytes.org; Guohanjun
> (Hanjun Guo) ; Linuxarm ;
> Jonathan Cameron ;
> sami.muja...@arm.com; wanghuiqiang 
> Subject: Re: [RFC PATCH v2 0/8] ACPI/IORT: Support for IORT RMR node
> 
> On Mon, Dec 14, 2020 at 3:48 PM Shameerali Kolothum Thodi
>  wrote:
> >
> >
> >
> > > -Original Message-
> > > From: Steven Price [mailto:steven.pr...@arm.com]
> > > Sent: 14 December 2020 13:43
> > > To: Robin Murphy ; Shameerali Kolothum Thodi
> > > ;
> > > linux-arm-ker...@lists.infradead.org; linux-a...@vger.kernel.org;
> > > iommu@lists.linux-foundation.org; de...@acpica.org
> > > Cc: Linuxarm ; lorenzo.pieral...@arm.com;
> > > j...@8bytes.org; wanghuiqiang ; Guohanjun
> > > (Hanjun Guo) ; Jonathan Cameron
> > > ; sami.muja...@arm.com
> > > Subject: Re: [RFC PATCH v2 0/8] ACPI/IORT: Support for IORT RMR node
> > >
> > > On 14/12/2020 12:33, Robin Murphy wrote:
> > > > On 2020-12-14 10:55, Shameerali Kolothum Thodi wrote:
> > > >> Hi Steve,
> > > >>
> > > >>> -Original Message-
> > > >>> From: Steven Price [mailto:steven.pr...@arm.com]
> > > >>> Sent: 10 December 2020 10:26
> > > >>> To: Shameerali Kolothum Thodi
> > > ;
> > > >>> linux-arm-ker...@lists.infradead.org; linux-a...@vger.kernel.org;
> > > >>> iommu@lists.linux-foundation.org; de...@acpica.org
> > > >>> Cc: Linuxarm ; lorenzo.pieral...@arm.com;
> > > >>> j...@8bytes.org; robin.mur...@arm.com; wanghuiqiang
> > > >>> ; Guohanjun (Hanjun Guo)
> > > >>> ; Jonathan Cameron
> > > >>> ; sami.muja...@arm.com
> > > >>> Subject: Re: [RFC PATCH v2 0/8] ACPI/IORT: Support for IORT RMR node
> > > >>>
> > > >>> On 19/11/2020 12:11, Shameer Kolothum wrote:
> > >  RFC v1 --> v2:
> > > - Added a generic interface for IOMMU drivers to retrieve all the
> > >   RMR info associated with a given IOMMU.
> > > - SMMUv3 driver gets the RMR list during probe() and installs
> > >   bypass STEs for all the SIDs in the RMR list. This is to keep
> > >   the ongoing traffic alive(if any) during SMMUv3 reset. This is
> > >   based on the suggestions received for v1 to take care of the
> > >   EFI framebuffer use case. Only sanity tested for now.
> > > >>>
> > > >>> Hi Shameer,
> > > >>>
> > > >>> Sorry for not looking at this before.
> > > >>>
> > > >>> Do you have any plans to implement support in the SMMUv2 driver?
> The
> > > >>> platform I've been testing the EFI framebuffer support on has the
> > > >>> display controller behind SMMUv2, so as it stands this series doesn't
> > > >>> work. I did hack something up for SMMUv2 so I was able to test the
> first
> > > >>> 4 patches.
> > > >>
> > > >> Thanks for taking a look. Sure, I can look into adding the support for
> > > >> SMMUv2.
> > >
> > > Great, thanks!
> > >
> > > >>>
> > > - During the probe/attach device, SMMUv3 driver reserves any
> > >   RMR region associated with the device such that there is a
> unity
> > >   mapping for them in SMMU.
> > > >>>
> > > >>> For the EFI framebuffer use case there is no device to attach so I
> > > >>> believe we are left with just the stream ID in bypass mode - which is
> > > >>> definitely an improvement (the display works!)
> > > >>
> > > >> Cool. That’s good to know.
> > > >>
> > > >>   but not actually a unity
> > > >>> mapping of the RMR range. I'm not sure whether it's worth fixing this
> or
> > > >>> not, but I just wanted to point out there's still a need for a driver
> > > >>> for the device before the bypass mode is replaced with the unity
> > > >>> mapping.
> > > >>
> > > >> I am not sure either. My idea was we will have bypass STE setup for
> > > >> all devices
> > > >> with RMR initially and when the corresponding driver takes over(if
> > > >> that happens)
> > > >> we will have the unity mapping setup properly for the RMR regions. And
> > > >> for cases
> > > >> like the above, it will remain in the bypass mode.
> > > >>
> > > >> Do you see any problem(security?) if the dev streams remain in bypass
> > > >> mode for
> > > >> this dev? Or is it possible to have a stub driver for this dev, so
> > > >> that we will have
> > > >> the probe/attach invoked and everything will fall in place?
> > > >
> > > > The downside is that if a driver never binds to that device, it remains
> > > > bypassed. If some other externally-controlled malicious device could
> > > > manage to spoof that device's requester ID, that could potentially be
> > > > exploited.
> > > >
> > > >> TBH, I haven't looked into creating a temp domain for these types of
> > > >> the devices
> > > >> and also not sure how we benefit from that compared to the STE bypass
> > > >> mode.
> > > >
> > > > T

Re: [RFC PATCH v2 0/8] ACPI/IORT: Support for IORT RMR node

2020-12-17 Thread Jon Nettleton
On Thu, Dec 17, 2020 at 4:42 PM Shameerali Kolothum Thodi
 wrote:
>
>
>
> > -Original Message-
> > From: Jon Nettleton [mailto:j...@solid-run.com]
> > Sent: 17 December 2020 14:48
> > To: Shameerali Kolothum Thodi 
> > Cc: Steven Price ; Robin Murphy
> > ; linux-arm-ker...@lists.infradead.org;
> > linux-a...@vger.kernel.org; iommu@lists.linux-foundation.org;
> > de...@acpica.org; lorenzo.pieral...@arm.com; j...@8bytes.org; Guohanjun
> > (Hanjun Guo) ; Linuxarm ;
> > Jonathan Cameron ;
> > sami.muja...@arm.com; wanghuiqiang 
> > Subject: Re: [RFC PATCH v2 0/8] ACPI/IORT: Support for IORT RMR node
> >
> > On Mon, Dec 14, 2020 at 3:48 PM Shameerali Kolothum Thodi
> >  wrote:
> > >
> > >
> > >
> > > > -Original Message-
> > > > From: Steven Price [mailto:steven.pr...@arm.com]
> > > > Sent: 14 December 2020 13:43
> > > > To: Robin Murphy ; Shameerali Kolothum Thodi
> > > > ;
> > > > linux-arm-ker...@lists.infradead.org; linux-a...@vger.kernel.org;
> > > > iommu@lists.linux-foundation.org; de...@acpica.org
> > > > Cc: Linuxarm ; lorenzo.pieral...@arm.com;
> > > > j...@8bytes.org; wanghuiqiang ; Guohanjun
> > > > (Hanjun Guo) ; Jonathan Cameron
> > > > ; sami.muja...@arm.com
> > > > Subject: Re: [RFC PATCH v2 0/8] ACPI/IORT: Support for IORT RMR node
> > > >
> > > > On 14/12/2020 12:33, Robin Murphy wrote:
> > > > > On 2020-12-14 10:55, Shameerali Kolothum Thodi wrote:
> > > > >> Hi Steve,
> > > > >>
> > > > >>> -Original Message-
> > > > >>> From: Steven Price [mailto:steven.pr...@arm.com]
> > > > >>> Sent: 10 December 2020 10:26
> > > > >>> To: Shameerali Kolothum Thodi
> > > > ;
> > > > >>> linux-arm-ker...@lists.infradead.org; linux-a...@vger.kernel.org;
> > > > >>> iommu@lists.linux-foundation.org; de...@acpica.org
> > > > >>> Cc: Linuxarm ; lorenzo.pieral...@arm.com;
> > > > >>> j...@8bytes.org; robin.mur...@arm.com; wanghuiqiang
> > > > >>> ; Guohanjun (Hanjun Guo)
> > > > >>> ; Jonathan Cameron
> > > > >>> ; sami.muja...@arm.com
> > > > >>> Subject: Re: [RFC PATCH v2 0/8] ACPI/IORT: Support for IORT RMR node
> > > > >>>
> > > > >>> On 19/11/2020 12:11, Shameer Kolothum wrote:
> > > >  RFC v1 --> v2:
> > > > - Added a generic interface for IOMMU drivers to retrieve all 
> > > >  the
> > > >   RMR info associated with a given IOMMU.
> > > > - SMMUv3 driver gets the RMR list during probe() and installs
> > > >   bypass STEs for all the SIDs in the RMR list. This is to keep
> > > >   the ongoing traffic alive(if any) during SMMUv3 reset. This is
> > > >   based on the suggestions received for v1 to take care of the
> > > >   EFI framebuffer use case. Only sanity tested for now.
> > > > >>>
> > > > >>> Hi Shameer,
> > > > >>>
> > > > >>> Sorry for not looking at this before.
> > > > >>>
> > > > >>> Do you have any plans to implement support in the SMMUv2 driver?
> > The
> > > > >>> platform I've been testing the EFI framebuffer support on has the
> > > > >>> display controller behind SMMUv2, so as it stands this series 
> > > > >>> doesn't
> > > > >>> work. I did hack something up for SMMUv2 so I was able to test the
> > first
> > > > >>> 4 patches.
> > > > >>
> > > > >> Thanks for taking a look. Sure, I can look into adding the support 
> > > > >> for
> > > > >> SMMUv2.
> > > >
> > > > Great, thanks!
> > > >
> > > > >>>
> > > > - During the probe/attach device, SMMUv3 driver reserves any
> > > >   RMR region associated with the device such that there is a
> > unity
> > > >   mapping for them in SMMU.
> > > > >>>
> > > > >>> For the EFI framebuffer use case there is no device to attach so I
> > > > >>> believe we are left with just the stream ID in bypass mode - which 
> > > > >>> is
> > > > >>> definitely an improvement (the display works!)
> > > > >>
> > > > >> Cool. That’s good to know.
> > > > >>
> > > > >>   but not actually a unity
> > > > >>> mapping of the RMR range. I'm not sure whether it's worth fixing 
> > > > >>> this
> > or
> > > > >>> not, but I just wanted to point out there's still a need for a 
> > > > >>> driver
> > > > >>> for the device before the bypass mode is replaced with the unity
> > > > >>> mapping.
> > > > >>
> > > > >> I am not sure either. My idea was we will have bypass STE setup for
> > > > >> all devices
> > > > >> with RMR initially and when the corresponding driver takes over(if
> > > > >> that happens)
> > > > >> we will have the unity mapping setup properly for the RMR regions. 
> > > > >> And
> > > > >> for cases
> > > > >> like the above, it will remain in the bypass mode.
> > > > >>
> > > > >> Do you see any problem(security?) if the dev streams remain in bypass
> > > > >> mode for
> > > > >> this dev? Or is it possible to have a stub driver for this dev, so
> > > > >> that we will have
> > > > >> the probe/attach invoked and everything will fall in place?
> > > > >
> > > > > The downside is that if a driver never binds to that device, it 
> > > > > remains
> 

Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

2020-12-17 Thread Bjorn Helgaas
On Wed, Dec 16, 2020 at 07:24:30PM +0800, Zhou Wang wrote:
> On 2020/6/23 23:04, Bjorn Helgaas wrote:
> > On Fri, Jun 19, 2020 at 10:26:54AM +0800, Zhangfei Gao wrote:
> >> Have studied _DSM method, two issues we met comparing using quirk.
> >>
> >> 1. Need change definition of either pci_host_bridge or pci_dev, like adding
> >> member can_stall,
> >> while pci system does not know stall now.
> >>
> >> a, pci devices do not have uuid: uuid need be described in dsdt, while pci
> >> devices are not defined in dsdt.
> >> so we have to use host bridge.
> > 
> > PCI devices *can* be described in the DSDT.  IIUC these particular
> > devices are hardwired (not plug-in cards), so platform firmware can
> > know about them and could describe them in the DSDT.
> > 
> >> b,  Parsing dsdt is in in pci subsystem.
> >> Like drivers/acpi/pci_root.c:
> >>obj = acpi_evaluate_dsm(ACPI_HANDLE(bus->bridge), 
> >> &pci_acpi_dsm_guid,
> >> 1,
> >> IGNORE_PCI_BOOT_CONFIG_DSM, NULL);
> >>
> >> After parsing DSM in pci, we need record this info.
> >> Currently, can_stall info is recorded in iommu_fwspec,
> >> which is allocated in iommu_fwspec_init and called by iort_iommu_configure
> >> for uefi.
> > 
> > You can look for a _DSM wherever it is convenient for you.  It could
> > be in an AMBA shim layer.
> > 
> >> 2. Guest kernel also need support sva.
> >> Using quirk, the guest can boot with sva enabled, since quirk is
> >> self-contained by kernel.
> >> If using  _DSM, a specific uefi or dtb has to be provided,
> >> currently we can useQEMU_EFI.fd from apt install qemu-efi
> > 
> > I don't quite understand what this means, but as I mentioned before, a
> > quirk for a *limited* number of devices is OK, as long as there is a
> > plan that removes the need for a quirk for future devices.
> > 
> > E.g., if the next platform version ships with a DTB or firmware with a
> > _DSM or other mechanism that enables the kernel to discover this
> > information without a kernel change, it's fine to use a quirk to cover
> > the early platform.
> > 
> > The principles are:
> > 
> >   - I don't want to have to update a quirk for every new Device ID
> > that needs this.
> 
> Hi Bjorn and Zhangfei,
> 
> We plan to use ATS/PRI to support SVA in future PCI devices. However, for
> current devices, we need to add limited number of quirk to let them
> work. The device IDs of current quirk needed devices are ZIP engine(0xa250, 
> 0xa251),
> SEC engine(0xa255, 0xa256), HPRE engine(0xa258, 0xa259), revision id are
> 0x21 and 0x30.
> 
> Let's continue to upstream these quirks!

Please post the patches you propose.  I don't think the previous ones
are in my queue.  Please include the lore URL for the previous
posting(s) in the cover letter so we can connect the discussion.

> >   - I don't really want to have to manage non-PCI information in the
> > struct pci_dev.  If this is AMBA- or IOMMU-related, it should be
> > stored in a structure related to AMBA or the IOMMU.
> > .
> > 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: swiotlb/virtio: unchecked device dma address and length

2020-12-17 Thread Ashish Kalra
On Thu, Dec 17, 2020 at 12:19:16PM +0800, Jason Wang wrote:
> 
> On 2020/12/16 下午9:04, Konrad Rzeszutek Wilk wrote:
> > On December 16, 2020 1:41:48 AM EST, Jason Wang  wrote:
> > > 
> > > - Original Message -
> > > > 
> > > > - Original Message -
> > > > > .snip.
> > > > > > > > This raises two issues:
> > > > > > > > 1) swiotlb_tlb_unmap_single fails to check whether the index
> > > > > > > > generated
> > > > > > > > from the dma_addr is in range of the io_tlb_orig_addr array.
> > > > > > > That is fairly simple to implement I would think. That is it
> > > can check
> > > > > > > that the dma_addr is from the PA in the io_tlb pool when
> > > SWIOTLB=force
> > > > > > > is used.
> > > > > > 
> > > > > > I'm not sure this can fix all the cases. It looks to me we should
> > > map
> > > > > > descriptor coherent but readonly (which is not supported by
> > > current DMA
> > > > > > API).
> > > > > I think I am missing something obvious here. The attacker is the
> > > > > hypervisor,
> > > > > aka
> > > > > the owner of the VirtIO device (ring0). The attacker is the one
> > > that
> > > > > provides the addr/len - having that readonly from a guest
> > > perspective
> > > > > does not change the fact that the hypervisor can modify the memory
> > > range
> > > > > by mapping it via a different virtual address in the hypervisor?
> > > (aka
> > > > > aliasing it).
> > > > Right, but if we allow hypervisor to provide arbitrary addr/len, does
> > > > it mean hypervisor can read encrypted content of encrypted memory of
> > > > guest through swiotlb?
> > Yes .
> > > > Thanks
> > > Actually not. I think you're right.
> > 
> > Your sentence is very confusing.
> 
> 
> Sorry for  being unclear. This is all a reply to your suggestion of adding
> checks in the swiotlb.
> 
> 
> > 
> > On a DMA unmap SWIOTLB (when force is used) it trusts the driver from 
> > providing the correct DMA address and length which SWIOTLB uses to match to 
> > its associated original PA address.
> > 
> > Think original PA having a mapping to a PA in the SWIOTLB pool.
> > 
> > 
> > The length is not checked so the attacker can modify that to say a huge 
> > number and cause SWIOTLB bounce code to write or read data from non SWIOTLB 
> > PA into the SWIOTLB PA pool.
> 
> 
> How can we read in this case? It looks to me we don't try to read during
> dma_unmap().
> 

That seems to be correct as in the unmap path, swiotlb_bounce() is being
called with DMA_FROM_DEVICE flag, so there is no read involved during
dma_unmap().

Thanks,
Ashish

> 
> 
> > 
> > 
> > 
> > 
> > > Thanks
> > > 
> > > > > > Otherwise, device can modify the desc[i].addr/desc[i].len at any
> > > time to
> > > > > > pretend a valid mapping.
> > > > > With the swiotlb=force as long as addr/len are within the PA
> > > boundaries
> > > > > within the SWIOTLB pool this should be OK?
> > > > > 
> > > > > After all that whole area is in cleartext and visible to the
> > > attacker.
> > > > > 
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: swiotlb/virtio: unchecked device dma address and length

2020-12-17 Thread Ashish Kalra
On Wed, Dec 16, 2020 at 10:07:31PM +, Radev, Martin wrote:
> Hello everybody,
> 
> I will try help clarify some things.
> 
> > On a DMA unmap SWIOTLB (when force is used) it trusts the driver from 
> > providing
> > the correct DMA address and length which SWIOTLB uses to match to its 
> > associated
> > original PA address.
> > The length is not checked so the attacker can modify that to say a huge 
> > number
> > and cause SWIOTLB bounce code to write or read data from non SWIOTLB PA 
> > into the
> > SWIOTLB PA pool.
> 
> This is true.
> As an example, I attached to the QEMU process, set a BP to 
> `virtqueue_split_fill`
> and modified the length field from 0x40 to 0x1, and filled the 
> corresponding
> buffer in the swiotlb region with As (0x41).
> 
> Immediately after resuming execution, the kernel would crash:
> [  122.154142] general protection fault, probably for non-canonical address 
> 0x4141414141414141:  [#1] PREEMPT SMP NOPTI
> [  122.156088] CPU: 0 PID: 917 Comm: kworker/0:6 Kdump: loaded Tainted: G 
>W   E 5.6.12-sevault+ #28
> [  122.157855] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 
> 02/06/2015
> [  122.159079] Workqueue: events_freezable_power_ disk_events_workfn
> [  122.160040] RIP: 0010:scsi_queue_rq+0x5af/0xa70 [scsi_mod]
> [  122.160916] Code: 01 89 83 9c 02 00 00 41 80 7f 08 00 74 07 83 8b 9c 02 00 
> 00 08 48 8b 83 40 02 00 00 c7 83 3c 01 00 00 00 00 00 00 48 8d 78 08 <48>
>  c7 00 00 00 00 00 48 c7 40 58 00 00 00 00 48 83 e7 f8 48 
> 29 f8
> [  122.163821] RSP: 0018:c92efb08 EFLAGS: 00010202
> [  122.164637] RAX: 4141414141414141 RBX: 888035b89c00 RCX: 
> 888035b89ed0
> [  122.165775] RDX: 0008 RSI:  RDI: 
> 4141414141414149
> [  122.166891] RBP: 888035946000 R08: 888035a79860 R09: 
> 
> [  122.168016] R10: ea0001287280 R11: 0008 R12: 
> 888035b89d18
> [  122.169159] R13: 888035945000 R14: 888035946000 R15: 
> c92efba0
> [  122.170287] FS:  () GS:88807f80() 
> knlGS:
> [  122.171564] CS:  0010 DS:  ES:  CR0: 80050033
> [  122.172470] CR2: 560e654b77b8 CR3: 4dd38000 CR4: 
> 003406f0
> 

I believe the above example is without a SEV guest enabled/active, as SEV
guest debugging can only be done with SEV Debug patches applied.

> What and where gets overwritten entirely depends on what virtio driver is 
> being
> targeted. All manage their memory for the descriptor buffers differently so 
> the overwrite
> may require to be large.
> 
> In the context of VirtIO and SWIOTLB, there are also these three fields other 
> than the length:
> dma_addr, flags, next
> 
> I had a look around a little bit, so my take is the following:
> 
> 1) There's already validation for dma_addr before doing the unmap with a call
>to is_swiotlb_buffer (1). I think this check is sufficient.
> 
> 2) flags
>Before doing the unmap, the virtio implementation would check the flag and 
> based on it
>would select a DMA direction (TO/FROM DEVICE). Still, it seems that this 
> would not
>trick the driver to copy data to the device since only a `sync for CPU` 
> may be performed
>in the unmap path.

That seems to be true. 

Thanks,
Ashish

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 4/9] of/iommu: Support dma-can-stall property

2020-12-17 Thread Zhangfei Gao



On 2020/12/14 下午8:51, Jean-Philippe Brucker wrote:

On Thu, Nov 26, 2020 at 06:09:26PM +, Robin Murphy wrote:

On 2020-11-12 12:55, Jean-Philippe Brucker wrote:

Copy the dma-can-stall property into the fwspec structure.

Can't we just handle this as a regular device property? It's not part of the
actual IOMMU specifier, it doesn't need to be translated in any way, and
AFAICS it's used a grand total of once, in a slow path. Simply treating it
as the per-device property that it is should require zero additional code
for DT, and a simple device_add_properties() call for IORT.

TBH that appears to be true of pasid-num-bits as well.

Right I think that's better, thanks for the pointer. I'll take care of
pasid-num-bits too. The Huawei quirk (fake PCIe supporting stall) is a
little worse this way, but it should work.


Thanks Jean, I tested the following diff, it works with Huawei quirk.

Thanks
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu