RE: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer

2015-05-11 Thread William Davis


> -Original Message-
> From: Mark Hounschell [mailto:ma...@compro.net]
> Sent: Friday, May 08, 2015 3:46 PM
> To: Konrad Rzeszutek Wilk; William Davis
> Cc: linux-...@vger.kernel.org; iommu@lists.linux-foundation.org; 
> jgli...@redhat.com; John Hubbard;
> Terence Ripperda
> Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
> 
> On 05/08/2015 04:21 PM, Konrad Rzeszutek Wilk wrote:
> > On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote:
> >> From: Will Davis 
> >>
> >> Hi,
> >>
> ...
> >>
> >> This solves a long-standing problem with the existing DMA-remapping 
> >> interfaces,
> >> which require that a struct page be given for the region to be mapped into 
> >> a
> >> device's IOVA domain. This requirement cannot support peer device BAR 
> >> ranges,
> >> for which no struct pages exist.
> >>
> ...
> >>
> >> The Intel and nommu versions have been verified on a dual Intel Xeon E5405
> >> workstation.
> 
> PCIe peer2peer is borked on all motherboards I've tried. Only writes are
> possible. Reads are not supported. I suppose if you have a platform with
> only PCI and an IOMMU this would be very useful. Without both read and
> write PCIe peer2peer support, this seems unnecessary.
> 

PCIe peer-to-peer isn't inherently broken or useless itself, even if a lot of 
its implementations are; I've successfully tested these patches with existing 
hardware (dual NVIDIA GPUs + an old-ish workstation), and it solves a 
longstanding problem for us [1], so I disagree with the assessment that this 
would be unnecessary.

I guess I don't see why the existence of some, or even a lot of, poor 
implementations suffices as a reason to reject a generic mechanism to support 
the "good", standardized implementations.

[1] 
http://stackoverflow.com/questions/19841815/does-the-nvidia-rdma-gpudirect-always-operate-only-physical-addresses-in-physic

Thanks,
Will

--
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer

2015-05-11 Thread William Davis


> -Original Message-
> From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
> Sent: Friday, May 08, 2015 3:22 PM
> To: William Davis
> Cc: j...@8bytes.org; jgli...@redhat.com; linux-...@vger.kernel.org; 
> iommu@lists.linux-foundation.org;
> John Hubbard; Terence Ripperda
> Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
> 
> ...
> >
> > The Intel and nommu versions have been verified on a dual Intel Xeon E5405
> > workstation. I'm in the process of obtaining hardware to test the AMD 
> > version
> > as well. Please review.
> 
> Does it work if you boot with 'iommu=soft swiotlb=force' which will mandate
> an strict usage of the DMA API?
> 

This patch series doesn't yet add a SWIOTLB implementation, and so the 
dma_map_resource() call would return 0 to indicate the path is not implemented 
(see patch 2/6). So no, the new interfaces would not work with that 
configuration, but they're also not expected to at this point.

Thanks,
Will
--
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer

2015-05-11 Thread Don Dutile

On 05/07/2015 02:11 PM, Jerome Glisse wrote:

On Thu, May 07, 2015 at 12:16:30PM -0500, Bjorn Helgaas wrote:

On Thu, May 7, 2015 at 11:23 AM, William Davis  wrote:

From: Bjorn Helgaas [mailto:bhelg...@google.com]
Sent: Thursday, May 7, 2015 8:13 AM
To: Yijing Wang
Cc: William Davis; Joerg Roedel; open list:INTEL IOMMU (VT-d); linux-
p...@vger.kernel.org; Terence Ripperda; John Hubbard; Jerome Glisse; Dave
Jiang; David S. Miller; Alex Williamson
Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer

On Wed, May 6, 2015 at 8:48 PM, Yijing Wang  wrote:

On 2015/5/7 6:18, Bjorn Helgaas wrote:

[+cc Yijing, Dave J, Dave M, Alex]

On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote:

From: Will Davis 

Hi,

This patch series adds DMA APIs to map and unmap a struct resource
to and from a PCI device's IOVA domain, and implements the AMD,
Intel, and nommu versions of these interfaces.

This solves a long-standing problem with the existing DMA-remapping
interfaces, which require that a struct page be given for the region
to be mapped into a device's IOVA domain. This requirement cannot
support peer device BAR ranges, for which no struct pages exist.
...



I think we currently assume there's no peer-to-peer traffic.

I don't know whether changing that will break anything, but I'm
concerned about these:

   - PCIe MPS configuration (see pcie_bus_configure_settings()).


I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER
force every device's MPS to 128B, what its concern is the TLP payload
size. In this series, it seems to only map a iova for device bar region.


MPS configuration makes assumptions about whether there will be any peer-
to-peer traffic.  If there will be none, MPS can be configured more
aggressively.

I don't think Linux has any way to detect whether a driver is doing peer-
to-peer, and there's no way to prevent a driver from doing it.
We're stuck with requiring the user to specify boot options
("pci=pcie_bus_safe", "pci=pcie_bus_perf", "pci=pcie_bus_peer2peer",
etc.) that tell the PCI core what the user expects to happen.

This is a terrible user experience.  The user has no way to tell what
drivers are going to do.  If he specifies the wrong thing, e.g., "assume no
peer-to-peer traffic," and then loads a driver that does peer-to-peer, the
kernel will configure MPS aggressively and when the device does a peer-to-
peer transfer, it may cause a Malformed TLP error.



I agree that this isn't a great user experience, but just want to clarify
that this problem is orthogonal to this patch series, correct?

Prior to this series, the MPS mismatch is still possible with p2p traffic,
but when an IOMMU is enabled p2p traffic will result in DMAR faults. The
aim of the series is to allow drivers to fix the latter, not the former.


Prior to this series, there wasn't any infrastructure for drivers to
do p2p, so it was mostly reasonable to assume that there *was* no p2p
traffic.

I think we currently default to doing nothing to MPS.  Prior to this
series, it might have been reasonable to optimize based on a "no-p2p"
assumption, e.g., default to pcie_bus_safe or pcie_bus_perf.  After
this series, I'm not sure what we could do, because p2p will be much
more likely.

It's just an issue; I don't know what the resolution is.


Can't we just have each device update its MPS at runtime. So if device A
decide to map something from device B then device A update MPS for A and
B to lowest common supported value.

Of course you need to keep track of that per device so that if a device C
comes around and want to exchange with device B and both C and B support
higher payload than A then if C reprogram B it will trigger issue for A.

I know we update other PCIE configuration parameter at runtime for GPU,
dunno if it is widely tested for other devices.


I believe all these cases are btwn endpts and the upstream ports of a
PCIe port/host-bringe/PCIe switch they are connected to, i.e., true, wire peers
-- not across a PCIe domain, which is the context of this p2p that the MPS has 
to span.



Cheers,
Jérôme
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer

2015-05-11 Thread Konrad Rzeszutek Wilk
On Fri, May 08, 2015 at 04:46:17PM -0400, Mark Hounschell wrote:
> On 05/08/2015 04:21 PM, Konrad Rzeszutek Wilk wrote:
> >On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote:
> >>From: Will Davis 
> >>
> >>Hi,
> >>
> >>This patch series adds DMA APIs to map and unmap a struct resource to and 
> >>from
> >>a PCI device's IOVA domain, and implements the AMD, Intel, and nommu 
> >>versions
> >>of these interfaces.
> >>
> >>This solves a long-standing problem with the existing DMA-remapping 
> >>interfaces,
> >>which require that a struct page be given for the region to be mapped into a
> >>device's IOVA domain. This requirement cannot support peer device BAR 
> >>ranges,
> >>for which no struct pages exist.
> >>
> >>The underlying implementations of map_page and map_sg convert the struct 
> >>page
> >>into its physical address anyway, so we just need a way to route the 
> >>physical
> >>address of the BAR region to these implementations. The new interfaces do 
> >>this
> >>by taking the struct resource describing a device's BAR region, from which 
> >>the
> >>physical address is derived.
> >>
> >>The Intel and nommu versions have been verified on a dual Intel Xeon E5405
> >>workstation. I'm in the process of obtaining hardware to test the AMD 
> >>version
> >>as well. Please review.
> >
> >Does it work if you boot with 'iommu=soft swiotlb=force' which will mandate
> >an strict usage of the DMA API?
> >
> 
> PCIe peer2peer is borked on all motherboards I've tried. Only writes are

:-(

> possible. Reads are not supported. I suppose if you have a platform with
> only PCI and an IOMMU this would be very useful. Without both read and write
> PCIe peer2peer support, this seems unnecessary.
> 

It is a perfect way to test the code to make sure the API works (or it
fails in the failure modes) _and_ that the drivers as well (use the
pci_map_sync, and so on).

> Mark
> 
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer

2015-05-08 Thread Mark Hounschell

On 05/08/2015 04:21 PM, Konrad Rzeszutek Wilk wrote:

On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote:

From: Will Davis 

Hi,

This patch series adds DMA APIs to map and unmap a struct resource to and from
a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions
of these interfaces.

This solves a long-standing problem with the existing DMA-remapping interfaces,
which require that a struct page be given for the region to be mapped into a
device's IOVA domain. This requirement cannot support peer device BAR ranges,
for which no struct pages exist.

The underlying implementations of map_page and map_sg convert the struct page
into its physical address anyway, so we just need a way to route the physical
address of the BAR region to these implementations. The new interfaces do this
by taking the struct resource describing a device's BAR region, from which the
physical address is derived.

The Intel and nommu versions have been verified on a dual Intel Xeon E5405
workstation. I'm in the process of obtaining hardware to test the AMD version
as well. Please review.


Does it work if you boot with 'iommu=soft swiotlb=force' which will mandate
an strict usage of the DMA API?



PCIe peer2peer is borked on all motherboards I've tried. Only writes are 
possible. Reads are not supported. I suppose if you have a platform with 
only PCI and an IOMMU this would be very useful. Without both read and 
write PCIe peer2peer support, this seems unnecessary.


Mark


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer

2015-05-08 Thread Konrad Rzeszutek Wilk
On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote:
> From: Will Davis 
> 
> Hi,
> 
> This patch series adds DMA APIs to map and unmap a struct resource to and from
> a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions
> of these interfaces.
> 
> This solves a long-standing problem with the existing DMA-remapping 
> interfaces,
> which require that a struct page be given for the region to be mapped into a
> device's IOVA domain. This requirement cannot support peer device BAR ranges,
> for which no struct pages exist.
> 
> The underlying implementations of map_page and map_sg convert the struct page
> into its physical address anyway, so we just need a way to route the physical
> address of the BAR region to these implementations. The new interfaces do this
> by taking the struct resource describing a device's BAR region, from which the
> physical address is derived.
> 
> The Intel and nommu versions have been verified on a dual Intel Xeon E5405
> workstation. I'm in the process of obtaining hardware to test the AMD version
> as well. Please review.

Does it work if you boot with 'iommu=soft swiotlb=force' which will mandate
an strict usage of the DMA API?

> 
> Thanks,
> Will
> 
> Will Davis (6):
>   dma-debug: add checking for map/unmap_resource
>   DMA-API: Introduce dma_(un)map_resource
>   dma-mapping: pci: add pci_(un)map_resource
>   iommu/amd: Implement (un)map_resource
>   iommu/vt-d: implement (un)map_resource
>   x86: add pci-nommu implementation of map_resource
> 
>  arch/x86/kernel/pci-nommu.c  | 17 +++
>  drivers/iommu/amd_iommu.c| 76 
> ++--
>  drivers/iommu/intel-iommu.c  | 18 
>  include/asm-generic/dma-mapping-broken.h |  9 
>  include/asm-generic/dma-mapping-common.h | 34 ++
>  include/asm-generic/pci-dma-compat.h | 14 ++
>  include/linux/dma-debug.h| 20 +
>  include/linux/dma-mapping.h  |  7 +++
>  lib/dma-debug.c  | 48 
>  9 files changed, 230 insertions(+), 13 deletions(-)
> 
> -- 
> 2.3.7
> 
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer

2015-05-07 Thread Jerome Glisse
On Thu, May 07, 2015 at 12:16:30PM -0500, Bjorn Helgaas wrote:
> On Thu, May 7, 2015 at 11:23 AM, William Davis  wrote:
> >> From: Bjorn Helgaas [mailto:bhelg...@google.com]
> >> Sent: Thursday, May 7, 2015 8:13 AM
> >> To: Yijing Wang
> >> Cc: William Davis; Joerg Roedel; open list:INTEL IOMMU (VT-d); linux-
> >> p...@vger.kernel.org; Terence Ripperda; John Hubbard; Jerome Glisse; Dave
> >> Jiang; David S. Miller; Alex Williamson
> >> Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
> >>
> >> On Wed, May 6, 2015 at 8:48 PM, Yijing Wang  wrote:
> >> > On 2015/5/7 6:18, Bjorn Helgaas wrote:
> >> >> [+cc Yijing, Dave J, Dave M, Alex]
> >> >>
> >> >> On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote:
> >> >>> From: Will Davis 
> >> >>>
> >> >>> Hi,
> >> >>>
> >> >>> This patch series adds DMA APIs to map and unmap a struct resource
> >> >>> to and from a PCI device's IOVA domain, and implements the AMD,
> >> >>> Intel, and nommu versions of these interfaces.
> >> >>>
> >> >>> This solves a long-standing problem with the existing DMA-remapping
> >> >>> interfaces, which require that a struct page be given for the region
> >> >>> to be mapped into a device's IOVA domain. This requirement cannot
> >> >>> support peer device BAR ranges, for which no struct pages exist.
> >> >>> ...
> >>
> >> >> I think we currently assume there's no peer-to-peer traffic.
> >> >>
> >> >> I don't know whether changing that will break anything, but I'm
> >> >> concerned about these:
> >> >>
> >> >>   - PCIe MPS configuration (see pcie_bus_configure_settings()).
> >> >
> >> > I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER
> >> > force every device's MPS to 128B, what its concern is the TLP payload
> >> > size. In this series, it seems to only map a iova for device bar region.
> >>
> >> MPS configuration makes assumptions about whether there will be any peer-
> >> to-peer traffic.  If there will be none, MPS can be configured more
> >> aggressively.
> >>
> >> I don't think Linux has any way to detect whether a driver is doing peer-
> >> to-peer, and there's no way to prevent a driver from doing it.
> >> We're stuck with requiring the user to specify boot options
> >> ("pci=pcie_bus_safe", "pci=pcie_bus_perf", "pci=pcie_bus_peer2peer",
> >> etc.) that tell the PCI core what the user expects to happen.
> >>
> >> This is a terrible user experience.  The user has no way to tell what
> >> drivers are going to do.  If he specifies the wrong thing, e.g., "assume no
> >> peer-to-peer traffic," and then loads a driver that does peer-to-peer, the
> >> kernel will configure MPS aggressively and when the device does a peer-to-
> >> peer transfer, it may cause a Malformed TLP error.
> >>
> >
> > I agree that this isn't a great user experience, but just want to clarify
> > that this problem is orthogonal to this patch series, correct?
> >
> > Prior to this series, the MPS mismatch is still possible with p2p traffic,
> > but when an IOMMU is enabled p2p traffic will result in DMAR faults. The
> > aim of the series is to allow drivers to fix the latter, not the former.
> 
> Prior to this series, there wasn't any infrastructure for drivers to
> do p2p, so it was mostly reasonable to assume that there *was* no p2p
> traffic.
> 
> I think we currently default to doing nothing to MPS.  Prior to this
> series, it might have been reasonable to optimize based on a "no-p2p"
> assumption, e.g., default to pcie_bus_safe or pcie_bus_perf.  After
> this series, I'm not sure what we could do, because p2p will be much
> more likely.
> 
> It's just an issue; I don't know what the resolution is.

Can't we just have each device update its MPS at runtime. So if device A
decide to map something from device B then device A update MPS for A and
B to lowest common supported value.

Of course you need to keep track of that per device so that if a device C
comes around and want to exchange with device B and both C and B support
higher payload than A then if C reprogram B it will trigger issue for A.

I know we update other PCIE configuration parameter at runtime for GPU,
dunno if it is widely tested for other devices.

Cheers,
Jérôme
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer

2015-05-07 Thread Bjorn Helgaas
On Thu, May 7, 2015 at 11:23 AM, William Davis  wrote:
>
>
>> -Original Message-
>> From: Bjorn Helgaas [mailto:bhelg...@google.com]
>> Sent: Thursday, May 7, 2015 8:13 AM
>> To: Yijing Wang
>> Cc: William Davis; Joerg Roedel; open list:INTEL IOMMU (VT-d); linux-
>> p...@vger.kernel.org; Terence Ripperda; John Hubbard; Jerome Glisse; Dave
>> Jiang; David S. Miller; Alex Williamson
>> Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
>>
>> On Wed, May 6, 2015 at 8:48 PM, Yijing Wang  wrote:
>> > On 2015/5/7 6:18, Bjorn Helgaas wrote:
>> >> [+cc Yijing, Dave J, Dave M, Alex]
>> >>
>> >> On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote:
>> >>> From: Will Davis 
>> >>>
>> >>> Hi,
>> >>>
>> >>> This patch series adds DMA APIs to map and unmap a struct resource
>> >>> to and from a PCI device's IOVA domain, and implements the AMD,
>> >>> Intel, and nommu versions of these interfaces.
>> >>>
>> >>> This solves a long-standing problem with the existing DMA-remapping
>> >>> interfaces, which require that a struct page be given for the region
>> >>> to be mapped into a device's IOVA domain. This requirement cannot
>> >>> support peer device BAR ranges, for which no struct pages exist.
>> >>> ...
>>
>> >> I think we currently assume there's no peer-to-peer traffic.
>> >>
>> >> I don't know whether changing that will break anything, but I'm
>> >> concerned about these:
>> >>
>> >>   - PCIe MPS configuration (see pcie_bus_configure_settings()).
>> >
>> > I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER
>> > force every device's MPS to 128B, what its concern is the TLP payload
>> > size. In this series, it seems to only map a iova for device bar region.
>>
>> MPS configuration makes assumptions about whether there will be any peer-
>> to-peer traffic.  If there will be none, MPS can be configured more
>> aggressively.
>>
>> I don't think Linux has any way to detect whether a driver is doing peer-
>> to-peer, and there's no way to prevent a driver from doing it.
>> We're stuck with requiring the user to specify boot options
>> ("pci=pcie_bus_safe", "pci=pcie_bus_perf", "pci=pcie_bus_peer2peer",
>> etc.) that tell the PCI core what the user expects to happen.
>>
>> This is a terrible user experience.  The user has no way to tell what
>> drivers are going to do.  If he specifies the wrong thing, e.g., "assume no
>> peer-to-peer traffic," and then loads a driver that does peer-to-peer, the
>> kernel will configure MPS aggressively and when the device does a peer-to-
>> peer transfer, it may cause a Malformed TLP error.
>>
>
> I agree that this isn't a great user experience, but just want to clarify 
> that this problem is orthogonal to this patch series, correct?
>
> Prior to this series, the MPS mismatch is still possible with p2p traffic, 
> but when an IOMMU is enabled p2p traffic will result in DMAR faults. The aim 
> of the series is to allow drivers to fix the latter, not the former.

Prior to this series, there wasn't any infrastructure for drivers to
do p2p, so it was mostly reasonable to assume that there *was* no p2p
traffic.

I think we currently default to doing nothing to MPS.  Prior to this
series, it might have been reasonable to optimize based on a "no-p2p"
assumption, e.g., default to pcie_bus_safe or pcie_bus_perf.  After
this series, I'm not sure what we could do, because p2p will be much
more likely.

It's just an issue; I don't know what the resolution is.

Bjorn
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer

2015-05-07 Thread William Davis


> -Original Message-
> From: Bjorn Helgaas [mailto:bhelg...@google.com]
> Sent: Thursday, May 7, 2015 8:13 AM
> To: Yijing Wang
> Cc: William Davis; Joerg Roedel; open list:INTEL IOMMU (VT-d); linux-
> p...@vger.kernel.org; Terence Ripperda; John Hubbard; Jerome Glisse; Dave
> Jiang; David S. Miller; Alex Williamson
> Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
> 
> On Wed, May 6, 2015 at 8:48 PM, Yijing Wang  wrote:
> > On 2015/5/7 6:18, Bjorn Helgaas wrote:
> >> [+cc Yijing, Dave J, Dave M, Alex]
> >>
> >> On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote:
> >>> From: Will Davis 
> >>>
> >>> Hi,
> >>>
> >>> This patch series adds DMA APIs to map and unmap a struct resource
> >>> to and from a PCI device's IOVA domain, and implements the AMD,
> >>> Intel, and nommu versions of these interfaces.
> >>>
> >>> This solves a long-standing problem with the existing DMA-remapping
> >>> interfaces, which require that a struct page be given for the region
> >>> to be mapped into a device's IOVA domain. This requirement cannot
> >>> support peer device BAR ranges, for which no struct pages exist.
> >>> ...
> 
> >> I think we currently assume there's no peer-to-peer traffic.
> >>
> >> I don't know whether changing that will break anything, but I'm
> >> concerned about these:
> >>
> >>   - PCIe MPS configuration (see pcie_bus_configure_settings()).
> >
> > I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER
> > force every device's MPS to 128B, what its concern is the TLP payload
> > size. In this series, it seems to only map a iova for device bar region.
> 
> MPS configuration makes assumptions about whether there will be any peer-
> to-peer traffic.  If there will be none, MPS can be configured more
> aggressively.
> 
> I don't think Linux has any way to detect whether a driver is doing peer-
> to-peer, and there's no way to prevent a driver from doing it.
> We're stuck with requiring the user to specify boot options
> ("pci=pcie_bus_safe", "pci=pcie_bus_perf", "pci=pcie_bus_peer2peer",
> etc.) that tell the PCI core what the user expects to happen.
> 
> This is a terrible user experience.  The user has no way to tell what
> drivers are going to do.  If he specifies the wrong thing, e.g., "assume no
> peer-to-peer traffic," and then loads a driver that does peer-to-peer, the
> kernel will configure MPS aggressively and when the device does a peer-to-
> peer transfer, it may cause a Malformed TLP error.
> 

I agree that this isn't a great user experience, but just want to clarify that 
this problem is orthogonal to this patch series, correct?

Prior to this series, the MPS mismatch is still possible with p2p traffic, but 
when an IOMMU is enabled p2p traffic will result in DMAR faults. The aim of the 
series is to allow drivers to fix the latter, not the former.

Thanks,
Will

--
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer

2015-05-07 Thread Bjorn Helgaas
On Wed, May 6, 2015 at 8:48 PM, Yijing Wang  wrote:
> On 2015/5/7 6:18, Bjorn Helgaas wrote:
>> [+cc Yijing, Dave J, Dave M, Alex]
>>
>> On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote:
>>> From: Will Davis 
>>>
>>> Hi,
>>>
>>> This patch series adds DMA APIs to map and unmap a struct resource to and 
>>> from
>>> a PCI device's IOVA domain, and implements the AMD, Intel, and nommu 
>>> versions
>>> of these interfaces.
>>>
>>> This solves a long-standing problem with the existing DMA-remapping 
>>> interfaces,
>>> which require that a struct page be given for the region to be mapped into a
>>> device's IOVA domain. This requirement cannot support peer device BAR 
>>> ranges,
>>> for which no struct pages exist.
>>> ...

>> I think we currently assume there's no peer-to-peer traffic.
>>
>> I don't know whether changing that will break anything, but I'm concerned
>> about these:
>>
>>   - PCIe MPS configuration (see pcie_bus_configure_settings()).
>
> I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER force 
> every
> device's MPS to 128B, what its concern is the TLP payload size. In this 
> series, it
> seems to only map a iova for device bar region.

MPS configuration makes assumptions about whether there will be any
peer-to-peer traffic.  If there will be none, MPS can be configured
more aggressively.

I don't think Linux has any way to detect whether a driver is doing
peer-to-peer, and there's no way to prevent a driver from doing it.
We're stuck with requiring the user to specify boot options
("pci=pcie_bus_safe", "pci=pcie_bus_perf", "pci=pcie_bus_peer2peer",
etc.) that tell the PCI core what the user expects to happen.

This is a terrible user experience.  The user has no way to tell what
drivers are going to do.  If he specifies the wrong thing, e.g.,
"assume no peer-to-peer traffic," and then loads a driver that does
peer-to-peer, the kernel will configure MPS aggressively and when the
device does a peer-to-peer transfer, it may cause a Malformed TLP
error.

Bjorn
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer

2015-05-06 Thread Alex Williamson
On Wed, 2015-05-06 at 17:18 -0500, Bjorn Helgaas wrote:
> [+cc Yijing, Dave J, Dave M, Alex]
> 
> On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote:
> > From: Will Davis 
> > 
> > Hi,
> > 
> > This patch series adds DMA APIs to map and unmap a struct resource to and 
> > from
> > a PCI device's IOVA domain, and implements the AMD, Intel, and nommu 
> > versions
> > of these interfaces.
> > 
> > This solves a long-standing problem with the existing DMA-remapping 
> > interfaces,
> > which require that a struct page be given for the region to be mapped into a
> > device's IOVA domain. This requirement cannot support peer device BAR 
> > ranges,
> > for which no struct pages exist.
> > 
> > The underlying implementations of map_page and map_sg convert the struct 
> > page
> > into its physical address anyway, so we just need a way to route the 
> > physical
> > address of the BAR region to these implementations. The new interfaces do 
> > this
> > by taking the struct resource describing a device's BAR region, from which 
> > the
> > physical address is derived.
> > 
> > The Intel and nommu versions have been verified on a dual Intel Xeon E5405
> > workstation. I'm in the process of obtaining hardware to test the AMD 
> > version
> > as well. Please review.
> 
> I think we currently assume there's no peer-to-peer traffic.
> 
> I don't know whether changing that will break anything, but I'm concerned
> about these:
> 
>   - PCIe MPS configuration (see pcie_bus_configure_settings()).
> 
>   - PCIe ACS, e.g., pci_acs_enabled().  My guess is that this one is OK,
> but Alex would know better.

I think it should be OK too.  ACS will force the transaction upstream
for IOMMU translation rather than possible allowing redirection lower in
the topology, but that's sort of the price we pay for isolation.  The
p2p context entries need to be present in the IOMMU, so without actually
reading the patches, this does seem like something a driver might want
to do via the DMA API.  The IOMMU API already allows us to avoid the
struct page issue and create mappings for p2p in the IOMMU.

>   - dma_addr_t.  Currently dma_addr_t is big enough to hold any address
> returned from the DMA API.  That's not necessarily big enough to hold a
> PCI bus address, e.g., a raw BAR value.
> 
> > Will Davis (6):
> >   dma-debug: add checking for map/unmap_resource
> >   DMA-API: Introduce dma_(un)map_resource
> >   dma-mapping: pci: add pci_(un)map_resource
> >   iommu/amd: Implement (un)map_resource
> >   iommu/vt-d: implement (un)map_resource
> >   x86: add pci-nommu implementation of map_resource
> > 
> >  arch/x86/kernel/pci-nommu.c  | 17 +++
> >  drivers/iommu/amd_iommu.c| 76 
> > ++--
> >  drivers/iommu/intel-iommu.c  | 18 
> >  include/asm-generic/dma-mapping-broken.h |  9 
> >  include/asm-generic/dma-mapping-common.h | 34 ++
> >  include/asm-generic/pci-dma-compat.h | 14 ++
> >  include/linux/dma-debug.h| 20 +
> >  include/linux/dma-mapping.h  |  7 +++
> >  lib/dma-debug.c  | 48 
> >  9 files changed, 230 insertions(+), 13 deletions(-)
> > 
> > -- 
> > 2.3.7
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer

2015-05-06 Thread Bjorn Helgaas
[+cc Yijing, Dave J, Dave M, Alex]

On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote:
> From: Will Davis 
> 
> Hi,
> 
> This patch series adds DMA APIs to map and unmap a struct resource to and from
> a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions
> of these interfaces.
> 
> This solves a long-standing problem with the existing DMA-remapping 
> interfaces,
> which require that a struct page be given for the region to be mapped into a
> device's IOVA domain. This requirement cannot support peer device BAR ranges,
> for which no struct pages exist.
> 
> The underlying implementations of map_page and map_sg convert the struct page
> into its physical address anyway, so we just need a way to route the physical
> address of the BAR region to these implementations. The new interfaces do this
> by taking the struct resource describing a device's BAR region, from which the
> physical address is derived.
> 
> The Intel and nommu versions have been verified on a dual Intel Xeon E5405
> workstation. I'm in the process of obtaining hardware to test the AMD version
> as well. Please review.

I think we currently assume there's no peer-to-peer traffic.

I don't know whether changing that will break anything, but I'm concerned
about these:

  - PCIe MPS configuration (see pcie_bus_configure_settings()).

  - PCIe ACS, e.g., pci_acs_enabled().  My guess is that this one is OK,
but Alex would know better.

  - dma_addr_t.  Currently dma_addr_t is big enough to hold any address
returned from the DMA API.  That's not necessarily big enough to hold a
PCI bus address, e.g., a raw BAR value.

> Will Davis (6):
>   dma-debug: add checking for map/unmap_resource
>   DMA-API: Introduce dma_(un)map_resource
>   dma-mapping: pci: add pci_(un)map_resource
>   iommu/amd: Implement (un)map_resource
>   iommu/vt-d: implement (un)map_resource
>   x86: add pci-nommu implementation of map_resource
> 
>  arch/x86/kernel/pci-nommu.c  | 17 +++
>  drivers/iommu/amd_iommu.c| 76 
> ++--
>  drivers/iommu/intel-iommu.c  | 18 
>  include/asm-generic/dma-mapping-broken.h |  9 
>  include/asm-generic/dma-mapping-common.h | 34 ++
>  include/asm-generic/pci-dma-compat.h | 14 ++
>  include/linux/dma-debug.h| 20 +
>  include/linux/dma-mapping.h  |  7 +++
>  lib/dma-debug.c  | 48 
>  9 files changed, 230 insertions(+), 13 deletions(-)
> 
> -- 
> 2.3.7
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer

2015-05-01 Thread wdavis
From: Will Davis 

Hi,

This patch series adds DMA APIs to map and unmap a struct resource to and from
a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions
of these interfaces.

This solves a long-standing problem with the existing DMA-remapping interfaces,
which require that a struct page be given for the region to be mapped into a
device's IOVA domain. This requirement cannot support peer device BAR ranges,
for which no struct pages exist.

The underlying implementations of map_page and map_sg convert the struct page
into its physical address anyway, so we just need a way to route the physical
address of the BAR region to these implementations. The new interfaces do this
by taking the struct resource describing a device's BAR region, from which the
physical address is derived.

The Intel and nommu versions have been verified on a dual Intel Xeon E5405
workstation. I'm in the process of obtaining hardware to test the AMD version
as well. Please review.

Thanks,
Will

Will Davis (6):
  dma-debug: add checking for map/unmap_resource
  DMA-API: Introduce dma_(un)map_resource
  dma-mapping: pci: add pci_(un)map_resource
  iommu/amd: Implement (un)map_resource
  iommu/vt-d: implement (un)map_resource
  x86: add pci-nommu implementation of map_resource

 arch/x86/kernel/pci-nommu.c  | 17 +++
 drivers/iommu/amd_iommu.c| 76 ++--
 drivers/iommu/intel-iommu.c  | 18 
 include/asm-generic/dma-mapping-broken.h |  9 
 include/asm-generic/dma-mapping-common.h | 34 ++
 include/asm-generic/pci-dma-compat.h | 14 ++
 include/linux/dma-debug.h| 20 +
 include/linux/dma-mapping.h  |  7 +++
 lib/dma-debug.c  | 48 
 9 files changed, 230 insertions(+), 13 deletions(-)

-- 
2.3.7

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu