RE: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
> -Original Message- > From: Mark Hounschell [mailto:ma...@compro.net] > Sent: Friday, May 08, 2015 3:46 PM > To: Konrad Rzeszutek Wilk; William Davis > Cc: linux-...@vger.kernel.org; iommu@lists.linux-foundation.org; > jgli...@redhat.com; John Hubbard; > Terence Ripperda > Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer > > On 05/08/2015 04:21 PM, Konrad Rzeszutek Wilk wrote: > > On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote: > >> From: Will Davis > >> > >> Hi, > >> > ... > >> > >> This solves a long-standing problem with the existing DMA-remapping > >> interfaces, > >> which require that a struct page be given for the region to be mapped into > >> a > >> device's IOVA domain. This requirement cannot support peer device BAR > >> ranges, > >> for which no struct pages exist. > >> > ... > >> > >> The Intel and nommu versions have been verified on a dual Intel Xeon E5405 > >> workstation. > > PCIe peer2peer is borked on all motherboards I've tried. Only writes are > possible. Reads are not supported. I suppose if you have a platform with > only PCI and an IOMMU this would be very useful. Without both read and > write PCIe peer2peer support, this seems unnecessary. > PCIe peer-to-peer isn't inherently broken or useless itself, even if a lot of its implementations are; I've successfully tested these patches with existing hardware (dual NVIDIA GPUs + an old-ish workstation), and it solves a longstanding problem for us [1], so I disagree with the assessment that this would be unnecessary. I guess I don't see why the existence of some, or even a lot of, poor implementations suffices as a reason to reject a generic mechanism to support the "good", standardized implementations. [1] http://stackoverflow.com/questions/19841815/does-the-nvidia-rdma-gpudirect-always-operate-only-physical-addresses-in-physic Thanks, Will -- nvpublic ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
> -Original Message- > From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com] > Sent: Friday, May 08, 2015 3:22 PM > To: William Davis > Cc: j...@8bytes.org; jgli...@redhat.com; linux-...@vger.kernel.org; > iommu@lists.linux-foundation.org; > John Hubbard; Terence Ripperda > Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer > > ... > > > > The Intel and nommu versions have been verified on a dual Intel Xeon E5405 > > workstation. I'm in the process of obtaining hardware to test the AMD > > version > > as well. Please review. > > Does it work if you boot with 'iommu=soft swiotlb=force' which will mandate > an strict usage of the DMA API? > This patch series doesn't yet add a SWIOTLB implementation, and so the dma_map_resource() call would return 0 to indicate the path is not implemented (see patch 2/6). So no, the new interfaces would not work with that configuration, but they're also not expected to at this point. Thanks, Will -- nvpublic ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
On 05/07/2015 02:11 PM, Jerome Glisse wrote: On Thu, May 07, 2015 at 12:16:30PM -0500, Bjorn Helgaas wrote: On Thu, May 7, 2015 at 11:23 AM, William Davis wrote: From: Bjorn Helgaas [mailto:bhelg...@google.com] Sent: Thursday, May 7, 2015 8:13 AM To: Yijing Wang Cc: William Davis; Joerg Roedel; open list:INTEL IOMMU (VT-d); linux- p...@vger.kernel.org; Terence Ripperda; John Hubbard; Jerome Glisse; Dave Jiang; David S. Miller; Alex Williamson Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer On Wed, May 6, 2015 at 8:48 PM, Yijing Wang wrote: On 2015/5/7 6:18, Bjorn Helgaas wrote: [+cc Yijing, Dave J, Dave M, Alex] On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote: From: Will Davis Hi, This patch series adds DMA APIs to map and unmap a struct resource to and from a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions of these interfaces. This solves a long-standing problem with the existing DMA-remapping interfaces, which require that a struct page be given for the region to be mapped into a device's IOVA domain. This requirement cannot support peer device BAR ranges, for which no struct pages exist. ... I think we currently assume there's no peer-to-peer traffic. I don't know whether changing that will break anything, but I'm concerned about these: - PCIe MPS configuration (see pcie_bus_configure_settings()). I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER force every device's MPS to 128B, what its concern is the TLP payload size. In this series, it seems to only map a iova for device bar region. MPS configuration makes assumptions about whether there will be any peer- to-peer traffic. If there will be none, MPS can be configured more aggressively. I don't think Linux has any way to detect whether a driver is doing peer- to-peer, and there's no way to prevent a driver from doing it. We're stuck with requiring the user to specify boot options ("pci=pcie_bus_safe", "pci=pcie_bus_perf", "pci=pcie_bus_peer2peer", etc.) that tell the PCI core what the user expects to happen. This is a terrible user experience. The user has no way to tell what drivers are going to do. If he specifies the wrong thing, e.g., "assume no peer-to-peer traffic," and then loads a driver that does peer-to-peer, the kernel will configure MPS aggressively and when the device does a peer-to- peer transfer, it may cause a Malformed TLP error. I agree that this isn't a great user experience, but just want to clarify that this problem is orthogonal to this patch series, correct? Prior to this series, the MPS mismatch is still possible with p2p traffic, but when an IOMMU is enabled p2p traffic will result in DMAR faults. The aim of the series is to allow drivers to fix the latter, not the former. Prior to this series, there wasn't any infrastructure for drivers to do p2p, so it was mostly reasonable to assume that there *was* no p2p traffic. I think we currently default to doing nothing to MPS. Prior to this series, it might have been reasonable to optimize based on a "no-p2p" assumption, e.g., default to pcie_bus_safe or pcie_bus_perf. After this series, I'm not sure what we could do, because p2p will be much more likely. It's just an issue; I don't know what the resolution is. Can't we just have each device update its MPS at runtime. So if device A decide to map something from device B then device A update MPS for A and B to lowest common supported value. Of course you need to keep track of that per device so that if a device C comes around and want to exchange with device B and both C and B support higher payload than A then if C reprogram B it will trigger issue for A. I know we update other PCIE configuration parameter at runtime for GPU, dunno if it is widely tested for other devices. I believe all these cases are btwn endpts and the upstream ports of a PCIe port/host-bringe/PCIe switch they are connected to, i.e., true, wire peers -- not across a PCIe domain, which is the context of this p2p that the MPS has to span. Cheers, Jérôme ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
On Fri, May 08, 2015 at 04:46:17PM -0400, Mark Hounschell wrote: > On 05/08/2015 04:21 PM, Konrad Rzeszutek Wilk wrote: > >On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote: > >>From: Will Davis > >> > >>Hi, > >> > >>This patch series adds DMA APIs to map and unmap a struct resource to and > >>from > >>a PCI device's IOVA domain, and implements the AMD, Intel, and nommu > >>versions > >>of these interfaces. > >> > >>This solves a long-standing problem with the existing DMA-remapping > >>interfaces, > >>which require that a struct page be given for the region to be mapped into a > >>device's IOVA domain. This requirement cannot support peer device BAR > >>ranges, > >>for which no struct pages exist. > >> > >>The underlying implementations of map_page and map_sg convert the struct > >>page > >>into its physical address anyway, so we just need a way to route the > >>physical > >>address of the BAR region to these implementations. The new interfaces do > >>this > >>by taking the struct resource describing a device's BAR region, from which > >>the > >>physical address is derived. > >> > >>The Intel and nommu versions have been verified on a dual Intel Xeon E5405 > >>workstation. I'm in the process of obtaining hardware to test the AMD > >>version > >>as well. Please review. > > > >Does it work if you boot with 'iommu=soft swiotlb=force' which will mandate > >an strict usage of the DMA API? > > > > PCIe peer2peer is borked on all motherboards I've tried. Only writes are :-( > possible. Reads are not supported. I suppose if you have a platform with > only PCI and an IOMMU this would be very useful. Without both read and write > PCIe peer2peer support, this seems unnecessary. > It is a perfect way to test the code to make sure the API works (or it fails in the failure modes) _and_ that the drivers as well (use the pci_map_sync, and so on). > Mark > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
On 05/08/2015 04:21 PM, Konrad Rzeszutek Wilk wrote: On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote: From: Will Davis Hi, This patch series adds DMA APIs to map and unmap a struct resource to and from a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions of these interfaces. This solves a long-standing problem with the existing DMA-remapping interfaces, which require that a struct page be given for the region to be mapped into a device's IOVA domain. This requirement cannot support peer device BAR ranges, for which no struct pages exist. The underlying implementations of map_page and map_sg convert the struct page into its physical address anyway, so we just need a way to route the physical address of the BAR region to these implementations. The new interfaces do this by taking the struct resource describing a device's BAR region, from which the physical address is derived. The Intel and nommu versions have been verified on a dual Intel Xeon E5405 workstation. I'm in the process of obtaining hardware to test the AMD version as well. Please review. Does it work if you boot with 'iommu=soft swiotlb=force' which will mandate an strict usage of the DMA API? PCIe peer2peer is borked on all motherboards I've tried. Only writes are possible. Reads are not supported. I suppose if you have a platform with only PCI and an IOMMU this would be very useful. Without both read and write PCIe peer2peer support, this seems unnecessary. Mark ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote: > From: Will Davis > > Hi, > > This patch series adds DMA APIs to map and unmap a struct resource to and from > a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions > of these interfaces. > > This solves a long-standing problem with the existing DMA-remapping > interfaces, > which require that a struct page be given for the region to be mapped into a > device's IOVA domain. This requirement cannot support peer device BAR ranges, > for which no struct pages exist. > > The underlying implementations of map_page and map_sg convert the struct page > into its physical address anyway, so we just need a way to route the physical > address of the BAR region to these implementations. The new interfaces do this > by taking the struct resource describing a device's BAR region, from which the > physical address is derived. > > The Intel and nommu versions have been verified on a dual Intel Xeon E5405 > workstation. I'm in the process of obtaining hardware to test the AMD version > as well. Please review. Does it work if you boot with 'iommu=soft swiotlb=force' which will mandate an strict usage of the DMA API? > > Thanks, > Will > > Will Davis (6): > dma-debug: add checking for map/unmap_resource > DMA-API: Introduce dma_(un)map_resource > dma-mapping: pci: add pci_(un)map_resource > iommu/amd: Implement (un)map_resource > iommu/vt-d: implement (un)map_resource > x86: add pci-nommu implementation of map_resource > > arch/x86/kernel/pci-nommu.c | 17 +++ > drivers/iommu/amd_iommu.c| 76 > ++-- > drivers/iommu/intel-iommu.c | 18 > include/asm-generic/dma-mapping-broken.h | 9 > include/asm-generic/dma-mapping-common.h | 34 ++ > include/asm-generic/pci-dma-compat.h | 14 ++ > include/linux/dma-debug.h| 20 + > include/linux/dma-mapping.h | 7 +++ > lib/dma-debug.c | 48 > 9 files changed, 230 insertions(+), 13 deletions(-) > > -- > 2.3.7 > > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
On Thu, May 07, 2015 at 12:16:30PM -0500, Bjorn Helgaas wrote: > On Thu, May 7, 2015 at 11:23 AM, William Davis wrote: > >> From: Bjorn Helgaas [mailto:bhelg...@google.com] > >> Sent: Thursday, May 7, 2015 8:13 AM > >> To: Yijing Wang > >> Cc: William Davis; Joerg Roedel; open list:INTEL IOMMU (VT-d); linux- > >> p...@vger.kernel.org; Terence Ripperda; John Hubbard; Jerome Glisse; Dave > >> Jiang; David S. Miller; Alex Williamson > >> Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer > >> > >> On Wed, May 6, 2015 at 8:48 PM, Yijing Wang wrote: > >> > On 2015/5/7 6:18, Bjorn Helgaas wrote: > >> >> [+cc Yijing, Dave J, Dave M, Alex] > >> >> > >> >> On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote: > >> >>> From: Will Davis > >> >>> > >> >>> Hi, > >> >>> > >> >>> This patch series adds DMA APIs to map and unmap a struct resource > >> >>> to and from a PCI device's IOVA domain, and implements the AMD, > >> >>> Intel, and nommu versions of these interfaces. > >> >>> > >> >>> This solves a long-standing problem with the existing DMA-remapping > >> >>> interfaces, which require that a struct page be given for the region > >> >>> to be mapped into a device's IOVA domain. This requirement cannot > >> >>> support peer device BAR ranges, for which no struct pages exist. > >> >>> ... > >> > >> >> I think we currently assume there's no peer-to-peer traffic. > >> >> > >> >> I don't know whether changing that will break anything, but I'm > >> >> concerned about these: > >> >> > >> >> - PCIe MPS configuration (see pcie_bus_configure_settings()). > >> > > >> > I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER > >> > force every device's MPS to 128B, what its concern is the TLP payload > >> > size. In this series, it seems to only map a iova for device bar region. > >> > >> MPS configuration makes assumptions about whether there will be any peer- > >> to-peer traffic. If there will be none, MPS can be configured more > >> aggressively. > >> > >> I don't think Linux has any way to detect whether a driver is doing peer- > >> to-peer, and there's no way to prevent a driver from doing it. > >> We're stuck with requiring the user to specify boot options > >> ("pci=pcie_bus_safe", "pci=pcie_bus_perf", "pci=pcie_bus_peer2peer", > >> etc.) that tell the PCI core what the user expects to happen. > >> > >> This is a terrible user experience. The user has no way to tell what > >> drivers are going to do. If he specifies the wrong thing, e.g., "assume no > >> peer-to-peer traffic," and then loads a driver that does peer-to-peer, the > >> kernel will configure MPS aggressively and when the device does a peer-to- > >> peer transfer, it may cause a Malformed TLP error. > >> > > > > I agree that this isn't a great user experience, but just want to clarify > > that this problem is orthogonal to this patch series, correct? > > > > Prior to this series, the MPS mismatch is still possible with p2p traffic, > > but when an IOMMU is enabled p2p traffic will result in DMAR faults. The > > aim of the series is to allow drivers to fix the latter, not the former. > > Prior to this series, there wasn't any infrastructure for drivers to > do p2p, so it was mostly reasonable to assume that there *was* no p2p > traffic. > > I think we currently default to doing nothing to MPS. Prior to this > series, it might have been reasonable to optimize based on a "no-p2p" > assumption, e.g., default to pcie_bus_safe or pcie_bus_perf. After > this series, I'm not sure what we could do, because p2p will be much > more likely. > > It's just an issue; I don't know what the resolution is. Can't we just have each device update its MPS at runtime. So if device A decide to map something from device B then device A update MPS for A and B to lowest common supported value. Of course you need to keep track of that per device so that if a device C comes around and want to exchange with device B and both C and B support higher payload than A then if C reprogram B it will trigger issue for A. I know we update other PCIE configuration parameter at runtime for GPU, dunno if it is widely tested for other devices. Cheers, Jérôme ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
On Thu, May 7, 2015 at 11:23 AM, William Davis wrote: > > >> -Original Message- >> From: Bjorn Helgaas [mailto:bhelg...@google.com] >> Sent: Thursday, May 7, 2015 8:13 AM >> To: Yijing Wang >> Cc: William Davis; Joerg Roedel; open list:INTEL IOMMU (VT-d); linux- >> p...@vger.kernel.org; Terence Ripperda; John Hubbard; Jerome Glisse; Dave >> Jiang; David S. Miller; Alex Williamson >> Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer >> >> On Wed, May 6, 2015 at 8:48 PM, Yijing Wang wrote: >> > On 2015/5/7 6:18, Bjorn Helgaas wrote: >> >> [+cc Yijing, Dave J, Dave M, Alex] >> >> >> >> On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote: >> >>> From: Will Davis >> >>> >> >>> Hi, >> >>> >> >>> This patch series adds DMA APIs to map and unmap a struct resource >> >>> to and from a PCI device's IOVA domain, and implements the AMD, >> >>> Intel, and nommu versions of these interfaces. >> >>> >> >>> This solves a long-standing problem with the existing DMA-remapping >> >>> interfaces, which require that a struct page be given for the region >> >>> to be mapped into a device's IOVA domain. This requirement cannot >> >>> support peer device BAR ranges, for which no struct pages exist. >> >>> ... >> >> >> I think we currently assume there's no peer-to-peer traffic. >> >> >> >> I don't know whether changing that will break anything, but I'm >> >> concerned about these: >> >> >> >> - PCIe MPS configuration (see pcie_bus_configure_settings()). >> > >> > I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER >> > force every device's MPS to 128B, what its concern is the TLP payload >> > size. In this series, it seems to only map a iova for device bar region. >> >> MPS configuration makes assumptions about whether there will be any peer- >> to-peer traffic. If there will be none, MPS can be configured more >> aggressively. >> >> I don't think Linux has any way to detect whether a driver is doing peer- >> to-peer, and there's no way to prevent a driver from doing it. >> We're stuck with requiring the user to specify boot options >> ("pci=pcie_bus_safe", "pci=pcie_bus_perf", "pci=pcie_bus_peer2peer", >> etc.) that tell the PCI core what the user expects to happen. >> >> This is a terrible user experience. The user has no way to tell what >> drivers are going to do. If he specifies the wrong thing, e.g., "assume no >> peer-to-peer traffic," and then loads a driver that does peer-to-peer, the >> kernel will configure MPS aggressively and when the device does a peer-to- >> peer transfer, it may cause a Malformed TLP error. >> > > I agree that this isn't a great user experience, but just want to clarify > that this problem is orthogonal to this patch series, correct? > > Prior to this series, the MPS mismatch is still possible with p2p traffic, > but when an IOMMU is enabled p2p traffic will result in DMAR faults. The aim > of the series is to allow drivers to fix the latter, not the former. Prior to this series, there wasn't any infrastructure for drivers to do p2p, so it was mostly reasonable to assume that there *was* no p2p traffic. I think we currently default to doing nothing to MPS. Prior to this series, it might have been reasonable to optimize based on a "no-p2p" assumption, e.g., default to pcie_bus_safe or pcie_bus_perf. After this series, I'm not sure what we could do, because p2p will be much more likely. It's just an issue; I don't know what the resolution is. Bjorn ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
> -Original Message- > From: Bjorn Helgaas [mailto:bhelg...@google.com] > Sent: Thursday, May 7, 2015 8:13 AM > To: Yijing Wang > Cc: William Davis; Joerg Roedel; open list:INTEL IOMMU (VT-d); linux- > p...@vger.kernel.org; Terence Ripperda; John Hubbard; Jerome Glisse; Dave > Jiang; David S. Miller; Alex Williamson > Subject: Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer > > On Wed, May 6, 2015 at 8:48 PM, Yijing Wang wrote: > > On 2015/5/7 6:18, Bjorn Helgaas wrote: > >> [+cc Yijing, Dave J, Dave M, Alex] > >> > >> On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote: > >>> From: Will Davis > >>> > >>> Hi, > >>> > >>> This patch series adds DMA APIs to map and unmap a struct resource > >>> to and from a PCI device's IOVA domain, and implements the AMD, > >>> Intel, and nommu versions of these interfaces. > >>> > >>> This solves a long-standing problem with the existing DMA-remapping > >>> interfaces, which require that a struct page be given for the region > >>> to be mapped into a device's IOVA domain. This requirement cannot > >>> support peer device BAR ranges, for which no struct pages exist. > >>> ... > > >> I think we currently assume there's no peer-to-peer traffic. > >> > >> I don't know whether changing that will break anything, but I'm > >> concerned about these: > >> > >> - PCIe MPS configuration (see pcie_bus_configure_settings()). > > > > I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER > > force every device's MPS to 128B, what its concern is the TLP payload > > size. In this series, it seems to only map a iova for device bar region. > > MPS configuration makes assumptions about whether there will be any peer- > to-peer traffic. If there will be none, MPS can be configured more > aggressively. > > I don't think Linux has any way to detect whether a driver is doing peer- > to-peer, and there's no way to prevent a driver from doing it. > We're stuck with requiring the user to specify boot options > ("pci=pcie_bus_safe", "pci=pcie_bus_perf", "pci=pcie_bus_peer2peer", > etc.) that tell the PCI core what the user expects to happen. > > This is a terrible user experience. The user has no way to tell what > drivers are going to do. If he specifies the wrong thing, e.g., "assume no > peer-to-peer traffic," and then loads a driver that does peer-to-peer, the > kernel will configure MPS aggressively and when the device does a peer-to- > peer transfer, it may cause a Malformed TLP error. > I agree that this isn't a great user experience, but just want to clarify that this problem is orthogonal to this patch series, correct? Prior to this series, the MPS mismatch is still possible with p2p traffic, but when an IOMMU is enabled p2p traffic will result in DMAR faults. The aim of the series is to allow drivers to fix the latter, not the former. Thanks, Will -- nvpublic ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
On Wed, May 6, 2015 at 8:48 PM, Yijing Wang wrote: > On 2015/5/7 6:18, Bjorn Helgaas wrote: >> [+cc Yijing, Dave J, Dave M, Alex] >> >> On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote: >>> From: Will Davis >>> >>> Hi, >>> >>> This patch series adds DMA APIs to map and unmap a struct resource to and >>> from >>> a PCI device's IOVA domain, and implements the AMD, Intel, and nommu >>> versions >>> of these interfaces. >>> >>> This solves a long-standing problem with the existing DMA-remapping >>> interfaces, >>> which require that a struct page be given for the region to be mapped into a >>> device's IOVA domain. This requirement cannot support peer device BAR >>> ranges, >>> for which no struct pages exist. >>> ... >> I think we currently assume there's no peer-to-peer traffic. >> >> I don't know whether changing that will break anything, but I'm concerned >> about these: >> >> - PCIe MPS configuration (see pcie_bus_configure_settings()). > > I think it should be ok for PCIe MPS configuration, PCIE_BUS_PEER2PEER force > every > device's MPS to 128B, what its concern is the TLP payload size. In this > series, it > seems to only map a iova for device bar region. MPS configuration makes assumptions about whether there will be any peer-to-peer traffic. If there will be none, MPS can be configured more aggressively. I don't think Linux has any way to detect whether a driver is doing peer-to-peer, and there's no way to prevent a driver from doing it. We're stuck with requiring the user to specify boot options ("pci=pcie_bus_safe", "pci=pcie_bus_perf", "pci=pcie_bus_peer2peer", etc.) that tell the PCI core what the user expects to happen. This is a terrible user experience. The user has no way to tell what drivers are going to do. If he specifies the wrong thing, e.g., "assume no peer-to-peer traffic," and then loads a driver that does peer-to-peer, the kernel will configure MPS aggressively and when the device does a peer-to-peer transfer, it may cause a Malformed TLP error. Bjorn ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
On Wed, 2015-05-06 at 17:18 -0500, Bjorn Helgaas wrote: > [+cc Yijing, Dave J, Dave M, Alex] > > On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote: > > From: Will Davis > > > > Hi, > > > > This patch series adds DMA APIs to map and unmap a struct resource to and > > from > > a PCI device's IOVA domain, and implements the AMD, Intel, and nommu > > versions > > of these interfaces. > > > > This solves a long-standing problem with the existing DMA-remapping > > interfaces, > > which require that a struct page be given for the region to be mapped into a > > device's IOVA domain. This requirement cannot support peer device BAR > > ranges, > > for which no struct pages exist. > > > > The underlying implementations of map_page and map_sg convert the struct > > page > > into its physical address anyway, so we just need a way to route the > > physical > > address of the BAR region to these implementations. The new interfaces do > > this > > by taking the struct resource describing a device's BAR region, from which > > the > > physical address is derived. > > > > The Intel and nommu versions have been verified on a dual Intel Xeon E5405 > > workstation. I'm in the process of obtaining hardware to test the AMD > > version > > as well. Please review. > > I think we currently assume there's no peer-to-peer traffic. > > I don't know whether changing that will break anything, but I'm concerned > about these: > > - PCIe MPS configuration (see pcie_bus_configure_settings()). > > - PCIe ACS, e.g., pci_acs_enabled(). My guess is that this one is OK, > but Alex would know better. I think it should be OK too. ACS will force the transaction upstream for IOMMU translation rather than possible allowing redirection lower in the topology, but that's sort of the price we pay for isolation. The p2p context entries need to be present in the IOMMU, so without actually reading the patches, this does seem like something a driver might want to do via the DMA API. The IOMMU API already allows us to avoid the struct page issue and create mappings for p2p in the IOMMU. > - dma_addr_t. Currently dma_addr_t is big enough to hold any address > returned from the DMA API. That's not necessarily big enough to hold a > PCI bus address, e.g., a raw BAR value. > > > Will Davis (6): > > dma-debug: add checking for map/unmap_resource > > DMA-API: Introduce dma_(un)map_resource > > dma-mapping: pci: add pci_(un)map_resource > > iommu/amd: Implement (un)map_resource > > iommu/vt-d: implement (un)map_resource > > x86: add pci-nommu implementation of map_resource > > > > arch/x86/kernel/pci-nommu.c | 17 +++ > > drivers/iommu/amd_iommu.c| 76 > > ++-- > > drivers/iommu/intel-iommu.c | 18 > > include/asm-generic/dma-mapping-broken.h | 9 > > include/asm-generic/dma-mapping-common.h | 34 ++ > > include/asm-generic/pci-dma-compat.h | 14 ++ > > include/linux/dma-debug.h| 20 + > > include/linux/dma-mapping.h | 7 +++ > > lib/dma-debug.c | 48 > > 9 files changed, 230 insertions(+), 13 deletions(-) > > > > -- > > 2.3.7 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > > the body of a message to majord...@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
[+cc Yijing, Dave J, Dave M, Alex] On Fri, May 01, 2015 at 01:32:12PM -0500, wda...@nvidia.com wrote: > From: Will Davis > > Hi, > > This patch series adds DMA APIs to map and unmap a struct resource to and from > a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions > of these interfaces. > > This solves a long-standing problem with the existing DMA-remapping > interfaces, > which require that a struct page be given for the region to be mapped into a > device's IOVA domain. This requirement cannot support peer device BAR ranges, > for which no struct pages exist. > > The underlying implementations of map_page and map_sg convert the struct page > into its physical address anyway, so we just need a way to route the physical > address of the BAR region to these implementations. The new interfaces do this > by taking the struct resource describing a device's BAR region, from which the > physical address is derived. > > The Intel and nommu versions have been verified on a dual Intel Xeon E5405 > workstation. I'm in the process of obtaining hardware to test the AMD version > as well. Please review. I think we currently assume there's no peer-to-peer traffic. I don't know whether changing that will break anything, but I'm concerned about these: - PCIe MPS configuration (see pcie_bus_configure_settings()). - PCIe ACS, e.g., pci_acs_enabled(). My guess is that this one is OK, but Alex would know better. - dma_addr_t. Currently dma_addr_t is big enough to hold any address returned from the DMA API. That's not necessarily big enough to hold a PCI bus address, e.g., a raw BAR value. > Will Davis (6): > dma-debug: add checking for map/unmap_resource > DMA-API: Introduce dma_(un)map_resource > dma-mapping: pci: add pci_(un)map_resource > iommu/amd: Implement (un)map_resource > iommu/vt-d: implement (un)map_resource > x86: add pci-nommu implementation of map_resource > > arch/x86/kernel/pci-nommu.c | 17 +++ > drivers/iommu/amd_iommu.c| 76 > ++-- > drivers/iommu/intel-iommu.c | 18 > include/asm-generic/dma-mapping-broken.h | 9 > include/asm-generic/dma-mapping-common.h | 34 ++ > include/asm-generic/pci-dma-compat.h | 14 ++ > include/linux/dma-debug.h| 20 + > include/linux/dma-mapping.h | 7 +++ > lib/dma-debug.c | 48 > 9 files changed, 230 insertions(+), 13 deletions(-) > > -- > 2.3.7 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/6] IOMMU/DMA map_resource support for peer-to-peer
From: Will Davis Hi, This patch series adds DMA APIs to map and unmap a struct resource to and from a PCI device's IOVA domain, and implements the AMD, Intel, and nommu versions of these interfaces. This solves a long-standing problem with the existing DMA-remapping interfaces, which require that a struct page be given for the region to be mapped into a device's IOVA domain. This requirement cannot support peer device BAR ranges, for which no struct pages exist. The underlying implementations of map_page and map_sg convert the struct page into its physical address anyway, so we just need a way to route the physical address of the BAR region to these implementations. The new interfaces do this by taking the struct resource describing a device's BAR region, from which the physical address is derived. The Intel and nommu versions have been verified on a dual Intel Xeon E5405 workstation. I'm in the process of obtaining hardware to test the AMD version as well. Please review. Thanks, Will Will Davis (6): dma-debug: add checking for map/unmap_resource DMA-API: Introduce dma_(un)map_resource dma-mapping: pci: add pci_(un)map_resource iommu/amd: Implement (un)map_resource iommu/vt-d: implement (un)map_resource x86: add pci-nommu implementation of map_resource arch/x86/kernel/pci-nommu.c | 17 +++ drivers/iommu/amd_iommu.c| 76 ++-- drivers/iommu/intel-iommu.c | 18 include/asm-generic/dma-mapping-broken.h | 9 include/asm-generic/dma-mapping-common.h | 34 ++ include/asm-generic/pci-dma-compat.h | 14 ++ include/linux/dma-debug.h| 20 + include/linux/dma-mapping.h | 7 +++ lib/dma-debug.c | 48 9 files changed, 230 insertions(+), 13 deletions(-) -- 2.3.7 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu