Re: Partial BAR Address Allocation
On Mon, Mar 06, 2017 at 12:04:39PM +0100, Joerg Roedel wrote: > On Wed, Feb 22, 2017 at 05:39:44PM -0600, Bjorn Helgaas wrote: > > [+cc Joerg, iommu list] > > > > On Wed, Feb 22, 2017 at 03:44:53PM -0500, Sinan Kaya wrote: > > > On 2/22/2017 1:44 PM, Bjorn Helgaas wrote: > > > > There is no way for a driver to say "I only need this memory BAR and > > > > not the other ones." The reason is because the PCI_COMMAND_MEMORY bit > > > > enables *all* the memory BARs; there's no way to enable memory BARs > > > > selectively. If we enable memory BARs and one of them is unassigned, > > > > that unassigned BAR is enabled, and the device will respond at > > > > whatever address the register happens to contain, and that may cause > > > > conflicts. > > Hmm, maybe I am missing something, but isn't this only a problem if the > 'unassigned' BAR as an address configured that also falls into the > Bridge-Window of the parent bridge? Otherwise no requests should be > routed to the BAR anyway, right? I guess it's true that we could safely enable a memory BAR if the upstream bridge would never route anything to it. But it would depend on the size of the BAR and the upstream bridge's configuration, so it doesn't feel like it would really be reliable in general. > > But if there *is* address translation in the PIO direction, we can > > have conflicts because the bridge can translate CPU-side PIO accesses > > to arbitrary PCI bus addresses. > > I am not aware of any hardware that does translation on the PIO space. > The IOMMUs I know of don't care about PIO at all. Right, address translation in the PIO direction would be done by the host bridge, not the IOMMU. There are a fair number of bridges that do this -- basically all the callers of pci_add_resource_offset(). They just apply a constant offset, often by chopping off some high-order bits of the CPU address. Bjorn ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Partial BAR Address Allocation
On 3/6/2017 6:04 AM, Joerg Roedel wrote: > On Wed, Feb 22, 2017 at 05:39:44PM -0600, Bjorn Helgaas wrote: >> [+cc Joerg, iommu list] >> >> On Wed, Feb 22, 2017 at 03:44:53PM -0500, Sinan Kaya wrote: >>> On 2/22/2017 1:44 PM, Bjorn Helgaas wrote: There is no way for a driver to say "I only need this memory BAR and not the other ones." The reason is because the PCI_COMMAND_MEMORY bit enables *all* the memory BARs; there's no way to enable memory BARs selectively. If we enable memory BARs and one of them is unassigned, that unassigned BAR is enabled, and the device will respond at whatever address the register happens to contain, and that may cause conflicts. > > Hmm, maybe I am missing something, but isn't this only a problem if the > 'unassigned' BAR as an address configured that also falls into the > Bridge-Window of the parent bridge? Otherwise no requests should be > routed to the BAR anyway, right? Correct, in order for this to happen you need to have multiple devices under a bridge. One device sends a read request towards the system address that happens to overlap with the BAR address of the unassigned BAR. The device with unassigned resource will start responding. This is one of those P2P use cases. > >>> The problem is that according to PCI specification BAR addresses and >>> DMA addresses cannot overlap. >>> >>> From PCI-to-PCI Bridge Arch. spec.: "A bridge forwards PCI memory >>> transactions from its primary interface to its secondary interface >>> (downstream) if a memory address is in the range defined by the >>> Memory Base and Memory Limit registers (when the base is less than >>> or equal to the limit) as illustrated in Figure 4-3. Conversely, a >>> memory transaction on the secondary interface that is within this >>> address range will not be forwarded upstream to the primary >>> interface." >>> >>> To be specific, if your DMA address happens to be in >>> [0x8000-0x] and root port's aperture includes this >>> range; the DMA will never make to the system memory. > > If there is no translation by an IOMMU this shouldn't be a problem, as > long as the bridge windows don't overlap with system ram. With > translation the IOMMU driver has to take care of that, which they > usually do. Correct, IOMMU drivers that I have reviewed all carve out the bridge windows out of the IOMMU driver allocatable address range in iova_reserve_pci_windows() function. > >> Hmmm. I guess SWIOTLB assumes there's no address translation in the >> DMA direction, right? If there's no address translation in the PIO >> direction, PCI bus BAR addresses are identical to the CPU-side >> addresses. In that case, there's no conflict because we already have >> to assign BARs so they never look like a system memory address. > > Yes, SWIOTLB assumes that IOVA == PA. > >> But if there *is* address translation in the PIO direction, we can >> have conflicts because the bridge can translate CPU-side PIO accesses >> to arbitrary PCI bus addresses. > > I am not aware of any hardware that does translation on the PIO space. > The IOMMUs I know of don't care about PIO at all. IOMMUs are used in the in inbound path mostly. Most of the HW has some sort of reserved address in the 4 GB whether with or without translation to support existing 32 bit only cards. We are talking about a problem in the outbound/PIO path. > > > > Joerg > > -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Partial BAR Address Allocation
On Wed, Feb 22, 2017 at 05:39:44PM -0600, Bjorn Helgaas wrote: > [+cc Joerg, iommu list] > > On Wed, Feb 22, 2017 at 03:44:53PM -0500, Sinan Kaya wrote: > > On 2/22/2017 1:44 PM, Bjorn Helgaas wrote: > > > There is no way for a driver to say "I only need this memory BAR and > > > not the other ones." The reason is because the PCI_COMMAND_MEMORY bit > > > enables *all* the memory BARs; there's no way to enable memory BARs > > > selectively. If we enable memory BARs and one of them is unassigned, > > > that unassigned BAR is enabled, and the device will respond at > > > whatever address the register happens to contain, and that may cause > > > conflicts. Hmm, maybe I am missing something, but isn't this only a problem if the 'unassigned' BAR as an address configured that also falls into the Bridge-Window of the parent bridge? Otherwise no requests should be routed to the BAR anyway, right? > > The problem is that according to PCI specification BAR addresses and > > DMA addresses cannot overlap. > > > > From PCI-to-PCI Bridge Arch. spec.: "A bridge forwards PCI memory > > transactions from its primary interface to its secondary interface > > (downstream) if a memory address is in the range defined by the > > Memory Base and Memory Limit registers (when the base is less than > > or equal to the limit) as illustrated in Figure 4-3. Conversely, a > > memory transaction on the secondary interface that is within this > > address range will not be forwarded upstream to the primary > > interface." > > > > To be specific, if your DMA address happens to be in > > [0x8000-0x] and root port's aperture includes this > > range; the DMA will never make to the system memory. If there is no translation by an IOMMU this shouldn't be a problem, as long as the bridge windows don't overlap with system ram. With translation the IOMMU driver has to take care of that, which they usually do. > Hmmm. I guess SWIOTLB assumes there's no address translation in the > DMA direction, right? If there's no address translation in the PIO > direction, PCI bus BAR addresses are identical to the CPU-side > addresses. In that case, there's no conflict because we already have > to assign BARs so they never look like a system memory address. Yes, SWIOTLB assumes that IOVA == PA. > But if there *is* address translation in the PIO direction, we can > have conflicts because the bridge can translate CPU-side PIO accesses > to arbitrary PCI bus addresses. I am not aware of any hardware that does translation on the PIO space. The IOMMUs I know of don't care about PIO at all. Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Partial BAR Address Allocation
Hi Robin, On 2/23/2017 6:40 AM, Robin Murphy wrote: > On 22/02/17 23:39, Bjorn Helgaas wrote: >> [+cc Joerg, iommu list] >> >> On Wed, Feb 22, 2017 at 03:44:53PM -0500, Sinan Kaya wrote: >>> On 2/22/2017 1:44 PM, Bjorn Helgaas wrote: There is no way for a driver to say "I only need this memory BAR and not the other ones." The reason is because the PCI_COMMAND_MEMORY bit enables *all* the memory BARs; there's no way to enable memory BARs selectively. If we enable memory BARs and one of them is unassigned, that unassigned BAR is enabled, and the device will respond at whatever address the register happens to contain, and that may cause conflicts. I'm not sure this answers your question. Do you want to get rid of 32-bit BAR addresses because your host bridge doesn't have a window to 32-bit PCI addresses? It's typical for a bridge to support a window to the 32-bit PCI space as well as one to the 64-bit PCI space. Often it performs address translation for the 32-bit window so it doesn't have to be in the 32-bit area on the CPU side, e.g., you could have something like this where we have three host bridges and the 2-4GB space on each PCI root bus is addressable: pci_bus :00: root bus resource [mem 0x108000-0x10] (bus address [0x8000-0x]) pci_bus 0001:00: root bus resource [mem 0x118000-0x11] (bus address [0x8000-0x]) pci_bus 0002:00: root bus resource [mem 0x128000-0x12] (bus address [0x8000-0x]) >>> >>> The problem is that according to PCI specification BAR addresses and >>> DMA addresses cannot overlap. >>> >>> From PCI-to-PCI Bridge Arch. spec.: "A bridge forwards PCI memory >>> transactions from its primary interface to its secondary interface >>> (downstream) if a memory address is in the range defined by the >>> Memory Base and Memory Limit registers (when the base is less than >>> or equal to the limit) as illustrated in Figure 4-3. Conversely, a >>> memory transaction on the secondary interface that is within this >>> address range will not be forwarded upstream to the primary >>> interface." >>> >>> To be specific, if your DMA address happens to be in >>> [0x8000-0x] and root port's aperture includes this >>> range; the DMA will never make to the system memory. >>> >>> Lorenzo and Robin took some steps to carve out PCI addresses out of >>> DMA addresses in IOMMU drivers by using iova_reserve_pci_windows() >>> function. >>> >>> However, I see that we are still exposed when the operating system >>> doesn't have any IOMMU driver and is using the SWIOTLB for instance. >> >> Hmmm. I guess SWIOTLB assumes there's no address translation in the >> DMA direction, right? > > Not entirely - it does rely on arch-provided dma_to_phys() and > phys_to_dma() helpers which are free to accommodate such translations in > a device-specific manner. On arm64 we use these to account for > dev->dma_pfn_offset describing a straightforward linear offset, but > unless one constant offset would apply to all possible outbound windows > I'm not sure that's much help here. yeah, that won't help. This is a PCI only problem. Arch layer solution will move the entire DMA ranges for all peripherals in the SOC to a specific offset. This would be most useful if the entire DDR would start at some non-zero offset. Even then, PCI usually has several ranges. One range like this to have some space below 4GB and another untranslated range for true 64bit cards. pci_bus 0002:00: root bus resource [mem 0x128000-0x12] (bus address [0x8000-0x]) We have to emulate some range in the first 4GB to make PCI cards happy. > >> If there's no address translation in the PIO >> direction, PCI bus BAR addresses are identical to the CPU-side >> addresses. In that case, there's no conflict because we already have >> to assign BARs so they never look like a system memory address. >> >> But if there *is* address translation in the PIO direction, we can >> have conflicts because the bridge can translate CPU-side PIO accesses >> to arbitrary PCI bus addresses. >> >>> The FW solution I'm looking at requires carving out some part of the >>> DDR from before OS boot so that OS doesn't reclaim that area for >>> DMA. >> >> If you want to reach system RAM, I guess you need to make sure you >> only DMA to bus addresses outside the host bridge windows, as you said >> above. DMA inside the windows would be handled as peer-to-peer DMA. >> >>> I'm not very happy with this solution. I'm also surprised that there >>> is no generic solution in the kernel takes care of this for all root >>> ports regardless of IOMMU driver presence. >> >> The PCI core isn't really involved in allocating DMA addresses, >> although there definitely is the connection with PCI-to-PCI bridge >> windows that you mentioned. I added IOMMU
Re: Partial BAR Address Allocation
On 22/02/17 23:39, Bjorn Helgaas wrote: > [+cc Joerg, iommu list] > > On Wed, Feb 22, 2017 at 03:44:53PM -0500, Sinan Kaya wrote: >> On 2/22/2017 1:44 PM, Bjorn Helgaas wrote: >>> There is no way for a driver to say "I only need this memory BAR and >>> not the other ones." The reason is because the PCI_COMMAND_MEMORY bit >>> enables *all* the memory BARs; there's no way to enable memory BARs >>> selectively. If we enable memory BARs and one of them is unassigned, >>> that unassigned BAR is enabled, and the device will respond at >>> whatever address the register happens to contain, and that may cause >>> conflicts. >>> >>> I'm not sure this answers your question. Do you want to get rid of >>> 32-bit BAR addresses because your host bridge doesn't have a window to >>> 32-bit PCI addresses? It's typical for a bridge to support a window >>> to the 32-bit PCI space as well as one to the 64-bit PCI space. Often >>> it performs address translation for the 32-bit window so it doesn't >>> have to be in the 32-bit area on the CPU side, e.g., you could have >>> something like this where we have three host bridges and the 2-4GB >>> space on each PCI root bus is addressable: >>> >>> pci_bus :00: root bus resource [mem 0x108000-0x10] (bus >>> address [0x8000-0x]) >>> pci_bus 0001:00: root bus resource [mem 0x118000-0x11] (bus >>> address [0x8000-0x]) >>> pci_bus 0002:00: root bus resource [mem 0x128000-0x12] (bus >>> address [0x8000-0x]) >> >> The problem is that according to PCI specification BAR addresses and >> DMA addresses cannot overlap. >> >> From PCI-to-PCI Bridge Arch. spec.: "A bridge forwards PCI memory >> transactions from its primary interface to its secondary interface >> (downstream) if a memory address is in the range defined by the >> Memory Base and Memory Limit registers (when the base is less than >> or equal to the limit) as illustrated in Figure 4-3. Conversely, a >> memory transaction on the secondary interface that is within this >> address range will not be forwarded upstream to the primary >> interface." >> >> To be specific, if your DMA address happens to be in >> [0x8000-0x] and root port's aperture includes this >> range; the DMA will never make to the system memory. >> >> Lorenzo and Robin took some steps to carve out PCI addresses out of >> DMA addresses in IOMMU drivers by using iova_reserve_pci_windows() >> function. >> >> However, I see that we are still exposed when the operating system >> doesn't have any IOMMU driver and is using the SWIOTLB for instance. > > Hmmm. I guess SWIOTLB assumes there's no address translation in the > DMA direction, right? Not entirely - it does rely on arch-provided dma_to_phys() and phys_to_dma() helpers which are free to accommodate such translations in a device-specific manner. On arm64 we use these to account for dev->dma_pfn_offset describing a straightforward linear offset, but unless one constant offset would apply to all possible outbound windows I'm not sure that's much help here. > If there's no address translation in the PIO > direction, PCI bus BAR addresses are identical to the CPU-side > addresses. In that case, there's no conflict because we already have > to assign BARs so they never look like a system memory address. > > But if there *is* address translation in the PIO direction, we can > have conflicts because the bridge can translate CPU-side PIO accesses > to arbitrary PCI bus addresses. > >> The FW solution I'm looking at requires carving out some part of the >> DDR from before OS boot so that OS doesn't reclaim that area for >> DMA. > > If you want to reach system RAM, I guess you need to make sure you > only DMA to bus addresses outside the host bridge windows, as you said > above. DMA inside the windows would be handled as peer-to-peer DMA. > >> I'm not very happy with this solution. I'm also surprised that there >> is no generic solution in the kernel takes care of this for all root >> ports regardless of IOMMU driver presence. > > The PCI core isn't really involved in allocating DMA addresses, > although there definitely is the connection with PCI-to-PCI bridge > windows that you mentioned. I added IOMMU guys, who would know a lot > more than I do. To me, having the bus addresses of windows shadow assigned physical addresses sounds mostly like a broken system configuration. Can the firmware not reprogram them elsewhere, or is the entire bottom 4GB of the physical memory map occupied by system RAM? Robin. > > Bjorn > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Partial BAR Address Allocation
[+cc Joerg, iommu list] On Wed, Feb 22, 2017 at 03:44:53PM -0500, Sinan Kaya wrote: > On 2/22/2017 1:44 PM, Bjorn Helgaas wrote: > > There is no way for a driver to say "I only need this memory BAR and > > not the other ones." The reason is because the PCI_COMMAND_MEMORY bit > > enables *all* the memory BARs; there's no way to enable memory BARs > > selectively. If we enable memory BARs and one of them is unassigned, > > that unassigned BAR is enabled, and the device will respond at > > whatever address the register happens to contain, and that may cause > > conflicts. > > > > I'm not sure this answers your question. Do you want to get rid of > > 32-bit BAR addresses because your host bridge doesn't have a window to > > 32-bit PCI addresses? It's typical for a bridge to support a window > > to the 32-bit PCI space as well as one to the 64-bit PCI space. Often > > it performs address translation for the 32-bit window so it doesn't > > have to be in the 32-bit area on the CPU side, e.g., you could have > > something like this where we have three host bridges and the 2-4GB > > space on each PCI root bus is addressable: > > > > pci_bus :00: root bus resource [mem 0x108000-0x10] (bus > > address [0x8000-0x]) > > pci_bus 0001:00: root bus resource [mem 0x118000-0x11] (bus > > address [0x8000-0x]) > > pci_bus 0002:00: root bus resource [mem 0x128000-0x12] (bus > > address [0x8000-0x]) > > The problem is that according to PCI specification BAR addresses and > DMA addresses cannot overlap. > > From PCI-to-PCI Bridge Arch. spec.: "A bridge forwards PCI memory > transactions from its primary interface to its secondary interface > (downstream) if a memory address is in the range defined by the > Memory Base and Memory Limit registers (when the base is less than > or equal to the limit) as illustrated in Figure 4-3. Conversely, a > memory transaction on the secondary interface that is within this > address range will not be forwarded upstream to the primary > interface." > > To be specific, if your DMA address happens to be in > [0x8000-0x] and root port's aperture includes this > range; the DMA will never make to the system memory. > > Lorenzo and Robin took some steps to carve out PCI addresses out of > DMA addresses in IOMMU drivers by using iova_reserve_pci_windows() > function. > > However, I see that we are still exposed when the operating system > doesn't have any IOMMU driver and is using the SWIOTLB for instance. Hmmm. I guess SWIOTLB assumes there's no address translation in the DMA direction, right? If there's no address translation in the PIO direction, PCI bus BAR addresses are identical to the CPU-side addresses. In that case, there's no conflict because we already have to assign BARs so they never look like a system memory address. But if there *is* address translation in the PIO direction, we can have conflicts because the bridge can translate CPU-side PIO accesses to arbitrary PCI bus addresses. > The FW solution I'm looking at requires carving out some part of the > DDR from before OS boot so that OS doesn't reclaim that area for > DMA. If you want to reach system RAM, I guess you need to make sure you only DMA to bus addresses outside the host bridge windows, as you said above. DMA inside the windows would be handled as peer-to-peer DMA. > I'm not very happy with this solution. I'm also surprised that there > is no generic solution in the kernel takes care of this for all root > ports regardless of IOMMU driver presence. The PCI core isn't really involved in allocating DMA addresses, although there definitely is the connection with PCI-to-PCI bridge windows that you mentioned. I added IOMMU guys, who would know a lot more than I do. Bjorn ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu