Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
On 04/04/2017 10:28 AM, Robin Murphy wrote: > So (at the risk of Jon mooing at me) m -- Computer Architect | Sent from my Fedora powered laptop ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
On Wed, Apr 12, 2017 at 08:41:20PM +, Jayachandran C wrote: > On Wed, Apr 12, 2017 at 02:11:38PM -0500, Bjorn Helgaas wrote: > > On Wed, Apr 12, 2017 at 06:10:34PM +, Jayachandran C wrote: > > > On Wed, Apr 12, 2017 at 11:21:18AM -0500, Bjorn Helgaas wrote: > > > > On Tue, Apr 11, 2017 at 03:27:02PM +, Jayachandran C wrote: > > > > > On Tue, Apr 11, 2017 at 08:41:25AM -0500, Bjorn Helgaas wrote: > > > > > > [+cc Joerg] > > > > > > > > > > > > On Tue, Apr 11, 2017 at 07:10:48AM +, Jayachandran C wrote: > > > > > > > On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote: > > > > > > > > Hi Jayachandran, > > > > > > > > > > > > > > > > On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote: > > > > > > > > > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan > > > > > > > > > earlier), the PCI > > > > > > > > > topology is slightly unusual. For a multi-node system, it > > > > > > > > > looks like: > > > > > > > > > > > > > > > > > > [node level PCI bridges - one per node] > > > > > > > > > [SoC PCI devices with MSI-X but no IOMMU] > > > > > > > > > [PCI-PCIe "glue" bridges - upto 14, one per real port > > > > > > > > > below] > > > > > > > > > [PCIe real root ports associated with IOMMU and GICv3 > > > > > > > > > ITS] > > > > > > > > > [External PCI devices connected to PCIe links] > > > > > > > > > > > > > > > > > > The top two levels of bridges should have introduced aliases > > > > > > > > > since they > > > > > > > > > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 > > > > > > > > > they do not. > > > > > > > > > In the case of external PCIe devices, the "real" root ports > > > > > > > > > are connected > > > > > > > > > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not > > > > > > > > > introduce an > > > > > > > > > alias. The SoC PCI devices are directly connected to the GIC > > > > > > > > > ITS, so the > > > > > > > > > node level bridges do not introduce an alias either. > > > > > > > > > > > > > > > > > > To handle this quirk, we mark the real PCIe root ports and > > > > > > > > > node level > > > > > > > > > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT. > > > > > > > > > With this, > > > > > > > > > pci_for_each_dma_alias() works correctly for external PCIe > > > > > > > > > devices and > > > > > > > > > SoC PCI devices. > > > > > > > > > > > > > > > > > > For the current revision of Cavium ThunderX2, the VendorID > > > > > > > > > and Device ID > > > > > > > > > are from Broadcom Vulcan (14e4:90XX). > > > > > > > > > > > > > > > > Can you supply some text here about why we want to apply this > > > > > > > > patch? > > > > > > > > E.g., does it avoid making unnecessary IOMMU mappings, improve > > > > > > > > performance, avoid a crash, etc? > > > > > > > > > > > > > > If this is for the commit message, I hope the following is ok: > > > > > > > > > > > > > > "With this change, both MSI-X and IO virtualization work > > > > > > > correctly on > > > > > > > Cavium ThunderX2. The GIC ITS driver gets the correct device ID to > > > > > > > configure MSI-X, the SMMUv3 driver gets the correct Stream IDs > > > > > > > for PCI > > > > > > > devices, and the IOMMU groups are setup correctly." > > > > > > > > > > > > This doesn't get at what the actual problem is. I'm hoping for > > > > > > something like "without this change, we set up an IOMMU mapping for > > > > > > requestor ID X, but device DMA uses requestor ID Y because , > > > > > > which > > > > > > results in an IOMMU fault" > > > > > > > > > > Ok. I hope this would be better: > > > > > > > > > > "Without this change, the last alias seen while traversing the PCI > > > > > hierarchy will be used as the RID to generate the device ID for ITS > > > > > and stream ID for SMMU. This in turn causes the MSI-X generated by the > > > > > device to fail since the ITS expects to have translation tables based > > > > > on the actual PCIe RID and not the (irrelevant) alias. Similarly, the > > > > > device DMA also fails when SMMU is enabled due to incorrect value in > > > > > SMMU translation tables" > > > > > > > > This description is true, but I don't think it addresses the real > > > > problem. I think the real problem is that your IOMMU code doesn't > > > > handle aliases correctly, and by ignoring these invalid aliases, we > > > > happen to map an alias that works for the builtin devices. But that's > > > > only because we got lucky (those devices use a single RID and they're > > > > not behind bridges that optionally take ownership). > > > > > > > > It would make sense to me if we fixed the IOMMU code to map *all* the > > > > aliases, which should be enough to make your devices work. If we then > > > > wanted to apply a patch like this on top, it would be simply an > > > > optimization that avoids unnecessary IOMMU mappings. > > > > > > The issue that the IOMMU code does not handle valid aliases is > > > unrelated to what
Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
On Wed, Apr 12, 2017 at 02:11:38PM -0500, Bjorn Helgaas wrote: > On Wed, Apr 12, 2017 at 06:10:34PM +, Jayachandran C wrote: > > On Wed, Apr 12, 2017 at 11:21:18AM -0500, Bjorn Helgaas wrote: > > > On Tue, Apr 11, 2017 at 03:27:02PM +, Jayachandran C wrote: > > > > On Tue, Apr 11, 2017 at 08:41:25AM -0500, Bjorn Helgaas wrote: > > > > > [+cc Joerg] > > > > > > > > > > On Tue, Apr 11, 2017 at 07:10:48AM +, Jayachandran C wrote: > > > > > > On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote: > > > > > > > Hi Jayachandran, > > > > > > > > > > > > > > On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote: > > > > > > > > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan > > > > > > > > earlier), the PCI > > > > > > > > topology is slightly unusual. For a multi-node system, it looks > > > > > > > > like: > > > > > > > > > > > > > > > > [node level PCI bridges - one per node] > > > > > > > > [SoC PCI devices with MSI-X but no IOMMU] > > > > > > > > [PCI-PCIe "glue" bridges - upto 14, one per real port below] > > > > > > > > [PCIe real root ports associated with IOMMU and GICv3 > > > > > > > > ITS] > > > > > > > > [External PCI devices connected to PCIe links] > > > > > > > > > > > > > > > > The top two levels of bridges should have introduced aliases > > > > > > > > since they > > > > > > > > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they > > > > > > > > do not. > > > > > > > > In the case of external PCIe devices, the "real" root ports are > > > > > > > > connected > > > > > > > > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not > > > > > > > > introduce an > > > > > > > > alias. The SoC PCI devices are directly connected to the GIC > > > > > > > > ITS, so the > > > > > > > > node level bridges do not introduce an alias either. > > > > > > > > > > > > > > > > To handle this quirk, we mark the real PCIe root ports and node > > > > > > > > level > > > > > > > > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT. > > > > > > > > With this, > > > > > > > > pci_for_each_dma_alias() works correctly for external PCIe > > > > > > > > devices and > > > > > > > > SoC PCI devices. > > > > > > > > > > > > > > > > For the current revision of Cavium ThunderX2, the VendorID and > > > > > > > > Device ID > > > > > > > > are from Broadcom Vulcan (14e4:90XX). > > > > > > > > > > > > > > Can you supply some text here about why we want to apply this > > > > > > > patch? > > > > > > > E.g., does it avoid making unnecessary IOMMU mappings, improve > > > > > > > performance, avoid a crash, etc? > > > > > > > > > > > > If this is for the commit message, I hope the following is ok: > > > > > > > > > > > > "With this change, both MSI-X and IO virtualization work correctly > > > > > > on > > > > > > Cavium ThunderX2. The GIC ITS driver gets the correct device ID to > > > > > > configure MSI-X, the SMMUv3 driver gets the correct Stream IDs for > > > > > > PCI > > > > > > devices, and the IOMMU groups are setup correctly." > > > > > > > > > > This doesn't get at what the actual problem is. I'm hoping for > > > > > something like "without this change, we set up an IOMMU mapping for > > > > > requestor ID X, but device DMA uses requestor ID Y because , which > > > > > results in an IOMMU fault" > > > > > > > > Ok. I hope this would be better: > > > > > > > > "Without this change, the last alias seen while traversing the PCI > > > > hierarchy will be used as the RID to generate the device ID for ITS > > > > and stream ID for SMMU. This in turn causes the MSI-X generated by the > > > > device to fail since the ITS expects to have translation tables based > > > > on the actual PCIe RID and not the (irrelevant) alias. Similarly, the > > > > device DMA also fails when SMMU is enabled due to incorrect value in > > > > SMMU translation tables" > > > > > > This description is true, but I don't think it addresses the real > > > problem. I think the real problem is that your IOMMU code doesn't > > > handle aliases correctly, and by ignoring these invalid aliases, we > > > happen to map an alias that works for the builtin devices. But that's > > > only because we got lucky (those devices use a single RID and they're > > > not behind bridges that optionally take ownership). > > > > > > It would make sense to me if we fixed the IOMMU code to map *all* the > > > aliases, which should be enough to make your devices work. If we then > > > wanted to apply a patch like this on top, it would be simply an > > > optimization that avoids unnecessary IOMMU mappings. > > > > The issue that the IOMMU code does not handle valid aliases is > > unrelated to what I am trying to fix. The quirk is to make sure > > that invalid aliases are not seen on ThunderX2 while doing > > pci_for_each_dma_alias(). > > > > The DMA and MSI-X requests leave the PCI/PCIe hierarchy at the point > > where the SMMU (or ITS) is attached, i.e. at the bridge
Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
On Wed, Apr 12, 2017 at 06:10:34PM +, Jayachandran C wrote: > On Wed, Apr 12, 2017 at 11:21:18AM -0500, Bjorn Helgaas wrote: > > On Tue, Apr 11, 2017 at 03:27:02PM +, Jayachandran C wrote: > > > On Tue, Apr 11, 2017 at 08:41:25AM -0500, Bjorn Helgaas wrote: > > > > [+cc Joerg] > > > > > > > > On Tue, Apr 11, 2017 at 07:10:48AM +, Jayachandran C wrote: > > > > > On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote: > > > > > > Hi Jayachandran, > > > > > > > > > > > > On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote: > > > > > > > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), > > > > > > > the PCI > > > > > > > topology is slightly unusual. For a multi-node system, it looks > > > > > > > like: > > > > > > > > > > > > > > [node level PCI bridges - one per node] > > > > > > > [SoC PCI devices with MSI-X but no IOMMU] > > > > > > > [PCI-PCIe "glue" bridges - upto 14, one per real port below] > > > > > > > [PCIe real root ports associated with IOMMU and GICv3 ITS] > > > > > > > [External PCI devices connected to PCIe links] > > > > > > > > > > > > > > The top two levels of bridges should have introduced aliases > > > > > > > since they > > > > > > > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they > > > > > > > do not. > > > > > > > In the case of external PCIe devices, the "real" root ports are > > > > > > > connected > > > > > > > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not > > > > > > > introduce an > > > > > > > alias. The SoC PCI devices are directly connected to the GIC ITS, > > > > > > > so the > > > > > > > node level bridges do not introduce an alias either. > > > > > > > > > > > > > > To handle this quirk, we mark the real PCIe root ports and node > > > > > > > level > > > > > > > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT. With > > > > > > > this, > > > > > > > pci_for_each_dma_alias() works correctly for external PCIe > > > > > > > devices and > > > > > > > SoC PCI devices. > > > > > > > > > > > > > > For the current revision of Cavium ThunderX2, the VendorID and > > > > > > > Device ID > > > > > > > are from Broadcom Vulcan (14e4:90XX). > > > > > > > > > > > > Can you supply some text here about why we want to apply this patch? > > > > > > E.g., does it avoid making unnecessary IOMMU mappings, improve > > > > > > performance, avoid a crash, etc? > > > > > > > > > > If this is for the commit message, I hope the following is ok: > > > > > > > > > > "With this change, both MSI-X and IO virtualization work correctly on > > > > > Cavium ThunderX2. The GIC ITS driver gets the correct device ID to > > > > > configure MSI-X, the SMMUv3 driver gets the correct Stream IDs for PCI > > > > > devices, and the IOMMU groups are setup correctly." > > > > > > > > This doesn't get at what the actual problem is. I'm hoping for > > > > something like "without this change, we set up an IOMMU mapping for > > > > requestor ID X, but device DMA uses requestor ID Y because , which > > > > results in an IOMMU fault" > > > > > > Ok. I hope this would be better: > > > > > > "Without this change, the last alias seen while traversing the PCI > > > hierarchy will be used as the RID to generate the device ID for ITS > > > and stream ID for SMMU. This in turn causes the MSI-X generated by the > > > device to fail since the ITS expects to have translation tables based > > > on the actual PCIe RID and not the (irrelevant) alias. Similarly, the > > > device DMA also fails when SMMU is enabled due to incorrect value in > > > SMMU translation tables" > > > > This description is true, but I don't think it addresses the real > > problem. I think the real problem is that your IOMMU code doesn't > > handle aliases correctly, and by ignoring these invalid aliases, we > > happen to map an alias that works for the builtin devices. But that's > > only because we got lucky (those devices use a single RID and they're > > not behind bridges that optionally take ownership). > > > > It would make sense to me if we fixed the IOMMU code to map *all* the > > aliases, which should be enough to make your devices work. If we then > > wanted to apply a patch like this on top, it would be simply an > > optimization that avoids unnecessary IOMMU mappings. > > The issue that the IOMMU code does not handle valid aliases is > unrelated to what I am trying to fix. The quirk is to make sure > that invalid aliases are not seen on ThunderX2 while doing > pci_for_each_dma_alias(). > > The DMA and MSI-X requests leave the PCI/PCIe hierarchy at the point > where the SMMU (or ITS) is attached, i.e. at the bridge marked with > PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT. The quirk ensures that we don't look > for aliases above that point. > > The toplevel bridge is marked PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT since > the on-chip devices are directly connected to the ITS (they do not > use SMMU). > > The top two levels of bridg
Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
On Wed, Apr 12, 2017 at 11:21:18AM -0500, Bjorn Helgaas wrote: > On Tue, Apr 11, 2017 at 03:27:02PM +, Jayachandran C wrote: > > On Tue, Apr 11, 2017 at 08:41:25AM -0500, Bjorn Helgaas wrote: > > > [+cc Joerg] > > > > > > On Tue, Apr 11, 2017 at 07:10:48AM +, Jayachandran C wrote: > > > > On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote: > > > > > Hi Jayachandran, > > > > > > > > > > On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote: > > > > > > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), > > > > > > the PCI > > > > > > topology is slightly unusual. For a multi-node system, it looks > > > > > > like: > > > > > > > > > > > > [node level PCI bridges - one per node] > > > > > > [SoC PCI devices with MSI-X but no IOMMU] > > > > > > [PCI-PCIe "glue" bridges - upto 14, one per real port below] > > > > > > [PCIe real root ports associated with IOMMU and GICv3 ITS] > > > > > > [External PCI devices connected to PCIe links] > > > > > > > > > > > > The top two levels of bridges should have introduced aliases since > > > > > > they > > > > > > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do > > > > > > not. > > > > > > In the case of external PCIe devices, the "real" root ports are > > > > > > connected > > > > > > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce > > > > > > an > > > > > > alias. The SoC PCI devices are directly connected to the GIC ITS, > > > > > > so the > > > > > > node level bridges do not introduce an alias either. > > > > > > > > > > > > To handle this quirk, we mark the real PCIe root ports and node > > > > > > level > > > > > > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT. With > > > > > > this, > > > > > > pci_for_each_dma_alias() works correctly for external PCIe devices > > > > > > and > > > > > > SoC PCI devices. > > > > > > > > > > > > For the current revision of Cavium ThunderX2, the VendorID and > > > > > > Device ID > > > > > > are from Broadcom Vulcan (14e4:90XX). > > > > > > > > > > Can you supply some text here about why we want to apply this patch? > > > > > E.g., does it avoid making unnecessary IOMMU mappings, improve > > > > > performance, avoid a crash, etc? > > > > > > > > If this is for the commit message, I hope the following is ok: > > > > > > > > "With this change, both MSI-X and IO virtualization work correctly on > > > > Cavium ThunderX2. The GIC ITS driver gets the correct device ID to > > > > configure MSI-X, the SMMUv3 driver gets the correct Stream IDs for PCI > > > > devices, and the IOMMU groups are setup correctly." > > > > > > This doesn't get at what the actual problem is. I'm hoping for > > > something like "without this change, we set up an IOMMU mapping for > > > requestor ID X, but device DMA uses requestor ID Y because , which > > > results in an IOMMU fault" > > > > Ok. I hope this would be better: > > > > "Without this change, the last alias seen while traversing the PCI > > hierarchy will be used as the RID to generate the device ID for ITS > > and stream ID for SMMU. This in turn causes the MSI-X generated by the > > device to fail since the ITS expects to have translation tables based > > on the actual PCIe RID and not the (irrelevant) alias. Similarly, the > > device DMA also fails when SMMU is enabled due to incorrect value in > > SMMU translation tables" > > This description is true, but I don't think it addresses the real > problem. I think the real problem is that your IOMMU code doesn't > handle aliases correctly, and by ignoring these invalid aliases, we > happen to map an alias that works for the builtin devices. But that's > only because we got lucky (those devices use a single RID and they're > not behind bridges that optionally take ownership). > > It would make sense to me if we fixed the IOMMU code to map *all* the > aliases, which should be enough to make your devices work. If we then > wanted to apply a patch like this on top, it would be simply an > optimization that avoids unnecessary IOMMU mappings. The issue that the IOMMU code does not handle valid aliases is unrelated to what I am trying to fix. The quirk is to make sure that invalid aliases are not seen on ThunderX2 while doing pci_for_each_dma_alias(). The DMA and MSI-X requests leave the PCI/PCIe hierarchy at the point where the SMMU (or ITS) is attached, i.e. at the bridge marked with PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT. The quirk ensures that we don't look for aliases above that point. The toplevel bridge is marked PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT since the on-chip devices are directly connected to the ITS (they do not use SMMU). The top two levels of bridges are not real PCI bridges but just PCI bridge-like things that were added to tie the whole hierarchy together for configuration and enumeration. They do not handle PCI/PCIe transactions in the traditional sense. I think my problem description is still not c
Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
On Tue, Apr 11, 2017 at 03:27:02PM +, Jayachandran C wrote: > On Tue, Apr 11, 2017 at 08:41:25AM -0500, Bjorn Helgaas wrote: > > [+cc Joerg] > > > > On Tue, Apr 11, 2017 at 07:10:48AM +, Jayachandran C wrote: > > > On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote: > > > > Hi Jayachandran, > > > > > > > > On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote: > > > > > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the > > > > > PCI > > > > > topology is slightly unusual. For a multi-node system, it looks like: > > > > > > > > > > [node level PCI bridges - one per node] > > > > > [SoC PCI devices with MSI-X but no IOMMU] > > > > > [PCI-PCIe "glue" bridges - upto 14, one per real port below] > > > > > [PCIe real root ports associated with IOMMU and GICv3 ITS] > > > > > [External PCI devices connected to PCIe links] > > > > > > > > > > The top two levels of bridges should have introduced aliases since > > > > > they > > > > > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do > > > > > not. > > > > > In the case of external PCIe devices, the "real" root ports are > > > > > connected > > > > > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce an > > > > > alias. The SoC PCI devices are directly connected to the GIC ITS, so > > > > > the > > > > > node level bridges do not introduce an alias either. > > > > > > > > > > To handle this quirk, we mark the real PCIe root ports and node level > > > > > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT. With this, > > > > > pci_for_each_dma_alias() works correctly for external PCIe devices and > > > > > SoC PCI devices. > > > > > > > > > > For the current revision of Cavium ThunderX2, the VendorID and Device > > > > > ID > > > > > are from Broadcom Vulcan (14e4:90XX). > > > > > > > > Can you supply some text here about why we want to apply this patch? > > > > E.g., does it avoid making unnecessary IOMMU mappings, improve > > > > performance, avoid a crash, etc? > > > > > > If this is for the commit message, I hope the following is ok: > > > > > > "With this change, both MSI-X and IO virtualization work correctly on > > > Cavium ThunderX2. The GIC ITS driver gets the correct device ID to > > > configure MSI-X, the SMMUv3 driver gets the correct Stream IDs for PCI > > > devices, and the IOMMU groups are setup correctly." > > > > This doesn't get at what the actual problem is. I'm hoping for > > something like "without this change, we set up an IOMMU mapping for > > requestor ID X, but device DMA uses requestor ID Y because , which > > results in an IOMMU fault" > > Ok. I hope this would be better: > > "Without this change, the last alias seen while traversing the PCI > hierarchy will be used as the RID to generate the device ID for ITS > and stream ID for SMMU. This in turn causes the MSI-X generated by the > device to fail since the ITS expects to have translation tables based > on the actual PCIe RID and not the (irrelevant) alias. Similarly, the > device DMA also fails when SMMU is enabled due to incorrect value in > SMMU translation tables" This description is true, but I don't think it addresses the real problem. I think the real problem is that your IOMMU code doesn't handle aliases correctly, and by ignoring these invalid aliases, we happen to map an alias that works for the builtin devices. But that's only because we got lucky (those devices use a single RID and they're not behind bridges that optionally take ownership). It would make sense to me if we fixed the IOMMU code to map *all* the aliases, which should be enough to make your devices work. If we then wanted to apply a patch like this on top, it would be simply an optimization that avoids unnecessary IOMMU mappings. > > I suspect the reason this patch makes a difference is because the > > current pci_for_each_dma_alias() believes one of those top-level > > bridges is an alias, and the iterator produces it last, so that's the > > one you map. The IOMMU is attached lower down, so that top-level > > bridge is not in fact an alias, but since you only look at the *last* > > one, you don't map the correct aliases from lower down in the tree. > > Exactly. The IORT spec allows a range of RIDs to map to an SMMU, which > means that a PCI RC can multiple SMMUs, each handling a subset of RIDs. > > In the case of Cavium ThunderX2, the RID which we should see on the RC > - if we follow the standard and factor in the aliasing introduced by the > PCI bridge and the PCI/PCIe bridge - is not the RID seen by the SMMU (or > ITS). > > But, if we stop the traversal at the point where SMMU (or ITS) is > attached, we will get the correct RID as seen by these. There is a single "correct RID" for your builtin SATA and USB, but in general there is no single RID. > > Stopping the iterator earlier happens to make the last alias be one of > > the correct ones, but it doesn't solve t
Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
On 04/11/2017 11:27 AM, Jayachandran C wrote: > On Tue, Apr 11, 2017 at 08:41:25AM -0500, Bjorn Helgaas wrote: >> I suspect the reason this patch makes a difference is because the >> current pci_for_each_dma_alias() believes one of those top-level >> bridges is an alias, and the iterator produces it last, so that's the >> one you map. The IOMMU is attached lower down, so that top-level >> bridge is not in fact an alias, but since you only look at the *last* >> one, you don't map the correct aliases from lower down in the tree. > > Exactly. The IORT spec allows a range of RIDs to map to an SMMU, which > means that a PCI RC can multiple SMMUs, each handling a subset of RIDs. > > In the case of Cavium ThunderX2, the RID which we should see on the RC > - if we follow the standard and factor in the aliasing introduced by the > PCI bridge and the PCI/PCIe bridge - is not the RID seen by the SMMU (or > ITS). > > But, if we stop the traversal at the point where SMMU (or ITS) is > attached, we will get the correct RID as seen by these. Side note that I am trying to get various specifications clarified to promote more of a familiar alternative architecture (x86) approach in the future in which these aren't at different levels in the topology. But to do that requires integrated Root Complex IP with bells/whistles. Jon. -- Computer Architect | Sent from my Fedora powered laptop ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
On 11/04/17 14:41, Bjorn Helgaas wrote: > [+cc Joerg] > > On Tue, Apr 11, 2017 at 07:10:48AM +, Jayachandran C wrote: >> On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote: >>> Hi Jayachandran, >>> >>> On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote: The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI topology is slightly unusual. For a multi-node system, it looks like: [node level PCI bridges - one per node] [SoC PCI devices with MSI-X but no IOMMU] [PCI-PCIe "glue" bridges - upto 14, one per real port below] [PCIe real root ports associated with IOMMU and GICv3 ITS] [External PCI devices connected to PCIe links] The top two levels of bridges should have introduced aliases since they are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do not. In the case of external PCIe devices, the "real" root ports are connected to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce an alias. The SoC PCI devices are directly connected to the GIC ITS, so the node level bridges do not introduce an alias either. To handle this quirk, we mark the real PCIe root ports and node level PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT. With this, pci_for_each_dma_alias() works correctly for external PCIe devices and SoC PCI devices. For the current revision of Cavium ThunderX2, the VendorID and Device ID are from Broadcom Vulcan (14e4:90XX). >>> >>> Can you supply some text here about why we want to apply this patch? >>> E.g., does it avoid making unnecessary IOMMU mappings, improve >>> performance, avoid a crash, etc? >> >> If this is for the commit message, I hope the following is ok: >> >> "With this change, both MSI-X and IO virtualization work correctly on >> Cavium ThunderX2. The GIC ITS driver gets the correct device ID to >> configure MSI-X, the SMMUv3 driver gets the correct Stream IDs for PCI >> devices, and the IOMMU groups are setup correctly." > > This doesn't get at what the actual problem is. I'm hoping for > something like "without this change, we set up an IOMMU mapping for > requestor ID X, but device DMA uses requestor ID Y because , which > results in an IOMMU fault" The fact that certain bits of code only consider the last-seen alias RID does mean we will install all mappings for a RID the IOMMU will never see, causing subsequent attempt to access them (via the real RID) to fault. Even with that fixed, considering the past-the-IOMMU-connection bridges will cause IOMMU group assignment to believe the IOMMU can't distinguish them and thus prevent individual devices being assignable to different VMs. > I've been puzzling over the fact that most of the callers of > pci_for_each_dma_alias() don't seem to use it correctly. For Intel > IOMMUs, domain_context_mapping() uses it to add a mapping for every > possible alias. But most of the other callers only look at the last > alias and ignore all the others. As it happens, I've just been looking into this and reaching the same conclusion (not least because a fair few of of the incorrect uses have my fingerprints all over them). The of_iommu_configure() and corresponding iort_iommu_configure() uses turn out to already be broken WRT DMA phantom functions vs. MSIs, but pretty straightforward to fix (and I now have one of the offending Marvell SATA cards to test things with). The one in its_pci_msi_prepare() turns out to be totally backwards, as it actually wants to iterate through every device which may alias with the given device (any suggestions for the neatest way to do that are most welcome). The other uses in pci_msi_*() are a little trickier, as in general we're assuming every device is associated with just a single ID (certainly for ARM GICv3 this assumption runs quite strongly through the architecture) - Marc and I have chucked some ideas around, but in the short term, it might be easier to just not even pretend to support MSIs from behind aliasing bridges for the DT/IORT cases. Either way, those are also broken for the phantom function case (and the tentative fix I've written will need rethinking in light of this discussion, oh well). > That might work most of the time, > but: > > - There's no guarantee that pci_for_each_dma_alias() iterates in any > particular order, so relying on the current order is fragile, > > - The pci_add_dma_alias() interface allows an arbitrary number of > aliases (as long as they're all on the same bus), and some devices > do use more than one, e.g., quirk_dma_func0_alias(), > quirk_mic_x200_dma_alias(), > > - pci_for_each_dma_alias() translates the rules in the PCIe to > PCI/PCI-X Bridge spec, r1.0, sec 2.3, about taking ownership into > aliases. I think it's important to pay attention to *every* > possible alias, not just the last one. Ha, that exact page
Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
On Tue, Apr 11, 2017 at 08:41:25AM -0500, Bjorn Helgaas wrote: > [+cc Joerg] > > On Tue, Apr 11, 2017 at 07:10:48AM +, Jayachandran C wrote: > > On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote: > > > Hi Jayachandran, > > > > > > On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote: > > > > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the > > > > PCI > > > > topology is slightly unusual. For a multi-node system, it looks like: > > > > > > > > [node level PCI bridges - one per node] > > > > [SoC PCI devices with MSI-X but no IOMMU] > > > > [PCI-PCIe "glue" bridges - upto 14, one per real port below] > > > > [PCIe real root ports associated with IOMMU and GICv3 ITS] > > > > [External PCI devices connected to PCIe links] > > > > > > > > The top two levels of bridges should have introduced aliases since they > > > > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do not. > > > > In the case of external PCIe devices, the "real" root ports are > > > > connected > > > > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce an > > > > alias. The SoC PCI devices are directly connected to the GIC ITS, so the > > > > node level bridges do not introduce an alias either. > > > > > > > > To handle this quirk, we mark the real PCIe root ports and node level > > > > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT. With this, > > > > pci_for_each_dma_alias() works correctly for external PCIe devices and > > > > SoC PCI devices. > > > > > > > > For the current revision of Cavium ThunderX2, the VendorID and Device ID > > > > are from Broadcom Vulcan (14e4:90XX). > > > > > > Can you supply some text here about why we want to apply this patch? > > > E.g., does it avoid making unnecessary IOMMU mappings, improve > > > performance, avoid a crash, etc? > > > > If this is for the commit message, I hope the following is ok: > > > > "With this change, both MSI-X and IO virtualization work correctly on > > Cavium ThunderX2. The GIC ITS driver gets the correct device ID to > > configure MSI-X, the SMMUv3 driver gets the correct Stream IDs for PCI > > devices, and the IOMMU groups are setup correctly." > > This doesn't get at what the actual problem is. I'm hoping for > something like "without this change, we set up an IOMMU mapping for > requestor ID X, but device DMA uses requestor ID Y because , which > results in an IOMMU fault" Ok. I hope this would be better: "Without this change, the last alias seen while traversing the PCI hierarchy will be used as the RID to generate the device ID for ITS and stream ID for SMMU. This in turn causes the MSI-X generated by the device to fail since the ITS expects to have translation tables based on the actual PCIe RID and not the (irrelevant) alias. Similarly, the device DMA also fails when SMMU is enabled due to incorrect value in SMMU translation tables" > I've been puzzling over the fact that most of the callers of > pci_for_each_dma_alias() don't seem to use it correctly. For Intel > IOMMUs, domain_context_mapping() uses it to add a mapping for every > possible alias. But most of the other callers only look at the last > alias and ignore all the others. That might work most of the time, > but: > > - There's no guarantee that pci_for_each_dma_alias() iterates in any > particular order, so relying on the current order is fragile, > > - The pci_add_dma_alias() interface allows an arbitrary number of > aliases (as long as they're all on the same bus), and some devices > do use more than one, e.g., quirk_dma_func0_alias(), > quirk_mic_x200_dma_alias(), > > - pci_for_each_dma_alias() translates the rules in the PCIe to > PCI/PCI-X Bridge spec, r1.0, sec 2.3, about taking ownership into > aliases. I think it's important to pay attention to *every* > possible alias, not just the last one. pci_for_each_dma_alias() is used by the ARM code to find the RID (Requester ID), and this is taken as the last alias as seen from the PCI controller (RC). The RID is then used to program the Device ID of the GIC ITS (ARM generic interrupt controller's interrupt translation service) for MSI-X (and similarly to program Stream ID of the SMMU). The translation from RID to Device ID or stream ID is provided by the IORT ACPI table[1] or by the a {iommu,msi}-{map,mask} [2] property in the device tree. Taking the last alias maybe reasonable since the mapping is from (PCI RC, RID) to (SMMU, streamID) or (GIC ITS, deviceID) and we are looking for a single the RID for a device as seen from the controller. > I suspect the reason this patch makes a difference is because the > current pci_for_each_dma_alias() believes one of those top-level > bridges is an alias, and the iterator produces it last, so that's the > one you map. The IOMMU is attached lower down, so that top-level > bridge is not in fact an alias, but since you only look at the *last* >
Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
[+cc Joerg] On Tue, Apr 11, 2017 at 07:10:48AM +, Jayachandran C wrote: > On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote: > > Hi Jayachandran, > > > > On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote: > > > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI > > > topology is slightly unusual. For a multi-node system, it looks like: > > > > > > [node level PCI bridges - one per node] > > > [SoC PCI devices with MSI-X but no IOMMU] > > > [PCI-PCIe "glue" bridges - upto 14, one per real port below] > > > [PCIe real root ports associated with IOMMU and GICv3 ITS] > > > [External PCI devices connected to PCIe links] > > > > > > The top two levels of bridges should have introduced aliases since they > > > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do not. > > > In the case of external PCIe devices, the "real" root ports are connected > > > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce an > > > alias. The SoC PCI devices are directly connected to the GIC ITS, so the > > > node level bridges do not introduce an alias either. > > > > > > To handle this quirk, we mark the real PCIe root ports and node level > > > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT. With this, > > > pci_for_each_dma_alias() works correctly for external PCIe devices and > > > SoC PCI devices. > > > > > > For the current revision of Cavium ThunderX2, the VendorID and Device ID > > > are from Broadcom Vulcan (14e4:90XX). > > > > Can you supply some text here about why we want to apply this patch? > > E.g., does it avoid making unnecessary IOMMU mappings, improve > > performance, avoid a crash, etc? > > If this is for the commit message, I hope the following is ok: > > "With this change, both MSI-X and IO virtualization work correctly on > Cavium ThunderX2. The GIC ITS driver gets the correct device ID to > configure MSI-X, the SMMUv3 driver gets the correct Stream IDs for PCI > devices, and the IOMMU groups are setup correctly." This doesn't get at what the actual problem is. I'm hoping for something like "without this change, we set up an IOMMU mapping for requestor ID X, but device DMA uses requestor ID Y because , which results in an IOMMU fault" I've been puzzling over the fact that most of the callers of pci_for_each_dma_alias() don't seem to use it correctly. For Intel IOMMUs, domain_context_mapping() uses it to add a mapping for every possible alias. But most of the other callers only look at the last alias and ignore all the others. That might work most of the time, but: - There's no guarantee that pci_for_each_dma_alias() iterates in any particular order, so relying on the current order is fragile, - The pci_add_dma_alias() interface allows an arbitrary number of aliases (as long as they're all on the same bus), and some devices do use more than one, e.g., quirk_dma_func0_alias(), quirk_mic_x200_dma_alias(), - pci_for_each_dma_alias() translates the rules in the PCIe to PCI/PCI-X Bridge spec, r1.0, sec 2.3, about taking ownership into aliases. I think it's important to pay attention to *every* possible alias, not just the last one. I suspect the reason this patch makes a difference is because the current pci_for_each_dma_alias() believes one of those top-level bridges is an alias, and the iterator produces it last, so that's the one you map. The IOMMU is attached lower down, so that top-level bridge is not in fact an alias, but since you only look at the *last* one, you don't map the correct aliases from lower down in the tree. Stopping the iterator earlier happens to make the last alias be one of the correct ones, but it doesn't solve the problems of quirked devices that can use multiple requester IDs, and it doesn't solve the problem of PCIe-to-PCI bridges that optionally take ownership of transactions. > I can send out a new patch if needed. > > The on chip SATA and USB use MSI-X, so this is needed for basic > functionality of the platform. No need for a new patch; I can integrate something into the changelog. > > > Signed-off-by: Jayachandran C > > > --- > > > drivers/pci/quirks.c | 14 ++ > > > 1 file changed, 14 insertions(+) > > > > > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > > > index 6736836..564a84a 100644 > > > --- a/drivers/pci/quirks.c > > > +++ b/drivers/pci/quirks.c > > > @@ -3958,6 +3958,20 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, > > > 0x2260, quirk_mic_x200_dma_alias); > > > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, > > > quirk_mic_x200_dma_alias); > > > > > > /* > > > + * The IOMMU and interrupt controller on Broadcom Vulcan/Cavium > > > ThunderX2 are > > > + * associated not at the root bus, but at a bridge below. This quirk flag > > > + * will ensure that the aliases are identified correctly. > > > + */ > > > +static void quirk_bridge_cavm_thrx2_pcie_roo
Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote: > Hi Jayachandran, > > On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote: > > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI > > topology is slightly unusual. For a multi-node system, it looks like: > > > > [node level PCI bridges - one per node] > > [SoC PCI devices with MSI-X but no IOMMU] > > [PCI-PCIe "glue" bridges - upto 14, one per real port below] > > [PCIe real root ports associated with IOMMU and GICv3 ITS] > > [External PCI devices connected to PCIe links] > > > > The top two levels of bridges should have introduced aliases since they > > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do not. > > In the case of external PCIe devices, the "real" root ports are connected > > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce an > > alias. The SoC PCI devices are directly connected to the GIC ITS, so the > > node level bridges do not introduce an alias either. > > > > To handle this quirk, we mark the real PCIe root ports and node level > > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT. With this, > > pci_for_each_dma_alias() works correctly for external PCIe devices and > > SoC PCI devices. > > > > For the current revision of Cavium ThunderX2, the VendorID and Device ID > > are from Broadcom Vulcan (14e4:90XX). > > Can you supply some text here about why we want to apply this patch? > E.g., does it avoid making unnecessary IOMMU mappings, improve > performance, avoid a crash, etc? If this is for the commit message, I hope the following is ok: "With this change, both MSI-X and IO virtualization work correctly on Cavium ThunderX2. The GIC ITS driver gets the correct device ID to configure MSI-X, the SMMUv3 driver gets the correct Stream IDs for PCI devices, and the IOMMU groups are setup correctly." I can send out a new patch if needed. The on chip SATA and USB use MSI-X, so this is needed for basic functionality of the platform. > > > Signed-off-by: Jayachandran C > > --- > > drivers/pci/quirks.c | 14 ++ > > 1 file changed, 14 insertions(+) > > > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > > index 6736836..564a84a 100644 > > --- a/drivers/pci/quirks.c > > +++ b/drivers/pci/quirks.c > > @@ -3958,6 +3958,20 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, > > 0x2260, quirk_mic_x200_dma_alias); > > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, > > quirk_mic_x200_dma_alias); > > > > /* > > + * The IOMMU and interrupt controller on Broadcom Vulcan/Cavium ThunderX2 > > are > > + * associated not at the root bus, but at a bridge below. This quirk flag > > + * will ensure that the aliases are identified correctly. > > + */ > > +static void quirk_bridge_cavm_thrx2_pcie_root(struct pci_dev *pdev) > > +{ > > + pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT; > > +} > > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9000, > > + quirk_bridge_cavm_thrx2_pcie_root); > > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9084, > > + quirk_bridge_cavm_thrx2_pcie_root); > > + > > +/* > > * Intersil/Techwell TW686[4589]-based video capture cards have an empty > > (zero) > > * class code. Fix it. > > */ Thanks, JC. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
Hi Jayachandran, On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote: > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI > topology is slightly unusual. For a multi-node system, it looks like: > > [node level PCI bridges - one per node] > [SoC PCI devices with MSI-X but no IOMMU] > [PCI-PCIe "glue" bridges - upto 14, one per real port below] > [PCIe real root ports associated with IOMMU and GICv3 ITS] > [External PCI devices connected to PCIe links] > > The top two levels of bridges should have introduced aliases since they > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do not. > In the case of external PCIe devices, the "real" root ports are connected > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce an > alias. The SoC PCI devices are directly connected to the GIC ITS, so the > node level bridges do not introduce an alias either. > > To handle this quirk, we mark the real PCIe root ports and node level > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT. With this, > pci_for_each_dma_alias() works correctly for external PCIe devices and > SoC PCI devices. > > For the current revision of Cavium ThunderX2, the VendorID and Device ID > are from Broadcom Vulcan (14e4:90XX). Can you supply some text here about why we want to apply this patch? E.g., does it avoid making unnecessary IOMMU mappings, improve performance, avoid a crash, etc? > Signed-off-by: Jayachandran C > --- > drivers/pci/quirks.c | 14 ++ > 1 file changed, 14 insertions(+) > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index 6736836..564a84a 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -3958,6 +3958,20 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, > quirk_mic_x200_dma_alias); > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, > quirk_mic_x200_dma_alias); > > /* > + * The IOMMU and interrupt controller on Broadcom Vulcan/Cavium ThunderX2 are > + * associated not at the root bus, but at a bridge below. This quirk flag > + * will ensure that the aliases are identified correctly. > + */ > +static void quirk_bridge_cavm_thrx2_pcie_root(struct pci_dev *pdev) > +{ > + pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT; > +} > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9000, > + quirk_bridge_cavm_thrx2_pcie_root); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9084, > + quirk_bridge_cavm_thrx2_pcie_root); > + > +/* > * Intersil/Techwell TW686[4589]-based video capture cards have an empty > (zero) > * class code. Fix it. > */ > -- > 2.7.4 > > > ___ > linux-arm-kernel mailing list > linux-arm-ker...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
[Moving Bjorn back to to: ] On Tue, Apr 04, 2017 at 03:28:26PM +0100, Robin Murphy wrote: > On 04/04/17 12:50, Jayachandran C wrote: > > On Mon, Apr 03, 2017 at 04:07:53PM +0100, Robin Murphy wrote: > >> On 03/04/17 14:15, Jayachandran C wrote: > >>> The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI > >>> topology is slightly unusual. For a multi-node system, it looks like: > >>> > >>> [node level PCI bridges - one per node] > >>> [SoC PCI devices with MSI-X but no IOMMU] > >>> [PCI-PCIe "glue" bridges - upto 14, one per real port below] > >>> [PCIe real root ports associated with IOMMU and GICv3 ITS] > >>> [External PCI devices connected to PCIe links] > >> > >> Since it's not entirely obvious, what does the actual DT - or IORT if > >> you must ;) - topology for this look like? I can't help thinking that > >> either it's inaccurate, or that this is going to expose a shortcoming in > >> pci_dma_configure() which breaks things - unless I'm missing something, > >> isn't find_pci_root_bus() going to go all the way up to the top-level > >> glue bridge and pick up the wrong firmware node (if any) for the > >> appropriate DMA properties? > > > > I will try to describe the ACPI interface: > > > > There is just one ECAM area, a single bus range and one set of memory > > windows for the whole system - so there is just one entry in DSDT for > > the PCI controller. This entry also corresponds to the PCI RC node in > > IORT. DMA is coherent and supports 64 bits system-wide, the attributes > > (in DSDT and IORT) reflect this. > > > > lspci on the system looks like this: > > -[:00]-+-00.0-[01-1e]--+-04.0 14e4:9026 > >| +-04.1 14e4:9026 > >| +-05.0 14e4:9027 > >| +-05.1 14e4:9027 > >| +-0a.0-[02-03]00.0-[03]-- > >| +-0a.1-[04-05]00.0-[05]-- > >| [...etc...] > >| +-0b.0-[12-14]00.0-[13-14]--+-00.0 8086:1583 > >| | \-00.1 8086:1583 > >| [...etc...] > >| \-0b.5-[1d-1e]00.0-[1e]-- > >\-00.1-[1f-3b]--+-04.0 14e4:9026 > >+-04.1 14e4:9026 > >+-05.0 14e4:9027 > >+-05.1 14e4:9027 > >+-0a.0-[20-21]00.0-[21]-- > >[...etc...] > > > > The devices here are: > > - 00:00.0 and 00:00.1 are the node (socket) level bridges > > - 01:[45].x and 1f:[45].x are SoC PCI devices like SATA and USB > > - 01:[ab].x and 1f:[ab].x are the PCI-PCIe "reverse"/glue bridges > > - 02:00.0 etc are the "real" PCIe ports connected to external PCIe cards. > > Each node has a GIC ITS, and a group of 4 PCIe ports have an SMMU. > > > > The IORT is built by the firmware based on its PCI enumeration. The IORT > > will have multiple entries under the PCI RC node: > > - one entry per node to map the SoC devices directly to ITS for MSI-X, > >since the SoC devices are not attached to any SMMU. > > - An entry per "real" PCIe port to map RIDs under it to the corresponding > >SMMU. > > The SMMU nodes will have an entry to map its RID ranges to the node ITS. > > > > The IORT spec supports this configuration, and the corresponding code is > > already upstream, so the only sticking point right now is > > pci_for_each_dma_alias(). > > Thanks, that helps a lot. The "single global ECAM space" idea was > eluding me, but in that context it all makes much more sense - I'm > assuming the two quirked device IDs correspond to the 00:00.[01] devices > and the [02-1e]:00.0 ones. > > So (at the risk of Jon mooing at me), I guess the DT description would > be a single node looking something like: > > pcie { > reg = [global ECAM space for segment ]; > > msi-map = <0x0100 &its0 0x0100 0x1d00>, > <0x1f00 &its1 0x1f00 0x1d00>; > iommu-map = <0x0200 &smmu0 0x0200 0x1c00>, > <0x2000 &smmu0 0x2000 0x1c00>; > }; > > (note to self: which incidentally also means of_pci_map_rid() probably > wants fixing to not treat gaps in the map as an error) > > With only one node like that, rather than having the whole first 3 > levels of bridges described, the "stop at the appropriate node in the > callback" approach does become even more impractical in all cases. So, > for $TITLE, based on the above understanding: > > Reviewed-by: Robin Murphy Hi Bjorn, This seems to be the reasonable way to add support for the quirk. Would really appreciate feedback from you. Thanks, JC. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
On 04/04/17 12:50, Jayachandran C wrote: > On Mon, Apr 03, 2017 at 04:07:53PM +0100, Robin Murphy wrote: >> On 03/04/17 14:15, Jayachandran C wrote: >>> The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI >>> topology is slightly unusual. For a multi-node system, it looks like: >>> >>> [node level PCI bridges - one per node] >>> [SoC PCI devices with MSI-X but no IOMMU] >>> [PCI-PCIe "glue" bridges - upto 14, one per real port below] >>> [PCIe real root ports associated with IOMMU and GICv3 ITS] >>> [External PCI devices connected to PCIe links] >> >> Since it's not entirely obvious, what does the actual DT - or IORT if >> you must ;) - topology for this look like? I can't help thinking that >> either it's inaccurate, or that this is going to expose a shortcoming in >> pci_dma_configure() which breaks things - unless I'm missing something, >> isn't find_pci_root_bus() going to go all the way up to the top-level >> glue bridge and pick up the wrong firmware node (if any) for the >> appropriate DMA properties? > > I will try to describe the ACPI interface: > > There is just one ECAM area, a single bus range and one set of memory > windows for the whole system - so there is just one entry in DSDT for > the PCI controller. This entry also corresponds to the PCI RC node in > IORT. DMA is coherent and supports 64 bits system-wide, the attributes > (in DSDT and IORT) reflect this. > > lspci on the system looks like this: > -[:00]-+-00.0-[01-1e]--+-04.0 14e4:9026 >| +-04.1 14e4:9026 >| +-05.0 14e4:9027 >| +-05.1 14e4:9027 >| +-0a.0-[02-03]00.0-[03]-- >| +-0a.1-[04-05]00.0-[05]-- >| [...etc...] >| +-0b.0-[12-14]00.0-[13-14]--+-00.0 8086:1583 >| | \-00.1 8086:1583 >| [...etc...] >| \-0b.5-[1d-1e]00.0-[1e]-- >\-00.1-[1f-3b]--+-04.0 14e4:9026 >+-04.1 14e4:9026 >+-05.0 14e4:9027 >+-05.1 14e4:9027 >+-0a.0-[20-21]00.0-[21]-- >[...etc...] > > The devices here are: > - 00:00.0 and 00:00.1 are the node (socket) level bridges > - 01:[45].x and 1f:[45].x are SoC PCI devices like SATA and USB > - 01:[ab].x and 1f:[ab].x are the PCI-PCIe "reverse"/glue bridges > - 02:00.0 etc are the "real" PCIe ports connected to external PCIe cards. > Each node has a GIC ITS, and a group of 4 PCIe ports have an SMMU. > > The IORT is built by the firmware based on its PCI enumeration. The IORT > will have multiple entries under the PCI RC node: > - one entry per node to map the SoC devices directly to ITS for MSI-X, >since the SoC devices are not attached to any SMMU. > - An entry per "real" PCIe port to map RIDs under it to the corresponding >SMMU. > The SMMU nodes will have an entry to map its RID ranges to the node ITS. > > The IORT spec supports this configuration, and the corresponding code is > already upstream, so the only sticking point right now is > pci_for_each_dma_alias(). Thanks, that helps a lot. The "single global ECAM space" idea was eluding me, but in that context it all makes much more sense - I'm assuming the two quirked device IDs correspond to the 00:00.[01] devices and the [02-1e]:00.0 ones. So (at the risk of Jon mooing at me), I guess the DT description would be a single node looking something like: pcie { reg = [global ECAM space for segment ]; msi-map = <0x0100 &its0 0x0100 0x1d00>, <0x1f00 &its1 0x1f00 0x1d00>; iommu-map = <0x0200 &smmu0 0x0200 0x1c00>, <0x2000 &smmu0 0x2000 0x1c00>; }; (note to self: which incidentally also means of_pci_map_rid() probably wants fixing to not treat gaps in the map as an error) With only one node like that, rather than having the whole first 3 levels of bridges described, the "stop at the appropriate node in the callback" approach does become even more impractical in all cases. So, for $TITLE, based on the above understanding: Reviewed-by: Robin Murphy Cheers, Robin. > > JC. > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
On Mon, Apr 03, 2017 at 04:07:53PM +0100, Robin Murphy wrote: > On 03/04/17 14:15, Jayachandran C wrote: > > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI > > topology is slightly unusual. For a multi-node system, it looks like: > > > > [node level PCI bridges - one per node] > > [SoC PCI devices with MSI-X but no IOMMU] > > [PCI-PCIe "glue" bridges - upto 14, one per real port below] > > [PCIe real root ports associated with IOMMU and GICv3 ITS] > > [External PCI devices connected to PCIe links] > > Since it's not entirely obvious, what does the actual DT - or IORT if > you must ;) - topology for this look like? I can't help thinking that > either it's inaccurate, or that this is going to expose a shortcoming in > pci_dma_configure() which breaks things - unless I'm missing something, > isn't find_pci_root_bus() going to go all the way up to the top-level > glue bridge and pick up the wrong firmware node (if any) for the > appropriate DMA properties? I will try to describe the ACPI interface: There is just one ECAM area, a single bus range and one set of memory windows for the whole system - so there is just one entry in DSDT for the PCI controller. This entry also corresponds to the PCI RC node in IORT. DMA is coherent and supports 64 bits system-wide, the attributes (in DSDT and IORT) reflect this. lspci on the system looks like this: -[:00]-+-00.0-[01-1e]--+-04.0 14e4:9026 | +-04.1 14e4:9026 | +-05.0 14e4:9027 | +-05.1 14e4:9027 | +-0a.0-[02-03]00.0-[03]-- | +-0a.1-[04-05]00.0-[05]-- | [...etc...] | +-0b.0-[12-14]00.0-[13-14]--+-00.0 8086:1583 | | \-00.1 8086:1583 | [...etc...] | \-0b.5-[1d-1e]00.0-[1e]-- \-00.1-[1f-3b]--+-04.0 14e4:9026 +-04.1 14e4:9026 +-05.0 14e4:9027 +-05.1 14e4:9027 +-0a.0-[20-21]00.0-[21]-- [...etc...] The devices here are: - 00:00.0 and 00:00.1 are the node (socket) level bridges - 01:[45].x and 1f:[45].x are SoC PCI devices like SATA and USB - 01:[ab].x and 1f:[ab].x are the PCI-PCIe "reverse"/glue bridges - 02:00.0 etc are the "real" PCIe ports connected to external PCIe cards. Each node has a GIC ITS, and a group of 4 PCIe ports have an SMMU. The IORT is built by the firmware based on its PCI enumeration. The IORT will have multiple entries under the PCI RC node: - one entry per node to map the SoC devices directly to ITS for MSI-X, since the SoC devices are not attached to any SMMU. - An entry per "real" PCIe port to map RIDs under it to the corresponding SMMU. The SMMU nodes will have an entry to map its RID ranges to the node ITS. The IORT spec supports this configuration, and the corresponding code is already upstream, so the only sticking point right now is pci_for_each_dma_alias(). JC. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
On 03/04/17 14:15, Jayachandran C wrote: > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI > topology is slightly unusual. For a multi-node system, it looks like: > > [node level PCI bridges - one per node] > [SoC PCI devices with MSI-X but no IOMMU] > [PCI-PCIe "glue" bridges - upto 14, one per real port below] > [PCIe real root ports associated with IOMMU and GICv3 ITS] > [External PCI devices connected to PCIe links] Since it's not entirely obvious, what does the actual DT - or IORT if you must ;) - topology for this look like? I can't help thinking that either it's inaccurate, or that this is going to expose a shortcoming in pci_dma_configure() which breaks things - unless I'm missing something, isn't find_pci_root_bus() going to go all the way up to the top-level glue bridge and pick up the wrong firmware node (if any) for the appropriate DMA properties? Robin. > The top two levels of bridges should have introduced aliases since they > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do not. > In the case of external PCIe devices, the "real" root ports are connected > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce an > alias. The SoC PCI devices are directly connected to the GIC ITS, so the > node level bridges do not introduce an alias either. > > To handle this quirk, we mark the real PCIe root ports and node level > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT. With this, > pci_for_each_dma_alias() works correctly for external PCIe devices and > SoC PCI devices. > > For the current revision of Cavium ThunderX2, the VendorID and Device ID > are from Broadcom Vulcan (14e4:90XX). > > Signed-off-by: Jayachandran C > --- > drivers/pci/quirks.c | 14 ++ > 1 file changed, 14 insertions(+) > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index 6736836..564a84a 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -3958,6 +3958,20 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, > quirk_mic_x200_dma_alias); > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, > quirk_mic_x200_dma_alias); > > /* > + * The IOMMU and interrupt controller on Broadcom Vulcan/Cavium ThunderX2 are > + * associated not at the root bus, but at a bridge below. This quirk flag > + * will ensure that the aliases are identified correctly. > + */ > +static void quirk_bridge_cavm_thrx2_pcie_root(struct pci_dev *pdev) > +{ > + pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT; > +} > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9000, > + quirk_bridge_cavm_thrx2_pcie_root); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9084, > + quirk_bridge_cavm_thrx2_pcie_root); > + > +/* > * Intersil/Techwell TW686[4589]-based video capture cards have an empty > (zero) > * class code. Fix it. > */ > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling
The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI topology is slightly unusual. For a multi-node system, it looks like: [node level PCI bridges - one per node] [SoC PCI devices with MSI-X but no IOMMU] [PCI-PCIe "glue" bridges - upto 14, one per real port below] [PCIe real root ports associated with IOMMU and GICv3 ITS] [External PCI devices connected to PCIe links] The top two levels of bridges should have introduced aliases since they are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do not. In the case of external PCIe devices, the "real" root ports are connected to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce an alias. The SoC PCI devices are directly connected to the GIC ITS, so the node level bridges do not introduce an alias either. To handle this quirk, we mark the real PCIe root ports and node level PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT. With this, pci_for_each_dma_alias() works correctly for external PCIe devices and SoC PCI devices. For the current revision of Cavium ThunderX2, the VendorID and Device ID are from Broadcom Vulcan (14e4:90XX). Signed-off-by: Jayachandran C --- drivers/pci/quirks.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 6736836..564a84a 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3958,6 +3958,20 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, quirk_mic_x200_dma_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, quirk_mic_x200_dma_alias); /* + * The IOMMU and interrupt controller on Broadcom Vulcan/Cavium ThunderX2 are + * associated not at the root bus, but at a bridge below. This quirk flag + * will ensure that the aliases are identified correctly. + */ +static void quirk_bridge_cavm_thrx2_pcie_root(struct pci_dev *pdev) +{ + pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT; +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9000, + quirk_bridge_cavm_thrx2_pcie_root); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9084, + quirk_bridge_cavm_thrx2_pcie_root); + +/* * Intersil/Techwell TW686[4589]-based video capture cards have an empty (zero) * class code. Fix it. */ -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu