Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-12 Thread Jon Masters
On 04/04/2017 10:28 AM, Robin Murphy wrote:

> So (at the risk of Jon mooing at me)

m


-- 
Computer Architect | Sent from my Fedora powered laptop
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-12 Thread Bjorn Helgaas
On Wed, Apr 12, 2017 at 08:41:20PM +, Jayachandran C wrote:
> On Wed, Apr 12, 2017 at 02:11:38PM -0500, Bjorn Helgaas wrote:
> > On Wed, Apr 12, 2017 at 06:10:34PM +, Jayachandran C wrote:
> > > On Wed, Apr 12, 2017 at 11:21:18AM -0500, Bjorn Helgaas wrote:
> > > > On Tue, Apr 11, 2017 at 03:27:02PM +, Jayachandran C wrote:
> > > > > On Tue, Apr 11, 2017 at 08:41:25AM -0500, Bjorn Helgaas wrote:
> > > > > > [+cc Joerg]
> > > > > > 
> > > > > > On Tue, Apr 11, 2017 at 07:10:48AM +, Jayachandran C wrote:
> > > > > > > On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote:
> > > > > > > > Hi Jayachandran,
> > > > > > > > 
> > > > > > > > On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote:
> > > > > > > > > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan 
> > > > > > > > > earlier), the PCI
> > > > > > > > > topology is slightly unusual. For a multi-node system, it 
> > > > > > > > > looks like:
> > > > > > > > > 
> > > > > > > > > [node level PCI bridges - one per node]
> > > > > > > > > [SoC PCI devices with MSI-X but no IOMMU]
> > > > > > > > > [PCI-PCIe "glue" bridges - upto 14, one per real port 
> > > > > > > > > below]
> > > > > > > > > [PCIe real root ports associated with IOMMU and GICv3 
> > > > > > > > > ITS]
> > > > > > > > > [External PCI devices connected to PCIe links]
> > > > > > > > > 
> > > > > > > > > The top two levels of bridges should have introduced aliases 
> > > > > > > > > since they
> > > > > > > > > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 
> > > > > > > > > they do not.
> > > > > > > > > In the case of external PCIe devices, the "real" root ports 
> > > > > > > > > are connected
> > > > > > > > > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not 
> > > > > > > > > introduce an
> > > > > > > > > alias. The SoC PCI devices are directly connected to the GIC 
> > > > > > > > > ITS, so the
> > > > > > > > > node level bridges do not introduce an alias either.
> > > > > > > > > 
> > > > > > > > > To handle this quirk, we mark the real PCIe root ports and 
> > > > > > > > > node level
> > > > > > > > > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT.  
> > > > > > > > > With this,
> > > > > > > > > pci_for_each_dma_alias() works correctly for external PCIe 
> > > > > > > > > devices and
> > > > > > > > > SoC PCI devices.
> > > > > > > > > 
> > > > > > > > > For the current revision of Cavium ThunderX2, the VendorID 
> > > > > > > > > and Device ID
> > > > > > > > > are from Broadcom Vulcan (14e4:90XX).
> > > > > > > > 
> > > > > > > > Can you supply some text here about why we want to apply this 
> > > > > > > > patch?
> > > > > > > > E.g., does it avoid making unnecessary IOMMU mappings, improve
> > > > > > > > performance, avoid a crash, etc?
> > > > > > > 
> > > > > > > If this is for the commit message, I hope the following is ok:
> > > > > > > 
> > > > > > > "With this change, both MSI-X and IO virtualization work 
> > > > > > > correctly on
> > > > > > > Cavium ThunderX2. The GIC ITS driver gets the correct device ID to
> > > > > > > configure MSI-X, the SMMUv3 driver gets the correct Stream IDs 
> > > > > > > for PCI
> > > > > > > devices, and the IOMMU groups are setup correctly."
> > > > > > 
> > > > > > This doesn't get at what the actual problem is.  I'm hoping for
> > > > > > something like "without this change, we set up an IOMMU mapping for
> > > > > > requestor ID X, but device DMA uses requestor ID Y because , 
> > > > > > which
> > > > > > results in an IOMMU fault"
> > > > > 
> > > > > Ok. I hope this would be better:
> > > > > 
> > > > > "Without this change, the last alias seen while traversing the PCI
> > > > > hierarchy will be used as the RID to generate the device ID for ITS
> > > > > and stream ID for SMMU. This in turn causes the MSI-X generated by the
> > > > > device to fail since the ITS expects to have translation tables based
> > > > > on the actual PCIe RID and not the (irrelevant) alias. Similarly, the
> > > > > device DMA also fails when SMMU is enabled due to incorrect value in
> > > > > SMMU translation tables"
> > > > 
> > > > This description is true, but I don't think it addresses the real
> > > > problem.  I think the real problem is that your IOMMU code doesn't
> > > > handle aliases correctly, and by ignoring these invalid aliases, we
> > > > happen to map an alias that works for the builtin devices.  But that's
> > > > only because we got lucky (those devices use a single RID and they're
> > > > not behind bridges that optionally take ownership).
> > > > 
> > > > It would make sense to me if we fixed the IOMMU code to map *all* the
> > > > aliases, which should be enough to make your devices work.  If we then
> > > > wanted to apply a patch like this on top, it would be simply an
> > > > optimization that avoids unnecessary IOMMU mappings.
> > > 
> > > The issue that the IOMMU code does not handle valid aliases is
> > > unrelated to what

Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-12 Thread Jayachandran C
On Wed, Apr 12, 2017 at 02:11:38PM -0500, Bjorn Helgaas wrote:
> On Wed, Apr 12, 2017 at 06:10:34PM +, Jayachandran C wrote:
> > On Wed, Apr 12, 2017 at 11:21:18AM -0500, Bjorn Helgaas wrote:
> > > On Tue, Apr 11, 2017 at 03:27:02PM +, Jayachandran C wrote:
> > > > On Tue, Apr 11, 2017 at 08:41:25AM -0500, Bjorn Helgaas wrote:
> > > > > [+cc Joerg]
> > > > > 
> > > > > On Tue, Apr 11, 2017 at 07:10:48AM +, Jayachandran C wrote:
> > > > > > On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote:
> > > > > > > Hi Jayachandran,
> > > > > > > 
> > > > > > > On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote:
> > > > > > > > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan 
> > > > > > > > earlier), the PCI
> > > > > > > > topology is slightly unusual. For a multi-node system, it looks 
> > > > > > > > like:
> > > > > > > > 
> > > > > > > > [node level PCI bridges - one per node]
> > > > > > > > [SoC PCI devices with MSI-X but no IOMMU]
> > > > > > > > [PCI-PCIe "glue" bridges - upto 14, one per real port below]
> > > > > > > > [PCIe real root ports associated with IOMMU and GICv3 
> > > > > > > > ITS]
> > > > > > > > [External PCI devices connected to PCIe links]
> > > > > > > > 
> > > > > > > > The top two levels of bridges should have introduced aliases 
> > > > > > > > since they
> > > > > > > > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they 
> > > > > > > > do not.
> > > > > > > > In the case of external PCIe devices, the "real" root ports are 
> > > > > > > > connected
> > > > > > > > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not 
> > > > > > > > introduce an
> > > > > > > > alias. The SoC PCI devices are directly connected to the GIC 
> > > > > > > > ITS, so the
> > > > > > > > node level bridges do not introduce an alias either.
> > > > > > > > 
> > > > > > > > To handle this quirk, we mark the real PCIe root ports and node 
> > > > > > > > level
> > > > > > > > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT.  
> > > > > > > > With this,
> > > > > > > > pci_for_each_dma_alias() works correctly for external PCIe 
> > > > > > > > devices and
> > > > > > > > SoC PCI devices.
> > > > > > > > 
> > > > > > > > For the current revision of Cavium ThunderX2, the VendorID and 
> > > > > > > > Device ID
> > > > > > > > are from Broadcom Vulcan (14e4:90XX).
> > > > > > > 
> > > > > > > Can you supply some text here about why we want to apply this 
> > > > > > > patch?
> > > > > > > E.g., does it avoid making unnecessary IOMMU mappings, improve
> > > > > > > performance, avoid a crash, etc?
> > > > > > 
> > > > > > If this is for the commit message, I hope the following is ok:
> > > > > > 
> > > > > > "With this change, both MSI-X and IO virtualization work correctly 
> > > > > > on
> > > > > > Cavium ThunderX2. The GIC ITS driver gets the correct device ID to
> > > > > > configure MSI-X, the SMMUv3 driver gets the correct Stream IDs for 
> > > > > > PCI
> > > > > > devices, and the IOMMU groups are setup correctly."
> > > > > 
> > > > > This doesn't get at what the actual problem is.  I'm hoping for
> > > > > something like "without this change, we set up an IOMMU mapping for
> > > > > requestor ID X, but device DMA uses requestor ID Y because , which
> > > > > results in an IOMMU fault"
> > > > 
> > > > Ok. I hope this would be better:
> > > > 
> > > > "Without this change, the last alias seen while traversing the PCI
> > > > hierarchy will be used as the RID to generate the device ID for ITS
> > > > and stream ID for SMMU. This in turn causes the MSI-X generated by the
> > > > device to fail since the ITS expects to have translation tables based
> > > > on the actual PCIe RID and not the (irrelevant) alias. Similarly, the
> > > > device DMA also fails when SMMU is enabled due to incorrect value in
> > > > SMMU translation tables"
> > > 
> > > This description is true, but I don't think it addresses the real
> > > problem.  I think the real problem is that your IOMMU code doesn't
> > > handle aliases correctly, and by ignoring these invalid aliases, we
> > > happen to map an alias that works for the builtin devices.  But that's
> > > only because we got lucky (those devices use a single RID and they're
> > > not behind bridges that optionally take ownership).
> > > 
> > > It would make sense to me if we fixed the IOMMU code to map *all* the
> > > aliases, which should be enough to make your devices work.  If we then
> > > wanted to apply a patch like this on top, it would be simply an
> > > optimization that avoids unnecessary IOMMU mappings.
> > 
> > The issue that the IOMMU code does not handle valid aliases is
> > unrelated to what I am trying to fix. The quirk is to make sure
> > that invalid aliases are not seen on ThunderX2 while doing
> > pci_for_each_dma_alias().
> > 
> > The DMA and MSI-X requests leave the PCI/PCIe hierarchy at the point
> > where the SMMU (or ITS) is attached, i.e. at the bridge 

Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-12 Thread Bjorn Helgaas
On Wed, Apr 12, 2017 at 06:10:34PM +, Jayachandran C wrote:
> On Wed, Apr 12, 2017 at 11:21:18AM -0500, Bjorn Helgaas wrote:
> > On Tue, Apr 11, 2017 at 03:27:02PM +, Jayachandran C wrote:
> > > On Tue, Apr 11, 2017 at 08:41:25AM -0500, Bjorn Helgaas wrote:
> > > > [+cc Joerg]
> > > > 
> > > > On Tue, Apr 11, 2017 at 07:10:48AM +, Jayachandran C wrote:
> > > > > On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote:
> > > > > > Hi Jayachandran,
> > > > > > 
> > > > > > On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote:
> > > > > > > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), 
> > > > > > > the PCI
> > > > > > > topology is slightly unusual. For a multi-node system, it looks 
> > > > > > > like:
> > > > > > > 
> > > > > > > [node level PCI bridges - one per node]
> > > > > > > [SoC PCI devices with MSI-X but no IOMMU]
> > > > > > > [PCI-PCIe "glue" bridges - upto 14, one per real port below]
> > > > > > > [PCIe real root ports associated with IOMMU and GICv3 ITS]
> > > > > > > [External PCI devices connected to PCIe links]
> > > > > > > 
> > > > > > > The top two levels of bridges should have introduced aliases 
> > > > > > > since they
> > > > > > > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they 
> > > > > > > do not.
> > > > > > > In the case of external PCIe devices, the "real" root ports are 
> > > > > > > connected
> > > > > > > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not 
> > > > > > > introduce an
> > > > > > > alias. The SoC PCI devices are directly connected to the GIC ITS, 
> > > > > > > so the
> > > > > > > node level bridges do not introduce an alias either.
> > > > > > > 
> > > > > > > To handle this quirk, we mark the real PCIe root ports and node 
> > > > > > > level
> > > > > > > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT.  With 
> > > > > > > this,
> > > > > > > pci_for_each_dma_alias() works correctly for external PCIe 
> > > > > > > devices and
> > > > > > > SoC PCI devices.
> > > > > > > 
> > > > > > > For the current revision of Cavium ThunderX2, the VendorID and 
> > > > > > > Device ID
> > > > > > > are from Broadcom Vulcan (14e4:90XX).
> > > > > > 
> > > > > > Can you supply some text here about why we want to apply this patch?
> > > > > > E.g., does it avoid making unnecessary IOMMU mappings, improve
> > > > > > performance, avoid a crash, etc?
> > > > > 
> > > > > If this is for the commit message, I hope the following is ok:
> > > > > 
> > > > > "With this change, both MSI-X and IO virtualization work correctly on
> > > > > Cavium ThunderX2. The GIC ITS driver gets the correct device ID to
> > > > > configure MSI-X, the SMMUv3 driver gets the correct Stream IDs for PCI
> > > > > devices, and the IOMMU groups are setup correctly."
> > > > 
> > > > This doesn't get at what the actual problem is.  I'm hoping for
> > > > something like "without this change, we set up an IOMMU mapping for
> > > > requestor ID X, but device DMA uses requestor ID Y because , which
> > > > results in an IOMMU fault"
> > > 
> > > Ok. I hope this would be better:
> > > 
> > > "Without this change, the last alias seen while traversing the PCI
> > > hierarchy will be used as the RID to generate the device ID for ITS
> > > and stream ID for SMMU. This in turn causes the MSI-X generated by the
> > > device to fail since the ITS expects to have translation tables based
> > > on the actual PCIe RID and not the (irrelevant) alias. Similarly, the
> > > device DMA also fails when SMMU is enabled due to incorrect value in
> > > SMMU translation tables"
> > 
> > This description is true, but I don't think it addresses the real
> > problem.  I think the real problem is that your IOMMU code doesn't
> > handle aliases correctly, and by ignoring these invalid aliases, we
> > happen to map an alias that works for the builtin devices.  But that's
> > only because we got lucky (those devices use a single RID and they're
> > not behind bridges that optionally take ownership).
> > 
> > It would make sense to me if we fixed the IOMMU code to map *all* the
> > aliases, which should be enough to make your devices work.  If we then
> > wanted to apply a patch like this on top, it would be simply an
> > optimization that avoids unnecessary IOMMU mappings.
> 
> The issue that the IOMMU code does not handle valid aliases is
> unrelated to what I am trying to fix. The quirk is to make sure
> that invalid aliases are not seen on ThunderX2 while doing
> pci_for_each_dma_alias().
> 
> The DMA and MSI-X requests leave the PCI/PCIe hierarchy at the point
> where the SMMU (or ITS) is attached, i.e. at the bridge marked with
> PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT. The quirk ensures that we don't look
> for aliases above that point.
> 
> The toplevel bridge is marked PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT since
> the on-chip devices are directly connected to the ITS (they do not
> use SMMU).
> 
> The top two levels of bridg

Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-12 Thread Jayachandran C
On Wed, Apr 12, 2017 at 11:21:18AM -0500, Bjorn Helgaas wrote:
> On Tue, Apr 11, 2017 at 03:27:02PM +, Jayachandran C wrote:
> > On Tue, Apr 11, 2017 at 08:41:25AM -0500, Bjorn Helgaas wrote:
> > > [+cc Joerg]
> > > 
> > > On Tue, Apr 11, 2017 at 07:10:48AM +, Jayachandran C wrote:
> > > > On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote:
> > > > > Hi Jayachandran,
> > > > > 
> > > > > On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote:
> > > > > > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), 
> > > > > > the PCI
> > > > > > topology is slightly unusual. For a multi-node system, it looks 
> > > > > > like:
> > > > > > 
> > > > > > [node level PCI bridges - one per node]
> > > > > > [SoC PCI devices with MSI-X but no IOMMU]
> > > > > > [PCI-PCIe "glue" bridges - upto 14, one per real port below]
> > > > > > [PCIe real root ports associated with IOMMU and GICv3 ITS]
> > > > > > [External PCI devices connected to PCIe links]
> > > > > > 
> > > > > > The top two levels of bridges should have introduced aliases since 
> > > > > > they
> > > > > > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do 
> > > > > > not.
> > > > > > In the case of external PCIe devices, the "real" root ports are 
> > > > > > connected
> > > > > > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce 
> > > > > > an
> > > > > > alias. The SoC PCI devices are directly connected to the GIC ITS, 
> > > > > > so the
> > > > > > node level bridges do not introduce an alias either.
> > > > > > 
> > > > > > To handle this quirk, we mark the real PCIe root ports and node 
> > > > > > level
> > > > > > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT.  With 
> > > > > > this,
> > > > > > pci_for_each_dma_alias() works correctly for external PCIe devices 
> > > > > > and
> > > > > > SoC PCI devices.
> > > > > > 
> > > > > > For the current revision of Cavium ThunderX2, the VendorID and 
> > > > > > Device ID
> > > > > > are from Broadcom Vulcan (14e4:90XX).
> > > > > 
> > > > > Can you supply some text here about why we want to apply this patch?
> > > > > E.g., does it avoid making unnecessary IOMMU mappings, improve
> > > > > performance, avoid a crash, etc?
> > > > 
> > > > If this is for the commit message, I hope the following is ok:
> > > > 
> > > > "With this change, both MSI-X and IO virtualization work correctly on
> > > > Cavium ThunderX2. The GIC ITS driver gets the correct device ID to
> > > > configure MSI-X, the SMMUv3 driver gets the correct Stream IDs for PCI
> > > > devices, and the IOMMU groups are setup correctly."
> > > 
> > > This doesn't get at what the actual problem is.  I'm hoping for
> > > something like "without this change, we set up an IOMMU mapping for
> > > requestor ID X, but device DMA uses requestor ID Y because , which
> > > results in an IOMMU fault"
> > 
> > Ok. I hope this would be better:
> > 
> > "Without this change, the last alias seen while traversing the PCI
> > hierarchy will be used as the RID to generate the device ID for ITS
> > and stream ID for SMMU. This in turn causes the MSI-X generated by the
> > device to fail since the ITS expects to have translation tables based
> > on the actual PCIe RID and not the (irrelevant) alias. Similarly, the
> > device DMA also fails when SMMU is enabled due to incorrect value in
> > SMMU translation tables"
> 
> This description is true, but I don't think it addresses the real
> problem.  I think the real problem is that your IOMMU code doesn't
> handle aliases correctly, and by ignoring these invalid aliases, we
> happen to map an alias that works for the builtin devices.  But that's
> only because we got lucky (those devices use a single RID and they're
> not behind bridges that optionally take ownership).
> 
> It would make sense to me if we fixed the IOMMU code to map *all* the
> aliases, which should be enough to make your devices work.  If we then
> wanted to apply a patch like this on top, it would be simply an
> optimization that avoids unnecessary IOMMU mappings.

The issue that the IOMMU code does not handle valid aliases is
unrelated to what I am trying to fix. The quirk is to make sure
that invalid aliases are not seen on ThunderX2 while doing
pci_for_each_dma_alias().

The DMA and MSI-X requests leave the PCI/PCIe hierarchy at the point
where the SMMU (or ITS) is attached, i.e. at the bridge marked with
PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT. The quirk ensures that we don't look
for aliases above that point.

The toplevel bridge is marked PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT since
the on-chip devices are directly connected to the ITS (they do not
use SMMU).

The top two levels of bridges are not real PCI bridges but just
PCI bridge-like things that were added to tie the whole hierarchy
together for configuration and enumeration. They do not handle
PCI/PCIe transactions in the traditional sense.

I think my problem description is still not c

Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-12 Thread Bjorn Helgaas
On Tue, Apr 11, 2017 at 03:27:02PM +, Jayachandran C wrote:
> On Tue, Apr 11, 2017 at 08:41:25AM -0500, Bjorn Helgaas wrote:
> > [+cc Joerg]
> > 
> > On Tue, Apr 11, 2017 at 07:10:48AM +, Jayachandran C wrote:
> > > On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote:
> > > > Hi Jayachandran,
> > > > 
> > > > On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote:
> > > > > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the 
> > > > > PCI
> > > > > topology is slightly unusual. For a multi-node system, it looks like:
> > > > > 
> > > > > [node level PCI bridges - one per node]
> > > > > [SoC PCI devices with MSI-X but no IOMMU]
> > > > > [PCI-PCIe "glue" bridges - upto 14, one per real port below]
> > > > > [PCIe real root ports associated with IOMMU and GICv3 ITS]
> > > > > [External PCI devices connected to PCIe links]
> > > > > 
> > > > > The top two levels of bridges should have introduced aliases since 
> > > > > they
> > > > > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do 
> > > > > not.
> > > > > In the case of external PCIe devices, the "real" root ports are 
> > > > > connected
> > > > > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce an
> > > > > alias. The SoC PCI devices are directly connected to the GIC ITS, so 
> > > > > the
> > > > > node level bridges do not introduce an alias either.
> > > > > 
> > > > > To handle this quirk, we mark the real PCIe root ports and node level
> > > > > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT.  With this,
> > > > > pci_for_each_dma_alias() works correctly for external PCIe devices and
> > > > > SoC PCI devices.
> > > > > 
> > > > > For the current revision of Cavium ThunderX2, the VendorID and Device 
> > > > > ID
> > > > > are from Broadcom Vulcan (14e4:90XX).
> > > > 
> > > > Can you supply some text here about why we want to apply this patch?
> > > > E.g., does it avoid making unnecessary IOMMU mappings, improve
> > > > performance, avoid a crash, etc?
> > > 
> > > If this is for the commit message, I hope the following is ok:
> > > 
> > > "With this change, both MSI-X and IO virtualization work correctly on
> > > Cavium ThunderX2. The GIC ITS driver gets the correct device ID to
> > > configure MSI-X, the SMMUv3 driver gets the correct Stream IDs for PCI
> > > devices, and the IOMMU groups are setup correctly."
> > 
> > This doesn't get at what the actual problem is.  I'm hoping for
> > something like "without this change, we set up an IOMMU mapping for
> > requestor ID X, but device DMA uses requestor ID Y because , which
> > results in an IOMMU fault"
> 
> Ok. I hope this would be better:
> 
> "Without this change, the last alias seen while traversing the PCI
> hierarchy will be used as the RID to generate the device ID for ITS
> and stream ID for SMMU. This in turn causes the MSI-X generated by the
> device to fail since the ITS expects to have translation tables based
> on the actual PCIe RID and not the (irrelevant) alias. Similarly, the
> device DMA also fails when SMMU is enabled due to incorrect value in
> SMMU translation tables"

This description is true, but I don't think it addresses the real
problem.  I think the real problem is that your IOMMU code doesn't
handle aliases correctly, and by ignoring these invalid aliases, we
happen to map an alias that works for the builtin devices.  But that's
only because we got lucky (those devices use a single RID and they're
not behind bridges that optionally take ownership).

It would make sense to me if we fixed the IOMMU code to map *all* the
aliases, which should be enough to make your devices work.  If we then
wanted to apply a patch like this on top, it would be simply an
optimization that avoids unnecessary IOMMU mappings.

> > I suspect the reason this patch makes a difference is because the
> > current pci_for_each_dma_alias() believes one of those top-level
> > bridges is an alias, and the iterator produces it last, so that's the
> > one you map.  The IOMMU is attached lower down, so that top-level
> > bridge is not in fact an alias, but since you only look at the *last*
> > one, you don't map the correct aliases from lower down in the tree.
> 
> Exactly. The IORT spec allows a range of RIDs to map to an SMMU, which
> means that a PCI RC can multiple SMMUs, each handling a subset of RIDs.
> 
> In the case of Cavium ThunderX2, the RID which we should see on the RC
> - if we follow the standard and factor in the aliasing introduced by the
> PCI bridge and the PCI/PCIe bridge - is not the RID seen by the SMMU (or
> ITS).
> 
> But, if we stop the traversal at the point where SMMU (or ITS) is
> attached, we will get the correct RID as seen by these.

There is a single "correct RID" for your builtin SATA and USB, but in
general there is no single RID.

> > Stopping the iterator earlier happens to make the last alias be one of
> > the correct ones, but it doesn't solve t

Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-11 Thread Jon Masters
On 04/11/2017 11:27 AM, Jayachandran C wrote:
> On Tue, Apr 11, 2017 at 08:41:25AM -0500, Bjorn Helgaas wrote:

>> I suspect the reason this patch makes a difference is because the
>> current pci_for_each_dma_alias() believes one of those top-level
>> bridges is an alias, and the iterator produces it last, so that's the
>> one you map.  The IOMMU is attached lower down, so that top-level
>> bridge is not in fact an alias, but since you only look at the *last*
>> one, you don't map the correct aliases from lower down in the tree.
> 
> Exactly. The IORT spec allows a range of RIDs to map to an SMMU, which
> means that a PCI RC can multiple SMMUs, each handling a subset of RIDs.
> 
> In the case of Cavium ThunderX2, the RID which we should see on the RC
> - if we follow the standard and factor in the aliasing introduced by the
> PCI bridge and the PCI/PCIe bridge - is not the RID seen by the SMMU (or
> ITS).
> 
> But, if we stop the traversal at the point where SMMU (or ITS) is
> attached, we will get the correct RID as seen by these.

Side note that I am trying to get various specifications clarified to
promote more of a familiar alternative architecture (x86) approach in
the future in which these aren't at different levels in the topology.
But to do that requires integrated Root Complex IP with bells/whistles.

Jon.

-- 
Computer Architect | Sent from my Fedora powered laptop

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-11 Thread Robin Murphy
On 11/04/17 14:41, Bjorn Helgaas wrote:
> [+cc Joerg]
> 
> On Tue, Apr 11, 2017 at 07:10:48AM +, Jayachandran C wrote:
>> On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote:
>>> Hi Jayachandran,
>>>
>>> On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote:
 The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI
 topology is slightly unusual. For a multi-node system, it looks like:

 [node level PCI bridges - one per node]
 [SoC PCI devices with MSI-X but no IOMMU]
 [PCI-PCIe "glue" bridges - upto 14, one per real port below]
 [PCIe real root ports associated with IOMMU and GICv3 ITS]
 [External PCI devices connected to PCIe links]

 The top two levels of bridges should have introduced aliases since they
 are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do not.
 In the case of external PCIe devices, the "real" root ports are connected
 to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce an
 alias. The SoC PCI devices are directly connected to the GIC ITS, so the
 node level bridges do not introduce an alias either.

 To handle this quirk, we mark the real PCIe root ports and node level
 PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT.  With this,
 pci_for_each_dma_alias() works correctly for external PCIe devices and
 SoC PCI devices.

 For the current revision of Cavium ThunderX2, the VendorID and Device ID
 are from Broadcom Vulcan (14e4:90XX).
>>>
>>> Can you supply some text here about why we want to apply this patch?
>>> E.g., does it avoid making unnecessary IOMMU mappings, improve
>>> performance, avoid a crash, etc?
>>
>> If this is for the commit message, I hope the following is ok:
>>
>> "With this change, both MSI-X and IO virtualization work correctly on
>> Cavium ThunderX2. The GIC ITS driver gets the correct device ID to
>> configure MSI-X, the SMMUv3 driver gets the correct Stream IDs for PCI
>> devices, and the IOMMU groups are setup correctly."
> 
> This doesn't get at what the actual problem is.  I'm hoping for
> something like "without this change, we set up an IOMMU mapping for
> requestor ID X, but device DMA uses requestor ID Y because , which
> results in an IOMMU fault"

The fact that certain bits of code only consider the last-seen alias RID
does mean we will install all mappings for a RID the IOMMU will never
see, causing subsequent attempt to access them (via the real RID) to
fault. Even with that fixed, considering the past-the-IOMMU-connection
bridges will cause IOMMU group assignment to believe the IOMMU can't
distinguish them and thus prevent individual devices being assignable to
different VMs.

> I've been puzzling over the fact that most of the callers of
> pci_for_each_dma_alias() don't seem to use it correctly.  For Intel
> IOMMUs, domain_context_mapping() uses it to add a mapping for every
> possible alias.  But most of the other callers only look at the last
> alias and ignore all the others.

As it happens, I've just been looking into this and reaching the same
conclusion (not least because a fair few of of the incorrect uses have
my fingerprints all over them). The of_iommu_configure() and
corresponding iort_iommu_configure() uses turn out to already be broken
WRT DMA phantom functions vs. MSIs, but pretty straightforward to fix
(and I now have one of the offending Marvell SATA cards to test things
with). The one in its_pci_msi_prepare() turns out to be totally
backwards, as it actually wants to iterate through every device which
may alias with the given device (any suggestions for the neatest way to
do that are most welcome).

The other uses in pci_msi_*() are a little trickier, as in general we're
assuming every device is associated with just a single ID (certainly for
ARM GICv3 this assumption runs quite strongly through the architecture)
- Marc and I have chucked some ideas around, but in the short term, it
might be easier to just not even pretend to support MSIs from behind
aliasing bridges for the DT/IORT cases. Either way, those are also
broken for the phantom function case (and the tentative fix I've written
will need rethinking in light of this discussion, oh well).

>  That might work most of the time,
> but:
> 
>   - There's no guarantee that pci_for_each_dma_alias() iterates in any
> particular order, so relying on the current order is fragile,
> 
>   - The pci_add_dma_alias() interface allows an arbitrary number of
> aliases (as long as they're all on the same bus), and some devices
> do use more than one, e.g., quirk_dma_func0_alias(),
> quirk_mic_x200_dma_alias(),
> 
>   - pci_for_each_dma_alias() translates the rules in the PCIe to
> PCI/PCI-X Bridge spec, r1.0, sec 2.3, about taking ownership into
> aliases.  I think it's important to pay attention to *every*
> possible alias, not just the last one.

Ha, that exact page

Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-11 Thread Jayachandran C
On Tue, Apr 11, 2017 at 08:41:25AM -0500, Bjorn Helgaas wrote:
> [+cc Joerg]
> 
> On Tue, Apr 11, 2017 at 07:10:48AM +, Jayachandran C wrote:
> > On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote:
> > > Hi Jayachandran,
> > > 
> > > On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote:
> > > > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the 
> > > > PCI
> > > > topology is slightly unusual. For a multi-node system, it looks like:
> > > > 
> > > > [node level PCI bridges - one per node]
> > > > [SoC PCI devices with MSI-X but no IOMMU]
> > > > [PCI-PCIe "glue" bridges - upto 14, one per real port below]
> > > > [PCIe real root ports associated with IOMMU and GICv3 ITS]
> > > > [External PCI devices connected to PCIe links]
> > > > 
> > > > The top two levels of bridges should have introduced aliases since they
> > > > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do not.
> > > > In the case of external PCIe devices, the "real" root ports are 
> > > > connected
> > > > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce an
> > > > alias. The SoC PCI devices are directly connected to the GIC ITS, so the
> > > > node level bridges do not introduce an alias either.
> > > > 
> > > > To handle this quirk, we mark the real PCIe root ports and node level
> > > > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT.  With this,
> > > > pci_for_each_dma_alias() works correctly for external PCIe devices and
> > > > SoC PCI devices.
> > > > 
> > > > For the current revision of Cavium ThunderX2, the VendorID and Device ID
> > > > are from Broadcom Vulcan (14e4:90XX).
> > > 
> > > Can you supply some text here about why we want to apply this patch?
> > > E.g., does it avoid making unnecessary IOMMU mappings, improve
> > > performance, avoid a crash, etc?
> > 
> > If this is for the commit message, I hope the following is ok:
> > 
> > "With this change, both MSI-X and IO virtualization work correctly on
> > Cavium ThunderX2. The GIC ITS driver gets the correct device ID to
> > configure MSI-X, the SMMUv3 driver gets the correct Stream IDs for PCI
> > devices, and the IOMMU groups are setup correctly."
> 
> This doesn't get at what the actual problem is.  I'm hoping for
> something like "without this change, we set up an IOMMU mapping for
> requestor ID X, but device DMA uses requestor ID Y because , which
> results in an IOMMU fault"

Ok. I hope this would be better:

"Without this change, the last alias seen while traversing the PCI
hierarchy will be used as the RID to generate the device ID for ITS
and stream ID for SMMU. This in turn causes the MSI-X generated by the
device to fail since the ITS expects to have translation tables based
on the actual PCIe RID and not the (irrelevant) alias. Similarly, the
device DMA also fails when SMMU is enabled due to incorrect value in
SMMU translation tables"

> I've been puzzling over the fact that most of the callers of
> pci_for_each_dma_alias() don't seem to use it correctly.  For Intel
> IOMMUs, domain_context_mapping() uses it to add a mapping for every
> possible alias.  But most of the other callers only look at the last
> alias and ignore all the others.  That might work most of the time,
> but:
> 
>   - There's no guarantee that pci_for_each_dma_alias() iterates in any
> particular order, so relying on the current order is fragile,
> 
>   - The pci_add_dma_alias() interface allows an arbitrary number of
> aliases (as long as they're all on the same bus), and some devices
> do use more than one, e.g., quirk_dma_func0_alias(),
> quirk_mic_x200_dma_alias(),
> 
>   - pci_for_each_dma_alias() translates the rules in the PCIe to
> PCI/PCI-X Bridge spec, r1.0, sec 2.3, about taking ownership into
> aliases.  I think it's important to pay attention to *every*
> possible alias, not just the last one.

pci_for_each_dma_alias() is used by the ARM code to find the RID
(Requester ID), and this is taken as the last alias as seen from the
PCI controller (RC). The RID is then used to program the Device ID
of the GIC ITS (ARM generic interrupt controller's interrupt translation
service) for MSI-X (and similarly to program Stream ID of the SMMU).

The translation from RID to Device ID or stream ID is provided by the
IORT ACPI table[1] or by the a {iommu,msi}-{map,mask} [2] property in
the device tree.

Taking the last alias maybe reasonable since the mapping is from
(PCI RC, RID) to (SMMU, streamID) or (GIC ITS, deviceID) and we are
looking for a single the RID for a device as seen from the controller.

> I suspect the reason this patch makes a difference is because the
> current pci_for_each_dma_alias() believes one of those top-level
> bridges is an alias, and the iterator produces it last, so that's the
> one you map.  The IOMMU is attached lower down, so that top-level
> bridge is not in fact an alias, but since you only look at the *last*
> 

Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-11 Thread Bjorn Helgaas
[+cc Joerg]

On Tue, Apr 11, 2017 at 07:10:48AM +, Jayachandran C wrote:
> On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote:
> > Hi Jayachandran,
> > 
> > On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote:
> > > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI
> > > topology is slightly unusual. For a multi-node system, it looks like:
> > > 
> > > [node level PCI bridges - one per node]
> > > [SoC PCI devices with MSI-X but no IOMMU]
> > > [PCI-PCIe "glue" bridges - upto 14, one per real port below]
> > > [PCIe real root ports associated with IOMMU and GICv3 ITS]
> > > [External PCI devices connected to PCIe links]
> > > 
> > > The top two levels of bridges should have introduced aliases since they
> > > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do not.
> > > In the case of external PCIe devices, the "real" root ports are connected
> > > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce an
> > > alias. The SoC PCI devices are directly connected to the GIC ITS, so the
> > > node level bridges do not introduce an alias either.
> > > 
> > > To handle this quirk, we mark the real PCIe root ports and node level
> > > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT.  With this,
> > > pci_for_each_dma_alias() works correctly for external PCIe devices and
> > > SoC PCI devices.
> > > 
> > > For the current revision of Cavium ThunderX2, the VendorID and Device ID
> > > are from Broadcom Vulcan (14e4:90XX).
> > 
> > Can you supply some text here about why we want to apply this patch?
> > E.g., does it avoid making unnecessary IOMMU mappings, improve
> > performance, avoid a crash, etc?
> 
> If this is for the commit message, I hope the following is ok:
> 
> "With this change, both MSI-X and IO virtualization work correctly on
> Cavium ThunderX2. The GIC ITS driver gets the correct device ID to
> configure MSI-X, the SMMUv3 driver gets the correct Stream IDs for PCI
> devices, and the IOMMU groups are setup correctly."

This doesn't get at what the actual problem is.  I'm hoping for
something like "without this change, we set up an IOMMU mapping for
requestor ID X, but device DMA uses requestor ID Y because , which
results in an IOMMU fault"

I've been puzzling over the fact that most of the callers of
pci_for_each_dma_alias() don't seem to use it correctly.  For Intel
IOMMUs, domain_context_mapping() uses it to add a mapping for every
possible alias.  But most of the other callers only look at the last
alias and ignore all the others.  That might work most of the time,
but:

  - There's no guarantee that pci_for_each_dma_alias() iterates in any
particular order, so relying on the current order is fragile,

  - The pci_add_dma_alias() interface allows an arbitrary number of
aliases (as long as they're all on the same bus), and some devices
do use more than one, e.g., quirk_dma_func0_alias(),
quirk_mic_x200_dma_alias(),

  - pci_for_each_dma_alias() translates the rules in the PCIe to
PCI/PCI-X Bridge spec, r1.0, sec 2.3, about taking ownership into
aliases.  I think it's important to pay attention to *every*
possible alias, not just the last one.

I suspect the reason this patch makes a difference is because the
current pci_for_each_dma_alias() believes one of those top-level
bridges is an alias, and the iterator produces it last, so that's the
one you map.  The IOMMU is attached lower down, so that top-level
bridge is not in fact an alias, but since you only look at the *last*
one, you don't map the correct aliases from lower down in the tree.

Stopping the iterator earlier happens to make the last alias be one of
the correct ones, but it doesn't solve the problems of quirked devices
that can use multiple requester IDs, and it doesn't solve the problem
of PCIe-to-PCI bridges that optionally take ownership of transactions.

> I can send out a new patch if needed.
> 
> The on chip SATA and USB use MSI-X, so this is needed for basic
> functionality of the platform.

No need for a new patch; I can integrate something into the changelog.

> > > Signed-off-by: Jayachandran C 
> > > ---
> > >  drivers/pci/quirks.c | 14 ++
> > >  1 file changed, 14 insertions(+)
> > > 
> > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > index 6736836..564a84a 100644
> > > --- a/drivers/pci/quirks.c
> > > +++ b/drivers/pci/quirks.c
> > > @@ -3958,6 +3958,20 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 
> > > 0x2260, quirk_mic_x200_dma_alias);
> > >  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, 
> > > quirk_mic_x200_dma_alias);
> > >  
> > >  /*
> > > + * The IOMMU and interrupt controller on Broadcom Vulcan/Cavium 
> > > ThunderX2 are
> > > + * associated not at the root bus, but at a bridge below. This quirk flag
> > > + * will ensure that the aliases are identified correctly.
> > > + */
> > > +static void quirk_bridge_cavm_thrx2_pcie_roo

Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-11 Thread Jayachandran C
On Mon, Apr 10, 2017 at 08:28:47PM -0500, Bjorn Helgaas wrote:
> Hi Jayachandran,
> 
> On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote:
> > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI
> > topology is slightly unusual. For a multi-node system, it looks like:
> > 
> > [node level PCI bridges - one per node]
> > [SoC PCI devices with MSI-X but no IOMMU]
> > [PCI-PCIe "glue" bridges - upto 14, one per real port below]
> > [PCIe real root ports associated with IOMMU and GICv3 ITS]
> > [External PCI devices connected to PCIe links]
> > 
> > The top two levels of bridges should have introduced aliases since they
> > are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do not.
> > In the case of external PCIe devices, the "real" root ports are connected
> > to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce an
> > alias. The SoC PCI devices are directly connected to the GIC ITS, so the
> > node level bridges do not introduce an alias either.
> > 
> > To handle this quirk, we mark the real PCIe root ports and node level
> > PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT.  With this,
> > pci_for_each_dma_alias() works correctly for external PCIe devices and
> > SoC PCI devices.
> > 
> > For the current revision of Cavium ThunderX2, the VendorID and Device ID
> > are from Broadcom Vulcan (14e4:90XX).
> 
> Can you supply some text here about why we want to apply this patch?
> E.g., does it avoid making unnecessary IOMMU mappings, improve
> performance, avoid a crash, etc?

If this is for the commit message, I hope the following is ok:

"With this change, both MSI-X and IO virtualization work correctly on
Cavium ThunderX2. The GIC ITS driver gets the correct device ID to
configure MSI-X, the SMMUv3 driver gets the correct Stream IDs for PCI
devices, and the IOMMU groups are setup correctly."

I can send out a new patch if needed.

The on chip SATA and USB use MSI-X, so this is needed for basic
functionality of the platform.

> 
> > Signed-off-by: Jayachandran C 
> > ---
> >  drivers/pci/quirks.c | 14 ++
> >  1 file changed, 14 insertions(+)
> > 
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index 6736836..564a84a 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -3958,6 +3958,20 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 
> > 0x2260, quirk_mic_x200_dma_alias);
> >  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, 
> > quirk_mic_x200_dma_alias);
> >  
> >  /*
> > + * The IOMMU and interrupt controller on Broadcom Vulcan/Cavium ThunderX2 
> > are
> > + * associated not at the root bus, but at a bridge below. This quirk flag
> > + * will ensure that the aliases are identified correctly.
> > + */
> > +static void quirk_bridge_cavm_thrx2_pcie_root(struct pci_dev *pdev)
> > +{
> > +   pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT;
> > +}
> > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9000,
> > +   quirk_bridge_cavm_thrx2_pcie_root);
> > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9084,
> > +   quirk_bridge_cavm_thrx2_pcie_root);
> > +
> > +/*
> >   * Intersil/Techwell TW686[4589]-based video capture cards have an empty 
> > (zero)
> >   * class code.  Fix it.
> >   */

Thanks,
JC.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-10 Thread Bjorn Helgaas
Hi Jayachandran,

On Mon, Apr 03, 2017 at 01:15:04PM +, Jayachandran C wrote:
> The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI
> topology is slightly unusual. For a multi-node system, it looks like:
> 
> [node level PCI bridges - one per node]
> [SoC PCI devices with MSI-X but no IOMMU]
> [PCI-PCIe "glue" bridges - upto 14, one per real port below]
> [PCIe real root ports associated with IOMMU and GICv3 ITS]
> [External PCI devices connected to PCIe links]
> 
> The top two levels of bridges should have introduced aliases since they
> are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do not.
> In the case of external PCIe devices, the "real" root ports are connected
> to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce an
> alias. The SoC PCI devices are directly connected to the GIC ITS, so the
> node level bridges do not introduce an alias either.
> 
> To handle this quirk, we mark the real PCIe root ports and node level
> PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT.  With this,
> pci_for_each_dma_alias() works correctly for external PCIe devices and
> SoC PCI devices.
> 
> For the current revision of Cavium ThunderX2, the VendorID and Device ID
> are from Broadcom Vulcan (14e4:90XX).

Can you supply some text here about why we want to apply this patch?
E.g., does it avoid making unnecessary IOMMU mappings, improve
performance, avoid a crash, etc?

> Signed-off-by: Jayachandran C 
> ---
>  drivers/pci/quirks.c | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 6736836..564a84a 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3958,6 +3958,20 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, 
> quirk_mic_x200_dma_alias);
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, 
> quirk_mic_x200_dma_alias);
>  
>  /*
> + * The IOMMU and interrupt controller on Broadcom Vulcan/Cavium ThunderX2 are
> + * associated not at the root bus, but at a bridge below. This quirk flag
> + * will ensure that the aliases are identified correctly.
> + */
> +static void quirk_bridge_cavm_thrx2_pcie_root(struct pci_dev *pdev)
> +{
> + pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT;
> +}
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9000,
> + quirk_bridge_cavm_thrx2_pcie_root);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9084,
> + quirk_bridge_cavm_thrx2_pcie_root);
> +
> +/*
>   * Intersil/Techwell TW686[4589]-based video capture cards have an empty 
> (zero)
>   * class code.  Fix it.
>   */
> -- 
> 2.7.4
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-10 Thread Jayachandran C
[Moving Bjorn back to to: ]

On Tue, Apr 04, 2017 at 03:28:26PM +0100, Robin Murphy wrote:
> On 04/04/17 12:50, Jayachandran C wrote:
> > On Mon, Apr 03, 2017 at 04:07:53PM +0100, Robin Murphy wrote:
> >> On 03/04/17 14:15, Jayachandran C wrote:
> >>> The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI
> >>> topology is slightly unusual. For a multi-node system, it looks like:
> >>>
> >>> [node level PCI bridges - one per node]
> >>> [SoC PCI devices with MSI-X but no IOMMU]
> >>> [PCI-PCIe "glue" bridges - upto 14, one per real port below]
> >>> [PCIe real root ports associated with IOMMU and GICv3 ITS]
> >>> [External PCI devices connected to PCIe links]
> >>
> >> Since it's not entirely obvious, what does the actual DT - or IORT if
> >> you must ;) - topology for this look like? I can't help thinking that
> >> either it's inaccurate, or that this is going to expose a shortcoming in
> >> pci_dma_configure() which breaks things - unless I'm missing something,
> >> isn't find_pci_root_bus() going to go all the way up to the top-level
> >> glue bridge and pick up the wrong firmware node (if any) for the
> >> appropriate DMA properties?
> > 
> > I will try to describe the ACPI interface:
> > 
> > There is just one ECAM area, a single bus range and one set of memory
> > windows for the whole system - so there is just one entry in DSDT for
> > the PCI controller. This entry also corresponds to the PCI RC node in
> > IORT. DMA is coherent and supports 64 bits system-wide, the attributes
> > (in DSDT and IORT) reflect this.
> > 
> > lspci on the system looks like this:
> > -[:00]-+-00.0-[01-1e]--+-04.0  14e4:9026
> >|   +-04.1  14e4:9026
> >|   +-05.0  14e4:9027
> >|   +-05.1  14e4:9027
> >|   +-0a.0-[02-03]00.0-[03]--
> >|   +-0a.1-[04-05]00.0-[05]--
> >|   [...etc...]
> >|   +-0b.0-[12-14]00.0-[13-14]--+-00.0  8086:1583
> >|   |   \-00.1  8086:1583
> >|   [...etc...]
> >|   \-0b.5-[1d-1e]00.0-[1e]--
> >\-00.1-[1f-3b]--+-04.0  14e4:9026
> >+-04.1  14e4:9026
> >+-05.0  14e4:9027
> >+-05.1  14e4:9027
> >+-0a.0-[20-21]00.0-[21]--
> >[...etc...]
> > 
> > The devices here are:
> >  - 00:00.0 and 00:00.1 are the node (socket) level bridges
> >  - 01:[45].x and 1f:[45].x are SoC PCI devices like SATA and USB
> >  - 01:[ab].x and 1f:[ab].x are the PCI-PCIe "reverse"/glue bridges
> >  - 02:00.0 etc are the "real" PCIe ports connected to external PCIe cards. 
> > Each node has a GIC ITS, and a group of 4 PCIe ports have an SMMU.
> > 
> > The IORT is built by the firmware based on its PCI enumeration. The IORT
> > will have multiple entries under the PCI RC node:
> >  - one entry per node to map the SoC devices directly to ITS for MSI-X,
> >since the SoC devices are not attached to any SMMU.
> >  - An entry per "real" PCIe port to map RIDs under it to the corresponding
> >SMMU.
> > The SMMU nodes will have an entry to map its RID ranges to the node ITS.
> > 
> > The IORT spec supports this configuration, and the corresponding code is
> > already upstream, so the only sticking point right now is
> > pci_for_each_dma_alias().
> 
> Thanks, that helps a lot. The "single global ECAM space" idea was
> eluding me, but in that context it all makes much more sense - I'm
> assuming the two quirked device IDs correspond to the 00:00.[01] devices
> and the [02-1e]:00.0 ones.
> 
> So (at the risk of Jon mooing at me), I guess the DT description would
> be a single node looking something like:
> 
> pcie {
>   reg = [global ECAM space for segment ];
> 
>   msi-map = <0x0100 &its0 0x0100 0x1d00>,
> <0x1f00 &its1 0x1f00 0x1d00>;
>   iommu-map = <0x0200 &smmu0 0x0200 0x1c00>,
>   <0x2000 &smmu0 0x2000 0x1c00>;
> };
> 
> (note to self: which incidentally also means of_pci_map_rid() probably
> wants fixing to not treat gaps in the map as an error)
> 
> With only one node like that, rather than having the whole first 3
> levels of bridges described, the "stop at the appropriate node in the
> callback" approach does become even more impractical in all cases. So,
> for $TITLE, based on the above understanding:
> 
> Reviewed-by: Robin Murphy 

Hi Bjorn,

This seems to be the reasonable way to add support for the quirk. 
Would really appreciate feedback from you.

Thanks,
JC.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-04 Thread Robin Murphy
On 04/04/17 12:50, Jayachandran C wrote:
> On Mon, Apr 03, 2017 at 04:07:53PM +0100, Robin Murphy wrote:
>> On 03/04/17 14:15, Jayachandran C wrote:
>>> The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI
>>> topology is slightly unusual. For a multi-node system, it looks like:
>>>
>>> [node level PCI bridges - one per node]
>>> [SoC PCI devices with MSI-X but no IOMMU]
>>> [PCI-PCIe "glue" bridges - upto 14, one per real port below]
>>> [PCIe real root ports associated with IOMMU and GICv3 ITS]
>>> [External PCI devices connected to PCIe links]
>>
>> Since it's not entirely obvious, what does the actual DT - or IORT if
>> you must ;) - topology for this look like? I can't help thinking that
>> either it's inaccurate, or that this is going to expose a shortcoming in
>> pci_dma_configure() which breaks things - unless I'm missing something,
>> isn't find_pci_root_bus() going to go all the way up to the top-level
>> glue bridge and pick up the wrong firmware node (if any) for the
>> appropriate DMA properties?
> 
> I will try to describe the ACPI interface:
> 
> There is just one ECAM area, a single bus range and one set of memory
> windows for the whole system - so there is just one entry in DSDT for
> the PCI controller. This entry also corresponds to the PCI RC node in
> IORT. DMA is coherent and supports 64 bits system-wide, the attributes
> (in DSDT and IORT) reflect this.
> 
> lspci on the system looks like this:
> -[:00]-+-00.0-[01-1e]--+-04.0  14e4:9026
>|   +-04.1  14e4:9026
>|   +-05.0  14e4:9027
>|   +-05.1  14e4:9027
>|   +-0a.0-[02-03]00.0-[03]--
>|   +-0a.1-[04-05]00.0-[05]--
>|   [...etc...]
>|   +-0b.0-[12-14]00.0-[13-14]--+-00.0  8086:1583
>|   |   \-00.1  8086:1583
>|   [...etc...]
>|   \-0b.5-[1d-1e]00.0-[1e]--
>\-00.1-[1f-3b]--+-04.0  14e4:9026
>+-04.1  14e4:9026
>+-05.0  14e4:9027
>+-05.1  14e4:9027
>+-0a.0-[20-21]00.0-[21]--
>[...etc...]
> 
> The devices here are:
>  - 00:00.0 and 00:00.1 are the node (socket) level bridges
>  - 01:[45].x and 1f:[45].x are SoC PCI devices like SATA and USB
>  - 01:[ab].x and 1f:[ab].x are the PCI-PCIe "reverse"/glue bridges
>  - 02:00.0 etc are the "real" PCIe ports connected to external PCIe cards. 
> Each node has a GIC ITS, and a group of 4 PCIe ports have an SMMU.
> 
> The IORT is built by the firmware based on its PCI enumeration. The IORT
> will have multiple entries under the PCI RC node:
>  - one entry per node to map the SoC devices directly to ITS for MSI-X,
>since the SoC devices are not attached to any SMMU.
>  - An entry per "real" PCIe port to map RIDs under it to the corresponding
>SMMU.
> The SMMU nodes will have an entry to map its RID ranges to the node ITS.
> 
> The IORT spec supports this configuration, and the corresponding code is
> already upstream, so the only sticking point right now is
> pci_for_each_dma_alias().

Thanks, that helps a lot. The "single global ECAM space" idea was
eluding me, but in that context it all makes much more sense - I'm
assuming the two quirked device IDs correspond to the 00:00.[01] devices
and the [02-1e]:00.0 ones.

So (at the risk of Jon mooing at me), I guess the DT description would
be a single node looking something like:

pcie {
reg = [global ECAM space for segment ];

msi-map = <0x0100 &its0 0x0100 0x1d00>,
  <0x1f00 &its1 0x1f00 0x1d00>;
iommu-map = <0x0200 &smmu0 0x0200 0x1c00>,
<0x2000 &smmu0 0x2000 0x1c00>;
};

(note to self: which incidentally also means of_pci_map_rid() probably
wants fixing to not treat gaps in the map as an error)

With only one node like that, rather than having the whole first 3
levels of bridges described, the "stop at the appropriate node in the
callback" approach does become even more impractical in all cases. So,
for $TITLE, based on the above understanding:

Reviewed-by: Robin Murphy 

Cheers,
Robin.

> 
> JC.
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-04 Thread Jayachandran C
On Mon, Apr 03, 2017 at 04:07:53PM +0100, Robin Murphy wrote:
> On 03/04/17 14:15, Jayachandran C wrote:
> > The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI
> > topology is slightly unusual. For a multi-node system, it looks like:
> > 
> > [node level PCI bridges - one per node]
> > [SoC PCI devices with MSI-X but no IOMMU]
> > [PCI-PCIe "glue" bridges - upto 14, one per real port below]
> > [PCIe real root ports associated with IOMMU and GICv3 ITS]
> > [External PCI devices connected to PCIe links]
> 
> Since it's not entirely obvious, what does the actual DT - or IORT if
> you must ;) - topology for this look like? I can't help thinking that
> either it's inaccurate, or that this is going to expose a shortcoming in
> pci_dma_configure() which breaks things - unless I'm missing something,
> isn't find_pci_root_bus() going to go all the way up to the top-level
> glue bridge and pick up the wrong firmware node (if any) for the
> appropriate DMA properties?

I will try to describe the ACPI interface:

There is just one ECAM area, a single bus range and one set of memory
windows for the whole system - so there is just one entry in DSDT for
the PCI controller. This entry also corresponds to the PCI RC node in
IORT. DMA is coherent and supports 64 bits system-wide, the attributes
(in DSDT and IORT) reflect this.

lspci on the system looks like this:
-[:00]-+-00.0-[01-1e]--+-04.0  14e4:9026
   |   +-04.1  14e4:9026
   |   +-05.0  14e4:9027
   |   +-05.1  14e4:9027
   |   +-0a.0-[02-03]00.0-[03]--
   |   +-0a.1-[04-05]00.0-[05]--
   |   [...etc...]
   |   +-0b.0-[12-14]00.0-[13-14]--+-00.0  8086:1583
   |   |   \-00.1  8086:1583
   |   [...etc...]
   |   \-0b.5-[1d-1e]00.0-[1e]--
   \-00.1-[1f-3b]--+-04.0  14e4:9026
   +-04.1  14e4:9026
   +-05.0  14e4:9027
   +-05.1  14e4:9027
   +-0a.0-[20-21]00.0-[21]--
   [...etc...]

The devices here are:
 - 00:00.0 and 00:00.1 are the node (socket) level bridges
 - 01:[45].x and 1f:[45].x are SoC PCI devices like SATA and USB
 - 01:[ab].x and 1f:[ab].x are the PCI-PCIe "reverse"/glue bridges
 - 02:00.0 etc are the "real" PCIe ports connected to external PCIe cards. 
Each node has a GIC ITS, and a group of 4 PCIe ports have an SMMU.

The IORT is built by the firmware based on its PCI enumeration. The IORT
will have multiple entries under the PCI RC node:
 - one entry per node to map the SoC devices directly to ITS for MSI-X,
   since the SoC devices are not attached to any SMMU.
 - An entry per "real" PCIe port to map RIDs under it to the corresponding
   SMMU.
The SMMU nodes will have an entry to map its RID ranges to the node ITS.

The IORT spec supports this configuration, and the corresponding code is
already upstream, so the only sticking point right now is
pci_for_each_dma_alias().

JC.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-03 Thread Robin Murphy
On 03/04/17 14:15, Jayachandran C wrote:
> The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI
> topology is slightly unusual. For a multi-node system, it looks like:
> 
> [node level PCI bridges - one per node]
> [SoC PCI devices with MSI-X but no IOMMU]
> [PCI-PCIe "glue" bridges - upto 14, one per real port below]
> [PCIe real root ports associated with IOMMU and GICv3 ITS]
> [External PCI devices connected to PCIe links]

Since it's not entirely obvious, what does the actual DT - or IORT if
you must ;) - topology for this look like? I can't help thinking that
either it's inaccurate, or that this is going to expose a shortcoming in
pci_dma_configure() which breaks things - unless I'm missing something,
isn't find_pci_root_bus() going to go all the way up to the top-level
glue bridge and pick up the wrong firmware node (if any) for the
appropriate DMA properties?

Robin.

> The top two levels of bridges should have introduced aliases since they
> are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do not.
> In the case of external PCIe devices, the "real" root ports are connected
> to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce an
> alias. The SoC PCI devices are directly connected to the GIC ITS, so the
> node level bridges do not introduce an alias either.
> 
> To handle this quirk, we mark the real PCIe root ports and node level
> PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT.  With this,
> pci_for_each_dma_alias() works correctly for external PCIe devices and
> SoC PCI devices.
> 
> For the current revision of Cavium ThunderX2, the VendorID and Device ID
> are from Broadcom Vulcan (14e4:90XX).
> 
> Signed-off-by: Jayachandran C 
> ---
>  drivers/pci/quirks.c | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 6736836..564a84a 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3958,6 +3958,20 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, 
> quirk_mic_x200_dma_alias);
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, 
> quirk_mic_x200_dma_alias);
>  
>  /*
> + * The IOMMU and interrupt controller on Broadcom Vulcan/Cavium ThunderX2 are
> + * associated not at the root bus, but at a bridge below. This quirk flag
> + * will ensure that the aliases are identified correctly.
> + */
> +static void quirk_bridge_cavm_thrx2_pcie_root(struct pci_dev *pdev)
> +{
> + pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT;
> +}
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9000,
> + quirk_bridge_cavm_thrx2_pcie_root);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9084,
> + quirk_bridge_cavm_thrx2_pcie_root);
> +
> +/*
>   * Intersil/Techwell TW686[4589]-based video capture cards have an empty 
> (zero)
>   * class code.  Fix it.
>   */
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 2/2] PCI: quirks: Fix ThunderX2 dma alias handling

2017-04-03 Thread Jayachandran C
The Cavium ThunderX2 arm64 SoCs (called Broadcom Vulcan earlier), the PCI
topology is slightly unusual. For a multi-node system, it looks like:

[node level PCI bridges - one per node]
[SoC PCI devices with MSI-X but no IOMMU]
[PCI-PCIe "glue" bridges - upto 14, one per real port below]
[PCIe real root ports associated with IOMMU and GICv3 ITS]
[External PCI devices connected to PCIe links]

The top two levels of bridges should have introduced aliases since they
are PCI and PCI/PCIe bridges, but in the case of ThunderX2 they do not.
In the case of external PCIe devices, the "real" root ports are connected
to the SMMU and the GIC ITS, so PCI-PCIe bridge does not introduce an
alias. The SoC PCI devices are directly connected to the GIC ITS, so the
node level bridges do not introduce an alias either.

To handle this quirk, we mark the real PCIe root ports and node level
PCI bridges with the flag PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT.  With this,
pci_for_each_dma_alias() works correctly for external PCIe devices and
SoC PCI devices.

For the current revision of Cavium ThunderX2, the VendorID and Device ID
are from Broadcom Vulcan (14e4:90XX).

Signed-off-by: Jayachandran C 
---
 drivers/pci/quirks.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 6736836..564a84a 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3958,6 +3958,20 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2260, 
quirk_mic_x200_dma_alias);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2264, 
quirk_mic_x200_dma_alias);
 
 /*
+ * The IOMMU and interrupt controller on Broadcom Vulcan/Cavium ThunderX2 are
+ * associated not at the root bus, but at a bridge below. This quirk flag
+ * will ensure that the aliases are identified correctly.
+ */
+static void quirk_bridge_cavm_thrx2_pcie_root(struct pci_dev *pdev)
+{
+   pdev->dev_flags |= PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT;
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9000,
+   quirk_bridge_cavm_thrx2_pcie_root);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_BROADCOM, 0x9084,
+   quirk_bridge_cavm_thrx2_pcie_root);
+
+/*
  * Intersil/Techwell TW686[4589]-based video capture cards have an empty (zero)
  * class code.  Fix it.
  */
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu