Re: [PATCH 05/13] pci: New pci_acs_enabled()
On 05/18/2012 10:47 PM, Alex Williamson wrote: On Fri, 2012-05-18 at 19:00 -0400, Don Dutile wrote: On 05/18/2012 06:02 PM, Alex Williamson wrote: On Wed, 2012-05-16 at 09:29 -0400, Don Dutile wrote: On 05/15/2012 05:09 PM, Alex Williamson wrote: On Tue, 2012-05-15 at 13:56 -0600, Bjorn Helgaas wrote: On Mon, May 14, 2012 at 4:49 PM, Alex Williamson alex.william...@redhat.comwrote: On Mon, 2012-05-14 at 16:02 -0600, Bjorn Helgaas wrote: On Fri, May 11, 2012 at 4:56 PM, Alex Williamson alex.william...@redhat.comwrote: In a PCIe environment, transactions aren't always required to reach the root bus before being re-routed. Peer-to-peer DMA may actually not be seen by the IOMMU in these cases. For IOMMU groups, we want to provide IOMMU drivers a way to detect these restrictions. Provided with a PCI device, pci_acs_enabled returns the furthest downstream device with a complete PCI ACS chain. This information can then be used in grouping to create fully isolated groups. ACS chain logic extracted from libvirt. The name pci_acs_enabled() sounds like it returns a boolean, but it doesn't. Right, maybe this should be: struct pci_dev *pci_find_upstream_acs(struct pci_dev *pdev); +1; there is a global in the PCI code, pci_acs_enable, and a function pci_enable_acs(), which the above name certainly confuses. I recommend pci_find_top_acs_bridge() would be most descriptive. Finally, with my email filters fixed, I can see this email... :) Welcome back ;) Indeed... and I recvd 3 copies of this reply, so the pendulum has flipped the other direction... ;-) Yep, the new API I'm working with is: bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags); bool pci_acs_path_enabled(struct pci_dev *start, struct pci_dev *end, u16 acs_flags); ok. I'm not sure what a complete PCI ACS chain means. The function starts from dev and searches *upstream*, so I'm guessing it returns the root of a subtree that must be contained in a group. Any intermediate switch between an endpoint and the root bus can redirect a dma access without iommu translation, Is this redirection just the normal PCI bridge forwarding that allows peer-to-peer transactions, i.e., the rule (from P2P bridge spec, rev 1.2, sec 4.1) that the bridge apertures define address ranges that are forwarded from primary to secondary interface, and the inverse ranges are forwarded from secondary to primary? For example, here: ^ | ++---+ || +--+-++-++-+ | Downstream || Downstream | |Port||Port| | 06:05.0 || 06:06.0 | +--+-++--+-+ | | +v+ +v+ | Endpoint| | Endpoint| | 07:00.0 | | 08:00.0 | +-+ +-+ that rule is all that's needed for a transaction from 07:00.0 to be forwarded from upstream to the internal switch bus 06, then claimed by 06:06.0 and forwarded downstream to 08:00.0. This is plain old PCI, nothing specific to PCIe. Right, I think the main PCI difference is the point-to-point nature of PCIe vs legacy PCI bus. On a legacy PCI bus there's no way to prevent devices talking to each other, but on PCIe the transaction makes a U-turn at some point and heads out another downstream port. ACS allows us to prevent that from happening. detail: PCIe up/downstream routing is really done by an internal switch; ACS forces the legacy, PCI base-limit address routing and *forces* the switch to always route the transaction from a downstream port to the upstream port. I don't understand ACS very well, but it looks like it basically provides ways to prevent that peer-to-peer forwarding, so transactions would be sent upstream toward the root (and specifically, the IOMMU) instead of being directly claimed by 06:06.0. Yep, that's my meager understanding as well. +1 so we're looking for the furthest upstream device for which acs is enabled all the way up to the root bus. Correct me if this is wrong: To force device A's DMAs to be processed by an IOMMU, ACS must be enabled on the root port and every downstream port along the path to A. Yes, modulo this comment in libvirt source: /* if we have no parent, and this is the root bus, ACS doesn't come * into play since devices on the root bus can't P2P without going * through the root IOMMU. */ Correct. PCIe spec says roots must support ACS. I believe all the root bridges that have an IOMMU have ACS wired in/on. Would you mind looking for the paragraph that says this? I'd rather code this into the iommu driver callers than core PCI code if this is just a platform standard. In section 6.12.1.1 of PCIe Base spec, rev 3.0, it states: ACS upstream fwding: Must be implemented by
Re: [PATCH 05/13] pci: New pci_acs_enabled()
On Mon, 2012-05-21 at 09:31 -0400, Don Dutile wrote: On 05/18/2012 10:47 PM, Alex Williamson wrote: On Fri, 2012-05-18 at 19:00 -0400, Don Dutile wrote: On 05/18/2012 06:02 PM, Alex Williamson wrote: On Wed, 2012-05-16 at 09:29 -0400, Don Dutile wrote: On 05/15/2012 05:09 PM, Alex Williamson wrote: On Tue, 2012-05-15 at 13:56 -0600, Bjorn Helgaas wrote: On Mon, May 14, 2012 at 4:49 PM, Alex Williamson alex.william...@redhat.comwrote: On Mon, 2012-05-14 at 16:02 -0600, Bjorn Helgaas wrote: On Fri, May 11, 2012 at 4:56 PM, Alex Williamson alex.william...@redhat.comwrote: In a PCIe environment, transactions aren't always required to reach the root bus before being re-routed. Peer-to-peer DMA may actually not be seen by the IOMMU in these cases. For IOMMU groups, we want to provide IOMMU drivers a way to detect these restrictions. Provided with a PCI device, pci_acs_enabled returns the furthest downstream device with a complete PCI ACS chain. This information can then be used in grouping to create fully isolated groups. ACS chain logic extracted from libvirt. The name pci_acs_enabled() sounds like it returns a boolean, but it doesn't. Right, maybe this should be: struct pci_dev *pci_find_upstream_acs(struct pci_dev *pdev); +1; there is a global in the PCI code, pci_acs_enable, and a function pci_enable_acs(), which the above name certainly confuses. I recommend pci_find_top_acs_bridge() would be most descriptive. Finally, with my email filters fixed, I can see this email... :) Welcome back ;) Indeed... and I recvd 3 copies of this reply, so the pendulum has flipped the other direction... ;-) Yep, the new API I'm working with is: bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags); bool pci_acs_path_enabled(struct pci_dev *start, struct pci_dev *end, u16 acs_flags); ok. I'm not sure what a complete PCI ACS chain means. The function starts from dev and searches *upstream*, so I'm guessing it returns the root of a subtree that must be contained in a group. Any intermediate switch between an endpoint and the root bus can redirect a dma access without iommu translation, Is this redirection just the normal PCI bridge forwarding that allows peer-to-peer transactions, i.e., the rule (from P2P bridge spec, rev 1.2, sec 4.1) that the bridge apertures define address ranges that are forwarded from primary to secondary interface, and the inverse ranges are forwarded from secondary to primary? For example, here: ^ | ++---+ || +--+-++-++-+ | Downstream || Downstream | |Port||Port| | 06:05.0 || 06:06.0 | +--+-++--+-+ | | +v+ +v+ | Endpoint| | Endpoint| | 07:00.0 | | 08:00.0 | +-+ +-+ that rule is all that's needed for a transaction from 07:00.0 to be forwarded from upstream to the internal switch bus 06, then claimed by 06:06.0 and forwarded downstream to 08:00.0. This is plain old PCI, nothing specific to PCIe. Right, I think the main PCI difference is the point-to-point nature of PCIe vs legacy PCI bus. On a legacy PCI bus there's no way to prevent devices talking to each other, but on PCIe the transaction makes a U-turn at some point and heads out another downstream port. ACS allows us to prevent that from happening. detail: PCIe up/downstream routing is really done by an internal switch; ACS forces the legacy, PCI base-limit address routing and *forces* the switch to always route the transaction from a downstream port to the upstream port. I don't understand ACS very well, but it looks like it basically provides ways to prevent that peer-to-peer forwarding, so transactions would be sent upstream toward the root (and specifically, the IOMMU) instead of being directly claimed by 06:06.0. Yep, that's my meager understanding as well. +1 so we're looking for the furthest upstream device for which acs is enabled all the way up to the root bus. Correct me if this is wrong: To force device A's DMAs to be processed by an IOMMU, ACS must be enabled on the root port and every downstream port along the path to A. Yes, modulo this comment in libvirt source: /* if we have no parent, and this is the root bus, ACS doesn't come * into play since devices on the root bus can't P2P without going * through the root IOMMU. */ Correct. PCIe spec says roots must support ACS. I believe all the root bridges that have an IOMMU have ACS wired in/on. Would you mind
Re: [PATCH 05/13] pci: New pci_acs_enabled()
On 05/21/2012 10:59 AM, Alex Williamson wrote: On Mon, 2012-05-21 at 09:31 -0400, Don Dutile wrote: On 05/18/2012 10:47 PM, Alex Williamson wrote: On Fri, 2012-05-18 at 19:00 -0400, Don Dutile wrote: On 05/18/2012 06:02 PM, Alex Williamson wrote: On Wed, 2012-05-16 at 09:29 -0400, Don Dutile wrote: On 05/15/2012 05:09 PM, Alex Williamson wrote: On Tue, 2012-05-15 at 13:56 -0600, Bjorn Helgaas wrote: On Mon, May 14, 2012 at 4:49 PM, Alex Williamson alex.william...@redhat.com wrote: On Mon, 2012-05-14 at 16:02 -0600, Bjorn Helgaas wrote: On Fri, May 11, 2012 at 4:56 PM, Alex Williamson alex.william...@redhat.com wrote: In a PCIe environment, transactions aren't always required to reach the root bus before being re-routed. Peer-to-peer DMA may actually not be seen by the IOMMU in these cases. For IOMMU groups, we want to provide IOMMU drivers a way to detect these restrictions. Provided with a PCI device, pci_acs_enabled returns the furthest downstream device with a complete PCI ACS chain. This information can then be used in grouping to create fully isolated groups. ACS chain logic extracted from libvirt. The name pci_acs_enabled() sounds like it returns a boolean, but it doesn't. Right, maybe this should be: struct pci_dev *pci_find_upstream_acs(struct pci_dev *pdev); +1; there is a global in the PCI code, pci_acs_enable, and a function pci_enable_acs(), which the above name certainly confuses. I recommend pci_find_top_acs_bridge() would be most descriptive. Finally, with my email filters fixed, I can see this email... :) Welcome back ;) Indeed... and I recvd 3 copies of this reply, so the pendulum has flipped the other direction... ;-) Yep, the new API I'm working with is: bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags); bool pci_acs_path_enabled(struct pci_dev *start, struct pci_dev *end, u16 acs_flags); ok. I'm not sure what a complete PCI ACS chain means. The function starts from dev and searches *upstream*, so I'm guessing it returns the root of a subtree that must be contained in a group. Any intermediate switch between an endpoint and the root bus can redirect a dma access without iommu translation, Is this redirection just the normal PCI bridge forwarding that allows peer-to-peer transactions, i.e., the rule (from P2P bridge spec, rev 1.2, sec 4.1) that the bridge apertures define address ranges that are forwarded from primary to secondary interface, and the inverse ranges are forwarded from secondary to primary? For example, here: ^ | ++---+ || +--+-++-++-+ | Downstream || Downstream | |Port||Port| | 06:05.0 || 06:06.0 | +--+-++--+-+ | | +v+ +v+ | Endpoint| | Endpoint| | 07:00.0 | | 08:00.0 | +-+ +-+ that rule is all that's needed for a transaction from 07:00.0 to be forwarded from upstream to the internal switch bus 06, then claimed by 06:06.0 and forwarded downstream to 08:00.0. This is plain old PCI, nothing specific to PCIe. Right, I think the main PCI difference is the point-to-point nature of PCIe vs legacy PCI bus. On a legacy PCI bus there's no way to prevent devices talking to each other, but on PCIe the transaction makes a U-turn at some point and heads out another downstream port. ACS allows us to prevent that from happening. detail: PCIe up/downstream routing is really done by an internal switch; ACS forces the legacy, PCI base-limit address routing and *forces* the switch to always route the transaction from a downstream port to the upstream port. I don't understand ACS very well, but it looks like it basically provides ways to prevent that peer-to-peer forwarding, so transactions would be sent upstream toward the root (and specifically, the IOMMU) instead of being directly claimed by 06:06.0. Yep, that's my meager understanding as well. +1 so we're looking for the furthest upstream device for which acs is enabled all the way up to the root bus. Correct me if this is wrong: To force device A's DMAs to be processed by an IOMMU, ACS must be enabled on the root port and every downstream port along the path to A. Yes, modulo this comment in libvirt source: /* if we have no parent, and this is the root bus, ACS doesn't come * into play since devices on the root bus can't P2P without going * through the root IOMMU. */ Correct. PCIe spec says roots must support ACS. I believe all the root bridges that have an IOMMU have ACS wired in/on. Would you mind looking for the paragraph that says this? I'd rather code this into the iommu driver callers than core PCI code if this is just a
Re: RESEND3: Re: [PATCH 05/13] pci: New pci_acs_enabled()
On 05/18/2012 06:02 PM, Alex Williamson wrote: On Wed, 2012-05-16 at 09:29 -0400, Don Dutile wrote: On 05/15/2012 05:09 PM, Alex Williamson wrote: On Tue, 2012-05-15 at 13:56 -0600, Bjorn Helgaas wrote: On Mon, May 14, 2012 at 4:49 PM, Alex Williamson alex.william...@redhat.com wrote: On Mon, 2012-05-14 at 16:02 -0600, Bjorn Helgaas wrote: On Fri, May 11, 2012 at 4:56 PM, Alex Williamson alex.william...@redhat.com wrote: In a PCIe environment, transactions aren't always required to reach the root bus before being re-routed. Peer-to-peer DMA may actually not be seen by the IOMMU in these cases. For IOMMU groups, we want to provide IOMMU drivers a way to detect these restrictions. Provided with a PCI device, pci_acs_enabled returns the furthest downstream device with a complete PCI ACS chain. This information can then be used in grouping to create fully isolated groups. ACS chain logic extracted from libvirt. The name pci_acs_enabled() sounds like it returns a boolean, but it doesn't. Right, maybe this should be: struct pci_dev *pci_find_upstream_acs(struct pci_dev *pdev); +1; there is a global in the PCI code, pci_acs_enable, and a function pci_enable_acs(), which the above name certainly confuses. I recommend pci_find_top_acs_bridge() would be most descriptive. Finally, with my email filters fixed, I can see this email... :) Yep, the new API I'm working with is: bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags); bool pci_acs_path_enabled(struct pci_dev *start, struct pci_dev *end, u16 acs_flags); ok. I'm not sure what a complete PCI ACS chain means. The function starts from dev and searches *upstream*, so I'm guessing it returns the root of a subtree that must be contained in a group. Any intermediate switch between an endpoint and the root bus can redirect a dma access without iommu translation, Is this redirection just the normal PCI bridge forwarding that allows peer-to-peer transactions, i.e., the rule (from P2P bridge spec, rev 1.2, sec 4.1) that the bridge apertures define address ranges that are forwarded from primary to secondary interface, and the inverse ranges are forwarded from secondary to primary? For example, here: ^ | ++---+ || +--+-++-++-+ | Downstream || Downstream | |Port||Port| | 06:05.0 || 06:06.0 | +--+-++--+-+ | | +v+ +v+ | Endpoint| | Endpoint| | 07:00.0 | | 08:00.0 | +-+ +-+ that rule is all that's needed for a transaction from 07:00.0 to be forwarded from upstream to the internal switch bus 06, then claimed by 06:06.0 and forwarded downstream to 08:00.0. This is plain old PCI, nothing specific to PCIe. Right, I think the main PCI difference is the point-to-point nature of PCIe vs legacy PCI bus. On a legacy PCI bus there's no way to prevent devices talking to each other, but on PCIe the transaction makes a U-turn at some point and heads out another downstream port. ACS allows us to prevent that from happening. detail: PCIe up/downstream routing is really done by an internal switch; ACS forces the legacy, PCI base-limit address routing and *forces* the switch to always route the transaction from a downstream port to the upstream port. I don't understand ACS very well, but it looks like it basically provides ways to prevent that peer-to-peer forwarding, so transactions would be sent upstream toward the root (and specifically, the IOMMU) instead of being directly claimed by 06:06.0. Yep, that's my meager understanding as well. +1 so we're looking for the furthest upstream device for which acs is enabled all the way up to the root bus. Correct me if this is wrong: To force device A's DMAs to be processed by an IOMMU, ACS must be enabled on the root port and every downstream port along the path to A. Yes, modulo this comment in libvirt source: /* if we have no parent, and this is the root bus, ACS doesn't come * into play since devices on the root bus can't P2P without going * through the root IOMMU. */ Correct. PCIe spec says roots must support ACS. I believe all the root bridges that have an IOMMU have ACS wired in/on. Would you mind looking for the paragraph that says this? I'd rather code this into the iommu driver callers than core PCI code if this is just a platform standard. In section 6.12.1.1 of PCIe Base spec, rev 3.0, it states: ACS upstream fwding: Must be implemented by Root Ports if the RC supports Redirected Request Validation; -- which means, if a Root port allows a peer-to-peer transaction to another one of its ports, then it has to support ACS. So, this means that: (a) if a Root
Re: [PATCH 05/13] pci: New pci_acs_enabled()
On Fri, 2012-05-18 at 19:00 -0400, Don Dutile wrote: On 05/18/2012 06:02 PM, Alex Williamson wrote: On Wed, 2012-05-16 at 09:29 -0400, Don Dutile wrote: On 05/15/2012 05:09 PM, Alex Williamson wrote: On Tue, 2012-05-15 at 13:56 -0600, Bjorn Helgaas wrote: On Mon, May 14, 2012 at 4:49 PM, Alex Williamson alex.william...@redhat.com wrote: On Mon, 2012-05-14 at 16:02 -0600, Bjorn Helgaas wrote: On Fri, May 11, 2012 at 4:56 PM, Alex Williamson alex.william...@redhat.com wrote: In a PCIe environment, transactions aren't always required to reach the root bus before being re-routed. Peer-to-peer DMA may actually not be seen by the IOMMU in these cases. For IOMMU groups, we want to provide IOMMU drivers a way to detect these restrictions. Provided with a PCI device, pci_acs_enabled returns the furthest downstream device with a complete PCI ACS chain. This information can then be used in grouping to create fully isolated groups. ACS chain logic extracted from libvirt. The name pci_acs_enabled() sounds like it returns a boolean, but it doesn't. Right, maybe this should be: struct pci_dev *pci_find_upstream_acs(struct pci_dev *pdev); +1; there is a global in the PCI code, pci_acs_enable, and a function pci_enable_acs(), which the above name certainly confuses. I recommend pci_find_top_acs_bridge() would be most descriptive. Finally, with my email filters fixed, I can see this email... :) Welcome back ;) Yep, the new API I'm working with is: bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags); bool pci_acs_path_enabled(struct pci_dev *start, struct pci_dev *end, u16 acs_flags); ok. I'm not sure what a complete PCI ACS chain means. The function starts from dev and searches *upstream*, so I'm guessing it returns the root of a subtree that must be contained in a group. Any intermediate switch between an endpoint and the root bus can redirect a dma access without iommu translation, Is this redirection just the normal PCI bridge forwarding that allows peer-to-peer transactions, i.e., the rule (from P2P bridge spec, rev 1.2, sec 4.1) that the bridge apertures define address ranges that are forwarded from primary to secondary interface, and the inverse ranges are forwarded from secondary to primary? For example, here: ^ | ++---+ || +--+-++-++-+ | Downstream || Downstream | |Port||Port| | 06:05.0 || 06:06.0 | +--+-++--+-+ | | +v+ +v+ | Endpoint| | Endpoint| | 07:00.0 | | 08:00.0 | +-+ +-+ that rule is all that's needed for a transaction from 07:00.0 to be forwarded from upstream to the internal switch bus 06, then claimed by 06:06.0 and forwarded downstream to 08:00.0. This is plain old PCI, nothing specific to PCIe. Right, I think the main PCI difference is the point-to-point nature of PCIe vs legacy PCI bus. On a legacy PCI bus there's no way to prevent devices talking to each other, but on PCIe the transaction makes a U-turn at some point and heads out another downstream port. ACS allows us to prevent that from happening. detail: PCIe up/downstream routing is really done by an internal switch; ACS forces the legacy, PCI base-limit address routing and *forces* the switch to always route the transaction from a downstream port to the upstream port. I don't understand ACS very well, but it looks like it basically provides ways to prevent that peer-to-peer forwarding, so transactions would be sent upstream toward the root (and specifically, the IOMMU) instead of being directly claimed by 06:06.0. Yep, that's my meager understanding as well. +1 so we're looking for the furthest upstream device for which acs is enabled all the way up to the root bus. Correct me if this is wrong: To force device A's DMAs to be processed by an IOMMU, ACS must be enabled on the root port and every downstream port along the path to A. Yes, modulo this comment in libvirt source: /* if we have no parent, and this is the root bus, ACS doesn't come * into play since devices on the root bus can't P2P without going * through the root IOMMU. */ Correct. PCIe spec says roots must support ACS. I believe all the root bridges that have an IOMMU have ACS wired in/on. Would you mind looking for the paragraph that says this? I'd rather code this into the iommu driver callers than core PCI code if this is just a platform standard. In section 6.12.1.1 of PCIe Base spec, rev 3.0, it states: ACS upstream fwding: Must be implemented
Re: [PATCH 05/13] pci: New pci_acs_enabled()
On 05/15/2012 05:09 PM, Alex Williamson wrote: On Tue, 2012-05-15 at 13:56 -0600, Bjorn Helgaas wrote: On Mon, May 14, 2012 at 4:49 PM, Alex Williamson alex.william...@redhat.com wrote: On Mon, 2012-05-14 at 16:02 -0600, Bjorn Helgaas wrote: On Fri, May 11, 2012 at 4:56 PM, Alex Williamson alex.william...@redhat.com wrote: In a PCIe environment, transactions aren't always required to reach the root bus before being re-routed. Peer-to-peer DMA may actually not be seen by the IOMMU in these cases. For IOMMU groups, we want to provide IOMMU drivers a way to detect these restrictions. Provided with a PCI device, pci_acs_enabled returns the furthest downstream device with a complete PCI ACS chain. This information can then be used in grouping to create fully isolated groups. ACS chain logic extracted from libvirt. The name pci_acs_enabled() sounds like it returns a boolean, but it doesn't. Right, maybe this should be: struct pci_dev *pci_find_upstream_acs(struct pci_dev *pdev); +1; there is a global in the PCI code, pci_acs_enable, and a function pci_enable_acs(), which the above name certainly confuses. I recommend pci_find_top_acs_bridge() would be most descriptive. I'm not sure what a complete PCI ACS chain means. The function starts from dev and searches *upstream*, so I'm guessing it returns the root of a subtree that must be contained in a group. Any intermediate switch between an endpoint and the root bus can redirect a dma access without iommu translation, Is this redirection just the normal PCI bridge forwarding that allows peer-to-peer transactions, i.e., the rule (from P2P bridge spec, rev 1.2, sec 4.1) that the bridge apertures define address ranges that are forwarded from primary to secondary interface, and the inverse ranges are forwarded from secondary to primary? For example, here: ^ | ++---+ || +--+-++-++-+ | Downstream || Downstream | |Port||Port| | 06:05.0 || 06:06.0 | +--+-++--+-+ | | +v+ +v+ | Endpoint| | Endpoint| | 07:00.0 | | 08:00.0 | +-+ +-+ that rule is all that's needed for a transaction from 07:00.0 to be forwarded from upstream to the internal switch bus 06, then claimed by 06:06.0 and forwarded downstream to 08:00.0. This is plain old PCI, nothing specific to PCIe. Right, I think the main PCI difference is the point-to-point nature of PCIe vs legacy PCI bus. On a legacy PCI bus there's no way to prevent devices talking to each other, but on PCIe the transaction makes a U-turn at some point and heads out another downstream port. ACS allows us to prevent that from happening. detail: PCIe up/downstream routing is really done by an internal switch; ACS forces the legacy, PCI base-limit address routing and *forces* the switch to always route the transaction from a downstream port to the upstream port. I don't understand ACS very well, but it looks like it basically provides ways to prevent that peer-to-peer forwarding, so transactions would be sent upstream toward the root (and specifically, the IOMMU) instead of being directly claimed by 06:06.0. Yep, that's my meager understanding as well. +1 so we're looking for the furthest upstream device for which acs is enabled all the way up to the root bus. Correct me if this is wrong: To force device A's DMAs to be processed by an IOMMU, ACS must be enabled on the root port and every downstream port along the path to A. Yes, modulo this comment in libvirt source: /* if we have no parent, and this is the root bus, ACS doesn't come * into play since devices on the root bus can't P2P without going * through the root IOMMU. */ Correct. PCIe spec says roots must support ACS. I believe all the root bridges that have an IOMMU have ACS wired in/on. So we assume that a redirect at the point of the iommu will factor in iommu translation. If so, I think you're trying to find out the closest upstream device X such that everything leading to X has ACS enabled. Every device below X can DMA freely to other devices below X, so they would all have to be in the same isolated group. Yes I tried to work through some examples to develop some intuition about this: (inserting fixed url) http://www.asciiflow.com/#3736558963405980039 pci_acs_enabled(00:00.0) = 00:00.0 (on root bus (but doesn't it matter if 00:00.0 is PCIe or if RP has ACS?)) Hmm, the latter is the assumption above. For the former, I think libvirt was probably assuming that PCI devices must have a PCIe device upstream from them because x86 doesn't have assignment friendly IOMMUs except on PCIe. I'll need to work on making that more generic. pci_acs_enabled(00:01.0) = 00:01.0 (on root bus)
Re: [PATCH 05/13] pci: New pci_acs_enabled()
On Wed, 2012-05-16 at 09:29 -0400, Don Dutile wrote: On 05/15/2012 05:09 PM, Alex Williamson wrote: On Tue, 2012-05-15 at 13:56 -0600, Bjorn Helgaas wrote: On Mon, May 14, 2012 at 4:49 PM, Alex Williamson alex.william...@redhat.com wrote: On Mon, 2012-05-14 at 16:02 -0600, Bjorn Helgaas wrote: On Fri, May 11, 2012 at 4:56 PM, Alex Williamson alex.william...@redhat.com wrote: In a PCIe environment, transactions aren't always required to reach the root bus before being re-routed. Peer-to-peer DMA may actually not be seen by the IOMMU in these cases. For IOMMU groups, we want to provide IOMMU drivers a way to detect these restrictions. Provided with a PCI device, pci_acs_enabled returns the furthest downstream device with a complete PCI ACS chain. This information can then be used in grouping to create fully isolated groups. ACS chain logic extracted from libvirt. The name pci_acs_enabled() sounds like it returns a boolean, but it doesn't. Right, maybe this should be: struct pci_dev *pci_find_upstream_acs(struct pci_dev *pdev); +1; there is a global in the PCI code, pci_acs_enable, and a function pci_enable_acs(), which the above name certainly confuses. I recommend pci_find_top_acs_bridge() would be most descriptive. Yep, the new API I'm working with is: bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags); bool pci_acs_path_enabled(struct pci_dev *start, struct pci_dev *end, u16 acs_flags); I'm not sure what a complete PCI ACS chain means. The function starts from dev and searches *upstream*, so I'm guessing it returns the root of a subtree that must be contained in a group. Any intermediate switch between an endpoint and the root bus can redirect a dma access without iommu translation, Is this redirection just the normal PCI bridge forwarding that allows peer-to-peer transactions, i.e., the rule (from P2P bridge spec, rev 1.2, sec 4.1) that the bridge apertures define address ranges that are forwarded from primary to secondary interface, and the inverse ranges are forwarded from secondary to primary? For example, here: ^ | ++---+ || +--+-++-++-+ | Downstream || Downstream | |Port||Port| | 06:05.0 || 06:06.0 | +--+-++--+-+ | | +v+ +v+ | Endpoint| | Endpoint| | 07:00.0 | | 08:00.0 | +-+ +-+ that rule is all that's needed for a transaction from 07:00.0 to be forwarded from upstream to the internal switch bus 06, then claimed by 06:06.0 and forwarded downstream to 08:00.0. This is plain old PCI, nothing specific to PCIe. Right, I think the main PCI difference is the point-to-point nature of PCIe vs legacy PCI bus. On a legacy PCI bus there's no way to prevent devices talking to each other, but on PCIe the transaction makes a U-turn at some point and heads out another downstream port. ACS allows us to prevent that from happening. detail: PCIe up/downstream routing is really done by an internal switch; ACS forces the legacy, PCI base-limit address routing and *forces* the switch to always route the transaction from a downstream port to the upstream port. I don't understand ACS very well, but it looks like it basically provides ways to prevent that peer-to-peer forwarding, so transactions would be sent upstream toward the root (and specifically, the IOMMU) instead of being directly claimed by 06:06.0. Yep, that's my meager understanding as well. +1 so we're looking for the furthest upstream device for which acs is enabled all the way up to the root bus. Correct me if this is wrong: To force device A's DMAs to be processed by an IOMMU, ACS must be enabled on the root port and every downstream port along the path to A. Yes, modulo this comment in libvirt source: /* if we have no parent, and this is the root bus, ACS doesn't come * into play since devices on the root bus can't P2P without going * through the root IOMMU. */ Correct. PCIe spec says roots must support ACS. I believe all the root bridges that have an IOMMU have ACS wired in/on. Would you mind looking for the paragraph that says this? I'd rather code this into the iommu driver callers than core PCI code if this is just a platform standard. So we assume that a redirect at the point of the iommu will factor in iommu translation. If so, I think you're trying to find out the closest upstream device X such that everything leading to X has ACS enabled. Every device below X can DMA freely to other devices below X, so they would all have to be in the same isolated group.
Re: [PATCH 05/13] pci: New pci_acs_enabled()
On Mon, May 14, 2012 at 4:49 PM, Alex Williamson alex.william...@redhat.com wrote: On Mon, 2012-05-14 at 16:02 -0600, Bjorn Helgaas wrote: On Fri, May 11, 2012 at 4:56 PM, Alex Williamson alex.william...@redhat.com wrote: In a PCIe environment, transactions aren't always required to reach the root bus before being re-routed. Peer-to-peer DMA may actually not be seen by the IOMMU in these cases. For IOMMU groups, we want to provide IOMMU drivers a way to detect these restrictions. Provided with a PCI device, pci_acs_enabled returns the furthest downstream device with a complete PCI ACS chain. This information can then be used in grouping to create fully isolated groups. ACS chain logic extracted from libvirt. The name pci_acs_enabled() sounds like it returns a boolean, but it doesn't. Right, maybe this should be: struct pci_dev *pci_find_upstream_acs(struct pci_dev *pdev); I'm not sure what a complete PCI ACS chain means. The function starts from dev and searches *upstream*, so I'm guessing it returns the root of a subtree that must be contained in a group. Any intermediate switch between an endpoint and the root bus can redirect a dma access without iommu translation, Is this redirection just the normal PCI bridge forwarding that allows peer-to-peer transactions, i.e., the rule (from P2P bridge spec, rev 1.2, sec 4.1) that the bridge apertures define address ranges that are forwarded from primary to secondary interface, and the inverse ranges are forwarded from secondary to primary? For example, here: ^ | ++---+ || +--+-++-++-+ | Downstream || Downstream | |Port||Port| | 06:05.0 || 06:06.0 | +--+-++--+-+ | | +v+ +v+ | Endpoint| | Endpoint| | 07:00.0 | | 08:00.0 | +-+ +-+ that rule is all that's needed for a transaction from 07:00.0 to be forwarded from upstream to the internal switch bus 06, then claimed by 06:06.0 and forwarded downstream to 08:00.0. This is plain old PCI, nothing specific to PCIe. I don't understand ACS very well, but it looks like it basically provides ways to prevent that peer-to-peer forwarding, so transactions would be sent upstream toward the root (and specifically, the IOMMU) instead of being directly claimed by 06:06.0. so we're looking for the furthest upstream device for which acs is enabled all the way up to the root bus. Correct me if this is wrong: To force device A's DMAs to be processed by an IOMMU, ACS must be enabled on the root port and every downstream port along the path to A. If so, I think you're trying to find out the closest upstream device X such that everything leading to X has ACS enabled. Every device below X can DMA freely to other devices below X, so they would all have to be in the same isolated group. I tried to work through some examples to develop some intuition about this: | ++--+--+ || | || +|---+ +v+ +v+ | +-v+ | | 00:00.0 | | 00:01.0 | | | 00:02.0 | | | PCI | | PCIe-to | | | Upstream | | +-+ | PCI | | +-++ | +++ || | || +-+--++ | +--+--+ | || | | | | | +v-+ +v-+ +v-+ | +v+ +v+| | 02:00.0 | | 02:01.0 | | 02:02.0 | | | 01:00.0 | | 01:01.0 || |Downstream| |Downstream| |Downstream| | | PCI | | PCI || | w/o ACS | | w/ ACS | | w/ ACS | | +-+ +-+| +-++ ++-+ ++-+ | +---|---||---+ | || +v+ +v+ +v+ | 03:00.0 | | 04:00.0 | | 05:00.0 | | PCIe | | PCIe | | PCIe | +-+ | w/o ACS | | w/ ACS | +-+ +-+ pci_acs_enabled(00:00.0) = 00:00.0 (on root bus (but doesn't it matter if 00:00.0 is PCIe or if RP has ACS?))
Re: [PATCH 05/13] pci: New pci_acs_enabled()
I tried to work through some examples to develop some intuition about this: Sorry, gmail inserted line breaks that ruined this picture. Here's a URL for it: http://www.asciiflow.com/#3736558963405980039 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/13] pci: New pci_acs_enabled()
On Tue, 2012-05-15 at 13:56 -0600, Bjorn Helgaas wrote: On Mon, May 14, 2012 at 4:49 PM, Alex Williamson alex.william...@redhat.com wrote: On Mon, 2012-05-14 at 16:02 -0600, Bjorn Helgaas wrote: On Fri, May 11, 2012 at 4:56 PM, Alex Williamson alex.william...@redhat.com wrote: In a PCIe environment, transactions aren't always required to reach the root bus before being re-routed. Peer-to-peer DMA may actually not be seen by the IOMMU in these cases. For IOMMU groups, we want to provide IOMMU drivers a way to detect these restrictions. Provided with a PCI device, pci_acs_enabled returns the furthest downstream device with a complete PCI ACS chain. This information can then be used in grouping to create fully isolated groups. ACS chain logic extracted from libvirt. The name pci_acs_enabled() sounds like it returns a boolean, but it doesn't. Right, maybe this should be: struct pci_dev *pci_find_upstream_acs(struct pci_dev *pdev); I'm not sure what a complete PCI ACS chain means. The function starts from dev and searches *upstream*, so I'm guessing it returns the root of a subtree that must be contained in a group. Any intermediate switch between an endpoint and the root bus can redirect a dma access without iommu translation, Is this redirection just the normal PCI bridge forwarding that allows peer-to-peer transactions, i.e., the rule (from P2P bridge spec, rev 1.2, sec 4.1) that the bridge apertures define address ranges that are forwarded from primary to secondary interface, and the inverse ranges are forwarded from secondary to primary? For example, here: ^ | ++---+ || +--+-++-++-+ | Downstream || Downstream | |Port||Port| | 06:05.0 || 06:06.0 | +--+-++--+-+ | | +v+ +v+ | Endpoint| | Endpoint| | 07:00.0 | | 08:00.0 | +-+ +-+ that rule is all that's needed for a transaction from 07:00.0 to be forwarded from upstream to the internal switch bus 06, then claimed by 06:06.0 and forwarded downstream to 08:00.0. This is plain old PCI, nothing specific to PCIe. Right, I think the main PCI difference is the point-to-point nature of PCIe vs legacy PCI bus. On a legacy PCI bus there's no way to prevent devices talking to each other, but on PCIe the transaction makes a U-turn at some point and heads out another downstream port. ACS allows us to prevent that from happening. I don't understand ACS very well, but it looks like it basically provides ways to prevent that peer-to-peer forwarding, so transactions would be sent upstream toward the root (and specifically, the IOMMU) instead of being directly claimed by 06:06.0. Yep, that's my meager understanding as well. so we're looking for the furthest upstream device for which acs is enabled all the way up to the root bus. Correct me if this is wrong: To force device A's DMAs to be processed by an IOMMU, ACS must be enabled on the root port and every downstream port along the path to A. Yes, modulo this comment in libvirt source: /* if we have no parent, and this is the root bus, ACS doesn't come * into play since devices on the root bus can't P2P without going * through the root IOMMU. */ So we assume that a redirect at the point of the iommu will factor in iommu translation. If so, I think you're trying to find out the closest upstream device X such that everything leading to X has ACS enabled. Every device below X can DMA freely to other devices below X, so they would all have to be in the same isolated group. Yes I tried to work through some examples to develop some intuition about this: (inserting fixed url) http://www.asciiflow.com/#3736558963405980039 pci_acs_enabled(00:00.0) = 00:00.0 (on root bus (but doesn't it matter if 00:00.0 is PCIe or if RP has ACS?)) Hmm, the latter is the assumption above. For the former, I think libvirt was probably assuming that PCI devices must have a PCIe device upstream from them because x86 doesn't have assignment friendly IOMMUs except on PCIe. I'll need to work on making that more generic. pci_acs_enabled(00:01.0) = 00:01.0 (on root bus) pci_acs_enabled(01:00.0) = 01:00.0 (acs_dev = 00:01.0, 01:00.0 is not PCIe; seems wrong) Oops, I'm calling pci_find_upstream_pcie_bridge() first on any of my input devices, so this was passing for me. I'll need to incorporate that generically. pci_acs_enabled(00:02.0) = 00:02.0 (on root bus; seems wrong if RP doesn't have ACS) Yeah, let me validate the libvirt assumption. I see ACS on my root port, so maybe they're just assuming it's always enabled or that the precedence favors IOMMU translation. I'm also starting to think that we might want from and to
Re: [PATCH 05/13] pci: New pci_acs_enabled()
On Fri, May 11, 2012 at 4:56 PM, Alex Williamson alex.william...@redhat.com wrote: In a PCIe environment, transactions aren't always required to reach the root bus before being re-routed. Peer-to-peer DMA may actually not be seen by the IOMMU in these cases. For IOMMU groups, we want to provide IOMMU drivers a way to detect these restrictions. Provided with a PCI device, pci_acs_enabled returns the furthest downstream device with a complete PCI ACS chain. This information can then be used in grouping to create fully isolated groups. ACS chain logic extracted from libvirt. The name pci_acs_enabled() sounds like it returns a boolean, but it doesn't. I'm not sure what a complete PCI ACS chain means. The function starts from dev and searches *upstream*, so I'm guessing it returns the root of a subtree that must be contained in a group. Signed-off-by: Alex Williamson alex.william...@redhat.com --- drivers/pci/pci.c | 43 +++ include/linux/pci.h | 1 + 2 files changed, 44 insertions(+), 0 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 111569c..d7f05ce 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -2358,6 +2358,49 @@ void pci_enable_acs(struct pci_dev *dev) pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl); } +#define PCI_EXT_CAP_ACS_ENABLED (PCI_ACS_SV | PCI_ACS_RR | \ + PCI_ACS_CR | PCI_ACS_UF) + +/** + * pci_acs_enabled - test ACS support in downstream chain + * @dev: starting PCI device + * + * Returns the furthest downstream device with an unbroken ACS chain. If + * ACS is enabled throughout the chain, the returned device is the same as + * the one passed in. + */ +struct pci_dev *pci_acs_enabled(struct pci_dev *dev) +{ + struct pci_dev *acs_dev; + int pos; + u16 ctrl; + + if (!pci_is_root_bus(dev-bus)) + acs_dev = pci_acs_enabled(dev-bus-self); + else + return dev; + + /* If the chain is already broken, pass on the device */ + if (acs_dev != dev-bus-self) + return acs_dev; + + if (!pci_is_pcie(dev) || (dev-class 8) != PCI_CLASS_BRIDGE_PCI) + return dev; + + if (dev-pcie_type != PCI_EXP_TYPE_DOWNSTREAM) + return dev; + + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS); + if (!pos) + return acs_dev; + + pci_read_config_word(dev, pos + PCI_ACS_CTRL, ctrl); + if ((ctrl PCI_EXT_CAP_ACS_ENABLED) != PCI_EXT_CAP_ACS_ENABLED) + return acs_dev; + + return dev; +} + /** * pci_swizzle_interrupt_pin - swizzle INTx for device behind bridge * @dev: the PCI device diff --git a/include/linux/pci.h b/include/linux/pci.h index 9910b5c..dc25da3 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1586,6 +1586,7 @@ static inline bool pci_is_pcie(struct pci_dev *dev) } void pci_request_acs(void); +struct pci_dev *pci_acs_enabled(struct pci_dev *dev); #define PCI_VPD_LRDT 0x80 /* Large Resource Data Type */ -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/13] pci: New pci_acs_enabled()
On Mon, 2012-05-14 at 16:02 -0600, Bjorn Helgaas wrote: On Fri, May 11, 2012 at 4:56 PM, Alex Williamson alex.william...@redhat.com wrote: In a PCIe environment, transactions aren't always required to reach the root bus before being re-routed. Peer-to-peer DMA may actually not be seen by the IOMMU in these cases. For IOMMU groups, we want to provide IOMMU drivers a way to detect these restrictions. Provided with a PCI device, pci_acs_enabled returns the furthest downstream device with a complete PCI ACS chain. This information can then be used in grouping to create fully isolated groups. ACS chain logic extracted from libvirt. The name pci_acs_enabled() sounds like it returns a boolean, but it doesn't. Right, maybe this should be: struct pci_dev *pci_find_upstream_acs(struct pci_dev *pdev); I'm not sure what a complete PCI ACS chain means. The function starts from dev and searches *upstream*, so I'm guessing it returns the root of a subtree that must be contained in a group. Any intermediate switch between an endpoint and the root bus can redirect a dma access without iommu translation, so we're looking for the furthest upstream device for which acs is enabled all the way up to the root bus. I'll fix the function name and comments/commit log if that makes it sufficiently clear. Thanks, Alex Signed-off-by: Alex Williamson alex.william...@redhat.com --- drivers/pci/pci.c | 43 +++ include/linux/pci.h |1 + 2 files changed, 44 insertions(+), 0 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 111569c..d7f05ce 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -2358,6 +2358,49 @@ void pci_enable_acs(struct pci_dev *dev) pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl); } +#define PCI_EXT_CAP_ACS_ENABLED(PCI_ACS_SV | PCI_ACS_RR | \ +PCI_ACS_CR | PCI_ACS_UF) + +/** + * pci_acs_enabled - test ACS support in downstream chain + * @dev: starting PCI device + * + * Returns the furthest downstream device with an unbroken ACS chain. If + * ACS is enabled throughout the chain, the returned device is the same as + * the one passed in. + */ +struct pci_dev *pci_acs_enabled(struct pci_dev *dev) +{ + struct pci_dev *acs_dev; + int pos; + u16 ctrl; + + if (!pci_is_root_bus(dev-bus)) + acs_dev = pci_acs_enabled(dev-bus-self); + else + return dev; + + /* If the chain is already broken, pass on the device */ + if (acs_dev != dev-bus-self) + return acs_dev; + + if (!pci_is_pcie(dev) || (dev-class 8) != PCI_CLASS_BRIDGE_PCI) + return dev; + + if (dev-pcie_type != PCI_EXP_TYPE_DOWNSTREAM) + return dev; + + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS); + if (!pos) + return acs_dev; + + pci_read_config_word(dev, pos + PCI_ACS_CTRL, ctrl); + if ((ctrl PCI_EXT_CAP_ACS_ENABLED) != PCI_EXT_CAP_ACS_ENABLED) + return acs_dev; + + return dev; +} + /** * pci_swizzle_interrupt_pin - swizzle INTx for device behind bridge * @dev: the PCI device diff --git a/include/linux/pci.h b/include/linux/pci.h index 9910b5c..dc25da3 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1586,6 +1586,7 @@ static inline bool pci_is_pcie(struct pci_dev *dev) } void pci_request_acs(void); +struct pci_dev *pci_acs_enabled(struct pci_dev *dev); #define PCI_VPD_LRDT 0x80/* Large Resource Data Type */ -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html