Re: [PATCH v3 7/7] PCI: Exclude PCIe ports used for virtual links in pcie_bandwidth_available()

2023-11-15 Thread Lazar, Lijo




On 11/16/2023 2:39 AM, Mario Limonciello wrote:

On 11/15/2023 11:04, Mario Limonciello wrote:

On 11/14/2023 21:23, Lazar, Lijo wrote:



On 11/15/2023 1:37 AM, Mario Limonciello wrote:

The USB4 spec specifies that PCIe ports that are used for tunneling
PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s 
and

behave as a PCIe Gen1 device. The actual performance of these ports is
controlled by the fabric implementation.

Callers for pcie_bandwidth_available() will always find the PCIe ports
used for tunneling as a limiting factor potentially leading to 
incorrect

performance decisions.

To prevent such problems check explicitly for ports that are marked as
virtual links or as thunderbolt controllers and skip them when looking
for bandwidth limitations of the hierarchy. If the only device 
connected

is a port used for tunneling then report that device.

Callers to pcie_bandwidth_available() could make this change on their
own as well but then they wouldn't be able to detect other potential
speed bottlenecks from the hierarchy without duplicating
pcie_bandwidth_available() logic.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925#note_2145860
Link: https://www.usb.org/document-library/usb4r-specification-v20
   USB4 V2 with Errata and ECN through June 2023
   Section 11.2.1
Signed-off-by: Mario Limonciello 
---
v2->v3:
  * Split from previous patch version
  * Look for thunderbolt or virtual link
---
  drivers/pci/pci.c | 19 +++
  1 file changed, 19 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 0ff7883cc774..b1fb2258b211 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6269,11 +6269,20 @@ static u32 pcie_calc_bw_limits(struct 
pci_dev *dev, u32 bw,
   * limiting_dev, speed, and width pointers are supplied) 
information about
   * that point.  The bandwidth returned is in Mb/s, i.e., 
megabits/second of

   * raw bandwidth.
+ *
+ * This excludes the bandwidth calculation that has been returned 
from a
+ * PCIe device that is used for transmitting tunneled PCIe traffic 
over a virtual
+ * link part of larger hierarchy. Examples include Thunderbolt3 and 
USB4 links.
+ * The calculation is excluded because the USB4 specification 
specifies that the
+ * max speed returned from PCIe configuration registers for the 
tunneling link is
+ * always PCI 1x 2.5 GT/s.  When only tunneled devices are present, 
the bandwidth

+ * returned is the bandwidth available from the first tunneled device.
   */
  u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev 
**limiting_dev,

   enum pci_bus_speed *speed,
   enum pcie_link_width *width)
  {
+    struct pci_dev *vdev = NULL;
  u32 bw = 0;
  if (speed)
@@ -6282,10 +6291,20 @@ u32 pcie_bandwidth_available(struct pci_dev 
*dev, struct pci_dev **limiting_dev,

  *width = PCIE_LNK_WIDTH_UNKNOWN;
  while (dev) {
+    if (dev->is_virtual_link || dev->is_thunderbolt) {
+    if (!vdev)
+    vdev = dev;
+    goto skip;
+    }


One problem with this is it *silently* ignores the bandwidth limiting 
device - the bandwidth may not be really available if there are 
virtual links in between. That is a change in behavior from the 
messages shown in __pcie_print_link_status.


That's a good point.  How about a matching behavioral change to 
__pcie_print_link_status() where it looks at the entire hierarchy for 
any links marked as virtual and prints a message along the lines of:


"This value may be further limited by virtual links".


I'll wait for some more feedback on the series before posting another 
version, but I did put this together and this is a sample from dmesg of 
the wording I'm planning on using for the next version:


31.504 Gb/s available PCIe bandwidth, this may be further limited by 
conditions of virtual link :00:03.1




This will cover the the message, but for any real user of the API this 
is not good enough as the speed returned doesn't really indicate the 
bandwidth available. Or, modify the description such that users know 
that the value cannot be trusted when there is virtual link in between 
(probably the API should indicate that through some param/return code) 
and act accordingly.


Thanks,
Lijo





Thanks,
Lijo

  bw = pcie_calc_bw_limits(dev, bw, limiting_dev, speed, 
width);

+skip:
  dev = pci_upstream_bridge(dev);
  }
+    /* If nothing "faster" found on hierarchy, limit to first 
virtual link */

+    if (vdev && !bw)
+    bw = pcie_calc_bw_limits(vdev, bw, limiting_dev, speed, 
width);

+
  return bw;
  }
  EXPORT_SYMBOL(pcie_bandwidth_available);






Re: [PATCH v3 7/7] PCI: Exclude PCIe ports used for virtual links in pcie_bandwidth_available()

2023-11-15 Thread Mario Limonciello

On 11/15/2023 11:04, Mario Limonciello wrote:

On 11/14/2023 21:23, Lazar, Lijo wrote:



On 11/15/2023 1:37 AM, Mario Limonciello wrote:

The USB4 spec specifies that PCIe ports that are used for tunneling
PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s and
behave as a PCIe Gen1 device. The actual performance of these ports is
controlled by the fabric implementation.

Callers for pcie_bandwidth_available() will always find the PCIe ports
used for tunneling as a limiting factor potentially leading to incorrect
performance decisions.

To prevent such problems check explicitly for ports that are marked as
virtual links or as thunderbolt controllers and skip them when looking
for bandwidth limitations of the hierarchy. If the only device connected
is a port used for tunneling then report that device.

Callers to pcie_bandwidth_available() could make this change on their
own as well but then they wouldn't be able to detect other potential
speed bottlenecks from the hierarchy without duplicating
pcie_bandwidth_available() logic.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925#note_2145860
Link: https://www.usb.org/document-library/usb4r-specification-v20
   USB4 V2 with Errata and ECN through June 2023
   Section 11.2.1
Signed-off-by: Mario Limonciello 
---
v2->v3:
  * Split from previous patch version
  * Look for thunderbolt or virtual link
---
  drivers/pci/pci.c | 19 +++
  1 file changed, 19 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 0ff7883cc774..b1fb2258b211 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6269,11 +6269,20 @@ static u32 pcie_calc_bw_limits(struct pci_dev 
*dev, u32 bw,
   * limiting_dev, speed, and width pointers are supplied) 
information about
   * that point.  The bandwidth returned is in Mb/s, i.e., 
megabits/second of

   * raw bandwidth.
+ *
+ * This excludes the bandwidth calculation that has been returned 
from a
+ * PCIe device that is used for transmitting tunneled PCIe traffic 
over a virtual
+ * link part of larger hierarchy. Examples include Thunderbolt3 and 
USB4 links.
+ * The calculation is excluded because the USB4 specification 
specifies that the
+ * max speed returned from PCIe configuration registers for the 
tunneling link is
+ * always PCI 1x 2.5 GT/s.  When only tunneled devices are present, 
the bandwidth

+ * returned is the bandwidth available from the first tunneled device.
   */
  u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev 
**limiting_dev,

   enum pci_bus_speed *speed,
   enum pcie_link_width *width)
  {
+    struct pci_dev *vdev = NULL;
  u32 bw = 0;
  if (speed)
@@ -6282,10 +6291,20 @@ u32 pcie_bandwidth_available(struct pci_dev 
*dev, struct pci_dev **limiting_dev,

  *width = PCIE_LNK_WIDTH_UNKNOWN;
  while (dev) {
+    if (dev->is_virtual_link || dev->is_thunderbolt) {
+    if (!vdev)
+    vdev = dev;
+    goto skip;
+    }


One problem with this is it *silently* ignores the bandwidth limiting 
device - the bandwidth may not be really available if there are 
virtual links in between. That is a change in behavior from the 
messages shown in __pcie_print_link_status.


That's a good point.  How about a matching behavioral change to 
__pcie_print_link_status() where it looks at the entire hierarchy for 
any links marked as virtual and prints a message along the lines of:


"This value may be further limited by virtual links".


I'll wait for some more feedback on the series before posting another 
version, but I did put this together and this is a sample from dmesg of 
the wording I'm planning on using for the next version:


31.504 Gb/s available PCIe bandwidth, this may be further limited by 
conditions of virtual link :00:03.1






Thanks,
Lijo


  bw = pcie_calc_bw_limits(dev, bw, limiting_dev, speed, width);
+skip:
  dev = pci_upstream_bridge(dev);
  }
+    /* If nothing "faster" found on hierarchy, limit to first 
virtual link */

+    if (vdev && !bw)
+    bw = pcie_calc_bw_limits(vdev, bw, limiting_dev, speed, width);
+
  return bw;
  }
  EXPORT_SYMBOL(pcie_bandwidth_available);






Re: [PATCH v3 7/7] PCI: Exclude PCIe ports used for virtual links in pcie_bandwidth_available()

2023-11-15 Thread Mario Limonciello

On 11/14/2023 21:23, Lazar, Lijo wrote:



On 11/15/2023 1:37 AM, Mario Limonciello wrote:

The USB4 spec specifies that PCIe ports that are used for tunneling
PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s and
behave as a PCIe Gen1 device. The actual performance of these ports is
controlled by the fabric implementation.

Callers for pcie_bandwidth_available() will always find the PCIe ports
used for tunneling as a limiting factor potentially leading to incorrect
performance decisions.

To prevent such problems check explicitly for ports that are marked as
virtual links or as thunderbolt controllers and skip them when looking
for bandwidth limitations of the hierarchy. If the only device connected
is a port used for tunneling then report that device.

Callers to pcie_bandwidth_available() could make this change on their
own as well but then they wouldn't be able to detect other potential
speed bottlenecks from the hierarchy without duplicating
pcie_bandwidth_available() logic.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925#note_2145860
Link: https://www.usb.org/document-library/usb4r-specification-v20
   USB4 V2 with Errata and ECN through June 2023
   Section 11.2.1
Signed-off-by: Mario Limonciello 
---
v2->v3:
  * Split from previous patch version
  * Look for thunderbolt or virtual link
---
  drivers/pci/pci.c | 19 +++
  1 file changed, 19 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 0ff7883cc774..b1fb2258b211 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6269,11 +6269,20 @@ static u32 pcie_calc_bw_limits(struct pci_dev 
*dev, u32 bw,
   * limiting_dev, speed, and width pointers are supplied) information 
about
   * that point.  The bandwidth returned is in Mb/s, i.e., 
megabits/second of

   * raw bandwidth.
+ *
+ * This excludes the bandwidth calculation that has been returned from a
+ * PCIe device that is used for transmitting tunneled PCIe traffic 
over a virtual
+ * link part of larger hierarchy. Examples include Thunderbolt3 and 
USB4 links.
+ * The calculation is excluded because the USB4 specification 
specifies that the
+ * max speed returned from PCIe configuration registers for the 
tunneling link is
+ * always PCI 1x 2.5 GT/s.  When only tunneled devices are present, 
the bandwidth

+ * returned is the bandwidth available from the first tunneled device.
   */
  u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev 
**limiting_dev,

   enum pci_bus_speed *speed,
   enum pcie_link_width *width)
  {
+    struct pci_dev *vdev = NULL;
  u32 bw = 0;
  if (speed)
@@ -6282,10 +6291,20 @@ u32 pcie_bandwidth_available(struct pci_dev 
*dev, struct pci_dev **limiting_dev,

  *width = PCIE_LNK_WIDTH_UNKNOWN;
  while (dev) {
+    if (dev->is_virtual_link || dev->is_thunderbolt) {
+    if (!vdev)
+    vdev = dev;
+    goto skip;
+    }


One problem with this is it *silently* ignores the bandwidth limiting 
device - the bandwidth may not be really available if there are virtual 
links in between. That is a change in behavior from the messages shown 
in __pcie_print_link_status.


That's a good point.  How about a matching behavioral change to 
__pcie_print_link_status() where it looks at the entire hierarchy for 
any links marked as virtual and prints a message along the lines of:


"This value may be further limited by virtual links".



Thanks,
Lijo


  bw = pcie_calc_bw_limits(dev, bw, limiting_dev, speed, width);
+skip:
  dev = pci_upstream_bridge(dev);
  }
+    /* If nothing "faster" found on hierarchy, limit to first virtual 
link */

+    if (vdev && !bw)
+    bw = pcie_calc_bw_limits(vdev, bw, limiting_dev, speed, width);
+
  return bw;
  }
  EXPORT_SYMBOL(pcie_bandwidth_available);




Re: [PATCH v3 7/7] PCI: Exclude PCIe ports used for virtual links in pcie_bandwidth_available()

2023-11-14 Thread Lazar, Lijo




On 11/15/2023 1:37 AM, Mario Limonciello wrote:

The USB4 spec specifies that PCIe ports that are used for tunneling
PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s and
behave as a PCIe Gen1 device. The actual performance of these ports is
controlled by the fabric implementation.

Callers for pcie_bandwidth_available() will always find the PCIe ports
used for tunneling as a limiting factor potentially leading to incorrect
performance decisions.

To prevent such problems check explicitly for ports that are marked as
virtual links or as thunderbolt controllers and skip them when looking
for bandwidth limitations of the hierarchy. If the only device connected
is a port used for tunneling then report that device.

Callers to pcie_bandwidth_available() could make this change on their
own as well but then they wouldn't be able to detect other potential
speed bottlenecks from the hierarchy without duplicating
pcie_bandwidth_available() logic.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925#note_2145860
Link: https://www.usb.org/document-library/usb4r-specification-v20
   USB4 V2 with Errata and ECN through June 2023
   Section 11.2.1
Signed-off-by: Mario Limonciello 
---
v2->v3:
  * Split from previous patch version
  * Look for thunderbolt or virtual link
---
  drivers/pci/pci.c | 19 +++
  1 file changed, 19 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 0ff7883cc774..b1fb2258b211 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6269,11 +6269,20 @@ static u32 pcie_calc_bw_limits(struct pci_dev *dev, u32 
bw,
   * limiting_dev, speed, and width pointers are supplied) information about
   * that point.  The bandwidth returned is in Mb/s, i.e., megabits/second of
   * raw bandwidth.
+ *
+ * This excludes the bandwidth calculation that has been returned from a
+ * PCIe device that is used for transmitting tunneled PCIe traffic over a 
virtual
+ * link part of larger hierarchy. Examples include Thunderbolt3 and USB4 links.
+ * The calculation is excluded because the USB4 specification specifies that 
the
+ * max speed returned from PCIe configuration registers for the tunneling link 
is
+ * always PCI 1x 2.5 GT/s.  When only tunneled devices are present, the 
bandwidth
+ * returned is the bandwidth available from the first tunneled device.
   */
  u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev 
**limiting_dev,
 enum pci_bus_speed *speed,
 enum pcie_link_width *width)
  {
+   struct pci_dev *vdev = NULL;
u32 bw = 0;
  
  	if (speed)

@@ -6282,10 +6291,20 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, 
struct pci_dev **limiting_dev,
*width = PCIE_LNK_WIDTH_UNKNOWN;
  
  	while (dev) {

+   if (dev->is_virtual_link || dev->is_thunderbolt) {
+   if (!vdev)
+   vdev = dev;
+   goto skip;
+   }


One problem with this is it *silently* ignores the bandwidth limiting 
device - the bandwidth may not be really available if there are virtual 
links in between. That is a change in behavior from the messages shown 
in __pcie_print_link_status.


Thanks,
Lijo


bw = pcie_calc_bw_limits(dev, bw, limiting_dev, speed, width);
+skip:
dev = pci_upstream_bridge(dev);
}
  
+	/* If nothing "faster" found on hierarchy, limit to first virtual link */

+   if (vdev && !bw)
+   bw = pcie_calc_bw_limits(vdev, bw, limiting_dev, speed, width);
+
return bw;
  }
  EXPORT_SYMBOL(pcie_bandwidth_available);


[PATCH v3 7/7] PCI: Exclude PCIe ports used for virtual links in pcie_bandwidth_available()

2023-11-14 Thread Mario Limonciello
The USB4 spec specifies that PCIe ports that are used for tunneling
PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s and
behave as a PCIe Gen1 device. The actual performance of these ports is
controlled by the fabric implementation.

Callers for pcie_bandwidth_available() will always find the PCIe ports
used for tunneling as a limiting factor potentially leading to incorrect
performance decisions.

To prevent such problems check explicitly for ports that are marked as
virtual links or as thunderbolt controllers and skip them when looking
for bandwidth limitations of the hierarchy. If the only device connected
is a port used for tunneling then report that device.

Callers to pcie_bandwidth_available() could make this change on their
own as well but then they wouldn't be able to detect other potential
speed bottlenecks from the hierarchy without duplicating
pcie_bandwidth_available() logic.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925#note_2145860
Link: https://www.usb.org/document-library/usb4r-specification-v20
  USB4 V2 with Errata and ECN through June 2023
  Section 11.2.1
Signed-off-by: Mario Limonciello 
---
v2->v3:
 * Split from previous patch version
 * Look for thunderbolt or virtual link
---
 drivers/pci/pci.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 0ff7883cc774..b1fb2258b211 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6269,11 +6269,20 @@ static u32 pcie_calc_bw_limits(struct pci_dev *dev, u32 
bw,
  * limiting_dev, speed, and width pointers are supplied) information about
  * that point.  The bandwidth returned is in Mb/s, i.e., megabits/second of
  * raw bandwidth.
+ *
+ * This excludes the bandwidth calculation that has been returned from a
+ * PCIe device that is used for transmitting tunneled PCIe traffic over a 
virtual
+ * link part of larger hierarchy. Examples include Thunderbolt3 and USB4 links.
+ * The calculation is excluded because the USB4 specification specifies that 
the
+ * max speed returned from PCIe configuration registers for the tunneling link 
is
+ * always PCI 1x 2.5 GT/s.  When only tunneled devices are present, the 
bandwidth
+ * returned is the bandwidth available from the first tunneled device.
  */
 u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev 
**limiting_dev,
 enum pci_bus_speed *speed,
 enum pcie_link_width *width)
 {
+   struct pci_dev *vdev = NULL;
u32 bw = 0;
 
if (speed)
@@ -6282,10 +6291,20 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, 
struct pci_dev **limiting_dev,
*width = PCIE_LNK_WIDTH_UNKNOWN;
 
while (dev) {
+   if (dev->is_virtual_link || dev->is_thunderbolt) {
+   if (!vdev)
+   vdev = dev;
+   goto skip;
+   }
bw = pcie_calc_bw_limits(dev, bw, limiting_dev, speed, width);
+skip:
dev = pci_upstream_bridge(dev);
}
 
+   /* If nothing "faster" found on hierarchy, limit to first virtual link 
*/
+   if (vdev && !bw)
+   bw = pcie_calc_bw_limits(vdev, bw, limiting_dev, speed, width);
+
return bw;
 }
 EXPORT_SYMBOL(pcie_bandwidth_available);
-- 
2.34.1