Re: [Nouveau] [PATCH v2 6/9] PCI: Rename is_thunderbolt to is_tunneled

2023-11-03 Thread Hans de Goede
Hi,

On 11/3/23 20:07, Mario Limonciello wrote:
> The `is_thunderbolt` bit has been used to indicate that a PCIe device
> contained the Intel VSEC which is used by various parts of the kernel
> to change behavior. To later allow usage with USB4 controllers as well,
> rename this to `is_tunneled`.
> 
> Signed-off-by: Mario Limonciello 

Here is my ack for the trivial drivers/platform/x86/apple-gmux.c change:

Acked-by: Hans de Goede 

Bjorn, feel free to route this through the PCI tree.

Regards,

Hans




> ---
>  drivers/pci/pci.c | 2 +-
>  drivers/pci/probe.c   | 2 +-
>  drivers/platform/x86/apple-gmux.c | 2 +-
>  include/linux/pci.h   | 2 +-
>  4 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 59c01d68c6d5..d9aa5a39f585 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3032,7 +3032,7 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge)
>   return true;
>  
>   /* Even the oldest 2010 Thunderbolt controller supports D3. */
> - if (bridge->is_thunderbolt)
> + if (bridge->is_tunneled)
>   return true;
>  
>   /* Platform might know better if the bridge supports D3 */
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 795534589b98..518413d15402 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1597,7 +1597,7 @@ static void set_pcie_thunderbolt(struct pci_dev *dev)
>   /* Is the device part of a Thunderbolt controller? */
>   vsec = pci_find_vsec_capability(dev, PCI_VENDOR_ID_INTEL, 
> PCI_VSEC_ID_INTEL_TBT);
>   if (vsec)
> - dev->is_thunderbolt = 1;
> + dev->is_tunneled = 1;
>  }
>  
>  static void set_pcie_untrusted(struct pci_dev *dev)
> diff --git a/drivers/platform/x86/apple-gmux.c 
> b/drivers/platform/x86/apple-gmux.c
> index 1417e230edbd..20315aa4463a 100644
> --- a/drivers/platform/x86/apple-gmux.c
> +++ b/drivers/platform/x86/apple-gmux.c
> @@ -774,7 +774,7 @@ static int gmux_resume(struct device *dev)
>  
>  static int is_thunderbolt(struct device *dev, void *data)
>  {
> - return to_pci_dev(dev)->is_thunderbolt;
> + return to_pci_dev(dev)->is_tunneled;
>  }
>  
>  static int gmux_probe(struct pnp_dev *pnp, const struct pnp_device_id *id)
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 439c2dac8a3e..b1724f25fb02 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -440,7 +440,7 @@ struct pci_dev {
>   unsigned intis_virtfn:1;
>   unsigned intis_hotplug_bridge:1;
>   unsigned intshpc_managed:1; /* SHPC owned by shpchp */
> - unsigned intis_thunderbolt:1;   /* Thunderbolt controller */
> + unsigned intis_tunneled:1;  /* Tunneled TBT or USB4 link */
>   unsigned intno_command_complete:1;  /* No command completion */
>   /*
>* Devices marked being untrusted are the ones that can potentially



Re: [Nouveau] [PATCH v2 0/9] Improvements to pcie_bandwidth_available() for eGPUs

2023-11-03 Thread Bjorn Helgaas
On Fri, Nov 03, 2023 at 02:07:49PM -0500, Mario Limonciello wrote:
> Downstream drivers are getting the wrong values from
> pcie_bandwidth_available() which is causing problems for performance
> of eGPUs.
> 
> This series overhauls Thunderbolt related device detection and uses
> the changes to change the behavior of pcie_bandwidth_available().
> 
> NOTE: This series is currently based on top of v6.6 + this change that
>   will be merged for 6.7:
> Link: https://patchwork.freedesktop.org/patch/564738/

Thanks, Mario, I'll look at this soon after v6.7-rc1 (probably Nov
12), so the amdgpu patch should be in mainline by then.

> v1->v2:
>  * Rename is_thunderbolt
>  * Look for _DSD instead of link
>  * Drop pci_is_thunderbolt_attached() from all drivers
>  * Adjust links
>  * Adjust commit messages
>  * Add quirk for Tiger Lake
> 
> Mario Limonciello (9):
>   drm/nouveau: Switch from pci_is_thunderbolt_attached() to
> dev_is_removable()
>   drm/radeon: Switch from pci_is_thunderbolt_attached() to
> dev_is_removable()
>   PCI: Drop pci_is_thunderbolt_attached()
>   PCI: Move the `PCI_CLASS_SERIAL_USB_USB4` definition to common header
>   PCI: pciehp: Move check for is_thunderbolt into a quirk
>   PCI: Rename is_thunderbolt to is_tunneled
>   PCI: ACPI: Detect PCIe root ports that are used for tunneling
>   PCI: Exclude PCIe ports used for tunneling in
> pcie_bandwidth_available()
>   PCI: Add a quirk to mark 0x8086 : 0x9a23 as supporting PCIe tunneling
> 
>  drivers/gpu/drm/nouveau/nouveau_vga.c  |  6 +-
>  drivers/gpu/drm/radeon/radeon_device.c |  4 +-
>  drivers/gpu/drm/radeon/radeon_kms.c|  2 +-
>  drivers/pci/hotplug/pciehp_hpc.c   |  6 +-
>  drivers/pci/pci-acpi.c | 16 ++
>  drivers/pci/pci.c  | 76 +-
>  drivers/pci/probe.c|  2 +-
>  drivers/pci/quirks.c   | 31 +++
>  drivers/platform/x86/apple-gmux.c  |  2 +-
>  drivers/thunderbolt/nhi.h  |  2 -
>  include/linux/pci.h| 25 +
>  include/linux/pci_ids.h|  1 +
>  12 files changed, 109 insertions(+), 64 deletions(-)
> 
> -- 
> 2.34.1
> 


[Nouveau] [PATCH v2 9/9] PCI: Add a quirk to mark 0x8086 : 0x9a23 as supporting PCIe tunneling

2023-11-03 Thread Mario Limonciello
The PCI root port used for tunneling USB4 traffic on Tiger Lake is
is not marked as tunneling but has the same limitations as other
PCIe root ports used for tunneling.

This causes pcie_bandwidth_available() to treat it as the limiting
device in the PCI hierarchy and downstream driver to program devices
incorrectly as a result.

Add a quirk to mark the device as tunneling so that it will be skipped
in pcie_bandwidth_available() like other TBT3/USB4 root ports and bridges.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2885
Signed-off-by: Mario Limonciello 
---
 drivers/pci/quirks.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 4bbf6e33ca11..0f124e075834 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3827,6 +3827,17 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 
PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_2C
quirk_thunderbolt_command_complete);
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PORT_RIDGE,
quirk_thunderbolt_command_complete);
+
+/*
+ * PCIe root port associated with the integrated controller is used for PCIe
+ * tunneling but can't be detected using ACPI.
+ */
+static void quirk_thunderbolt_tunneling(struct pci_dev *pdev)
+{
+   pdev->is_tunneled = 1;
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x9a23,
+   quirk_thunderbolt_tunneling);
 #ifdef CONFIG_ACPI
 /*
  * Apple: Shutdown Cactus Ridge Thunderbolt controller.
-- 
2.34.1



[Nouveau] [PATCH v2 8/9] PCI: Exclude PCIe ports used for tunneling in pcie_bandwidth_available()

2023-11-03 Thread Mario Limonciello
The USB4 spec specifies that PCIe ports that are used for tunneling
PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s and
behave as a PCIe Gen1 device. The actual performance of these ports is
controlled by the fabric implementation.

Downstream drivers such as amdgpu which utilize pcie_bandwidth_available()
to program the device will always find the PCIe ports used for
tunneling as a limiting factor potentially leading to incorrect
performance decisions.

To prevent problems in downstream drivers check explicitly for ports
being used for PCIe tunneling and skip them when looking for bandwidth
limitations of the hierarchy. If the only device connected is a root port
used for tunneling then report that device.

Downstream drivers could make this change on their own but then they
wouldn't be able to detect other potential speed bottlenecks from the
hierarchy without duplicating pcie_bandwidth_available() logic.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925#note_2145860
Link: https://www.usb.org/document-library/usb4r-specification-v20
  USB4 V2 with Errata and ECN through June 2023
  Section 11.2.1
Signed-off-by: Mario Limonciello 
---
 drivers/pci/pci.c | 74 +++
 1 file changed, 49 insertions(+), 25 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d9aa5a39f585..15e37164ce56 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6223,6 +6223,35 @@ int pcie_set_mps(struct pci_dev *dev, int mps)
 }
 EXPORT_SYMBOL(pcie_set_mps);
 
+static u32 pcie_calc_bw_limits(struct pci_dev *dev, u32 bw,
+  struct pci_dev **limiting_dev,
+  enum pci_bus_speed *speed,
+  enum pcie_link_width *width)
+{
+   enum pcie_link_width next_width;
+   enum pci_bus_speed next_speed;
+   u32 next_bw;
+   u16 lnksta;
+
+   pcie_capability_read_word(dev, PCI_EXP_LNKSTA, );
+   next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS];
+   next_width = (lnksta & PCI_EXP_LNKSTA_NLW) >> PCI_EXP_LNKSTA_NLW_SHIFT;
+   next_bw = next_width * PCIE_SPEED2MBS_ENC(next_speed);
+
+   /* Check if current device limits the total bandwidth */
+   if (!bw || next_bw <= bw) {
+   bw = next_bw;
+   if (limiting_dev)
+   *limiting_dev = dev;
+   if (speed)
+   *speed = next_speed;
+   if (width)
+   *width = next_width;
+   }
+
+   return bw;
+}
+
 /**
  * pcie_bandwidth_available - determine minimum link settings of a PCIe
  *   device and its bandwidth limitation
@@ -6236,47 +6265,42 @@ EXPORT_SYMBOL(pcie_set_mps);
  * limiting_dev, speed, and width pointers are supplied) information about
  * that point.  The bandwidth returned is in Mb/s, i.e., megabits/second of
  * raw bandwidth.
+ *
+ * This excludes the bandwidth calculation that has been returned from a
+ * PCIe device used for transmitting tunneled PCIe traffic over a Thunderbolt
+ * or USB4 link that is part of larger hierarchy. The calculation is excluded
+ * because the USB4 specification specifies that the max speed returned from
+ * PCIe configuration registers for the tunneling link is always PCI 1x 2.5 
GT/s.
+ * When only tunneled devices are present, the bandwidth returned is the
+ * bandwidth available from the first tunneled device.
  */
 u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev 
**limiting_dev,
 enum pci_bus_speed *speed,
 enum pcie_link_width *width)
 {
-   u16 lnksta;
-   enum pci_bus_speed next_speed;
-   enum pcie_link_width next_width;
-   u32 bw, next_bw;
+   struct pci_dev *tdev = NULL;
+   u32 bw = 0;
 
if (speed)
*speed = PCI_SPEED_UNKNOWN;
if (width)
*width = PCIE_LNK_WIDTH_UNKNOWN;
 
-   bw = 0;
-
while (dev) {
-   pcie_capability_read_word(dev, PCI_EXP_LNKSTA, );
-
-   next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS];
-   next_width = (lnksta & PCI_EXP_LNKSTA_NLW) >>
-   PCI_EXP_LNKSTA_NLW_SHIFT;
-
-   next_bw = next_width * PCIE_SPEED2MBS_ENC(next_speed);
-
-   /* Check if current device limits the total bandwidth */
-   if (!bw || next_bw <= bw) {
-   bw = next_bw;
-
-   if (limiting_dev)
-   *limiting_dev = dev;
-   if (speed)
-   *speed = next_speed;
-   if (width)
-   *width = next_width;
+   if (dev->is_tunneled) {
+   if (!tdev)
+   tdev = dev;
+   goto skip;
}
-
+   bw 

[Nouveau] [PATCH v2 7/9] PCI: ACPI: Detect PCIe root ports that are used for tunneling

2023-11-03 Thread Mario Limonciello
USB4 routers support a feature called "PCIe tunneling". This
allows PCIe traffic to be transmitted over USB4 fabric.

PCIe root ports that are used in this fashion can be discovered
by device specific data that specifies the USB4 router they are
connected to. For the PCI core, the specific connection information
doesn't matter, but it's interesting to know that this root port is
used for tunneling traffic. This will allow other decisions to be
made based upon it.

Detect the `usb4-host-interface` _DSD and if it's found save it
into the `is_tunneling` bit in `struct pci_device`.

Link: https://www.usb.org/document-library/usb4r-specification-v20
  USB4 V2 with Errata and ECN through June 2023
  Section 2.2.10.3
Link: 
https://learn.microsoft.com/en-us/windows-hardware/design/component-guidelines/usb4-acpi-requirements#port-mapping-_dsd-for-usb-3x-and-pcie
Signed-off-by: Mario Limonciello 
---
 drivers/pci/pci-acpi.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 05b7357bd258..81dbfd335f34 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -1414,12 +1414,28 @@ static void pci_acpi_set_external_facing(struct pci_dev 
*dev)
dev->external_facing = 1;
 }
 
+static void pci_acpi_set_tunneling(struct pci_dev *dev)
+{
+   if (!pci_is_pcie(dev))
+   return;
+
+   switch (pci_pcie_type(dev)) {
+   case PCI_EXP_TYPE_ROOT_PORT:
+   case PCI_EXP_TYPE_DOWNSTREAM:
+   dev->is_tunneled = device_property_present(>dev, 
"usb4-host-interface");
+   break;
+   default:
+   return;
+   }
+}
+
 void pci_acpi_setup(struct device *dev, struct acpi_device *adev)
 {
struct pci_dev *pci_dev = to_pci_dev(dev);
 
pci_acpi_optimize_delay(pci_dev, adev->handle);
pci_acpi_set_external_facing(pci_dev);
+   pci_acpi_set_tunneling(pci_dev);
pci_acpi_add_edr_notifier(pci_dev);
 
pci_acpi_add_pm_notifier(adev, pci_dev);
-- 
2.34.1



[Nouveau] [PATCH v2 6/9] PCI: Rename is_thunderbolt to is_tunneled

2023-11-03 Thread Mario Limonciello
The `is_thunderbolt` bit has been used to indicate that a PCIe device
contained the Intel VSEC which is used by various parts of the kernel
to change behavior. To later allow usage with USB4 controllers as well,
rename this to `is_tunneled`.

Signed-off-by: Mario Limonciello 
---
 drivers/pci/pci.c | 2 +-
 drivers/pci/probe.c   | 2 +-
 drivers/platform/x86/apple-gmux.c | 2 +-
 include/linux/pci.h   | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 59c01d68c6d5..d9aa5a39f585 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3032,7 +3032,7 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge)
return true;
 
/* Even the oldest 2010 Thunderbolt controller supports D3. */
-   if (bridge->is_thunderbolt)
+   if (bridge->is_tunneled)
return true;
 
/* Platform might know better if the bridge supports D3 */
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 795534589b98..518413d15402 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1597,7 +1597,7 @@ static void set_pcie_thunderbolt(struct pci_dev *dev)
/* Is the device part of a Thunderbolt controller? */
vsec = pci_find_vsec_capability(dev, PCI_VENDOR_ID_INTEL, 
PCI_VSEC_ID_INTEL_TBT);
if (vsec)
-   dev->is_thunderbolt = 1;
+   dev->is_tunneled = 1;
 }
 
 static void set_pcie_untrusted(struct pci_dev *dev)
diff --git a/drivers/platform/x86/apple-gmux.c 
b/drivers/platform/x86/apple-gmux.c
index 1417e230edbd..20315aa4463a 100644
--- a/drivers/platform/x86/apple-gmux.c
+++ b/drivers/platform/x86/apple-gmux.c
@@ -774,7 +774,7 @@ static int gmux_resume(struct device *dev)
 
 static int is_thunderbolt(struct device *dev, void *data)
 {
-   return to_pci_dev(dev)->is_thunderbolt;
+   return to_pci_dev(dev)->is_tunneled;
 }
 
 static int gmux_probe(struct pnp_dev *pnp, const struct pnp_device_id *id)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 439c2dac8a3e..b1724f25fb02 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -440,7 +440,7 @@ struct pci_dev {
unsigned intis_virtfn:1;
unsigned intis_hotplug_bridge:1;
unsigned intshpc_managed:1; /* SHPC owned by shpchp */
-   unsigned intis_thunderbolt:1;   /* Thunderbolt controller */
+   unsigned intis_tunneled:1;  /* Tunneled TBT or USB4 link */
unsigned intno_command_complete:1;  /* No command completion */
/*
 * Devices marked being untrusted are the ones that can potentially
-- 
2.34.1



[Nouveau] [PATCH v2 5/9] PCI: pciehp: Move check for is_thunderbolt into a quirk

2023-11-03 Thread Mario Limonciello
commit 493fb50e958c ("PCI: pciehp: Assume NoCompl+ for Thunderbolt
ports") added a check into pciehp code to explicitly set NoCompl+
for all Intel Thunderbolt controllers, including those that don't
need it.

This overloaded the purpose of the `is_thunderbolt` member of
`struct pci_device` because that means that any controller that
identifies as thunderbolt would set NoCompl+ even if it doesn't
suffer this deficiency. As that commit helpfully specifies all the
controllers with the problem, move them into a PCI quirk.

Signed-off-by: Mario Limonciello 
---
 drivers/pci/hotplug/pciehp_hpc.c |  6 +-
 drivers/pci/quirks.c | 20 
 include/linux/pci.h  |  1 +
 3 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index fd713abdfb9f..23a92d681d1c 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -991,11 +991,7 @@ struct controller *pcie_init(struct pcie_device *dev)
if (pdev->hotplug_user_indicators)
slot_cap &= ~(PCI_EXP_SLTCAP_AIP | PCI_EXP_SLTCAP_PIP);
 
-   /*
-* We assume no Thunderbolt controllers support Command Complete events,
-* but some controllers falsely claim they do.
-*/
-   if (pdev->is_thunderbolt)
+   if (pdev->no_command_complete)
slot_cap |= PCI_EXP_SLTCAP_NCCS;
 
ctrl->slot_cap = slot_cap;
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index eeec1d6f9023..4bbf6e33ca11 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3807,6 +3807,26 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 
PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PORT_RIDGE,
quirk_thunderbolt_hotplug_msi);
 
+/*
+ * We assume no Thunderbolt controllers support Command Complete events,
+ * but some controllers falsely claim they do.
+ */
+static void quirk_thunderbolt_command_complete(struct pci_dev *pdev)
+{
+   pdev->no_command_complete = 1;
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_LIGHT_RIDGE,
+   quirk_thunderbolt_command_complete);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_EAGLE_RIDGE,
+   quirk_thunderbolt_command_complete);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_LIGHT_PEAK,
+   quirk_thunderbolt_command_complete);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 
PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C,
+   quirk_thunderbolt_command_complete);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 
PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_2C,
+   quirk_thunderbolt_command_complete);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PORT_RIDGE,
+   quirk_thunderbolt_command_complete);
 #ifdef CONFIG_ACPI
 /*
  * Apple: Shutdown Cactus Ridge Thunderbolt controller.
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 530b0a360514..439c2dac8a3e 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -441,6 +441,7 @@ struct pci_dev {
unsigned intis_hotplug_bridge:1;
unsigned intshpc_managed:1; /* SHPC owned by shpchp */
unsigned intis_thunderbolt:1;   /* Thunderbolt controller */
+   unsigned intno_command_complete:1;  /* No command completion */
/*
 * Devices marked being untrusted are the ones that can potentially
 * execute DMA attacks and similar. They are typically connected
-- 
2.34.1



[Nouveau] [PATCH v2 3/9] PCI: Drop pci_is_thunderbolt_attached()

2023-11-03 Thread Mario Limonciello
All callers have switched to dev_is_removable() for detecting
hotpluggable PCIe devices.

Signed-off-by: Mario Limonciello 
---
 include/linux/pci.h | 22 --
 1 file changed, 22 deletions(-)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index b56417276042..530b0a360514 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2616,28 +2616,6 @@ static inline bool pci_ari_enabled(struct pci_bus *bus)
return bus->self && bus->self->ari_enabled;
 }
 
-/**
- * pci_is_thunderbolt_attached - whether device is on a Thunderbolt daisy chain
- * @pdev: PCI device to check
- *
- * Walk upwards from @pdev and check for each encountered bridge if it's part
- * of a Thunderbolt controller.  Reaching the host bridge means @pdev is not
- * Thunderbolt-attached.  (But rather soldered to the mainboard usually.)
- */
-static inline bool pci_is_thunderbolt_attached(struct pci_dev *pdev)
-{
-   struct pci_dev *parent = pdev;
-
-   if (pdev->is_thunderbolt)
-   return true;
-
-   while ((parent = pci_upstream_bridge(parent)))
-   if (parent->is_thunderbolt)
-   return true;
-
-   return false;
-}
-
 #if defined(CONFIG_PCIEPORTBUS) || defined(CONFIG_EEH)
 void pci_uevent_ers(struct pci_dev *pdev, enum  pci_ers_result err_type);
 #endif
-- 
2.34.1



[Nouveau] [PATCH v2 4/9] PCI: Move the `PCI_CLASS_SERIAL_USB_USB4` definition to common header

2023-11-03 Thread Mario Limonciello
`PCI_CLASS_SERIAL_USB_USB4` may be used by code outside of thunderbolt.
Move the declaration into the common pci_ids.h header.

Acked-by: Mika Westerberberg 
Signed-off-by: Mario Limonciello 
---
 drivers/thunderbolt/nhi.h | 2 --
 include/linux/pci_ids.h   | 1 +
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/thunderbolt/nhi.h b/drivers/thunderbolt/nhi.h
index 0f029ce75882..675ddefe283c 100644
--- a/drivers/thunderbolt/nhi.h
+++ b/drivers/thunderbolt/nhi.h
@@ -91,6 +91,4 @@ extern const struct tb_nhi_ops icl_nhi_ops;
 #define PCI_DEVICE_ID_INTEL_RPL_NHI0   0xa73e
 #define PCI_DEVICE_ID_INTEL_RPL_NHI1   0xa76d
 
-#define PCI_CLASS_SERIAL_USB_USB4  0x0c0340
-
 #endif
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 91b457de262e..1fc8b5e72f80 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -121,6 +121,7 @@
 #define PCI_CLASS_SERIAL_USB_OHCI  0x0c0310
 #define PCI_CLASS_SERIAL_USB_EHCI  0x0c0320
 #define PCI_CLASS_SERIAL_USB_XHCI  0x0c0330
+#define PCI_CLASS_SERIAL_USB_USB4  0x0c0340
 #define PCI_CLASS_SERIAL_USB_DEVICE0x0c03fe
 #define PCI_CLASS_SERIAL_FIBER 0x0c04
 #define PCI_CLASS_SERIAL_SMBUS 0x0c05
-- 
2.34.1



[Nouveau] [PATCH v2 1/9] drm/nouveau: Switch from pci_is_thunderbolt_attached() to dev_is_removable()

2023-11-03 Thread Mario Limonciello
pci_is_thunderbolt_attached() only works for Intel TBT devices. Switch to
using dev_is_removable() to be able to detect USB4 devices as well.

Signed-off-by: Mario Limonciello 
---
 drivers/gpu/drm/nouveau/nouveau_vga.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_vga.c 
b/drivers/gpu/drm/nouveau/nouveau_vga.c
index f8bf0ec26844..14215b7ca187 100644
--- a/drivers/gpu/drm/nouveau/nouveau_vga.c
+++ b/drivers/gpu/drm/nouveau/nouveau_vga.c
@@ -94,8 +94,8 @@ nouveau_vga_init(struct nouveau_drm *drm)
 
vga_client_register(pdev, nouveau_vga_set_decode);
 
-   /* don't register Thunderbolt eGPU with vga_switcheroo */
-   if (pci_is_thunderbolt_attached(pdev))
+   /* don't register USB4/Thunderbolt eGPU with vga_switcheroo */
+   if (dev_is_removable(>dev))
return;
 
vga_switcheroo_register_client(pdev, _switcheroo_ops, runtime);
@@ -118,7 +118,7 @@ nouveau_vga_fini(struct nouveau_drm *drm)
 
vga_client_unregister(pdev);
 
-   if (pci_is_thunderbolt_attached(pdev))
+   if (dev_is_removable(>dev))
return;
 
vga_switcheroo_unregister_client(pdev);
-- 
2.34.1



[Nouveau] [PATCH v2 2/9] drm/radeon: Switch from pci_is_thunderbolt_attached() to dev_is_removable()

2023-11-03 Thread Mario Limonciello
pci_is_thunderbolt_attached() only works for Intel TBT devices. Switch to
using dev_is_removable() to be able to detect USB4 devices as well.

Signed-off-by: Mario Limonciello 
---
 drivers/gpu/drm/radeon/radeon_device.c | 4 ++--
 drivers/gpu/drm/radeon/radeon_kms.c| 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index afbb3a80c0c6..ba0ca0694d18 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -1429,7 +1429,7 @@ int radeon_device_init(struct radeon_device *rdev,
 
if (rdev->flags & RADEON_IS_PX)
runtime = true;
-   if (!pci_is_thunderbolt_attached(rdev->pdev))
+   if (!dev_is_removable(>pdev->dev))
vga_switcheroo_register_client(rdev->pdev,
   _switcheroo_ops, runtime);
if (runtime)
@@ -1519,7 +1519,7 @@ void radeon_device_fini(struct radeon_device *rdev)
radeon_bo_evict_vram(rdev);
radeon_audio_component_fini(rdev);
radeon_fini(rdev);
-   if (!pci_is_thunderbolt_attached(rdev->pdev))
+   if (!dev_is_removable(>pdev->dev))
vga_switcheroo_unregister_client(rdev->pdev);
if (rdev->flags & RADEON_IS_PX)
vga_switcheroo_fini_domain_pm_ops(rdev->dev);
diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
b/drivers/gpu/drm/radeon/radeon_kms.c
index a16590c6247f..ead912a58ab8 100644
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -138,7 +138,7 @@ int radeon_driver_load_kms(struct drm_device *dev, unsigned 
long flags)
if ((radeon_runtime_pm != 0) &&
radeon_has_atpx() &&
((flags & RADEON_IS_IGP) == 0) &&
-   !pci_is_thunderbolt_attached(pdev))
+   !dev_is_removable(>dev))
flags |= RADEON_IS_PX;
 
/* radeon_device_init should report only fatal error
-- 
2.34.1



[Nouveau] [PATCH v2 0/9] Improvements to pcie_bandwidth_available() for eGPUs

2023-11-03 Thread Mario Limonciello
Downstream drivers are getting the wrong values from
pcie_bandwidth_available() which is causing problems for performance
of eGPUs.

This series overhauls Thunderbolt related device detection and uses
the changes to change the behavior of pcie_bandwidth_available().

NOTE: This series is currently based on top of v6.6 + this change that
  will be merged for 6.7:
Link: https://patchwork.freedesktop.org/patch/564738/

v1->v2:
 * Rename is_thunderbolt
 * Look for _DSD instead of link
 * Drop pci_is_thunderbolt_attached() from all drivers
 * Adjust links
 * Adjust commit messages
 * Add quirk for Tiger Lake

Mario Limonciello (9):
  drm/nouveau: Switch from pci_is_thunderbolt_attached() to
dev_is_removable()
  drm/radeon: Switch from pci_is_thunderbolt_attached() to
dev_is_removable()
  PCI: Drop pci_is_thunderbolt_attached()
  PCI: Move the `PCI_CLASS_SERIAL_USB_USB4` definition to common header
  PCI: pciehp: Move check for is_thunderbolt into a quirk
  PCI: Rename is_thunderbolt to is_tunneled
  PCI: ACPI: Detect PCIe root ports that are used for tunneling
  PCI: Exclude PCIe ports used for tunneling in
pcie_bandwidth_available()
  PCI: Add a quirk to mark 0x8086 : 0x9a23 as supporting PCIe tunneling

 drivers/gpu/drm/nouveau/nouveau_vga.c  |  6 +-
 drivers/gpu/drm/radeon/radeon_device.c |  4 +-
 drivers/gpu/drm/radeon/radeon_kms.c|  2 +-
 drivers/pci/hotplug/pciehp_hpc.c   |  6 +-
 drivers/pci/pci-acpi.c | 16 ++
 drivers/pci/pci.c  | 76 +-
 drivers/pci/probe.c|  2 +-
 drivers/pci/quirks.c   | 31 +++
 drivers/platform/x86/apple-gmux.c  |  2 +-
 drivers/thunderbolt/nhi.h  |  2 -
 include/linux/pci.h| 25 +
 include/linux/pci_ids.h|  1 +
 12 files changed, 109 insertions(+), 64 deletions(-)

-- 
2.34.1



Re: [Nouveau] Error "kernel rejected pushbuf" with RTX 3090 (GA102 chip)

2023-11-03 Thread Timothy Maden

On 10/3/23 10:44, Timothy Madden wrote:

Hello,

Whenever I try to run a graphical application with my RTX 3090 Strix I 
get the following error in the console output:


 nouveau: kernel rejected pushbuf: No such device
 nouveau: ch23: krec 0 pushes 1 bufs 21 relocs 0



Is there maybe a limit to the number of pushbufs in the source code, 
that I can hard-code to a smaller values, then recompile Mesa ?


Where would I find it in the source files ?

--
Thank you,
Timothy Madden




Re: [Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures

2023-11-03 Thread Danilo Krummrich

On 11/3/23 15:04, Christian König wrote:

Am 03.11.23 um 14:14 schrieb Danilo Krummrich:

On Fri, Nov 03, 2023 at 08:18:35AM +0100, Christian König wrote:

Am 02.11.23 um 00:31 schrieb Danilo Krummrich:

Implement reference counting for struct drm_gpuvm.

 From the design point of view what is that good for?

It was discussed in this thread [1].

Essentially, the idea is to make sure that vm_bo->vm is always valid without the
driver having the need to take extra care. It also ensures that GPUVM can't be
freed with mappings still held.


Well in this case I have some objections to this. The lifetime of the VM is 
driver and use case specific.


That's fine, I don't see how this changes with a reference count.



Especially we most likely don't want the VM to live longer than the application 
which originally used it. If you make the GPUVM an independent object you 
actually open up driver abuse for the lifetime of this.


Right, we don't want that. But I don't see how the reference count prevents 
that.

Independant object is relative. struct drm_gpuvm is still embedded into a driver
specific structure. It's working the same way as with struct drm_gem_obejct.



Additional to that see below for a quite real problem with this.


Background is that the most common use case I see is that this object is
embedded into something else and a reference count is then not really a good
idea.

Do you have a specific use-case in mind where this would interfere?


Yes, absolutely. For an example see amdgpu_mes_self_test(), here we initialize 
a temporary amdgpu VM for an in kernel unit test which runs during driver load.

When the function returns I need to guarantee that the VM is destroyed or 
otherwise I will mess up normal operation.


Nothing prevents that. The reference counting is well defined. If the driver 
did not
take additional references (which is clearly up to the driver taking care of) 
and all
VM_BOs and mappings are cleaned up, the reference count is guaranteed to be 1 
at this
point.

Also note that if the driver would have not cleaned up all VM_BOs and mappings 
before
shutting down the VM, it would have been a bug anyways and the driver would 
potentially
leak memory and UAF issues.

Hence, If the VM is still alive at a point where you don't expect it to be, 
then it's
simply a driver bug.



Reference counting is nice when you don't know who else is referring to your 
VM, but the cost is that you also don't know when the object will guardedly be 
destroyed.

I can trivially work around this by saying that the generic GPUVM object has a 
different lifetime than the amdgpu specific object, but that opens up doors for 
use after free again.


If your driver never touches the VM's reference count and exits the VM with a 
clean state
(no mappings and no VM_BOs left), effectively, this is the same as having no 
reference
count.

In the very worst case you could argue that we trade a potential UAF *and* 
memroy leak
(no reference count) with *only* a memory leak (with reference count), which to 
me seems
reasonable.



Regards,
Christian.


Thanks,
Christian.

[1]https://lore.kernel.org/dri-devel/6fa058a4-20d3-44b9-af58-755cfb375...@redhat.com/


Signed-off-by: Danilo Krummrich
---
   drivers/gpu/drm/drm_gpuvm.c| 44 +++---
   drivers/gpu/drm/nouveau/nouveau_uvmm.c | 20 +---
   include/drm/drm_gpuvm.h| 31 +-
   3 files changed, 78 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index 53e2c406fb04..6a88eafc5229 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -746,6 +746,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name,
gpuvm->rb.tree = RB_ROOT_CACHED;
INIT_LIST_HEAD(>rb.list);
+   kref_init(>kref);
+
gpuvm->name = name ? name : "unknown";
gpuvm->flags = flags;
gpuvm->ops = ops;
@@ -770,15 +772,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name,
   }
   EXPORT_SYMBOL_GPL(drm_gpuvm_init);
-/**
- * drm_gpuvm_destroy() - cleanup a _gpuvm
- * @gpuvm: pointer to the _gpuvm to clean up
- *
- * Note that it is a bug to call this function on a manager that still
- * holds GPU VA mappings.
- */
-void
-drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
+static void
+drm_gpuvm_fini(struct drm_gpuvm *gpuvm)
   {
gpuvm->name = NULL;
@@ -790,7 +785,33 @@ drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
drm_gem_object_put(gpuvm->r_obj);
   }
-EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
+
+static void
+drm_gpuvm_free(struct kref *kref)
+{
+   struct drm_gpuvm *gpuvm = container_of(kref, struct drm_gpuvm, kref);
+
+   if (drm_WARN_ON(gpuvm->drm, !gpuvm->ops->vm_free))
+   return;
+
+   drm_gpuvm_fini(gpuvm);
+
+   gpuvm->ops->vm_free(gpuvm);
+}
+
+/**
+ * drm_gpuvm_bo_put() - drop a struct drm_gpuvm reference
+ * @gpuvm: the _gpuvm to release the reference of
+ *
+ * This releases 

Re: [Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures

2023-11-03 Thread Christian König

Am 03.11.23 um 14:14 schrieb Danilo Krummrich:

On Fri, Nov 03, 2023 at 08:18:35AM +0100, Christian König wrote:

Am 02.11.23 um 00:31 schrieb Danilo Krummrich:

Implement reference counting for struct drm_gpuvm.

 From the design point of view what is that good for?

It was discussed in this thread [1].

Essentially, the idea is to make sure that vm_bo->vm is always valid without the
driver having the need to take extra care. It also ensures that GPUVM can't be
freed with mappings still held.


Well in this case I have some objections to this. The lifetime of the VM 
is driver and use case specific.


Especially we most likely don't want the VM to live longer than the 
application which originally used it. If you make the GPUVM an 
independent object you actually open up driver abuse for the lifetime of 
this.


Additional to that see below for a quite real problem with this.


Background is that the most common use case I see is that this object is
embedded into something else and a reference count is then not really a good
idea.

Do you have a specific use-case in mind where this would interfere?


Yes, absolutely. For an example see amdgpu_mes_self_test(), here we 
initialize a temporary amdgpu VM for an in kernel unit test which runs 
during driver load.


When the function returns I need to guarantee that the VM is destroyed 
or otherwise I will mess up normal operation.


Reference counting is nice when you don't know who else is referring to 
your VM, but the cost is that you also don't know when the object will 
guardedly be destroyed.


I can trivially work around this by saying that the generic GPUVM object 
has a different lifetime than the amdgpu specific object, but that opens 
up doors for use after free again.


Regards,
Christian.




Thanks,
Christian.

[1]https://lore.kernel.org/dri-devel/6fa058a4-20d3-44b9-af58-755cfb375...@redhat.com/


Signed-off-by: Danilo Krummrich
---
   drivers/gpu/drm/drm_gpuvm.c| 44 +++---
   drivers/gpu/drm/nouveau/nouveau_uvmm.c | 20 +---
   include/drm/drm_gpuvm.h| 31 +-
   3 files changed, 78 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index 53e2c406fb04..6a88eafc5229 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -746,6 +746,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name,
gpuvm->rb.tree = RB_ROOT_CACHED;
INIT_LIST_HEAD(>rb.list);
+   kref_init(>kref);
+
gpuvm->name = name ? name : "unknown";
gpuvm->flags = flags;
gpuvm->ops = ops;
@@ -770,15 +772,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name,
   }
   EXPORT_SYMBOL_GPL(drm_gpuvm_init);
-/**
- * drm_gpuvm_destroy() - cleanup a _gpuvm
- * @gpuvm: pointer to the _gpuvm to clean up
- *
- * Note that it is a bug to call this function on a manager that still
- * holds GPU VA mappings.
- */
-void
-drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
+static void
+drm_gpuvm_fini(struct drm_gpuvm *gpuvm)
   {
gpuvm->name = NULL;
@@ -790,7 +785,33 @@ drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
drm_gem_object_put(gpuvm->r_obj);
   }
-EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
+
+static void
+drm_gpuvm_free(struct kref *kref)
+{
+   struct drm_gpuvm *gpuvm = container_of(kref, struct drm_gpuvm, kref);
+
+   if (drm_WARN_ON(gpuvm->drm, !gpuvm->ops->vm_free))
+   return;
+
+   drm_gpuvm_fini(gpuvm);
+
+   gpuvm->ops->vm_free(gpuvm);
+}
+
+/**
+ * drm_gpuvm_bo_put() - drop a struct drm_gpuvm reference
+ * @gpuvm: the _gpuvm to release the reference of
+ *
+ * This releases a reference to @gpuvm.
+ */
+void
+drm_gpuvm_put(struct drm_gpuvm *gpuvm)
+{
+   if (gpuvm)
+   kref_put(>kref, drm_gpuvm_free);
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_put);
   static int
   __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
@@ -843,7 +864,7 @@ drm_gpuva_insert(struct drm_gpuvm *gpuvm,
if (unlikely(!drm_gpuvm_range_valid(gpuvm, addr, range)))
return -EINVAL;
-   return __drm_gpuva_insert(gpuvm, va);
+   return __drm_gpuva_insert(drm_gpuvm_get(gpuvm), va);
   }
   EXPORT_SYMBOL_GPL(drm_gpuva_insert);
@@ -876,6 +897,7 @@ drm_gpuva_remove(struct drm_gpuva *va)
}
__drm_gpuva_remove(va);
+   drm_gpuvm_put(va->vm);
   }
   EXPORT_SYMBOL_GPL(drm_gpuva_remove);
diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c 
b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
index 54be12c1272f..cb2f06565c46 100644
--- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
@@ -1780,6 +1780,18 @@ nouveau_uvmm_bo_unmap_all(struct nouveau_bo *nvbo)
}
   }
+static void
+nouveau_uvmm_free(struct drm_gpuvm *gpuvm)
+{
+   struct nouveau_uvmm *uvmm = uvmm_from_gpuvm(gpuvm);
+
+   kfree(uvmm);
+}
+
+static const struct drm_gpuvm_ops gpuvm_ops = {
+   .vm_free = nouveau_uvmm_free,
+};
+
   int
   

Re: [Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures

2023-11-03 Thread Danilo Krummrich
On Fri, Nov 03, 2023 at 08:18:35AM +0100, Christian König wrote:
> Am 02.11.23 um 00:31 schrieb Danilo Krummrich:
> > Implement reference counting for struct drm_gpuvm.
> 
> From the design point of view what is that good for?

It was discussed in this thread [1].

Essentially, the idea is to make sure that vm_bo->vm is always valid without the
driver having the need to take extra care. It also ensures that GPUVM can't be
freed with mappings still held.

> 
> Background is that the most common use case I see is that this object is
> embedded into something else and a reference count is then not really a good
> idea.

Do you have a specific use-case in mind where this would interfere?

> 
> Thanks,
> Christian.

[1] 
https://lore.kernel.org/dri-devel/6fa058a4-20d3-44b9-af58-755cfb375...@redhat.com/

> 
> > 
> > Signed-off-by: Danilo Krummrich 
> > ---
> >   drivers/gpu/drm/drm_gpuvm.c| 44 +++---
> >   drivers/gpu/drm/nouveau/nouveau_uvmm.c | 20 +---
> >   include/drm/drm_gpuvm.h| 31 +-
> >   3 files changed, 78 insertions(+), 17 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
> > index 53e2c406fb04..6a88eafc5229 100644
> > --- a/drivers/gpu/drm/drm_gpuvm.c
> > +++ b/drivers/gpu/drm/drm_gpuvm.c
> > @@ -746,6 +746,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char 
> > *name,
> > gpuvm->rb.tree = RB_ROOT_CACHED;
> > INIT_LIST_HEAD(>rb.list);
> > +   kref_init(>kref);
> > +
> > gpuvm->name = name ? name : "unknown";
> > gpuvm->flags = flags;
> > gpuvm->ops = ops;
> > @@ -770,15 +772,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char 
> > *name,
> >   }
> >   EXPORT_SYMBOL_GPL(drm_gpuvm_init);
> > -/**
> > - * drm_gpuvm_destroy() - cleanup a _gpuvm
> > - * @gpuvm: pointer to the _gpuvm to clean up
> > - *
> > - * Note that it is a bug to call this function on a manager that still
> > - * holds GPU VA mappings.
> > - */
> > -void
> > -drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
> > +static void
> > +drm_gpuvm_fini(struct drm_gpuvm *gpuvm)
> >   {
> > gpuvm->name = NULL;
> > @@ -790,7 +785,33 @@ drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
> > drm_gem_object_put(gpuvm->r_obj);
> >   }
> > -EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
> > +
> > +static void
> > +drm_gpuvm_free(struct kref *kref)
> > +{
> > +   struct drm_gpuvm *gpuvm = container_of(kref, struct drm_gpuvm, kref);
> > +
> > +   if (drm_WARN_ON(gpuvm->drm, !gpuvm->ops->vm_free))
> > +   return;
> > +
> > +   drm_gpuvm_fini(gpuvm);
> > +
> > +   gpuvm->ops->vm_free(gpuvm);
> > +}
> > +
> > +/**
> > + * drm_gpuvm_bo_put() - drop a struct drm_gpuvm reference
> > + * @gpuvm: the _gpuvm to release the reference of
> > + *
> > + * This releases a reference to @gpuvm.
> > + */
> > +void
> > +drm_gpuvm_put(struct drm_gpuvm *gpuvm)
> > +{
> > +   if (gpuvm)
> > +   kref_put(>kref, drm_gpuvm_free);
> > +}
> > +EXPORT_SYMBOL_GPL(drm_gpuvm_put);
> >   static int
> >   __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
> > @@ -843,7 +864,7 @@ drm_gpuva_insert(struct drm_gpuvm *gpuvm,
> > if (unlikely(!drm_gpuvm_range_valid(gpuvm, addr, range)))
> > return -EINVAL;
> > -   return __drm_gpuva_insert(gpuvm, va);
> > +   return __drm_gpuva_insert(drm_gpuvm_get(gpuvm), va);
> >   }
> >   EXPORT_SYMBOL_GPL(drm_gpuva_insert);
> > @@ -876,6 +897,7 @@ drm_gpuva_remove(struct drm_gpuva *va)
> > }
> > __drm_gpuva_remove(va);
> > +   drm_gpuvm_put(va->vm);
> >   }
> >   EXPORT_SYMBOL_GPL(drm_gpuva_remove);
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c 
> > b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
> > index 54be12c1272f..cb2f06565c46 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
> > @@ -1780,6 +1780,18 @@ nouveau_uvmm_bo_unmap_all(struct nouveau_bo *nvbo)
> > }
> >   }
> > +static void
> > +nouveau_uvmm_free(struct drm_gpuvm *gpuvm)
> > +{
> > +   struct nouveau_uvmm *uvmm = uvmm_from_gpuvm(gpuvm);
> > +
> > +   kfree(uvmm);
> > +}
> > +
> > +static const struct drm_gpuvm_ops gpuvm_ops = {
> > +   .vm_free = nouveau_uvmm_free,
> > +};
> > +
> >   int
> >   nouveau_uvmm_ioctl_vm_init(struct drm_device *dev,
> >void *data,
> > @@ -1830,7 +1842,7 @@ nouveau_uvmm_ioctl_vm_init(struct drm_device *dev,
> >NOUVEAU_VA_SPACE_END,
> >init->kernel_managed_addr,
> >init->kernel_managed_size,
> > -  NULL);
> > +  _ops);
> > /* GPUVM takes care from here on. */
> > drm_gem_object_put(r_obj);
> > @@ -1849,8 +1861,7 @@ nouveau_uvmm_ioctl_vm_init(struct drm_device *dev,
> > return 0;
> >   out_gpuvm_fini:
> > -   drm_gpuvm_destroy(>base);
> > -   kfree(uvmm);
> > +   drm_gpuvm_put(>base);
> >   out_unlock:
> > mutex_unlock(>mutex);
> > return ret;
> > @@ -1902,7 +1913,6 @@ nouveau_uvmm_fini(struct nouveau_uvmm 

Re: [Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures

2023-11-03 Thread Christian König

Am 02.11.23 um 00:31 schrieb Danilo Krummrich:

Implement reference counting for struct drm_gpuvm.


From the design point of view what is that good for?

Background is that the most common use case I see is that this object is 
embedded into something else and a reference count is then not really a 
good idea.


Thanks,
Christian.



Signed-off-by: Danilo Krummrich 
---
  drivers/gpu/drm/drm_gpuvm.c| 44 +++---
  drivers/gpu/drm/nouveau/nouveau_uvmm.c | 20 +---
  include/drm/drm_gpuvm.h| 31 +-
  3 files changed, 78 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index 53e2c406fb04..6a88eafc5229 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -746,6 +746,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name,
gpuvm->rb.tree = RB_ROOT_CACHED;
INIT_LIST_HEAD(>rb.list);
  
+	kref_init(>kref);

+
gpuvm->name = name ? name : "unknown";
gpuvm->flags = flags;
gpuvm->ops = ops;
@@ -770,15 +772,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name,
  }
  EXPORT_SYMBOL_GPL(drm_gpuvm_init);
  
-/**

- * drm_gpuvm_destroy() - cleanup a _gpuvm
- * @gpuvm: pointer to the _gpuvm to clean up
- *
- * Note that it is a bug to call this function on a manager that still
- * holds GPU VA mappings.
- */
-void
-drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
+static void
+drm_gpuvm_fini(struct drm_gpuvm *gpuvm)
  {
gpuvm->name = NULL;
  
@@ -790,7 +785,33 @@ drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
  
  	drm_gem_object_put(gpuvm->r_obj);

  }
-EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
+
+static void
+drm_gpuvm_free(struct kref *kref)
+{
+   struct drm_gpuvm *gpuvm = container_of(kref, struct drm_gpuvm, kref);
+
+   if (drm_WARN_ON(gpuvm->drm, !gpuvm->ops->vm_free))
+   return;
+
+   drm_gpuvm_fini(gpuvm);
+
+   gpuvm->ops->vm_free(gpuvm);
+}
+
+/**
+ * drm_gpuvm_bo_put() - drop a struct drm_gpuvm reference
+ * @gpuvm: the _gpuvm to release the reference of
+ *
+ * This releases a reference to @gpuvm.
+ */
+void
+drm_gpuvm_put(struct drm_gpuvm *gpuvm)
+{
+   if (gpuvm)
+   kref_put(>kref, drm_gpuvm_free);
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_put);
  
  static int

  __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
@@ -843,7 +864,7 @@ drm_gpuva_insert(struct drm_gpuvm *gpuvm,
if (unlikely(!drm_gpuvm_range_valid(gpuvm, addr, range)))
return -EINVAL;
  
-	return __drm_gpuva_insert(gpuvm, va);

+   return __drm_gpuva_insert(drm_gpuvm_get(gpuvm), va);
  }
  EXPORT_SYMBOL_GPL(drm_gpuva_insert);
  
@@ -876,6 +897,7 @@ drm_gpuva_remove(struct drm_gpuva *va)

}
  
  	__drm_gpuva_remove(va);

+   drm_gpuvm_put(va->vm);
  }
  EXPORT_SYMBOL_GPL(drm_gpuva_remove);
  
diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c

index 54be12c1272f..cb2f06565c46 100644
--- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
@@ -1780,6 +1780,18 @@ nouveau_uvmm_bo_unmap_all(struct nouveau_bo *nvbo)
}
  }
  
+static void

+nouveau_uvmm_free(struct drm_gpuvm *gpuvm)
+{
+   struct nouveau_uvmm *uvmm = uvmm_from_gpuvm(gpuvm);
+
+   kfree(uvmm);
+}
+
+static const struct drm_gpuvm_ops gpuvm_ops = {
+   .vm_free = nouveau_uvmm_free,
+};
+
  int
  nouveau_uvmm_ioctl_vm_init(struct drm_device *dev,
   void *data,
@@ -1830,7 +1842,7 @@ nouveau_uvmm_ioctl_vm_init(struct drm_device *dev,
   NOUVEAU_VA_SPACE_END,
   init->kernel_managed_addr,
   init->kernel_managed_size,
-  NULL);
+  _ops);
/* GPUVM takes care from here on. */
drm_gem_object_put(r_obj);
  
@@ -1849,8 +1861,7 @@ nouveau_uvmm_ioctl_vm_init(struct drm_device *dev,

return 0;
  
  out_gpuvm_fini:

-   drm_gpuvm_destroy(>base);
-   kfree(uvmm);
+   drm_gpuvm_put(>base);
  out_unlock:
mutex_unlock(>mutex);
return ret;
@@ -1902,7 +1913,6 @@ nouveau_uvmm_fini(struct nouveau_uvmm *uvmm)
  
  	mutex_lock(>mutex);

nouveau_vmm_fini(>vmm);
-   drm_gpuvm_destroy(>base);
-   kfree(uvmm);
+   drm_gpuvm_put(>base);
mutex_unlock(>mutex);
  }
diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
index 0c2e24155a93..4e6e1fd3485a 100644
--- a/include/drm/drm_gpuvm.h
+++ b/include/drm/drm_gpuvm.h
@@ -247,6 +247,11 @@ struct drm_gpuvm {
struct list_head list;
} rb;
  
+	/**

+* @kref: reference count of this object
+*/
+   struct kref kref;
+
/**
 * @kernel_alloc_node:
 *
@@ -273,7 +278,23 @@ void drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char 
*name,
u64 start_offset, u64 range,
u64 reserve_offset, u64 reserve_range,