Re: [Nouveau] [PATCH v2 6/9] PCI: Rename is_thunderbolt to is_tunneled
Hi, On 11/3/23 20:07, Mario Limonciello wrote: > The `is_thunderbolt` bit has been used to indicate that a PCIe device > contained the Intel VSEC which is used by various parts of the kernel > to change behavior. To later allow usage with USB4 controllers as well, > rename this to `is_tunneled`. > > Signed-off-by: Mario Limonciello Here is my ack for the trivial drivers/platform/x86/apple-gmux.c change: Acked-by: Hans de Goede Bjorn, feel free to route this through the PCI tree. Regards, Hans > --- > drivers/pci/pci.c | 2 +- > drivers/pci/probe.c | 2 +- > drivers/platform/x86/apple-gmux.c | 2 +- > include/linux/pci.h | 2 +- > 4 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 59c01d68c6d5..d9aa5a39f585 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -3032,7 +3032,7 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge) > return true; > > /* Even the oldest 2010 Thunderbolt controller supports D3. */ > - if (bridge->is_thunderbolt) > + if (bridge->is_tunneled) > return true; > > /* Platform might know better if the bridge supports D3 */ > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index 795534589b98..518413d15402 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -1597,7 +1597,7 @@ static void set_pcie_thunderbolt(struct pci_dev *dev) > /* Is the device part of a Thunderbolt controller? */ > vsec = pci_find_vsec_capability(dev, PCI_VENDOR_ID_INTEL, > PCI_VSEC_ID_INTEL_TBT); > if (vsec) > - dev->is_thunderbolt = 1; > + dev->is_tunneled = 1; > } > > static void set_pcie_untrusted(struct pci_dev *dev) > diff --git a/drivers/platform/x86/apple-gmux.c > b/drivers/platform/x86/apple-gmux.c > index 1417e230edbd..20315aa4463a 100644 > --- a/drivers/platform/x86/apple-gmux.c > +++ b/drivers/platform/x86/apple-gmux.c > @@ -774,7 +774,7 @@ static int gmux_resume(struct device *dev) > > static int is_thunderbolt(struct device *dev, void *data) > { > - return to_pci_dev(dev)->is_thunderbolt; > + return to_pci_dev(dev)->is_tunneled; > } > > static int gmux_probe(struct pnp_dev *pnp, const struct pnp_device_id *id) > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 439c2dac8a3e..b1724f25fb02 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -440,7 +440,7 @@ struct pci_dev { > unsigned intis_virtfn:1; > unsigned intis_hotplug_bridge:1; > unsigned intshpc_managed:1; /* SHPC owned by shpchp */ > - unsigned intis_thunderbolt:1; /* Thunderbolt controller */ > + unsigned intis_tunneled:1; /* Tunneled TBT or USB4 link */ > unsigned intno_command_complete:1; /* No command completion */ > /* >* Devices marked being untrusted are the ones that can potentially
Re: [Nouveau] [PATCH v2 0/9] Improvements to pcie_bandwidth_available() for eGPUs
On Fri, Nov 03, 2023 at 02:07:49PM -0500, Mario Limonciello wrote: > Downstream drivers are getting the wrong values from > pcie_bandwidth_available() which is causing problems for performance > of eGPUs. > > This series overhauls Thunderbolt related device detection and uses > the changes to change the behavior of pcie_bandwidth_available(). > > NOTE: This series is currently based on top of v6.6 + this change that > will be merged for 6.7: > Link: https://patchwork.freedesktop.org/patch/564738/ Thanks, Mario, I'll look at this soon after v6.7-rc1 (probably Nov 12), so the amdgpu patch should be in mainline by then. > v1->v2: > * Rename is_thunderbolt > * Look for _DSD instead of link > * Drop pci_is_thunderbolt_attached() from all drivers > * Adjust links > * Adjust commit messages > * Add quirk for Tiger Lake > > Mario Limonciello (9): > drm/nouveau: Switch from pci_is_thunderbolt_attached() to > dev_is_removable() > drm/radeon: Switch from pci_is_thunderbolt_attached() to > dev_is_removable() > PCI: Drop pci_is_thunderbolt_attached() > PCI: Move the `PCI_CLASS_SERIAL_USB_USB4` definition to common header > PCI: pciehp: Move check for is_thunderbolt into a quirk > PCI: Rename is_thunderbolt to is_tunneled > PCI: ACPI: Detect PCIe root ports that are used for tunneling > PCI: Exclude PCIe ports used for tunneling in > pcie_bandwidth_available() > PCI: Add a quirk to mark 0x8086 : 0x9a23 as supporting PCIe tunneling > > drivers/gpu/drm/nouveau/nouveau_vga.c | 6 +- > drivers/gpu/drm/radeon/radeon_device.c | 4 +- > drivers/gpu/drm/radeon/radeon_kms.c| 2 +- > drivers/pci/hotplug/pciehp_hpc.c | 6 +- > drivers/pci/pci-acpi.c | 16 ++ > drivers/pci/pci.c | 76 +- > drivers/pci/probe.c| 2 +- > drivers/pci/quirks.c | 31 +++ > drivers/platform/x86/apple-gmux.c | 2 +- > drivers/thunderbolt/nhi.h | 2 - > include/linux/pci.h| 25 + > include/linux/pci_ids.h| 1 + > 12 files changed, 109 insertions(+), 64 deletions(-) > > -- > 2.34.1 >
[Nouveau] [PATCH v2 9/9] PCI: Add a quirk to mark 0x8086 : 0x9a23 as supporting PCIe tunneling
The PCI root port used for tunneling USB4 traffic on Tiger Lake is is not marked as tunneling but has the same limitations as other PCIe root ports used for tunneling. This causes pcie_bandwidth_available() to treat it as the limiting device in the PCI hierarchy and downstream driver to program devices incorrectly as a result. Add a quirk to mark the device as tunneling so that it will be skipped in pcie_bandwidth_available() like other TBT3/USB4 root ports and bridges. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2885 Signed-off-by: Mario Limonciello --- drivers/pci/quirks.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 4bbf6e33ca11..0f124e075834 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3827,6 +3827,17 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_2C quirk_thunderbolt_command_complete); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PORT_RIDGE, quirk_thunderbolt_command_complete); + +/* + * PCIe root port associated with the integrated controller is used for PCIe + * tunneling but can't be detected using ACPI. + */ +static void quirk_thunderbolt_tunneling(struct pci_dev *pdev) +{ + pdev->is_tunneled = 1; +} +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x9a23, + quirk_thunderbolt_tunneling); #ifdef CONFIG_ACPI /* * Apple: Shutdown Cactus Ridge Thunderbolt controller. -- 2.34.1
[Nouveau] [PATCH v2 8/9] PCI: Exclude PCIe ports used for tunneling in pcie_bandwidth_available()
The USB4 spec specifies that PCIe ports that are used for tunneling PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s and behave as a PCIe Gen1 device. The actual performance of these ports is controlled by the fabric implementation. Downstream drivers such as amdgpu which utilize pcie_bandwidth_available() to program the device will always find the PCIe ports used for tunneling as a limiting factor potentially leading to incorrect performance decisions. To prevent problems in downstream drivers check explicitly for ports being used for PCIe tunneling and skip them when looking for bandwidth limitations of the hierarchy. If the only device connected is a root port used for tunneling then report that device. Downstream drivers could make this change on their own but then they wouldn't be able to detect other potential speed bottlenecks from the hierarchy without duplicating pcie_bandwidth_available() logic. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925#note_2145860 Link: https://www.usb.org/document-library/usb4r-specification-v20 USB4 V2 with Errata and ECN through June 2023 Section 11.2.1 Signed-off-by: Mario Limonciello --- drivers/pci/pci.c | 74 +++ 1 file changed, 49 insertions(+), 25 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index d9aa5a39f585..15e37164ce56 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -6223,6 +6223,35 @@ int pcie_set_mps(struct pci_dev *dev, int mps) } EXPORT_SYMBOL(pcie_set_mps); +static u32 pcie_calc_bw_limits(struct pci_dev *dev, u32 bw, + struct pci_dev **limiting_dev, + enum pci_bus_speed *speed, + enum pcie_link_width *width) +{ + enum pcie_link_width next_width; + enum pci_bus_speed next_speed; + u32 next_bw; + u16 lnksta; + + pcie_capability_read_word(dev, PCI_EXP_LNKSTA, ); + next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS]; + next_width = (lnksta & PCI_EXP_LNKSTA_NLW) >> PCI_EXP_LNKSTA_NLW_SHIFT; + next_bw = next_width * PCIE_SPEED2MBS_ENC(next_speed); + + /* Check if current device limits the total bandwidth */ + if (!bw || next_bw <= bw) { + bw = next_bw; + if (limiting_dev) + *limiting_dev = dev; + if (speed) + *speed = next_speed; + if (width) + *width = next_width; + } + + return bw; +} + /** * pcie_bandwidth_available - determine minimum link settings of a PCIe * device and its bandwidth limitation @@ -6236,47 +6265,42 @@ EXPORT_SYMBOL(pcie_set_mps); * limiting_dev, speed, and width pointers are supplied) information about * that point. The bandwidth returned is in Mb/s, i.e., megabits/second of * raw bandwidth. + * + * This excludes the bandwidth calculation that has been returned from a + * PCIe device used for transmitting tunneled PCIe traffic over a Thunderbolt + * or USB4 link that is part of larger hierarchy. The calculation is excluded + * because the USB4 specification specifies that the max speed returned from + * PCIe configuration registers for the tunneling link is always PCI 1x 2.5 GT/s. + * When only tunneled devices are present, the bandwidth returned is the + * bandwidth available from the first tunneled device. */ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev, enum pci_bus_speed *speed, enum pcie_link_width *width) { - u16 lnksta; - enum pci_bus_speed next_speed; - enum pcie_link_width next_width; - u32 bw, next_bw; + struct pci_dev *tdev = NULL; + u32 bw = 0; if (speed) *speed = PCI_SPEED_UNKNOWN; if (width) *width = PCIE_LNK_WIDTH_UNKNOWN; - bw = 0; - while (dev) { - pcie_capability_read_word(dev, PCI_EXP_LNKSTA, ); - - next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS]; - next_width = (lnksta & PCI_EXP_LNKSTA_NLW) >> - PCI_EXP_LNKSTA_NLW_SHIFT; - - next_bw = next_width * PCIE_SPEED2MBS_ENC(next_speed); - - /* Check if current device limits the total bandwidth */ - if (!bw || next_bw <= bw) { - bw = next_bw; - - if (limiting_dev) - *limiting_dev = dev; - if (speed) - *speed = next_speed; - if (width) - *width = next_width; + if (dev->is_tunneled) { + if (!tdev) + tdev = dev; + goto skip; } - + bw
[Nouveau] [PATCH v2 7/9] PCI: ACPI: Detect PCIe root ports that are used for tunneling
USB4 routers support a feature called "PCIe tunneling". This allows PCIe traffic to be transmitted over USB4 fabric. PCIe root ports that are used in this fashion can be discovered by device specific data that specifies the USB4 router they are connected to. For the PCI core, the specific connection information doesn't matter, but it's interesting to know that this root port is used for tunneling traffic. This will allow other decisions to be made based upon it. Detect the `usb4-host-interface` _DSD and if it's found save it into the `is_tunneling` bit in `struct pci_device`. Link: https://www.usb.org/document-library/usb4r-specification-v20 USB4 V2 with Errata and ECN through June 2023 Section 2.2.10.3 Link: https://learn.microsoft.com/en-us/windows-hardware/design/component-guidelines/usb4-acpi-requirements#port-mapping-_dsd-for-usb-3x-and-pcie Signed-off-by: Mario Limonciello --- drivers/pci/pci-acpi.c | 16 1 file changed, 16 insertions(+) diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c index 05b7357bd258..81dbfd335f34 100644 --- a/drivers/pci/pci-acpi.c +++ b/drivers/pci/pci-acpi.c @@ -1414,12 +1414,28 @@ static void pci_acpi_set_external_facing(struct pci_dev *dev) dev->external_facing = 1; } +static void pci_acpi_set_tunneling(struct pci_dev *dev) +{ + if (!pci_is_pcie(dev)) + return; + + switch (pci_pcie_type(dev)) { + case PCI_EXP_TYPE_ROOT_PORT: + case PCI_EXP_TYPE_DOWNSTREAM: + dev->is_tunneled = device_property_present(>dev, "usb4-host-interface"); + break; + default: + return; + } +} + void pci_acpi_setup(struct device *dev, struct acpi_device *adev) { struct pci_dev *pci_dev = to_pci_dev(dev); pci_acpi_optimize_delay(pci_dev, adev->handle); pci_acpi_set_external_facing(pci_dev); + pci_acpi_set_tunneling(pci_dev); pci_acpi_add_edr_notifier(pci_dev); pci_acpi_add_pm_notifier(adev, pci_dev); -- 2.34.1
[Nouveau] [PATCH v2 6/9] PCI: Rename is_thunderbolt to is_tunneled
The `is_thunderbolt` bit has been used to indicate that a PCIe device contained the Intel VSEC which is used by various parts of the kernel to change behavior. To later allow usage with USB4 controllers as well, rename this to `is_tunneled`. Signed-off-by: Mario Limonciello --- drivers/pci/pci.c | 2 +- drivers/pci/probe.c | 2 +- drivers/platform/x86/apple-gmux.c | 2 +- include/linux/pci.h | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 59c01d68c6d5..d9aa5a39f585 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -3032,7 +3032,7 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge) return true; /* Even the oldest 2010 Thunderbolt controller supports D3. */ - if (bridge->is_thunderbolt) + if (bridge->is_tunneled) return true; /* Platform might know better if the bridge supports D3 */ diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 795534589b98..518413d15402 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1597,7 +1597,7 @@ static void set_pcie_thunderbolt(struct pci_dev *dev) /* Is the device part of a Thunderbolt controller? */ vsec = pci_find_vsec_capability(dev, PCI_VENDOR_ID_INTEL, PCI_VSEC_ID_INTEL_TBT); if (vsec) - dev->is_thunderbolt = 1; + dev->is_tunneled = 1; } static void set_pcie_untrusted(struct pci_dev *dev) diff --git a/drivers/platform/x86/apple-gmux.c b/drivers/platform/x86/apple-gmux.c index 1417e230edbd..20315aa4463a 100644 --- a/drivers/platform/x86/apple-gmux.c +++ b/drivers/platform/x86/apple-gmux.c @@ -774,7 +774,7 @@ static int gmux_resume(struct device *dev) static int is_thunderbolt(struct device *dev, void *data) { - return to_pci_dev(dev)->is_thunderbolt; + return to_pci_dev(dev)->is_tunneled; } static int gmux_probe(struct pnp_dev *pnp, const struct pnp_device_id *id) diff --git a/include/linux/pci.h b/include/linux/pci.h index 439c2dac8a3e..b1724f25fb02 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -440,7 +440,7 @@ struct pci_dev { unsigned intis_virtfn:1; unsigned intis_hotplug_bridge:1; unsigned intshpc_managed:1; /* SHPC owned by shpchp */ - unsigned intis_thunderbolt:1; /* Thunderbolt controller */ + unsigned intis_tunneled:1; /* Tunneled TBT or USB4 link */ unsigned intno_command_complete:1; /* No command completion */ /* * Devices marked being untrusted are the ones that can potentially -- 2.34.1
[Nouveau] [PATCH v2 5/9] PCI: pciehp: Move check for is_thunderbolt into a quirk
commit 493fb50e958c ("PCI: pciehp: Assume NoCompl+ for Thunderbolt ports") added a check into pciehp code to explicitly set NoCompl+ for all Intel Thunderbolt controllers, including those that don't need it. This overloaded the purpose of the `is_thunderbolt` member of `struct pci_device` because that means that any controller that identifies as thunderbolt would set NoCompl+ even if it doesn't suffer this deficiency. As that commit helpfully specifies all the controllers with the problem, move them into a PCI quirk. Signed-off-by: Mario Limonciello --- drivers/pci/hotplug/pciehp_hpc.c | 6 +- drivers/pci/quirks.c | 20 include/linux/pci.h | 1 + 3 files changed, 22 insertions(+), 5 deletions(-) diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c index fd713abdfb9f..23a92d681d1c 100644 --- a/drivers/pci/hotplug/pciehp_hpc.c +++ b/drivers/pci/hotplug/pciehp_hpc.c @@ -991,11 +991,7 @@ struct controller *pcie_init(struct pcie_device *dev) if (pdev->hotplug_user_indicators) slot_cap &= ~(PCI_EXP_SLTCAP_AIP | PCI_EXP_SLTCAP_PIP); - /* -* We assume no Thunderbolt controllers support Command Complete events, -* but some controllers falsely claim they do. -*/ - if (pdev->is_thunderbolt) + if (pdev->no_command_complete) slot_cap |= PCI_EXP_SLTCAP_NCCS; ctrl->slot_cap = slot_cap; diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index eeec1d6f9023..4bbf6e33ca11 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3807,6 +3807,26 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PORT_RIDGE, quirk_thunderbolt_hotplug_msi); +/* + * We assume no Thunderbolt controllers support Command Complete events, + * but some controllers falsely claim they do. + */ +static void quirk_thunderbolt_command_complete(struct pci_dev *pdev) +{ + pdev->no_command_complete = 1; +} +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_LIGHT_RIDGE, + quirk_thunderbolt_command_complete); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_EAGLE_RIDGE, + quirk_thunderbolt_command_complete); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_LIGHT_PEAK, + quirk_thunderbolt_command_complete); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C, + quirk_thunderbolt_command_complete); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_2C, + quirk_thunderbolt_command_complete); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PORT_RIDGE, + quirk_thunderbolt_command_complete); #ifdef CONFIG_ACPI /* * Apple: Shutdown Cactus Ridge Thunderbolt controller. diff --git a/include/linux/pci.h b/include/linux/pci.h index 530b0a360514..439c2dac8a3e 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -441,6 +441,7 @@ struct pci_dev { unsigned intis_hotplug_bridge:1; unsigned intshpc_managed:1; /* SHPC owned by shpchp */ unsigned intis_thunderbolt:1; /* Thunderbolt controller */ + unsigned intno_command_complete:1; /* No command completion */ /* * Devices marked being untrusted are the ones that can potentially * execute DMA attacks and similar. They are typically connected -- 2.34.1
[Nouveau] [PATCH v2 3/9] PCI: Drop pci_is_thunderbolt_attached()
All callers have switched to dev_is_removable() for detecting hotpluggable PCIe devices. Signed-off-by: Mario Limonciello --- include/linux/pci.h | 22 -- 1 file changed, 22 deletions(-) diff --git a/include/linux/pci.h b/include/linux/pci.h index b56417276042..530b0a360514 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2616,28 +2616,6 @@ static inline bool pci_ari_enabled(struct pci_bus *bus) return bus->self && bus->self->ari_enabled; } -/** - * pci_is_thunderbolt_attached - whether device is on a Thunderbolt daisy chain - * @pdev: PCI device to check - * - * Walk upwards from @pdev and check for each encountered bridge if it's part - * of a Thunderbolt controller. Reaching the host bridge means @pdev is not - * Thunderbolt-attached. (But rather soldered to the mainboard usually.) - */ -static inline bool pci_is_thunderbolt_attached(struct pci_dev *pdev) -{ - struct pci_dev *parent = pdev; - - if (pdev->is_thunderbolt) - return true; - - while ((parent = pci_upstream_bridge(parent))) - if (parent->is_thunderbolt) - return true; - - return false; -} - #if defined(CONFIG_PCIEPORTBUS) || defined(CONFIG_EEH) void pci_uevent_ers(struct pci_dev *pdev, enum pci_ers_result err_type); #endif -- 2.34.1
[Nouveau] [PATCH v2 4/9] PCI: Move the `PCI_CLASS_SERIAL_USB_USB4` definition to common header
`PCI_CLASS_SERIAL_USB_USB4` may be used by code outside of thunderbolt. Move the declaration into the common pci_ids.h header. Acked-by: Mika Westerberberg Signed-off-by: Mario Limonciello --- drivers/thunderbolt/nhi.h | 2 -- include/linux/pci_ids.h | 1 + 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/thunderbolt/nhi.h b/drivers/thunderbolt/nhi.h index 0f029ce75882..675ddefe283c 100644 --- a/drivers/thunderbolt/nhi.h +++ b/drivers/thunderbolt/nhi.h @@ -91,6 +91,4 @@ extern const struct tb_nhi_ops icl_nhi_ops; #define PCI_DEVICE_ID_INTEL_RPL_NHI0 0xa73e #define PCI_DEVICE_ID_INTEL_RPL_NHI1 0xa76d -#define PCI_CLASS_SERIAL_USB_USB4 0x0c0340 - #endif diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 91b457de262e..1fc8b5e72f80 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -121,6 +121,7 @@ #define PCI_CLASS_SERIAL_USB_OHCI 0x0c0310 #define PCI_CLASS_SERIAL_USB_EHCI 0x0c0320 #define PCI_CLASS_SERIAL_USB_XHCI 0x0c0330 +#define PCI_CLASS_SERIAL_USB_USB4 0x0c0340 #define PCI_CLASS_SERIAL_USB_DEVICE0x0c03fe #define PCI_CLASS_SERIAL_FIBER 0x0c04 #define PCI_CLASS_SERIAL_SMBUS 0x0c05 -- 2.34.1
[Nouveau] [PATCH v2 1/9] drm/nouveau: Switch from pci_is_thunderbolt_attached() to dev_is_removable()
pci_is_thunderbolt_attached() only works for Intel TBT devices. Switch to using dev_is_removable() to be able to detect USB4 devices as well. Signed-off-by: Mario Limonciello --- drivers/gpu/drm/nouveau/nouveau_vga.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_vga.c b/drivers/gpu/drm/nouveau/nouveau_vga.c index f8bf0ec26844..14215b7ca187 100644 --- a/drivers/gpu/drm/nouveau/nouveau_vga.c +++ b/drivers/gpu/drm/nouveau/nouveau_vga.c @@ -94,8 +94,8 @@ nouveau_vga_init(struct nouveau_drm *drm) vga_client_register(pdev, nouveau_vga_set_decode); - /* don't register Thunderbolt eGPU with vga_switcheroo */ - if (pci_is_thunderbolt_attached(pdev)) + /* don't register USB4/Thunderbolt eGPU with vga_switcheroo */ + if (dev_is_removable(>dev)) return; vga_switcheroo_register_client(pdev, _switcheroo_ops, runtime); @@ -118,7 +118,7 @@ nouveau_vga_fini(struct nouveau_drm *drm) vga_client_unregister(pdev); - if (pci_is_thunderbolt_attached(pdev)) + if (dev_is_removable(>dev)) return; vga_switcheroo_unregister_client(pdev); -- 2.34.1
[Nouveau] [PATCH v2 2/9] drm/radeon: Switch from pci_is_thunderbolt_attached() to dev_is_removable()
pci_is_thunderbolt_attached() only works for Intel TBT devices. Switch to using dev_is_removable() to be able to detect USB4 devices as well. Signed-off-by: Mario Limonciello --- drivers/gpu/drm/radeon/radeon_device.c | 4 ++-- drivers/gpu/drm/radeon/radeon_kms.c| 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index afbb3a80c0c6..ba0ca0694d18 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -1429,7 +1429,7 @@ int radeon_device_init(struct radeon_device *rdev, if (rdev->flags & RADEON_IS_PX) runtime = true; - if (!pci_is_thunderbolt_attached(rdev->pdev)) + if (!dev_is_removable(>pdev->dev)) vga_switcheroo_register_client(rdev->pdev, _switcheroo_ops, runtime); if (runtime) @@ -1519,7 +1519,7 @@ void radeon_device_fini(struct radeon_device *rdev) radeon_bo_evict_vram(rdev); radeon_audio_component_fini(rdev); radeon_fini(rdev); - if (!pci_is_thunderbolt_attached(rdev->pdev)) + if (!dev_is_removable(>pdev->dev)) vga_switcheroo_unregister_client(rdev->pdev); if (rdev->flags & RADEON_IS_PX) vga_switcheroo_fini_domain_pm_ops(rdev->dev); diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c index a16590c6247f..ead912a58ab8 100644 --- a/drivers/gpu/drm/radeon/radeon_kms.c +++ b/drivers/gpu/drm/radeon/radeon_kms.c @@ -138,7 +138,7 @@ int radeon_driver_load_kms(struct drm_device *dev, unsigned long flags) if ((radeon_runtime_pm != 0) && radeon_has_atpx() && ((flags & RADEON_IS_IGP) == 0) && - !pci_is_thunderbolt_attached(pdev)) + !dev_is_removable(>dev)) flags |= RADEON_IS_PX; /* radeon_device_init should report only fatal error -- 2.34.1
[Nouveau] [PATCH v2 0/9] Improvements to pcie_bandwidth_available() for eGPUs
Downstream drivers are getting the wrong values from pcie_bandwidth_available() which is causing problems for performance of eGPUs. This series overhauls Thunderbolt related device detection and uses the changes to change the behavior of pcie_bandwidth_available(). NOTE: This series is currently based on top of v6.6 + this change that will be merged for 6.7: Link: https://patchwork.freedesktop.org/patch/564738/ v1->v2: * Rename is_thunderbolt * Look for _DSD instead of link * Drop pci_is_thunderbolt_attached() from all drivers * Adjust links * Adjust commit messages * Add quirk for Tiger Lake Mario Limonciello (9): drm/nouveau: Switch from pci_is_thunderbolt_attached() to dev_is_removable() drm/radeon: Switch from pci_is_thunderbolt_attached() to dev_is_removable() PCI: Drop pci_is_thunderbolt_attached() PCI: Move the `PCI_CLASS_SERIAL_USB_USB4` definition to common header PCI: pciehp: Move check for is_thunderbolt into a quirk PCI: Rename is_thunderbolt to is_tunneled PCI: ACPI: Detect PCIe root ports that are used for tunneling PCI: Exclude PCIe ports used for tunneling in pcie_bandwidth_available() PCI: Add a quirk to mark 0x8086 : 0x9a23 as supporting PCIe tunneling drivers/gpu/drm/nouveau/nouveau_vga.c | 6 +- drivers/gpu/drm/radeon/radeon_device.c | 4 +- drivers/gpu/drm/radeon/radeon_kms.c| 2 +- drivers/pci/hotplug/pciehp_hpc.c | 6 +- drivers/pci/pci-acpi.c | 16 ++ drivers/pci/pci.c | 76 +- drivers/pci/probe.c| 2 +- drivers/pci/quirks.c | 31 +++ drivers/platform/x86/apple-gmux.c | 2 +- drivers/thunderbolt/nhi.h | 2 - include/linux/pci.h| 25 + include/linux/pci_ids.h| 1 + 12 files changed, 109 insertions(+), 64 deletions(-) -- 2.34.1
Re: [Nouveau] Error "kernel rejected pushbuf" with RTX 3090 (GA102 chip)
On 10/3/23 10:44, Timothy Madden wrote: Hello, Whenever I try to run a graphical application with my RTX 3090 Strix I get the following error in the console output: nouveau: kernel rejected pushbuf: No such device nouveau: ch23: krec 0 pushes 1 bufs 21 relocs 0 Is there maybe a limit to the number of pushbufs in the source code, that I can hard-code to a smaller values, then recompile Mesa ? Where would I find it in the source files ? -- Thank you, Timothy Madden
Re: [Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures
On 11/3/23 15:04, Christian König wrote: Am 03.11.23 um 14:14 schrieb Danilo Krummrich: On Fri, Nov 03, 2023 at 08:18:35AM +0100, Christian König wrote: Am 02.11.23 um 00:31 schrieb Danilo Krummrich: Implement reference counting for struct drm_gpuvm. From the design point of view what is that good for? It was discussed in this thread [1]. Essentially, the idea is to make sure that vm_bo->vm is always valid without the driver having the need to take extra care. It also ensures that GPUVM can't be freed with mappings still held. Well in this case I have some objections to this. The lifetime of the VM is driver and use case specific. That's fine, I don't see how this changes with a reference count. Especially we most likely don't want the VM to live longer than the application which originally used it. If you make the GPUVM an independent object you actually open up driver abuse for the lifetime of this. Right, we don't want that. But I don't see how the reference count prevents that. Independant object is relative. struct drm_gpuvm is still embedded into a driver specific structure. It's working the same way as with struct drm_gem_obejct. Additional to that see below for a quite real problem with this. Background is that the most common use case I see is that this object is embedded into something else and a reference count is then not really a good idea. Do you have a specific use-case in mind where this would interfere? Yes, absolutely. For an example see amdgpu_mes_self_test(), here we initialize a temporary amdgpu VM for an in kernel unit test which runs during driver load. When the function returns I need to guarantee that the VM is destroyed or otherwise I will mess up normal operation. Nothing prevents that. The reference counting is well defined. If the driver did not take additional references (which is clearly up to the driver taking care of) and all VM_BOs and mappings are cleaned up, the reference count is guaranteed to be 1 at this point. Also note that if the driver would have not cleaned up all VM_BOs and mappings before shutting down the VM, it would have been a bug anyways and the driver would potentially leak memory and UAF issues. Hence, If the VM is still alive at a point where you don't expect it to be, then it's simply a driver bug. Reference counting is nice when you don't know who else is referring to your VM, but the cost is that you also don't know when the object will guardedly be destroyed. I can trivially work around this by saying that the generic GPUVM object has a different lifetime than the amdgpu specific object, but that opens up doors for use after free again. If your driver never touches the VM's reference count and exits the VM with a clean state (no mappings and no VM_BOs left), effectively, this is the same as having no reference count. In the very worst case you could argue that we trade a potential UAF *and* memroy leak (no reference count) with *only* a memory leak (with reference count), which to me seems reasonable. Regards, Christian. Thanks, Christian. [1]https://lore.kernel.org/dri-devel/6fa058a4-20d3-44b9-af58-755cfb375...@redhat.com/ Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/drm_gpuvm.c| 44 +++--- drivers/gpu/drm/nouveau/nouveau_uvmm.c | 20 +--- include/drm/drm_gpuvm.h| 31 +- 3 files changed, 78 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c index 53e2c406fb04..6a88eafc5229 100644 --- a/drivers/gpu/drm/drm_gpuvm.c +++ b/drivers/gpu/drm/drm_gpuvm.c @@ -746,6 +746,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name, gpuvm->rb.tree = RB_ROOT_CACHED; INIT_LIST_HEAD(>rb.list); + kref_init(>kref); + gpuvm->name = name ? name : "unknown"; gpuvm->flags = flags; gpuvm->ops = ops; @@ -770,15 +772,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name, } EXPORT_SYMBOL_GPL(drm_gpuvm_init); -/** - * drm_gpuvm_destroy() - cleanup a _gpuvm - * @gpuvm: pointer to the _gpuvm to clean up - * - * Note that it is a bug to call this function on a manager that still - * holds GPU VA mappings. - */ -void -drm_gpuvm_destroy(struct drm_gpuvm *gpuvm) +static void +drm_gpuvm_fini(struct drm_gpuvm *gpuvm) { gpuvm->name = NULL; @@ -790,7 +785,33 @@ drm_gpuvm_destroy(struct drm_gpuvm *gpuvm) drm_gem_object_put(gpuvm->r_obj); } -EXPORT_SYMBOL_GPL(drm_gpuvm_destroy); + +static void +drm_gpuvm_free(struct kref *kref) +{ + struct drm_gpuvm *gpuvm = container_of(kref, struct drm_gpuvm, kref); + + if (drm_WARN_ON(gpuvm->drm, !gpuvm->ops->vm_free)) + return; + + drm_gpuvm_fini(gpuvm); + + gpuvm->ops->vm_free(gpuvm); +} + +/** + * drm_gpuvm_bo_put() - drop a struct drm_gpuvm reference + * @gpuvm: the _gpuvm to release the reference of + * + * This releases
Re: [Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures
Am 03.11.23 um 14:14 schrieb Danilo Krummrich: On Fri, Nov 03, 2023 at 08:18:35AM +0100, Christian König wrote: Am 02.11.23 um 00:31 schrieb Danilo Krummrich: Implement reference counting for struct drm_gpuvm. From the design point of view what is that good for? It was discussed in this thread [1]. Essentially, the idea is to make sure that vm_bo->vm is always valid without the driver having the need to take extra care. It also ensures that GPUVM can't be freed with mappings still held. Well in this case I have some objections to this. The lifetime of the VM is driver and use case specific. Especially we most likely don't want the VM to live longer than the application which originally used it. If you make the GPUVM an independent object you actually open up driver abuse for the lifetime of this. Additional to that see below for a quite real problem with this. Background is that the most common use case I see is that this object is embedded into something else and a reference count is then not really a good idea. Do you have a specific use-case in mind where this would interfere? Yes, absolutely. For an example see amdgpu_mes_self_test(), here we initialize a temporary amdgpu VM for an in kernel unit test which runs during driver load. When the function returns I need to guarantee that the VM is destroyed or otherwise I will mess up normal operation. Reference counting is nice when you don't know who else is referring to your VM, but the cost is that you also don't know when the object will guardedly be destroyed. I can trivially work around this by saying that the generic GPUVM object has a different lifetime than the amdgpu specific object, but that opens up doors for use after free again. Regards, Christian. Thanks, Christian. [1]https://lore.kernel.org/dri-devel/6fa058a4-20d3-44b9-af58-755cfb375...@redhat.com/ Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/drm_gpuvm.c| 44 +++--- drivers/gpu/drm/nouveau/nouveau_uvmm.c | 20 +--- include/drm/drm_gpuvm.h| 31 +- 3 files changed, 78 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c index 53e2c406fb04..6a88eafc5229 100644 --- a/drivers/gpu/drm/drm_gpuvm.c +++ b/drivers/gpu/drm/drm_gpuvm.c @@ -746,6 +746,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name, gpuvm->rb.tree = RB_ROOT_CACHED; INIT_LIST_HEAD(>rb.list); + kref_init(>kref); + gpuvm->name = name ? name : "unknown"; gpuvm->flags = flags; gpuvm->ops = ops; @@ -770,15 +772,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name, } EXPORT_SYMBOL_GPL(drm_gpuvm_init); -/** - * drm_gpuvm_destroy() - cleanup a _gpuvm - * @gpuvm: pointer to the _gpuvm to clean up - * - * Note that it is a bug to call this function on a manager that still - * holds GPU VA mappings. - */ -void -drm_gpuvm_destroy(struct drm_gpuvm *gpuvm) +static void +drm_gpuvm_fini(struct drm_gpuvm *gpuvm) { gpuvm->name = NULL; @@ -790,7 +785,33 @@ drm_gpuvm_destroy(struct drm_gpuvm *gpuvm) drm_gem_object_put(gpuvm->r_obj); } -EXPORT_SYMBOL_GPL(drm_gpuvm_destroy); + +static void +drm_gpuvm_free(struct kref *kref) +{ + struct drm_gpuvm *gpuvm = container_of(kref, struct drm_gpuvm, kref); + + if (drm_WARN_ON(gpuvm->drm, !gpuvm->ops->vm_free)) + return; + + drm_gpuvm_fini(gpuvm); + + gpuvm->ops->vm_free(gpuvm); +} + +/** + * drm_gpuvm_bo_put() - drop a struct drm_gpuvm reference + * @gpuvm: the _gpuvm to release the reference of + * + * This releases a reference to @gpuvm. + */ +void +drm_gpuvm_put(struct drm_gpuvm *gpuvm) +{ + if (gpuvm) + kref_put(>kref, drm_gpuvm_free); +} +EXPORT_SYMBOL_GPL(drm_gpuvm_put); static int __drm_gpuva_insert(struct drm_gpuvm *gpuvm, @@ -843,7 +864,7 @@ drm_gpuva_insert(struct drm_gpuvm *gpuvm, if (unlikely(!drm_gpuvm_range_valid(gpuvm, addr, range))) return -EINVAL; - return __drm_gpuva_insert(gpuvm, va); + return __drm_gpuva_insert(drm_gpuvm_get(gpuvm), va); } EXPORT_SYMBOL_GPL(drm_gpuva_insert); @@ -876,6 +897,7 @@ drm_gpuva_remove(struct drm_gpuva *va) } __drm_gpuva_remove(va); + drm_gpuvm_put(va->vm); } EXPORT_SYMBOL_GPL(drm_gpuva_remove); diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c index 54be12c1272f..cb2f06565c46 100644 --- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c +++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c @@ -1780,6 +1780,18 @@ nouveau_uvmm_bo_unmap_all(struct nouveau_bo *nvbo) } } +static void +nouveau_uvmm_free(struct drm_gpuvm *gpuvm) +{ + struct nouveau_uvmm *uvmm = uvmm_from_gpuvm(gpuvm); + + kfree(uvmm); +} + +static const struct drm_gpuvm_ops gpuvm_ops = { + .vm_free = nouveau_uvmm_free, +}; + int
Re: [Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures
On Fri, Nov 03, 2023 at 08:18:35AM +0100, Christian König wrote: > Am 02.11.23 um 00:31 schrieb Danilo Krummrich: > > Implement reference counting for struct drm_gpuvm. > > From the design point of view what is that good for? It was discussed in this thread [1]. Essentially, the idea is to make sure that vm_bo->vm is always valid without the driver having the need to take extra care. It also ensures that GPUVM can't be freed with mappings still held. > > Background is that the most common use case I see is that this object is > embedded into something else and a reference count is then not really a good > idea. Do you have a specific use-case in mind where this would interfere? > > Thanks, > Christian. [1] https://lore.kernel.org/dri-devel/6fa058a4-20d3-44b9-af58-755cfb375...@redhat.com/ > > > > > Signed-off-by: Danilo Krummrich > > --- > > drivers/gpu/drm/drm_gpuvm.c| 44 +++--- > > drivers/gpu/drm/nouveau/nouveau_uvmm.c | 20 +--- > > include/drm/drm_gpuvm.h| 31 +- > > 3 files changed, 78 insertions(+), 17 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c > > index 53e2c406fb04..6a88eafc5229 100644 > > --- a/drivers/gpu/drm/drm_gpuvm.c > > +++ b/drivers/gpu/drm/drm_gpuvm.c > > @@ -746,6 +746,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char > > *name, > > gpuvm->rb.tree = RB_ROOT_CACHED; > > INIT_LIST_HEAD(>rb.list); > > + kref_init(>kref); > > + > > gpuvm->name = name ? name : "unknown"; > > gpuvm->flags = flags; > > gpuvm->ops = ops; > > @@ -770,15 +772,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char > > *name, > > } > > EXPORT_SYMBOL_GPL(drm_gpuvm_init); > > -/** > > - * drm_gpuvm_destroy() - cleanup a _gpuvm > > - * @gpuvm: pointer to the _gpuvm to clean up > > - * > > - * Note that it is a bug to call this function on a manager that still > > - * holds GPU VA mappings. > > - */ > > -void > > -drm_gpuvm_destroy(struct drm_gpuvm *gpuvm) > > +static void > > +drm_gpuvm_fini(struct drm_gpuvm *gpuvm) > > { > > gpuvm->name = NULL; > > @@ -790,7 +785,33 @@ drm_gpuvm_destroy(struct drm_gpuvm *gpuvm) > > drm_gem_object_put(gpuvm->r_obj); > > } > > -EXPORT_SYMBOL_GPL(drm_gpuvm_destroy); > > + > > +static void > > +drm_gpuvm_free(struct kref *kref) > > +{ > > + struct drm_gpuvm *gpuvm = container_of(kref, struct drm_gpuvm, kref); > > + > > + if (drm_WARN_ON(gpuvm->drm, !gpuvm->ops->vm_free)) > > + return; > > + > > + drm_gpuvm_fini(gpuvm); > > + > > + gpuvm->ops->vm_free(gpuvm); > > +} > > + > > +/** > > + * drm_gpuvm_bo_put() - drop a struct drm_gpuvm reference > > + * @gpuvm: the _gpuvm to release the reference of > > + * > > + * This releases a reference to @gpuvm. > > + */ > > +void > > +drm_gpuvm_put(struct drm_gpuvm *gpuvm) > > +{ > > + if (gpuvm) > > + kref_put(>kref, drm_gpuvm_free); > > +} > > +EXPORT_SYMBOL_GPL(drm_gpuvm_put); > > static int > > __drm_gpuva_insert(struct drm_gpuvm *gpuvm, > > @@ -843,7 +864,7 @@ drm_gpuva_insert(struct drm_gpuvm *gpuvm, > > if (unlikely(!drm_gpuvm_range_valid(gpuvm, addr, range))) > > return -EINVAL; > > - return __drm_gpuva_insert(gpuvm, va); > > + return __drm_gpuva_insert(drm_gpuvm_get(gpuvm), va); > > } > > EXPORT_SYMBOL_GPL(drm_gpuva_insert); > > @@ -876,6 +897,7 @@ drm_gpuva_remove(struct drm_gpuva *va) > > } > > __drm_gpuva_remove(va); > > + drm_gpuvm_put(va->vm); > > } > > EXPORT_SYMBOL_GPL(drm_gpuva_remove); > > diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c > > b/drivers/gpu/drm/nouveau/nouveau_uvmm.c > > index 54be12c1272f..cb2f06565c46 100644 > > --- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c > > +++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c > > @@ -1780,6 +1780,18 @@ nouveau_uvmm_bo_unmap_all(struct nouveau_bo *nvbo) > > } > > } > > +static void > > +nouveau_uvmm_free(struct drm_gpuvm *gpuvm) > > +{ > > + struct nouveau_uvmm *uvmm = uvmm_from_gpuvm(gpuvm); > > + > > + kfree(uvmm); > > +} > > + > > +static const struct drm_gpuvm_ops gpuvm_ops = { > > + .vm_free = nouveau_uvmm_free, > > +}; > > + > > int > > nouveau_uvmm_ioctl_vm_init(struct drm_device *dev, > >void *data, > > @@ -1830,7 +1842,7 @@ nouveau_uvmm_ioctl_vm_init(struct drm_device *dev, > >NOUVEAU_VA_SPACE_END, > >init->kernel_managed_addr, > >init->kernel_managed_size, > > - NULL); > > + _ops); > > /* GPUVM takes care from here on. */ > > drm_gem_object_put(r_obj); > > @@ -1849,8 +1861,7 @@ nouveau_uvmm_ioctl_vm_init(struct drm_device *dev, > > return 0; > > out_gpuvm_fini: > > - drm_gpuvm_destroy(>base); > > - kfree(uvmm); > > + drm_gpuvm_put(>base); > > out_unlock: > > mutex_unlock(>mutex); > > return ret; > > @@ -1902,7 +1913,6 @@ nouveau_uvmm_fini(struct nouveau_uvmm
Re: [Nouveau] [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm_gpuvm structures
Am 02.11.23 um 00:31 schrieb Danilo Krummrich: Implement reference counting for struct drm_gpuvm. From the design point of view what is that good for? Background is that the most common use case I see is that this object is embedded into something else and a reference count is then not really a good idea. Thanks, Christian. Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/drm_gpuvm.c| 44 +++--- drivers/gpu/drm/nouveau/nouveau_uvmm.c | 20 +--- include/drm/drm_gpuvm.h| 31 +- 3 files changed, 78 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c index 53e2c406fb04..6a88eafc5229 100644 --- a/drivers/gpu/drm/drm_gpuvm.c +++ b/drivers/gpu/drm/drm_gpuvm.c @@ -746,6 +746,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name, gpuvm->rb.tree = RB_ROOT_CACHED; INIT_LIST_HEAD(>rb.list); + kref_init(>kref); + gpuvm->name = name ? name : "unknown"; gpuvm->flags = flags; gpuvm->ops = ops; @@ -770,15 +772,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name, } EXPORT_SYMBOL_GPL(drm_gpuvm_init); -/** - * drm_gpuvm_destroy() - cleanup a _gpuvm - * @gpuvm: pointer to the _gpuvm to clean up - * - * Note that it is a bug to call this function on a manager that still - * holds GPU VA mappings. - */ -void -drm_gpuvm_destroy(struct drm_gpuvm *gpuvm) +static void +drm_gpuvm_fini(struct drm_gpuvm *gpuvm) { gpuvm->name = NULL; @@ -790,7 +785,33 @@ drm_gpuvm_destroy(struct drm_gpuvm *gpuvm) drm_gem_object_put(gpuvm->r_obj); } -EXPORT_SYMBOL_GPL(drm_gpuvm_destroy); + +static void +drm_gpuvm_free(struct kref *kref) +{ + struct drm_gpuvm *gpuvm = container_of(kref, struct drm_gpuvm, kref); + + if (drm_WARN_ON(gpuvm->drm, !gpuvm->ops->vm_free)) + return; + + drm_gpuvm_fini(gpuvm); + + gpuvm->ops->vm_free(gpuvm); +} + +/** + * drm_gpuvm_bo_put() - drop a struct drm_gpuvm reference + * @gpuvm: the _gpuvm to release the reference of + * + * This releases a reference to @gpuvm. + */ +void +drm_gpuvm_put(struct drm_gpuvm *gpuvm) +{ + if (gpuvm) + kref_put(>kref, drm_gpuvm_free); +} +EXPORT_SYMBOL_GPL(drm_gpuvm_put); static int __drm_gpuva_insert(struct drm_gpuvm *gpuvm, @@ -843,7 +864,7 @@ drm_gpuva_insert(struct drm_gpuvm *gpuvm, if (unlikely(!drm_gpuvm_range_valid(gpuvm, addr, range))) return -EINVAL; - return __drm_gpuva_insert(gpuvm, va); + return __drm_gpuva_insert(drm_gpuvm_get(gpuvm), va); } EXPORT_SYMBOL_GPL(drm_gpuva_insert); @@ -876,6 +897,7 @@ drm_gpuva_remove(struct drm_gpuva *va) } __drm_gpuva_remove(va); + drm_gpuvm_put(va->vm); } EXPORT_SYMBOL_GPL(drm_gpuva_remove); diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c index 54be12c1272f..cb2f06565c46 100644 --- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c +++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c @@ -1780,6 +1780,18 @@ nouveau_uvmm_bo_unmap_all(struct nouveau_bo *nvbo) } } +static void +nouveau_uvmm_free(struct drm_gpuvm *gpuvm) +{ + struct nouveau_uvmm *uvmm = uvmm_from_gpuvm(gpuvm); + + kfree(uvmm); +} + +static const struct drm_gpuvm_ops gpuvm_ops = { + .vm_free = nouveau_uvmm_free, +}; + int nouveau_uvmm_ioctl_vm_init(struct drm_device *dev, void *data, @@ -1830,7 +1842,7 @@ nouveau_uvmm_ioctl_vm_init(struct drm_device *dev, NOUVEAU_VA_SPACE_END, init->kernel_managed_addr, init->kernel_managed_size, - NULL); + _ops); /* GPUVM takes care from here on. */ drm_gem_object_put(r_obj); @@ -1849,8 +1861,7 @@ nouveau_uvmm_ioctl_vm_init(struct drm_device *dev, return 0; out_gpuvm_fini: - drm_gpuvm_destroy(>base); - kfree(uvmm); + drm_gpuvm_put(>base); out_unlock: mutex_unlock(>mutex); return ret; @@ -1902,7 +1913,6 @@ nouveau_uvmm_fini(struct nouveau_uvmm *uvmm) mutex_lock(>mutex); nouveau_vmm_fini(>vmm); - drm_gpuvm_destroy(>base); - kfree(uvmm); + drm_gpuvm_put(>base); mutex_unlock(>mutex); } diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h index 0c2e24155a93..4e6e1fd3485a 100644 --- a/include/drm/drm_gpuvm.h +++ b/include/drm/drm_gpuvm.h @@ -247,6 +247,11 @@ struct drm_gpuvm { struct list_head list; } rb; + /** +* @kref: reference count of this object +*/ + struct kref kref; + /** * @kernel_alloc_node: * @@ -273,7 +278,23 @@ void drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name, u64 start_offset, u64 range, u64 reserve_offset, u64 reserve_range,