[PULL] drm-xe-fixes
Hi Dave and Simona, drm-xe-fixes for 6.12-rc2. A few fixes, particularly on error paths and corner cases. We have more than usual as I decided to skip last week pull request. Some tuning for Xe2 were missing and also got updated to match the spec. Thanks Lucas De Marchi drm-xe-fixes-2024-10-03: Driver Changes: - Restore pci state on resume (Rodrigo Vivi) - Fix locking on submission, queue and vm (Matthew Auld, Matthew Brost) - Fix UAF on queue destruction (Matthew Auld) - Fix resource release on freq init error path (He Lugang) - Use rw_semaphore to reduce contention on ASID->VM lookup (Matthew Brost) - Fix steering for media on Xe2_HPM (Gustavo Sousa) - Tuning updates to Xe2 (Gustavo Sousa) - Resume TDR after GT reset to prevent jobs running forever (Matthew Brost) - Move id allocation to avoid userspace using a guessed number to trigger UAF (Matthew Auld, Matthew Brost) - Fix OA stream close preventing pbatch buffers to complete (José) - Fix NPD when migrating memory on LNL (Zhanjun Dong) - Fix memory leak when aborting binds (Matthew Brost) The following changes since commit 9852d85ec9d492ebef56dc5f229416c925758edc: Linux 6.12-rc1 (2024-09-29 15:06:19 -0700) are available in the Git repository at: https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-10-03 for you to fetch changes up to a6f3b2527375c786f2eff77d3ee8b805bcfe026d: drm/xe: Fix memory leak when aborting binds (2024-10-03 01:24:54 -0500) Driver Changes: - Restore pci state on resume (Rodrigo Vivi) - Fix locking on submission, queue and vm (Matthew Auld, Matthew Brost) - Fix UAF on queue destruction (Matthew Auld) - Fix resource release on freq init error path (He Lugang) - Use rw_semaphore to reduce contention on ASID->VM lookup (Matthew Brost) - Fix steering for media on Xe2_HPM (Gustavo Sousa) - Tuning updates to Xe2 (Gustavo Sousa) - Resume TDR after GT reset to prevent jobs running forever (Matthew Brost) - Move id allocation to avoid userspace using a guessed number to trigger UAF (Matthew Auld, Matthew Brost) - Fix OA stream close preventing pbatch buffers to complete (José) - Fix NPD when migrating memory on LNL (Zhanjun Dong) - Fix memory leak when aborting binds (Matthew Brost) Gustavo Sousa (3): drm/xe/mcr: Use Xe2_LPM steering tables for Xe2_HPM drm/xe/xe2: Extend performance tuning to media GT drm/xe/xe2: Add performance tuning for L3 cache flushing He Lugang (1): drm/xe: use devm_add_action_or_reset() helper José Roberto de Souza (1): drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close Matthew Auld (4): drm/xe/guc_submit: add missing locking in wedged_fini drm/xe: fix UAF around queue destruction drm/xe/vm: move xa_alloc to prevent UAF drm/xe/queue: move xa_alloc to prevent UAF Matthew Brost (5): drm/xe: Convert to USM lock to rwsem drm/xe: Use helper for ASID -> VM in GPU faults and access counters drm/xe: Resume TDR after GT reset drm/xe: Clean up VM / exec queue file lock usage. drm/xe: Fix memory leak when aborting binds Rodrigo Vivi (1): drm/xe: Restore pci state upon resume Zhanjun Dong (1): drm/xe: Prevent null pointer access in xe_migrate_copy drivers/gpu/drm/xe/regs/xe_gt_regs.h | 11 ++ drivers/gpu/drm/xe/xe_bo.c| 4 ++-- drivers/gpu/drm/xe/xe_device.c| 12 +-- drivers/gpu/drm/xe/xe_device_types.h | 19 + drivers/gpu/drm/xe/xe_drm_client.c| 9 +++- drivers/gpu/drm/xe/xe_exec_queue.c| 6 +++--- drivers/gpu/drm/xe/xe_gpu_scheduler.c | 5 + drivers/gpu/drm/xe/xe_gpu_scheduler.h | 2 ++ drivers/gpu/drm/xe/xe_gt_freq.c | 4 ++-- drivers/gpu/drm/xe/xe_gt_mcr.c| 2 +- drivers/gpu/drm/xe/xe_gt_pagefault.c | 39 +++ drivers/gpu/drm/xe/xe_gt_sysfs.c | 2 +- drivers/gpu/drm/xe/xe_guc_submit.c| 37 ++--- drivers/gpu/drm/xe/xe_guc_types.h | 2 ++ drivers/gpu/drm/xe/xe_oa.c| 9 +++- drivers/gpu/drm/xe/xe_pci.c | 2 ++ drivers/gpu/drm/xe/xe_pt.c| 2 +- drivers/gpu/drm/xe/xe_tuning.c| 28 + drivers/gpu/drm/xe/xe_vm.c| 28 +++-- 19 files changed, 159 insertions(+), 64 deletions(-)
Re: linux-next: build failure after merge of the kspp tree
On Thu, Sep 19, 2024 at 09:27:52AM GMT, Stephen Rothwell wrote: Hi all, On Mon, 9 Sep 2024 19:59:39 +1000 Stephen Rothwell wrote: After merging the kspp tree, today's linux-next build (x86_64 allmodconfig) failed like this: drivers/gpu/drm/xe/xe_gt_idle.c:56:27: error: redefinition of 'str_up_down' 56 | static inline const char *str_up_down(bool v) | ^~~ In file included from include/linux/string_helpers.h:7, from drivers/gpu/drm/xe/xe_assert.h:9, from drivers/gpu/drm/xe/xe_force_wake.h:9, from drivers/gpu/drm/xe/xe_gt_idle.c:8: include/linux/string_choices.h:62:27: note: previous definition of 'str_up_down' with type 'const char *(bool)' {aka 'const char *(_Bool)'} 62 | static inline const char *str_up_down(bool v) | ^~~ Caused by commit a98ae7f045b2 ("lib/string_choices: Add str_up_down() helper") interacting with commit 0914c1e45d3a ("drm/xe/xe_gt_idle: add debugfs entry for powergating info") from the drm-xe tree. I have applied the following patch for today. From: Stephen Rothwell Date: Mon, 9 Sep 2024 19:40:17 +1000 Subject: [PATCH] fix up for "lib/string_choices: Add str_up_down() helper" interacting wit commit "drm/xe/xe_gt_idle: add debugfs entry for powergating info" from the drm-xe tree. Signed-off-by: Stephen Rothwell --- drivers/gpu/drm/xe/xe_gt_idle.c | 5 - 1 file changed, 5 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_gt_idle.c b/drivers/gpu/drm/xe/xe_gt_idle.c index 85a35ed153a3..0f98c1539c64 100644 --- a/drivers/gpu/drm/xe/xe_gt_idle.c +++ b/drivers/gpu/drm/xe/xe_gt_idle.c @@ -53,11 +53,6 @@ pc_to_xe(struct xe_guc_pc *pc) return gt_to_xe(gt); } -static inline const char *str_up_down(bool v) -{ - return v ? "up" : "down"; -} - static const char *gt_idle_state_to_string(enum xe_gt_idle_state state) { switch (state) { -- 2.45.2 This is now needed in the merge between Linus' tree and the drm-xe tree. Thanks. This not going to 6.12. It's targeted to 6.13, so we should fix it when merging drm-next back to drm-xe-next. Lucas De Marchi -- Cheers, Stephen Rothwell
Re: [PULL] drm-xe-next-fixes
On Thu, Sep 19, 2024 at 09:56:47PM GMT, Lucas De Marchi wrote: Hi Dave and Simona, A few fixes for 6.11-rc1. oops, I meant 6.12-rc1, of course :) Lucas De Marchi
[PULL] drm-xe-next-fixes
Hi Dave and Simona, A few fixes for 6.11-rc1. Thanks Lucas De Marchi drm-xe-next-fixes-2024-09-19: Driver Changes: - Fix macro for checking minimum GuC version (Michal Wajdeczko) - Fix CCS offset calculation for some BMG SKUs (Matthew Auld) - Fix locking on memory usage reporting via fdinfo and BO destroy (Matthew Auld) - Fix GPU page fault handler on a closed VM (Matthew Brost) - Fix overflow in oa batch buffer (José) The following changes since commit ae2c6d8b3b88c176dff92028941a4023f1b4cb91: Merge tag 'drm-xe-next-fixes-2024-09-12' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next (2024-09-17 14:53:34 +1000) are available in the Git repository at: https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-next-fixes-2024-09-19 for you to fetch changes up to 6c10ba06bb1b48acce6d4d9c1e33beb9954f1788: drm/xe/oa: Fix overflow in oa batch buffer (2024-09-17 23:31:59 -0500) Driver Changes: - Fix macro for checking minimum GuC version (Michal Wajdeczko) - Fix CCS offset calculation for some BMG SKUs (Matthew Auld) - Fix locking on memory usage reporting via fdinfo and BO destroy (Matthew Auld) - Fix GPU page fault handler on a closed VM (Matthew Brost) - Fix overflow in oa batch buffer (José) José Roberto de Souza (1): drm/xe/oa: Fix overflow in oa batch buffer Matthew Auld (5): drm/xe/vram: fix ccs offset calculation drm/xe/client: fix deadlock in show_meminfo() drm/xe/client: add missing bo locking in show_meminfo() drm/xe/client: use mem_type from the current resource drm/xe/bo: add some annotations in bo_put() Matthew Brost (1): drm/xe: Do not run GPU page fault handler on a closed VM Michal Wajdeczko (1): drm/xe/guc: Fix GUC_{SUBMIT,FIRMWARE}_VER helper macros drivers/gpu/drm/xe/xe_bb.c | 3 ++- drivers/gpu/drm/xe/xe_bo.c | 14 ++ drivers/gpu/drm/xe/xe_bo.h | 6 + drivers/gpu/drm/xe/xe_drm_client.c | 50 +--- drivers/gpu/drm/xe/xe_gt_pagefault.c | 6 + drivers/gpu/drm/xe/xe_guc.h | 6 +++-- drivers/gpu/drm/xe/xe_vram.c | 1 + 7 files changed, 69 insertions(+), 17 deletions(-)
[PULL] drm-xe-next-fixes
Hi Dave and Simona, Fixes for 6.12 cycle. Very normal fixes, mostly on error paths. Maybe something different from previous pull requests is the addition of the workaround, which we were not doing since they don't have a "Fixes" trailer. However as we are enabling Xe2 platforms starting with this release, we are being more careful and adding them when they are really needed. Thanks Lucas De Marchi drm-xe-next-fixes-2024-09-12: Driver Changes: - Fix usefafter-free when provisioning VF (Matthew Auld) - Suppress rpm warning on false positive (Rodrigo) - Fix memleak on ioctl error path (Dafna) - Fix use-after-free while inserting ggtt (Michal Wajdeczko) - Add Wa_15016589081 workaround (Tejas) - Fix error path on suspend (Maarten) The following changes since commit b615b9c36cae0468491547206406a909a9a37f26: Merge v6.11-rc7 into drm-next (2024-09-11 09:18:15 +0200) are available in the Git repository at: https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-next-fixes-2024-09-12 for you to fetch changes up to f1a4dceeb2bd4b4478e4f0c77dac55569d153fb3: drm/xe: Fix missing conversion to xe_display_pm_runtime_resume (2024-09-12 18:04:36 -0500) Driver Changes: - Fix usefafter-free when provisioning VF (Matthew Auld) - Suppress rpm warning on false positive (Rodrigo) - Fix memleak on ioctl error path (Dafna) - Fix use-after-free while inserting ggtt (Michal Wajdeczko) - Add Wa_15016589081 workaround (Tejas) - Fix error path on suspend (Maarten) Arnd Bergmann (1): drm/xe: fix build warning with CONFIG_PM=n Dafna Hirschfeld (1): drm/xe: fix missing 'xe_vm_put' Maarten Lankhorst (1): drm/xe: Fix missing conversion to xe_display_pm_runtime_resume Matthew Auld (1): drm/xe: prevent potential UAF in pf_provision_vf_ggtt() Michal Wajdeczko (1): drm/xe: Don't keep stale pointer to bo->ggtt_node Rodrigo Vivi (1): drm/xe: Suppress missing outer rpm protection warning Tejas Upadhyay (1): drm/xe/xe2hpg: Add Wa_15016589081 drivers/gpu/drm/xe/regs/xe_gt_regs.h | 1 + drivers/gpu/drm/xe/xe_exec_queue.c | 4 +++- drivers/gpu/drm/xe/xe_ggtt.c | 7 +-- drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 8 drivers/gpu/drm/xe/xe_pm.c | 23 +-- drivers/gpu/drm/xe/xe_wa.c | 4 6 files changed, 38 insertions(+), 9 deletions(-)
Re: [PATCH v4 1/3] drm: Introduce device wedged event
On Tue, Sep 10, 2024 at 06:53:19PM GMT, Raag Jadav wrote: On Mon, Sep 09, 2024 at 03:01:50PM -0500, Lucas De Marchi wrote: On Sun, Sep 08, 2024 at 11:08:39PM GMT, Asahi Lina wrote: > On 9/8/24 12:07 AM, Lucas De Marchi wrote: > > On Sat, Sep 07, 2024 at 08:38:30PM GMT, Asahi Lina wrote: > > > On 9/6/24 6:42 PM, Raag Jadav wrote: > > > > Introduce device wedged event, which will notify userspace of wedged > > > > (hanged/unusable) state of the DRM device through a uevent. This is > > > > useful especially in cases where the device is in unrecoverable state > > > > and requires userspace intervention for recovery. > > > > > > > > Purpose of this implementation is to be vendor agnostic. Userspace > > > > consumers (sysadmin) can define udev rules to parse this event and > > > > take respective action to recover the device. > > > > > > > > Consumer expectations: > > > > -- > > > > 1) Unbind driver > > > > 2) Reset bus device > > > > 3) Re-bind driver > > > > > > Is this supposed to be normative? For drm/asahi we have a "wedged" > > > concept (firmware crashed), but the only possible recovery action is a > > > full system reboot (which might still be desirable to allow userspace to > > > trigger automatically in some scenarios) since there is no bus-level > > > reset and no firmware reload possible. > > > > maybe let drivers hint possible/supported recovery mechanisms and then > > sysadmin chooses what to do? > > How would we do this? A textual value for the event or something like > that? ("WEDGED=bus-reset" vs "WEDGED=reboot"?) If there's a need for more than one, than I think exposing the supported ones sorted by "side effect" in sysfs would be good. Something like: $ cat /sys/class/drm/card0/device/wedge_recover rebind bus-reset reboot How do we expect the drivers to flag supported ones? Extra hooks? The comment above... wedge_recover would be a sysfs exposed by the driver to userspace with the supported mechanisms. WEDGED= (which is also crafted by the driver or with explicit functions in drm) would report to userspace the minimum needed mechanism for recovery. Lucas De Marchi
Re: [PATCH 6/7] drm/i915/pmu: Lazy unregister
Hi Peter, On Wed, Jul 24, 2024 at 10:39:57AM GMT, Lucas De Marchi wrote: On Wed, Jul 24, 2024 at 02:41:05PM GMT, Peter Zijlstra wrote: On Tue, Jul 23, 2024 at 10:30:08AM -0500, Lucas De Marchi wrote: On Tue, Jul 23, 2024 at 09:03:25AM GMT, Tvrtko Ursulin wrote: On 22/07/2024 22:06, Lucas De Marchi wrote: > Instead of calling perf_pmu_unregister() when unbinding, defer that to > the destruction of i915 object. Since perf itself holds a reference in > the event, this only happens when all events are gone, which guarantees > i915 is not unregistering the pmu with live events. > > Previously, running the following sequence would crash the system after > ~2 tries: > >1) bind device to i915 >2) wait events to show up on sysfs >3) start perf stat -I 1000 -e i915/rcs0-busy/ >4) unbind driver >5) kill perf > > Most of the time this crashes in perf_pmu_disable() while accessing the > percpu pmu_disable_count. This happens because perf_pmu_unregister() > destroys it with free_percpu(pmu->pmu_disable_count). > > With a lazy unbind, the pmu is only unregistered after (5) as opposed to > after (4). The downside is that if a new bind operation is attempted for > the same device/driver without killing the perf process, i915 will fail > to register the pmu (but still load successfully). This seems better > than completely crashing the system. So effectively allows unbind to succeed without fully unbinding the driver from the device? That sounds like a significant drawback and if so, I wonder if a more complicated solution wouldn't be better after all. Or is there precedence for allowing userspace keeping their paws on unbound devices in this way? keeping the resources alive but "unplunged" while the hardware disappeared is a common thing to do... it's the whole point of the drmm-managed resource for example. If you bind the driver and then unbind it while userspace is holding a ref, next time you try to bind it will come up with a different card number. A similar thing that could be done is to adjust the name of the event - currently we add the mangled pci slot. That said, I agree a better approach would be to allow perf_pmu_unregister() to do its job even when there are open events. On top of that (or as a way to help achieve that), make perf core replace the callbacks with stubs when pmu is unregistered - that would even kill the need for i915's checks on pmu->closed (and fix the lack thereof in other drivers). It can be a can of worms though and may be pushed back by perf core maintainers, so it'd be good have their feedback. I don't think I understand the problem. I also don't understand drivers much -- so that might be the problem. We can bind/unbind the driver to/from the pci device. Example: echo -n ":00:02.0" > /sys/bus/pci/drivers/i915/unbind This is essentially unplugging the HW from the kernel. This will trigger the driver to deinitialize and free up all resources use by that device. So when the driver is binding it does: perf_pmu_register(); When it's unbinding: perf_pmu_unregister(); Reasons to unbind include: - driver testing so then we can unload the module and load it again - device is toast - doing an FLR and rebinding may fix/workaround it - For SR-IOV, in which we are exposing multiple instances of the same underlying PCI device, we may need to bind/unbind on-demand (it's not yet clear if perf_pmu_register() would be called on the VF instances, but listed here just to explain the bind/unbind) - Hotplug So the problem appears to be that the device just disappears without warning? How can a GPU go away like that? Since you have a notion of this device, can't you do this stubbing you talk about? That is, if your internal device reference becomes NULL, let the PMU methods preserve the state like no-ops. It's not clear if you are suggesting these stubs to be in each driver or to be assigned by perf in perf_pmu_unregister(). Some downsides of doing it in the driver: - you can't remove the module as perf will still call module code - need to replicate the stubs in every driver (or the equivalent of pmu->closed checks in i915_pmu.c) I *think* the stubs would be quiet the same for every device, so could be feasible to share them inside perf. Alternatively it could simply shortcut the call, without stubs, by looking at event->hw.state and a new pmu->state. I can give this a try. trying to revive these patches to get this fixed. Are you ok with one of those approaches, i.e. a) either add the stub in the perf side or b) shortcut the calls after perf_pmu_unregister() is called ? Lucas De Marchi thanks Lucas De Marchi And then when the last event goes away, tear down the whole thing. Again, I'm not sure I'm following.
Re: [PATCH v4 1/3] drm: Introduce device wedged event
On Sun, Sep 08, 2024 at 11:08:39PM GMT, Asahi Lina wrote: On 9/8/24 12:07 AM, Lucas De Marchi wrote: On Sat, Sep 07, 2024 at 08:38:30PM GMT, Asahi Lina wrote: On 9/6/24 6:42 PM, Raag Jadav wrote: Introduce device wedged event, which will notify userspace of wedged (hanged/unusable) state of the DRM device through a uevent. This is useful especially in cases where the device is in unrecoverable state and requires userspace intervention for recovery. Purpose of this implementation is to be vendor agnostic. Userspace consumers (sysadmin) can define udev rules to parse this event and take respective action to recover the device. Consumer expectations: -- 1) Unbind driver 2) Reset bus device 3) Re-bind driver Is this supposed to be normative? For drm/asahi we have a "wedged" concept (firmware crashed), but the only possible recovery action is a full system reboot (which might still be desirable to allow userspace to trigger automatically in some scenarios) since there is no bus-level reset and no firmware reload possible. maybe let drivers hint possible/supported recovery mechanisms and then sysadmin chooses what to do? How would we do this? A textual value for the event or something like that? ("WEDGED=bus-reset" vs "WEDGED=reboot"?) If there's a need for more than one, than I think exposing the supported ones sorted by "side effect" in sysfs would be good. Something like: $ cat /sys/class/drm/card0/device/wedge_recover rebind bus-reset reboot Although if there is actually an unrecoverable state like "reboot", you could simply remove the underlying device from the kernel side, with no userspace intervention. Lucas De Marchi ~~ Lina
Re: [PATCH v4 1/3] drm: Introduce device wedged event
On Sat, Sep 07, 2024 at 08:38:30PM GMT, Asahi Lina wrote: On 9/6/24 6:42 PM, Raag Jadav wrote: Introduce device wedged event, which will notify userspace of wedged (hanged/unusable) state of the DRM device through a uevent. This is useful especially in cases where the device is in unrecoverable state and requires userspace intervention for recovery. Purpose of this implementation is to be vendor agnostic. Userspace consumers (sysadmin) can define udev rules to parse this event and take respective action to recover the device. Consumer expectations: -- 1) Unbind driver 2) Reset bus device 3) Re-bind driver Is this supposed to be normative? For drm/asahi we have a "wedged" concept (firmware crashed), but the only possible recovery action is a full system reboot (which might still be desirable to allow userspace to trigger automatically in some scenarios) since there is no bus-level reset and no firmware reload possible. maybe let drivers hint possible/supported recovery mechanisms and then sysadmin chooses what to do? Lucas De Marchi
Re: linux-next: manual merge of the drm-xe tree with the drm-intel tree
On Fri, Sep 06, 2024 at 01:15:02PM GMT, Stephen Rothwell wrote: Hi all, Today's linux-next merge of the drm-xe tree got a conflict in: drivers/gpu/drm/xe/display/xe_display.c between commit: 11d0613af7c5 ("drm/i915/display: include drm/drm_probe_helper.h where needed") from the drm-intel tree and commit: 87d8ecf01544 ("drm/xe: replace #include with ") from the drm-xe tree. I fixed it up (see below) and can carry the fix as necessary. This looks good, thanks Lucas De Marchi is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc drivers/gpu/drm/xe/display/xe_display.c index 303d00b99a68,75736faf2a80.. --- a/drivers/gpu/drm/xe/display/xe_display.c +++ b/drivers/gpu/drm/xe/display/xe_display.c @@@ -10,8 -10,7 +10,8 @@@ #include #include +#include - #include + #include #include "soc/intel_dram.h" #include "i915_drv.h"/* FIXME: HAS_DISPLAY() depends on this */
[PULL] drm-xe-next
Hi Dave and Simona, Last drm-xe-next pull for 6.12. I was going to bring only the fixes, but that would be basically most of the changes available. And what's not a fix it's pretty trivial stuff. So I'm including everything. The one outstanding thing is a prep patch to dma-buf that was needed to properly replace a fix we reverted last week. That's being tested for more than a week and proved better than the previous one. thanks Lucas De Marchi drm-xe-next-2024-09-05: Cross-subsystem Changes: - Split dma fence array creation into alloc and arm (Matthew Brost) Driver Changes: - Move kernel_lrc to execlist backend (Ilia) - Fix type width for pcode coommand (Karthik) - Make xe_drm.h include unambiguous (Jani) - Fixes and debug improvements for GSC load (Daniele) - Track resources and VF state by PF (Michal Wajdeczko) - Fix memory leak on error path (Nirmoy) - Cleanup header includes (Matt Roper) - Move pcode logic to tile scope (Matt Roper) - Move hwmon logic to device scope (Matt Roper) - Fix media TLB invalidation (Matthew Brost) - Threshold config fixes for PF (Michal Wajdeczko) - Remove extra "[drm]" from logs (Michal Wajdeczko) - Add missing runtime ref (Rodrigo Vivi) - Fix circular locking on runtime suspend (Rodrigo Vivi) - Fix rpm in TTM swapout path (Thomas) The following changes since commit 3adcf970dc7ec0469ec3116a5a8be9161d17a335: drm/xe/bmg: Drop force_probe requirement (2024-08-28 10:47:03 -0700) are available in the Git repository at: https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-next-2024-09-05 for you to fetch changes up to 34bb7b813ab398106f700b0a6b218509bb0b904c: drm/xe: Use xe_pm_runtime_get in xe_bo_move() if reclaim-safe. (2024-09-04 09:28:09 +0200) Cross-subsystem Changes: - Split dma fence array creation into alloc and arm (Matthew Brost) Driver Changes: - Move kernel_lrc to execlist backend (Ilia) - Fix type width for pcode coommand (Karthik) - Make xe_drm.h include unambiguous (Jani) - Fixes and debug improvements for GSC load (Daniele) - Track resources and VF state by PF (Michal Wajdeczko) - Fix memory leak on error path (Nirmoy) - Cleanup header includes (Matt Roper) - Move pcode logic to tile scope (Matt Roper) - Move hwmon logic to device scope (Matt Roper) - Fix media TLB invalidation (Matthew Brost) - Threshold config fixes for PF (Michal Wajdeczko) - Remove extra "[drm]" from logs (Michal Wajdeczko) - Add missing runtime ref (Rodrigo Vivi) - Fix circular locking on runtime suspend (Rodrigo Vivi) - Fix rpm in TTM swapout path (Thomas) Daniele Ceraolo Spurio (5): drm/xe/gsc: Do not attempt to load the GSC multiple times drm/xe/gsc: Fix FW status if the firmware is already loaded drm/xe/gsc: Track the platform in the compatibility version drm/xe/gsc: Add debugfs to print GSC info drm/xe/gsc: Wedge the device if the GSCCS reset fails Ilia Levi (1): drm/xe: move the kernel lrc from hwe to execlist port Jani Nikula (1): drm/xe: replace #include with Karthik Poosa (1): drm/xe/hwmon: Fix WRITE_I1 param from u32 to u16 Matt Roper (3): drm/xe/display: Drop unnecessary xe_gt.h includes drm/xe/pcode: Treat pcode as per-tile rather than per-GT drm/xe/hwmon: Treat hwmon as a per-device concept Matthew Brost (2): dma-buf: Split out dma fence array create into alloc and arm functions drm/xe: Invalidate media_gt TLBs in PT code Michal Wajdeczko (7): drm/xe/pf: Add function to sanitize VF resources drm/xe/pf: Fix documentation formatting drm/xe/pf: Drop GuC notifications for non-existing VF drm/xe/pf: Improve VF control drm/xe/pf: Add thresholds to the VF KLV config drm/xe/pf: Reset thresholds when releasing a VF config drm/xe: Remove redundant [drm] tag from xe_assert() message Nirmoy Das (1): drm/xe: Fix memory leak on xe_alloc_pf_queue failure Rodrigo Vivi (2): drm/xe: Add missing runtime reference to wedged upon gt_reset drm/xe/display: Avoid encoder_suspend at runtime suspend Thomas Hellström (1): drm/xe: Use xe_pm_runtime_get in xe_bo_move() if reclaim-safe. drivers/dma-buf/dma-fence-array.c | 78 +- drivers/gpu/drm/xe/Makefile|1 + .../gpu/drm/xe/compat-i915-headers/intel_pcode.h |8 +- .../gpu/drm/xe/compat-i915-headers/intel_uncore.h |7 + drivers/gpu/drm/xe/display/intel_fbdev_fb.c|1 - drivers/gpu/drm/xe/display/xe_display.c|8 +- drivers/gpu/drm/xe/display/xe_dsb_buffer.c |1 - drivers/gpu/drm/xe/display/xe_fb_pin.c |1 - drivers/gpu/drm/xe/display/xe_hdcp_gsc.c |1 - drivers/gpu/drm/xe/regs/xe_gsc_regs.h |4 + drivers/gpu/drm/xe/tests/xe_dma_buf.c |
[PULL] drm-xe-next
Hi Dave and Sima, Second drm-xe-next pull request for the 6.12 cycle. This includes the pull request from last week since it was not applied due to the ttm patch. That patch is now reverted and replacement on the back burner. The only UAPI change is actually a fix for building with gcc 5. Aside from the additional fixes compared to last week, 2 important patches to remove the force_probe requirement for LNL and BMG. Those are the first platforms to be officially supported by the xe driver: one integrated and one discrete. For BMG there are still some necessary changes going through the drm-intel-next pull request later this week. Cheers, Lucas De Marchi drm-xe-next-2024-08-28: UAPI Changes: - Fix OA format masks which were breaking build with gcc-5 Cross-subsystem Changes: Driver Changes: - Use dma_fence_chain_free in chain fence unused as a sync (Matthew Brost) - Refactor hw engine lookup and mmio access to be used in more places (Dominik, Matt Auld, Mika Kuoppala) - Enable priority mem read for Xe2 and later (Pallavi Mishra) - Fix PL1 disable flow in xe_hwmon_power_max_write (Karthik) - Fix refcount and speedup devcoredump (Matthew Brost) - Add performance tuning changes to Xe2 (Akshata, Shekhar) - Fix OA sysfs entry (Ashutosh) - Add first GuC firmware support for BMG (Julia) - Bump minimum GuC firmware for platforms under force_probe to match LNL and BMG (Julia) - Fix access check on user fence creation (Nirmoy) - Add/document workarounds for Xe2 (Julia, Daniele, John, Tejas) - Document workaround and use proper WA infra (Matt Roper) - Fix VF configuration on media GT (Michal Wajdeczko) - Fix VM dma-resv lock (Matthew Brost) - Allow suspend/resume exec queue backend op to be called multiple times (Matthew Brost) - Add GT stats to debugfs (Nirmoy) - Add hwconfig to debugfs (Matt Roper) - Compile out all debugfs code with ONFIG_DEUBG_FS=n (Lucas) - Remove dead kunit code (Jani Nikula) - Refactor drvdata storing to help display (Jani Nikula) - Cleanup unsused xe parameter in pte handling (Himal) - Rename s/enable_display/probe_display/ for clarity (Lucas) - Fix missing MCR annotation in couple of registers (Tejas) - Fix DGFX display suspend/resume (Maarten) - Prepare exec_queue_kill for PXP handling (Daniele) - Fix devm/drmm issues (Daniele, Matthew Brost) - Fix tile and ggtt fini sequences (Matthew Brost) - Fix crashes when probing without firmware in place (Daniele, Matthew Brost) - Use xe_managed for kernel BOs (Daniele, Matthew Brost) - Future-proof dss_per_group calculation by using hwconfig (Matt Roper) - Use reserved copy engine for user binds on faulting devices (Matthew Brost) - Allow mixing dma-fence jobs and long-running faulting jobs (Francois) - Cleanup redundant arg when creating use BO (Nirmoy) - Prevent UAF around preempt fence (Auld) - Fix display suspend/resume (Maarten) - Use vma_pages() helper (Thorsten) - Calculate pagefault queue size (Stuart, Matthew Auld) - Fix missing pagefault wq destroy (Stuart) - Fix lifetime handling of HW fence ctx (Matthew Brost) - Fix order destroy order for jobs (Matthew Brost) - Fix TLB invalidation for media GT (Matthew Brost) - Document GGTT (Rodrigo Vivi) - Refactor GGTT layering and fix runtime outer protection (Rodrigo Vivi) - Handle HPD polling on display pm runtime suspend/resume (Imre, Vinod) - Drop unrequired NULL checks (Apoorva, Himal) - Use separate rpm lockdep map for non-d3cold-capable devices (Thomas Hellström) - Support "nomodeset" kernel command-line option (Thomas Zimmermann) - Drop force_probe requirement for LNL and BMG (Lucas, Balasubramani) The following changes since commit a809b92ee0f84c3f655b16a8b4d04bc3665d954a: Merge tag 'drm-intel-next-2024-08-13' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next (2024-08-16 12:56:42 +1000) are available in the Git repository at: https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-next-2024-08-28 for you to fetch changes up to 3adcf970dc7ec0469ec3116a5a8be9161d17a335: drm/xe/bmg: Drop force_probe requirement (2024-08-28 10:47:03 -0700) UAPI Changes: - Fix OA format masks which were breaking build with gcc-5 Cross-subsystem Changes: Driver Changes: - Use dma_fence_chain_free in chain fence unused as a sync (Matthew Brost) - Refactor hw engine lookup and mmio access to be used in more places (Dominik, Matt Auld, Mika Kuoppala) - Enable priority mem read for Xe2 and later (Pallavi Mishra) - Fix PL1 disable flow in xe_hwmon_power_max_write (Karthik) - Fix refcount and speedup devcoredump (Matthew Brost) - Add performance tuning changes to Xe2 (Akshata, Shekhar) - Fix OA sysfs entry (Ashutosh) - Add first GuC firmware support for BMG (Julia) - Bump minimum GuC firmware for platforms under force_probe to match LNL and BMG (Julia) - Fix access check on user fence creation (Nirmoy) - Add/document workarounds for Xe2 (Julia, Daniele, John, Tejas) - Documen
Re: [PATCH v2] drm/xe: Support 'nomodeset' kernel command-line option
On Tue, 27 Aug 2024 14:09:05 +0200, Thomas Zimmermann wrote: > Setting 'nomodeset' on the kernel command line disables all graphics > drivers with modesetting capabilities, leaving only firmware drivers, > such as simpledrm or efifb. > > Most DRM drivers automatically support 'nomodeset' via DRM's module > helper macros. In xe, which uses regular module_init(), manually call > drm_firmware_drivers_only() to test for 'nomodeset'. Do not register > the driver if set. > > [...] Applied to drm-xe-next, thanks! [1/1] drm/xe: Support 'nomodeset' kernel command-line option commit: 014125c64d09e58e90dde49fbb57d802a13e2559 Best regards, -- Lucas De Marchi
Re: [PATCH 2/2] Revert "drm/ttm: Add a flag to allow drivers to skip clear-on-free"
On Tue, Aug 27, 2024 at 11:42:59AM GMT, Thomas Hellström wrote: On Wed, 2024-08-21 at 11:50 +0200, Nirmoy Das wrote: Remove TTM_TT_FLAG_CLEARED_ON_FREE now that XE stopped using this flag. This reverts commit decbfaf06db05fa1f9b33149ebb3c145b44e878f. Cc: Christian König Cc: Himal Prasad Ghimiray Cc: Matthew Auld Cc: Matthew Brost Cc: Thomas Hellström Signed-off-by: Nirmoy Das Reviewed-by: Thomas Hellström Christian, since that commit being reverted was merged through drm-xe... ack on reverting it in our branch? It's actually the only branch where it can be reverted. AFAICS you agreed with the revert in https://lore.kernel.org/intel-xe/02a383c5-db18-47ef-9118-72effd83a...@amd.com/ thanks Lucas De Marchi
Re: [PATCH] drm/xe: Support 'nomodeset' kernel command-line option
On Wed, Aug 21, 2024 at 04:48:23PM GMT, Thomas Zimmermann wrote: Hi Am 21.08.24 um 16:29 schrieb Jani Nikula: On Wed, 21 Aug 2024, Gustavo Sousa wrote: Quoting Thomas Zimmermann (2024-08-21 10:56:59-03:00) Setting 'nomodeset' on the kernel command line disables all graphics drivers with modesetting capabilities; leaving only firmware drivers, such as simpledrm or efifb. Most DRM drivers automatically support 'nomodeset' via DRM's module helper macros. In xe, which uses regular module_init(), manually call drm_firmware_drivers_only() to test for 'nomodeset'. Do not register the driver if set. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/xe/xe_module.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c index 923460119cec..60fb7dd26903 100644 --- a/drivers/gpu/drm/xe/xe_module.c +++ b/drivers/gpu/drm/xe/xe_module.c @@ -8,6 +8,8 @@ #include #include +#include + #include "xe_drv.h" #include "xe_hw_fence.h" #include "xe_pci.h" @@ -92,6 +94,9 @@ static int __init xe_init(void) { int err, i; +if (drm_firmware_drivers_only()) +return -ENODEV; + Hm... But what if xe is to be used only for compute or render? Shouldn't we handle this somewhere else? The question becomes, what does "nomodeset" really mean here? That function's name 'firmware drivers only' says it better than the option's name. We used 'nomodeset', because it was there already and had the correct semantics. agreed this should be on a module-level to maintain the behavior already used. If we were not maintaining that behavior, then we should probably not use "nomodeset" and choose something else :). Also we already have the other 2 as module params: probe_display and disable_display, with driver still registering as a drm driver, but leaving the display part out. Thomas, are you going to send a v2 to use the init table? thanks Lucas De Marchi See what i915 does in i915_module.c. i915 and the other drivers for PCI-based hardware don't load at all. Drivers for external displays (e.g., SPI, USB) ignore nomodeset, as these displays are not initialized by firmware. Best regards Thomas Cc: Sima. BR, Jani. Taking a quick look, xe_display_probe() might be a good candidate? -- Gustavo Sousa for (i = 0; i < ARRAY_SIZE(init_funcs); i++) { err = init_funcs[i].init(); if (err) { -- 2.46.0 -- -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg)
Re: [PATCH] drm/xe: Fix total initialization in xe_ggtt_print_holes()
On Fri, Aug 23, 2024 at 08:47:13PM GMT, Nathan Chancellor wrote: Clang warns (or errors with CONFIG_DRM_WERROR or CONFIG_WERROR): drivers/gpu/drm/xe/xe_ggtt.c:810:3: error: variable 'total' is uninitialized when used here [-Werror,-Wuninitialized] 810 | total += hole_size; | ^ drivers/gpu/drm/xe/xe_ggtt.c:798:11: note: initialize the variable 'total' to silence this warning 798 | u64 total; | ^ | = 0 1 error generated. Move the zero initialization of total from xe_gt_sriov_pf_config_print_available_ggtt() to xe_ggtt_print_holes() to resolve the warning. Fixes: 136367290ea5 ("drm/xe: Introduce xe_ggtt_print_holes") Signed-off-by: Nathan Chancellor Reviewed-by: Lucas De Marchi I will push it soon to drm-xe-next, thanks. Lucas De Marchi --- drivers/gpu/drm/xe/xe_ggtt.c | 2 +- drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c index a3ce91decdce..86fc6afa43bd 100644 --- a/drivers/gpu/drm/xe/xe_ggtt.c +++ b/drivers/gpu/drm/xe/xe_ggtt.c @@ -795,7 +795,7 @@ u64 xe_ggtt_print_holes(struct xe_ggtt *ggtt, u64 alignment, struct drm_printer const struct drm_mm_node *entry; u64 hole_min_start = xe_wopcm_size(tile_to_xe(ggtt->tile)); u64 hole_start, hole_end, hole_size; - u64 total; + u64 total = 0; char buf[10]; mutex_lock(&ggtt->lock); diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c index c8835d9eead6..41ed07b153b5 100644 --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c @@ -2110,7 +2110,7 @@ int xe_gt_sriov_pf_config_print_available_ggtt(struct xe_gt *gt, struct drm_prin { struct xe_ggtt *ggtt = gt_to_tile(gt)->mem.ggtt; u64 alignment = pf_get_ggtt_alignment(gt); - u64 spare, avail, total = 0; + u64 spare, avail, total; char buf[10]; xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt))); --- base-commit: 66a0f6b9f5fc205272035b6ffa4830be51e3f787 change-id: 20240823-drm-xe-fix-total-in-xe_ggtt_print_holes-0cf2c399aa2a Best regards, -- Nathan Chancellor
[PULL] drm-xe-next
Hi Dave and Sima, Second drm-xe-next pull request for the 6.12 cycle. Several fixes here that are also going through drm-xe-fixes. Most of the other changes are the missing bits for Xe2 (LNL and BMG) and general refactors. The only UAPI change is actually a fix for building with gcc 5. I left some commits out since they were too fresh and I didn't have core CI results for them. Plan is to send the final pull request for 6.12 next week with those commits. thanks Lucas De Marchi drm-xe-next-2024-08-22: UAPI Changes: - Fix OA format masks which were breaking build with gcc-5 Cross-subsystem Changes: - drm/ttm: Add a flag to allow drivers to skip clear-on-free Driver Changes: - Use dma_fence_chain_free in chain fence unused as a sync (Matthew Brost) - Refactor hw engine lookup and mmio access to be used in more places (Dominik, Matt Auld, Mika Kuoppala) - Enable priority mem read for Xe2 and later (Pallavi Mishra) - Fix PL1 disable flow in xe_hwmon_power_max_write (Karthik) - Fix refcount and speedup devcoredump (Matthew Brost) - Add performance tuning changes to Xe2 (Akshata, Shekhar) - Fix OA sysfs entry (Ashutosh) - Add first GuC firmware support for BMG (Julia) - Bump minimum GuC firmware for platforms under force_probe to match LNL and BMG (Julia) - Fix access check on user fence creation (Nirmoy) - Add/document workarounds for Xe2 (Julia, Daniele, John, Tejas) - Document workaround and use proper WA infra (Matt Roper) - Fix VF configuration on media GT (Michal Wajdeczko) - Fix VM dma-resv lock (Matthew Brost) - Allow suspend/resume exec queue backend op to be called multiple times (Matthew Brost) - Add GT stats to debugfs (Nirmoy) - Add hwconfig to debugfs (Matt Roper) - Compile out all debugfs code with ONFIG_DEUBG_FS=n (Lucas) - Offload system clear page to GPU (Nirmoy) - Remove dead kunit code (Jani Nikula) - Refactor drvdata storing to help display (Jani Nikula) - Cleanup unsused xe parameter in pte handling (Himal) - Rename s/enable_display/probe_display/ for clarity (Lucas) - Fix missing MCR annotation in couple of registers (Tejas) - Fix DGFX display suspend/resume (Maarten) - Prepare exec_queue_kill for PXP handling (Daniele) - Fix devm/drmm issues (Daniele, Matthew Brost) - Fix tile fini sequence (Brost) - Fix crashes when probing without firmware in place (Daniele) - Use xe_managed for kernel BOs (Daniele, Matthew Brost) - Future-proof dss_per_group calculation by using hwconfig (Matt Roper) - Use reserved copy engine for user binds on faulting devices (Matthew Brost) - Allow mixing dma-fence jobs and long-running faulting jobs (Francois) - Cleanup redundant arg when creating use BO (Nirmoy) - Prevent UAF around preempt fence (Auld) - Fix display suspend/resume (Maarten) - Use vma_pages() helper (Thorsten) The following changes since commit a809b92ee0f84c3f655b16a8b4d04bc3665d954a: Merge tag 'drm-intel-next-2024-08-13' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next (2024-08-16 12:56:42 +1000) are available in the Git repository at: https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-next-2024-08-22 for you to fetch changes up to 23683061805be368c8d1c7e7ff52abc470cac275: drm/xe/lnl: Offload system clear page activity to GPU (2024-08-19 17:49:00 +0200) UAPI Changes: - Fix OA format masks which were breaking build with gcc-5 Cross-subsystem Changes: - drm/ttm: Add a flag to allow drivers to skip clear-on-free Driver Changes: - Use dma_fence_chain_free in chain fence unused as a sync (Matthew Brost) - Refactor hw engine lookup and mmio access to be used in more places (Dominik, Matt Auld, Mika Kuoppala) - Enable priority mem read for Xe2 and later (Pallavi Mishra) - Fix PL1 disable flow in xe_hwmon_power_max_write (Karthik) - Fix refcount and speedup devcoredump (Matthew Brost) - Add performance tuning changes to Xe2 (Akshata, Shekhar) - Fix OA sysfs entry (Ashutosh) - Add first GuC firmware support for BMG (Julia) - Bump minimum GuC firmware for platforms under force_probe to match LNL and BMG (Julia) - Fix access check on user fence creation (Nirmoy) - Add/document workarounds for Xe2 (Julia, Daniele, John, Tejas) - Document workaround and use proper WA infra (Matt Roper) - Fix VF configuration on media GT (Michal Wajdeczko) - Fix VM dma-resv lock (Matthew Brost) - Allow suspend/resume exec queue backend op to be called multiple times (Matthew Brost) - Add GT stats to debugfs (Nirmoy) - Add hwconfig to debugfs (Matt Roper) - Compile out all debugfs code with ONFIG_DEUBG_FS=n (Lucas) - Offload system clear page to GPU (Nirmoy) - Remove dead kunit code (Jani Nikula) - Refactor drvdata storing to help display (Jani Nikula) - Cleanup unsused xe parameter in pte handling (Himal) - Rename s/enable_display/probe_display/ for clarity (Lucas) - Fix missing MCR annotation in couple of registers (Tejas) - Fix DGFX display suspend/resume (Maarten) - Prepare exec
Re: [PATCH] drm/xe: Support 'nomodeset' kernel command-line option
On Wed, Aug 21, 2024 at 03:56:59PM GMT, Thomas Zimmermann wrote: Setting 'nomodeset' on the kernel command line disables all graphics drivers with modesetting capabilities; leaving only firmware drivers, such as simpledrm or efifb. Most DRM drivers automatically support 'nomodeset' via DRM's module helper macros. In xe, which uses regular module_init(), manually call drm_firmware_drivers_only() to test for 'nomodeset'. Do not register the driver if set. I see some drivers like i915 and radeon using an additional 'modeset' parameter... probably to be able to avoid modeset for that specific driver while still allowing for others? Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/xe/xe_module.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c index 923460119cec..60fb7dd26903 100644 --- a/drivers/gpu/drm/xe/xe_module.c +++ b/drivers/gpu/drm/xe/xe_module.c @@ -8,6 +8,8 @@ #include #include +#include + #include "xe_drv.h" #include "xe_hw_fence.h" #include "xe_pci.h" @@ -92,6 +94,9 @@ static int __init xe_init(void) { int err, i; + if (drm_firmware_drivers_only()) + return -ENODEV; nit: being the first, without an .exit may be equivalent, but probably better for parity with i915 to use a xe_check_modeset() and add it as the first one in the table. We'd need to check for exit != NULL, though. Anyway, Reviewed-by: Lucas De Marchi thanks Lucas De Marchi + for (i = 0; i < ARRAY_SIZE(init_funcs); i++) { err = init_funcs[i].init(); if (err) { -- 2.46.0
Re: [PATCH v2] drm/i915/gt: Use kmemdup_array instead of kmemdup for multiple allocation
On Tue, Aug 20, 2024 at 05:53:02PM GMT, Yu Jiaoliang wrote: Let the kememdup_array() take care about multiplication and possible overflows. v2: - Change subject - Leave one blank line between the commit log and the tag section - Fix code alignment issue Signed-off-by: Yu Jiaoliang Reviewed-by: Jani Nikula Reviewed-by: Andi Shyti --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index d90348c56765..0fcfd55c62b4 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -111,9 +111,8 @@ static void wa_init_finish(struct i915_wa_list *wal) { /* Trim unused entries. */ if (!IS_ALIGNED(wal->count, WA_LIST_CHUNK)) { - struct i915_wa *list = kmemdup_array(wal->list, ^ it was already kmemdup_array, not kmemdup. Am I missing anything? Lucas De Marchi - wal->count, sizeof(*list), - GFP_KERNEL); + struct i915_wa *list = kmemdup_array(wal->list, wal->count, + sizeof(*list), GFP_KERNEL); if (list) { kfree(wal->list); -- 2.34.1
Re: Build regressions/improvements in v6.11-rc4
On Mon, Aug 19, 2024 at 09:30:32AM GMT, Geert Uytterhoeven wrote: On Mon, 19 Aug 2024, Geert Uytterhoeven wrote: JFYI, when comparing v6.11-rc4[1] to v6.11-rc3[3], the summaries are: - build errors: +6/-4 + /kisskb/src/arch/sparc/vdso/vdso32/../vclock_gettime.c: error: no previous prototype for '__vdso_clock_gettime' [-Werror=missing-prototypes]: => 254:1 + /kisskb/src/arch/sparc/vdso/vdso32/../vclock_gettime.c: error: no previous prototype for '__vdso_clock_gettime_stick' [-Werror=missing-prototypes]: => 282:1 + /kisskb/src/arch/sparc/vdso/vdso32/../vclock_gettime.c: error: no previous prototype for '__vdso_gettimeofday' [-Werror=missing-prototypes]: => 307:1 + /kisskb/src/arch/sparc/vdso/vdso32/../vclock_gettime.c: error: no previous prototype for '__vdso_gettimeofday_stick' [-Werror=missing-prototypes]: => 343:1 sparc64-gcc13/sparc64-allmodconfig (pre-existing, but now emitted twice in this config only?) + /kisskb/src/include/linux/compiler_types.h: error: call to '__compiletime_assert_952' declared with attribute error: FIELD_GET: mask is not constant: => 510:38 + /kisskb/src/include/linux/compiler_types.h: error: call to '__compiletime_assert_953' declared with attribute error: FIELD_GET: mask is not constant: => 510:38 powerpc-gcc5/powerpc-all{mod,yes}config In function 'decode_oa_format.isra.26', inlined from 'xe_oa_set_prop_oa_format' at drivers/gpu/drm/xe/xe_oa.c:1664:6: drivers/gpu/drm/xe/xe_oa.c:1573:18: note: in expansion of macro 'FIELD_GET' u32 bc_report = FIELD_GET(DRM_XE_OA_FORMAT_MASK_BC_REPORT, fmt); ^ Seen before, patch available. +Rodrigo This patch needs to propagate to drm-xe-fixes: f2881dfdaaa9 ("drm/xe/oa/uapi: Make bit masks unsigned") Lucas De Marchi Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. - Linus Torvalds
Re: [PATCH 2/3] drm/fourcc: define Intel Xe2 related tile4 ccs modifiers
On Fri, Aug 16, 2024 at 02:52:28PM GMT, Juha-Pekka Heikkila wrote: Add Tile4 type ccs modifiers to indicate presence of compression on Xe2. Here is defined I915_FORMAT_MOD_4_TILED_LNL_CCS which is meant for integrated graphics with igpu related limitations Here is also defined I915_FORMAT_MOD_4_TILED_BMG_CCS which is meant for discrete graphics with dgpu related limitations Signed-off-by: Juha-Pekka Heikkila not very fond of adding the platform names, but looks like this was always the approach, so this keeps the consistency. Reviewed-by: Lucas De Marchi We will need an ack from drm-misc maintainer to merge this through drm-intel. Let's add some Cc. Lucas De Marchi --- include/uapi/drm/drm_fourcc.h | 25 + 1 file changed, 25 insertions(+) diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h index 2d84a8052b15..78abd819fd62 100644 --- a/include/uapi/drm/drm_fourcc.h +++ b/include/uapi/drm/drm_fourcc.h @@ -702,6 +702,31 @@ extern "C" { */ #define I915_FORMAT_MOD_4_TILED_MTL_RC_CCS_CC fourcc_mod_code(INTEL, 15) +/* + * Intel Color Control Surfaces (CCS) for graphics ver. 20 unified compression + * on integrated graphics + * + * The main surface is Tile 4 and at plane index 0. For semi-planar formats + * like NV12, the Y and UV planes are Tile 4 and are located at plane indices + * 0 and 1, respectively. The CCS for all planes are stored outside of the + * GEM object in a reserved memory area dedicated for the storage of the + * CCS data for all compressible GEM objects. + */ +#define I915_FORMAT_MOD_4_TILED_LNL_CCS fourcc_mod_code(INTEL, 16) + +/* + * Intel Color Control Surfaces (CCS) for graphics ver. 20 unified compression + * on discrete graphics + * + * The main surface is Tile 4 and at plane index 0. For semi-planar formats + * like NV12, the Y and UV planes are Tile 4 and are located at plane indices + * 0 and 1, respectively. The CCS for all planes are stored outside of the + * GEM object in a reserved memory area dedicated for the storage of the + * CCS data for all compressible GEM objects. The GEM object must be stored in + * contiguous memory with a size aligned to 64KB + */ +#define I915_FORMAT_MOD_4_TILED_BMG_CCS fourcc_mod_code(INTEL, 17) + /* * Tiled, NV12MT, grouped in 64 (pixels) x 32 (lines) -sized macroblocks * -- 2.45.2
Re: [PATCH v2] drm/xe/uapi: Bring back reset uevent
On Fri, Aug 16, 2024 at 09:59:31AM GMT, Aravind Iddamsetty wrote: On 14/08/24 19:54, Lucas De Marchi wrote: On Wed, Aug 14, 2024 at 12:16:41PM GMT, Aravind Iddamsetty wrote: i see that this is even called from xe_guc_exec_queue_lr_cleanup which is for long running queue from various call paths need to check in what scenarios those happen. Should we add a reason for long running queue? Both of your questions would probably be answered if this was getting developed at the same time of the user space usage of these uevents. Can't we consider the generic Linux user as a consumer, via udevadm. No, udevadm just confirms that there is an event being sent. Main thing to understand here is "what is this event useful for? what can be done as outcome of receiving this event?". From your other reply it seems this is about "device is wedged/toast, and driver can't recover without userspace intervention since userspace may have different policies" What would be some examples of actions for userspace to take? - Unbind driver, kill clients, rebind? - FLR? - Something else? The expectation from userspace on receiving this event is to do a reset most probably SBR(unbind->sbr->rebind). The other requirement of this event is for L0 Sysman https://spec.oneapi.io/level-zero/latest/sysman/api.html#_CPPv4N21zes_event_type_flag_t41ZES_EVENT_TYPE_FLAG_DEVICE_RESET_REQUIREDE https://spec.oneapi.io/level-zero/latest/sysman/api.html#_CPPv4N18zes_device_state_t5resetE So we expect the admin do the above actions so these things should be in the commit message. where's the pending implementation of this spec? I imagine somewhere we'd see something on top of libudev generating these L0 events. The events themselves are not much more than confirming via udevadm, but then at least we know the implementation covers the expectations from the the spec without having to come back later fixing up stuff. Is this currently implemented somewhere? Do you mean by some other driver or subsystem? Also, is it possible to make is a generic drm event handling so distros don't have to setup udev rules for each vendor? I think doing via drm event mandates the observability applications(L0) to have open connection to the device to process these which is not desired. I'm not talking about keeping the device open. Here you are defining xe-specific uevents. Could this be drm-specific so someone implementing that specification you mentioned doesn't have to implement N different uevents for N different vendors? And a distro defining a policy via udev rules doesn't have to do the same thing slightly differently N times? Lucas De Marchi Thanks, Aravind. Lucas De Marchi Thanks, Aravind. Raag
Re: linux-next: manual merge of the drm-xe tree with the drm-intel tree
On Thu, Aug 15, 2024 at 11:37:17AM GMT, Stephen Rothwell wrote: Hi all, Today's linux-next merge of the drm-xe tree got a conflict in: drivers/gpu/drm/xe/display/xe_display.c between commit: 769b081c18b9 ("drm/i915/opregion: convert to struct intel_display") from the drm-intel tree and commit: 1eda95cba9df ("drm/xe: Rename enable_display module param") from the drm-xe tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. this matches our current merge and will be resolved when we backmerge drm-next, before sending our next pull. thanks Lucas De Marchi -- Cheers, Stephen Rothwell diff --cc drivers/gpu/drm/xe/display/xe_display.c index 0e4adde84cb2,56a940b39412.. --- a/drivers/gpu/drm/xe/display/xe_display.c +++ b/drivers/gpu/drm/xe/display/xe_display.c @@@ -127,9 -126,8 +127,9 @@@ int xe_display_init_nommio(struct xe_de static void xe_display_fini_noirq(void *arg) { struct xe_device *xe = arg; + struct intel_display *display = &xe->display; - if (!xe->info.enable_display) + if (!xe->info.probe_display) return; intel_display_driver_remove_noirq(xe); @@@ -138,10 -135,9 +138,10 @@@ int xe_display_init_noirq(struct xe_device *xe) { + struct intel_display *display = &xe->display; int err; - if (!xe->info.enable_display) + if (!xe->info.probe_display) return 0; intel_display_driver_early_probe(xe); @@@ -252,9 -246,7 +252,9 @@@ void xe_display_irq_handler(struct xe_d void xe_display_irq_enable(struct xe_device *xe, u32 gu_misc_iir) { + struct intel_display *display = &xe->display; + - if (!xe->info.enable_display) + if (!xe->info.probe_display) return; if (gu_misc_iir & GU_MISC_GSE) @@@ -289,9 -296,8 +289,9 @@@ static bool suspend_to_idle(void void xe_display_pm_suspend(struct xe_device *xe, bool runtime) { + struct intel_display *display = &xe->display; bool s2idle = suspend_to_idle(); - if (!xe->info.enable_display) + if (!xe->info.probe_display) return; /* @@@ -341,9 -347,7 +341,9 @@@ void xe_display_pm_resume_early(struct void xe_display_pm_resume(struct xe_device *xe, bool runtime) { + struct intel_display *display = &xe->display; + - if (!xe->info.enable_display) + if (!xe->info.probe_display) return; intel_dmc_resume(xe);
Re: [PATCH v2] drm/xe/uapi: Bring back reset uevent
On Wed, Aug 14, 2024 at 12:16:41PM GMT, Aravind Iddamsetty wrote: i see that this is even called from xe_guc_exec_queue_lr_cleanup which is for long running queue from various call paths need to check in what scenarios those happen. Should we add a reason for long running queue? Both of your questions would probably be answered if this was getting developed at the same time of the user space usage of these uevents. Can't we consider the generic Linux user as a consumer, via udevadm. No, udevadm just confirms that there is an event being sent. Main thing to understand here is "what is this event useful for? what can be done as outcome of receiving this event?". From your other reply it seems this is about "device is wedged/toast, and driver can't recover without userspace intervention since userspace may have different policies" What would be some examples of actions for userspace to take? - Unbind driver, kill clients, rebind? - FLR? - Something else? Is this currently implemented somewhere? Also, is it possible to make is a generic drm event handling so distros don't have to setup udev rules for each vendor? Lucas De Marchi Thanks, Aravind. Raag
Re: [PATCH v2] drm/xe/uapi: Bring back reset uevent
On Mon, Aug 12, 2024 at 01:18:12PM GMT, Raag Jadav wrote: From: Himal Prasad Ghimiray This was dropped in commit 77a0d4d1cea2 ("drm/xe/uapi: Remove reset uevent for now") as part of refactoring. Now that we have better uapi semantics and naming for the uevent, bring it back. With this in place, userspace will be notified of wedged device along with its reason. Title of this patch, this paragraph and what's being done in the implemenation don't match since reset != wedged $ udevadm monitor --property --kernel monitor will print the received events for: KERNEL - the kernel uevent KERNEL[871.188570] change /devices/pci:00/:00:01.0/:01:00.0/:02:01.0/:03:00.0 (pci) ACTION=change DEVPATH=/devices/pci:00/:00:01.0/:01:00.0/:02:01.0/:03:00.0 SUBSYSTEM=pci DEVICE_STATUS=NEEDS_RESET REASON=GT_RESET_FAILED TILE_ID=0 GT_ID=0 DRIVER=xe PCI_CLASS=3 PCI_ID=8086:56B1 PCI_SUBSYS_ID=8086:1210 PCI_SLOT_NAME=:03:00.0 MODALIAS=pci:v8086d56B1sv8086sd1210bc03sc00i00 SEQNUM=6104 v2: Change authorship to Himal (Aravind) Add uevent for all device wedged cases (Aravind) Signed-off-by: Himal Prasad Ghimiray Signed-off-by: Raag Jadav --- drivers/gpu/drm/xe/xe_device.c | 10 +- drivers/gpu/drm/xe/xe_device.h | 2 +- drivers/gpu/drm/xe/xe_gt.c | 23 +++ drivers/gpu/drm/xe/xe_guc.c| 13 - drivers/gpu/drm/xe/xe_guc_submit.c | 13 - include/uapi/drm/xe_drm.h | 29 + 6 files changed, 82 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index 1aba6f9eaa19..d975bdce4a7d 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -955,6 +955,7 @@ static void xe_device_wedged_fini(struct drm_device *drm, void *arg) /** * xe_device_declare_wedged - Declare device wedged * @xe: xe device instance + * @event_params: parameters to be sent along with uevent * * This is a final state that can only be cleared with a mudule * re-probe (unbind + bind). @@ -965,8 +966,10 @@ static void xe_device_wedged_fini(struct drm_device *drm, void *arg) * on every single execution timeout (a.k.a. GPU hang) right after devcoredump * snapshot capture. In this mode, GT reset won't be attempted so the state of * the issue is preserved for further debugging. + * Caller is expected to pass respective parameters to be sent along with + * uevent. Pass NULL in case of no params. */ -void xe_device_declare_wedged(struct xe_device *xe) +void xe_device_declare_wedged(struct xe_device *xe, char **event_params) { struct xe_gt *gt; u8 id; @@ -984,12 +987,17 @@ void xe_device_declare_wedged(struct xe_device *xe) xe_pm_runtime_get_noresume(xe); if (!atomic_xchg(&xe->wedged.flag, 1)) { + struct pci_dev *pdev = to_pci_dev(xe->drm.dev); + Given the names of the uevents, I'm not sure if the intention is to notify when a gt reset happens or only when it fails (and wedged.mode != 0) Lucas De Marchi xe->needs_flr_on_fini = true; drm_err(&xe->drm, "CRITICAL: Xe has declared device %s as wedged.\n" "IOCTLs and executions are blocked. Only a rebind may clear the failure\n" "Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/xe/kernel/issues/new\n";, dev_name(xe->drm.dev)); + + /* Notify userspace about reset required */ + kobject_uevent_env(&pdev->dev.kobj, KOBJ_CHANGE, event_params); } for_each_gt(gt, xe, id) diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h index db6cc8d0d6b8..5d40fc6f0904 100644 --- a/drivers/gpu/drm/xe/xe_device.h +++ b/drivers/gpu/drm/xe/xe_device.h @@ -174,7 +174,7 @@ static inline bool xe_device_wedged(struct xe_device *xe) return atomic_read(&xe->wedged.flag); } -void xe_device_declare_wedged(struct xe_device *xe); +void xe_device_declare_wedged(struct xe_device *xe, char **reset_event); struct xe_file *xe_file_get(struct xe_file *xef); void xe_file_put(struct xe_file *xef); diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index 58895ed22f6e..519f3c2cf9e2 100644 --- a/drivers/gpu/drm/xe/xe_gt.c +++ b/drivers/gpu/drm/xe/xe_gt.c @@ -741,6 +741,24 @@ static int do_gt_restart(struct xe_gt *gt) return 0; } +static void xe_gt_reset_failed(struct xe_gt *gt, int err) +{ + char *event_params[5]; + + xe_gt_err(gt, "reset failed (%pe)\n", ERR_PTR(err)); + + event_params[0] = DRM_XE_RESET_REQUIRED_UEVENT; + event_params[1] = DRM_XE_RESET_REQUIRED_UEVENT_REASON_GT; + event_params[2] = kasprintf(GFP_KERNEL, "TILE_ID=%d", gt_to_ti
Re: [PATCH v1] drm/xe/uapi: Bring back reset uevent
On Tue, Aug 06, 2024 at 10:02:31AM GMT, Raag Jadav wrote: From: Lucas De Marchi Bring back uevent for gt reset failure with better uapi naming. With this in place we can receive failure event using udev. $ udevadm monitor --property --kernel monitor will print the received events for: KERNEL - the kernel uevent KERNEL[871.188570] change /devices/pci:00/:00:01.0/:01:00.0/:02:01.0/:03:00.0 (pci) ACTION=change DEVPATH=/devices/pci:00/:00:01.0/:01:00.0/:02:01.0/:03:00.0 SUBSYSTEM=pci DEVICE_STATUS=NEEDS_RESET REASON=GT_RESET_FAILED TILE_ID=0 GT_ID=0 DRIVER=xe PCI_CLASS=3 PCI_ID=8086:56B1 PCI_SUBSYS_ID=8086:1210 PCI_SLOT_NAME=:03:00.0 MODALIAS=pci:v8086d56B1sv8086sd1210bc03sc00i00 SEQNUM=6104 Signed-off-by: Lucas De Marchi please drop my s-o-b here and don't add me as the author of this patch, which I certainly am not. You need to point to the commit where it was reverted and *why* it's ok to have this uapi now. Lucas De Marchi Signed-off-by: Raag Jadav --- drivers/gpu/drm/xe/xe_gt.c | 27 +-- include/uapi/drm/xe_drm.h | 17 + 2 files changed, 42 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index b04e47186f5b..5ceef0059861 100644 --- a/drivers/gpu/drm/xe/xe_gt.c +++ b/drivers/gpu/drm/xe/xe_gt.c @@ -740,6 +740,30 @@ static int do_gt_restart(struct xe_gt *gt) return 0; } +static void xe_uevent_gt_reset_failure(struct pci_dev *pdev, u8 tile_id, u8 gt_id) +{ + char *reset_event[5]; + + reset_event[0] = DRM_XE_RESET_REQUIRED_UEVENT; + reset_event[1] = DRM_XE_RESET_REQUIRED_UEVENT_REASON_GT; + reset_event[2] = kasprintf(GFP_KERNEL, "TILE_ID=%d", tile_id); + reset_event[3] = kasprintf(GFP_KERNEL, "GT_ID=%d", gt_id); + reset_event[4] = NULL; + kobject_uevent_env(&pdev->dev.kobj, KOBJ_CHANGE, reset_event); + + kfree(reset_event[2]); + kfree(reset_event[3]); +} + +static void gt_reset_failed(struct xe_gt *gt, int err) +{ + xe_gt_err(gt, "reset failed (%pe)\n", ERR_PTR(err)); + + /* Notify userspace about gt reset failure */ + xe_uevent_gt_reset_failure(to_pci_dev(gt_to_xe(gt)->drm.dev), + gt_to_tile(gt)->id, gt->info.id); +} + static int gt_reset(struct xe_gt *gt) { int err; @@ -795,8 +819,7 @@ static int gt_reset(struct xe_gt *gt) XE_WARN_ON(xe_uc_start(>->uc)); xe_pm_runtime_put(gt_to_xe(gt)); err_fail: - xe_gt_err(gt, "reset failed (%pe)\n", ERR_PTR(err)); - + gt_reset_failed(gt, err); xe_device_declare_wedged(gt_to_xe(gt)); return err; diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index 19619d4952a8..9ea3be97535e 100644 --- a/include/uapi/drm/xe_drm.h +++ b/include/uapi/drm/xe_drm.h @@ -20,6 +20,7 @@ extern "C" { * 2. Extension definition and helper structs * 3. IOCTL's Query structs in the order of the Query's entries. * 4. The rest of IOCTL structs in the order of IOCTL declaration. + * 5. uEvents */ /** @@ -1686,6 +1687,22 @@ struct drm_xe_oa_stream_info { __u64 reserved[3]; }; +/** + * DOC: uevent generated by xe on it's pci node. + * + * DRM_XE_RESET_REQUIRED_UEVENT - Event is generated when device needs reset. + * The REASON is provided along with the event for which reset is required. + * On the basis of REASONS, additional information might be supplied. + */ +#define DRM_XE_RESET_REQUIRED_UEVENT "DEVICE_STATUS=NEEDS_RESET" + +/** + * DRM_XE_RESET_REQUIRED_UEVENT_REASON_GT - Reason provided to DRM_XE_RESET_REQUIRED_UEVENT + * incase of gt reset failure. The additional information supplied is tile id and + * gt id of the gt unit for which reset has failed. + */ +#define DRM_XE_RESET_REQUIRED_UEVENT_REASON_GT "REASON=GT_RESET_FAILED" + #if defined(__cplusplus) } #endif -- 2.34.1
Re: [PATCH 04/12] drm/xe: xe_gen_wa_oob: fix program_invocation_short_name for macos
On Wed, Aug 07, 2024 at 08:13:51AM GMT, Daniel Gomez wrote: On Tue, Aug 06, 2024 at 08:50:09PM GMT, Lucas De Marchi wrote: On Wed, Aug 07, 2024 at 01:09:18AM GMT, Daniel Gomez via B4 Relay wrote: > From: Daniel Gomez > > Use getprogname() [1] instead of program_invocation_short_name() [2] > for macOS hosts. > > [1]: > https://www.gnu.org/software/gnulib/manual/html_node/ > program_005finvocation_005fshort_005fname.html > > [2]: > https://developer.apple.com/library/archive/documentation/System/ > Conceptual/ManPages_iPhoneOS/man3/getprogname.3.html > > Fixes build error for macOS hosts: > > drivers/gpu/drm/xe/xe_gen_wa_oob.c:34:3: error: use of > undeclared identifier 'program_invocation_short_name'34 | > program_invocation_short_name); | ^ 1 error > generated. > > Signed-off-by: Daniel Gomez > --- > drivers/gpu/drm/xe/xe_gen_wa_oob.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/xe/xe_gen_wa_oob.c b/drivers/gpu/drm/xe/xe_gen_wa_oob.c > index 904cf47925aa..079b8870c461 100644 > --- a/drivers/gpu/drm/xe/xe_gen_wa_oob.c > +++ b/drivers/gpu/drm/xe/xe_gen_wa_oob.c > @@ -9,6 +9,12 @@ > #include > #include > #include > +#define PROG_INV_NAME program_invocation_short_name > + > +#ifdef __APPLE__ > +#include > +#define PROG_INV_NAME getprogname() > +#endif > > #define HEADER \ >"// SPDX-License-Identifier: MIT\n" \ > @@ -31,7 +37,7 @@ > static void print_usage(FILE *f) > { >fprintf(f, "usage: %s \n", > - program_invocation_short_name); > + PROG_INV_NAME); instead of doing that, can we a) include stdlib.h unconditionally and b) add here a `static const char *program_invocation_short_name = getprogname()` so we don't need to change the common case and just handle the "build on macos" as a compat layer? Does this align with your suggestion (v1 diff)? yes, just a nit that in xe we keep the includes sorted alphabetically, so the stdlib.h include should move up one line. Other than that, // Reviewed-by: Lucas De Marchi ... assuming the rest of the series and idea about building the kernel on macOS is not pushed back. Lucas De Marchi Note that static cannot be use here. diff --git a/drivers/gpu/drm/xe/xe_gen_wa_oob.c b/drivers/gpu/drm/xe/xe_gen_wa_oob.c index 079b8870c461..b3add20ccb01 100644 --- a/drivers/gpu/drm/xe/xe_gen_wa_oob.c +++ b/drivers/gpu/drm/xe/xe_gen_wa_oob.c @@ -9,12 +9,7 @@ #include #include #include -#define PROG_INV_NAME program_invocation_short_name - -#ifdef __APPLE__ #include -#define PROG_INV_NAME getprogname() -#endif #define HEADER \ "// SPDX-License-Identifier: MIT\n" \ @@ -36,8 +31,11 @@ static void print_usage(FILE *f) { +#ifdef __APPLE__ + const char *program_invocation_short_name = getprogname(); +#endif fprintf(f, "usage: %s \n", - PROG_INV_NAME); + program_invocation_short_name); } static void print_parse_error(const char *err_msg, const char *line, Lucas De Marchi > } > > static void print_parse_error(const char *err_msg, const char *line, > > -- > Git-146) > >
Re: [PATCH 04/12] drm/xe: xe_gen_wa_oob: fix program_invocation_short_name for macos
On Wed, Aug 07, 2024 at 01:09:18AM GMT, Daniel Gomez via B4 Relay wrote: From: Daniel Gomez Use getprogname() [1] instead of program_invocation_short_name() [2] for macOS hosts. [1]: https://www.gnu.org/software/gnulib/manual/html_node/ program_005finvocation_005fshort_005fname.html [2]: https://developer.apple.com/library/archive/documentation/System/ Conceptual/ManPages_iPhoneOS/man3/getprogname.3.html Fixes build error for macOS hosts: drivers/gpu/drm/xe/xe_gen_wa_oob.c:34:3: error: use of undeclared identifier 'program_invocation_short_name'34 | program_invocation_short_name); | ^ 1 error generated. Signed-off-by: Daniel Gomez --- drivers/gpu/drm/xe/xe_gen_wa_oob.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_gen_wa_oob.c b/drivers/gpu/drm/xe/xe_gen_wa_oob.c index 904cf47925aa..079b8870c461 100644 --- a/drivers/gpu/drm/xe/xe_gen_wa_oob.c +++ b/drivers/gpu/drm/xe/xe_gen_wa_oob.c @@ -9,6 +9,12 @@ #include #include #include +#define PROG_INV_NAME program_invocation_short_name + +#ifdef __APPLE__ +#include +#define PROG_INV_NAME getprogname() +#endif #define HEADER \ "// SPDX-License-Identifier: MIT\n" \ @@ -31,7 +37,7 @@ static void print_usage(FILE *f) { fprintf(f, "usage: %s \n", - program_invocation_short_name); + PROG_INV_NAME); instead of doing that, can we a) include stdlib.h unconditionally and b) add here a `static const char *program_invocation_short_name = getprogname()` so we don't need to change the common case and just handle the "build on macos" as a compat layer? Lucas De Marchi } static void print_parse_error(const char *err_msg, const char *line, -- Git-146)
[PULL] drm-xe-next
Hi Dave and Sima, Early drm-xe-next pull request for 6.12. Main reason for being much earlier than usual is to have the additional SIMD16 EU reported as it's a needed UAPI for Lunar Lake and Battlemage. It's sitting in drm-xe-next for a few weeks and userspace already testing with it. There's also a minor uapi change with change of error return code. The other 2 already propagated via -fixes. Other changes bring general improvements and cleanups to the driver, further support for SR-IOV as well as head Lunar Lake and Battlemage to the finish line of being officially exposed by the driver. Some bits still influx, so not yet there though. thanks Lucas De Marchi drm-xe-next-2024-07-30: drm-xe-next for 6.12 UAPI Changes: - Rename xe perf layer as xe observation layer, but was also made available via fixes to previous verison (Ashutosh) - Use write-back caching mode for system memory on DGFX, but was also mad available via fixes to previous version (Thomas) - Expose SIMD16 EU mask in topology query for userspace to know the type of EU, as available in PVC, Lunar Lake and Battlemage (Lucas) - Return ENOBUFS instead of ENOMEM in vm_bind if failure is tied to an array of binds (Matthew Brost) Driver Changes: - Log cleanup moving messages to debug priority (Michal Wajdeczko) - Add timeout to fences to adhere to dma_buf rules (Matthew Brost) - Rename old engine nomenclature to exec_queue (Matthew Brost) - Convert multiple bind ops to 1 job (Matthew Brost) - Add error injection for vm bind to help testing error path (Matthew Brost) - Fix error handling in page table to propagate correctly to userspace (Matthew Brost) - Re-organize and cleanup SR-IOV related registers (Michal Wajdeczko) - Make the device write barrier compatible with VF (Michal Wajdeczko) - New display workarounds for Battlemage (Matthew Auld) - New media workarounds for Lunar Lake and Battlemage (Ngai-Mint Kwan) - New graphics workarounds for Lunar Lake (Bommu Krishnaiah) - Tracepoint updates (Matthew Brost, Nirmoy Das) - Cleanup the header generation for OOB workarounds (Lucas De Marchi) - Fix leaking HDCP-related object (Nirmoy Das) - Serialize L2 flushes to avoid races (Tejas Upadhyay) - Log pid and comm on job timeout (José Roberto de Souza) - Simplify boilerplate code for live kunit (Michal Wajdeczko) - Improve kunit skips for live kunit (Michal Wajdeczko) - Fix xe_sync cleanup when handling xe_exec ioctl (Ashutosh Dixit) - Limit fair VF LMEM provisioning (Michal Wajdeczko) - New workaround to fence mmio writes in Lunar Lake (Tejas Upadhyay) - Warn on writes inaccessible register in VF (Michal Wajdeczko) - Fix register lookup in VF (Michal Wajdeczko) - Add GSC support for Battlemage (Alexander Usyskin) - Fix wedging only the GT in which timeout occurred (Matthew Brost) - Block device suspend when wedging (Matthew Brost) - Handle compression and migration changes for Battlemage (Akshata Jahagirdar) - Limit access of stolen memory for Lunar Lake (Uma Shankar) - Fail invalid addresses during user fence creation (Matthew Brost) - Refcount xe_file to safely and accurately store fdinfo stats (Umesh Nerlige Ramappa) - Cleanup and fix PM reference for TLB invalidation code (Matthew Brost) - Fix PM reference handling when communicating with GuC (Matthew Brost) - Add new BO flag for 2 MiB alignement and use in VF (Michal Wajdeczko) - Simplify MMIO setup for multi-tile platforms (Lucas De Marchi) - Add check for uninitialized access to OOB workarounds (Lucas De Marchi) - New GSC and HuC firmware blobs for Lunar Lake and Battlemage (Daniele Ceraolo Spurio) - Unify mmio wait logic (Gustavo Sousa) - Fix off-by-one when processing RTP rules (Lucas De Marchi) - Future-proof migrate logic with compressed PAT flag (Matt Roper) - Add WA kunit tests for Battlemage (Lucas De Marchi) - Test active tracking for workaorunds with kunit (Lucas De Marchi) - Add kunit tests for RTP with no actions (Lucas De Marchi) - Unify parse of OR rules in RTP (Lucas De Marchi) - Add performance tuning for Battlemage (Sai Teja Pottumuttu) - Make bit masks unsigned (Geert Uytterhoeven) The following changes since commit aaa08078e7251131f045ba248a68671db7f7bdf7: drm/xe/bmg: Apply Wa_22019338487 (2024-07-02 12:14:00 -0400) are available in the Git repository at: https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-next-2024-07-30 for you to fetch changes up to f2881dfdaaa9ec873dbd383ef5512fc31e576cbb: drm/xe/oa/uapi: Make bit masks unsigned (2024-07-30 13:45:38 -0700) drm-xe-next for 6.12 UAPI Changes: - Rename xe perf layer as xe observation layer, but was also made available via fixes to previous verison (Ashutosh) - Use write-back caching mode for system memory on DGFX, but was also mad available via fixes to previous version (Thomas) - Expose SIMD16 EU mask in topology query for userspace to know the type of EU, as available in PVC, Lunar Lake
Re: [PATCH] drm/xe/oa/uapi: Make bit masks unsigned
On Mon, Jul 29, 2024 at 02:54:58PM GMT, Ashutosh Dixit wrote: On Mon, 29 Jul 2024 06:21:20 -0700, Lucas De Marchi wrote: Hi Lucas, Reviewed-by: Lucas De Marchi That fixes the build, but question to Ashutosh: it's odd to tie the format to a bspec. What happens on next platform if the HW changes? Hopefully it doesn't change in an incompatible way, but looking at this code it seems we could still keep the uapi by untying the HW from the uapi in the documentation. IMO, in this case, it is not possible to decouple the formats from Bspec because that is where they are specified (in Bspec 52198/60942). In i915 the OA formats were specified by an enum (enum drm_i915_oa_format), but I would argue that enum is meaningful only when we refer back to Bspec. Also the i915 enum had to constantly updated when HW added new formats. For Xe, we changed the way the formats are specified in a way which we believe will make the uapi more robust and uapi header update much less frequent (hopefully we will never have to update the header and if at all we have to, we should be able to do it in a backwards compatible way since we have sufficient number of free bits). HW has followed this scheme for specifying the formats for years and only recently for Xe2 has added a couple of bits and introduced new PEC formats which I think it is not going to change now for some time. But as I said the formats have to refer back to Bspec since that is where there are specified and there are too many of them. Any description or enum is ambiguous unless it refers back to Bspec. So I don't see how not to refer to Bspec in the documentation. If anyone has any ideas about not referring to Bspec I am willing to consider it but I think the best way forward is to leave the documentation as is: /* * OA_FORMAT's are specified the same way as in PRM/Bspec 52198/60942, * in terms of the following quantities: a. enum @drm_xe_oa_format_type * b. Counter select c. Counter size and d. BC report. Also refer to the * oa_formats array in drivers/gpu/drm/xe/xe_oa.c. */ #define DRM_XE_OA_FORMAT_MASK_FMT_TYPE (0xff << 0) #define DRM_XE_OA_FORMAT_MASK_COUNTER_SEL (0xff << 8) #define DRM_XE_OA_FORMAT_MASK_COUNTER_SIZE (0xff << 16) #define DRM_XE_OA_FORMAT_MASK_BC_REPORT (0xff << 24) I was under impression that these shifts were something coming from the HW definition and we were exposing that raw to userspace, rather than e.g. struct drm_xe_oa_format { __u8 fmt_type; __u8 counter_sel; __u8 counter_size; __u8 bc_report; __u32 rsvd; }; now I see the shifts are not tied to HW and it was only the chosen format rather than declaring a struct. Applied this patch to drm-xe-next. Thanks Lucas De Marchi Thanks. -- Ashutosh
Re: [PATCH] drm/xe/oa/uapi: Make bit masks unsigned
On Mon, Jul 29, 2024 at 11:26:34AM GMT, Geert Uytterhoeven wrote: When building with gcc-5: In function ‘decode_oa_format.isra.26’, inlined from ‘xe_oa_set_prop_oa_format’ at drivers/gpu/drm/xe/xe_oa.c:1664:6: ././include/linux/compiler_types.h:510:38: error: call to ‘__compiletime_assert_1336’ declared with attribute error: FIELD_GET: mask is not constant [...] ./include/linux/bitfield.h:155:3: note: in expansion of macro ‘__BF_FIELD_CHECK’ __BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: "); \ ^ drivers/gpu/drm/xe/xe_oa.c:1573:18: note: in expansion of macro ‘FIELD_GET’ u32 bc_report = FIELD_GET(DRM_XE_OA_FORMAT_MASK_BC_REPORT, fmt); ^ Fixes: b6fd51c6211910b1 ("drm/xe/oa/uapi: Define and parse OA stream properties") Signed-off-by: Geert Uytterhoeven Reviewed-by: Lucas De Marchi That fixes the build, but question to Ashutosh: it's odd to tie the format to a bspec. What happens on next platform if the HW changes? Hopefully it doesn't change in an incompatible way, but looking at this code it seems we could still keep the uapi by untying the HW from the uapi in the documentation. Lucas De Marchi --- Compile-tested only. --- include/uapi/drm/xe_drm.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index 19619d4952a863f7..db232a25189eba9f 100644 --- a/include/uapi/drm/xe_drm.h +++ b/include/uapi/drm/xe_drm.h @@ -1590,10 +1590,10 @@ enum drm_xe_oa_property_id { * b. Counter select c. Counter size and d. BC report. Also refer to the * oa_formats array in drivers/gpu/drm/xe/xe_oa.c. */ -#define DRM_XE_OA_FORMAT_MASK_FMT_TYPE (0xff << 0) -#define DRM_XE_OA_FORMAT_MASK_COUNTER_SEL (0xff << 8) -#define DRM_XE_OA_FORMAT_MASK_COUNTER_SIZE (0xff << 16) -#define DRM_XE_OA_FORMAT_MASK_BC_REPORT(0xff << 24) +#define DRM_XE_OA_FORMAT_MASK_FMT_TYPE (0xffu << 0) +#define DRM_XE_OA_FORMAT_MASK_COUNTER_SEL (0xffu << 8) +#define DRM_XE_OA_FORMAT_MASK_COUNTER_SIZE (0xffu << 16) +#define DRM_XE_OA_FORMAT_MASK_BC_REPORT(0xffu << 24) /** * @DRM_XE_OA_PROPERTY_OA_PERIOD_EXPONENT: Requests periodic OA unit -- 2.34.1
Re: [PATCH 6/7] drm/i915/pmu: Lazy unregister
On Wed, Jul 24, 2024 at 02:41:05PM GMT, Peter Zijlstra wrote: On Tue, Jul 23, 2024 at 10:30:08AM -0500, Lucas De Marchi wrote: On Tue, Jul 23, 2024 at 09:03:25AM GMT, Tvrtko Ursulin wrote: > > On 22/07/2024 22:06, Lucas De Marchi wrote: > > Instead of calling perf_pmu_unregister() when unbinding, defer that to > > the destruction of i915 object. Since perf itself holds a reference in > > the event, this only happens when all events are gone, which guarantees > > i915 is not unregistering the pmu with live events. > > > > Previously, running the following sequence would crash the system after > > ~2 tries: > > > > 1) bind device to i915 > > 2) wait events to show up on sysfs > > 3) start perf stat -I 1000 -e i915/rcs0-busy/ > > 4) unbind driver > > 5) kill perf > > > > Most of the time this crashes in perf_pmu_disable() while accessing the > > percpu pmu_disable_count. This happens because perf_pmu_unregister() > > destroys it with free_percpu(pmu->pmu_disable_count). > > > > With a lazy unbind, the pmu is only unregistered after (5) as opposed to > > after (4). The downside is that if a new bind operation is attempted for > > the same device/driver without killing the perf process, i915 will fail > > to register the pmu (but still load successfully). This seems better > > than completely crashing the system. > > So effectively allows unbind to succeed without fully unbinding the > driver from the device? That sounds like a significant drawback and if > so, I wonder if a more complicated solution wouldn't be better after > all. Or is there precedence for allowing userspace keeping their paws on > unbound devices in this way? keeping the resources alive but "unplunged" while the hardware disappeared is a common thing to do... it's the whole point of the drmm-managed resource for example. If you bind the driver and then unbind it while userspace is holding a ref, next time you try to bind it will come up with a different card number. A similar thing that could be done is to adjust the name of the event - currently we add the mangled pci slot. That said, I agree a better approach would be to allow perf_pmu_unregister() to do its job even when there are open events. On top of that (or as a way to help achieve that), make perf core replace the callbacks with stubs when pmu is unregistered - that would even kill the need for i915's checks on pmu->closed (and fix the lack thereof in other drivers). It can be a can of worms though and may be pushed back by perf core maintainers, so it'd be good have their feedback. I don't think I understand the problem. I also don't understand drivers much -- so that might be the problem. We can bind/unbind the driver to/from the pci device. Example: echo -n ":00:02.0" > /sys/bus/pci/drivers/i915/unbind This is essentially unplugging the HW from the kernel. This will trigger the driver to deinitialize and free up all resources use by that device. So when the driver is binding it does: perf_pmu_register(); When it's unbinding: perf_pmu_unregister(); Reasons to unbind include: - driver testing so then we can unload the module and load it again - device is toast - doing an FLR and rebinding may fix/workaround it - For SR-IOV, in which we are exposing multiple instances of the same underlying PCI device, we may need to bind/unbind on-demand (it's not yet clear if perf_pmu_register() would be called on the VF instances, but listed here just to explain the bind/unbind) - Hotplug So the problem appears to be that the device just disappears without warning? How can a GPU go away like that? Since you have a notion of this device, can't you do this stubbing you talk about? That is, if your internal device reference becomes NULL, let the PMU methods preserve the state like no-ops. It's not clear if you are suggesting these stubs to be in each driver or to be assigned by perf in perf_pmu_unregister(). Some downsides of doing it in the driver: - you can't remove the module as perf will still call module code - need to replicate the stubs in every driver (or the equivalent of pmu->closed checks in i915_pmu.c) I *think* the stubs would be quiet the same for every device, so could be feasible to share them inside perf. Alternatively it could simply shortcut the call, without stubs, by looking at event->hw.state and a new pmu->state. I can give this a try. thanks Lucas De Marchi And then when the last event goes away, tear down the whole thing. Again, I'm not sure I'm following.
Re: [PATCH 6/7] drm/i915/pmu: Lazy unregister
On Tue, Jul 23, 2024 at 09:03:25AM GMT, Tvrtko Ursulin wrote: On 22/07/2024 22:06, Lucas De Marchi wrote: Instead of calling perf_pmu_unregister() when unbinding, defer that to the destruction of i915 object. Since perf itself holds a reference in the event, this only happens when all events are gone, which guarantees i915 is not unregistering the pmu with live events. Previously, running the following sequence would crash the system after ~2 tries: 1) bind device to i915 2) wait events to show up on sysfs 3) start perf stat -I 1000 -e i915/rcs0-busy/ 4) unbind driver 5) kill perf Most of the time this crashes in perf_pmu_disable() while accessing the percpu pmu_disable_count. This happens because perf_pmu_unregister() destroys it with free_percpu(pmu->pmu_disable_count). With a lazy unbind, the pmu is only unregistered after (5) as opposed to after (4). The downside is that if a new bind operation is attempted for the same device/driver without killing the perf process, i915 will fail to register the pmu (but still load successfully). This seems better than completely crashing the system. So effectively allows unbind to succeed without fully unbinding the driver from the device? That sounds like a significant drawback and if so, I wonder if a more complicated solution wouldn't be better after all. Or is there precedence for allowing userspace keeping their paws on unbound devices in this way? keeping the resources alive but "unplunged" while the hardware disappeared is a common thing to do... it's the whole point of the drmm-managed resource for example. If you bind the driver and then unbind it while userspace is holding a ref, next time you try to bind it will come up with a different card number. A similar thing that could be done is to adjust the name of the event - currently we add the mangled pci slot. That said, I agree a better approach would be to allow perf_pmu_unregister() to do its job even when there are open events. On top of that (or as a way to help achieve that), make perf core replace the callbacks with stubs when pmu is unregistered - that would even kill the need for i915's checks on pmu->closed (and fix the lack thereof in other drivers). It can be a can of worms though and may be pushed back by perf core maintainers, so it'd be good have their feedback. thanks Lucas De Marchi Regards, Tvrtko Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/i915_pmu.c | 24 +--- 1 file changed, 9 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 8708f905f4f4..df53a8fe53ec 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -1158,18 +1158,21 @@ static void free_pmu(struct drm_device *dev, void *res) struct i915_pmu *pmu = res; struct drm_i915_private *i915 = pmu_to_i915(pmu); + perf_pmu_unregister(&pmu->base); free_event_attributes(pmu); kfree(pmu->base.attr_groups); if (IS_DGFX(i915)) kfree(pmu->name); + + /* +* Make sure all currently running (but shortcut on pmu->closed) are +* gone before proceeding with free'ing the pmu object embedded in i915. +*/ + synchronize_rcu(); } static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node) { - struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node); - - GEM_BUG_ON(!pmu->base.event_init); - /* Select the first online CPU as a designated reader. */ if (cpumask_empty(&i915_pmu_cpumask)) cpumask_set_cpu(cpu, &i915_pmu_cpumask); @@ -1182,8 +1185,6 @@ static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node) struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node); unsigned int target = i915_pmu_target_cpu; - GEM_BUG_ON(!pmu->base.event_init); - /* * Unregistering an instance generates a CPU offline event which we must * ignore to avoid incorrectly modifying the shared i915_pmu_cpumask. @@ -1337,21 +1338,14 @@ void i915_pmu_unregister(struct drm_i915_private *i915) { struct i915_pmu *pmu = &i915->pmu; - if (!pmu->base.event_init) - return; - /* -* "Disconnect" the PMU callbacks - since all are atomic synchronize_rcu -* ensures all currently executing ones will have exited before we -* proceed with unregistration. +* "Disconnect" the PMU callbacks - unregistering the pmu will be done +* later when all currently open events are gone */ pmu->closed = true; - synchronize_rcu(); hrtimer_cancel(&pmu->timer); - i915_pmu_unregister_cpuhp_state(pmu); - perf_pmu_unregister(&pmu->base); pmu->base.event_init = NULL; }
[PATCH 6/7] drm/i915/pmu: Lazy unregister
Instead of calling perf_pmu_unregister() when unbinding, defer that to the destruction of i915 object. Since perf itself holds a reference in the event, this only happens when all events are gone, which guarantees i915 is not unregistering the pmu with live events. Previously, running the following sequence would crash the system after ~2 tries: 1) bind device to i915 2) wait events to show up on sysfs 3) start perf stat -I 1000 -e i915/rcs0-busy/ 4) unbind driver 5) kill perf Most of the time this crashes in perf_pmu_disable() while accessing the percpu pmu_disable_count. This happens because perf_pmu_unregister() destroys it with free_percpu(pmu->pmu_disable_count). With a lazy unbind, the pmu is only unregistered after (5) as opposed to after (4). The downside is that if a new bind operation is attempted for the same device/driver without killing the perf process, i915 will fail to register the pmu (but still load successfully). This seems better than completely crashing the system. Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/i915_pmu.c | 24 +--- 1 file changed, 9 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 8708f905f4f4..df53a8fe53ec 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -1158,18 +1158,21 @@ static void free_pmu(struct drm_device *dev, void *res) struct i915_pmu *pmu = res; struct drm_i915_private *i915 = pmu_to_i915(pmu); + perf_pmu_unregister(&pmu->base); free_event_attributes(pmu); kfree(pmu->base.attr_groups); if (IS_DGFX(i915)) kfree(pmu->name); + + /* +* Make sure all currently running (but shortcut on pmu->closed) are +* gone before proceeding with free'ing the pmu object embedded in i915. +*/ + synchronize_rcu(); } static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node) { - struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node); - - GEM_BUG_ON(!pmu->base.event_init); - /* Select the first online CPU as a designated reader. */ if (cpumask_empty(&i915_pmu_cpumask)) cpumask_set_cpu(cpu, &i915_pmu_cpumask); @@ -1182,8 +1185,6 @@ static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node) struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node); unsigned int target = i915_pmu_target_cpu; - GEM_BUG_ON(!pmu->base.event_init); - /* * Unregistering an instance generates a CPU offline event which we must * ignore to avoid incorrectly modifying the shared i915_pmu_cpumask. @@ -1337,21 +1338,14 @@ void i915_pmu_unregister(struct drm_i915_private *i915) { struct i915_pmu *pmu = &i915->pmu; - if (!pmu->base.event_init) - return; - /* -* "Disconnect" the PMU callbacks - since all are atomic synchronize_rcu -* ensures all currently executing ones will have exited before we -* proceed with unregistration. +* "Disconnect" the PMU callbacks - unregistering the pmu will be done +* later when all currently open events are gone */ pmu->closed = true; - synchronize_rcu(); hrtimer_cancel(&pmu->timer); - i915_pmu_unregister_cpuhp_state(pmu); - perf_pmu_unregister(&pmu->base); pmu->base.event_init = NULL; } -- 2.43.0
[PATCH 1/7] perf/core: Add pmu get/put
If a pmu is unregistered while there's an active event, perf will still access the pmu via event->pmu, even after the event is destroyed. This makes it difficult for drivers like i915 that take a reference on the device when the event is created and put it when it's destroyed. Currently the following use-after-free happens just after destroying the event: BUG: KASAN: use-after-free in exclusive_event_destroy+0xd8/0xf0 Read of size 4 at addr 88816e2bb63c by task perf/7748 Whenever and event is created, get a pmu reference to use in event->pmu and just before calling module_put(), drop the reference.. Signed-off-by: Lucas De Marchi --- include/linux/perf_event.h | 3 +++ kernel/events/core.c | 32 2 files changed, 31 insertions(+), 4 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index a5304ae8c654..7048a505e93c 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -540,6 +540,9 @@ struct pmu { * Check period value for PERF_EVENT_IOC_PERIOD ioctl. */ int (*check_period) (struct perf_event *event, u64 value); /* optional */ + + struct pmu *(*get) (struct pmu *pmu); /* optional: get a reference */ + void (*put) (struct pmu *pmu); /* optional: put a reference */ }; enum perf_addr_filter_action_t { diff --git a/kernel/events/core.c b/kernel/events/core.c index 1b6f5dc7ed32..cc7541b644b0 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -5208,6 +5208,8 @@ static void perf_addr_filters_splice(struct perf_event *event, static void _free_event(struct perf_event *event) { + struct module *module; + irq_work_sync(&event->pending_irq); unaccount_event(event); @@ -5259,7 +5261,13 @@ static void _free_event(struct perf_event *event) put_ctx(event->ctx); exclusive_event_destroy(event); - module_put(event->pmu->module); + + module = event->pmu->module; + event->pmu->put(event->pmu); + /* can't touch pmu anymore */ + event->pmu = NULL; + + module_put(module); call_rcu(&event->rcu_head, free_event_rcu); } @@ -11331,6 +11339,11 @@ static int perf_pmu_nop_int(struct pmu *pmu) return 0; } +static struct pmu *perf_pmu_nop_pmu(struct pmu *pmu) +{ + return pmu; +} + static int perf_event_nop_int(struct perf_event *event, u64 value) { return 0; @@ -11617,6 +11630,12 @@ int perf_pmu_register(struct pmu *pmu, const char *name, int type) if (!pmu->event_idx) pmu->event_idx = perf_event_idx_default; + if (!pmu->get) + pmu->get = perf_pmu_nop_pmu; + + if (!pmu->put) + pmu->put = perf_pmu_nop_void; + list_add_rcu(&pmu->entry, &pmus); atomic_set(&pmu->exclusive_cnt, 0); ret = 0; @@ -11695,7 +11714,8 @@ static int perf_try_init_event(struct pmu *pmu, struct perf_event *event) BUG_ON(!ctx); } - event->pmu = pmu; + event->pmu = pmu->get(pmu); + ret = pmu->event_init(event); if (ctx) @@ -11714,8 +11734,12 @@ static int perf_try_init_event(struct pmu *pmu, struct perf_event *event) event->destroy(event); } - if (ret) - module_put(pmu->module); + if (ret) { + struct module *module = pmu->module; + + pmu->put(pmu); + module_put(module); + } return ret; } -- 2.43.0
[PATCH 2/7] drm/i915/pmu: Fix crash due to use-after-free
When an i915 PMU counter is enabled and the driver is then unbound, the PMU will be unregistered via perf_pmu_unregister(), however the event will still be alive. i915 currently tries to deal with this situation by: a) Marking the pmu as "closed" and shortcut the calls from perf b) Taking a reference from i915, that is put back when the event is destroyed. c) Setting event_init to NULL to avoid any further event (a) is ugly, but may be left as is since it protects not trying to access the HW that is now gone. Unless a pmu driver can call perf_pmu_unregister() and not receive any more calls, it's a necessary ugliness. (b) doesn't really work: when the event is destroyed and the i915 ref is put it may free the i915 object, that contains the pmu, not only the event. After event->destroy() callback, perf still expects the pmu object to be alive. Instead of pigging back on the event->destroy() to take and put the device reference, implement the new get()/put() on the pmu object for that purpose. (c) is not entirely correct as from the perf POV it's not an optional call: perf would just dereference the NULL pointer. However this also protects other entrypoints in i915_pmu. A new event creation from perf after the pmu has been unregistered should not be possible anyway: perf_init_event() bails out when not finding the pmu. This may be cleaned up later. Cc: # 5.11+ Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/i915_pmu.c | 34 +++-- 1 file changed, 20 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 21eb0c5b320d..cb5f6471ec6e 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -514,15 +514,6 @@ static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer) return HRTIMER_RESTART; } -static void i915_pmu_event_destroy(struct perf_event *event) -{ - struct i915_pmu *pmu = event_to_pmu(event); - struct drm_i915_private *i915 = pmu_to_i915(pmu); - - drm_WARN_ON(&i915->drm, event->parent); - - drm_dev_put(&i915->drm); -} static int engine_event_status(struct intel_engine_cs *engine, @@ -628,11 +619,6 @@ static int i915_pmu_event_init(struct perf_event *event) if (ret) return ret; - if (!event->parent) { - drm_dev_get(&i915->drm); - event->destroy = i915_pmu_event_destroy; - } - return 0; } @@ -872,6 +858,24 @@ static int i915_pmu_event_event_idx(struct perf_event *event) return 0; } +static struct pmu *i915_pmu_get(struct pmu *base) +{ + struct i915_pmu *pmu = container_of(base, struct i915_pmu, base); + struct drm_i915_private *i915 = pmu_to_i915(pmu); + + drm_dev_get(&i915->drm); + + return base; +} + +static void i915_pmu_put(struct pmu *base) +{ + struct i915_pmu *pmu = container_of(base, struct i915_pmu, base); + struct drm_i915_private *i915 = pmu_to_i915(pmu); + + drm_dev_put(&i915->drm); +} + struct i915_str_attribute { struct device_attribute attr; const char *str; @@ -1299,6 +1303,8 @@ void i915_pmu_register(struct drm_i915_private *i915) pmu->base.stop = i915_pmu_event_stop; pmu->base.read = i915_pmu_event_read; pmu->base.event_idx = i915_pmu_event_event_idx; + pmu->base.get = i915_pmu_get; + pmu->base.put = i915_pmu_put; ret = perf_pmu_register(&pmu->base, pmu->name, -1); if (ret) -- 2.43.0
[PATCH 5/7] drm/i915/pmu: Let resource survive unbind
There's no need to free the resources during unbind. Since perf events may still access them due to open events, it's safer to free them when dropping the last i915 reference. Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/i915_pmu.c | 21 - 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index b5d14dd318e4..8708f905f4f4 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -5,6 +5,7 @@ */ #include +#include #include "gt/intel_engine.h" #include "gt/intel_engine_pm.h" @@ -1152,6 +1153,17 @@ static void free_event_attributes(struct i915_pmu *pmu) pmu->pmu_attr = NULL; } +static void free_pmu(struct drm_device *dev, void *res) +{ + struct i915_pmu *pmu = res; + struct drm_i915_private *i915 = pmu_to_i915(pmu); + + free_event_attributes(pmu); + kfree(pmu->base.attr_groups); + if (IS_DGFX(i915)) + kfree(pmu->name); +} + static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node) { struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node); @@ -1302,6 +1314,9 @@ void i915_pmu_register(struct drm_i915_private *i915) if (ret) goto err_unreg; + if (drmm_add_action_or_reset(&i915->drm, free_pmu, pmu)) + goto err_unreg; + return; err_unreg: @@ -1336,11 +1351,7 @@ void i915_pmu_unregister(struct drm_i915_private *i915) hrtimer_cancel(&pmu->timer); i915_pmu_unregister_cpuhp_state(pmu); - perf_pmu_unregister(&pmu->base); + pmu->base.event_init = NULL; - kfree(pmu->base.attr_groups); - if (IS_DGFX(i915)) - kfree(pmu->name); - free_event_attributes(pmu); } -- 2.43.0
[PATCH 7/7] drm/i915/pmu: Do not set event_init to NULL
event_init is not an optional function pointer from perf events. Now that pmu unregister happens only when freeing i915, setting it to NULL only protects other functions in i915. Replace that by checking pmu->closed. Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/i915_pmu.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index df53a8fe53ec..c5738035bc2f 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -303,7 +303,7 @@ void i915_pmu_gt_parked(struct intel_gt *gt) { struct i915_pmu *pmu = >->i915->pmu; - if (!pmu->base.event_init) + if (pmu->closed) return; spin_lock_irq(&pmu->lock); @@ -325,7 +325,7 @@ void i915_pmu_gt_unparked(struct intel_gt *gt) { struct i915_pmu *pmu = >->i915->pmu; - if (!pmu->base.event_init) + if (pmu->closed) return; spin_lock_irq(&pmu->lock); @@ -1325,12 +1325,12 @@ void i915_pmu_register(struct drm_i915_private *i915) err_groups: kfree(pmu->base.attr_groups); err_attr: - pmu->base.event_init = NULL; free_event_attributes(pmu); err_name: if (IS_DGFX(i915)) kfree(pmu->name); err: + pmu->closed = true; drm_notice(&i915->drm, "Failed to register PMU!\n"); } @@ -1346,6 +1346,4 @@ void i915_pmu_unregister(struct drm_i915_private *i915) hrtimer_cancel(&pmu->timer); i915_pmu_unregister_cpuhp_state(pmu); - - pmu->base.event_init = NULL; } -- 2.43.0
[PATCH 3/7] drm/i915/pmu: Use event_to_pmu()
i915 pointer is not needed in this function and all the others simply calculate the i915_pmu container based on the event->pmu. Follow the same logic as in other functions. Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/i915_pmu.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index cb5f6471ec6e..3a8bd11b87e7 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -820,15 +820,14 @@ static void i915_pmu_event_start(struct perf_event *event, int flags) static void i915_pmu_event_stop(struct perf_event *event, int flags) { - struct drm_i915_private *i915 = - container_of(event->pmu, typeof(*i915), pmu.base); - struct i915_pmu *pmu = &i915->pmu; + struct i915_pmu *pmu = event_to_pmu(event); if (pmu->closed) goto out; if (flags & PERF_EF_UPDATE) i915_pmu_event_read(event); + i915_pmu_disable(event); out: -- 2.43.0
[PATCH 4/7] drm/i915/pmu: Drop is_igp()
There's no reason to hardcode checking for integrated graphics on a specific pci slot. That information is already available per platform an can be checked with IS_DGFX(). Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/i915_pmu.c | 17 +++-- 1 file changed, 3 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 3a8bd11b87e7..b5d14dd318e4 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -1235,17 +1235,6 @@ static void i915_pmu_unregister_cpuhp_state(struct i915_pmu *pmu) cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node); } -static bool is_igp(struct drm_i915_private *i915) -{ - struct pci_dev *pdev = to_pci_dev(i915->drm.dev); - - /* IGP is :00:02.0 */ - return pci_domain_nr(pdev->bus) == 0 && - pdev->bus->number == 0 && - PCI_SLOT(pdev->devfn) == 2 && - PCI_FUNC(pdev->devfn) == 0; -} - void i915_pmu_register(struct drm_i915_private *i915) { struct i915_pmu *pmu = &i915->pmu; @@ -1269,7 +1258,7 @@ void i915_pmu_register(struct drm_i915_private *i915) pmu->cpuhp.cpu = -1; init_rc6(pmu); - if (!is_igp(i915)) { + if (IS_DGFX(i915)) { pmu->name = kasprintf(GFP_KERNEL, "i915_%s", dev_name(i915->drm.dev)); @@ -1323,7 +1312,7 @@ void i915_pmu_register(struct drm_i915_private *i915) pmu->base.event_init = NULL; free_event_attributes(pmu); err_name: - if (!is_igp(i915)) + if (IS_DGFX(i915)) kfree(pmu->name); err: drm_notice(&i915->drm, "Failed to register PMU!\n"); @@ -1351,7 +1340,7 @@ void i915_pmu_unregister(struct drm_i915_private *i915) perf_pmu_unregister(&pmu->base); pmu->base.event_init = NULL; kfree(pmu->base.attr_groups); - if (!is_igp(i915)) + if (IS_DGFX(i915)) kfree(pmu->name); free_event_attributes(pmu); } -- 2.43.0
[PATCH 0/7] Fix i915 pmu on bind/unbind
This is intended to fix the pmu integration in i915 when the device unbinds. I don't have the hardware to test, but I belive a similar issue occurs with any driver using perf_pmu_unregister() if they can unbind from the device - in drm subsystem, that would be the amd driver. Previous attempts I could find: 1) https://lore.kernel.org/lkml/20240115170120.662220-1-tvrtko.ursu...@linux.intel.com/ 2) https://lore.kernel.org/all/20240213180302.47266-1-umesh.nerlige.rama...@intel.com/ I think (2) is a no-go as it tries to do everything from the i915 side by force-closing the fd. I believe this series is simpler* than (1), but that could could also be a good alternative if we want to go with that approach. First patch is to perf. The rest is to i915 that builds on that and moves the unregister bit by bit to be done later, after the last reference to i915 is dropped. Peter/Ingo, could I get your opinion on this or if (1) would be a good alternative? Thanks. * simpler, but see downside mentioned in patch 6 Lucas De Marchi (7): perf/core: Add pmu get/put drm/i915/pmu: Fix crash due to use-after-free drm/i915/pmu: Use event_to_pmu() drm/i915/pmu: Drop is_igp() drm/i915/pmu: Let resource survive unbind drm/i915/pmu: Lazy unregister drm/i915/pmu: Do not set event_init to NULL drivers/gpu/drm/i915/i915_pmu.c | 103 include/linux/perf_event.h | 3 + kernel/events/core.c| 32 -- 3 files changed, 81 insertions(+), 57 deletions(-) -- 2.43.0
Re: When sysfs is not available (say containers)
On Fri, Jul 19, 2024 at 09:50:04AM GMT, Ashutosh Dixit wrote: On Mon, 17 Jun 2024 18:45:57 -0700, Ashutosh Dixit wrote: Folks, The below is just an example from one of the earlier OA patches (already merged): [PATCH 05/17] drm/xe/oa/uapi: Add/remove OA config perf ops +static ssize_t show_dynamic_id(struct kobject *kobj, + struct kobj_attribute *attr, + char *buf) +{ + struct xe_oa_config *oa_config = + container_of(attr, typeof(*oa_config), sysfs_metric_id); + + return sysfs_emit(buf, "%d\n", oa_config->id); +} + +static int create_dynamic_oa_sysfs_entry(struct xe_oa *oa, +struct xe_oa_config *oa_config) +{ + sysfs_attr_init(&oa_config->sysfs_metric_id.attr); + oa_config->sysfs_metric_id.attr.name = "id"; + oa_config->sysfs_metric_id.attr.mode = 0444; + oa_config->sysfs_metric_id.show = show_dynamic_id; + oa_config->sysfs_metric_id.store = NULL; + + oa_config->attrs[0] = &oa_config->sysfs_metric_id.attr; + oa_config->attrs[1] = NULL; + + oa_config->sysfs_metric.name = oa_config->uuid; + oa_config->sysfs_metric.attrs = oa_config->attrs; + + return sysfs_create_group(oa->metrics_kobj, &oa_config->sysfs_metric); +} So we often expose things in sysfs. The question is: are there general guidelines for what to do for environments (such as containers) where userspace cannot access sysfs? E.g. in such cases, do we expose the information exposed in sysfs via queries (i.e. an ioctl)? Or another way? What have we done in the past in drm and what should we do in these cases for Xe? userspace should be written in a way to handle sysfs potentially not being around and not crash in that case. Providing limited functionality is fine and user can decide what to do in that case. Creating duplicate and alternative API to handle this is not a good solution IMO. For containers, it's common to mount sysfs read-only to give container visibility on the host configuration... or parts of it in case you are giving the container privilege over that part of the system. Related, on another project I maintain (kmod) including systemd folks: https://github.com/kmod-project/kmod/issues/10 From https://systemd.io/CONTAINER_INTERFACE/: Make sure to pre-mount /proc/, /sys/, and /sys/fs/selinux/ before invoking systemd, and mount /sys/, /sys/fs/selinux/ and /proc/sys/ read-only (the latter via e.g. a read-only bind mount on itself) that page has more information on other parts of sysfs that people make writable/readable for similar issues in other subsystems and is worth reading. Lucas De Marchi
[PULL] drm-xe-fixes
Hi Dave and Sima, First patch has an important impact to userspace, changing the cahcing mode to write-back for system memory on DGFX. In this case we introduce a limitation in the cache selection uapi that is transparent to UMDs. I mean, no change on their side is needed. Coherence is maintained, but with some known possible and acceptable/accepted differences in CPU access speed. Second patch fixes a leak when finalizing hdcp gsc. drm-xe-fixes-2024-07-11: UAPI Changes: - Use write-back caching mode for system memory on DGFX (Thomas) Driver Changes: - Do not leak object when finalizing hdcp gsc (Nirmoy) The following changes since commit 256abd8e550ce977b728be79a74e1729438b4948: Linux 6.10-rc7 (2024-07-07 14:23:46 -0700) are available in the Git repository at: https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-07-11 for you to fetch changes up to 609458abd5a10180f513ca364d6c0ae30128c821: drm/xe/display/xe_hdcp_gsc: Free arbiter on driver removal (2024-07-11 08:25:32 -0700) UAPI Changes: - Use write-back caching mode for system memory on DGFX (Thomas) Driver Changes: - Do not leak object when finalizing hdcp gsc (Nirmoy) Nirmoy Das (1): drm/xe/display/xe_hdcp_gsc: Free arbiter on driver removal Thomas Hellström (1): drm/xe: Use write-back caching mode for system memory on DGFX drivers/gpu/drm/xe/display/xe_hdcp_gsc.c | 12 +--- drivers/gpu/drm/xe/xe_bo.c | 47 +++- drivers/gpu/drm/xe/xe_bo_types.h | 3 +- include/uapi/drm/xe_drm.h| 8 +- 4 files changed, 45 insertions(+), 25 deletions(-)
Re: [PULL] drm-xe-next-fixes
On Tue, Jul 09, 2024 at 05:31:39PM GMT, Rodrigo Vivi wrote: Hi Dave and Sima, Here goes a very early drm-xe-next-fixes with 2 important fixes that are going to impact user space. 1. The first one is the rename of the OA stuff from the bad 'perf' name to the xe_observation. Although the rename in the uapi header is likely inoffensive because our UMDs always copy the header to their code, there's a sysfs filename change that is impacting mesa. For this one Mesa MR is ready and they are only waiting for this pull request to be picked by you so they can merge that to Mesa: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30027 With both sides in place, there won't be any kernel version released with the bad naming and no Mesa released using that filename. This is the main reason that I'm sending this PR earlier than normal. 2. The second case, which also impact user space is the write-back caching mode for system memory on DGFX. In this case we introduce a limitation in the cache selection uapi that is transparent to UMDs. I mean, no change on their side is needed. Coherence is maintained with some know possible visible and acceptable/accepted differences in CPU access speed. but this commit is also going to drm-xe-fixes, so I don't think we should have it in this pull. I'm looking at some changes to dim to avoid this kind of problem in future. Lucas De Marchi Thanks, Rodrigo. drm-xe-next-fixes-2024-07-09: UAPI Changes: - Rename xe perf layer as xe observation layer (Ashutosh) - Use write-back caching mode for system memory on DGFX (Thomas) Driver Changes: - Drop trace_xe_hw_fence_free (Brost) The following changes since commit 62a05f4ae9c1fb70bc75d494c9c1c373d2c2e374: Merge tag 'drm-msm-next-2024-07-04' of https://gitlab.freedesktop.org/drm/msm into drm-next (2024-07-05 12:45:41 +0200) are available in the Git repository at: https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-next-fixes-2024-07-09 for you to fetch changes up to 463108053c19f24fa228863824698d5ca72826b6: drm/xe: Drop trace_xe_hw_fence_free (2024-07-09 16:21:26 -0400) UAPI Changes: - Rename xe perf layer as xe observation layer (Ashutosh) - Use write-back caching mode for system memory on DGFX (Thomas) Driver Changes: - Drop trace_xe_hw_fence_free (Brost) Ashutosh Dixit (1): drm/xe/uapi: Rename xe perf layer as xe observation layer Matthew Brost (1): drm/xe: Drop trace_xe_hw_fence_free Thomas Hellström (1): drm/xe: Use write-back caching mode for system memory on DGFX drivers/gpu/drm/xe/Makefile | 2 +- drivers/gpu/drm/xe/xe_bo.c | 47 +-- drivers/gpu/drm/xe/xe_bo_types.h | 3 +- drivers/gpu/drm/xe/xe_device.c | 4 +- drivers/gpu/drm/xe/xe_device_types.h | 2 +- drivers/gpu/drm/xe/xe_gt_types.h | 2 +- drivers/gpu/drm/xe/xe_hw_fence.c | 1 - drivers/gpu/drm/xe/xe_module.c | 6 +- drivers/gpu/drm/xe/xe_oa.c | 34 +-- drivers/gpu/drm/xe/xe_observation.c | 93 + drivers/gpu/drm/xe/xe_observation.h | 20 +++ drivers/gpu/drm/xe/xe_perf.c | 92 - drivers/gpu/drm/xe/xe_perf.h | 20 --- drivers/gpu/drm/xe/xe_trace.h| 5 -- include/uapi/drm/xe_drm.h| 110 +++ 15 files changed, 227 insertions(+), 214 deletions(-) create mode 100644 drivers/gpu/drm/xe/xe_observation.c create mode 100644 drivers/gpu/drm/xe/xe_observation.h delete mode 100644 drivers/gpu/drm/xe/xe_perf.c delete mode 100644 drivers/gpu/drm/xe/xe_perf.h
Re: [PATCH -next] drm/xe: Remove duplicate generated/xe_wa_oob.h header
On Wed, Jul 10, 2024 at 01:49:41PM GMT, Jiapeng Chong wrote: ./drivers/gpu/drm/xe/xe_gt.c: generated/xe_wa_oob.h is included more than once. Reported-by: Abaci Robot Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9493 Signed-off-by: Jiapeng Chong thanks, but a similar patch got merged yesterday: https://lore.kernel.org/intel-xe/20240708173301.1543871-1-lucas.demar...@intel.com/#r Lucas De Marchi
Re: [PATCH v2 1/2] drm/i915/gem: Return NULL instead of '0'
On Mon, Jun 17, 2024 at 08:42:42PM GMT, Andi Shyti wrote: Commit 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup warning") returns '0' from i915_gem_stolen_lmem_setup(), but it's supposed to return a pointer to the intel_memory_region structure. Sparse complains with the following message: drivers/gpu/drm/i915/gem/i915_gem_stolen.c:943:32: sparse: sparse: Using plain integer as NULL pointer Return NULL. Signed-off-by: Andi Shyti Cc: Jonathan Cavitt --- Cc: Lucas De Marchi Reviewed-by: Lucas De Marchi thanks Lucas De Marchi drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c index 004471f60117..9ca73936dc5e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c @@ -940,7 +940,7 @@ i915_gem_stolen_lmem_setup(struct drm_i915_private *i915, u16 type, drm_dbg(&i915->drm, "Disabling stolen memory support due to OOB placement: lmem_size = %lli vs dsm_base = %lli\n", lmem_size, dsm_base); - return 0; + return NULL; } dsm_size = ALIGN_DOWN(lmem_size - dsm_base, SZ_1M); } -- 2.45.1
Re: [PATCH v2 2/2] drm/i915/gem: Use the correct format specifier for resource_size_t
On Mon, Jun 17, 2024 at 08:42:43PM GMT, Andi Shyti wrote: Commit 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup warning") adds a debug message where the "lmem_size" and "dsm_base" variables are printed using the %lli identifier. However, these variables are defined as resource_size_t, which are unsigned long for 32-bit machines and unsigned long long for 64-bit machines. The documentation (core-api/printk-formats.rst) recommends using the %pa specifier for printing addresses and sizes of resources. Replace %lli with %pa. This patch also mutes the following sparse warning when compiling with: s/sparse// I will do that while applying, thanks. Reviewed-by: Lucas De Marchi Lucas De Marchi
Re: [PATCH 1/2] drm/i915/gem: Return -EINVAL instead of '0'
On Mon, Jun 17, 2024 at 08:38:20PM GMT, Andi Shyti wrote: On Mon, Jun 17, 2024 at 10:46:07AM -0500, Lucas De Marchi wrote: On Mon, Jun 17, 2024 at 04:22:11PM GMT, Andi Shyti wrote: > On Mon, Jun 17, 2024 at 07:55:10AM -0500, Lucas De Marchi wrote: > > On Sun, Jun 16, 2024 at 09:03:48AM GMT, Andi Shyti wrote: > > > Commit 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup > > > warning") returns '0' from i915_gem_stolen_lmem_setup(), but it's > > > supposed to return a pointer to the intel_memory_region > > > structure. > > > > > > Sparse complains with the following message: > > > > > > > > drivers/gpu/drm/i915/gem/i915_gem_stolen.c:943:32: sparse: sparse: > > > Using plain integer as NULL pointer > > > > > > The caller checks for errors, and if no error is returned, it > > > stores the address of the stolen memory. Therefore, we can't > > > return NULL. Since we are handling a case of out-of-bounds, it's > > > appropriate to treat the "lmem_size < dsm_base" case as an error. > > > > which completely invalidates the point of the commit that introduced this > > regression. That was commit was supposed to do "let's continue, just > > disabling stolen". > > Yes, correct, I missed the point while fixing stuff. But patch 2 > is still valid. no, it's not. It's introduced by the same commit. I went to look into this exactly because of the second issue: it broke 32b build in xe and all the CI.Hooks in xe are failing. yes, it's broken because it's using %lli, right? In 32b it should be %li. Patch 2 is replacing %lli with %pa which should fix the 32b build. I'm sending a new series now. wait... but instead of reverting you are sending a new series changing the first patch to return NULL. However in your commit message you said for this version: The caller checks for errors, and if no error is returned, it stores the address of the stolen memory. Therefore, we can't return NULL. Since we are handling a case of out-of-bounds, it's appropriate to treat the "lmem_size < dsm_base" case as an error. Return -EINVAL embedded in a pointer instead of '0' (or NULL). This way, we avoid a potential NULL pointer dereference. So... what's it? Can we return NULL or not? Is this tested on that scenario with with small BAR or does the module just fail to load later and explode? Lucas De Marchi Andi Lucas De Marchi > > Thanks, > Andi
Re: [PATCH 1/2] drm/i915/gem: Return -EINVAL instead of '0'
On Mon, Jun 17, 2024 at 04:22:11PM GMT, Andi Shyti wrote: On Mon, Jun 17, 2024 at 07:55:10AM -0500, Lucas De Marchi wrote: On Sun, Jun 16, 2024 at 09:03:48AM GMT, Andi Shyti wrote: > Commit 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup > warning") returns '0' from i915_gem_stolen_lmem_setup(), but it's > supposed to return a pointer to the intel_memory_region > structure. > > Sparse complains with the following message: > > > > drivers/gpu/drm/i915/gem/i915_gem_stolen.c:943:32: sparse: sparse: > Using plain integer as NULL pointer > > The caller checks for errors, and if no error is returned, it > stores the address of the stolen memory. Therefore, we can't > return NULL. Since we are handling a case of out-of-bounds, it's > appropriate to treat the "lmem_size < dsm_base" case as an error. which completely invalidates the point of the commit that introduced this regression. That was commit was supposed to do "let's continue, just disabling stolen". Yes, correct, I missed the point while fixing stuff. But patch 2 is still valid. no, it's not. It's introduced by the same commit. I went to look into this exactly because of the second issue: it broke 32b build in xe and all the CI.Hooks in xe are failing. Lucas De Marchi Thanks, Andi
Re: [PATCH 1/2] drm/i915/gem: Return -EINVAL instead of '0'
On Sun, Jun 16, 2024 at 09:03:48AM GMT, Andi Shyti wrote: Commit 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup warning") returns '0' from i915_gem_stolen_lmem_setup(), but it's supposed to return a pointer to the intel_memory_region structure. Sparse complains with the following message: drivers/gpu/drm/i915/gem/i915_gem_stolen.c:943:32: sparse: sparse: Using plain integer as NULL pointer The caller checks for errors, and if no error is returned, it stores the address of the stolen memory. Therefore, we can't return NULL. Since we are handling a case of out-of-bounds, it's appropriate to treat the "lmem_size < dsm_base" case as an error. which completely invalidates the point of the commit that introduced this regression. That was commit was supposed to do "let's continue, just disabling stolen". Apparently we should just revert that patch instead since it introduced 2 new issues and didn't solve what it was supposed to... for probe failures, we are completely fine leaving the warning there. Lucas De Marchi Return -EINVAL embedded in a pointer instead of '0' (or NULL). This way, we avoid a potential NULL pointer dereference. Since we are returning an error, it makes sense to print an error message with drm_err() instead of a debug message using drm_dbg(). Fixes: 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup warning") Signed-off-by: Andi Shyti Cc: Jonathan Cavitt --- drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c index 004471f60117..bd774ce713cf 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c @@ -937,10 +937,10 @@ i915_gem_stolen_lmem_setup(struct drm_i915_private *i915, u16 type, /* Use DSM base address instead for stolen memory */ dsm_base = intel_uncore_read64(uncore, GEN6_DSMBASE) & GEN11_BDSM_MASK; if (lmem_size < dsm_base) { - drm_dbg(&i915->drm, + drm_err(&i915->drm, "Disabling stolen memory support due to OOB placement: lmem_size = %lli vs dsm_base = %lli\n", lmem_size, dsm_base); - return 0; + return ERR_PTR(-EINVAL); } dsm_size = ALIGN_DOWN(lmem_size - dsm_base, SZ_1M); } -- 2.45.1
Re: [PATCH 1/2] drm/dp: Describe target_rr_divider in struct drm_dp_as_sdp
On Thu, Jun 13, 2024 at 10:43:16AM GMT, Mitul Golani wrote: Describe newly added parameter target_rr_divider in struct drm_dp_as_sdp. Fixes: a20c6d954d75 ("drm/dp: Add refresh rate divider to struct representing AS SDP") Cc: Mitul Golani Cc: Arun R Murthy Cc: Suraj Kandpal Cc: Ankit Nautiyal Cc: Jani Nikula Cc: Stephen Rothwell ^ wrong newline. If you do that, git doesn't recognize them as part of the trailer. Lucas De Marchi Signed-off-by: Mitul Golani Reviewed-by: Ankit Nautiyal --- include/drm/display/drm_dp_helper.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/drm/display/drm_dp_helper.h b/include/drm/display/drm_dp_helper.h index ea03e1dd26ba..7f2567fa230d 100644 --- a/include/drm/display/drm_dp_helper.h +++ b/include/drm/display/drm_dp_helper.h @@ -112,6 +112,7 @@ struct drm_dp_vsc_sdp { * @target_rr: Target Refresh * @duration_incr_ms: Successive frame duration increase * @duration_decr_ms: Successive frame duration decrease + * @target_rr_divider: Target refresh rate divider * @mode: Adaptive Sync Operation Mode */ struct drm_dp_as_sdp { -- 2.45.2
Re: [PATCH v2] MAINTAINERS: Update Xe driver maintainers
On Sun, 02 Jun 2024 21:09:59 +0200, Thomas Hellström wrote: > Add Rodrigo Vivi as an Xe driver maintainer. > > v2: > - Cc also Lucas De Marchi (Rodrigo vivi) > - Remove a blank line in commit the commit message (Lucas De Marchi) > > > [...] Applied to drm-xe-fixes, thanks! [1/1] MAINTAINERS: Update Xe driver maintainers commit: a9f9b30e1748252d158f78a0c0affdc949671dd1 Best regards, -- Lucas De Marchi
Re: [PATCH] MAINTAINERS: Update Xe driver maintainers
On Fri, May 31, 2024 at 04:10:51PM GMT, Thomas Hellström wrote: Add Rodrigo Vivi as an Xe driver maintainer. Cc: David Airlie Cc: Daniel Vetter Cc: Rodrigo Vivi Cc: dri-devel@lists.freedesktop.org Cc: linux-ker...@vger.kernel.org shouldn't have a blank line here. Otherwise git doesn't consider the lines above as part of the trailer. Acked-by: Lucas De Marchi Maybe we should also add a T: git https://gitlab.freedesktop.org/drm/i915/kernel.git on the display side so we direct display patches to go through drm-intel like we are currently doing. But we can leave this for another patch. thanks Lucas De Marchi Signed-off-by: Thomas Hellström --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 572be0546e21..8f9982c99257 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11037,6 +11037,7 @@ F: include/uapi/drm/i915_drm.h INTEL DRM XE DRIVER (Lunar Lake and newer) M: Lucas De Marchi M: Thomas Hellström +M: Rodrigo Vivi L: intel...@lists.freedesktop.org S: Supported W: https://drm.pages.freedesktop.org/intel-docs/ -- 2.44.0
Re: FYI: drm-intel repo moved to fd.o GitLab
Cc'ing drm and drm-intel maintainers too. And a few more details below. On Tue, May 21, 2024 at 02:56:22PM GMT, Knop, Ryszard wrote: Hello, As of today, we've moved the drm-intel repository (upstream for drivers/gpu/drm/i915) to a new location: Previously: https://cgit.freedesktop.org/drm-intel Moved to: https://gitlab.freedesktop.org/drm/i915/kernel For those using dim, it should ask you to migrate your remote automatically on your next branch update. By default it will use the ssh protocol. If you don't have your ssh key in gitlab already, you can follow https://docs.gitlab.com/ee/user/ssh.html#add-an-ssh-key-to-your-gitlab-account to add it. New server has no git:// protocol, so anyone using it is advised to move to https:// instead. The old repo is now a read-only mirror of the GitLab-hosted one. Relevant documentation and links were updated. Thanks to bentiss, mripard and demarchi for help with the move! and thank you for preparing all the patches to the different repos. Lucas De Marchi Ryszard
Re: [PATCH] nightly.conf: Update drm-intel URLs, add missing bare ssh drm-xe URL
On Wed, Apr 24, 2024 at 01:32:19PM GMT, Ryszard Knop wrote: - Switch drm-intel URLs to the new GitLab location. - Add a short SSH link for drm-xe for completeness with other blocks. - Add a missing tab in drm_tip_config for column alignment. Signed-off-by: Ryszard Knop Applied, thanks Lucas De Marchi
Re: [PATCH v5 0/8] drm/xe: Per client usage
On Fri, 17 May 2024 13:43:02 -0700, Lucas De Marchi wrote: > v5 of of > https://lore.kernel.org/all/20240515214258.59209-1-lucas.demar...@intel.com > > Add per-client usage statistics to xe. This ports xe to use the common > method in drm to export the usage to userspace per client (where 1 > client == 1 drm fd open). > > However instead of using the current format measured in nsec, this > creates a new one. The intention here is not to mix the GPU clock domain > with the CPU clock. It allows to cover a few more use cases without > extra complications. > > [...] Applied to drm-xe-next. Thanks everyone for the reviews and contributions. [1/8] drm/xe: Promote xe_hw_engine_class_to_str() commit: ab689514b6ac518ef6e88afa245b834b0dae15a5 [2/8] drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion commit: bd49e50d81b543e678965118a86958d87c045c73 [3/8] drm/xe/lrc: Add helper to capture context timestamp commit: 9b090d57746d965684f53a1aefcb363bab653ad3 [4/8] drm/xe: Add helper to capture engine timestamp commit: f2f6b667c67daee6fe2c51b5cec3bb0f1b4c1ce0 [5/8] drm/xe: Add helper to accumulate exec queue runtime commit: 6109f24f87d75122cf6de50901115cbee4285ce2 [6/8] drm/xe: Cache data about user-visible engines commit: baa14865529bf1f3c12dc6145bd9109ef289e038 [7/8] drm/xe: Add helper to return any available hw engine commit: 6aa18d7436b0c11f7e62fd6cdb707eaeab1dc473 [8/8] drm/xe/client: Print runtime to fdinfo commit: 188ced1e0ff892f0948f20480e2e0122380ae46d Best regards, -- Lucas De Marchi
Re: [PATCH v5 5/8] drm/xe: Add helper to accumulate exec queue runtime
On Tue, May 21, 2024 at 02:53:56AM GMT, Matthew Brost wrote: On Sat, May 18, 2024 at 09:37:20AM -0500, Lucas De Marchi wrote: On Fri, May 17, 2024 at 03:40:22PM GMT, Matt Roper wrote: > On Fri, May 17, 2024 at 01:43:07PM -0700, Lucas De Marchi wrote: > > From: Umesh Nerlige Ramappa > > > > Add a helper to accumulate per-client runtime of all its > > exec queues. This is called every time a sched job is finished. > > > > v2: > > - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate > > runtime when job is finished since xe_sched_job_completed() is not a > > notification that job finished. > > - Stop trying to update runtime from xe_exec_queue_fini() - that is > > redundant and may happen after xef is closed, leading to a > > use-after-free > > - Do not special case the first timestamp read: the default LRC sets > > CTX_TIMESTAMP to zero, so even the first sample should be a valid > > one. > > - Handle the parallel submission case by multiplying the runtime by > > width. > > v3: Update comments > > > > Signed-off-by: Umesh Nerlige Ramappa > > Signed-off-by: Lucas De Marchi > > --- > > drivers/gpu/drm/xe/xe_device_types.h | 3 +++ > > drivers/gpu/drm/xe/xe_exec_queue.c | 37 > > drivers/gpu/drm/xe/xe_exec_queue.h | 1 + > > drivers/gpu/drm/xe/xe_execlist.c | 1 + > > drivers/gpu/drm/xe/xe_guc_submit.c | 2 ++ > > 5 files changed, 44 insertions(+) > > > > diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h > > index 5c5e36de452a..bc97990fd032 100644 > > --- a/drivers/gpu/drm/xe/xe_device_types.h > > +++ b/drivers/gpu/drm/xe/xe_device_types.h > > @@ -555,6 +555,9 @@ struct xe_file { > > struct mutex lock; > > } exec_queue; > > > > + /** @runtime: hw engine class runtime in ticks for this drm client */ > > + u64 runtime[XE_ENGINE_CLASS_MAX]; > > + > > /** @client: drm client */ > > struct xe_drm_client *client; > > }; > > diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c > > index 395de93579fa..fa6dc996eca8 100644 > > --- a/drivers/gpu/drm/xe/xe_exec_queue.c > > +++ b/drivers/gpu/drm/xe/xe_exec_queue.c > > @@ -769,6 +769,43 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q) > > q->lrc[0].fence_ctx.next_seqno - 1; > > } > > > > +/** > > + * xe_exec_queue_update_runtime() - Update runtime for this exec queue from hw > > + * @q: The exec queue > > + * > > + * Update the timestamp saved by HW for this exec queue and save runtime > > + * calculated by using the delta from last update. On multi-lrc case, only the > > + * first is considered. > > + */ > > +void xe_exec_queue_update_runtime(struct xe_exec_queue *q) > > +{ > > + struct xe_file *xef; > > + struct xe_lrc *lrc; > > + u32 old_ts, new_ts; > > + > > + /* > > + * Jobs that are run during driver load may use an exec_queue, but are > > + * not associated with a user xe file, so avoid accumulating busyness > > + * for kernel specific work. > > + */ > > + if (!q->vm || !q->vm->xef) > > + return; > > + > > + xef = q->vm->xef; > > + > > + /* > > + * Only sample the first LRC. For parallel submission, all of them are > > + * scheduled together and we compensate that below by multiplying by > > + * width - this may introduce errors if that premise is not true and > > + * they don't exit 100% aligned. On the other hand, looping through > > + * the LRCs and reading them in different time could also introduce > > + * errors. > > At the time we're executing this function, those LRCs aren't executing > on the hardware anymore and their timestamps aren't continuing to move, not necessarily. Besides calling this function when execution ends, see last patch. This is called when userspace reads the fdinfo file, which portentially reads this for running LRCs. > right? I don't see where error could creep in from just looping over > each of them? > > I guess parallel submission is mostly just used by media these days, > where the batches submitted in parallel are nearly identical and > expected to run the same amount of time, right? Do we have any what I got from Matt Brost is that they are always scheduled together and will run together on the different instances of that engine class, regardles
Re: [PATCH v5 5/8] drm/xe: Add helper to accumulate exec queue runtime
On Fri, May 17, 2024 at 03:40:22PM GMT, Matt Roper wrote: On Fri, May 17, 2024 at 01:43:07PM -0700, Lucas De Marchi wrote: From: Umesh Nerlige Ramappa Add a helper to accumulate per-client runtime of all its exec queues. This is called every time a sched job is finished. v2: - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate runtime when job is finished since xe_sched_job_completed() is not a notification that job finished. - Stop trying to update runtime from xe_exec_queue_fini() - that is redundant and may happen after xef is closed, leading to a use-after-free - Do not special case the first timestamp read: the default LRC sets CTX_TIMESTAMP to zero, so even the first sample should be a valid one. - Handle the parallel submission case by multiplying the runtime by width. v3: Update comments Signed-off-by: Umesh Nerlige Ramappa Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_device_types.h | 3 +++ drivers/gpu/drm/xe/xe_exec_queue.c | 37 drivers/gpu/drm/xe/xe_exec_queue.h | 1 + drivers/gpu/drm/xe/xe_execlist.c | 1 + drivers/gpu/drm/xe/xe_guc_submit.c | 2 ++ 5 files changed, 44 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h index 5c5e36de452a..bc97990fd032 100644 --- a/drivers/gpu/drm/xe/xe_device_types.h +++ b/drivers/gpu/drm/xe/xe_device_types.h @@ -555,6 +555,9 @@ struct xe_file { struct mutex lock; } exec_queue; + /** @runtime: hw engine class runtime in ticks for this drm client */ + u64 runtime[XE_ENGINE_CLASS_MAX]; + /** @client: drm client */ struct xe_drm_client *client; }; diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index 395de93579fa..fa6dc996eca8 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -769,6 +769,43 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q) q->lrc[0].fence_ctx.next_seqno - 1; } +/** + * xe_exec_queue_update_runtime() - Update runtime for this exec queue from hw + * @q: The exec queue + * + * Update the timestamp saved by HW for this exec queue and save runtime + * calculated by using the delta from last update. On multi-lrc case, only the + * first is considered. + */ +void xe_exec_queue_update_runtime(struct xe_exec_queue *q) +{ + struct xe_file *xef; + struct xe_lrc *lrc; + u32 old_ts, new_ts; + + /* +* Jobs that are run during driver load may use an exec_queue, but are +* not associated with a user xe file, so avoid accumulating busyness +* for kernel specific work. +*/ + if (!q->vm || !q->vm->xef) + return; + + xef = q->vm->xef; + + /* +* Only sample the first LRC. For parallel submission, all of them are +* scheduled together and we compensate that below by multiplying by +* width - this may introduce errors if that premise is not true and +* they don't exit 100% aligned. On the other hand, looping through +* the LRCs and reading them in different time could also introduce +* errors. At the time we're executing this function, those LRCs aren't executing on the hardware anymore and their timestamps aren't continuing to move, not necessarily. Besides calling this function when execution ends, see last patch. This is called when userspace reads the fdinfo file, which portentially reads this for running LRCs. right? I don't see where error could creep in from just looping over each of them? I guess parallel submission is mostly just used by media these days, where the batches submitted in parallel are nearly identical and expected to run the same amount of time, right? Do we have any what I got from Matt Brost is that they are always scheduled together and will run together on the different instances of that engine class, regardless if they are different. As you said, I'm not sure there's even a use case for running different batches. +Matt Brost to confirm my reasoning below. Looking at our uapi and some tests in igt. This is controlled by the width arg when creating the exec queue. Later when executing, we can only pass 1 sync obj to wait for completion. So even if userspace passes different batch addresses during exec (i.e. different batches), the whole lot will not complete until everything finishes. I think it's reasonable to consider everything is busy while it doesn't complete. userspace (or potential future userspace) that might submit heterogeneous batches in parallel, which would make this inaccurate? I'm not very familiar with the use cases of parallel submission, so I'll trust that you've got a better understanding of the userspace usage than I do; everything else here looks fine to me. I kind of
[PATCH v5 6/8] drm/xe: Cache data about user-visible engines
gt->info.engine_mask used to indicate the available engines, but that is not always true anymore: some engines are reserved to kernel and some may be exposed as a single engine (e.g. with ccs_mode). Runtime changes only happen when no clients exist, so it's safe to cache the list of engines in the gt and update that when it's needed. This will help implementing per client engine utilization so this (mostly constant) information doesn't need to be re-calculated on every query. Reviewed-by: Jonathan Cavitt Reviewed-by: Umesh Nerlige Ramappa Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_gt.c | 23 +++ drivers/gpu/drm/xe/xe_gt.h | 13 + drivers/gpu/drm/xe/xe_gt_ccs_mode.c | 1 + drivers/gpu/drm/xe/xe_gt_types.h| 21 - 4 files changed, 57 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index e69a03ddd255..5194a3d38e76 100644 --- a/drivers/gpu/drm/xe/xe_gt.c +++ b/drivers/gpu/drm/xe/xe_gt.c @@ -560,9 +560,32 @@ int xe_gt_init(struct xe_gt *gt) if (err) return err; + xe_gt_record_user_engines(gt); + return drmm_add_action_or_reset(>_to_xe(gt)->drm, gt_fini, gt); } +void xe_gt_record_user_engines(struct xe_gt *gt) +{ + struct xe_hw_engine *hwe; + enum xe_hw_engine_id id; + + gt->user_engines.mask = 0; + memset(gt->user_engines.instances_per_class, 0, + sizeof(gt->user_engines.instances_per_class)); + + for_each_hw_engine(hwe, gt, id) { + if (xe_hw_engine_is_reserved(hwe)) + continue; + + gt->user_engines.mask |= BIT_ULL(id); + gt->user_engines.instances_per_class[hwe->class]++; + } + + xe_gt_assert(gt, (gt->user_engines.mask | gt->info.engine_mask) +== gt->info.engine_mask); +} + static int do_gt_reset(struct xe_gt *gt) { int err; diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h index 8474c50b1b30..1d010bf4a756 100644 --- a/drivers/gpu/drm/xe/xe_gt.h +++ b/drivers/gpu/drm/xe/xe_gt.h @@ -38,6 +38,19 @@ int xe_gt_init_hwconfig(struct xe_gt *gt); int xe_gt_init_early(struct xe_gt *gt); int xe_gt_init(struct xe_gt *gt); int xe_gt_record_default_lrcs(struct xe_gt *gt); + +/** + * xe_gt_record_user_engines - save data related to engines available to + * usersapce + * @gt: GT structure + * + * Walk the available HW engines from gt->info.engine_mask and calculate data + * related to those engines that may be used by userspace. To be used whenever + * available engines change in runtime (e.g. with ccs_mode) or during + * initialization + */ +void xe_gt_record_user_engines(struct xe_gt *gt); + void xe_gt_suspend_prepare(struct xe_gt *gt); int xe_gt_suspend(struct xe_gt *gt); int xe_gt_resume(struct xe_gt *gt); diff --git a/drivers/gpu/drm/xe/xe_gt_ccs_mode.c b/drivers/gpu/drm/xe/xe_gt_ccs_mode.c index a34c9a24dafc..c36218f4f6c8 100644 --- a/drivers/gpu/drm/xe/xe_gt_ccs_mode.c +++ b/drivers/gpu/drm/xe/xe_gt_ccs_mode.c @@ -134,6 +134,7 @@ ccs_mode_store(struct device *kdev, struct device_attribute *attr, if (gt->ccs_mode != num_engines) { xe_gt_info(gt, "Setting compute mode to %d\n", num_engines); gt->ccs_mode = num_engines; + xe_gt_record_user_engines(gt); xe_gt_reset_async(gt); } diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h index 475fb58882f1..10a9a9529377 100644 --- a/drivers/gpu/drm/xe/xe_gt_types.h +++ b/drivers/gpu/drm/xe/xe_gt_types.h @@ -113,7 +113,11 @@ struct xe_gt { enum xe_gt_type type; /** @info.reference_clock: clock frequency */ u32 reference_clock; - /** @info.engine_mask: mask of engines present on GT */ + /** +* @info.engine_mask: mask of engines present on GT. Some of +* them may be reserved in runtime and not available for user. +* See @user_engines.mask +*/ u64 engine_mask; /** @info.gmdid: raw GMD_ID value from hardware */ u32 gmdid; @@ -368,6 +372,21 @@ struct xe_gt { /** @wa_active.oob: bitmap with active OOB workaroudns */ unsigned long *oob; } wa_active; + + /** @user_engines: engines present in GT and available to userspace */ + struct { + /** +* @user_engines.mask: like @info->engine_mask, but take in +* consideration only engines available to userspace +*/ + u64 mask; + + /** +* @user_engines.instances_per_class: aggregate per class the +* number of engines available to userspace +
[PATCH v5 4/8] drm/xe: Add helper to capture engine timestamp
Just like CTX_TIMESTAMP is used to calculate runtime, add a helper to get the timestamp for the engine so it can be used to calculate the "engine time" with the same unit as the runtime is recorded. Reviewed-by: Umesh Nerlige Ramappa Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_hw_engine.c | 5 + drivers/gpu/drm/xe/xe_hw_engine.h | 1 + 2 files changed, 6 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index 942fca8f1eb9..de1aefaa2335 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -1121,3 +1121,8 @@ const char *xe_hw_engine_class_to_str(enum xe_engine_class class) return NULL; } + +u64 xe_hw_engine_read_timestamp(struct xe_hw_engine *hwe) +{ + return xe_mmio_read64_2x32(hwe->gt, RING_TIMESTAMP(hwe->mmio_base)); +} diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h b/drivers/gpu/drm/xe/xe_hw_engine.h index 843de159e47c..7f2d27c0ba1a 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.h +++ b/drivers/gpu/drm/xe/xe_hw_engine.h @@ -68,5 +68,6 @@ static inline bool xe_hw_engine_is_valid(struct xe_hw_engine *hwe) } const char *xe_hw_engine_class_to_str(enum xe_engine_class class); +u64 xe_hw_engine_read_timestamp(struct xe_hw_engine *hwe); #endif -- 2.43.0
[PATCH v5 7/8] drm/xe: Add helper to return any available hw engine
Get the first available engine from a gt, which helps in the case any engine serves as a context, like when reading RING_TIMESTAMP. Reviewed-by: Umesh Nerlige Ramappa Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_gt.c | 11 +++ drivers/gpu/drm/xe/xe_gt.h | 7 +++ 2 files changed, 18 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index 5194a3d38e76..3432fef56486 100644 --- a/drivers/gpu/drm/xe/xe_gt.c +++ b/drivers/gpu/drm/xe/xe_gt.c @@ -833,3 +833,14 @@ struct xe_hw_engine *xe_gt_any_hw_engine_by_reset_domain(struct xe_gt *gt, return NULL; } + +struct xe_hw_engine *xe_gt_any_hw_engine(struct xe_gt *gt) +{ + struct xe_hw_engine *hwe; + enum xe_hw_engine_id id; + + for_each_hw_engine(hwe, gt, id) + return hwe; + + return NULL; +} diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h index 1d010bf4a756..9073ac68a777 100644 --- a/drivers/gpu/drm/xe/xe_gt.h +++ b/drivers/gpu/drm/xe/xe_gt.h @@ -67,6 +67,13 @@ void xe_gt_remove(struct xe_gt *gt); struct xe_hw_engine * xe_gt_any_hw_engine_by_reset_domain(struct xe_gt *gt, enum xe_engine_class class); +/** + * xe_gt_any_hw_engine - scan the list of engines and return the + * first available + * @gt: GT structure + */ +struct xe_hw_engine *xe_gt_any_hw_engine(struct xe_gt *gt); + struct xe_hw_engine *xe_gt_hw_engine(struct xe_gt *gt, enum xe_engine_class class, u16 instance, -- 2.43.0
[PATCH v5 5/8] drm/xe: Add helper to accumulate exec queue runtime
From: Umesh Nerlige Ramappa Add a helper to accumulate per-client runtime of all its exec queues. This is called every time a sched job is finished. v2: - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate runtime when job is finished since xe_sched_job_completed() is not a notification that job finished. - Stop trying to update runtime from xe_exec_queue_fini() - that is redundant and may happen after xef is closed, leading to a use-after-free - Do not special case the first timestamp read: the default LRC sets CTX_TIMESTAMP to zero, so even the first sample should be a valid one. - Handle the parallel submission case by multiplying the runtime by width. v3: Update comments Signed-off-by: Umesh Nerlige Ramappa Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_device_types.h | 3 +++ drivers/gpu/drm/xe/xe_exec_queue.c | 37 drivers/gpu/drm/xe/xe_exec_queue.h | 1 + drivers/gpu/drm/xe/xe_execlist.c | 1 + drivers/gpu/drm/xe/xe_guc_submit.c | 2 ++ 5 files changed, 44 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h index 5c5e36de452a..bc97990fd032 100644 --- a/drivers/gpu/drm/xe/xe_device_types.h +++ b/drivers/gpu/drm/xe/xe_device_types.h @@ -555,6 +555,9 @@ struct xe_file { struct mutex lock; } exec_queue; + /** @runtime: hw engine class runtime in ticks for this drm client */ + u64 runtime[XE_ENGINE_CLASS_MAX]; + /** @client: drm client */ struct xe_drm_client *client; }; diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index 395de93579fa..fa6dc996eca8 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -769,6 +769,43 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q) q->lrc[0].fence_ctx.next_seqno - 1; } +/** + * xe_exec_queue_update_runtime() - Update runtime for this exec queue from hw + * @q: The exec queue + * + * Update the timestamp saved by HW for this exec queue and save runtime + * calculated by using the delta from last update. On multi-lrc case, only the + * first is considered. + */ +void xe_exec_queue_update_runtime(struct xe_exec_queue *q) +{ + struct xe_file *xef; + struct xe_lrc *lrc; + u32 old_ts, new_ts; + + /* +* Jobs that are run during driver load may use an exec_queue, but are +* not associated with a user xe file, so avoid accumulating busyness +* for kernel specific work. +*/ + if (!q->vm || !q->vm->xef) + return; + + xef = q->vm->xef; + + /* +* Only sample the first LRC. For parallel submission, all of them are +* scheduled together and we compensate that below by multiplying by +* width - this may introduce errors if that premise is not true and +* they don't exit 100% aligned. On the other hand, looping through +* the LRCs and reading them in different time could also introduce +* errors. +*/ + lrc = &q->lrc[0]; + new_ts = xe_lrc_update_timestamp(lrc, &old_ts); + xef->runtime[q->class] += (new_ts - old_ts) * q->width; +} + void xe_exec_queue_kill(struct xe_exec_queue *q) { struct xe_exec_queue *eq = q, *next; diff --git a/drivers/gpu/drm/xe/xe_exec_queue.h b/drivers/gpu/drm/xe/xe_exec_queue.h index 48f6da53a292..e0f07d28ee1a 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.h +++ b/drivers/gpu/drm/xe/xe_exec_queue.h @@ -75,5 +75,6 @@ struct dma_fence *xe_exec_queue_last_fence_get(struct xe_exec_queue *e, struct xe_vm *vm); void xe_exec_queue_last_fence_set(struct xe_exec_queue *e, struct xe_vm *vm, struct dma_fence *fence); +void xe_exec_queue_update_runtime(struct xe_exec_queue *q); #endif diff --git a/drivers/gpu/drm/xe/xe_execlist.c b/drivers/gpu/drm/xe/xe_execlist.c index e9dee1e14fef..bd7f27efe0e0 100644 --- a/drivers/gpu/drm/xe/xe_execlist.c +++ b/drivers/gpu/drm/xe/xe_execlist.c @@ -306,6 +306,7 @@ static void execlist_job_free(struct drm_sched_job *drm_job) { struct xe_sched_job *job = to_xe_sched_job(drm_job); + xe_exec_queue_update_runtime(job->q); xe_sched_job_put(job); } diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index 4efb88e3e056..ad2b8067d071 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -749,6 +749,8 @@ static void guc_exec_queue_free_job(struct drm_sched_job *drm_job) { struct xe_sched_job *job = to_xe_sched_job(drm_job); + xe_exec_queue_update_runtime(job->q); + trace_xe_sched_job_free(job); xe_sched_job_put(job); } -- 2.43.0
[PATCH v5 3/8] drm/xe/lrc: Add helper to capture context timestamp
From: Umesh Nerlige Ramappa Add a helper to capture CTX_TIMESTAMP from the context image so it can be used to calculate the runtime. v2: Add kernel-doc to clarify expectation from caller Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Francois Dugast Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/regs/xe_lrc_layout.h | 1 + drivers/gpu/drm/xe/xe_lrc.c | 12 drivers/gpu/drm/xe/xe_lrc.h | 14 ++ drivers/gpu/drm/xe/xe_lrc_types.h | 3 +++ 4 files changed, 30 insertions(+) diff --git a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h index e6ca8bbda8f4..045dfd09db99 100644 --- a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h +++ b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h @@ -11,6 +11,7 @@ #define CTX_RING_TAIL (0x06 + 1) #define CTX_RING_START (0x08 + 1) #define CTX_RING_CTL (0x0a + 1) +#define CTX_TIMESTAMP (0x22 + 1) #define CTX_INDIRECT_RING_STATE(0x26 + 1) #define CTX_PDP0_UDW (0x30 + 1) #define CTX_PDP0_LDW (0x32 + 1) diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c index 9b0a4078add3..f679cb9aaea7 100644 --- a/drivers/gpu/drm/xe/xe_lrc.c +++ b/drivers/gpu/drm/xe/xe_lrc.c @@ -844,6 +844,7 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, lrc->tile = gt_to_tile(hwe->gt); lrc->ring.size = ring_size; lrc->ring.tail = 0; + lrc->ctx_timestamp = 0; xe_hw_fence_ctx_init(&lrc->fence_ctx, hwe->gt, hwe->fence_irq, hwe->name); @@ -898,6 +899,8 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, RING_CTL_SIZE(lrc->ring.size) | RING_VALID); } + xe_lrc_write_ctx_reg(lrc, CTX_TIMESTAMP, 0); + if (xe->info.has_asid && vm) xe_lrc_write_ctx_reg(lrc, PVC_CTX_ASID, vm->usm.asid); @@ -1576,3 +1579,12 @@ void xe_lrc_snapshot_free(struct xe_lrc_snapshot *snapshot) xe_bo_put(snapshot->lrc_bo); kfree(snapshot); } + +u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts) +{ + *old_ts = lrc->ctx_timestamp; + + lrc->ctx_timestamp = xe_lrc_read_ctx_reg(lrc, CTX_TIMESTAMP); + + return lrc->ctx_timestamp; +} diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h index e0e841963c23..b9da1031083b 100644 --- a/drivers/gpu/drm/xe/xe_lrc.h +++ b/drivers/gpu/drm/xe/xe_lrc.h @@ -66,4 +66,18 @@ void xe_lrc_snapshot_capture_delayed(struct xe_lrc_snapshot *snapshot); void xe_lrc_snapshot_print(struct xe_lrc_snapshot *snapshot, struct drm_printer *p); void xe_lrc_snapshot_free(struct xe_lrc_snapshot *snapshot); +/** + * xe_lrc_update_timestamp - readout LRC timestamp and update cached value + * @lrc: logical ring context for this exec queue + * @old_ts: pointer where to save the previous timestamp + * + * Read the current timestamp for this LRC and update the cached value. The + * previous cached value is also returned in @old_ts so the caller can calculate + * the delta between 2 updates. Note that this is not intended to be called from + * any place, but just by the paths updating the drm client utilization. + * + * Returns the current LRC timestamp + */ +u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts); + #endif diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h b/drivers/gpu/drm/xe/xe_lrc_types.h index cdbf03faef15..0fa055da6b27 100644 --- a/drivers/gpu/drm/xe/xe_lrc_types.h +++ b/drivers/gpu/drm/xe/xe_lrc_types.h @@ -45,6 +45,9 @@ struct xe_lrc { /** @fence_ctx: context for hw fence */ struct xe_hw_fence_ctx fence_ctx; + + /** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */ + u32 ctx_timestamp; }; struct xe_lrc_snapshot; -- 2.43.0
[PATCH v5 2/8] drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion
XE_ENGINE_CLASS_OTHER was missing from the str conversion. Add it and remove the default handling so it's protected by -Wswitch. Currently the only user is xe_hw_engine_class_sysfs_init(), which already skips XE_ENGINE_CLASS_OTHER, so there's no change in behavior. Reviewed-by: Nirmoy Das Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_hw_engine.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index b71e90c555fa..942fca8f1eb9 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -,9 +,13 @@ const char *xe_hw_engine_class_to_str(enum xe_engine_class class) return "vecs"; case XE_ENGINE_CLASS_COPY: return "bcs"; + case XE_ENGINE_CLASS_OTHER: + return "other"; case XE_ENGINE_CLASS_COMPUTE: return "ccs"; - default: - return NULL; + case XE_ENGINE_CLASS_MAX: + break; } + + return NULL; } -- 2.43.0
[PATCH v5 1/8] drm/xe: Promote xe_hw_engine_class_to_str()
Move it out of the sysfs compilation unit so it can be re-used in other places. Reviewed-by: Nirmoy Das Reviewed-by: Oak Zeng Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_hw_engine.c | 18 ++ drivers/gpu/drm/xe/xe_hw_engine.h | 2 ++ drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c | 18 -- 3 files changed, 20 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index e19af179af33..b71e90c555fa 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -1099,3 +1099,21 @@ bool xe_hw_engine_is_reserved(struct xe_hw_engine *hwe) return xe->info.has_usm && hwe->class == XE_ENGINE_CLASS_COPY && hwe->instance == gt->usm.reserved_bcs_instance; } + +const char *xe_hw_engine_class_to_str(enum xe_engine_class class) +{ + switch (class) { + case XE_ENGINE_CLASS_RENDER: + return "rcs"; + case XE_ENGINE_CLASS_VIDEO_DECODE: + return "vcs"; + case XE_ENGINE_CLASS_VIDEO_ENHANCE: + return "vecs"; + case XE_ENGINE_CLASS_COPY: + return "bcs"; + case XE_ENGINE_CLASS_COMPUTE: + return "ccs"; + default: + return NULL; + } +} diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h b/drivers/gpu/drm/xe/xe_hw_engine.h index 71968ee2f600..843de159e47c 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.h +++ b/drivers/gpu/drm/xe/xe_hw_engine.h @@ -67,4 +67,6 @@ static inline bool xe_hw_engine_is_valid(struct xe_hw_engine *hwe) return hwe->name; } +const char *xe_hw_engine_class_to_str(enum xe_engine_class class); + #endif diff --git a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c index 844ec68cbbb8..efce6c7dd2a2 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c +++ b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c @@ -618,24 +618,6 @@ static void hw_engine_class_sysfs_fini(struct drm_device *drm, void *arg) kobject_put(kobj); } -static const char *xe_hw_engine_class_to_str(enum xe_engine_class class) -{ - switch (class) { - case XE_ENGINE_CLASS_RENDER: - return "rcs"; - case XE_ENGINE_CLASS_VIDEO_DECODE: - return "vcs"; - case XE_ENGINE_CLASS_VIDEO_ENHANCE: - return "vecs"; - case XE_ENGINE_CLASS_COPY: - return "bcs"; - case XE_ENGINE_CLASS_COMPUTE: - return "ccs"; - default: - return NULL; - } -} - /** * xe_hw_engine_class_sysfs_init - Init HW engine classes on GT. * @gt: Xe GT. -- 2.43.0
[PATCH v5 0/8] drm/xe: Per client usage
v5 of of https://lore.kernel.org/all/20240515214258.59209-1-lucas.demar...@intel.com Add per-client usage statistics to xe. This ports xe to use the common method in drm to export the usage to userspace per client (where 1 client == 1 drm fd open). However instead of using the current format measured in nsec, this creates a new one. The intention here is not to mix the GPU clock domain with the CPU clock. It allows to cover a few more use cases without extra complications. This version is tested on an ADL-P and also checked gputop with i915 to make sure not regressed. Last patch also contains the documentation for the new key and sample output as requested in v1. The pre-existent drm-cycles- is used as is, which allows gputop to work with xe. This last patch still has some open discussion from v2, so we may need to hold it a little more. v2: - Create a new drm-total-cycles instead of re-using drm-engine with a different unit - Add documentation for the new interface and clarify usage of xe_lrc_update_timestamp() v3: - Fix bugs in "drm/xe: Add helper to accumulate exec queue runtime" - see commit message - Reorder commits so the ones that are useful in other patch series come first v4: - Fix some comments and documentation - Add 2 patches so we cache on the gt the mask of engines visible to userspace and the per-class capacity. Previously we were doing this during the query, but besides not being very efficient as pointed by Tvrtko, we were also not handling correclty reserved engines and engines "hidden" by e.g. ccs_mode. So move that part to be executed on init and when changing the available engines. - Simplify the fdinfo output loop since now we have the information cached on gt. This also moves the read of the gpu timestamp out of the loop as suggested by Tvrtko and using the helpers implemented in the new patches. v5: - Fix kernel-doc - Move pm_runtime_put() earlier in the function as it's not needed anymore after interacting with the HW. Lucas De Marchi (6): drm/xe: Promote xe_hw_engine_class_to_str() drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion drm/xe: Add helper to capture engine timestamp drm/xe: Cache data about user-visible engines drm/xe: Add helper to return any available hw engine drm/xe/client: Print runtime to fdinfo Umesh Nerlige Ramappa (2): drm/xe/lrc: Add helper to capture context timestamp drm/xe: Add helper to accumulate exec queue runtime Documentation/gpu/drm-usage-stats.rst | 21 ++- Documentation/gpu/xe/index.rst| 1 + Documentation/gpu/xe/xe-drm-usage-stats.rst | 10 ++ drivers/gpu/drm/xe/regs/xe_lrc_layout.h | 1 + drivers/gpu/drm/xe/xe_device_types.h | 3 + drivers/gpu/drm/xe/xe_drm_client.c| 121 +- drivers/gpu/drm/xe/xe_exec_queue.c| 37 ++ drivers/gpu/drm/xe/xe_exec_queue.h| 1 + drivers/gpu/drm/xe/xe_execlist.c | 1 + drivers/gpu/drm/xe/xe_gt.c| 34 + drivers/gpu/drm/xe/xe_gt.h| 20 +++ drivers/gpu/drm/xe/xe_gt_ccs_mode.c | 1 + drivers/gpu/drm/xe/xe_gt_types.h | 21 ++- drivers/gpu/drm/xe/xe_guc_submit.c| 2 + drivers/gpu/drm/xe/xe_hw_engine.c | 27 drivers/gpu/drm/xe/xe_hw_engine.h | 3 + drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c | 18 --- drivers/gpu/drm/xe/xe_lrc.c | 12 ++ drivers/gpu/drm/xe/xe_lrc.h | 14 ++ drivers/gpu/drm/xe/xe_lrc_types.h | 3 + 20 files changed, 329 insertions(+), 22 deletions(-) create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst -- 2.43.0
[PATCH v5 8/8] drm/xe/client: Print runtime to fdinfo
Print the accumulated runtime for client when printing fdinfo. Each time a query is done it first does 2 things: 1) loop through all the exec queues for the current client and accumulate the runtime, per engine class. CTX_TIMESTAMP is used for that, being read from the context image. 2) Read a "GPU timestamp" that can be used for considering "how much GPU time has passed" and that has the same unit/refclock as the one recording the runtime. RING_TIMESTAMP is used for that via MMIO. Since for all current platforms RING_TIMESTAMP follows the same refclock, just read it once, using any first engine available. This is exported to userspace as 2 numbers in fdinfo: drm-cycles-: drm-total-cycles-: Userspace is expected to collect at least 2 samples, which allows to know the client engine busyness as per: RUNTIME1 - RUNTIME0 busyness = - T1 - T0 Since drm-cycles- always starts at 0, it's also possible to know if and engine was ever used by a client. It's expected that userspace will read any 2 samples every few seconds. Given the update frequency of the counters involved and that CTX_TIMESTAMP is 32-bits, the counter for each exec_queue can wrap around (assuming 100% utilization) after ~200s. The wraparound is not perceived by userspace since it's just accumulated for all the exec_queues in a 64-bit counter) but the measurement will not be accurate if the samples are too far apart. This could be mitigated by adding a workqueue to accumulate the counters every so often, but it's additional complexity for something that is done already by userspace every few seconds in tools like gputop (from igt), htop, nvtop, etc, with none of them really defaulting to 1 sample per minute or more. Reviewed-by: Umesh Nerlige Ramappa Acked-by: Tvrtko Ursulin Signed-off-by: Lucas De Marchi --- Documentation/gpu/drm-usage-stats.rst | 21 +++- Documentation/gpu/xe/index.rst | 1 + Documentation/gpu/xe/xe-drm-usage-stats.rst | 10 ++ drivers/gpu/drm/xe/xe_drm_client.c | 121 +++- 4 files changed, 150 insertions(+), 3 deletions(-) create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst index 6dc299343b48..a80f95ca1b2f 100644 --- a/Documentation/gpu/drm-usage-stats.rst +++ b/Documentation/gpu/drm-usage-stats.rst @@ -112,6 +112,19 @@ larger value within a reasonable period. Upon observing a value lower than what was previously read, userspace is expected to stay with that larger previous value until a monotonic update is seen. +- drm-total-cycles-: + +Engine identifier string must be the same as the one specified in the +drm-cycles- tag and shall contain the total number cycles for the given +engine. + +This is a timestamp in GPU unspecified unit that matches the update rate +of drm-cycles-. For drivers that implement this interface, the engine +utilization can be calculated entirely on the GPU clock domain, without +considering the CPU sleep time between 2 samples. + +A driver may implement either this key or drm-maxfreq-, but not both. + - drm-maxfreq-: [Hz|MHz|KHz] Engine identifier string must be the same as the one specified in the @@ -121,6 +134,9 @@ percentage utilization of the engine, whereas drm-engine- only reflects time active without considering what frequency the engine is operating as a percentage of its maximum frequency. +A driver may implement either this key or drm-total-cycles-, but not +both. + Memory ^^ @@ -168,5 +184,6 @@ be documented above and where possible, aligned with other drivers. Driver specific implementations --- -:ref:`i915-usage-stats` -:ref:`panfrost-usage-stats` +* :ref:`i915-usage-stats` +* :ref:`panfrost-usage-stats` +* :ref:`xe-usage-stats` diff --git a/Documentation/gpu/xe/index.rst b/Documentation/gpu/xe/index.rst index c224ecaee81e..3f07aa3b5432 100644 --- a/Documentation/gpu/xe/index.rst +++ b/Documentation/gpu/xe/index.rst @@ -23,3 +23,4 @@ DG2, etc is provided to prototype the driver. xe_firmware xe_tile xe_debugging + xe-drm-usage-stats.rst diff --git a/Documentation/gpu/xe/xe-drm-usage-stats.rst b/Documentation/gpu/xe/xe-drm-usage-stats.rst new file mode 100644 index ..482d503ae68a --- /dev/null +++ b/Documentation/gpu/xe/xe-drm-usage-stats.rst @@ -0,0 +1,10 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +.. _xe-usage-stats: + + +Xe DRM client usage stats implementation + + +.. kernel-doc:: drivers/gpu/drm/xe/xe_drm_client.c + :doc: DRM Client usage stats diff --git a/drivers/gpu/drm/xe/xe_drm_client.c b/drivers/gpu/drm/xe/xe_drm_client.c index 08f0b7c95901..af404c9e5cc0 100644 --- a/drivers/gpu/drm/xe/xe_drm_client.c +++ b/dr
Re: [PATCH v4 6/8] drm/xe: Cache data about user-visible engines
On Thu, May 16, 2024 at 11:33:54AM GMT, Umesh Nerlige Ramappa wrote: On Wed, May 15, 2024 at 02:42:56PM -0700, Lucas De Marchi wrote: gt->info.engine_mask used to indicate the available engines, but that is not always true anymore: some engines are reserved to kernel and some may be exposed as a single engine (e.g. with ccs_mode). Runtime changes only happen when no clients exist, so it's safe to cache the list of engines in the gt and update that when it's needed. This will help implementing per client engine utilization so this (mostly constant) information doesn't need to be re-calculated on every query. Signed-off-by: Lucas De Marchi Just a few questions below, otherwise this looks good as is: Reviewed-by: Umesh Nerlige Ramappa --- drivers/gpu/drm/xe/xe_gt.c | 23 +++ drivers/gpu/drm/xe/xe_gt.h | 13 + drivers/gpu/drm/xe/xe_gt_ccs_mode.c | 1 + drivers/gpu/drm/xe/xe_gt_types.h| 21 - 4 files changed, 57 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index e69a03ddd255..5194a3d38e76 100644 --- a/drivers/gpu/drm/xe/xe_gt.c +++ b/drivers/gpu/drm/xe/xe_gt.c @@ -560,9 +560,32 @@ int xe_gt_init(struct xe_gt *gt) if (err) return err; + xe_gt_record_user_engines(gt); + return drmm_add_action_or_reset(>_to_xe(gt)->drm, gt_fini, gt); } +void xe_gt_record_user_engines(struct xe_gt *gt) +{ + struct xe_hw_engine *hwe; + enum xe_hw_engine_id id; + + gt->user_engines.mask = 0; + memset(gt->user_engines.instances_per_class, 0, + sizeof(gt->user_engines.instances_per_class)); + + for_each_hw_engine(hwe, gt, id) { + if (xe_hw_engine_is_reserved(hwe)) + continue; + + gt->user_engines.mask |= BIT_ULL(id); + gt->user_engines.instances_per_class[hwe->class]++; + } + + xe_gt_assert(gt, (gt->user_engines.mask | gt->info.engine_mask) +== gt->info.engine_mask); I am not seeing a place where user_engines.mask is not a subset of info.engine_mask in the driver, so the above check will always be true. that's why it's an assert. user_engines.mask should always be a subset of info.engine_mask, otherwise something went terribly wrong. Did you mean to do and & instead of | above? That might make sense since then you are making sure that the user_engines are a subset of engine_mask. no, what I'm trying to assert is that user_engines.mask never has an engine that is not present in info.engine_mask. Example: engine_mask == 0b01 user_engines.mask == 0b11 That should never happen and it should fail the assert. I decided to add the assert because I'm not deriving the user_engines.mask directly from the mask, but indirectly. Early on probe we setup the mask and create the hw_engine instances and we are calculating the user_engines.mask from there. I just wanted to make sure we don't screw up something in the middle that causes issues. +} + static int do_gt_reset(struct xe_gt *gt) { int err; diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h index 8474c50b1b30..ad3fd31e0a41 100644 --- a/drivers/gpu/drm/xe/xe_gt.h +++ b/drivers/gpu/drm/xe/xe_gt.h @@ -38,6 +38,19 @@ int xe_gt_init_hwconfig(struct xe_gt *gt); int xe_gt_init_early(struct xe_gt *gt); int xe_gt_init(struct xe_gt *gt); int xe_gt_record_default_lrcs(struct xe_gt *gt); + +/** + * @xe_gt_record_user_engines - save data related to engines available to + * usersapce + * @gt: GT structure + * + * Walk the available HW engines from gt->info.engine_mask and calculate data + * related to those engines that may be used by userspace. To be used whenever + * available engines change in runtime (e.g. with ccs_mode) or during After the driver loads, do we expect ccs_mode to change dynamically based on some criteria OR is it a one time configuration at driver load? If former, can you provide an example where ccs_mode would change dynamically, just curious. it can be set via sysfs, but it blocks changing it if there are clients. For with display, it's easier to check by loading the driver with enable_display=0. Trying that on a DG2: # modprobe xe enable_display=0 # exec 3<> /dev/dri/card1 # tail -n4 /proc/self/fdinfo/3 drm-cycles-bcs: 0 drm-total-cycles-bcs: 37728138157 drm-cycles-ccs: 0 drm-total-cycles-ccs: 37728138157 # # exec 3<&- # echo 2 > /sys/devices/pci:00/:00:01.0/:01:00.0/:02:01.0/:03:00.0/tile0/gt0/ccs_mode # exec 3<> /dev/dri/card1 # tail -n4 /proc/self/fdinfo/3 drm-total-cycles-bcs: 38260910526 drm-cycles-ccs: 0 drm-total-cycles-ccs: 382609
Re: [PATCH 2/2] MAINTAINERS: update Xe driver maintainers
On Wed, May 15, 2024 at 07:22:22PM GMT, Oded Gabbay wrote: Because I left Intel, I'm removing myself from the list of Xe driver maintainers. Signed-off-by: Oded Gabbay Acked-by: Lucas De Marchi thanks Lucas De Marchi --- MAINTAINERS | 1 - 1 file changed, 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 5bd45a919aff..2469607ff5b7 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -10863,7 +10863,6 @@ F: include/uapi/drm/i915_drm.h INTEL DRM XE DRIVER (Lunar Lake and newer) M: Lucas De Marchi -M: Oded Gabbay M: Thomas Hellström L: intel...@lists.freedesktop.org S: Supported -- 2.34.1
Re: [PATCH v4 8/8] drm/xe/client: Print runtime to fdinfo
On Thu, May 16, 2024 at 08:57:48AM GMT, Tvrtko Ursulin wrote: On 15/05/2024 22:42, Lucas De Marchi wrote: Print the accumulated runtime for client when printing fdinfo. Each time a query is done it first does 2 things: 1) loop through all the exec queues for the current client and accumulate the runtime, per engine class. CTX_TIMESTAMP is used for that, being read from the context image. 2) Read a "GPU timestamp" that can be used for considering "how much GPU time has passed" and that has the same unit/refclock as the one recording the runtime. RING_TIMESTAMP is used for that via MMIO. Since for all current platforms RING_TIMESTAMP follows the same refclock, just read it once, using any first engine available. This is exported to userspace as 2 numbers in fdinfo: drm-cycles-: drm-total-cycles-: Userspace is expected to collect at least 2 samples, which allows to know the client engine busyness as per: RUNTIME1 - RUNTIME0 busyness = - T1 - T0 Since drm-cycles- always starts at 0, it's also possible to know if and engine was ever used by a client. It's expected that userspace will read any 2 samples every few seconds. Given the update frequency of the counters involved and that CTX_TIMESTAMP is 32-bits, the counter for each exec_queue can wrap around (assuming 100% utilization) after ~200s. The wraparound is not perceived by userspace since it's just accumulated for all the exec_queues in a 64-bit counter) but the measurement will not be accurate if the samples are too far apart. This could be mitigated by adding a workqueue to accumulate the counters every so often, but it's additional complexity for something that is done already by userspace every few seconds in tools like gputop (from igt), htop, nvtop, etc, with none of them really defaulting to 1 sample per minute or more. Signed-off-by: Lucas De Marchi --- Documentation/gpu/drm-usage-stats.rst | 21 +++- Documentation/gpu/xe/index.rst | 1 + Documentation/gpu/xe/xe-drm-usage-stats.rst | 10 ++ drivers/gpu/drm/xe/xe_drm_client.c | 121 +++- 4 files changed, 150 insertions(+), 3 deletions(-) create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst index 6dc299343b48..a80f95ca1b2f 100644 --- a/Documentation/gpu/drm-usage-stats.rst +++ b/Documentation/gpu/drm-usage-stats.rst @@ -112,6 +112,19 @@ larger value within a reasonable period. Upon observing a value lower than what was previously read, userspace is expected to stay with that larger previous value until a monotonic update is seen. +- drm-total-cycles-: + +Engine identifier string must be the same as the one specified in the +drm-cycles- tag and shall contain the total number cycles for the given +engine. + +This is a timestamp in GPU unspecified unit that matches the update rate +of drm-cycles-. For drivers that implement this interface, the engine +utilization can be calculated entirely on the GPU clock domain, without +considering the CPU sleep time between 2 samples. + +A driver may implement either this key or drm-maxfreq-, but not both. + - drm-maxfreq-: [Hz|MHz|KHz] Engine identifier string must be the same as the one specified in the @@ -121,6 +134,9 @@ percentage utilization of the engine, whereas drm-engine- only reflects time active without considering what frequency the engine is operating as a percentage of its maximum frequency. +A driver may implement either this key or drm-total-cycles-, but not +both. + For the spec part: Acked-by: Tvrtko Ursulin thanks Some minor comments and questions below. Memory ^^ @@ -168,5 +184,6 @@ be documented above and where possible, aligned with other drivers. Driver specific implementations --- -:ref:`i915-usage-stats` -:ref:`panfrost-usage-stats` +* :ref:`i915-usage-stats` +* :ref:`panfrost-usage-stats` +* :ref:`xe-usage-stats` diff --git a/Documentation/gpu/xe/index.rst b/Documentation/gpu/xe/index.rst index c224ecaee81e..3f07aa3b5432 100644 --- a/Documentation/gpu/xe/index.rst +++ b/Documentation/gpu/xe/index.rst @@ -23,3 +23,4 @@ DG2, etc is provided to prototype the driver. xe_firmware xe_tile xe_debugging + xe-drm-usage-stats.rst diff --git a/Documentation/gpu/xe/xe-drm-usage-stats.rst b/Documentation/gpu/xe/xe-drm-usage-stats.rst new file mode 100644 index ..482d503ae68a --- /dev/null +++ b/Documentation/gpu/xe/xe-drm-usage-stats.rst @@ -0,0 +1,10 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +.. _xe-usage-stats: + + +Xe DRM client usage stats implementation + + +.. kernel-doc:: drivers/gpu/drm/xe/xe_drm_client.c + :doc: DRM Client usage stats diff --git a/drivers/gpu/drm/
[PATCH v4 7/8] drm/xe: Add helper to return any available hw engine
Get the first available engine from a gt, which helps in the case any engine serves as a context, like when reading RING_TIMESTAMP. Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_gt.c | 11 +++ drivers/gpu/drm/xe/xe_gt.h | 7 +++ 2 files changed, 18 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index 5194a3d38e76..3432fef56486 100644 --- a/drivers/gpu/drm/xe/xe_gt.c +++ b/drivers/gpu/drm/xe/xe_gt.c @@ -833,3 +833,14 @@ struct xe_hw_engine *xe_gt_any_hw_engine_by_reset_domain(struct xe_gt *gt, return NULL; } + +struct xe_hw_engine *xe_gt_any_hw_engine(struct xe_gt *gt) +{ + struct xe_hw_engine *hwe; + enum xe_hw_engine_id id; + + for_each_hw_engine(hwe, gt, id) + return hwe; + + return NULL; +} diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h index ad3fd31e0a41..a53f01362d94 100644 --- a/drivers/gpu/drm/xe/xe_gt.h +++ b/drivers/gpu/drm/xe/xe_gt.h @@ -67,6 +67,13 @@ void xe_gt_remove(struct xe_gt *gt); struct xe_hw_engine * xe_gt_any_hw_engine_by_reset_domain(struct xe_gt *gt, enum xe_engine_class class); +/** + * xe_gt_any_hw_engine - scan the list of engines and return the + * first available + * @gt: GT structure + */ +struct xe_hw_engine *xe_gt_any_hw_engine(struct xe_gt *gt); + struct xe_hw_engine *xe_gt_hw_engine(struct xe_gt *gt, enum xe_engine_class class, u16 instance, -- 2.43.0
[PATCH v4 4/8] drm/xe: Add helper to capture engine timestamp
Just like CTX_TIMESTAMP is used to calculate runtime, add a helper to get the timestamp for the engine so it can be used to calculate the "engine time" with the same unit as the runtime is recorded. Reviewed-by: Umesh Nerlige Ramappa Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_hw_engine.c | 5 + drivers/gpu/drm/xe/xe_hw_engine.h | 1 + 2 files changed, 6 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index f76bb507e1a5..51536d05ad86 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -1121,3 +1121,8 @@ const char *xe_hw_engine_class_to_str(enum xe_engine_class class) return NULL; } + +u64 xe_hw_engine_read_timestamp(struct xe_hw_engine *hwe) +{ + return xe_mmio_read64_2x32(hwe->gt, RING_TIMESTAMP(hwe->mmio_base)); +} diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h b/drivers/gpu/drm/xe/xe_hw_engine.h index 843de159e47c..7f2d27c0ba1a 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.h +++ b/drivers/gpu/drm/xe/xe_hw_engine.h @@ -68,5 +68,6 @@ static inline bool xe_hw_engine_is_valid(struct xe_hw_engine *hwe) } const char *xe_hw_engine_class_to_str(enum xe_engine_class class); +u64 xe_hw_engine_read_timestamp(struct xe_hw_engine *hwe); #endif -- 2.43.0
[PATCH v4 8/8] drm/xe/client: Print runtime to fdinfo
Print the accumulated runtime for client when printing fdinfo. Each time a query is done it first does 2 things: 1) loop through all the exec queues for the current client and accumulate the runtime, per engine class. CTX_TIMESTAMP is used for that, being read from the context image. 2) Read a "GPU timestamp" that can be used for considering "how much GPU time has passed" and that has the same unit/refclock as the one recording the runtime. RING_TIMESTAMP is used for that via MMIO. Since for all current platforms RING_TIMESTAMP follows the same refclock, just read it once, using any first engine available. This is exported to userspace as 2 numbers in fdinfo: drm-cycles-: drm-total-cycles-: Userspace is expected to collect at least 2 samples, which allows to know the client engine busyness as per: RUNTIME1 - RUNTIME0 busyness = - T1 - T0 Since drm-cycles- always starts at 0, it's also possible to know if and engine was ever used by a client. It's expected that userspace will read any 2 samples every few seconds. Given the update frequency of the counters involved and that CTX_TIMESTAMP is 32-bits, the counter for each exec_queue can wrap around (assuming 100% utilization) after ~200s. The wraparound is not perceived by userspace since it's just accumulated for all the exec_queues in a 64-bit counter) but the measurement will not be accurate if the samples are too far apart. This could be mitigated by adding a workqueue to accumulate the counters every so often, but it's additional complexity for something that is done already by userspace every few seconds in tools like gputop (from igt), htop, nvtop, etc, with none of them really defaulting to 1 sample per minute or more. Signed-off-by: Lucas De Marchi --- Documentation/gpu/drm-usage-stats.rst | 21 +++- Documentation/gpu/xe/index.rst | 1 + Documentation/gpu/xe/xe-drm-usage-stats.rst | 10 ++ drivers/gpu/drm/xe/xe_drm_client.c | 121 +++- 4 files changed, 150 insertions(+), 3 deletions(-) create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst index 6dc299343b48..a80f95ca1b2f 100644 --- a/Documentation/gpu/drm-usage-stats.rst +++ b/Documentation/gpu/drm-usage-stats.rst @@ -112,6 +112,19 @@ larger value within a reasonable period. Upon observing a value lower than what was previously read, userspace is expected to stay with that larger previous value until a monotonic update is seen. +- drm-total-cycles-: + +Engine identifier string must be the same as the one specified in the +drm-cycles- tag and shall contain the total number cycles for the given +engine. + +This is a timestamp in GPU unspecified unit that matches the update rate +of drm-cycles-. For drivers that implement this interface, the engine +utilization can be calculated entirely on the GPU clock domain, without +considering the CPU sleep time between 2 samples. + +A driver may implement either this key or drm-maxfreq-, but not both. + - drm-maxfreq-: [Hz|MHz|KHz] Engine identifier string must be the same as the one specified in the @@ -121,6 +134,9 @@ percentage utilization of the engine, whereas drm-engine- only reflects time active without considering what frequency the engine is operating as a percentage of its maximum frequency. +A driver may implement either this key or drm-total-cycles-, but not +both. + Memory ^^ @@ -168,5 +184,6 @@ be documented above and where possible, aligned with other drivers. Driver specific implementations --- -:ref:`i915-usage-stats` -:ref:`panfrost-usage-stats` +* :ref:`i915-usage-stats` +* :ref:`panfrost-usage-stats` +* :ref:`xe-usage-stats` diff --git a/Documentation/gpu/xe/index.rst b/Documentation/gpu/xe/index.rst index c224ecaee81e..3f07aa3b5432 100644 --- a/Documentation/gpu/xe/index.rst +++ b/Documentation/gpu/xe/index.rst @@ -23,3 +23,4 @@ DG2, etc is provided to prototype the driver. xe_firmware xe_tile xe_debugging + xe-drm-usage-stats.rst diff --git a/Documentation/gpu/xe/xe-drm-usage-stats.rst b/Documentation/gpu/xe/xe-drm-usage-stats.rst new file mode 100644 index ..482d503ae68a --- /dev/null +++ b/Documentation/gpu/xe/xe-drm-usage-stats.rst @@ -0,0 +1,10 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +.. _xe-usage-stats: + + +Xe DRM client usage stats implementation + + +.. kernel-doc:: drivers/gpu/drm/xe/xe_drm_client.c + :doc: DRM Client usage stats diff --git a/drivers/gpu/drm/xe/xe_drm_client.c b/drivers/gpu/drm/xe/xe_drm_client.c index 08f0b7c95901..952b0cc87708 100644 --- a/drivers/gpu/drm/xe/xe_drm_client.c +++ b/drivers/gpu/drm/xe/xe_drm_client.c @@ -2,6 +2,7 @@ /
[PATCH v4 6/8] drm/xe: Cache data about user-visible engines
gt->info.engine_mask used to indicate the available engines, but that is not always true anymore: some engines are reserved to kernel and some may be exposed as a single engine (e.g. with ccs_mode). Runtime changes only happen when no clients exist, so it's safe to cache the list of engines in the gt and update that when it's needed. This will help implementing per client engine utilization so this (mostly constant) information doesn't need to be re-calculated on every query. Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_gt.c | 23 +++ drivers/gpu/drm/xe/xe_gt.h | 13 + drivers/gpu/drm/xe/xe_gt_ccs_mode.c | 1 + drivers/gpu/drm/xe/xe_gt_types.h| 21 - 4 files changed, 57 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index e69a03ddd255..5194a3d38e76 100644 --- a/drivers/gpu/drm/xe/xe_gt.c +++ b/drivers/gpu/drm/xe/xe_gt.c @@ -560,9 +560,32 @@ int xe_gt_init(struct xe_gt *gt) if (err) return err; + xe_gt_record_user_engines(gt); + return drmm_add_action_or_reset(>_to_xe(gt)->drm, gt_fini, gt); } +void xe_gt_record_user_engines(struct xe_gt *gt) +{ + struct xe_hw_engine *hwe; + enum xe_hw_engine_id id; + + gt->user_engines.mask = 0; + memset(gt->user_engines.instances_per_class, 0, + sizeof(gt->user_engines.instances_per_class)); + + for_each_hw_engine(hwe, gt, id) { + if (xe_hw_engine_is_reserved(hwe)) + continue; + + gt->user_engines.mask |= BIT_ULL(id); + gt->user_engines.instances_per_class[hwe->class]++; + } + + xe_gt_assert(gt, (gt->user_engines.mask | gt->info.engine_mask) +== gt->info.engine_mask); +} + static int do_gt_reset(struct xe_gt *gt) { int err; diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h index 8474c50b1b30..ad3fd31e0a41 100644 --- a/drivers/gpu/drm/xe/xe_gt.h +++ b/drivers/gpu/drm/xe/xe_gt.h @@ -38,6 +38,19 @@ int xe_gt_init_hwconfig(struct xe_gt *gt); int xe_gt_init_early(struct xe_gt *gt); int xe_gt_init(struct xe_gt *gt); int xe_gt_record_default_lrcs(struct xe_gt *gt); + +/** + * @xe_gt_record_user_engines - save data related to engines available to + * usersapce + * @gt: GT structure + * + * Walk the available HW engines from gt->info.engine_mask and calculate data + * related to those engines that may be used by userspace. To be used whenever + * available engines change in runtime (e.g. with ccs_mode) or during + * initialization + */ +void xe_gt_record_user_engines(struct xe_gt *gt); + void xe_gt_suspend_prepare(struct xe_gt *gt); int xe_gt_suspend(struct xe_gt *gt); int xe_gt_resume(struct xe_gt *gt); diff --git a/drivers/gpu/drm/xe/xe_gt_ccs_mode.c b/drivers/gpu/drm/xe/xe_gt_ccs_mode.c index a34c9a24dafc..c36218f4f6c8 100644 --- a/drivers/gpu/drm/xe/xe_gt_ccs_mode.c +++ b/drivers/gpu/drm/xe/xe_gt_ccs_mode.c @@ -134,6 +134,7 @@ ccs_mode_store(struct device *kdev, struct device_attribute *attr, if (gt->ccs_mode != num_engines) { xe_gt_info(gt, "Setting compute mode to %d\n", num_engines); gt->ccs_mode = num_engines; + xe_gt_record_user_engines(gt); xe_gt_reset_async(gt); } diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h index 5a114fc9dde7..aaf2951749a6 100644 --- a/drivers/gpu/drm/xe/xe_gt_types.h +++ b/drivers/gpu/drm/xe/xe_gt_types.h @@ -112,7 +112,11 @@ struct xe_gt { enum xe_gt_type type; /** @info.reference_clock: clock frequency */ u32 reference_clock; - /** @info.engine_mask: mask of engines present on GT */ + /** +* @info.engine_mask: mask of engines present on GT. Some of +* them may be reserved in runtime and not available for user. +* See @user_engines.mask +*/ u64 engine_mask; /** @info.gmdid: raw GMD_ID value from hardware */ u32 gmdid; @@ -365,6 +369,21 @@ struct xe_gt { /** @wa_active.oob: bitmap with active OOB workaroudns */ unsigned long *oob; } wa_active; + + /** @user_engines: engines present in GT and available to userspace */ + struct { + /** +* @mask: like @info->engine_mask, but take in consideration +* only engines available to userspace +*/ + u64 mask; + + /** +* @instances_per_class: aggregate per class the number of +* engines available to userspace +*/ + u8 instances_per_class[XE_ENGINE_CLASS_MAX]; + } user_engines; }; #endif -- 2.43.0
[PATCH v4 3/8] drm/xe/lrc: Add helper to capture context timestamp
From: Umesh Nerlige Ramappa Add a helper to capture CTX_TIMESTAMP from the context image so it can be used to calculate the runtime. v2: Add kernel-doc to clarify expectation from caller Signed-off-by: Umesh Nerlige Ramappa Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/regs/xe_lrc_layout.h | 1 + drivers/gpu/drm/xe/xe_lrc.c | 12 drivers/gpu/drm/xe/xe_lrc.h | 14 ++ drivers/gpu/drm/xe/xe_lrc_types.h | 3 +++ 4 files changed, 30 insertions(+) diff --git a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h index e6ca8bbda8f4..045dfd09db99 100644 --- a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h +++ b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h @@ -11,6 +11,7 @@ #define CTX_RING_TAIL (0x06 + 1) #define CTX_RING_START (0x08 + 1) #define CTX_RING_CTL (0x0a + 1) +#define CTX_TIMESTAMP (0x22 + 1) #define CTX_INDIRECT_RING_STATE(0x26 + 1) #define CTX_PDP0_UDW (0x30 + 1) #define CTX_PDP0_LDW (0x32 + 1) diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c index 9b0a4078add3..f679cb9aaea7 100644 --- a/drivers/gpu/drm/xe/xe_lrc.c +++ b/drivers/gpu/drm/xe/xe_lrc.c @@ -844,6 +844,7 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, lrc->tile = gt_to_tile(hwe->gt); lrc->ring.size = ring_size; lrc->ring.tail = 0; + lrc->ctx_timestamp = 0; xe_hw_fence_ctx_init(&lrc->fence_ctx, hwe->gt, hwe->fence_irq, hwe->name); @@ -898,6 +899,8 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, RING_CTL_SIZE(lrc->ring.size) | RING_VALID); } + xe_lrc_write_ctx_reg(lrc, CTX_TIMESTAMP, 0); + if (xe->info.has_asid && vm) xe_lrc_write_ctx_reg(lrc, PVC_CTX_ASID, vm->usm.asid); @@ -1576,3 +1579,12 @@ void xe_lrc_snapshot_free(struct xe_lrc_snapshot *snapshot) xe_bo_put(snapshot->lrc_bo); kfree(snapshot); } + +u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts) +{ + *old_ts = lrc->ctx_timestamp; + + lrc->ctx_timestamp = xe_lrc_read_ctx_reg(lrc, CTX_TIMESTAMP); + + return lrc->ctx_timestamp; +} diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h index e0e841963c23..b9da1031083b 100644 --- a/drivers/gpu/drm/xe/xe_lrc.h +++ b/drivers/gpu/drm/xe/xe_lrc.h @@ -66,4 +66,18 @@ void xe_lrc_snapshot_capture_delayed(struct xe_lrc_snapshot *snapshot); void xe_lrc_snapshot_print(struct xe_lrc_snapshot *snapshot, struct drm_printer *p); void xe_lrc_snapshot_free(struct xe_lrc_snapshot *snapshot); +/** + * xe_lrc_update_timestamp - readout LRC timestamp and update cached value + * @lrc: logical ring context for this exec queue + * @old_ts: pointer where to save the previous timestamp + * + * Read the current timestamp for this LRC and update the cached value. The + * previous cached value is also returned in @old_ts so the caller can calculate + * the delta between 2 updates. Note that this is not intended to be called from + * any place, but just by the paths updating the drm client utilization. + * + * Returns the current LRC timestamp + */ +u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts); + #endif diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h b/drivers/gpu/drm/xe/xe_lrc_types.h index cdbf03faef15..0fa055da6b27 100644 --- a/drivers/gpu/drm/xe/xe_lrc_types.h +++ b/drivers/gpu/drm/xe/xe_lrc_types.h @@ -45,6 +45,9 @@ struct xe_lrc { /** @fence_ctx: context for hw fence */ struct xe_hw_fence_ctx fence_ctx; + + /** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */ + u32 ctx_timestamp; }; struct xe_lrc_snapshot; -- 2.43.0
[PATCH v4 1/8] drm/xe: Promote xe_hw_engine_class_to_str()
Move it out of the sysfs compilation unit so it can be re-used in other places. Reviewed-by: Nirmoy Das Reviewed-by: Oak Zeng Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_hw_engine.c | 18 ++ drivers/gpu/drm/xe/xe_hw_engine.h | 2 ++ drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c | 18 -- 3 files changed, 20 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index 45f582a7caaa..520db5d518c3 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -1099,3 +1099,21 @@ bool xe_hw_engine_is_reserved(struct xe_hw_engine *hwe) return xe->info.has_usm && hwe->class == XE_ENGINE_CLASS_COPY && hwe->instance == gt->usm.reserved_bcs_instance; } + +const char *xe_hw_engine_class_to_str(enum xe_engine_class class) +{ + switch (class) { + case XE_ENGINE_CLASS_RENDER: + return "rcs"; + case XE_ENGINE_CLASS_VIDEO_DECODE: + return "vcs"; + case XE_ENGINE_CLASS_VIDEO_ENHANCE: + return "vecs"; + case XE_ENGINE_CLASS_COPY: + return "bcs"; + case XE_ENGINE_CLASS_COMPUTE: + return "ccs"; + default: + return NULL; + } +} diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h b/drivers/gpu/drm/xe/xe_hw_engine.h index 71968ee2f600..843de159e47c 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.h +++ b/drivers/gpu/drm/xe/xe_hw_engine.h @@ -67,4 +67,6 @@ static inline bool xe_hw_engine_is_valid(struct xe_hw_engine *hwe) return hwe->name; } +const char *xe_hw_engine_class_to_str(enum xe_engine_class class); + #endif diff --git a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c index 844ec68cbbb8..efce6c7dd2a2 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c +++ b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c @@ -618,24 +618,6 @@ static void hw_engine_class_sysfs_fini(struct drm_device *drm, void *arg) kobject_put(kobj); } -static const char *xe_hw_engine_class_to_str(enum xe_engine_class class) -{ - switch (class) { - case XE_ENGINE_CLASS_RENDER: - return "rcs"; - case XE_ENGINE_CLASS_VIDEO_DECODE: - return "vcs"; - case XE_ENGINE_CLASS_VIDEO_ENHANCE: - return "vecs"; - case XE_ENGINE_CLASS_COPY: - return "bcs"; - case XE_ENGINE_CLASS_COMPUTE: - return "ccs"; - default: - return NULL; - } -} - /** * xe_hw_engine_class_sysfs_init - Init HW engine classes on GT. * @gt: Xe GT. -- 2.43.0
[PATCH v4 5/8] drm/xe: Add helper to accumulate exec queue runtime
From: Umesh Nerlige Ramappa Add a helper to accumulate per-client runtime of all its exec queues. This is called every time a sched job is finished. v2: - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate runtime when job is finished since xe_sched_job_completed() is not a notification that job finished. - Stop trying to update runtime from xe_exec_queue_fini() - that is redundant and may happen after xef is closed, leading to a use-after-free - Do not special case the first timestamp read: the default LRC sets CTX_TIMESTAMP to zero, so even the first sample should be a valid one. - Handle the parallel submission case by multiplying the runtime by width. v3: Update comments Signed-off-by: Umesh Nerlige Ramappa Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_device_types.h | 3 +++ drivers/gpu/drm/xe/xe_exec_queue.c | 37 drivers/gpu/drm/xe/xe_exec_queue.h | 1 + drivers/gpu/drm/xe/xe_execlist.c | 1 + drivers/gpu/drm/xe/xe_guc_submit.c | 2 ++ 5 files changed, 44 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h index 5c5e36de452a..bc97990fd032 100644 --- a/drivers/gpu/drm/xe/xe_device_types.h +++ b/drivers/gpu/drm/xe/xe_device_types.h @@ -555,6 +555,9 @@ struct xe_file { struct mutex lock; } exec_queue; + /** @runtime: hw engine class runtime in ticks for this drm client */ + u64 runtime[XE_ENGINE_CLASS_MAX]; + /** @client: drm client */ struct xe_drm_client *client; }; diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index 395de93579fa..fa6dc996eca8 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -769,6 +769,43 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q) q->lrc[0].fence_ctx.next_seqno - 1; } +/** + * xe_exec_queue_update_runtime() - Update runtime for this exec queue from hw + * @q: The exec queue + * + * Update the timestamp saved by HW for this exec queue and save runtime + * calculated by using the delta from last update. On multi-lrc case, only the + * first is considered. + */ +void xe_exec_queue_update_runtime(struct xe_exec_queue *q) +{ + struct xe_file *xef; + struct xe_lrc *lrc; + u32 old_ts, new_ts; + + /* +* Jobs that are run during driver load may use an exec_queue, but are +* not associated with a user xe file, so avoid accumulating busyness +* for kernel specific work. +*/ + if (!q->vm || !q->vm->xef) + return; + + xef = q->vm->xef; + + /* +* Only sample the first LRC. For parallel submission, all of them are +* scheduled together and we compensate that below by multiplying by +* width - this may introduce errors if that premise is not true and +* they don't exit 100% aligned. On the other hand, looping through +* the LRCs and reading them in different time could also introduce +* errors. +*/ + lrc = &q->lrc[0]; + new_ts = xe_lrc_update_timestamp(lrc, &old_ts); + xef->runtime[q->class] += (new_ts - old_ts) * q->width; +} + void xe_exec_queue_kill(struct xe_exec_queue *q) { struct xe_exec_queue *eq = q, *next; diff --git a/drivers/gpu/drm/xe/xe_exec_queue.h b/drivers/gpu/drm/xe/xe_exec_queue.h index 48f6da53a292..e0f07d28ee1a 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.h +++ b/drivers/gpu/drm/xe/xe_exec_queue.h @@ -75,5 +75,6 @@ struct dma_fence *xe_exec_queue_last_fence_get(struct xe_exec_queue *e, struct xe_vm *vm); void xe_exec_queue_last_fence_set(struct xe_exec_queue *e, struct xe_vm *vm, struct dma_fence *fence); +void xe_exec_queue_update_runtime(struct xe_exec_queue *q); #endif diff --git a/drivers/gpu/drm/xe/xe_execlist.c b/drivers/gpu/drm/xe/xe_execlist.c index e9dee1e14fef..bd7f27efe0e0 100644 --- a/drivers/gpu/drm/xe/xe_execlist.c +++ b/drivers/gpu/drm/xe/xe_execlist.c @@ -306,6 +306,7 @@ static void execlist_job_free(struct drm_sched_job *drm_job) { struct xe_sched_job *job = to_xe_sched_job(drm_job); + xe_exec_queue_update_runtime(job->q); xe_sched_job_put(job); } diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index 4efb88e3e056..ad2b8067d071 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -749,6 +749,8 @@ static void guc_exec_queue_free_job(struct drm_sched_job *drm_job) { struct xe_sched_job *job = to_xe_sched_job(drm_job); + xe_exec_queue_update_runtime(job->q); + trace_xe_sched_job_free(job); xe_sched_job_put(job); } -- 2.43.0
[PATCH v4 2/8] drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion
XE_ENGINE_CLASS_OTHER was missing from the str conversion. Add it and remove the default handling so it's protected by -Wswitch. Currently the only user is xe_hw_engine_class_sysfs_init(), which already skips XE_ENGINE_CLASS_OTHER, so there's no change in behavior. Reviewed-by: Nirmoy Das Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_hw_engine.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index 520db5d518c3..f76bb507e1a5 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -,9 +,13 @@ const char *xe_hw_engine_class_to_str(enum xe_engine_class class) return "vecs"; case XE_ENGINE_CLASS_COPY: return "bcs"; + case XE_ENGINE_CLASS_OTHER: + return "other"; case XE_ENGINE_CLASS_COMPUTE: return "ccs"; - default: - return NULL; + case XE_ENGINE_CLASS_MAX: + break; } + + return NULL; } -- 2.43.0
[PATCH v4 0/8] drm/xe: Per client usage
v4 of https://lore.kernel.org/all/20240507224510.442971-1-lucas.demar...@intel.com Add per-client usage statistics to xe. This ports xe to use the common method in drm to export the usage to userspace per client (where 1 client == 1 drm fd open). However instead of using the current format measured in nsec, this creates a new one. The intention here is not to mix the GPU clock domain with the CPU clock. It allows to cover a few more use cases without extra complications. This version is tested on an ADL-P and also checked gputop with i915 to make sure not regressed. Last patch also contains the documentation for the new key and sample output as requested in v1. The pre-existent drm-cycles- is used as is, which allows gputop to work with xe. This last patch still has some open discussion from v2, so we may need to hold it a little more. v2: - Create a new drm-total-cycles instead of re-using drm-engine with a different unit - Add documentation for the new interface and clarify usage of xe_lrc_update_timestamp() v3: - Fix bugs in "drm/xe: Add helper to accumulate exec queue runtime" - see commit message - Reorder commits so the ones that are useful in other patch series come first v4: - Fix some comments and documentation - Add 2 patches so we cache on the gt the mask of engines visible to userspace and the per-class capacity. Previously we were doing this during the query, but besides not being very efficient as pointed by Tvrtko, we were also not handling correclty reserved engines and engines "hidden" by e.g. ccs_mode. So move that part to be executed on init and when changing the available engines. - Simplify the fdinfo output loop since now we have the information cached on gt. This also moves the read of the gpu timestamp out of the loop as suggested by Tvrtko and using the helpers implemented in the new patches. Lucas De Marchi (6): drm/xe: Promote xe_hw_engine_class_to_str() drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion drm/xe: Add helper to capture engine timestamp drm/xe: Cache data about user-visible engines drm/xe: Add helper to return any available hw engine drm/xe/client: Print runtime to fdinfo Umesh Nerlige Ramappa (2): drm/xe/lrc: Add helper to capture context timestamp drm/xe: Add helper to accumulate exec queue runtime Documentation/gpu/drm-usage-stats.rst | 21 ++- Documentation/gpu/xe/index.rst| 1 + Documentation/gpu/xe/xe-drm-usage-stats.rst | 10 ++ drivers/gpu/drm/xe/regs/xe_lrc_layout.h | 1 + drivers/gpu/drm/xe/xe_device_types.h | 3 + drivers/gpu/drm/xe/xe_drm_client.c| 121 +- drivers/gpu/drm/xe/xe_exec_queue.c| 37 ++ drivers/gpu/drm/xe/xe_exec_queue.h| 1 + drivers/gpu/drm/xe/xe_execlist.c | 1 + drivers/gpu/drm/xe/xe_gt.c| 34 + drivers/gpu/drm/xe/xe_gt.h| 20 +++ drivers/gpu/drm/xe/xe_gt_ccs_mode.c | 1 + drivers/gpu/drm/xe/xe_gt_types.h | 21 ++- drivers/gpu/drm/xe/xe_guc_submit.c| 2 + drivers/gpu/drm/xe/xe_hw_engine.c | 27 drivers/gpu/drm/xe/xe_hw_engine.h | 3 + drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c | 18 --- drivers/gpu/drm/xe/xe_lrc.c | 12 ++ drivers/gpu/drm/xe/xe_lrc.h | 14 ++ drivers/gpu/drm/xe/xe_lrc_types.h | 3 + 20 files changed, 329 insertions(+), 22 deletions(-) create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst -- 2.43.0
[PULL] drm-xe-fixes
Hi Dave and Sima, Please pull the drm-xe-fixes for this week targeting v6.9. thanks Lucas De Marchi drm-xe-fixes-2024-05-09: - Fix use zero-length element array - Move more from system wq to ordered private wq - Do not ignore return for drmm_mutex_init The following changes since commit dd5a440a31fae6e459c0d627162825505361: Linux 6.9-rc7 (2024-05-05 14:06:01 -0700) are available in the Git repository at: https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-05-09 for you to fetch changes up to c002bfe644a29ba600c571f2abba13a155a12dcd: drm/xe: Use ordered WQ for G2H handler (2024-05-09 09:41:27 -0500) - Fix use zero-length element array - Move more from system wq to ordered private wq - Do not ignore return for drmm_mutex_init Daniele Ceraolo Spurio (1): drm/xe/guc: Check error code when initializing the CT mutex Lucas De Marchi (1): drm/xe/ads: Use flexible-array Matthew Brost (1): drm/xe: Use ordered WQ for G2H handler drivers/gpu/drm/xe/xe_guc_ads.c | 2 +- drivers/gpu/drm/xe/xe_guc_ct.c | 10 +- drivers/gpu/drm/xe/xe_guc_ct.h | 2 +- drivers/gpu/drm/xe/xe_guc_ct_types.h | 2 ++ 4 files changed, 13 insertions(+), 3 deletions(-)
Re: [PATCH] drm/xe: Fix UBSAN shift-out-of-bounds failure
On Tue, May 07, 2024 at 01:04:11PM GMT, Shuicheng Lin wrote: Here is the failure stack: [ 12.988209] [ cut here ] [ 12.988216] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13 [ 12.988232] shift exponent 64 is too large for 64-bit type 'long unsigned int' [ 12.988235] CPU: 4 PID: 1310 Comm: gnome-shell Tainted: G U 6.9.0-rc6+prerelease1158+ #19 [ 12.988237] Hardware name: Intel Corporation Raptor Lake Client Platform/RPL-S ADP-S DDR5 UDIMM CRB, BIOS RPLSFWI1.R00.3301.A02.2208050712 08/05/2022 [ 12.988239] Call Trace: [ 12.988240] [ 12.988242] dump_stack_lvl+0xd7/0xf0 [ 12.988248] dump_stack+0x10/0x20 [ 12.988250] ubsan_epilogue+0x9/0x40 [ 12.988253] __ubsan_handle_shift_out_of_bounds+0x10e/0x170 [ 12.988260] dma_resv_reserve_fences.cold+0x2b/0x48 [ 12.988262] ? ww_mutex_lock_interruptible+0x3c/0x110 [ 12.988267] drm_exec_prepare_obj+0x45/0x60 [drm_exec] [ 12.988271] ? vm_bind_ioctl_ops_execute+0x5b/0x740 [xe] [ 12.988345] vm_bind_ioctl_ops_execute+0x78/0x740 [xe] It is caused by the value 0 of parameter num_fences in function drm_exec_prepare_obj. And lead to in function __rounddown_pow_of_two, "0 - 1" causes the shift-out-of-bounds. ok For num_fences == 0 case, drm_exec_prepare_obj is the same as drm_exec_lock_obj in function, so call drm_exec_lock_obj instead to solve it. this is not true and per discussion in this thread it's not going to change. drm_exec_prepare_obj() should not be called with num_fences == 0. So I'd reworded with something like below so we have all breadcrumbs for anyone trying to understand the changes later: By design drm_exec_prepare_obj() should be called only when there are fences to be reserved. If num_fences is 0, calling drm_exec_lock_obj() is sufficient as was done in commit 9377de4cb3e8 ("drm/xe/vm: Avoid reserving zero fences") Cc: Nirmoy Das Cc: Matthew Brost Signed-off-by: Shuicheng Lin with the reword, Reviewed-by: Lucas De Marchi And also add: Link: https://lore.kernel.org/all/24d4a9a9-c622-4f56-8672-21f4c6785...@amd.com Could you also submit a patch to add the warning like mentioned by Christian? thanks Lucas De Marchi --- drivers/gpu/drm/xe/xe_vm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index d17192c8b7de..c5b1694b292f 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -2692,7 +2692,7 @@ static int vma_lock_and_validate(struct drm_exec *exec, struct xe_vma *vma, if (bo) { if (!bo->vm) - err = drm_exec_prepare_obj(exec, &bo->ttm.base, 0); + err = drm_exec_lock_obj(exec, &bo->ttm.base); if (!err && validate) err = xe_bo_validate(bo, xe_vma_vm(vma), true); } @@ -2777,7 +2777,7 @@ static int vm_bind_ioctl_ops_lock_and_prep(struct drm_exec *exec, struct xe_vma_op *op; int err; - err = drm_exec_prepare_obj(exec, xe_vm_obj(vm), 0); + err = drm_exec_lock_obj(exec, xe_vm_obj(vm)); if (err) return err; -- 2.25.1
Re: [PATCH v2 6/6] drm/xe/client: Print runtime to fdinfo
On Wed, May 08, 2024 at 09:23:17AM GMT, Tvrtko Ursulin wrote: On 07/05/2024 22:35, Lucas De Marchi wrote: On Fri, Apr 26, 2024 at 11:47:37AM GMT, Tvrtko Ursulin wrote: On 24/04/2024 00:56, Lucas De Marchi wrote: Print the accumulated runtime for client when printing fdinfo. Each time a query is done it first does 2 things: 1) loop through all the exec queues for the current client and accumulate the runtime, per engine class. CTX_TIMESTAMP is used for that, being read from the context image. 2) Read a "GPU timestamp" that can be used for considering "how much GPU time has passed" and that has the same unit/refclock as the one recording the runtime. RING_TIMESTAMP is used for that via MMIO. Since for all current platforms RING_TIMESTAMP follows the same refclock, just read it once, using any first engine. This is exported to userspace as 2 numbers in fdinfo: drm-cycles-: drm-total-cycles-: Userspace is expected to collect at least 2 samples, which allows to know the client engine busyness as per: RUNTIME1 - RUNTIME0 busyness = - T1 - T0 Another thing to point out is that it's expected that userspace will read any 2 samples every few seconds. Given the update frequency of the counters involved and that CTX_TIMESTAMP is 32-bits, the counter for each exec_queue can wrap around (assuming 100% utilization) after ~200s. The wraparound is not perceived by userspace since it's just accumulated for all the exec_queues in a 64-bit counter), but the measurement will not be accurate if the samples are too far apart. This could be mitigated by adding a workqueue to accumulate the counters every so often, but it's additional complexity for something that is done already by userspace every few seconds in tools like gputop (from igt), htop, nvtop, etc with none of them really defaulting to 1 sample per minute or more. Signed-off-by: Lucas De Marchi --- Documentation/gpu/drm-usage-stats.rst | 16 ++- Documentation/gpu/xe/index.rst | 1 + Documentation/gpu/xe/xe-drm-usage-stats.rst | 10 ++ drivers/gpu/drm/xe/xe_drm_client.c | 138 +++- 4 files changed, 162 insertions(+), 3 deletions(-) create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst index 6dc299343b48..421766289b78 100644 --- a/Documentation/gpu/drm-usage-stats.rst +++ b/Documentation/gpu/drm-usage-stats.rst @@ -112,6 +112,17 @@ larger value within a reasonable period. Upon observing a value lower than what was previously read, userspace is expected to stay with that larger previous value until a monotonic update is seen. +- drm-total-cycles-: + +Engine identifier string must be the same as the one specified in the +drm-cycles- tag and shall contain the total number cycles for the given +engine. + +This is a timestamp in GPU unspecified unit that matches the update rate +of drm-cycles-. For drivers that implement this interface, the engine +utilization can be calculated entirely on the GPU clock domain, without +considering the CPU sleep time between 2 samples. Two opens. 1) Do we need to explicity document that drm-total-cycles and drm-maxfreq are mutually exclusive? so userspace has a fallback mechanism to calculate utilization depending on what keys are available? No, to document all three at once do not make sense. Or at least are not expected. Or you envisage someone might legitimately emit all three? I don't see what would be the semantics. When we have cycles+maxfreq the latter is in Hz. And when we have cycles+total then it is unitless. All three? I don't follow what you mean here. *cycles* is actually a unit. The engine spent 10 cycles running this context (drm-cycles). In the same period there were 100 cycles available (drm-total-cycles). Current frequency is X MHz. Max frequency is Y MHz. For me all of them make sense if one wants to mix them together. For xe it doesn't make sense because the counter backing drm-cycles and drm-total-cycles is unrelated to the engine frequency. I can add something in the doc that we do not expected to see all of them together until we see a usecase. Each driver may implement a subset. 2) Should drm-total-cycles for now be documents as driver specific? you mean to call it xe-total-cycles? Yes but it is not an ask, just an open. Ok, my opinion is that we shouldn't. Just like we have drm-cycles today implemented by some drivers, but not all. I'd consider the drm-curfreq, not documented in the drm layer as something to be fixed or migrated to a driver-only interface (probably not possible anymore as it'd break the uapi). Problem I see with turning it into xe-total-cycles, is that the moment another driver decide to implement they will either have to use xe- prefix or xe will need to st
Re: [PATCH v3 5/6] drm/xe: Add helper to accumulate exec queue runtime
On Wed, May 08, 2024 at 09:34:59AM GMT, Tvrtko Ursulin wrote: On 07/05/2024 23:45, Lucas De Marchi wrote: From: Umesh Nerlige Ramappa Add a helper to accumulate per-client runtime of all its exec queues. This is called every time a sched job is finished. v2: - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate runtime when job is finished since xe_sched_job_completed() is not a notification that job finished. - Stop trying to update runtime from xe_exec_queue_fini() - that is redundant and may happen after xef is closed, leading to a use-after-free - Do not special case the first timestamp read: the default LRC sets CTX_TIMESTAMP to zero, so even the first sample should be a valid one. - Handle the parallel submission case by multiplying the runtime by width. Signed-off-by: Umesh Nerlige Ramappa Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_device_types.h | 9 +++ drivers/gpu/drm/xe/xe_exec_queue.c | 35 drivers/gpu/drm/xe/xe_exec_queue.h | 1 + drivers/gpu/drm/xe/xe_execlist.c | 1 + drivers/gpu/drm/xe/xe_guc_submit.c | 2 ++ 5 files changed, 48 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h index 906b98fb973b..de078bdf0ab9 100644 --- a/drivers/gpu/drm/xe/xe_device_types.h +++ b/drivers/gpu/drm/xe/xe_device_types.h @@ -560,6 +560,15 @@ struct xe_file { struct mutex lock; } exec_queue; + /** +* @runtime: hw engine class runtime in ticks for this drm client +* +* Only stats from xe_exec_queue->lrc[0] are accumulated. For multi-lrc +* case, since all jobs run in parallel on the engines, only the stats +* from lrc[0] are sufficient. +*/ + u64 runtime[XE_ENGINE_CLASS_MAX]; + /** @client: drm client */ struct xe_drm_client *client; }; diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index 395de93579fa..86eb22e22c95 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -769,6 +769,41 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q) q->lrc[0].fence_ctx.next_seqno - 1; } +/** + * xe_exec_queue_update_runtime() - Update runtime for this exec queue from hw + * @q: The exec queue + * + * Update the timestamp saved by HW for this exec queue and save runtime + * calculated by using the delta from last update. On multi-lrc case, only the + * first is considered. + */ +void xe_exec_queue_update_runtime(struct xe_exec_queue *q) +{ + struct xe_file *xef; + struct xe_lrc *lrc; + u32 old_ts, new_ts; + + /* +* Jobs that are run during driver load may use an exec_queue, but are +* not associated with a user xe file, so avoid accumulating busyness +* for kernel specific work. +*/ + if (!q->vm || !q->vm->xef) + return; + + xef = q->vm->xef; + + /* +* Only sample the first LRC. For parallel submission, all of them are +* scheduled together and we compensate that below by multiplying by +* width +*/ + lrc = &q->lrc[0]; + + new_ts = xe_lrc_update_timestamp(lrc, &old_ts); + xef->runtime[q->class] += (new_ts - old_ts) * q->width; I think in theory this could be introducing a systematic error depending on how firmware handles things and tick resolution. Or even regardless of the firmware, if the timestamps are saved on context exit by the GPU hw itself and parallel contexts do not exit 100% aligned. Undershoot would be I think fine, but systematic overshoot under constant 100% parallel load from mutlitple client could constantly show >100% class utilisation. Probably not a concern in practice but worthy a comment? Ok... just extend the comment above? I could have looped through the LRCs to read the timestamp, but I didn't like reading them in slightly different times, which could give slight different results if they are in a running state. This way here it doesn't have this problem at the expense of "we have to trust the hw/firmware". We can switch to the loop eventually if our level of trust decreases after getting more data/test :) thanks Lucas De Marchi Regards, Tvrtko +} + void xe_exec_queue_kill(struct xe_exec_queue *q) { struct xe_exec_queue *eq = q, *next; diff --git a/drivers/gpu/drm/xe/xe_exec_queue.h b/drivers/gpu/drm/xe/xe_exec_queue.h index 48f6da53a292..e0f07d28ee1a 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.h +++ b/drivers/gpu/drm/xe/xe_exec_queue.h @@ -75,5 +75,6 @@ struct dma_fence *xe_exec_queue_last_fence_get(struct xe_exec_queue *e, struct xe_vm *vm); void xe_exec_queue_last_fence_set(struct xe_exec_queue *e, struct xe_vm *vm, str
Re: [PATCH v3 5/6] drm/xe: Add helper to accumulate exec queue runtime
On Tue, May 07, 2024 at 04:45:36PM GMT, Umesh Nerlige Ramappa wrote: On Tue, May 07, 2024 at 03:45:09PM -0700, Lucas De Marchi wrote: From: Umesh Nerlige Ramappa Add a helper to accumulate per-client runtime of all its exec queues. This is called every time a sched job is finished. v2: - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate runtime when job is finished since xe_sched_job_completed() is not a notification that job finished. - Stop trying to update runtime from xe_exec_queue_fini() - that is redundant and may happen after xef is closed, leading to a use-after-free - Do not special case the first timestamp read: the default LRC sets CTX_TIMESTAMP to zero, so even the first sample should be a valid one. - Handle the parallel submission case by multiplying the runtime by width. Signed-off-by: Umesh Nerlige Ramappa Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_device_types.h | 9 +++ drivers/gpu/drm/xe/xe_exec_queue.c | 35 drivers/gpu/drm/xe/xe_exec_queue.h | 1 + drivers/gpu/drm/xe/xe_execlist.c | 1 + drivers/gpu/drm/xe/xe_guc_submit.c | 2 ++ 5 files changed, 48 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h index 906b98fb973b..de078bdf0ab9 100644 --- a/drivers/gpu/drm/xe/xe_device_types.h +++ b/drivers/gpu/drm/xe/xe_device_types.h @@ -560,6 +560,15 @@ struct xe_file { struct mutex lock; } exec_queue; + /** +* @runtime: hw engine class runtime in ticks for this drm client +* +* Only stats from xe_exec_queue->lrc[0] are accumulated. For multi-lrc +* case, since all jobs run in parallel on the engines, only the stats +* from lrc[0] are sufficient. Maybe we can drop the above comment altogether after the multi-lrc update. right... I removed it from the middle of the function and this was a leftover. I will remove it on next version. thanks Lucas De Marchi Umesh
[PATCH v3 6/6] drm/xe/client: Print runtime to fdinfo
Print the accumulated runtime for client when printing fdinfo. Each time a query is done it first does 2 things: 1) loop through all the exec queues for the current client and accumulate the runtime, per engine class. CTX_TIMESTAMP is used for that, being read from the context image. 2) Read a "GPU timestamp" that can be used for considering "how much GPU time has passed" and that has the same unit/refclock as the one recording the runtime. RING_TIMESTAMP is used for that via MMIO. Since for all current platforms RING_TIMESTAMP follows the same refclock, just read it once, using any first engine. This is exported to userspace as 2 numbers in fdinfo: drm-cycles-: drm-total-cycles-: Userspace is expected to collect at least 2 samples, which allows to know the client engine busyness as per: RUNTIME1 - RUNTIME0 busyness = - T1 - T0 Another thing to point out is that it's expected that userspace will read any 2 samples every few seconds. Given the update frequency of the counters involved and that CTX_TIMESTAMP is 32-bits, the counter for each exec_queue can wrap around (assuming 100% utilization) after ~200s. The wraparound is not perceived by userspace since it's just accumulated for all the exec_queues in a 64-bit counter), but the measurement will not be accurate if the samples are too far apart. This could be mitigated by adding a workqueue to accumulate the counters every so often, but it's additional complexity for something that is done already by userspace every few seconds in tools like gputop (from igt), htop, nvtop, etc with none of them really defaulting to 1 sample per minute or more. Signed-off-by: Lucas De Marchi --- Documentation/gpu/drm-usage-stats.rst | 16 ++- Documentation/gpu/xe/index.rst | 1 + Documentation/gpu/xe/xe-drm-usage-stats.rst | 10 ++ drivers/gpu/drm/xe/xe_drm_client.c | 136 +++- 4 files changed, 160 insertions(+), 3 deletions(-) create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst index 6dc299343b48..421766289b78 100644 --- a/Documentation/gpu/drm-usage-stats.rst +++ b/Documentation/gpu/drm-usage-stats.rst @@ -112,6 +112,17 @@ larger value within a reasonable period. Upon observing a value lower than what was previously read, userspace is expected to stay with that larger previous value until a monotonic update is seen. +- drm-total-cycles-: + +Engine identifier string must be the same as the one specified in the +drm-cycles- tag and shall contain the total number cycles for the given +engine. + +This is a timestamp in GPU unspecified unit that matches the update rate +of drm-cycles-. For drivers that implement this interface, the engine +utilization can be calculated entirely on the GPU clock domain, without +considering the CPU sleep time between 2 samples. + - drm-maxfreq-: [Hz|MHz|KHz] Engine identifier string must be the same as the one specified in the @@ -168,5 +179,6 @@ be documented above and where possible, aligned with other drivers. Driver specific implementations --- -:ref:`i915-usage-stats` -:ref:`panfrost-usage-stats` +* :ref:`i915-usage-stats` +* :ref:`panfrost-usage-stats` +* :ref:`xe-usage-stats` diff --git a/Documentation/gpu/xe/index.rst b/Documentation/gpu/xe/index.rst index c224ecaee81e..3f07aa3b5432 100644 --- a/Documentation/gpu/xe/index.rst +++ b/Documentation/gpu/xe/index.rst @@ -23,3 +23,4 @@ DG2, etc is provided to prototype the driver. xe_firmware xe_tile xe_debugging + xe-drm-usage-stats.rst diff --git a/Documentation/gpu/xe/xe-drm-usage-stats.rst b/Documentation/gpu/xe/xe-drm-usage-stats.rst new file mode 100644 index ..482d503ae68a --- /dev/null +++ b/Documentation/gpu/xe/xe-drm-usage-stats.rst @@ -0,0 +1,10 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +.. _xe-usage-stats: + + +Xe DRM client usage stats implementation + + +.. kernel-doc:: drivers/gpu/drm/xe/xe_drm_client.c + :doc: DRM Client usage stats diff --git a/drivers/gpu/drm/xe/xe_drm_client.c b/drivers/gpu/drm/xe/xe_drm_client.c index 08f0b7c95901..27494839d586 100644 --- a/drivers/gpu/drm/xe/xe_drm_client.c +++ b/drivers/gpu/drm/xe/xe_drm_client.c @@ -2,6 +2,7 @@ /* * Copyright © 2023 Intel Corporation */ +#include "xe_drm_client.h" #include #include @@ -12,9 +13,65 @@ #include "xe_bo.h" #include "xe_bo_types.h" #include "xe_device_types.h" -#include "xe_drm_client.h" +#include "xe_exec_queue.h" +#include "xe_gt.h" +#include "xe_hw_engine.h" +#include "xe_pm.h" #include "xe_trace.h" +/** + * DOC: DRM Client usag
[PATCH v3 3/6] drm/xe/lrc: Add helper to capture context timestamp
From: Umesh Nerlige Ramappa Add a helper to capture CTX_TIMESTAMP from the context image so it can be used to calculate the runtime. v2: Add kernel-doc to clarify expectation from caller Signed-off-by: Umesh Nerlige Ramappa Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/regs/xe_lrc_layout.h | 1 + drivers/gpu/drm/xe/xe_lrc.c | 11 +++ drivers/gpu/drm/xe/xe_lrc.h | 14 ++ drivers/gpu/drm/xe/xe_lrc_types.h | 3 +++ 4 files changed, 29 insertions(+) diff --git a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h index 1825d8f79db6..8780e6c6b649 100644 --- a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h +++ b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h @@ -11,6 +11,7 @@ #define CTX_RING_TAIL (0x06 + 1) #define CTX_RING_START (0x08 + 1) #define CTX_RING_CTL (0x0a + 1) +#define CTX_TIMESTAMP (0x22 + 1) #define CTX_PDP0_UDW (0x30 + 1) #define CTX_PDP0_LDW (0x32 + 1) diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c index 2066d34ddf0b..8c540f2f43e3 100644 --- a/drivers/gpu/drm/xe/xe_lrc.c +++ b/drivers/gpu/drm/xe/xe_lrc.c @@ -751,6 +751,7 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, lrc->tile = gt_to_tile(hwe->gt); lrc->ring.size = ring_size; lrc->ring.tail = 0; + lrc->ctx_timestamp = 0; xe_hw_fence_ctx_init(&lrc->fence_ctx, hwe->gt, hwe->fence_irq, hwe->name); @@ -786,6 +787,7 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, xe_drm_client_add_bo(vm->xef->client, lrc->bo); } + xe_lrc_write_ctx_reg(lrc, CTX_TIMESTAMP, 0); xe_lrc_write_ctx_reg(lrc, CTX_RING_START, __xe_lrc_ring_ggtt_addr(lrc)); xe_lrc_write_ctx_reg(lrc, CTX_RING_HEAD, 0); xe_lrc_write_ctx_reg(lrc, CTX_RING_TAIL, lrc->ring.tail); @@ -1444,3 +1446,12 @@ void xe_lrc_snapshot_free(struct xe_lrc_snapshot *snapshot) xe_bo_put(snapshot->lrc_bo); kfree(snapshot); } + +u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts) +{ + *old_ts = lrc->ctx_timestamp; + + lrc->ctx_timestamp = xe_lrc_read_ctx_reg(lrc, CTX_TIMESTAMP); + + return lrc->ctx_timestamp; +} diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h index d32fa31faa2c..dcbc6edd80da 100644 --- a/drivers/gpu/drm/xe/xe_lrc.h +++ b/drivers/gpu/drm/xe/xe_lrc.h @@ -60,4 +60,18 @@ void xe_lrc_snapshot_capture_delayed(struct xe_lrc_snapshot *snapshot); void xe_lrc_snapshot_print(struct xe_lrc_snapshot *snapshot, struct drm_printer *p); void xe_lrc_snapshot_free(struct xe_lrc_snapshot *snapshot); +/** + * xe_lrc_update_timestamp - readout LRC timestamp and update cached value + * @lrc: logical ring context for this exec queue + * @old_ts: pointer where to save the previous timestamp + * + * Read the current timestamp for this LRC and update the cached value. The + * previous cached value is also returned in @old_ts so the caller can calculate + * the delta between 2 updates. Note that this is not intended to be called from + * any place, but just by the paths updating the drm client utilization. + * + * Returns the current LRC timestamp + */ +u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts); + #endif diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h b/drivers/gpu/drm/xe/xe_lrc_types.h index b716df0dfb4e..5765d771b901 100644 --- a/drivers/gpu/drm/xe/xe_lrc_types.h +++ b/drivers/gpu/drm/xe/xe_lrc_types.h @@ -41,6 +41,9 @@ struct xe_lrc { /** @fence_ctx: context for hw fence */ struct xe_hw_fence_ctx fence_ctx; + + /** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */ + u32 ctx_timestamp; }; struct xe_lrc_snapshot; -- 2.43.0
[PATCH v3 0/6] drm/xe: Per client usage
Add per-client usage statistics to xe. This ports xe to use the common method in drm to export the usage to userspace per client (where 1 client == 1 drm fd open). However instead of using the current format measured in nsec, this creates a new one. The intention here is not to mix the GPU clock domain with the CPU clock. It allows to cover a few more use cases without extra complications. I tested this on DG2 and also checked gputop with i915 to make sure not regressed. Last patch also contains the documentation for the new key and sample output as requested in v1. Reproducing it partially here: - drm-total-cycles-: Engine identifier string must be the same as the one specified in the drm-cycles- tag and shall contain the total number cycles for the given engine. This is a timestamp in GPU unspecified unit that matches the update rate of drm-cycles-. For drivers that implement this interface, the engine utilization can be calculated entirely on the GPU clock domain, without considering the CPU sleep time between 2 samples. The pre-existent drm-cycles- is used as is, which allows gputop to work with xe. This last patch still has some open discussion from v2, so we may need to hold it a little more. Test-with: 20240504064643.25863-1-lucas.demar...@intel.com v2: - Create a new drm-total-cycles instead of re-using drm-engine with a different unit - Add documentation for the new interface and clarify usage of xe_lrc_update_timestamp() v3: - Fix bugs in "drm/xe: Add helper to accumulate exec queue runtime" - see commit message - Reorder commits so the ones that are useful in other patch series come first Lucas De Marchi (4): drm/xe: Promote xe_hw_engine_class_to_str() drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion drm/xe: Add helper to capture engine timestamp drm/xe/client: Print runtime to fdinfo Umesh Nerlige Ramappa (2): drm/xe/lrc: Add helper to capture context timestamp drm/xe: Add helper to accumulate exec queue runtime Documentation/gpu/drm-usage-stats.rst | 16 ++- Documentation/gpu/xe/index.rst| 1 + Documentation/gpu/xe/xe-drm-usage-stats.rst | 10 ++ drivers/gpu/drm/xe/regs/xe_lrc_layout.h | 1 + drivers/gpu/drm/xe/xe_device_types.h | 9 ++ drivers/gpu/drm/xe/xe_drm_client.c| 136 +- drivers/gpu/drm/xe/xe_exec_queue.c| 35 + drivers/gpu/drm/xe/xe_exec_queue.h| 1 + drivers/gpu/drm/xe/xe_execlist.c | 1 + drivers/gpu/drm/xe/xe_guc_submit.c| 2 + drivers/gpu/drm/xe/xe_hw_engine.c | 27 drivers/gpu/drm/xe/xe_hw_engine.h | 3 + drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c | 18 --- drivers/gpu/drm/xe/xe_lrc.c | 11 ++ drivers/gpu/drm/xe/xe_lrc.h | 14 ++ drivers/gpu/drm/xe/xe_lrc_types.h | 3 + 16 files changed, 267 insertions(+), 21 deletions(-) create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst -- 2.43.0
[PATCH v3 4/6] drm/xe: Add helper to capture engine timestamp
Just like CTX_TIMESTAMP is used to calculate runtime, add a helper to get the timestamp for the engine so it can be used to calculate the "engine time" with the same unit as the runtime is recorded. Reviewed-by: Umesh Nerlige Ramappa Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_hw_engine.c | 5 + drivers/gpu/drm/xe/xe_hw_engine.h | 1 + 2 files changed, 6 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index de30b74bf253..734f43a35b48 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -1110,3 +1110,8 @@ const char *xe_hw_engine_class_to_str(enum xe_engine_class class) return NULL; } + +u64 xe_hw_engine_read_timestamp(struct xe_hw_engine *hwe) +{ + return xe_mmio_read64_2x32(hwe->gt, RING_TIMESTAMP(hwe->mmio_base)); +} diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h b/drivers/gpu/drm/xe/xe_hw_engine.h index 843de159e47c..7f2d27c0ba1a 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.h +++ b/drivers/gpu/drm/xe/xe_hw_engine.h @@ -68,5 +68,6 @@ static inline bool xe_hw_engine_is_valid(struct xe_hw_engine *hwe) } const char *xe_hw_engine_class_to_str(enum xe_engine_class class); +u64 xe_hw_engine_read_timestamp(struct xe_hw_engine *hwe); #endif -- 2.43.0
[PATCH v3 1/6] drm/xe: Promote xe_hw_engine_class_to_str()
Move it out of the sysfs compilation unit so it can be re-used in other places. Reviewed-by: Nirmoy Das Reviewed-by: Oak Zeng Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_hw_engine.c | 18 ++ drivers/gpu/drm/xe/xe_hw_engine.h | 2 ++ drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c | 18 -- 3 files changed, 20 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index ec69803152a2..85712650be22 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -1088,3 +1088,21 @@ bool xe_hw_engine_is_reserved(struct xe_hw_engine *hwe) return xe->info.has_usm && hwe->class == XE_ENGINE_CLASS_COPY && hwe->instance == gt->usm.reserved_bcs_instance; } + +const char *xe_hw_engine_class_to_str(enum xe_engine_class class) +{ + switch (class) { + case XE_ENGINE_CLASS_RENDER: + return "rcs"; + case XE_ENGINE_CLASS_VIDEO_DECODE: + return "vcs"; + case XE_ENGINE_CLASS_VIDEO_ENHANCE: + return "vecs"; + case XE_ENGINE_CLASS_COPY: + return "bcs"; + case XE_ENGINE_CLASS_COMPUTE: + return "ccs"; + default: + return NULL; + } +} diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h b/drivers/gpu/drm/xe/xe_hw_engine.h index 71968ee2f600..843de159e47c 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.h +++ b/drivers/gpu/drm/xe/xe_hw_engine.h @@ -67,4 +67,6 @@ static inline bool xe_hw_engine_is_valid(struct xe_hw_engine *hwe) return hwe->name; } +const char *xe_hw_engine_class_to_str(enum xe_engine_class class); + #endif diff --git a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c index 844ec68cbbb8..efce6c7dd2a2 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c +++ b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c @@ -618,24 +618,6 @@ static void hw_engine_class_sysfs_fini(struct drm_device *drm, void *arg) kobject_put(kobj); } -static const char *xe_hw_engine_class_to_str(enum xe_engine_class class) -{ - switch (class) { - case XE_ENGINE_CLASS_RENDER: - return "rcs"; - case XE_ENGINE_CLASS_VIDEO_DECODE: - return "vcs"; - case XE_ENGINE_CLASS_VIDEO_ENHANCE: - return "vecs"; - case XE_ENGINE_CLASS_COPY: - return "bcs"; - case XE_ENGINE_CLASS_COMPUTE: - return "ccs"; - default: - return NULL; - } -} - /** * xe_hw_engine_class_sysfs_init - Init HW engine classes on GT. * @gt: Xe GT. -- 2.43.0
[PATCH v3 5/6] drm/xe: Add helper to accumulate exec queue runtime
From: Umesh Nerlige Ramappa Add a helper to accumulate per-client runtime of all its exec queues. This is called every time a sched job is finished. v2: - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate runtime when job is finished since xe_sched_job_completed() is not a notification that job finished. - Stop trying to update runtime from xe_exec_queue_fini() - that is redundant and may happen after xef is closed, leading to a use-after-free - Do not special case the first timestamp read: the default LRC sets CTX_TIMESTAMP to zero, so even the first sample should be a valid one. - Handle the parallel submission case by multiplying the runtime by width. Signed-off-by: Umesh Nerlige Ramappa Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_device_types.h | 9 +++ drivers/gpu/drm/xe/xe_exec_queue.c | 35 drivers/gpu/drm/xe/xe_exec_queue.h | 1 + drivers/gpu/drm/xe/xe_execlist.c | 1 + drivers/gpu/drm/xe/xe_guc_submit.c | 2 ++ 5 files changed, 48 insertions(+) diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h index 906b98fb973b..de078bdf0ab9 100644 --- a/drivers/gpu/drm/xe/xe_device_types.h +++ b/drivers/gpu/drm/xe/xe_device_types.h @@ -560,6 +560,15 @@ struct xe_file { struct mutex lock; } exec_queue; + /** +* @runtime: hw engine class runtime in ticks for this drm client +* +* Only stats from xe_exec_queue->lrc[0] are accumulated. For multi-lrc +* case, since all jobs run in parallel on the engines, only the stats +* from lrc[0] are sufficient. +*/ + u64 runtime[XE_ENGINE_CLASS_MAX]; + /** @client: drm client */ struct xe_drm_client *client; }; diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index 395de93579fa..86eb22e22c95 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -769,6 +769,41 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q) q->lrc[0].fence_ctx.next_seqno - 1; } +/** + * xe_exec_queue_update_runtime() - Update runtime for this exec queue from hw + * @q: The exec queue + * + * Update the timestamp saved by HW for this exec queue and save runtime + * calculated by using the delta from last update. On multi-lrc case, only the + * first is considered. + */ +void xe_exec_queue_update_runtime(struct xe_exec_queue *q) +{ + struct xe_file *xef; + struct xe_lrc *lrc; + u32 old_ts, new_ts; + + /* +* Jobs that are run during driver load may use an exec_queue, but are +* not associated with a user xe file, so avoid accumulating busyness +* for kernel specific work. +*/ + if (!q->vm || !q->vm->xef) + return; + + xef = q->vm->xef; + + /* +* Only sample the first LRC. For parallel submission, all of them are +* scheduled together and we compensate that below by multiplying by +* width +*/ + lrc = &q->lrc[0]; + + new_ts = xe_lrc_update_timestamp(lrc, &old_ts); + xef->runtime[q->class] += (new_ts - old_ts) * q->width; +} + void xe_exec_queue_kill(struct xe_exec_queue *q) { struct xe_exec_queue *eq = q, *next; diff --git a/drivers/gpu/drm/xe/xe_exec_queue.h b/drivers/gpu/drm/xe/xe_exec_queue.h index 48f6da53a292..e0f07d28ee1a 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.h +++ b/drivers/gpu/drm/xe/xe_exec_queue.h @@ -75,5 +75,6 @@ struct dma_fence *xe_exec_queue_last_fence_get(struct xe_exec_queue *e, struct xe_vm *vm); void xe_exec_queue_last_fence_set(struct xe_exec_queue *e, struct xe_vm *vm, struct dma_fence *fence); +void xe_exec_queue_update_runtime(struct xe_exec_queue *q); #endif diff --git a/drivers/gpu/drm/xe/xe_execlist.c b/drivers/gpu/drm/xe/xe_execlist.c index dece2785933c..a316431025c7 100644 --- a/drivers/gpu/drm/xe/xe_execlist.c +++ b/drivers/gpu/drm/xe/xe_execlist.c @@ -307,6 +307,7 @@ static void execlist_job_free(struct drm_sched_job *drm_job) { struct xe_sched_job *job = to_xe_sched_job(drm_job); + xe_exec_queue_update_runtime(job->q); xe_sched_job_put(job); } diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index d274a139010b..e0ebfe83dfe8 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -749,6 +749,8 @@ static void guc_exec_queue_free_job(struct drm_sched_job *drm_job) { struct xe_sched_job *job = to_xe_sched_job(drm_job); + xe_exec_queue_update_runtime(job->q); + trace_xe_sched_job_free(job); xe_sched_job_put(job); } -- 2.43.0
[PATCH v3 2/6] drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion
XE_ENGINE_CLASS_OTHER was missing from the str conversion. Add it and remove the default handling so it's protected by -Wswitch. Currently the only user is xe_hw_engine_class_sysfs_init(), which already skips XE_ENGINE_CLASS_OTHER, so there's no change in behavior. Reviewed-by: Nirmoy Das Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/xe/xe_hw_engine.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c index 85712650be22..de30b74bf253 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine.c +++ b/drivers/gpu/drm/xe/xe_hw_engine.c @@ -1100,9 +1100,13 @@ const char *xe_hw_engine_class_to_str(enum xe_engine_class class) return "vecs"; case XE_ENGINE_CLASS_COPY: return "bcs"; + case XE_ENGINE_CLASS_OTHER: + return "other"; case XE_ENGINE_CLASS_COMPUTE: return "ccs"; - default: - return NULL; + case XE_ENGINE_CLASS_MAX: + break; } + + return NULL; } -- 2.43.0
Re: [PATCH v2 6/6] drm/xe/client: Print runtime to fdinfo
On Fri, Apr 26, 2024 at 11:47:37AM GMT, Tvrtko Ursulin wrote: On 24/04/2024 00:56, Lucas De Marchi wrote: Print the accumulated runtime for client when printing fdinfo. Each time a query is done it first does 2 things: 1) loop through all the exec queues for the current client and accumulate the runtime, per engine class. CTX_TIMESTAMP is used for that, being read from the context image. 2) Read a "GPU timestamp" that can be used for considering "how much GPU time has passed" and that has the same unit/refclock as the one recording the runtime. RING_TIMESTAMP is used for that via MMIO. Since for all current platforms RING_TIMESTAMP follows the same refclock, just read it once, using any first engine. This is exported to userspace as 2 numbers in fdinfo: drm-cycles-: drm-total-cycles-: Userspace is expected to collect at least 2 samples, which allows to know the client engine busyness as per: RUNTIME1 - RUNTIME0 busyness = - T1 - T0 Another thing to point out is that it's expected that userspace will read any 2 samples every few seconds. Given the update frequency of the counters involved and that CTX_TIMESTAMP is 32-bits, the counter for each exec_queue can wrap around (assuming 100% utilization) after ~200s. The wraparound is not perceived by userspace since it's just accumulated for all the exec_queues in a 64-bit counter), but the measurement will not be accurate if the samples are too far apart. This could be mitigated by adding a workqueue to accumulate the counters every so often, but it's additional complexity for something that is done already by userspace every few seconds in tools like gputop (from igt), htop, nvtop, etc with none of them really defaulting to 1 sample per minute or more. Signed-off-by: Lucas De Marchi --- Documentation/gpu/drm-usage-stats.rst | 16 ++- Documentation/gpu/xe/index.rst | 1 + Documentation/gpu/xe/xe-drm-usage-stats.rst | 10 ++ drivers/gpu/drm/xe/xe_drm_client.c | 138 +++- 4 files changed, 162 insertions(+), 3 deletions(-) create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst index 6dc299343b48..421766289b78 100644 --- a/Documentation/gpu/drm-usage-stats.rst +++ b/Documentation/gpu/drm-usage-stats.rst @@ -112,6 +112,17 @@ larger value within a reasonable period. Upon observing a value lower than what was previously read, userspace is expected to stay with that larger previous value until a monotonic update is seen. +- drm-total-cycles-: + +Engine identifier string must be the same as the one specified in the +drm-cycles- tag and shall contain the total number cycles for the given +engine. + +This is a timestamp in GPU unspecified unit that matches the update rate +of drm-cycles-. For drivers that implement this interface, the engine +utilization can be calculated entirely on the GPU clock domain, without +considering the CPU sleep time between 2 samples. Two opens. 1) Do we need to explicity document that drm-total-cycles and drm-maxfreq are mutually exclusive? so userspace has a fallback mechanism to calculate utilization depending on what keys are available? 2) Should drm-total-cycles for now be documents as driver specific? you mean to call it xe-total-cycles? I have added some more poeple in the cc who were involved with driver fdinfo implementations if they will have an opinion. I would say potentially yes, and promote it to common if more than one driver would use it. For instance I see panfrost has the driver specific drm-curfreq (although isn't documenting it fully in panfrost.rst). And I have to say it is somewhat questionable to expose the current frequency per fdinfo per engine but not my call. aren't all of Documentation/gpu/drm-usage-stats.rst optional that driver may or may not implement? When you say driver-specific I'd think more of the ones not using as prefix as e.g. amd-*. I think drm-cycles + drm-total-cycles is just an alternative implementation for engine utilization. Like drm-cycles + drm-maxfreq already is an alternative to drm-engine and is not implemented by e.g. amdgpu/i915. I will submit a new version of the entire patch series to get the ball rolling, but let's keep this open for now. <...> +static void show_runtime(struct drm_printer *p, struct drm_file *file) +{ + struct xe_file *xef = file->driver_priv; + struct xe_device *xe = xef->xe; + struct xe_gt *gt; + struct xe_hw_engine *hwe; + struct xe_exec_queue *q; + unsigned long i, id_hwe, id_gt, capacity[XE_ENGINE_CLASS_MAX] = { }; + u64 gpu_timestamp, engine_mask = 0; + bool gpu_stamp = false; + + xe_pm_runtime_get(xe); + + /* Accumulate all the exec
Re: [PATCH] drm/xe: Fix UBSAN shift-out-of-bounds failure
+Thomas, +Christian, +dri-devel On Tue, May 07, 2024 at 11:42:46AM GMT, Nirmoy Das wrote: On 5/7/2024 11:39 AM, Nirmoy Das wrote: On 5/7/2024 10:04 AM, Shuicheng Lin wrote: Here is the failure stack: [ 12.988209] [ cut here ] [ 12.988216] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13 [ 12.988232] shift exponent 64 is too large for 64-bit type 'long unsigned int' [ 12.988235] CPU: 4 PID: 1310 Comm: gnome-shell Tainted: G U 6.9.0-rc6+prerelease1158+ #19 [ 12.988237] Hardware name: Intel Corporation Raptor Lake Client Platform/RPL-S ADP-S DDR5 UDIMM CRB, BIOS RPLSFWI1.R00.3301.A02.2208050712 08/05/2022 [ 12.988239] Call Trace: [ 12.988240] [ 12.988242] dump_stack_lvl+0xd7/0xf0 [ 12.988248] dump_stack+0x10/0x20 [ 12.988250] ubsan_epilogue+0x9/0x40 [ 12.988253] __ubsan_handle_shift_out_of_bounds+0x10e/0x170 [ 12.988260] dma_resv_reserve_fences.cold+0x2b/0x48 [ 12.988262] ? ww_mutex_lock_interruptible+0x3c/0x110 [ 12.988267] drm_exec_prepare_obj+0x45/0x60 [drm_exec] [ 12.988271] ? vm_bind_ioctl_ops_execute+0x5b/0x740 [xe] [ 12.988345] vm_bind_ioctl_ops_execute+0x78/0x740 [xe] It is caused by the value 0 of parameter num_fences in function drm_exec_prepare_obj. And lead to in function __rounddown_pow_of_two, "0 - 1" causes the shift-out-of-bounds. For the num_fences, it should be 1 at least. Cc: Matthew Brost Signed-off-by: Shuicheng Lin --- drivers/gpu/drm/xe/xe_vm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index d17192c8b7de..96cb4d9762a3 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -2692,7 +2692,7 @@ static int vma_lock_and_validate(struct drm_exec *exec, struct xe_vma *vma, if (bo) { if (!bo->vm) - err = drm_exec_prepare_obj(exec, &bo->ttm.base, 0); + err = drm_exec_prepare_obj(exec, &bo->ttm.base, 1); This needs to be fixed in drm_exec_prepare_obj() by checking num_fences and not calling dma_resv_reserve_fences() or just call drm_exec_lock_obj() here. ref: https://patchwork.freedesktop.org/patch/577487/ we are hit again by this. Couldn't we change drm_exec_prepare_obj() to check num_fences and if is 0 just fallback to just do drm_exec_lock_obj() as "the least amount of work needed in this case"? Something like this: | diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c | index 2da094bdf8a4..68b5f6210b09 100644 | --- a/drivers/gpu/drm/drm_exec.c | +++ b/drivers/gpu/drm/drm_exec.c | @@ -296,10 +296,12 @@ int drm_exec_prepare_obj(struct drm_exec *exec, struct drm_gem_object *obj, | if (ret) | return ret; | | - ret = dma_resv_reserve_fences(obj->resv, num_fences); | - if (ret) { | - drm_exec_unlock_obj(exec, obj); | - return ret; | + if (num_fences) { | + ret = dma_resv_reserve_fences(obj->resv, num_fences); | + if (ret) { | + drm_exec_unlock_obj(exec, obj); | + return ret; | + } | } | | return 0; thanks Lucas De Marchi Nirmoy Regards, Nirmoy if (!err && validate) err = xe_bo_validate(bo, xe_vma_vm(vma), true); } @@ -2777,7 +2777,7 @@ static int vm_bind_ioctl_ops_lock_and_prep(struct drm_exec *exec, struct xe_vma_op *op; int err; - err = drm_exec_prepare_obj(exec, xe_vm_obj(vm), 0); + err = drm_exec_prepare_obj(exec, xe_vm_obj(vm), 1); if (err) return err;
[PULL] drm-xe-fixes
Hi Dave and Sima, Please pull the drm-xe-fixes for this week targeting v6.9-rc7. Two important fixes: one avoiding a use-after-free in the rebind worker and the other to make display work in ADL-N. thanks Lucas De Marchi drm-xe-fixes-2024-05-02: - Fix UAF on rebind worker - Fix ADL-N display integration The following changes since commit e67572cd2204894179d89bd7b984072f19313b03: Linux 6.9-rc6 (2024-04-28 13:47:24 -0700) are available in the Git repository at: https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-05-02 for you to fetch changes up to df04b152fca2d46e75fbb74ed79299bc420bc9e6: drm/xe/display: Fix ADL-N detection (2024-05-02 11:11:01 -0500) - Fix UAF on rebind worker - Fix ADL-N display integration Lucas De Marchi (1): drm/xe/display: Fix ADL-N detection Matthew Auld (1): drm/xe/vm: prevent UAF in rebind_work_func() drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h | 3 ++- drivers/gpu/drm/xe/xe_vm.c| 3 +++ 2 files changed, 5 insertions(+), 1 deletion(-)
Re: [PATCH] drm/sysfs: Add drm class-wide attribute to get active device clients
On Wed, May 01, 2024 at 09:25:01AM GMT, Rob Clark wrote: On Wed, May 1, 2024 at 9:19 AM Lucas De Marchi wrote: On Wed, May 01, 2024 at 04:58:05PM GMT, Tvrtko Ursulin wrote: > >Hi, > >On 24/04/2024 15:48, Adrián Larumbe wrote: >>Hi Tvrtko, >> >>On 15.04.2024 13:50, Tvrtko Ursulin wrote: >>> >>>On 05/04/2024 18:59, Rob Clark wrote: >>>>On Wed, Apr 3, 2024 at 11:37 AM Adrián Larumbe >>>> wrote: >>>>> >>>>>Up to this day, all fdinfo-based GPU profilers must traverse the entire >>>>>/proc directory structure to find open DRM clients with fdinfo file >>>>>descriptors. This is inefficient and time-consuming. >>>>> >>>>>This patch adds a new device class attribute that will install a sysfs file >>>>>per DRM device, which can be queried by profilers to get a list of PIDs for >>>>>their open clients. This file isn't human-readable, and it's meant to be >>>>>queried only by GPU profilers like gputop and nvtop. >>>>> >>>>>Cc: Boris Brezillon >>>>>Cc: Tvrtko Ursulin >>>>>Cc: Christopher Healy >>>>>Signed-off-by: Adrián Larumbe >>>> >>>>It does seem like a good idea.. idk if there is some precedent to >>>>prefer binary vs ascii in sysfs, but having a way to avoid walking >>>>_all_ processes is a good idea. >>> >>>I naturally second that it is a needed feature, but I do not think binary >>>format is justified. AFAIR it should be used for things like hw/fw >>>standardised tables or firmware images, not when exporting a simple list of >>>PIDs. It also precludes easy shell/script access and the benefit of avoiding >>>parsing a short list is I suspect completely dwarfed by needing to parse all >>>the related fdinfo etc. >> >>I'd rather keep it as a binary file for the sake of easily parsing the number >>list on the client side, in gputop or nvtop. For textual access, there's already >>a debugfs file that presents the same information, so I thought it was best not >>to duplicate that functionality and restrict sysfs to serving the very specific >>use case of UM profilers having to access the DRM client list. >> >>I should mention I did something controversial here, which is a semantically >>binary attribute through the regular attribute interface. I guess if I keep it >>as a binary attribute in the end, I should switch over to the binary attribute >>API. >> >>Another reason why I implemented it as a binary file is that we can only send >>back at most a whole page. If a PID takes 4 bytes, that's usually 1024 clients >>at most, which is probably enough for any UM profiler, but will decrease even >>more if we turn it into an ASCII readable file. > >I'm afraid I still think there is no reason for a binary file, even >less so artificially limited to 1024 clients. Any consumer will have >to parse text fdinfo so a binary list of pids is not adding any real >cost. yeah, I don't really understand why you'd want the binary number that you'd then have to turn into a string to open the /proc//. To me it sounds more like we want a text output and that output to be: /fdinfo/ So gputop could just read this file to know where the info is. Too bad we can't symlink cross fs, otherwise we could just add symlinks to e.g. /sys/class/drm/card/clients/*, which then nicely separate it per gpu too. But see below. > >>I did some research into sysfs binary attributes, and while some sources mention that >>it's often used for dumping or loading of driver FW, none of them claim it cannot >>be used for other purposes. >> >>>>>--- >>>>> drivers/gpu/drm/drm_internal.h | 2 +- >>>>> drivers/gpu/drm/drm_privacy_screen.c | 2 +- >>>>> drivers/gpu/drm/drm_sysfs.c | 89 ++-- >>>>> 3 files changed, 74 insertions(+), 19 deletions(-) >>>>> >>>>>diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h >>>>>index 2215baef9a3e..9a399b03d11c 100644 >>>>>--- a/drivers/gpu/drm/drm_internal.h >>>>>+++ b/drivers/gpu/drm/drm_internal.h >>>>>@@ -145,7 +145,7 @@ bool drm_master_internal_acquire(struct drm_device *dev); >>>>> void drm_master_internal_release(struct drm_device *dev); >>>>> >>>>> /* drm_sysfs.c */ >>>>>-extern struct class *drm_class; >>>>>+extern struct
Re: [PATCH] drm/sysfs: Add drm class-wide attribute to get active device clients
ice *dev) { @@ -128,6 +126,62 @@ static const struct component_ops typec_connector_ops = { static CLASS_ATTR_STRING(version, S_IRUGO, "drm 1.1.0 20060810"); +static ssize_t clients_show(struct device *cd, struct device_attribute *attr, char *buf) +{ + struct drm_minor *minor = cd->driver_data; + struct drm_device *ddev = minor->dev; + struct drm_file *priv; + ssize_t offset = 0; + void *pid_buf; + + if (minor->type != DRM_MINOR_RENDER) + return 0; Why this? I return nothing in case of a non-render node because we don't want display drivers to confuse UM GPU profilers. Feels to arbitrary to me. Let them handle it. + + pid_buf = kvmalloc(PAGE_SIZE, GFP_KERNEL); I don't quite get the kvmalloc for just one page (or why even a temporay buffer and not write into buf directly?). Should've used kmalloc, you're right. Or else I could just write everything straight into 'buf'. + if (!pid_buf) + return 0; + + mutex_lock(&ddev->filelist_mutex); + list_for_each_entry_reverse(priv, &ddev->filelist, lhead) { + struct pid *pid; + + if (drm_WARN_ON(ddev, (PAGE_SIZE - offset) < sizeof(pid_t))) + break; Feels bad.. I would suggest exploring implementing a read callback (instead of show) and handling arbitrary size output. I think regular class attributes can only implement show() and set(). For a more complex interface, I would have to turn it into an actual binary attribute, and that would be the only choice if we want the list of clients to be of arbitrary size. Yeah, i915 uses that to dump the error capture file which can be huge and is text so it is doable. + + rcu_read_lock(); + pid = rcu_dereference(priv->pid); + (*(pid_t *)(pid_buf + offset)) = pid_vnr(pid); + rcu_read_unlock(); + + offset += sizeof(pid_t); + } + mutex_unlock(&ddev->filelist_mutex); + + if (offset < PAGE_SIZE) + (*(pid_t *)(pid_buf + offset)) = 0; Either NULL terminated or PAGE_SIZE/sizeof(pid) entries and not NULL terminated feels weird. If I got that right. For me everything points towards going for text output. Yes, I know it might sound weird, but my reasoning was: either there are PAGE_SIZE/sizeof(pid) entries and the file isn't NULL terminated (which should be picked up by clients as being one page worth of data, the sysfs attribute maximum transfer unit), or else there aren't enough entries to fill a page and after the last one there's a NULL entry. + + memcpy(buf, pid_buf, offset); + + kvfree(pid_buf); + + return offset; + +} +static DEVICE_ATTR_RO(clients); /proc//fdinfo/ is only readable by the owner. if we report what are the open fds (or even what are the pids with a drm fd), we are leaking that info. So we should probably make this DEVICE_ATTR_ADMIN_RO. Lucas De Marchi Shouldn't BIN_ATTR_RO be used for binary files in sysfs? Like I said above, I sort of faked a binary attribute through the regular sysfs attr API, which is most likely a bad idea. Regards, Tvrtko P.S. Or maybe it is time for drmfs? Where each client gets a directory and drivers can populate files. Such as per client logging streams and whatnot. Yes, but maybe this is something we can discuss in depth in an RFC at a later time? Yes of course, it is just a long standing idea for flexible per client stuff. Regards, Tvrtko + +static struct attribute *drm_device_attrs[] = { + &dev_attr_clients.attr, + NULL, +}; +ATTRIBUTE_GROUPS(drm_device); + +struct class drm_class = { + .name = "drm", + .dev_groups = drm_device_groups, +}; + +static bool drm_class_initialised; + /** * drm_sysfs_init - initialize sysfs helpers * @@ -142,18 +196,19 @@ int drm_sysfs_init(void) { int err; - drm_class = class_create("drm"); - if (IS_ERR(drm_class)) - return PTR_ERR(drm_class); + err = class_register(&drm_class); + if (err) + return err; - err = class_create_file(drm_class, &class_attr_version.attr); + err = class_create_file(&drm_class, &class_attr_version.attr); if (err) { - class_destroy(drm_class); - drm_class = NULL; + class_destroy(&drm_class); return err; } - drm_class->devnode = drm_devnode; + drm_class.devnode = drm_devnode; + + drm_class_initialised = true; drm_sysfs_acpi_register(); return 0; @@ -166,12 +221,12 @@ int drm_sysfs_init(void) */ void drm_sysfs_destroy(void) { - if (IS_ERR_OR_NULL(drm_class)) + if (!drm_class_initialised) return; dr
Re: [PATCH] nouveau: Add missing break statement
On Tue, Apr 30, 2024 at 06:48:40PM GMT, Chaitanya Kumar Borah wrote: Add the missing break statement that causes the following build error CC [M] drivers/gpu/drm/i915/display/intel_display_device.o ../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c: In function ‘build_registry’: ../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1266:3: error: label at end of compound statement 1266 | default: | ^~~ CC [M] drivers/gpu/drm/amd/amdgpu/gfx_v10_0.o HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_reg.h CC [M] drivers/gpu/drm/amd/amdgpu/imu_v11_0.o make[7]: *** [../scripts/Makefile.build:244: drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.o] Error 1 make[7]: *** Waiting for unfinished jobs Fixes: b58a0bc904ff ("nouveau: add command-line GSP-RM registry support") Closes: https://lore.kernel.org/all/913052ca6c0988db1bab293cfae38529251b4594.ca...@nvidia.com/T/#m3c9acebac754f2e74a85b76c858c093bb1aacaf0 Just found a previous bug report, so we can add here: Closes: https://lore.kernel.org/all/ca+g9fyu7ug0k8h9qjt0wbtwh_ll9juc+vc0wmu_z_vsspdn...@mail.gmail.com/ Signed-off-by: Chaitanya Kumar Borah --- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c index 0b46db5c77b8..63619512e7f6 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c @@ -1264,6 +1264,7 @@ static void build_registry(struct nvkm_gsp *gsp, PACKED_REGISTRY_TABLE *registry str_offset += reg->vlen; break; default: + break; since reg->type is an enum and we are handling all the values, another possible approach is to remove the default handling and then the compiler will warn you of missing cases if built with -Wswitch. Any of the approaches seem good to me. Reviewed-by: Lucas De Marchi Lucas De Marchi } i++; -- 2.25.1
Re: [PATCH] [v7] nouveau: add command-line GSP-RM registry support
On Fri, Apr 26, 2024 at 06:08:19PM GMT, Danilo Krummrich wrote: On 4/25/24 18:38, Timur Tabi wrote: On Thu, 2024-04-25 at 15:22 +0200, Danilo Krummrich wrote: + size_t length; + + /* Remove any whitespace from the parameter string */ + length = strip(p, " \t\n"); With that, I see the following warning compiling this patch. warning: variable ‘length’ set but not used [-Wunused-but-set-variable] Did you intend to use the length for anything? No, and I could have sworn I fixed that before sending out v7. I think I originally intended 'length' to determine when I finished parsing the string. Also, looking at the warning made me aware of 'p' potentially being NULL. If you agree, I can fix the warning and add the corresponding NULL check when applying the patch. Yes, that would be great. You can just delete 'length'. The NULL check for 'p' should call clean_registry() before returning -ENOMEM. Applied to drm-misc-next, thanks! since this commit our CI is failing to build with the following error: CC [M] drivers/gpu/drm/i915/display/intel_display_device.o ../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c: In function ‘build_registry’: ../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1266:3: error: label at end of compound statement 1266 | default: | ^~~ CC [M] drivers/gpu/drm/amd/amdgpu/gfx_v10_0.o HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_reg.h CC [M] drivers/gpu/drm/amd/amdgpu/imu_v11_0.o make[7]: *** [../scripts/Makefile.build:244: drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.o] Error 1 make[7]: *** Waiting for unfinished jobs you are missing a `break;` after that default. Thanks for catching this.
Re: [PATCH 1/2] drm: print top commit sha after loading the driver
On Mon, Apr 29, 2024 at 02:02:28PM GMT, Jani Nikula wrote: On Wed, 24 Apr 2024, Farah Kassabri wrote: Add the last driver sha to the existing log message which prints the drm device info. The commit message fails to answer the most important question: why? Especially, what makes drm devices such special snowflakes that they'd need to be the only ones in the kernel to print git commit sha1's? the closest to what was added here would be srcversion: $ modinfo build64/drivers/gpu/drm/xe/xe.ko | grep srcversion srcversion: 0EA08A43AC399A0D942740 which makes more sense and doesn't depend on the git tree checkout and other possible problems with dirty checkouts. If you want to print it on module load to be able to tell from a log, you could probably just access mod->srcversion. but I'm not sure what we are trying to accomplish here and the commit message certainly missed that. Please Cc dri-devel on changes outside xe and provide the motivation in the commit message. thanks Lucas De Marchi BR, Jani. Signed-off-by: Farah Kassabri --- drivers/gpu/drm/drm_drv.c | 6 +++--- include/drm/drm_drv.h | 2 ++ 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 535b624d4c9d..e0f7af1b6ec3 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -947,10 +947,10 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags) } drm_panic_register(dev); - DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n", + DRM_INFO("Initialized %s %d.%d.%d%s %s for %s on minor %d\n", driver->name, driver->major, driver->minor, -driver->patchlevel, driver->date, -dev->dev ? dev_name(dev->dev) : "virtual device", +driver->patchlevel, driver->git_sha ? driver->git_sha : "", +driver->date, dev->dev ? dev_name(dev->dev) : "virtual device", dev->primary ? dev->primary->index : dev->accel->index); goto out_unlock; diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h index 8878260d7529..7578a1f4ce74 100644 --- a/include/drm/drm_drv.h +++ b/include/drm/drm_drv.h @@ -407,6 +407,8 @@ struct drm_driver { int minor; /** @patchlevel: driver patch level */ int patchlevel; + /** @git_sha: driver last commit sha */ + char *git_sha; /** @name: driver name */ char *name; /** @desc: driver description */ -- Jani Nikula, Intel
Re: [PATCH] MAINTAINERS: Move the drm-intel repo location to fd.o GitLab
On Wed, Apr 24, 2024 at 01:41:59PM GMT, Ryszard Knop wrote: The drm-intel repo is moving from the classic fd.o git host to GitLab. Update its location with a URL matching other fd.o GitLab kernel trees. Signed-off-by: Ryszard Knop Acked-by: Lucas De Marchi Also Cc'ing maintainers --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index d6327dc12cb1..fbf7371a0bb0 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -10854,7 +10854,7 @@ W: https://drm.pages.freedesktop.org/intel-docs/ Q: http://patchwork.freedesktop.org/project/intel-gfx/ B: https://drm.pages.freedesktop.org/intel-docs/how-to-file-i915-bugs.html C: irc://irc.oftc.net/intel-gfx -T: git git://anongit.freedesktop.org/drm-intel +T: git https://gitlab.freedesktop.org/drm/i915/kernel.git F: Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon F: Documentation/gpu/i915.rst F: drivers/gpu/drm/ci/xfails/i915* -- 2.44.0