[PULL] drm-xe-fixes

2024-10-03 Thread Lucas De Marchi

Hi Dave and Simona,

drm-xe-fixes for 6.12-rc2. A few fixes, particularly on error paths and
corner cases. We have more than usual as I decided to skip last week
pull request. Some tuning for Xe2 were missing and also got updated to
match the spec.

Thanks
Lucas De Marchi

drm-xe-fixes-2024-10-03:
Driver Changes:
- Restore pci state on resume (Rodrigo Vivi)
- Fix locking on submission, queue and vm (Matthew Auld, Matthew Brost)
- Fix UAF on queue destruction (Matthew Auld)
- Fix resource release on freq init error path (He Lugang)
- Use rw_semaphore to reduce contention on ASID->VM lookup (Matthew Brost)
- Fix steering for media on Xe2_HPM (Gustavo Sousa)
- Tuning updates to Xe2 (Gustavo Sousa)
- Resume TDR after GT reset to prevent jobs running forever (Matthew Brost)
- Move id allocation to avoid userspace using a guessed number
  to trigger UAF (Matthew Auld, Matthew Brost)
- Fix OA stream close preventing pbatch buffers to complete (José)
- Fix NPD when migrating memory on LNL (Zhanjun Dong)
- Fix memory leak when aborting binds (Matthew Brost)
The following changes since commit 9852d85ec9d492ebef56dc5f229416c925758edc:

  Linux 6.12-rc1 (2024-09-29 15:06:19 -0700)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-10-03

for you to fetch changes up to a6f3b2527375c786f2eff77d3ee8b805bcfe026d:

  drm/xe: Fix memory leak when aborting binds (2024-10-03 01:24:54 -0500)


Driver Changes:
- Restore pci state on resume (Rodrigo Vivi)
- Fix locking on submission, queue and vm (Matthew Auld, Matthew Brost)
- Fix UAF on queue destruction (Matthew Auld)
- Fix resource release on freq init error path (He Lugang)
- Use rw_semaphore to reduce contention on ASID->VM lookup (Matthew Brost)
- Fix steering for media on Xe2_HPM (Gustavo Sousa)
- Tuning updates to Xe2 (Gustavo Sousa)
- Resume TDR after GT reset to prevent jobs running forever (Matthew Brost)
- Move id allocation to avoid userspace using a guessed number
  to trigger UAF (Matthew Auld, Matthew Brost)
- Fix OA stream close preventing pbatch buffers to complete (José)
- Fix NPD when migrating memory on LNL (Zhanjun Dong)
- Fix memory leak when aborting binds (Matthew Brost)


Gustavo Sousa (3):
  drm/xe/mcr: Use Xe2_LPM steering tables for Xe2_HPM
  drm/xe/xe2: Extend performance tuning to media GT
  drm/xe/xe2: Add performance tuning for L3 cache flushing

He Lugang (1):
  drm/xe: use devm_add_action_or_reset() helper

José Roberto de Souza (1):
  drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close

Matthew Auld (4):
  drm/xe/guc_submit: add missing locking in wedged_fini
  drm/xe: fix UAF around queue destruction
  drm/xe/vm: move xa_alloc to prevent UAF
  drm/xe/queue: move xa_alloc to prevent UAF

Matthew Brost (5):
  drm/xe: Convert to USM lock to rwsem
  drm/xe: Use helper for ASID -> VM in GPU faults and access counters
  drm/xe: Resume TDR after GT reset
  drm/xe: Clean up VM / exec queue file lock usage.
  drm/xe: Fix memory leak when aborting binds

Rodrigo Vivi (1):
  drm/xe: Restore pci state upon resume

Zhanjun Dong (1):
  drm/xe: Prevent null pointer access in xe_migrate_copy

 drivers/gpu/drm/xe/regs/xe_gt_regs.h  | 11 ++
 drivers/gpu/drm/xe/xe_bo.c|  4 ++--
 drivers/gpu/drm/xe/xe_device.c| 12 +--
 drivers/gpu/drm/xe/xe_device_types.h  | 19 +
 drivers/gpu/drm/xe/xe_drm_client.c|  9 +++-
 drivers/gpu/drm/xe/xe_exec_queue.c|  6 +++---
 drivers/gpu/drm/xe/xe_gpu_scheduler.c |  5 +
 drivers/gpu/drm/xe/xe_gpu_scheduler.h |  2 ++
 drivers/gpu/drm/xe/xe_gt_freq.c   |  4 ++--
 drivers/gpu/drm/xe/xe_gt_mcr.c|  2 +-
 drivers/gpu/drm/xe/xe_gt_pagefault.c  | 39 +++
 drivers/gpu/drm/xe/xe_gt_sysfs.c  |  2 +-
 drivers/gpu/drm/xe/xe_guc_submit.c| 37 ++---
 drivers/gpu/drm/xe/xe_guc_types.h |  2 ++
 drivers/gpu/drm/xe/xe_oa.c|  9 +++-
 drivers/gpu/drm/xe/xe_pci.c   |  2 ++
 drivers/gpu/drm/xe/xe_pt.c|  2 +-
 drivers/gpu/drm/xe/xe_tuning.c| 28 +
 drivers/gpu/drm/xe/xe_vm.c| 28 +++--
 19 files changed, 159 insertions(+), 64 deletions(-)


Re: linux-next: build failure after merge of the kspp tree

2024-09-19 Thread Lucas De Marchi

On Thu, Sep 19, 2024 at 09:27:52AM GMT, Stephen Rothwell wrote:

Hi all,

On Mon, 9 Sep 2024 19:59:39 +1000 Stephen Rothwell  
wrote:


After merging the kspp tree, today's linux-next build (x86_64
allmodconfig) failed like this:

drivers/gpu/drm/xe/xe_gt_idle.c:56:27: error: redefinition of 'str_up_down'
   56 | static inline const char *str_up_down(bool v)
  |   ^~~
In file included from include/linux/string_helpers.h:7,
 from drivers/gpu/drm/xe/xe_assert.h:9,
 from drivers/gpu/drm/xe/xe_force_wake.h:9,
 from drivers/gpu/drm/xe/xe_gt_idle.c:8:
include/linux/string_choices.h:62:27: note: previous definition of 
'str_up_down' with type 'const char *(bool)' {aka 'const char *(_Bool)'}
   62 | static inline const char *str_up_down(bool v)
  |   ^~~

Caused by commit

  a98ae7f045b2 ("lib/string_choices: Add str_up_down() helper")

interacting with commit

  0914c1e45d3a ("drm/xe/xe_gt_idle: add debugfs entry for powergating info")

from the drm-xe tree.

I have applied the following patch for today.

From: Stephen Rothwell 
Date: Mon, 9 Sep 2024 19:40:17 +1000
Subject: [PATCH] fix up for "lib/string_choices: Add str_up_down() helper"

interacting wit commit "drm/xe/xe_gt_idle: add debugfs entry for
powergating info" from the drm-xe tree.

Signed-off-by: Stephen Rothwell 
---
 drivers/gpu/drm/xe/xe_gt_idle.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gt_idle.c b/drivers/gpu/drm/xe/xe_gt_idle.c
index 85a35ed153a3..0f98c1539c64 100644
--- a/drivers/gpu/drm/xe/xe_gt_idle.c
+++ b/drivers/gpu/drm/xe/xe_gt_idle.c
@@ -53,11 +53,6 @@ pc_to_xe(struct xe_guc_pc *pc)
return gt_to_xe(gt);
 }

-static inline const char *str_up_down(bool v)
-{
-   return v ? "up" : "down";
-}
-
 static const char *gt_idle_state_to_string(enum xe_gt_idle_state state)
 {
switch (state) {
--
2.45.2


This is now needed in the merge between Linus' tree and the drm-xe tree.


Thanks. This not going to 6.12. It's targeted to 6.13, so we should fix
it when merging drm-next back to drm-xe-next.

Lucas De Marchi



--
Cheers,
Stephen Rothwell





Re: [PULL] drm-xe-next-fixes

2024-09-19 Thread Lucas De Marchi

On Thu, Sep 19, 2024 at 09:56:47PM GMT, Lucas De Marchi wrote:

Hi Dave and Simona,

A few fixes for 6.11-rc1.


oops, I meant 6.12-rc1, of course :)

Lucas De Marchi


[PULL] drm-xe-next-fixes

2024-09-19 Thread Lucas De Marchi

Hi Dave and Simona,

A few fixes for 6.11-rc1.

Thanks
Lucas De Marchi

drm-xe-next-fixes-2024-09-19:
Driver Changes:
- Fix macro for checking minimum GuC version (Michal Wajdeczko)
- Fix CCS offset calculation for some BMG SKUs (Matthew Auld)
- Fix locking on memory usage reporting via fdinfo and BO destroy (Matthew Auld)
- Fix GPU page fault handler on a closed VM (Matthew Brost)
- Fix overflow in oa batch buffer (José)
The following changes since commit ae2c6d8b3b88c176dff92028941a4023f1b4cb91:

  Merge tag 'drm-xe-next-fixes-2024-09-12' of 
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next (2024-09-17 14:53:34 
+1000)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git 
tags/drm-xe-next-fixes-2024-09-19

for you to fetch changes up to 6c10ba06bb1b48acce6d4d9c1e33beb9954f1788:

  drm/xe/oa: Fix overflow in oa batch buffer (2024-09-17 23:31:59 -0500)


Driver Changes:
- Fix macro for checking minimum GuC version (Michal Wajdeczko)
- Fix CCS offset calculation for some BMG SKUs (Matthew Auld)
- Fix locking on memory usage reporting via fdinfo and BO destroy (Matthew Auld)
- Fix GPU page fault handler on a closed VM (Matthew Brost)
- Fix overflow in oa batch buffer (José)


José Roberto de Souza (1):
  drm/xe/oa: Fix overflow in oa batch buffer

Matthew Auld (5):
  drm/xe/vram: fix ccs offset calculation
  drm/xe/client: fix deadlock in show_meminfo()
  drm/xe/client: add missing bo locking in show_meminfo()
  drm/xe/client: use mem_type from the current resource
  drm/xe/bo: add some annotations in bo_put()

Matthew Brost (1):
  drm/xe: Do not run GPU page fault handler on a closed VM

Michal Wajdeczko (1):
  drm/xe/guc: Fix GUC_{SUBMIT,FIRMWARE}_VER helper macros

 drivers/gpu/drm/xe/xe_bb.c   |  3 ++-
 drivers/gpu/drm/xe/xe_bo.c   | 14 ++
 drivers/gpu/drm/xe/xe_bo.h   |  6 +
 drivers/gpu/drm/xe/xe_drm_client.c   | 50 +---
 drivers/gpu/drm/xe/xe_gt_pagefault.c |  6 +
 drivers/gpu/drm/xe/xe_guc.h  |  6 +++--
 drivers/gpu/drm/xe/xe_vram.c |  1 +
 7 files changed, 69 insertions(+), 17 deletions(-)


[PULL] drm-xe-next-fixes

2024-09-12 Thread Lucas De Marchi

Hi Dave and Simona,

Fixes for 6.12 cycle. Very normal fixes, mostly on error paths.
Maybe something different from previous pull requests is the addition of
the workaround, which we were not doing since they don't have a "Fixes"
trailer. However as we are enabling Xe2 platforms starting with this
release, we are being more careful and adding them when they are really
needed.

Thanks
Lucas De Marchi

drm-xe-next-fixes-2024-09-12:
Driver Changes:
- Fix usefafter-free when provisioning VF (Matthew Auld)
- Suppress rpm warning on false positive (Rodrigo)
- Fix memleak on ioctl error path (Dafna)
- Fix use-after-free while inserting ggtt (Michal Wajdeczko)
- Add Wa_15016589081 workaround (Tejas)
- Fix error path on suspend (Maarten)
The following changes since commit b615b9c36cae0468491547206406a909a9a37f26:

  Merge v6.11-rc7 into drm-next (2024-09-11 09:18:15 +0200)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git 
tags/drm-xe-next-fixes-2024-09-12

for you to fetch changes up to f1a4dceeb2bd4b4478e4f0c77dac55569d153fb3:

  drm/xe: Fix missing conversion to xe_display_pm_runtime_resume (2024-09-12 
18:04:36 -0500)


Driver Changes:
- Fix usefafter-free when provisioning VF (Matthew Auld)
- Suppress rpm warning on false positive (Rodrigo)
- Fix memleak on ioctl error path (Dafna)
- Fix use-after-free while inserting ggtt (Michal Wajdeczko)
- Add Wa_15016589081 workaround (Tejas)
- Fix error path on suspend (Maarten)


Arnd Bergmann (1):
  drm/xe: fix build warning with CONFIG_PM=n

Dafna Hirschfeld (1):
  drm/xe: fix missing 'xe_vm_put'

Maarten Lankhorst (1):
  drm/xe: Fix missing conversion to xe_display_pm_runtime_resume

Matthew Auld (1):
  drm/xe: prevent potential UAF in pf_provision_vf_ggtt()

Michal Wajdeczko (1):
  drm/xe: Don't keep stale pointer to bo->ggtt_node

Rodrigo Vivi (1):
  drm/xe: Suppress missing outer rpm protection warning

Tejas Upadhyay (1):
  drm/xe/xe2hpg: Add Wa_15016589081

 drivers/gpu/drm/xe/regs/xe_gt_regs.h   |  1 +
 drivers/gpu/drm/xe/xe_exec_queue.c |  4 +++-
 drivers/gpu/drm/xe/xe_ggtt.c   |  7 +--
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c |  8 
 drivers/gpu/drm/xe/xe_pm.c | 23 +--
 drivers/gpu/drm/xe/xe_wa.c |  4 
 6 files changed, 38 insertions(+), 9 deletions(-)


Re: [PATCH v4 1/3] drm: Introduce device wedged event

2024-09-10 Thread Lucas De Marchi

On Tue, Sep 10, 2024 at 06:53:19PM GMT, Raag Jadav wrote:

On Mon, Sep 09, 2024 at 03:01:50PM -0500, Lucas De Marchi wrote:

On Sun, Sep 08, 2024 at 11:08:39PM GMT, Asahi Lina wrote:
> On 9/8/24 12:07 AM, Lucas De Marchi wrote:
> > On Sat, Sep 07, 2024 at 08:38:30PM GMT, Asahi Lina wrote:
> > > On 9/6/24 6:42 PM, Raag Jadav wrote:
> > > > Introduce device wedged event, which will notify userspace of wedged
> > > > (hanged/unusable) state of the DRM device through a uevent. This is
> > > > useful especially in cases where the device is in unrecoverable state
> > > > and requires userspace intervention for recovery.
> > > >
> > > > Purpose of this implementation is to be vendor agnostic. Userspace
> > > > consumers (sysadmin) can define udev rules to parse this event and
> > > > take respective action to recover the device.
> > > >
> > > > Consumer expectations:
> > > > --
> > > > 1) Unbind driver
> > > > 2) Reset bus device
> > > > 3) Re-bind driver
> > >
> > > Is this supposed to be normative? For drm/asahi we have a "wedged"
> > > concept (firmware crashed), but the only possible recovery action is a
> > > full system reboot (which might still be desirable to allow userspace to
> > > trigger automatically in some scenarios) since there is no bus-level
> > > reset and no firmware reload possible.
> >
> > maybe let drivers hint possible/supported recovery mechanisms and then
> > sysadmin chooses what to do?
>
> How would we do this? A textual value for the event or something like
> that? ("WEDGED=bus-reset" vs "WEDGED=reboot"?)

If there's a need for more than one, than I think exposing the supported
ones sorted by "side effect" in sysfs would be good. Something like:

$ cat /sys/class/drm/card0/device/wedge_recover
rebind
bus-reset
reboot


How do we expect the drivers to flag supported ones? Extra hooks?


The comment above... wedge_recover would be a sysfs exposed by the
driver to userspace with the supported mechanisms.

WEDGED= (which is also crafted by the driver or with explicit
functions in drm) would report to userspace the minimum
needed mechanism for recovery.

Lucas De Marchi


Re: [PATCH 6/7] drm/i915/pmu: Lazy unregister

2024-09-09 Thread Lucas De Marchi

Hi Peter,

On Wed, Jul 24, 2024 at 10:39:57AM GMT, Lucas De Marchi wrote:

On Wed, Jul 24, 2024 at 02:41:05PM GMT, Peter Zijlstra wrote:

On Tue, Jul 23, 2024 at 10:30:08AM -0500, Lucas De Marchi wrote:

On Tue, Jul 23, 2024 at 09:03:25AM GMT, Tvrtko Ursulin wrote:


On 22/07/2024 22:06, Lucas De Marchi wrote:
> Instead of calling perf_pmu_unregister() when unbinding, defer that to
> the destruction of i915 object. Since perf itself holds a reference in
> the event, this only happens when all events are gone, which guarantees
> i915 is not unregistering the pmu with live events.
>
> Previously, running the following sequence would crash the system after
> ~2 tries:
>
>1) bind device to i915
>2) wait events to show up on sysfs
>3) start perf  stat -I 1000 -e i915/rcs0-busy/
>4) unbind driver
>5) kill perf
>
> Most of the time this crashes in perf_pmu_disable() while accessing the
> percpu pmu_disable_count. This happens because perf_pmu_unregister()
> destroys it with free_percpu(pmu->pmu_disable_count).
>
> With a lazy unbind, the pmu is only unregistered after (5) as opposed to
> after (4). The downside is that if a new bind operation is attempted for
> the same device/driver without killing the perf process, i915 will fail
> to register the pmu (but still load successfully). This seems better
> than completely crashing the system.

So effectively allows unbind to succeed without fully unbinding the
driver from the device? That sounds like a significant drawback and if
so, I wonder if a more complicated solution wouldn't be better after
all. Or is there precedence for allowing userspace keeping their paws on
unbound devices in this way?


keeping the resources alive but "unplunged" while the hardware
disappeared is a common thing to do... it's the whole point of the
drmm-managed resource for example. If you bind the driver and then
unbind it while userspace is holding a ref, next time you try to bind it
will come up with a different card number. A similar thing that could be
done is to adjust the name of the event - currently we add the mangled
pci slot.

That said, I agree a better approach would be to allow
perf_pmu_unregister() to do its job even when there are open events. On
top of that (or as a way to help achieve that), make perf core replace
the callbacks with stubs when pmu is unregistered - that would even kill
the need for i915's checks on pmu->closed (and fix the lack thereof in
other drivers).

It can be a can of worms though and may be pushed back by perf core
maintainers, so it'd be good have their feedback.


I don't think I understand the problem. I also don't understand drivers
much -- so that might be the problem.


We can bind/unbind the driver to/from the pci device. Example:

echo -n ":00:02.0" > /sys/bus/pci/drivers/i915/unbind

This is essentially unplugging the HW from the kernel.  This will
trigger the driver to deinitialize and free up all resources use by that
device.

So when the driver is binding it does:

perf_pmu_register();

When it's unbinding:

perf_pmu_unregister();

Reasons to unbind include:

- driver testing so then we can unload the module and load it
  again
- device is toast - doing an FLR and rebinding may
  fix/workaround it
- For SR-IOV, in which we are exposing multiple instances of the
  same underlying PCI device, we may need to bind/unbind
  on-demand  (it's not yet clear if perf_pmu_register() would be
  called on the VF instances, but listed here just to explain
  the bind/unbind)
- Hotplug



So the problem appears to be that the device just disappears without
warning? How can a GPU go away like that?

Since you have a notion of this device, can't you do this stubbing you
talk about? That is, if your internal device reference becomes NULL, let
the PMU methods preserve the state like no-ops.


It's not clear if you are suggesting these stubs to be in each driver or
to be assigned by perf in perf_pmu_unregister(). Some downsides
of doing it in the driver:

- you can't remove the module as perf will still call module
  code
- need to replicate the stubs in every driver (or the
  equivalent of pmu->closed checks in i915_pmu.c)

I *think* the stubs would be quiet the same for every device, so could
be feasible to share them inside perf. Alternatively it could simply
shortcut the call, without stubs, by looking at event->hw.state and
a new pmu->state. I can give this a try.


trying to revive these patches to get this fixed. Are you ok with one of
those approaches, i.e. a) either add the stub in the perf side or b)
shortcut the calls after perf_pmu_unregister() is called ?


Lucas De Marchi



thanks
Lucas De Marchi



And then when the last event goes away, tear down the whole thing.

Again, I'm not sure I'm following.


Re: [PATCH v4 1/3] drm: Introduce device wedged event

2024-09-09 Thread Lucas De Marchi

On Sun, Sep 08, 2024 at 11:08:39PM GMT, Asahi Lina wrote:



On 9/8/24 12:07 AM, Lucas De Marchi wrote:

On Sat, Sep 07, 2024 at 08:38:30PM GMT, Asahi Lina wrote:



On 9/6/24 6:42 PM, Raag Jadav wrote:

Introduce device wedged event, which will notify userspace of wedged
(hanged/unusable) state of the DRM device through a uevent. This is
useful especially in cases where the device is in unrecoverable state
and requires userspace intervention for recovery.

Purpose of this implementation is to be vendor agnostic. Userspace
consumers (sysadmin) can define udev rules to parse this event and
take respective action to recover the device.

Consumer expectations:
--
1) Unbind driver
2) Reset bus device
3) Re-bind driver


Is this supposed to be normative? For drm/asahi we have a "wedged"
concept (firmware crashed), but the only possible recovery action is a
full system reboot (which might still be desirable to allow userspace to
trigger automatically in some scenarios) since there is no bus-level
reset and no firmware reload possible.


maybe let drivers hint possible/supported recovery mechanisms and then
sysadmin chooses what to do?


How would we do this? A textual value for the event or something like
that? ("WEDGED=bus-reset" vs "WEDGED=reboot"?)


If there's a need for more than one, than I think exposing the supported
ones sorted by "side effect" in sysfs would be good. Something like:

$ cat /sys/class/drm/card0/device/wedge_recover
rebind
bus-reset
reboot

Although if there is actually an unrecoverable state like "reboot", you
could simply remove the underlying device from the kernel side, with no
userspace intervention.

Lucas De Marchi



~~ Lina


Re: [PATCH v4 1/3] drm: Introduce device wedged event

2024-09-07 Thread Lucas De Marchi

On Sat, Sep 07, 2024 at 08:38:30PM GMT, Asahi Lina wrote:



On 9/6/24 6:42 PM, Raag Jadav wrote:

Introduce device wedged event, which will notify userspace of wedged
(hanged/unusable) state of the DRM device through a uevent. This is
useful especially in cases where the device is in unrecoverable state
and requires userspace intervention for recovery.

Purpose of this implementation is to be vendor agnostic. Userspace
consumers (sysadmin) can define udev rules to parse this event and
take respective action to recover the device.

Consumer expectations:
--
1) Unbind driver
2) Reset bus device
3) Re-bind driver


Is this supposed to be normative? For drm/asahi we have a "wedged"
concept (firmware crashed), but the only possible recovery action is a
full system reboot (which might still be desirable to allow userspace to
trigger automatically in some scenarios) since there is no bus-level
reset and no firmware reload possible.


maybe let drivers hint possible/supported recovery mechanisms and then
sysadmin chooses what to do?

Lucas De Marchi


Re: linux-next: manual merge of the drm-xe tree with the drm-intel tree

2024-09-05 Thread Lucas De Marchi

On Fri, Sep 06, 2024 at 01:15:02PM GMT, Stephen Rothwell wrote:

Hi all,

Today's linux-next merge of the drm-xe tree got a conflict in:

 drivers/gpu/drm/xe/display/xe_display.c

between commit:

 11d0613af7c5 ("drm/i915/display: include drm/drm_probe_helper.h where needed")

from the drm-intel tree and commit:

 87d8ecf01544 ("drm/xe: replace #include  with 
")

from the drm-xe tree.

I fixed it up (see below) and can carry the fix as necessary. This


looks good, thanks

Lucas De Marchi


is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

--
Cheers,
Stephen Rothwell

diff --cc drivers/gpu/drm/xe/display/xe_display.c
index 303d00b99a68,75736faf2a80..
--- a/drivers/gpu/drm/xe/display/xe_display.c
+++ b/drivers/gpu/drm/xe/display/xe_display.c
@@@ -10,8 -10,7 +10,8 @@@

 #include 
 #include 
+#include 
- #include 
+ #include 

 #include "soc/intel_dram.h"
 #include "i915_drv.h"/* FIXME: HAS_DISPLAY() depends on this */





[PULL] drm-xe-next

2024-09-05 Thread Lucas De Marchi

Hi Dave and Simona,

Last drm-xe-next pull for 6.12. I was going to bring only the fixes, but
that would be basically most of the changes available. And what's not a
fix it's pretty trivial stuff. So I'm including everything.

The one outstanding thing is a prep patch to dma-buf that was needed to
properly replace a fix we reverted last week. That's being tested for
more than a week and proved better than the previous one.

thanks
Lucas De Marchi

drm-xe-next-2024-09-05:
Cross-subsystem Changes:
- Split dma fence array creation into alloc and arm (Matthew Brost)

Driver Changes:
- Move kernel_lrc to execlist backend (Ilia)
- Fix type width for pcode coommand (Karthik)
- Make xe_drm.h include unambiguous (Jani)
- Fixes and debug improvements for GSC load (Daniele)
- Track resources and VF state by PF (Michal Wajdeczko)
- Fix memory leak on error path (Nirmoy)
- Cleanup header includes (Matt Roper)
- Move pcode logic to tile scope (Matt Roper)
- Move hwmon logic to device scope (Matt Roper)
- Fix media TLB invalidation (Matthew Brost)
- Threshold config fixes for PF (Michal Wajdeczko)
- Remove extra "[drm]" from logs (Michal Wajdeczko)
- Add missing runtime ref (Rodrigo Vivi)
- Fix circular locking on runtime suspend (Rodrigo Vivi)
- Fix rpm in TTM swapout path (Thomas)
The following changes since commit 3adcf970dc7ec0469ec3116a5a8be9161d17a335:

  drm/xe/bmg: Drop force_probe requirement (2024-08-28 10:47:03 -0700)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-next-2024-09-05

for you to fetch changes up to 34bb7b813ab398106f700b0a6b218509bb0b904c:

  drm/xe: Use xe_pm_runtime_get in xe_bo_move() if reclaim-safe. (2024-09-04 
09:28:09 +0200)


Cross-subsystem Changes:
- Split dma fence array creation into alloc and arm (Matthew Brost)

Driver Changes:
- Move kernel_lrc to execlist backend (Ilia)
- Fix type width for pcode coommand (Karthik)
- Make xe_drm.h include unambiguous (Jani)
- Fixes and debug improvements for GSC load (Daniele)
- Track resources and VF state by PF (Michal Wajdeczko)
- Fix memory leak on error path (Nirmoy)
- Cleanup header includes (Matt Roper)
- Move pcode logic to tile scope (Matt Roper)
- Move hwmon logic to device scope (Matt Roper)
- Fix media TLB invalidation (Matthew Brost)
- Threshold config fixes for PF (Michal Wajdeczko)
- Remove extra "[drm]" from logs (Michal Wajdeczko)
- Add missing runtime ref (Rodrigo Vivi)
- Fix circular locking on runtime suspend (Rodrigo Vivi)
- Fix rpm in TTM swapout path (Thomas)


Daniele Ceraolo Spurio (5):
  drm/xe/gsc: Do not attempt to load the GSC multiple times
  drm/xe/gsc: Fix FW status if the firmware is already loaded
  drm/xe/gsc: Track the platform in the compatibility version
  drm/xe/gsc: Add debugfs to print GSC info
  drm/xe/gsc: Wedge the device if the GSCCS reset fails

Ilia Levi (1):
  drm/xe: move the kernel lrc from hwe to execlist port

Jani Nikula (1):
  drm/xe: replace #include  with 

Karthik Poosa (1):
  drm/xe/hwmon: Fix WRITE_I1 param from u32 to u16

Matt Roper (3):
  drm/xe/display: Drop unnecessary xe_gt.h includes
  drm/xe/pcode: Treat pcode as per-tile rather than per-GT
  drm/xe/hwmon: Treat hwmon as a per-device concept

Matthew Brost (2):
  dma-buf: Split out dma fence array create into alloc and arm functions
  drm/xe: Invalidate media_gt TLBs in PT code

Michal Wajdeczko (7):
  drm/xe/pf: Add function to sanitize VF resources
  drm/xe/pf: Fix documentation formatting
  drm/xe/pf: Drop GuC notifications for non-existing VF
  drm/xe/pf: Improve VF control
  drm/xe/pf: Add thresholds to the VF KLV config
  drm/xe/pf: Reset thresholds when releasing a VF config
  drm/xe: Remove redundant [drm] tag from xe_assert() message

Nirmoy Das (1):
  drm/xe: Fix memory leak on xe_alloc_pf_queue failure

Rodrigo Vivi (2):
  drm/xe: Add missing runtime reference to wedged upon gt_reset
  drm/xe/display: Avoid encoder_suspend at runtime suspend

Thomas Hellström (1):
  drm/xe: Use xe_pm_runtime_get in xe_bo_move() if reclaim-safe.

 drivers/dma-buf/dma-fence-array.c  |   78 +-
 drivers/gpu/drm/xe/Makefile|1 +
 .../gpu/drm/xe/compat-i915-headers/intel_pcode.h   |8 +-
 .../gpu/drm/xe/compat-i915-headers/intel_uncore.h  |7 +
 drivers/gpu/drm/xe/display/intel_fbdev_fb.c|1 -
 drivers/gpu/drm/xe/display/xe_display.c|8 +-
 drivers/gpu/drm/xe/display/xe_dsb_buffer.c |1 -
 drivers/gpu/drm/xe/display/xe_fb_pin.c |1 -
 drivers/gpu/drm/xe/display/xe_hdcp_gsc.c   |1 -
 drivers/gpu/drm/xe/regs/xe_gsc_regs.h  |4 +
 drivers/gpu/drm/xe/tests/xe_dma_buf.c  |

[PULL] drm-xe-next

2024-08-28 Thread Lucas De Marchi

Hi Dave and Sima,

Second drm-xe-next pull request for the 6.12 cycle. This includes the
pull request from last week since it was not applied due to the ttm
patch. That patch is now reverted and replacement on the back burner.

The only UAPI change is actually a fix for building with gcc 5.

Aside from the additional fixes compared to last week, 2 important
patches to remove the force_probe requirement for LNL and BMG.  Those
are the first platforms to be officially supported by the xe driver: one
integrated and one discrete. For BMG there are still some necessary
changes going through the drm-intel-next pull request later this week.

Cheers,
Lucas De Marchi

drm-xe-next-2024-08-28:
UAPI Changes:
- Fix OA format masks which were breaking build with gcc-5

Cross-subsystem Changes:

Driver Changes:
- Use dma_fence_chain_free in chain fence unused as a sync (Matthew Brost)
- Refactor hw engine lookup and mmio access to be used in more places
  (Dominik, Matt Auld, Mika Kuoppala)
- Enable priority mem read for Xe2 and later (Pallavi Mishra)
- Fix PL1 disable flow in xe_hwmon_power_max_write (Karthik)
- Fix refcount and speedup devcoredump (Matthew Brost)
- Add performance tuning changes to Xe2 (Akshata, Shekhar)
- Fix OA sysfs entry (Ashutosh)
- Add first GuC firmware support for BMG (Julia)
- Bump minimum GuC firmware for platforms under force_probe to match LNL
  and BMG (Julia)
- Fix access check on user fence creation (Nirmoy)
- Add/document workarounds for Xe2 (Julia, Daniele, John, Tejas)
- Document workaround and use proper WA infra (Matt Roper)
- Fix VF configuration on media GT (Michal Wajdeczko)
- Fix VM dma-resv lock (Matthew Brost)
- Allow suspend/resume exec queue backend op to be called multiple times
  (Matthew Brost)
- Add GT stats to debugfs (Nirmoy)
- Add hwconfig to debugfs (Matt Roper)
- Compile out all debugfs code with ONFIG_DEUBG_FS=n (Lucas)
- Remove dead kunit code (Jani Nikula)
- Refactor drvdata storing to help display (Jani Nikula)
- Cleanup unsused xe parameter in pte handling (Himal)
- Rename s/enable_display/probe_display/ for clarity (Lucas)
- Fix missing MCR annotation in couple of registers (Tejas)
- Fix DGFX display suspend/resume (Maarten)
- Prepare exec_queue_kill for PXP handling (Daniele)
- Fix devm/drmm issues (Daniele, Matthew Brost)
- Fix tile and ggtt fini sequences (Matthew Brost)
- Fix crashes when probing without firmware in place (Daniele, Matthew Brost)
- Use xe_managed for kernel BOs (Daniele, Matthew Brost)
- Future-proof dss_per_group calculation by using hwconfig (Matt Roper)
- Use reserved copy engine for user binds on faulting devices
  (Matthew Brost)
- Allow mixing dma-fence jobs and long-running faulting jobs (Francois)
- Cleanup redundant arg when creating use BO (Nirmoy)
- Prevent UAF around preempt fence (Auld)
- Fix display suspend/resume (Maarten)
- Use vma_pages() helper (Thorsten)
- Calculate pagefault queue size (Stuart, Matthew Auld)
- Fix missing pagefault wq destroy (Stuart)
- Fix lifetime handling of HW fence ctx (Matthew Brost)
- Fix order destroy order for jobs (Matthew Brost)
- Fix TLB invalidation for media GT (Matthew Brost)
- Document GGTT (Rodrigo Vivi)
- Refactor GGTT layering and fix runtime outer protection (Rodrigo Vivi)
- Handle HPD polling on display pm runtime suspend/resume (Imre, Vinod)
- Drop unrequired NULL checks (Apoorva, Himal)
- Use separate rpm lockdep map for non-d3cold-capable devices (Thomas Hellström)
- Support "nomodeset" kernel command-line option (Thomas Zimmermann)
- Drop force_probe requirement for LNL and BMG (Lucas, Balasubramani)
The following changes since commit a809b92ee0f84c3f655b16a8b4d04bc3665d954a:

  Merge tag 'drm-intel-next-2024-08-13' of 
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next (2024-08-16 
12:56:42 +1000)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-next-2024-08-28

for you to fetch changes up to 3adcf970dc7ec0469ec3116a5a8be9161d17a335:

  drm/xe/bmg: Drop force_probe requirement (2024-08-28 10:47:03 -0700)


UAPI Changes:
- Fix OA format masks which were breaking build with gcc-5

Cross-subsystem Changes:

Driver Changes:
- Use dma_fence_chain_free in chain fence unused as a sync (Matthew Brost)
- Refactor hw engine lookup and mmio access to be used in more places
  (Dominik, Matt Auld, Mika Kuoppala)
- Enable priority mem read for Xe2 and later (Pallavi Mishra)
- Fix PL1 disable flow in xe_hwmon_power_max_write (Karthik)
- Fix refcount and speedup devcoredump (Matthew Brost)
- Add performance tuning changes to Xe2 (Akshata, Shekhar)
- Fix OA sysfs entry (Ashutosh)
- Add first GuC firmware support for BMG (Julia)
- Bump minimum GuC firmware for platforms under force_probe to match LNL
  and BMG (Julia)
- Fix access check on user fence creation (Nirmoy)
- Add/document workarounds for Xe2 (Julia, Daniele, John, Tejas)
- Documen

Re: [PATCH v2] drm/xe: Support 'nomodeset' kernel command-line option

2024-08-27 Thread Lucas De Marchi


On Tue, 27 Aug 2024 14:09:05 +0200, Thomas Zimmermann wrote:
> Setting 'nomodeset' on the kernel command line disables all graphics
> drivers with modesetting capabilities, leaving only firmware drivers,
> such as simpledrm or efifb.
> 
> Most DRM drivers automatically support 'nomodeset' via DRM's module
> helper macros. In xe, which uses regular module_init(), manually call
> drm_firmware_drivers_only() to test for 'nomodeset'. Do not register
> the driver if set.
> 
> [...]

Applied to drm-xe-next, thanks!

[1/1] drm/xe: Support 'nomodeset' kernel command-line option
  commit: 014125c64d09e58e90dde49fbb57d802a13e2559

Best regards,
-- 
Lucas De Marchi 


Re: [PATCH 2/2] Revert "drm/ttm: Add a flag to allow drivers to skip clear-on-free"

2024-08-27 Thread Lucas De Marchi

On Tue, Aug 27, 2024 at 11:42:59AM GMT, Thomas Hellström wrote:

On Wed, 2024-08-21 at 11:50 +0200, Nirmoy Das wrote:

Remove TTM_TT_FLAG_CLEARED_ON_FREE now that XE stopped using this
flag.

This reverts commit decbfaf06db05fa1f9b33149ebb3c145b44e878f.

Cc: Christian König 
Cc: Himal Prasad Ghimiray 
Cc: Matthew Auld 
Cc: Matthew Brost 
Cc: Thomas Hellström 
Signed-off-by: Nirmoy Das 


Reviewed-by: Thomas Hellström 


Christian, since that commit being reverted was merged through drm-xe...
ack on reverting it in our branch? It's actually the only branch where
it can be reverted. AFAICS you agreed with the revert in
https://lore.kernel.org/intel-xe/02a383c5-db18-47ef-9118-72effd83a...@amd.com/


thanks
Lucas De Marchi


Re: [PATCH] drm/xe: Support 'nomodeset' kernel command-line option

2024-08-26 Thread Lucas De Marchi

On Wed, Aug 21, 2024 at 04:48:23PM GMT, Thomas Zimmermann wrote:

Hi

Am 21.08.24 um 16:29 schrieb Jani Nikula:

On Wed, 21 Aug 2024, Gustavo Sousa  wrote:

Quoting Thomas Zimmermann (2024-08-21 10:56:59-03:00)

Setting 'nomodeset' on the kernel command line disables all graphics
drivers with modesetting capabilities; leaving only firmware drivers,
such as simpledrm or efifb.

Most DRM drivers automatically support 'nomodeset' via DRM's module
helper macros. In xe, which uses regular module_init(), manually call
drm_firmware_drivers_only() to test for 'nomodeset'. Do not register
the driver if set.

Signed-off-by: Thomas Zimmermann 
---
drivers/gpu/drm/xe/xe_module.c | 5 +
1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
index 923460119cec..60fb7dd26903 100644
--- a/drivers/gpu/drm/xe/xe_module.c
+++ b/drivers/gpu/drm/xe/xe_module.c
@@ -8,6 +8,8 @@
#include 
#include 

+#include 
+
#include "xe_drv.h"
#include "xe_hw_fence.h"
#include "xe_pci.h"
@@ -92,6 +94,9 @@ static int __init xe_init(void)
{
int err, i;

+if (drm_firmware_drivers_only())
+return -ENODEV;
+

Hm... But what if xe is to be used only for compute or render? Shouldn't
we handle this somewhere else?

The question becomes, what does "nomodeset" really mean here?


That function's name 'firmware drivers only' says it better than the 
option's name. We used 'nomodeset', because it was there already and 
had the correct semantics.


agreed this should be on a module-level to maintain the behavior already
used. If we were not maintaining that behavior, then we should probably
not use "nomodeset" and choose something else :).

Also we already have the other 2 as module params:  probe_display and
disable_display, with driver still registering as a drm driver, but
leaving the display part out.

Thomas, are you going to send a v2 to use the init table?

thanks
Lucas De Marchi





See what i915 does in i915_module.c.


i915 and the other drivers for PCI-based hardware don't load at all. 
Drivers for external displays (e.g., SPI, USB) ignore nomodeset, as 
these displays are not initialized by firmware.


Best regards
Thomas



Cc: Sima.

BR,
Jani.




Taking a quick look, xe_display_probe() might be a good candidate?

--
Gustavo Sousa


for (i = 0; i < ARRAY_SIZE(init_funcs); i++) {
err = init_funcs[i].init();
if (err) {
--
2.46.0



--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)



Re: [PATCH] drm/xe: Fix total initialization in xe_ggtt_print_holes()

2024-08-24 Thread Lucas De Marchi

On Fri, Aug 23, 2024 at 08:47:13PM GMT, Nathan Chancellor wrote:

Clang warns (or errors with CONFIG_DRM_WERROR or CONFIG_WERROR):

 drivers/gpu/drm/xe/xe_ggtt.c:810:3: error: variable 'total' is uninitialized 
when used here [-Werror,-Wuninitialized]
   810 | total += hole_size;
   | ^
 drivers/gpu/drm/xe/xe_ggtt.c:798:11: note: initialize the variable 'total' to 
silence this warning
   798 | u64 total;
   |  ^
   |   = 0
 1 error generated.

Move the zero initialization of total from
xe_gt_sriov_pf_config_print_available_ggtt() to xe_ggtt_print_holes() to
resolve the warning.

Fixes: 136367290ea5 ("drm/xe: Introduce xe_ggtt_print_holes")
Signed-off-by: Nathan Chancellor 



Reviewed-by: Lucas De Marchi 

I will push it soon to drm-xe-next, thanks.

Lucas De Marchi


---
drivers/gpu/drm/xe/xe_ggtt.c   | 2 +-
drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
index a3ce91decdce..86fc6afa43bd 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.c
+++ b/drivers/gpu/drm/xe/xe_ggtt.c
@@ -795,7 +795,7 @@ u64 xe_ggtt_print_holes(struct xe_ggtt *ggtt, u64 
alignment, struct drm_printer
const struct drm_mm_node *entry;
u64 hole_min_start = xe_wopcm_size(tile_to_xe(ggtt->tile));
u64 hole_start, hole_end, hole_size;
-   u64 total;
+   u64 total = 0;
char buf[10];

mutex_lock(&ggtt->lock);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c 
b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
index c8835d9eead6..41ed07b153b5 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
@@ -2110,7 +2110,7 @@ int xe_gt_sriov_pf_config_print_available_ggtt(struct 
xe_gt *gt, struct drm_prin
{
struct xe_ggtt *ggtt = gt_to_tile(gt)->mem.ggtt;
u64 alignment = pf_get_ggtt_alignment(gt);
-   u64 spare, avail, total = 0;
+   u64 spare, avail, total;
char buf[10];

xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));

---
base-commit: 66a0f6b9f5fc205272035b6ffa4830be51e3f787
change-id: 20240823-drm-xe-fix-total-in-xe_ggtt_print_holes-0cf2c399aa2a

Best regards,
--
Nathan Chancellor 



[PULL] drm-xe-next

2024-08-22 Thread Lucas De Marchi

Hi Dave and Sima,

Second drm-xe-next pull request for the 6.12 cycle.

Several fixes here that are also going through drm-xe-fixes. Most of the
other changes are the missing bits for Xe2 (LNL and BMG) and general
refactors.

The only UAPI change is actually a fix for building with gcc 5.

I left some commits out since they were too fresh and I didn't have core
CI results for them. Plan is to send the final pull request for 6.12
next week with those commits.

thanks
Lucas De Marchi

drm-xe-next-2024-08-22:
UAPI Changes:
- Fix OA format masks which were breaking build with gcc-5

Cross-subsystem Changes:
- drm/ttm: Add a flag to allow drivers to skip clear-on-free

Driver Changes:
- Use dma_fence_chain_free in chain fence unused as a sync (Matthew Brost)
- Refactor hw engine lookup and mmio access to be used in more places
  (Dominik, Matt Auld, Mika Kuoppala)
- Enable priority mem read for Xe2 and later (Pallavi Mishra)
- Fix PL1 disable flow in xe_hwmon_power_max_write (Karthik)
- Fix refcount and speedup devcoredump (Matthew Brost)
- Add performance tuning changes to Xe2 (Akshata, Shekhar)
- Fix OA sysfs entry (Ashutosh)
- Add first GuC firmware support for BMG (Julia)
- Bump minimum GuC firmware for platforms under force_probe to match LNL
  and BMG (Julia)
- Fix access check on user fence creation (Nirmoy)
- Add/document workarounds for Xe2 (Julia, Daniele, John, Tejas)
- Document workaround and use proper WA infra (Matt Roper)
- Fix VF configuration on media GT (Michal Wajdeczko)
- Fix VM dma-resv lock (Matthew Brost)
- Allow suspend/resume exec queue backend op to be called multiple times
  (Matthew Brost)
- Add GT stats to debugfs (Nirmoy)
- Add hwconfig to debugfs (Matt Roper)
- Compile out all debugfs code with ONFIG_DEUBG_FS=n (Lucas)
- Offload system clear page to GPU (Nirmoy)
- Remove dead kunit code (Jani Nikula)
- Refactor drvdata storing to help display (Jani Nikula)
- Cleanup unsused xe parameter in pte handling (Himal)
- Rename s/enable_display/probe_display/ for clarity (Lucas)
- Fix missing MCR annotation in couple of registers (Tejas)
- Fix DGFX display suspend/resume (Maarten)
- Prepare exec_queue_kill for PXP handling (Daniele)
- Fix devm/drmm issues (Daniele, Matthew Brost)
- Fix tile fini sequence (Brost)
- Fix crashes when probing without firmware in place (Daniele)
- Use xe_managed for kernel BOs (Daniele, Matthew Brost)
- Future-proof dss_per_group calculation by using hwconfig (Matt Roper)
- Use reserved copy engine for user binds on faulting devices
  (Matthew Brost)
- Allow mixing dma-fence jobs and long-running faulting jobs (Francois)
- Cleanup redundant arg when creating use BO (Nirmoy)
- Prevent UAF around preempt fence (Auld)
- Fix display suspend/resume (Maarten)
- Use vma_pages() helper (Thorsten)
The following changes since commit a809b92ee0f84c3f655b16a8b4d04bc3665d954a:

  Merge tag 'drm-intel-next-2024-08-13' of 
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next (2024-08-16 
12:56:42 +1000)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-next-2024-08-22

for you to fetch changes up to 23683061805be368c8d1c7e7ff52abc470cac275:

  drm/xe/lnl: Offload system clear page activity to GPU (2024-08-19 17:49:00 
+0200)


UAPI Changes:
- Fix OA format masks which were breaking build with gcc-5

Cross-subsystem Changes:
- drm/ttm: Add a flag to allow drivers to skip clear-on-free

Driver Changes:
- Use dma_fence_chain_free in chain fence unused as a sync (Matthew Brost)
- Refactor hw engine lookup and mmio access to be used in more places
  (Dominik, Matt Auld, Mika Kuoppala)
- Enable priority mem read for Xe2 and later (Pallavi Mishra)
- Fix PL1 disable flow in xe_hwmon_power_max_write (Karthik)
- Fix refcount and speedup devcoredump (Matthew Brost)
- Add performance tuning changes to Xe2 (Akshata, Shekhar)
- Fix OA sysfs entry (Ashutosh)
- Add first GuC firmware support for BMG (Julia)
- Bump minimum GuC firmware for platforms under force_probe to match LNL
  and BMG (Julia)
- Fix access check on user fence creation (Nirmoy)
- Add/document workarounds for Xe2 (Julia, Daniele, John, Tejas)
- Document workaround and use proper WA infra (Matt Roper)
- Fix VF configuration on media GT (Michal Wajdeczko)
- Fix VM dma-resv lock (Matthew Brost)
- Allow suspend/resume exec queue backend op to be called multiple times
  (Matthew Brost)
- Add GT stats to debugfs (Nirmoy)
- Add hwconfig to debugfs (Matt Roper)
- Compile out all debugfs code with ONFIG_DEUBG_FS=n (Lucas)
- Offload system clear page to GPU (Nirmoy)
- Remove dead kunit code (Jani Nikula)
- Refactor drvdata storing to help display (Jani Nikula)
- Cleanup unsused xe parameter in pte handling (Himal)
- Rename s/enable_display/probe_display/ for clarity (Lucas)
- Fix missing MCR annotation in couple of registers (Tejas)
- Fix DGFX display suspend/resume (Maarten)
- Prepare exec

Re: [PATCH] drm/xe: Support 'nomodeset' kernel command-line option

2024-08-21 Thread Lucas De Marchi

On Wed, Aug 21, 2024 at 03:56:59PM GMT, Thomas Zimmermann wrote:

Setting 'nomodeset' on the kernel command line disables all graphics
drivers with modesetting capabilities; leaving only firmware drivers,
such as simpledrm or efifb.

Most DRM drivers automatically support 'nomodeset' via DRM's module
helper macros. In xe, which uses regular module_init(), manually call
drm_firmware_drivers_only() to test for 'nomodeset'. Do not register
the driver if set.


I see some drivers like i915 and radeon using an additional 'modeset'
parameter... probably to be able to avoid modeset for that specific
driver while still allowing for others?



Signed-off-by: Thomas Zimmermann 
---
drivers/gpu/drm/xe/xe_module.c | 5 +
1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
index 923460119cec..60fb7dd26903 100644
--- a/drivers/gpu/drm/xe/xe_module.c
+++ b/drivers/gpu/drm/xe/xe_module.c
@@ -8,6 +8,8 @@
#include 
#include 

+#include 
+
#include "xe_drv.h"
#include "xe_hw_fence.h"
#include "xe_pci.h"
@@ -92,6 +94,9 @@ static int __init xe_init(void)
{
int err, i;

+   if (drm_firmware_drivers_only())
+   return -ENODEV;


nit: being the first, without an .exit may be equivalent, but probably
better for parity with i915 to use a xe_check_modeset() and add it as
the first one in the table. We'd need to check for exit != NULL, though.

Anyway,

Reviewed-by: Lucas De Marchi 

thanks
Lucas De Marchi


+
for (i = 0; i < ARRAY_SIZE(init_funcs); i++) {
err = init_funcs[i].init();
if (err) {
--
2.46.0



Re: [PATCH v2] drm/i915/gt: Use kmemdup_array instead of kmemdup for multiple allocation

2024-08-20 Thread Lucas De Marchi

On Tue, Aug 20, 2024 at 05:53:02PM GMT, Yu Jiaoliang wrote:

Let the kememdup_array() take care about multiplication and possible
overflows.

v2:
- Change subject
- Leave one blank line between the commit log and the tag section
- Fix code alignment issue

Signed-off-by: Yu Jiaoliang 
Reviewed-by: Jani Nikula 
Reviewed-by: Andi Shyti 
---
drivers/gpu/drm/i915/gt/intel_workarounds.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index d90348c56765..0fcfd55c62b4 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -111,9 +111,8 @@ static void wa_init_finish(struct i915_wa_list *wal)
{
/* Trim unused entries. */
if (!IS_ALIGNED(wal->count, WA_LIST_CHUNK)) {
-   struct i915_wa *list = kmemdup_array(wal->list,


^

it was already kmemdup_array, not kmemdup. Am I missing anything?

Lucas De Marchi


-  wal->count, sizeof(*list),
-  GFP_KERNEL);
+   struct i915_wa *list = kmemdup_array(wal->list, wal->count,
+   
 sizeof(*list), GFP_KERNEL);

if (list) {
kfree(wal->list);
--
2.34.1



Re: Build regressions/improvements in v6.11-rc4

2024-08-19 Thread Lucas De Marchi

On Mon, Aug 19, 2024 at 09:30:32AM GMT, Geert Uytterhoeven wrote:

On Mon, 19 Aug 2024, Geert Uytterhoeven wrote:

JFYI, when comparing v6.11-rc4[1] to v6.11-rc3[3], the summaries are:
- build errors: +6/-4


 + /kisskb/src/arch/sparc/vdso/vdso32/../vclock_gettime.c: error: no previous 
prototype for '__vdso_clock_gettime' [-Werror=missing-prototypes]:  => 254:1
 + /kisskb/src/arch/sparc/vdso/vdso32/../vclock_gettime.c: error: no previous 
prototype for '__vdso_clock_gettime_stick' [-Werror=missing-prototypes]:  => 
282:1
 + /kisskb/src/arch/sparc/vdso/vdso32/../vclock_gettime.c: error: no previous 
prototype for '__vdso_gettimeofday' [-Werror=missing-prototypes]:  => 307:1
 + /kisskb/src/arch/sparc/vdso/vdso32/../vclock_gettime.c: error: no previous 
prototype for '__vdso_gettimeofday_stick' [-Werror=missing-prototypes]:  => 
343:1

sparc64-gcc13/sparc64-allmodconfig
(pre-existing, but now emitted twice in this config only?)

 + /kisskb/src/include/linux/compiler_types.h: error: call to 
'__compiletime_assert_952' declared with attribute error: FIELD_GET: mask is not 
constant:  => 510:38
 + /kisskb/src/include/linux/compiler_types.h: error: call to 
'__compiletime_assert_953' declared with attribute error: FIELD_GET: mask is not 
constant:  => 510:38

powerpc-gcc5/powerpc-all{mod,yes}config

In function 'decode_oa_format.isra.26',
   inlined from 'xe_oa_set_prop_oa_format' at drivers/gpu/drm/xe/xe_oa.c:1664:6:
drivers/gpu/drm/xe/xe_oa.c:1573:18: note: in expansion of macro 'FIELD_GET'
 u32 bc_report = FIELD_GET(DRM_XE_OA_FORMAT_MASK_BC_REPORT, fmt);
 ^
Seen before, patch available.


+Rodrigo

This patch needs to propagate to drm-xe-fixes:
f2881dfdaaa9 ("drm/xe/oa/uapi: Make bit masks unsigned")

Lucas De Marchi



Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
- Linus Torvalds


Re: [PATCH 2/3] drm/fourcc: define Intel Xe2 related tile4 ccs modifiers

2024-08-16 Thread Lucas De Marchi

On Fri, Aug 16, 2024 at 02:52:28PM GMT, Juha-Pekka Heikkila wrote:

Add Tile4 type ccs modifiers to indicate presence of compression on Xe2.
Here is defined I915_FORMAT_MOD_4_TILED_LNL_CCS which is meant for
integrated graphics with igpu related limitations
Here is also defined I915_FORMAT_MOD_4_TILED_BMG_CCS which is meant
for discrete graphics with dgpu related limitations

Signed-off-by: Juha-Pekka Heikkila 


not very fond of adding the platform names, but looks like this was
always the approach, so this keeps the consistency.

Reviewed-by: Lucas De Marchi 

We will need an ack from drm-misc maintainer to merge this through
drm-intel. Let's add some Cc.

Lucas De Marchi


---
include/uapi/drm/drm_fourcc.h | 25 +
1 file changed, 25 insertions(+)

diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
index 2d84a8052b15..78abd819fd62 100644
--- a/include/uapi/drm/drm_fourcc.h
+++ b/include/uapi/drm/drm_fourcc.h
@@ -702,6 +702,31 @@ extern "C" {
 */
#define I915_FORMAT_MOD_4_TILED_MTL_RC_CCS_CC fourcc_mod_code(INTEL, 15)

+/*
+ * Intel Color Control Surfaces (CCS) for graphics ver. 20 unified compression
+ * on integrated graphics
+ *
+ * The main surface is Tile 4 and at plane index 0. For semi-planar formats
+ * like NV12, the Y and UV planes are Tile 4 and are located at plane indices
+ * 0 and 1, respectively. The CCS for all planes are stored outside of the
+ * GEM object in a reserved memory area dedicated for the storage of the
+ * CCS data for all compressible GEM objects.
+ */
+#define I915_FORMAT_MOD_4_TILED_LNL_CCS fourcc_mod_code(INTEL, 16)
+
+/*
+ * Intel Color Control Surfaces (CCS) for graphics ver. 20 unified compression
+ * on discrete graphics
+ *
+ * The main surface is Tile 4 and at plane index 0. For semi-planar formats
+ * like NV12, the Y and UV planes are Tile 4 and are located at plane indices
+ * 0 and 1, respectively. The CCS for all planes are stored outside of the
+ * GEM object in a reserved memory area dedicated for the storage of the
+ * CCS data for all compressible GEM objects. The GEM object must be stored in
+ * contiguous memory with a size aligned to 64KB
+ */
+#define I915_FORMAT_MOD_4_TILED_BMG_CCS fourcc_mod_code(INTEL, 17)
+
/*
 * Tiled, NV12MT, grouped in 64 (pixels) x 32 (lines) -sized macroblocks
 *
--
2.45.2



Re: [PATCH v2] drm/xe/uapi: Bring back reset uevent

2024-08-15 Thread Lucas De Marchi

On Fri, Aug 16, 2024 at 09:59:31AM GMT, Aravind Iddamsetty wrote:


On 14/08/24 19:54, Lucas De Marchi wrote:

On Wed, Aug 14, 2024 at 12:16:41PM GMT, Aravind Iddamsetty wrote:

i see that this is even called from xe_guc_exec_queue_lr_cleanup which is for 
long running queue
from various call paths need to check in what scenarios those happen.

Should we add a reason for long running queue?

Both of your questions would probably be answered if this was getting developed
at the same time of the user space usage of these uevents.


Can't we consider the generic Linux user as a consumer, via udevadm.


No, udevadm just confirms that there is an event being sent. Main thing
to understand here is "what is this event useful for? what can be done
as outcome of receiving this event?". From your other reply it seems
this is about "device is wedged/toast, and driver can't recover without
userspace intervention since userspace may have different policies"

What would be some examples of actions for userspace to take?

- Unbind driver, kill clients, rebind?
- FLR?
- Something else?


The expectation from userspace on receiving this event is to do a reset
most probably SBR(unbind->sbr->rebind).

The other requirement of this event is for L0 Sysman

https://spec.oneapi.io/level-zero/latest/sysman/api.html#_CPPv4N21zes_event_type_flag_t41ZES_EVENT_TYPE_FLAG_DEVICE_RESET_REQUIREDE

https://spec.oneapi.io/level-zero/latest/sysman/api.html#_CPPv4N18zes_device_state_t5resetE

So we expect the admin do the above actions


so these things should be in the commit message. 


where's the pending implementation of this spec?  I imagine somewhere
we'd see something on top of libudev generating these L0 events. The
events themselves are not much more than confirming via udevadm, but
then at least we know the implementation covers the expectations from
the the spec without having to come back later fixing up stuff.





Is this currently implemented somewhere?

Do you mean by some other driver or subsystem?


Also, is it possible to make is a generic drm event handling so distros
don't have to setup udev rules for each vendor?


I think doing via drm event mandates the observability applications(L0)
to have open connection to the device to process these which is not desired.


I'm not talking about keeping the device open. Here you are defining
xe-specific uevents. Could this be drm-specific so someone implementing
that specification you mentioned doesn't have to implement N different
uevents for N different vendors?  And a distro defining a policy via
udev rules doesn't have to do the same thing slightly differently N times?

Lucas De Marchi



Thanks,
Aravind.




Lucas De Marchi



Thanks,
Aravind.



Raag


Re: linux-next: manual merge of the drm-xe tree with the drm-intel tree

2024-08-15 Thread Lucas De Marchi

On Thu, Aug 15, 2024 at 11:37:17AM GMT, Stephen Rothwell wrote:

Hi all,

Today's linux-next merge of the drm-xe tree got a conflict in:

 drivers/gpu/drm/xe/display/xe_display.c

between commit:

 769b081c18b9 ("drm/i915/opregion: convert to struct intel_display")

from the drm-intel tree and commit:

 1eda95cba9df ("drm/xe: Rename enable_display module param")

from the drm-xe tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.


this matches our current merge and will be resolved when we backmerge
drm-next, before sending our next pull.

thanks
Lucas De Marchi



--
Cheers,
Stephen Rothwell

diff --cc drivers/gpu/drm/xe/display/xe_display.c
index 0e4adde84cb2,56a940b39412..
--- a/drivers/gpu/drm/xe/display/xe_display.c
+++ b/drivers/gpu/drm/xe/display/xe_display.c
@@@ -127,9 -126,8 +127,9 @@@ int xe_display_init_nommio(struct xe_de
 static void xe_display_fini_noirq(void *arg)
 {
struct xe_device *xe = arg;
+   struct intel_display *display = &xe->display;

-   if (!xe->info.enable_display)
+   if (!xe->info.probe_display)
return;

intel_display_driver_remove_noirq(xe);
@@@ -138,10 -135,9 +138,10 @@@

 int xe_display_init_noirq(struct xe_device *xe)
 {
+   struct intel_display *display = &xe->display;
int err;

-   if (!xe->info.enable_display)
+   if (!xe->info.probe_display)
return 0;

intel_display_driver_early_probe(xe);
@@@ -252,9 -246,7 +252,9 @@@ void xe_display_irq_handler(struct xe_d

 void xe_display_irq_enable(struct xe_device *xe, u32 gu_misc_iir)
 {
+   struct intel_display *display = &xe->display;
+
-   if (!xe->info.enable_display)
+   if (!xe->info.probe_display)
return;

if (gu_misc_iir & GU_MISC_GSE)
@@@ -289,9 -296,8 +289,9 @@@ static bool suspend_to_idle(void

 void xe_display_pm_suspend(struct xe_device *xe, bool runtime)
 {
+   struct intel_display *display = &xe->display;
bool s2idle = suspend_to_idle();
-   if (!xe->info.enable_display)
+   if (!xe->info.probe_display)
return;

/*
@@@ -341,9 -347,7 +341,9 @@@ void xe_display_pm_resume_early(struct

 void xe_display_pm_resume(struct xe_device *xe, bool runtime)
 {
+   struct intel_display *display = &xe->display;
+
-   if (!xe->info.enable_display)
+   if (!xe->info.probe_display)
return;

intel_dmc_resume(xe);





Re: [PATCH v2] drm/xe/uapi: Bring back reset uevent

2024-08-14 Thread Lucas De Marchi

On Wed, Aug 14, 2024 at 12:16:41PM GMT, Aravind Iddamsetty wrote:

i see that this is even called from xe_guc_exec_queue_lr_cleanup which is for 
long running queue
from various call paths need to check in what scenarios those happen.

Should we add a reason for long running queue?

Both of your questions would probably be answered if this was getting developed
at the same time of the user space usage of these uevents.


Can't we consider the generic Linux user as a consumer, via udevadm.


No, udevadm just confirms that there is an event being sent. Main thing
to understand here is "what is this event useful for? what can be done
as outcome of receiving this event?". From your other reply it seems
this is about "device is wedged/toast, and driver can't recover without
userspace intervention since userspace may have different policies"

What would be some examples of actions for userspace to take?

- Unbind driver, kill clients, rebind?
- FLR?
- Something else?

Is this currently implemented somewhere?

Also, is it possible to make is a generic drm event handling so distros
don't have to setup udev rules for each vendor?


Lucas De Marchi



Thanks,
Aravind.



Raag


Re: [PATCH v2] drm/xe/uapi: Bring back reset uevent

2024-08-13 Thread Lucas De Marchi

On Mon, Aug 12, 2024 at 01:18:12PM GMT, Raag Jadav wrote:

From: Himal Prasad Ghimiray 

This was dropped in commit 77a0d4d1cea2 ("drm/xe/uapi: Remove reset
uevent for now") as part of refactoring.

Now that we have better uapi semantics and naming for the uevent,
bring it back. With this in place, userspace will be notified of
wedged device along with its reason.


Title of this patch, this paragraph and what's being done in the
implemenation don't match since reset != wedged



$ udevadm monitor --property --kernel
monitor will print the received events for:
KERNEL - the kernel uevent

KERNEL[871.188570] change   
/devices/pci:00/:00:01.0/:01:00.0/:02:01.0/:03:00.0 (pci)
ACTION=change
DEVPATH=/devices/pci:00/:00:01.0/:01:00.0/:02:01.0/:03:00.0
SUBSYSTEM=pci
DEVICE_STATUS=NEEDS_RESET
REASON=GT_RESET_FAILED
TILE_ID=0
GT_ID=0
DRIVER=xe
PCI_CLASS=3
PCI_ID=8086:56B1
PCI_SUBSYS_ID=8086:1210
PCI_SLOT_NAME=:03:00.0
MODALIAS=pci:v8086d56B1sv8086sd1210bc03sc00i00
SEQNUM=6104

v2: Change authorship to Himal (Aravind)
   Add uevent for all device wedged cases (Aravind)

Signed-off-by: Himal Prasad Ghimiray 
Signed-off-by: Raag Jadav 
---
drivers/gpu/drm/xe/xe_device.c | 10 +-
drivers/gpu/drm/xe/xe_device.h |  2 +-
drivers/gpu/drm/xe/xe_gt.c | 23 +++
drivers/gpu/drm/xe/xe_guc.c| 13 -
drivers/gpu/drm/xe/xe_guc_submit.c | 13 -
include/uapi/drm/xe_drm.h  | 29 +
6 files changed, 82 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 1aba6f9eaa19..d975bdce4a7d 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -955,6 +955,7 @@ static void xe_device_wedged_fini(struct drm_device *drm, 
void *arg)
/**
 * xe_device_declare_wedged - Declare device wedged
 * @xe: xe device instance
+ * @event_params: parameters to be sent along with uevent
 *
 * This is a final state that can only be cleared with a mudule
 * re-probe (unbind + bind).
@@ -965,8 +966,10 @@ static void xe_device_wedged_fini(struct drm_device *drm, 
void *arg)
 * on every single execution timeout (a.k.a. GPU hang) right after devcoredump
 * snapshot capture. In this mode, GT reset won't be attempted so the state of
 * the issue is preserved for further debugging.
+ * Caller is expected to pass respective parameters to be sent along with
+ * uevent. Pass NULL in case of no params.
 */
-void xe_device_declare_wedged(struct xe_device *xe)
+void xe_device_declare_wedged(struct xe_device *xe, char **event_params)
{
struct xe_gt *gt;
u8 id;
@@ -984,12 +987,17 @@ void xe_device_declare_wedged(struct xe_device *xe)
xe_pm_runtime_get_noresume(xe);

if (!atomic_xchg(&xe->wedged.flag, 1)) {
+   struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
+


Given the names of the uevents, I'm not sure if the intention is to
notify when a gt reset happens or only when it fails
(and wedged.mode != 0)

Lucas De Marchi


xe->needs_flr_on_fini = true;
drm_err(&xe->drm,
"CRITICAL: Xe has declared device %s as wedged.\n"
"IOCTLs and executions are blocked. Only a rebind may clear 
the failure\n"
"Please file a _new_ bug report at 
https://gitlab.freedesktop.org/drm/xe/kernel/issues/new\n";,
dev_name(xe->drm.dev));
+
+   /* Notify userspace about reset required */
+   kobject_uevent_env(&pdev->dev.kobj, KOBJ_CHANGE, event_params);
}

for_each_gt(gt, xe, id)
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index db6cc8d0d6b8..5d40fc6f0904 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -174,7 +174,7 @@ static inline bool xe_device_wedged(struct xe_device *xe)
return atomic_read(&xe->wedged.flag);
}

-void xe_device_declare_wedged(struct xe_device *xe);
+void xe_device_declare_wedged(struct xe_device *xe, char **reset_event);

struct xe_file *xe_file_get(struct xe_file *xef);
void xe_file_put(struct xe_file *xef);
diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index 58895ed22f6e..519f3c2cf9e2 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -741,6 +741,24 @@ static int do_gt_restart(struct xe_gt *gt)
return 0;
}

+static void xe_gt_reset_failed(struct xe_gt *gt, int err)
+{
+   char *event_params[5];
+
+   xe_gt_err(gt, "reset failed (%pe)\n", ERR_PTR(err));
+
+   event_params[0] = DRM_XE_RESET_REQUIRED_UEVENT;
+   event_params[1] = DRM_XE_RESET_REQUIRED_UEVENT_REASON_GT;
+   event_params[2] = kasprintf(GFP_KERNEL, "TILE_ID=%d", 
gt_to_ti

Re: [PATCH v1] drm/xe/uapi: Bring back reset uevent

2024-08-07 Thread Lucas De Marchi

On Tue, Aug 06, 2024 at 10:02:31AM GMT, Raag Jadav wrote:

From: Lucas De Marchi 

Bring back uevent for gt reset failure with better uapi naming.
With this in place we can receive failure event using udev.

$ udevadm monitor --property --kernel
monitor will print the received events for:
KERNEL - the kernel uevent

KERNEL[871.188570] change   
/devices/pci:00/:00:01.0/:01:00.0/:02:01.0/:03:00.0 (pci)
ACTION=change
DEVPATH=/devices/pci:00/:00:01.0/:01:00.0/:02:01.0/:03:00.0
SUBSYSTEM=pci
DEVICE_STATUS=NEEDS_RESET
REASON=GT_RESET_FAILED
TILE_ID=0
GT_ID=0
DRIVER=xe
PCI_CLASS=3
PCI_ID=8086:56B1
PCI_SUBSYS_ID=8086:1210
PCI_SLOT_NAME=:03:00.0
MODALIAS=pci:v8086d56B1sv8086sd1210bc03sc00i00
SEQNUM=6104

Signed-off-by: Lucas De Marchi 


please drop my s-o-b here and don't add me as the author of this patch,
which I certainly am not.

You need to point to the commit where it was reverted and *why* it's ok
to have this uapi now.

Lucas De Marchi



Signed-off-by: Raag Jadav 
---
drivers/gpu/drm/xe/xe_gt.c | 27 +--
include/uapi/drm/xe_drm.h  | 17 +
2 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index b04e47186f5b..5ceef0059861 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -740,6 +740,30 @@ static int do_gt_restart(struct xe_gt *gt)
return 0;
}

+static void xe_uevent_gt_reset_failure(struct pci_dev *pdev, u8 tile_id, u8 
gt_id)
+{
+   char *reset_event[5];
+
+   reset_event[0] = DRM_XE_RESET_REQUIRED_UEVENT;
+   reset_event[1] = DRM_XE_RESET_REQUIRED_UEVENT_REASON_GT;
+   reset_event[2] = kasprintf(GFP_KERNEL, "TILE_ID=%d", tile_id);
+   reset_event[3] = kasprintf(GFP_KERNEL, "GT_ID=%d", gt_id);
+   reset_event[4] = NULL;
+   kobject_uevent_env(&pdev->dev.kobj, KOBJ_CHANGE, reset_event);
+
+   kfree(reset_event[2]);
+   kfree(reset_event[3]);
+}
+
+static void gt_reset_failed(struct xe_gt *gt, int err)
+{
+   xe_gt_err(gt, "reset failed (%pe)\n", ERR_PTR(err));
+
+   /* Notify userspace about gt reset failure */
+   xe_uevent_gt_reset_failure(to_pci_dev(gt_to_xe(gt)->drm.dev),
+  gt_to_tile(gt)->id, gt->info.id);
+}
+
static int gt_reset(struct xe_gt *gt)
{
int err;
@@ -795,8 +819,7 @@ static int gt_reset(struct xe_gt *gt)
XE_WARN_ON(xe_uc_start(>->uc));
xe_pm_runtime_put(gt_to_xe(gt));
err_fail:
-   xe_gt_err(gt, "reset failed (%pe)\n", ERR_PTR(err));
-
+   gt_reset_failed(gt, err);
xe_device_declare_wedged(gt_to_xe(gt));

return err;
diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 19619d4952a8..9ea3be97535e 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -20,6 +20,7 @@ extern "C" {
 *   2. Extension definition and helper structs
 *   3. IOCTL's Query structs in the order of the Query's entries.
 *   4. The rest of IOCTL structs in the order of IOCTL declaration.
+ *   5. uEvents
 */

/**
@@ -1686,6 +1687,22 @@ struct drm_xe_oa_stream_info {
__u64 reserved[3];
};

+/**
+ * DOC: uevent generated by xe on it's pci node.
+ *
+ * DRM_XE_RESET_REQUIRED_UEVENT - Event is generated when device needs reset.
+ * The REASON is provided along with the event for which reset is required.
+ * On the basis of REASONS, additional information might be supplied.
+ */
+#define DRM_XE_RESET_REQUIRED_UEVENT "DEVICE_STATUS=NEEDS_RESET"
+
+/**
+ * DRM_XE_RESET_REQUIRED_UEVENT_REASON_GT - Reason provided to 
DRM_XE_RESET_REQUIRED_UEVENT
+ * incase of gt reset failure. The additional information supplied is tile id 
and
+ * gt id of the gt unit for which reset has failed.
+ */
+#define DRM_XE_RESET_REQUIRED_UEVENT_REASON_GT "REASON=GT_RESET_FAILED"
+
#if defined(__cplusplus)
}
#endif
--
2.34.1



Re: [PATCH 04/12] drm/xe: xe_gen_wa_oob: fix program_invocation_short_name for macos

2024-08-07 Thread Lucas De Marchi

On Wed, Aug 07, 2024 at 08:13:51AM GMT, Daniel Gomez wrote:

On Tue, Aug 06, 2024 at 08:50:09PM GMT, Lucas De Marchi wrote:

On Wed, Aug 07, 2024 at 01:09:18AM GMT, Daniel Gomez via B4 Relay wrote:
> From: Daniel Gomez 
>
> Use getprogname() [1] instead of program_invocation_short_name() [2]
> for macOS hosts.
>
> [1]:
> https://www.gnu.org/software/gnulib/manual/html_node/
> program_005finvocation_005fshort_005fname.html
>
> [2]:
> https://developer.apple.com/library/archive/documentation/System/
> Conceptual/ManPages_iPhoneOS/man3/getprogname.3.html
>
> Fixes build error for macOS hosts:
>
> drivers/gpu/drm/xe/xe_gen_wa_oob.c:34:3: error: use of
> undeclared identifier 'program_invocation_short_name'34 |
> program_invocation_short_name);   | ^ 1 error
> generated.
>
> Signed-off-by: Daniel Gomez 
> ---
> drivers/gpu/drm/xe/xe_gen_wa_oob.c | 8 +++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_gen_wa_oob.c 
b/drivers/gpu/drm/xe/xe_gen_wa_oob.c
> index 904cf47925aa..079b8870c461 100644
> --- a/drivers/gpu/drm/xe/xe_gen_wa_oob.c
> +++ b/drivers/gpu/drm/xe/xe_gen_wa_oob.c
> @@ -9,6 +9,12 @@
> #include 
> #include 
> #include 
> +#define PROG_INV_NAME program_invocation_short_name
> +
> +#ifdef __APPLE__
> +#include 
> +#define PROG_INV_NAME getprogname()
> +#endif
>
> #define HEADER \
>"// SPDX-License-Identifier: MIT\n" \
> @@ -31,7 +37,7 @@
> static void print_usage(FILE *f)
> {
>fprintf(f, "usage: %s   
\n",
> -  program_invocation_short_name);
> +  PROG_INV_NAME);

instead of doing that, can we a) include stdlib.h unconditionally and b)
add here a
`static const char *program_invocation_short_name = getprogname()` so we
don't need to change the common case and just handle the "build on
macos" as a compat layer?


Does this align with your suggestion (v1 diff)?


yes, just a nit that in xe we keep the includes sorted alphabetically,
so the stdlib.h include should move up one line. Other than that,

// Reviewed-by: Lucas De Marchi 

... assuming the rest of the series and idea about building the kernel
on macOS is not pushed back.

Lucas De Marchi




Note that static cannot be use here.

diff --git a/drivers/gpu/drm/xe/xe_gen_wa_oob.c 
b/drivers/gpu/drm/xe/xe_gen_wa_oob.c
index 079b8870c461..b3add20ccb01 100644
--- a/drivers/gpu/drm/xe/xe_gen_wa_oob.c
+++ b/drivers/gpu/drm/xe/xe_gen_wa_oob.c
@@ -9,12 +9,7 @@
#include 
#include 
#include 
-#define PROG_INV_NAME program_invocation_short_name
-
-#ifdef __APPLE__
#include 
-#define PROG_INV_NAME getprogname()
-#endif

#define HEADER \
   "// SPDX-License-Identifier: MIT\n" \
@@ -36,8 +31,11 @@

static void print_usage(FILE *f)
{
+#ifdef __APPLE__
+   const char *program_invocation_short_name = getprogname();
+#endif
   fprintf(f, "usage: %s   
\n",
-   PROG_INV_NAME);
+   program_invocation_short_name);
}

static void print_parse_error(const char *err_msg, const char *line,



Lucas De Marchi

> }
>
> static void print_parse_error(const char *err_msg, const char *line,
>
> --
> Git-146)
>
>


Re: [PATCH 04/12] drm/xe: xe_gen_wa_oob: fix program_invocation_short_name for macos

2024-08-06 Thread Lucas De Marchi

On Wed, Aug 07, 2024 at 01:09:18AM GMT, Daniel Gomez via B4 Relay wrote:

From: Daniel Gomez 

Use getprogname() [1] instead of program_invocation_short_name() [2]
for macOS hosts.

[1]:
https://www.gnu.org/software/gnulib/manual/html_node/
program_005finvocation_005fshort_005fname.html

[2]:
https://developer.apple.com/library/archive/documentation/System/
Conceptual/ManPages_iPhoneOS/man3/getprogname.3.html

Fixes build error for macOS hosts:

drivers/gpu/drm/xe/xe_gen_wa_oob.c:34:3: error: use of
undeclared identifier 'program_invocation_short_name'34 |
program_invocation_short_name);   | ^ 1 error
generated.

Signed-off-by: Daniel Gomez 
---
drivers/gpu/drm/xe/xe_gen_wa_oob.c | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_gen_wa_oob.c 
b/drivers/gpu/drm/xe/xe_gen_wa_oob.c
index 904cf47925aa..079b8870c461 100644
--- a/drivers/gpu/drm/xe/xe_gen_wa_oob.c
+++ b/drivers/gpu/drm/xe/xe_gen_wa_oob.c
@@ -9,6 +9,12 @@
#include 
#include 
#include 
+#define PROG_INV_NAME program_invocation_short_name
+
+#ifdef __APPLE__
+#include 
+#define PROG_INV_NAME getprogname()
+#endif

#define HEADER \
"// SPDX-License-Identifier: MIT\n" \
@@ -31,7 +37,7 @@
static void print_usage(FILE *f)
{
fprintf(f, "usage: %s   
\n",
-   program_invocation_short_name);
+   PROG_INV_NAME);


instead of doing that, can we a) include stdlib.h unconditionally and b)
add here a
`static const char *program_invocation_short_name = getprogname()` so we
don't need to change the common case and just handle the "build on
macos" as a compat layer?

Lucas De Marchi


}

static void print_parse_error(const char *err_msg, const char *line,

--
Git-146)




[PULL] drm-xe-next

2024-07-30 Thread Lucas De Marchi

Hi Dave and Sima,

Early drm-xe-next pull request for 6.12. Main reason for being much
earlier than usual is to have the additional SIMD16 EU reported as it's
a needed UAPI for Lunar Lake and Battlemage. It's sitting in drm-xe-next
for a few weeks and userspace already testing with it.

There's also a minor uapi change with change of error return code. The
other 2 already propagated via -fixes.

Other changes bring general improvements and cleanups to the driver,
further support for SR-IOV as well as head Lunar Lake and Battlemage to
the finish line of being officially exposed by the driver. Some bits
still influx, so not yet there though.

thanks
Lucas De Marchi

drm-xe-next-2024-07-30:
drm-xe-next for 6.12

UAPI Changes:
- Rename xe perf layer as xe observation layer, but was
  also made available via fixes to previous verison (Ashutosh)
- Use write-back caching mode for system memory on DGFX,
  but was also mad available via fixes to previous version (Thomas)
- Expose SIMD16 EU mask in topology query for userspace to know
  the type of EU, as available in PVC, Lunar Lake and Battlemage
  (Lucas)
- Return ENOBUFS instead of ENOMEM in vm_bind if failure is tied
  to an array of binds (Matthew Brost)

Driver Changes:
- Log cleanup moving messages to debug priority (Michal Wajdeczko)
- Add timeout to fences to adhere to dma_buf rules (Matthew Brost)
- Rename old engine nomenclature to exec_queue (Matthew Brost)
- Convert multiple bind ops to 1 job (Matthew Brost)
- Add error injection for vm bind to help testing error path
  (Matthew Brost)
- Fix error handling in page table to propagate correctly
  to userspace (Matthew Brost)
- Re-organize and cleanup SR-IOV related registers (Michal Wajdeczko)
- Make the device write barrier compatible with VF (Michal Wajdeczko)
- New display workarounds for Battlemage (Matthew  Auld)
- New media workarounds for Lunar Lake and Battlemage (Ngai-Mint Kwan)
- New graphics workarounds for Lunar Lake (Bommu Krishnaiah)
- Tracepoint updates (Matthew Brost, Nirmoy Das)
- Cleanup the header generation for OOB workarounds (Lucas De Marchi)
- Fix leaking HDCP-related object (Nirmoy Das)
- Serialize L2 flushes to avoid races (Tejas Upadhyay)
- Log pid and comm on job timeout (José Roberto de Souza)
- Simplify boilerplate code for live kunit (Michal Wajdeczko)
- Improve kunit skips for live kunit (Michal Wajdeczko)
- Fix xe_sync cleanup when handling xe_exec ioctl (Ashutosh Dixit)
- Limit fair VF LMEM provisioning (Michal Wajdeczko)
- New workaround to fence mmio writes in Lunar Lake (Tejas Upadhyay)
- Warn on writes inaccessible register in VF (Michal Wajdeczko)
- Fix register lookup in VF (Michal Wajdeczko)
- Add GSC support for Battlemage (Alexander Usyskin)
- Fix wedging only the GT in which timeout occurred (Matthew Brost)
- Block device suspend when wedging (Matthew Brost)
- Handle compression and migration changes for Battlemage
  (Akshata Jahagirdar)
- Limit access of stolen memory for Lunar Lake (Uma Shankar)
- Fail invalid addresses during user fence creation (Matthew Brost)
- Refcount xe_file to safely and accurately store fdinfo stats
  (Umesh Nerlige Ramappa)
- Cleanup and fix PM reference for TLB invalidation code
  (Matthew Brost)
- Fix PM reference handling when communicating with GuC (Matthew Brost)
- Add new BO flag for 2 MiB alignement and use in VF (Michal Wajdeczko)
- Simplify MMIO setup for multi-tile platforms (Lucas De Marchi)
- Add check for uninitialized access to OOB workarounds
  (Lucas De Marchi)
- New GSC and HuC firmware blobs for Lunar Lake and Battlemage
  (Daniele Ceraolo Spurio)
- Unify mmio wait logic (Gustavo Sousa)
- Fix off-by-one when processing RTP rules (Lucas De Marchi)
- Future-proof migrate logic with compressed PAT flag (Matt Roper)
- Add WA kunit tests for Battlemage (Lucas De Marchi)
- Test active tracking for workaorunds with kunit (Lucas De Marchi)
- Add kunit tests for RTP with no actions (Lucas De Marchi)
- Unify parse of OR rules in RTP (Lucas De Marchi)
- Add performance tuning for Battlemage (Sai Teja Pottumuttu)
- Make bit masks unsigned (Geert Uytterhoeven)
The following changes since commit aaa08078e7251131f045ba248a68671db7f7bdf7:

  drm/xe/bmg: Apply Wa_22019338487 (2024-07-02 12:14:00 -0400)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-next-2024-07-30

for you to fetch changes up to f2881dfdaaa9ec873dbd383ef5512fc31e576cbb:

  drm/xe/oa/uapi: Make bit masks unsigned (2024-07-30 13:45:38 -0700)


drm-xe-next for 6.12

UAPI Changes:
- Rename xe perf layer as xe observation layer, but was
  also made available via fixes to previous verison (Ashutosh)
- Use write-back caching mode for system memory on DGFX,
  but was also mad available via fixes to previous version (Thomas)
- Expose SIMD16 EU mask in topology query for userspace to know
  the type of EU, as available in PVC, Lunar Lake

Re: [PATCH] drm/xe/oa/uapi: Make bit masks unsigned

2024-07-30 Thread Lucas De Marchi

On Mon, Jul 29, 2024 at 02:54:58PM GMT, Ashutosh Dixit wrote:

On Mon, 29 Jul 2024 06:21:20 -0700, Lucas De Marchi wrote:




Hi Lucas,


Reviewed-by: Lucas De Marchi 

That fixes the build, but question to Ashutosh: it's odd to tie the
format to a bspec. What happens on next platform if the HW changes?
Hopefully it doesn't change in an incompatible way, but looking at this
code it seems we could still keep the uapi by untying the HW from the
uapi in the documentation.


IMO, in this case, it is not possible to decouple the formats from Bspec
because that is where they are specified (in Bspec 52198/60942).

In i915 the OA formats were specified by an enum (enum drm_i915_oa_format),
but I would argue that enum is meaningful only when we refer back to Bspec.
Also the i915 enum had to constantly updated when HW added new formats.

For Xe, we changed the way the formats are specified in a way which we
believe will make the uapi more robust and uapi header update much less
frequent (hopefully we will never have to update the header and if at all
we have to, we should be able to do it in a backwards compatible way since
we have sufficient number of free bits). HW has followed this scheme for
specifying the formats for years and only recently for Xe2 has added a
couple of bits and introduced new PEC formats which I think it is not going
to change now for some time.

But as I said the formats have to refer back to Bspec since that is where
there are specified and there are too many of them. Any description or enum
is ambiguous unless it refers back to Bspec. So I don't see how not to
refer to Bspec in the documentation. If anyone has any ideas about not
referring to Bspec I am willing to consider it but I think the best way
forward is to leave the documentation as is:

/*
 * OA_FORMAT's are specified the same way as in PRM/Bspec 52198/60942,
 * in terms of the following quantities: a. enum @drm_xe_oa_format_type
 * b. Counter select c. Counter size and d. BC report. Also refer to the
 * oa_formats array in drivers/gpu/drm/xe/xe_oa.c.
 */
#define DRM_XE_OA_FORMAT_MASK_FMT_TYPE  (0xff << 0)
#define DRM_XE_OA_FORMAT_MASK_COUNTER_SEL   (0xff << 8)
#define DRM_XE_OA_FORMAT_MASK_COUNTER_SIZE  (0xff << 16)
#define DRM_XE_OA_FORMAT_MASK_BC_REPORT (0xff << 24)


I was under impression that these shifts were something coming from the
HW definition and we were exposing that raw to userspace, rather than
e.g.

struct drm_xe_oa_format {
__u8 fmt_type;
__u8 counter_sel;
__u8 counter_size;
__u8 bc_report;
__u32 rsvd;
};

now I see the shifts are not tied to HW and it was only the chosen
format rather than declaring a struct.

Applied this patch to drm-xe-next.

Thanks
Lucas De Marchi



Thanks.
--
Ashutosh


Re: [PATCH] drm/xe/oa/uapi: Make bit masks unsigned

2024-07-29 Thread Lucas De Marchi

On Mon, Jul 29, 2024 at 11:26:34AM GMT, Geert Uytterhoeven wrote:

When building with gcc-5:

   In function ‘decode_oa_format.isra.26’,
inlined from ‘xe_oa_set_prop_oa_format’ at 
drivers/gpu/drm/xe/xe_oa.c:1664:6:
   ././include/linux/compiler_types.h:510:38: error: call to 
‘__compiletime_assert_1336’ declared with attribute error: FIELD_GET: mask is 
not constant
   [...]
   ./include/linux/bitfield.h:155:3: note: in expansion of macro 
‘__BF_FIELD_CHECK’
  __BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: "); \
  ^
   drivers/gpu/drm/xe/xe_oa.c:1573:18: note: in expansion of macro ‘FIELD_GET’
 u32 bc_report = FIELD_GET(DRM_XE_OA_FORMAT_MASK_BC_REPORT, fmt);
  ^

Fixes: b6fd51c6211910b1 ("drm/xe/oa/uapi: Define and parse OA stream 
properties")
Signed-off-by: Geert Uytterhoeven 



Reviewed-by: Lucas De Marchi 


That fixes the build, but question to Ashutosh: it's odd to tie the
format to a bspec. What happens on next platform if the HW changes?
Hopefully it doesn't change in an incompatible way, but looking at this
code it seems we could still keep the uapi by untying the HW from the
uapi in the documentation.

Lucas De Marchi


---
Compile-tested only.
---
include/uapi/drm/xe_drm.h | 8 
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
index 19619d4952a863f7..db232a25189eba9f 100644
--- a/include/uapi/drm/xe_drm.h
+++ b/include/uapi/drm/xe_drm.h
@@ -1590,10 +1590,10 @@ enum drm_xe_oa_property_id {
 * b. Counter select c. Counter size and d. BC report. Also refer to the
 * oa_formats array in drivers/gpu/drm/xe/xe_oa.c.
 */
-#define DRM_XE_OA_FORMAT_MASK_FMT_TYPE (0xff << 0)
-#define DRM_XE_OA_FORMAT_MASK_COUNTER_SEL  (0xff << 8)
-#define DRM_XE_OA_FORMAT_MASK_COUNTER_SIZE (0xff << 16)
-#define DRM_XE_OA_FORMAT_MASK_BC_REPORT(0xff << 24)
+#define DRM_XE_OA_FORMAT_MASK_FMT_TYPE (0xffu << 0)
+#define DRM_XE_OA_FORMAT_MASK_COUNTER_SEL  (0xffu << 8)
+#define DRM_XE_OA_FORMAT_MASK_COUNTER_SIZE (0xffu << 16)
+#define DRM_XE_OA_FORMAT_MASK_BC_REPORT(0xffu << 24)

/**
 * @DRM_XE_OA_PROPERTY_OA_PERIOD_EXPONENT: Requests periodic OA unit
--
2.34.1



Re: [PATCH 6/7] drm/i915/pmu: Lazy unregister

2024-07-24 Thread Lucas De Marchi

On Wed, Jul 24, 2024 at 02:41:05PM GMT, Peter Zijlstra wrote:

On Tue, Jul 23, 2024 at 10:30:08AM -0500, Lucas De Marchi wrote:

On Tue, Jul 23, 2024 at 09:03:25AM GMT, Tvrtko Ursulin wrote:
>
> On 22/07/2024 22:06, Lucas De Marchi wrote:
> > Instead of calling perf_pmu_unregister() when unbinding, defer that to
> > the destruction of i915 object. Since perf itself holds a reference in
> > the event, this only happens when all events are gone, which guarantees
> > i915 is not unregistering the pmu with live events.
> >
> > Previously, running the following sequence would crash the system after
> > ~2 tries:
> >
> >   1) bind device to i915
> >   2) wait events to show up on sysfs
> >   3) start perf  stat -I 1000 -e i915/rcs0-busy/
> >   4) unbind driver
> >   5) kill perf
> >
> > Most of the time this crashes in perf_pmu_disable() while accessing the
> > percpu pmu_disable_count. This happens because perf_pmu_unregister()
> > destroys it with free_percpu(pmu->pmu_disable_count).
> >
> > With a lazy unbind, the pmu is only unregistered after (5) as opposed to
> > after (4). The downside is that if a new bind operation is attempted for
> > the same device/driver without killing the perf process, i915 will fail
> > to register the pmu (but still load successfully). This seems better
> > than completely crashing the system.
>
> So effectively allows unbind to succeed without fully unbinding the
> driver from the device? That sounds like a significant drawback and if
> so, I wonder if a more complicated solution wouldn't be better after
> all. Or is there precedence for allowing userspace keeping their paws on
> unbound devices in this way?

keeping the resources alive but "unplunged" while the hardware
disappeared is a common thing to do... it's the whole point of the
drmm-managed resource for example. If you bind the driver and then
unbind it while userspace is holding a ref, next time you try to bind it
will come up with a different card number. A similar thing that could be
done is to adjust the name of the event - currently we add the mangled
pci slot.

That said, I agree a better approach would be to allow
perf_pmu_unregister() to do its job even when there are open events. On
top of that (or as a way to help achieve that), make perf core replace
the callbacks with stubs when pmu is unregistered - that would even kill
the need for i915's checks on pmu->closed (and fix the lack thereof in
other drivers).

It can be a can of worms though and may be pushed back by perf core
maintainers, so it'd be good have their feedback.


I don't think I understand the problem. I also don't understand drivers
much -- so that might be the problem.


We can bind/unbind the driver to/from the pci device. Example:

echo -n ":00:02.0" > /sys/bus/pci/drivers/i915/unbind

This is essentially unplugging the HW from the kernel.  This will
trigger the driver to deinitialize and free up all resources use by that
device.

So when the driver is binding it does:

perf_pmu_register();

When it's unbinding:

perf_pmu_unregister();

Reasons to unbind include:

- driver testing so then we can unload the module and load it
  again
- device is toast - doing an FLR and rebinding may
  fix/workaround it
- For SR-IOV, in which we are exposing multiple instances of the
  same underlying PCI device, we may need to bind/unbind
  on-demand  (it's not yet clear if perf_pmu_register() would be
  called on the VF instances, but listed here just to explain
  the bind/unbind)
- Hotplug



So the problem appears to be that the device just disappears without
warning? How can a GPU go away like that?

Since you have a notion of this device, can't you do this stubbing you
talk about? That is, if your internal device reference becomes NULL, let
the PMU methods preserve the state like no-ops.


It's not clear if you are suggesting these stubs to be in each driver or
to be assigned by perf in perf_pmu_unregister(). Some downsides
of doing it in the driver:

- you can't remove the module as perf will still call module
  code
- need to replicate the stubs in every driver (or the
  equivalent of pmu->closed checks in i915_pmu.c)

I *think* the stubs would be quiet the same for every device, so could
be feasible to share them inside perf. Alternatively it could simply
shortcut the call, without stubs, by looking at event->hw.state and
a new pmu->state. I can give this a try.

thanks
Lucas De Marchi



And then when the last event goes away, tear down the whole thing.

Again, I'm not sure I'm following.


Re: [PATCH 6/7] drm/i915/pmu: Lazy unregister

2024-07-23 Thread Lucas De Marchi

On Tue, Jul 23, 2024 at 09:03:25AM GMT, Tvrtko Ursulin wrote:


On 22/07/2024 22:06, Lucas De Marchi wrote:

Instead of calling perf_pmu_unregister() when unbinding, defer that to
the destruction of i915 object. Since perf itself holds a reference in
the event, this only happens when all events are gone, which guarantees
i915 is not unregistering the pmu with live events.

Previously, running the following sequence would crash the system after
~2 tries:

1) bind device to i915
2) wait events to show up on sysfs
3) start perf  stat -I 1000 -e i915/rcs0-busy/
4) unbind driver
5) kill perf

Most of the time this crashes in perf_pmu_disable() while accessing the
percpu pmu_disable_count. This happens because perf_pmu_unregister()
destroys it with free_percpu(pmu->pmu_disable_count).

With a lazy unbind, the pmu is only unregistered after (5) as opposed to
after (4). The downside is that if a new bind operation is attempted for
the same device/driver without killing the perf process, i915 will fail
to register the pmu (but still load successfully). This seems better
than completely crashing the system.


So effectively allows unbind to succeed without fully unbinding the 
driver from the device? That sounds like a significant drawback and if 
so, I wonder if a more complicated solution wouldn't be better after 
all. Or is there precedence for allowing userspace keeping their paws 
on unbound devices in this way?


keeping the resources alive but "unplunged" while the hardware
disappeared is a common thing to do... it's the whole point of the
drmm-managed resource for example. If you bind the driver and then
unbind it while userspace is holding a ref, next time you try to bind it
will come up with a different card number. A similar thing that could be
done is to adjust the name of the event - currently we add the mangled
pci slot.

That said, I agree a better approach would be to allow
perf_pmu_unregister() to do its job even when there are open events. On
top of that (or as a way to help achieve that), make perf core replace
the callbacks with stubs when pmu is unregistered - that would even kill
the need for i915's checks on pmu->closed (and fix the lack thereof in
other drivers).

It can be a can of worms though and may be pushed back by perf core
maintainers, so it'd be good have their feedback.

thanks
Lucas De Marchi



Regards,

Tvrtko



Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/i915_pmu.c | 24 +---
 1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 8708f905f4f4..df53a8fe53ec 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -1158,18 +1158,21 @@ static void free_pmu(struct drm_device *dev, void *res)
struct i915_pmu *pmu = res;
struct drm_i915_private *i915 = pmu_to_i915(pmu);
+   perf_pmu_unregister(&pmu->base);
free_event_attributes(pmu);
kfree(pmu->base.attr_groups);
if (IS_DGFX(i915))
kfree(pmu->name);
+
+   /*
+* Make sure all currently running (but shortcut on pmu->closed) are
+* gone before proceeding with free'ing the pmu object embedded in i915.
+*/
+   synchronize_rcu();
 }
 static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
 {
-   struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
-
-   GEM_BUG_ON(!pmu->base.event_init);
-
/* Select the first online CPU as a designated reader. */
if (cpumask_empty(&i915_pmu_cpumask))
cpumask_set_cpu(cpu, &i915_pmu_cpumask);
@@ -1182,8 +1185,6 @@ static int i915_pmu_cpu_offline(unsigned int cpu, struct 
hlist_node *node)
struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
unsigned int target = i915_pmu_target_cpu;
-   GEM_BUG_ON(!pmu->base.event_init);
-
/*
 * Unregistering an instance generates a CPU offline event which we must
 * ignore to avoid incorrectly modifying the shared i915_pmu_cpumask.
@@ -1337,21 +1338,14 @@ void i915_pmu_unregister(struct drm_i915_private *i915)
 {
struct i915_pmu *pmu = &i915->pmu;
-   if (!pmu->base.event_init)
-   return;
-
/*
-* "Disconnect" the PMU callbacks - since all are atomic synchronize_rcu
-* ensures all currently executing ones will have exited before we
-* proceed with unregistration.
+* "Disconnect" the PMU callbacks - unregistering the pmu will be done
+* later when all currently open events are gone
 */
pmu->closed = true;
-   synchronize_rcu();
hrtimer_cancel(&pmu->timer);
-
i915_pmu_unregister_cpuhp_state(pmu);
-   perf_pmu_unregister(&pmu->base);
pmu->base.event_init = NULL;
 }


[PATCH 6/7] drm/i915/pmu: Lazy unregister

2024-07-22 Thread Lucas De Marchi
Instead of calling perf_pmu_unregister() when unbinding, defer that to
the destruction of i915 object. Since perf itself holds a reference in
the event, this only happens when all events are gone, which guarantees
i915 is not unregistering the pmu with live events.

Previously, running the following sequence would crash the system after
~2 tries:

1) bind device to i915
2) wait events to show up on sysfs
3) start perf  stat -I 1000 -e i915/rcs0-busy/
4) unbind driver
5) kill perf

Most of the time this crashes in perf_pmu_disable() while accessing the
percpu pmu_disable_count. This happens because perf_pmu_unregister()
destroys it with free_percpu(pmu->pmu_disable_count).

With a lazy unbind, the pmu is only unregistered after (5) as opposed to
after (4). The downside is that if a new bind operation is attempted for
the same device/driver without killing the perf process, i915 will fail
to register the pmu (but still load successfully). This seems better
than completely crashing the system.

Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/i915_pmu.c | 24 +---
 1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 8708f905f4f4..df53a8fe53ec 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -1158,18 +1158,21 @@ static void free_pmu(struct drm_device *dev, void *res)
struct i915_pmu *pmu = res;
struct drm_i915_private *i915 = pmu_to_i915(pmu);
 
+   perf_pmu_unregister(&pmu->base);
free_event_attributes(pmu);
kfree(pmu->base.attr_groups);
if (IS_DGFX(i915))
kfree(pmu->name);
+
+   /*
+* Make sure all currently running (but shortcut on pmu->closed) are
+* gone before proceeding with free'ing the pmu object embedded in i915.
+*/
+   synchronize_rcu();
 }
 
 static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
 {
-   struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
-
-   GEM_BUG_ON(!pmu->base.event_init);
-
/* Select the first online CPU as a designated reader. */
if (cpumask_empty(&i915_pmu_cpumask))
cpumask_set_cpu(cpu, &i915_pmu_cpumask);
@@ -1182,8 +1185,6 @@ static int i915_pmu_cpu_offline(unsigned int cpu, struct 
hlist_node *node)
struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
unsigned int target = i915_pmu_target_cpu;
 
-   GEM_BUG_ON(!pmu->base.event_init);
-
/*
 * Unregistering an instance generates a CPU offline event which we must
 * ignore to avoid incorrectly modifying the shared i915_pmu_cpumask.
@@ -1337,21 +1338,14 @@ void i915_pmu_unregister(struct drm_i915_private *i915)
 {
struct i915_pmu *pmu = &i915->pmu;
 
-   if (!pmu->base.event_init)
-   return;
-
/*
-* "Disconnect" the PMU callbacks - since all are atomic synchronize_rcu
-* ensures all currently executing ones will have exited before we
-* proceed with unregistration.
+* "Disconnect" the PMU callbacks - unregistering the pmu will be done
+* later when all currently open events are gone
 */
pmu->closed = true;
-   synchronize_rcu();
 
hrtimer_cancel(&pmu->timer);
-
i915_pmu_unregister_cpuhp_state(pmu);
-   perf_pmu_unregister(&pmu->base);
 
pmu->base.event_init = NULL;
 }
-- 
2.43.0



[PATCH 1/7] perf/core: Add pmu get/put

2024-07-22 Thread Lucas De Marchi
If a pmu is unregistered while there's an active event, perf will still
access the pmu via event->pmu, even after the event is destroyed. This
makes it difficult for drivers like i915 that take a reference on the
device when the event is created and put it when it's destroyed.
Currently the following use-after-free happens just after destroying the
event:

BUG: KASAN: use-after-free in exclusive_event_destroy+0xd8/0xf0
Read of size 4 at addr 88816e2bb63c by task perf/7748

Whenever and event is created, get a pmu reference to use in event->pmu
and just before calling module_put(), drop the reference..

Signed-off-by: Lucas De Marchi 
---
 include/linux/perf_event.h |  3 +++
 kernel/events/core.c   | 32 
 2 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index a5304ae8c654..7048a505e93c 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -540,6 +540,9 @@ struct pmu {
 * Check period value for PERF_EVENT_IOC_PERIOD ioctl.
 */
int (*check_period) (struct perf_event *event, u64 value); 
/* optional */
+
+   struct pmu *(*get)  (struct pmu *pmu); /* optional: get a 
reference */
+   void (*put) (struct pmu *pmu); /* optional: put a 
reference */
 };
 
 enum perf_addr_filter_action_t {
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 1b6f5dc7ed32..cc7541b644b0 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5208,6 +5208,8 @@ static void perf_addr_filters_splice(struct perf_event 
*event,
 
 static void _free_event(struct perf_event *event)
 {
+   struct module *module;
+
irq_work_sync(&event->pending_irq);
 
unaccount_event(event);
@@ -5259,7 +5261,13 @@ static void _free_event(struct perf_event *event)
put_ctx(event->ctx);
 
exclusive_event_destroy(event);
-   module_put(event->pmu->module);
+
+   module = event->pmu->module;
+   event->pmu->put(event->pmu);
+   /* can't touch pmu anymore */
+   event->pmu = NULL;
+
+   module_put(module);
 
call_rcu(&event->rcu_head, free_event_rcu);
 }
@@ -11331,6 +11339,11 @@ static int perf_pmu_nop_int(struct pmu *pmu)
return 0;
 }
 
+static struct pmu *perf_pmu_nop_pmu(struct pmu *pmu)
+{
+   return pmu;
+}
+
 static int perf_event_nop_int(struct perf_event *event, u64 value)
 {
return 0;
@@ -11617,6 +11630,12 @@ int perf_pmu_register(struct pmu *pmu, const char 
*name, int type)
if (!pmu->event_idx)
pmu->event_idx = perf_event_idx_default;
 
+   if (!pmu->get)
+   pmu->get = perf_pmu_nop_pmu;
+
+   if (!pmu->put)
+   pmu->put = perf_pmu_nop_void;
+
list_add_rcu(&pmu->entry, &pmus);
atomic_set(&pmu->exclusive_cnt, 0);
ret = 0;
@@ -11695,7 +11714,8 @@ static int perf_try_init_event(struct pmu *pmu, struct 
perf_event *event)
BUG_ON(!ctx);
}
 
-   event->pmu = pmu;
+   event->pmu = pmu->get(pmu);
+
ret = pmu->event_init(event);
 
if (ctx)
@@ -11714,8 +11734,12 @@ static int perf_try_init_event(struct pmu *pmu, struct 
perf_event *event)
event->destroy(event);
}
 
-   if (ret)
-   module_put(pmu->module);
+   if (ret) {
+   struct module *module = pmu->module;
+
+   pmu->put(pmu);
+   module_put(module);
+   }
 
return ret;
 }
-- 
2.43.0



[PATCH 2/7] drm/i915/pmu: Fix crash due to use-after-free

2024-07-22 Thread Lucas De Marchi
When an i915 PMU counter is enabled and the driver is then unbound, the
PMU will be unregistered via perf_pmu_unregister(), however the event
will still be alive. i915 currently tries to deal with this situation
by:

a) Marking the pmu as "closed" and shortcut the calls from perf
b) Taking a reference from i915, that is put back when the event
   is destroyed.
c) Setting event_init to NULL to avoid any further event

(a) is ugly, but may be left as is since it protects not trying to
access the HW that is now gone. Unless a pmu driver can call
perf_pmu_unregister() and not receive any more calls, it's a necessary
ugliness.

(b) doesn't really work: when the event is destroyed and the i915 ref is
put it may free the i915 object, that contains the pmu, not only the
event. After event->destroy() callback, perf still expects the pmu
object to be alive.

Instead of pigging back on the event->destroy() to take and put the
device reference, implement the new get()/put() on the pmu object for
that purpose.

(c) is not entirely correct as from the perf POV it's not an optional
call: perf would just dereference the NULL pointer. However this also
protects other entrypoints in i915_pmu. A new event creation from perf
after the pmu has been unregistered should not be possible anyway:
perf_init_event() bails out when not finding the pmu. This may be
cleaned up later.

Cc:  # 5.11+
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/i915_pmu.c | 34 +++--
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 21eb0c5b320d..cb5f6471ec6e 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -514,15 +514,6 @@ static enum hrtimer_restart i915_sample(struct hrtimer 
*hrtimer)
return HRTIMER_RESTART;
 }
 
-static void i915_pmu_event_destroy(struct perf_event *event)
-{
-   struct i915_pmu *pmu = event_to_pmu(event);
-   struct drm_i915_private *i915 = pmu_to_i915(pmu);
-
-   drm_WARN_ON(&i915->drm, event->parent);
-
-   drm_dev_put(&i915->drm);
-}
 
 static int
 engine_event_status(struct intel_engine_cs *engine,
@@ -628,11 +619,6 @@ static int i915_pmu_event_init(struct perf_event *event)
if (ret)
return ret;
 
-   if (!event->parent) {
-   drm_dev_get(&i915->drm);
-   event->destroy = i915_pmu_event_destroy;
-   }
-
return 0;
 }
 
@@ -872,6 +858,24 @@ static int i915_pmu_event_event_idx(struct perf_event 
*event)
return 0;
 }
 
+static struct pmu *i915_pmu_get(struct pmu *base)
+{
+   struct i915_pmu *pmu = container_of(base, struct i915_pmu, base);
+   struct drm_i915_private *i915 = pmu_to_i915(pmu);
+
+   drm_dev_get(&i915->drm);
+
+   return base;
+}
+
+static void i915_pmu_put(struct pmu *base)
+{
+   struct i915_pmu *pmu = container_of(base, struct i915_pmu, base);
+   struct drm_i915_private *i915 = pmu_to_i915(pmu);
+
+   drm_dev_put(&i915->drm);
+}
+
 struct i915_str_attribute {
struct device_attribute attr;
const char *str;
@@ -1299,6 +1303,8 @@ void i915_pmu_register(struct drm_i915_private *i915)
pmu->base.stop  = i915_pmu_event_stop;
pmu->base.read  = i915_pmu_event_read;
pmu->base.event_idx = i915_pmu_event_event_idx;
+   pmu->base.get   = i915_pmu_get;
+   pmu->base.put   = i915_pmu_put;
 
ret = perf_pmu_register(&pmu->base, pmu->name, -1);
if (ret)
-- 
2.43.0



[PATCH 5/7] drm/i915/pmu: Let resource survive unbind

2024-07-22 Thread Lucas De Marchi
There's no need to free the resources during unbind. Since perf events
may still access them due to open events, it's safer to free them when
dropping the last i915 reference.

Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/i915_pmu.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index b5d14dd318e4..8708f905f4f4 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -5,6 +5,7 @@
  */
 
 #include 
+#include 
 
 #include "gt/intel_engine.h"
 #include "gt/intel_engine_pm.h"
@@ -1152,6 +1153,17 @@ static void free_event_attributes(struct i915_pmu *pmu)
pmu->pmu_attr = NULL;
 }
 
+static void free_pmu(struct drm_device *dev, void *res)
+{
+   struct i915_pmu *pmu = res;
+   struct drm_i915_private *i915 = pmu_to_i915(pmu);
+
+   free_event_attributes(pmu);
+   kfree(pmu->base.attr_groups);
+   if (IS_DGFX(i915))
+   kfree(pmu->name);
+}
+
 static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
 {
struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
@@ -1302,6 +1314,9 @@ void i915_pmu_register(struct drm_i915_private *i915)
if (ret)
goto err_unreg;
 
+   if (drmm_add_action_or_reset(&i915->drm, free_pmu, pmu))
+   goto err_unreg;
+
return;
 
 err_unreg:
@@ -1336,11 +1351,7 @@ void i915_pmu_unregister(struct drm_i915_private *i915)
hrtimer_cancel(&pmu->timer);
 
i915_pmu_unregister_cpuhp_state(pmu);
-
perf_pmu_unregister(&pmu->base);
+
pmu->base.event_init = NULL;
-   kfree(pmu->base.attr_groups);
-   if (IS_DGFX(i915))
-   kfree(pmu->name);
-   free_event_attributes(pmu);
 }
-- 
2.43.0



[PATCH 7/7] drm/i915/pmu: Do not set event_init to NULL

2024-07-22 Thread Lucas De Marchi
event_init is not an optional function pointer from perf events. Now
that pmu unregister happens only when freeing i915, setting it to NULL
only protects other functions in i915. Replace that by checking
pmu->closed.

Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/i915_pmu.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index df53a8fe53ec..c5738035bc2f 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -303,7 +303,7 @@ void i915_pmu_gt_parked(struct intel_gt *gt)
 {
struct i915_pmu *pmu = >->i915->pmu;
 
-   if (!pmu->base.event_init)
+   if (pmu->closed)
return;
 
spin_lock_irq(&pmu->lock);
@@ -325,7 +325,7 @@ void i915_pmu_gt_unparked(struct intel_gt *gt)
 {
struct i915_pmu *pmu = >->i915->pmu;
 
-   if (!pmu->base.event_init)
+   if (pmu->closed)
return;
 
spin_lock_irq(&pmu->lock);
@@ -1325,12 +1325,12 @@ void i915_pmu_register(struct drm_i915_private *i915)
 err_groups:
kfree(pmu->base.attr_groups);
 err_attr:
-   pmu->base.event_init = NULL;
free_event_attributes(pmu);
 err_name:
if (IS_DGFX(i915))
kfree(pmu->name);
 err:
+   pmu->closed = true;
drm_notice(&i915->drm, "Failed to register PMU!\n");
 }
 
@@ -1346,6 +1346,4 @@ void i915_pmu_unregister(struct drm_i915_private *i915)
 
hrtimer_cancel(&pmu->timer);
i915_pmu_unregister_cpuhp_state(pmu);
-
-   pmu->base.event_init = NULL;
 }
-- 
2.43.0



[PATCH 3/7] drm/i915/pmu: Use event_to_pmu()

2024-07-22 Thread Lucas De Marchi
i915 pointer is not needed in this function and all the others simply
calculate the i915_pmu container based on the event->pmu. Follow the
same logic as in other functions.

Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/i915_pmu.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index cb5f6471ec6e..3a8bd11b87e7 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -820,15 +820,14 @@ static void i915_pmu_event_start(struct perf_event 
*event, int flags)
 
 static void i915_pmu_event_stop(struct perf_event *event, int flags)
 {
-   struct drm_i915_private *i915 =
-   container_of(event->pmu, typeof(*i915), pmu.base);
-   struct i915_pmu *pmu = &i915->pmu;
+   struct i915_pmu *pmu = event_to_pmu(event);
 
if (pmu->closed)
goto out;
 
if (flags & PERF_EF_UPDATE)
i915_pmu_event_read(event);
+
i915_pmu_disable(event);
 
 out:
-- 
2.43.0



[PATCH 4/7] drm/i915/pmu: Drop is_igp()

2024-07-22 Thread Lucas De Marchi
There's no reason to hardcode checking for integrated graphics on a
specific pci slot. That information is already available per platform an
can be checked with IS_DGFX().

Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/i915_pmu.c | 17 +++--
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 3a8bd11b87e7..b5d14dd318e4 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -1235,17 +1235,6 @@ static void i915_pmu_unregister_cpuhp_state(struct 
i915_pmu *pmu)
cpuhp_state_remove_instance(cpuhp_slot, &pmu->cpuhp.node);
 }
 
-static bool is_igp(struct drm_i915_private *i915)
-{
-   struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
-
-   /* IGP is :00:02.0 */
-   return pci_domain_nr(pdev->bus) == 0 &&
-  pdev->bus->number == 0 &&
-  PCI_SLOT(pdev->devfn) == 2 &&
-  PCI_FUNC(pdev->devfn) == 0;
-}
-
 void i915_pmu_register(struct drm_i915_private *i915)
 {
struct i915_pmu *pmu = &i915->pmu;
@@ -1269,7 +1258,7 @@ void i915_pmu_register(struct drm_i915_private *i915)
pmu->cpuhp.cpu = -1;
init_rc6(pmu);
 
-   if (!is_igp(i915)) {
+   if (IS_DGFX(i915)) {
pmu->name = kasprintf(GFP_KERNEL,
  "i915_%s",
  dev_name(i915->drm.dev));
@@ -1323,7 +1312,7 @@ void i915_pmu_register(struct drm_i915_private *i915)
pmu->base.event_init = NULL;
free_event_attributes(pmu);
 err_name:
-   if (!is_igp(i915))
+   if (IS_DGFX(i915))
kfree(pmu->name);
 err:
drm_notice(&i915->drm, "Failed to register PMU!\n");
@@ -1351,7 +1340,7 @@ void i915_pmu_unregister(struct drm_i915_private *i915)
perf_pmu_unregister(&pmu->base);
pmu->base.event_init = NULL;
kfree(pmu->base.attr_groups);
-   if (!is_igp(i915))
+   if (IS_DGFX(i915))
kfree(pmu->name);
free_event_attributes(pmu);
 }
-- 
2.43.0



[PATCH 0/7] Fix i915 pmu on bind/unbind

2024-07-22 Thread Lucas De Marchi
This is intended to fix the pmu integration in i915 when the device
unbinds. I don't have the hardware to test, but I belive a similar issue
occurs with any driver using perf_pmu_unregister() if they can unbind
from the device - in drm subsystem, that would be the amd driver.

Previous attempts I could find:
1) 
https://lore.kernel.org/lkml/20240115170120.662220-1-tvrtko.ursu...@linux.intel.com/
2) 
https://lore.kernel.org/all/20240213180302.47266-1-umesh.nerlige.rama...@intel.com/

I think (2) is a no-go as it tries to do everything from the i915 side
by force-closing the fd.

I believe this series is simpler* than (1), but that could could also be
a good alternative if we want to go with that approach.

First patch is to perf. The rest is to i915 that builds on that and
moves the unregister bit by bit to be done later, after the last
reference to i915 is dropped.  Peter/Ingo, could I get your opinion on
this or if (1) would be a good alternative? Thanks.

* simpler, but see downside mentioned in patch 6

Lucas De Marchi (7):
  perf/core: Add pmu get/put
  drm/i915/pmu: Fix crash due to use-after-free
  drm/i915/pmu: Use event_to_pmu()
  drm/i915/pmu: Drop is_igp()
  drm/i915/pmu: Let resource survive unbind
  drm/i915/pmu: Lazy unregister
  drm/i915/pmu: Do not set event_init to NULL

 drivers/gpu/drm/i915/i915_pmu.c | 103 
 include/linux/perf_event.h  |   3 +
 kernel/events/core.c|  32 --
 3 files changed, 81 insertions(+), 57 deletions(-)

-- 
2.43.0



Re: When sysfs is not available (say containers)

2024-07-19 Thread Lucas De Marchi

On Fri, Jul 19, 2024 at 09:50:04AM GMT, Ashutosh Dixit wrote:

On Mon, 17 Jun 2024 18:45:57 -0700, Ashutosh Dixit wrote:




Folks,

The below is just an example from one of the earlier OA patches (already
merged):


[PATCH 05/17] drm/xe/oa/uapi: Add/remove OA config perf ops

+static ssize_t show_dynamic_id(struct kobject *kobj,
+  struct kobj_attribute *attr,
+  char *buf)
+{
+   struct xe_oa_config *oa_config =
+   container_of(attr, typeof(*oa_config), sysfs_metric_id);
+
+   return sysfs_emit(buf, "%d\n", oa_config->id);
+}
+
+static int create_dynamic_oa_sysfs_entry(struct xe_oa *oa,
+struct xe_oa_config *oa_config)
+{
+   sysfs_attr_init(&oa_config->sysfs_metric_id.attr);
+   oa_config->sysfs_metric_id.attr.name = "id";
+   oa_config->sysfs_metric_id.attr.mode = 0444;
+   oa_config->sysfs_metric_id.show = show_dynamic_id;
+   oa_config->sysfs_metric_id.store = NULL;
+
+   oa_config->attrs[0] = &oa_config->sysfs_metric_id.attr;
+   oa_config->attrs[1] = NULL;
+
+   oa_config->sysfs_metric.name = oa_config->uuid;
+   oa_config->sysfs_metric.attrs = oa_config->attrs;
+
+   return sysfs_create_group(oa->metrics_kobj, &oa_config->sysfs_metric);
+}


So we often expose things in sysfs. The question is: are there general
guidelines for what to do for environments (such as containers) where
userspace cannot access sysfs? E.g. in such cases, do we expose the
information exposed in sysfs via queries (i.e. an ioctl)? Or another way?
What have we done in the past in drm and what should we do in these cases
for Xe?


userspace should be written in a way to handle sysfs potentially not
being around and not crash in that case. Providing limited functionality
is fine and user can decide what to do in that case. Creating
duplicate and alternative API to handle this is not a good solution IMO.

For containers, it's common to mount sysfs read-only to give container
visibility on the host configuration... or parts of it in case you are
giving the container privilege over that part of the system.

Related, on another project I maintain (kmod) including systemd folks:
https://github.com/kmod-project/kmod/issues/10

From https://systemd.io/CONTAINER_INTERFACE/:

Make sure to pre-mount /proc/, /sys/, and /sys/fs/selinux/ before
invoking systemd, and mount /sys/, /sys/fs/selinux/ and /proc/sys/
read-only (the latter via e.g. a read-only bind mount on itself)

that page has more information on other parts of sysfs that people make
writable/readable for similar issues in other subsystems and is worth
reading.

Lucas De Marchi


[PULL] drm-xe-fixes

2024-07-11 Thread Lucas De Marchi

Hi Dave and Sima,

First patch has an important impact to userspace, changing the cahcing
mode to write-back for system memory on DGFX. In this case we introduce
a limitation in the cache selection uapi that is transparent to UMDs. I
mean, no change on their side is needed.  Coherence is maintained, but
with some known possible and acceptable/accepted differences in CPU
access speed.

Second patch fixes a leak when finalizing hdcp gsc.

drm-xe-fixes-2024-07-11:
UAPI Changes:
- Use write-back caching mode for system memory on DGFX (Thomas)

Driver Changes:
- Do not leak object when finalizing hdcp gsc (Nirmoy)
The following changes since commit 256abd8e550ce977b728be79a74e1729438b4948:

  Linux 6.10-rc7 (2024-07-07 14:23:46 -0700)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-07-11

for you to fetch changes up to 609458abd5a10180f513ca364d6c0ae30128c821:

  drm/xe/display/xe_hdcp_gsc: Free arbiter on driver removal (2024-07-11 
08:25:32 -0700)


UAPI Changes:
- Use write-back caching mode for system memory on DGFX (Thomas)

Driver Changes:
- Do not leak object when finalizing hdcp gsc (Nirmoy)


Nirmoy Das (1):
  drm/xe/display/xe_hdcp_gsc: Free arbiter on driver removal

Thomas Hellström (1):
  drm/xe: Use write-back caching mode for system memory on DGFX

 drivers/gpu/drm/xe/display/xe_hdcp_gsc.c | 12 +---
 drivers/gpu/drm/xe/xe_bo.c   | 47 +++-
 drivers/gpu/drm/xe/xe_bo_types.h |  3 +-
 include/uapi/drm/xe_drm.h|  8 +-
 4 files changed, 45 insertions(+), 25 deletions(-)


Re: [PULL] drm-xe-next-fixes

2024-07-10 Thread Lucas De Marchi

On Tue, Jul 09, 2024 at 05:31:39PM GMT, Rodrigo Vivi wrote:

Hi Dave and Sima,

Here goes a very early drm-xe-next-fixes with 2 important fixes
that are going to impact user space.

1. The first one is the rename of the OA stuff from the bad 'perf'
name to the xe_observation. Although the rename in the uapi
header is likely inoffensive because our UMDs always copy the
header to their code, there's a sysfs filename change that is
impacting mesa.

For this one Mesa MR is ready and they are only waiting for this
pull request to be picked by you so they can merge that to Mesa:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30027

With both sides in place, there won't be any kernel version
released with the bad naming and no Mesa released using that
filename. This is the main reason that I'm sending this PR
earlier than normal.

2. The second case, which also impact user space is the
write-back caching mode for system memory on DGFX. In this case
we introduce a limitation in the cache selection uapi that is
transparent to UMDs. I mean, no change on their side is needed.
Coherence is maintained with some know possible visible and
acceptable/accepted differences in CPU access speed.


but this commit is also going to drm-xe-fixes, so I don't think we
should have it in this pull. I'm looking at some changes to dim to avoid
this kind of problem in future.

Lucas De Marchi



Thanks,
Rodrigo.

drm-xe-next-fixes-2024-07-09:
UAPI Changes:
- Rename xe perf layer as xe observation layer (Ashutosh)
- Use write-back caching mode for system memory on DGFX (Thomas)

Driver Changes:
- Drop trace_xe_hw_fence_free (Brost)
The following changes since commit 62a05f4ae9c1fb70bc75d494c9c1c373d2c2e374:

 Merge tag 'drm-msm-next-2024-07-04' of https://gitlab.freedesktop.org/drm/msm 
into drm-next (2024-07-05 12:45:41 +0200)

are available in the Git repository at:

 https://gitlab.freedesktop.org/drm/xe/kernel.git 
tags/drm-xe-next-fixes-2024-07-09

for you to fetch changes up to 463108053c19f24fa228863824698d5ca72826b6:

 drm/xe: Drop trace_xe_hw_fence_free (2024-07-09 16:21:26 -0400)


UAPI Changes:
- Rename xe perf layer as xe observation layer (Ashutosh)
- Use write-back caching mode for system memory on DGFX (Thomas)

Driver Changes:
- Drop trace_xe_hw_fence_free (Brost)


Ashutosh Dixit (1):
 drm/xe/uapi: Rename xe perf layer as xe observation layer

Matthew Brost (1):
 drm/xe: Drop trace_xe_hw_fence_free

Thomas Hellström (1):
 drm/xe: Use write-back caching mode for system memory on DGFX

drivers/gpu/drm/xe/Makefile  |   2 +-
drivers/gpu/drm/xe/xe_bo.c   |  47 +--
drivers/gpu/drm/xe/xe_bo_types.h |   3 +-
drivers/gpu/drm/xe/xe_device.c   |   4 +-
drivers/gpu/drm/xe/xe_device_types.h |   2 +-
drivers/gpu/drm/xe/xe_gt_types.h |   2 +-
drivers/gpu/drm/xe/xe_hw_fence.c |   1 -
drivers/gpu/drm/xe/xe_module.c   |   6 +-
drivers/gpu/drm/xe/xe_oa.c   |  34 +--
drivers/gpu/drm/xe/xe_observation.c  |  93 +
drivers/gpu/drm/xe/xe_observation.h  |  20 +++
drivers/gpu/drm/xe/xe_perf.c |  92 -
drivers/gpu/drm/xe/xe_perf.h |  20 ---
drivers/gpu/drm/xe/xe_trace.h|   5 --
include/uapi/drm/xe_drm.h| 110 +++
15 files changed, 227 insertions(+), 214 deletions(-)
create mode 100644 drivers/gpu/drm/xe/xe_observation.c
create mode 100644 drivers/gpu/drm/xe/xe_observation.h
delete mode 100644 drivers/gpu/drm/xe/xe_perf.c
delete mode 100644 drivers/gpu/drm/xe/xe_perf.h


Re: [PATCH -next] drm/xe: Remove duplicate generated/xe_wa_oob.h header

2024-07-09 Thread Lucas De Marchi

On Wed, Jul 10, 2024 at 01:49:41PM GMT, Jiapeng Chong wrote:

./drivers/gpu/drm/xe/xe_gt.c: generated/xe_wa_oob.h is included more than once.

Reported-by: Abaci Robot 
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9493
Signed-off-by: Jiapeng Chong 


thanks, but a similar patch got merged yesterday:
https://lore.kernel.org/intel-xe/20240708173301.1543871-1-lucas.demar...@intel.com/#r

Lucas De Marchi


Re: [PATCH v2 1/2] drm/i915/gem: Return NULL instead of '0'

2024-06-27 Thread Lucas De Marchi

On Mon, Jun 17, 2024 at 08:42:42PM GMT, Andi Shyti wrote:

Commit 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup
warning") returns '0' from i915_gem_stolen_lmem_setup(), but it's
supposed to return a pointer to the intel_memory_region
structure.

Sparse complains with the following message:


drivers/gpu/drm/i915/gem/i915_gem_stolen.c:943:32: sparse: sparse:

  Using plain integer as NULL pointer

Return NULL.

Signed-off-by: Andi Shyti 
Cc: Jonathan Cavitt 
---
Cc: Lucas De Marchi 


Reviewed-by: Lucas De Marchi 

thanks
Lucas De Marchi



drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index 004471f60117..9ca73936dc5e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -940,7 +940,7 @@ i915_gem_stolen_lmem_setup(struct drm_i915_private *i915, 
u16 type,
drm_dbg(&i915->drm,
"Disabling stolen memory support due to OOB 
placement: lmem_size = %lli vs dsm_base = %lli\n",
lmem_size, dsm_base);
-   return 0;
+   return NULL;
}
dsm_size = ALIGN_DOWN(lmem_size - dsm_base, SZ_1M);
}
--
2.45.1



Re: [PATCH v2 2/2] drm/i915/gem: Use the correct format specifier for resource_size_t

2024-06-27 Thread Lucas De Marchi

On Mon, Jun 17, 2024 at 08:42:43PM GMT, Andi Shyti wrote:

Commit 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup
warning") adds a debug message where the "lmem_size" and
"dsm_base" variables are printed using the %lli identifier.

However, these variables are defined as resource_size_t, which
are unsigned long for 32-bit machines and unsigned long long for
64-bit machines.

The documentation (core-api/printk-formats.rst) recommends using
the %pa specifier for printing addresses and sizes of resources.

Replace %lli with %pa.

This patch also mutes the following sparse warning when compiling
with:


s/sparse//

I will do that while applying, thanks.

Reviewed-by: Lucas De Marchi 

Lucas De Marchi


Re: [PATCH 1/2] drm/i915/gem: Return -EINVAL instead of '0'

2024-06-17 Thread Lucas De Marchi

On Mon, Jun 17, 2024 at 08:38:20PM GMT, Andi Shyti wrote:

On Mon, Jun 17, 2024 at 10:46:07AM -0500, Lucas De Marchi wrote:

On Mon, Jun 17, 2024 at 04:22:11PM GMT, Andi Shyti wrote:
> On Mon, Jun 17, 2024 at 07:55:10AM -0500, Lucas De Marchi wrote:
> > On Sun, Jun 16, 2024 at 09:03:48AM GMT, Andi Shyti wrote:
> > > Commit 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup
> > > warning") returns '0' from i915_gem_stolen_lmem_setup(), but it's
> > > supposed to return a pointer to the intel_memory_region
> > > structure.
> > >
> > > Sparse complains with the following message:
> > >
> > > > > drivers/gpu/drm/i915/gem/i915_gem_stolen.c:943:32: sparse: sparse:
> > >   Using plain integer as NULL pointer
> > >
> > > The caller checks for errors, and if no error is returned, it
> > > stores the address of the stolen memory. Therefore, we can't
> > > return NULL. Since we are handling a case of out-of-bounds, it's
> > > appropriate to treat the "lmem_size < dsm_base" case as an error.
> >
> > which completely invalidates the point of the commit that introduced this
> > regression. That was commit was supposed to do "let's continue, just
> > disabling stolen".
>
> Yes, correct, I missed the point while fixing stuff. But patch 2
> is still valid.

no, it's not. It's introduced by the same commit. I went to look into
this exactly because of the second issue: it broke 32b build in xe and
all the CI.Hooks in xe are failing.


yes, it's broken because it's using %lli, right? In 32b it should
be %li.

Patch 2 is replacing %lli with %pa which should fix the 32b
build.

I'm sending a new series now.


wait... but instead of reverting you are sending a new series changing
the first patch to return NULL. However in your commit message you said
for this version:

The caller checks for errors, and if no error is returned, it
stores the address of the stolen memory. Therefore, we can't
return NULL. Since we are handling a case of out-of-bounds, it's
appropriate to treat the "lmem_size < dsm_base" case as an error.

    Return -EINVAL embedded in a pointer instead of '0' (or NULL).

This way, we avoid a potential NULL pointer dereference.

So... what's it?  Can we return NULL or not? Is this tested on that
scenario with with small BAR or does the module
just fail to load later and explode?

Lucas De Marchi



Andi


Lucas De Marchi


>
> Thanks,
> Andi


Re: [PATCH 1/2] drm/i915/gem: Return -EINVAL instead of '0'

2024-06-17 Thread Lucas De Marchi

On Mon, Jun 17, 2024 at 04:22:11PM GMT, Andi Shyti wrote:

On Mon, Jun 17, 2024 at 07:55:10AM -0500, Lucas De Marchi wrote:

On Sun, Jun 16, 2024 at 09:03:48AM GMT, Andi Shyti wrote:
> Commit 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup
> warning") returns '0' from i915_gem_stolen_lmem_setup(), but it's
> supposed to return a pointer to the intel_memory_region
> structure.
>
> Sparse complains with the following message:
>
> > > drivers/gpu/drm/i915/gem/i915_gem_stolen.c:943:32: sparse: sparse:
>   Using plain integer as NULL pointer
>
> The caller checks for errors, and if no error is returned, it
> stores the address of the stolen memory. Therefore, we can't
> return NULL. Since we are handling a case of out-of-bounds, it's
> appropriate to treat the "lmem_size < dsm_base" case as an error.

which completely invalidates the point of the commit that introduced this
regression. That was commit was supposed to do "let's continue, just
disabling stolen".


Yes, correct, I missed the point while fixing stuff. But patch 2
is still valid.


no, it's not. It's introduced by the same commit. I went to look into
this exactly because of the second issue: it broke 32b build in xe and
all the CI.Hooks in xe are failing.

Lucas De Marchi




Thanks,
Andi


Re: [PATCH 1/2] drm/i915/gem: Return -EINVAL instead of '0'

2024-06-17 Thread Lucas De Marchi

On Sun, Jun 16, 2024 at 09:03:48AM GMT, Andi Shyti wrote:

Commit 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup
warning") returns '0' from i915_gem_stolen_lmem_setup(), but it's
supposed to return a pointer to the intel_memory_region
structure.

Sparse complains with the following message:


drivers/gpu/drm/i915/gem/i915_gem_stolen.c:943:32: sparse: sparse:

  Using plain integer as NULL pointer

The caller checks for errors, and if no error is returned, it
stores the address of the stolen memory. Therefore, we can't
return NULL. Since we are handling a case of out-of-bounds, it's
appropriate to treat the "lmem_size < dsm_base" case as an error.


which completely invalidates the point of the commit that introduced this
regression. That was commit was supposed to do "let's continue, just
disabling stolen".  Apparently we should just revert that patch instead
since it introduced 2 new issues and didn't solve what it was supposed
to... for probe failures, we are completely fine leaving the warning
there.


Lucas De Marchi



Return -EINVAL embedded in a pointer instead of '0' (or NULL).

This way, we avoid a potential NULL pointer dereference.

Since we are returning an error, it makes sense to print an error
message with drm_err() instead of a debug message using
drm_dbg().

Fixes: 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup warning")
Signed-off-by: Andi Shyti 
Cc: Jonathan Cavitt 
---
drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index 004471f60117..bd774ce713cf 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -937,10 +937,10 @@ i915_gem_stolen_lmem_setup(struct drm_i915_private *i915, 
u16 type,
/* Use DSM base address instead for stolen memory */
dsm_base = intel_uncore_read64(uncore, GEN6_DSMBASE) & 
GEN11_BDSM_MASK;
if (lmem_size < dsm_base) {
-   drm_dbg(&i915->drm,
+   drm_err(&i915->drm,
"Disabling stolen memory support due to OOB 
placement: lmem_size = %lli vs dsm_base = %lli\n",
lmem_size, dsm_base);
-   return 0;
+   return ERR_PTR(-EINVAL);
}
dsm_size = ALIGN_DOWN(lmem_size - dsm_base, SZ_1M);
}
--
2.45.1



Re: [PATCH 1/2] drm/dp: Describe target_rr_divider in struct drm_dp_as_sdp

2024-06-13 Thread Lucas De Marchi

On Thu, Jun 13, 2024 at 10:43:16AM GMT, Mitul Golani wrote:

Describe newly added parameter target_rr_divider in struct
drm_dp_as_sdp.

Fixes: a20c6d954d75 ("drm/dp: Add refresh rate divider to struct representing AS 
SDP")
Cc: Mitul Golani 
Cc: Arun R Murthy 
Cc: Suraj Kandpal 
Cc: Ankit Nautiyal 
Cc: Jani Nikula 
Cc: Stephen Rothwell 



^ wrong newline. If you do that, git doesn't recognize them as part of
the trailer.

Lucas De Marchi


Signed-off-by: Mitul Golani 
Reviewed-by: Ankit Nautiyal 
---
include/drm/display/drm_dp_helper.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/include/drm/display/drm_dp_helper.h 
b/include/drm/display/drm_dp_helper.h
index ea03e1dd26ba..7f2567fa230d 100644
--- a/include/drm/display/drm_dp_helper.h
+++ b/include/drm/display/drm_dp_helper.h
@@ -112,6 +112,7 @@ struct drm_dp_vsc_sdp {
 * @target_rr: Target Refresh
 * @duration_incr_ms: Successive frame duration increase
 * @duration_decr_ms: Successive frame duration decrease
+ * @target_rr_divider: Target refresh rate divider
 * @mode: Adaptive Sync Operation Mode
 */
struct drm_dp_as_sdp {
--
2.45.2



Re: [PATCH v2] MAINTAINERS: Update Xe driver maintainers

2024-06-06 Thread Lucas De Marchi


On Sun, 02 Jun 2024 21:09:59 +0200, Thomas Hellström wrote:
> Add Rodrigo Vivi as an Xe driver maintainer.
> 
> v2:
> - Cc also Lucas De Marchi (Rodrigo vivi)
> - Remove a blank line in commit the commit message (Lucas De Marchi)
> 
> 
> [...]

Applied to drm-xe-fixes, thanks!

[1/1] MAINTAINERS: Update Xe driver maintainers
  commit: a9f9b30e1748252d158f78a0c0affdc949671dd1

Best regards,
-- 
Lucas De Marchi 


Re: [PATCH] MAINTAINERS: Update Xe driver maintainers

2024-05-31 Thread Lucas De Marchi

On Fri, May 31, 2024 at 04:10:51PM GMT, Thomas Hellström wrote:

Add Rodrigo Vivi as an Xe driver maintainer.

Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Rodrigo Vivi 
Cc: dri-devel@lists.freedesktop.org
Cc: linux-ker...@vger.kernel.org



shouldn't have a blank line here. Otherwise git doesn't consider the
lines above as part of the trailer.

Acked-by: Lucas De Marchi 


Maybe we should also add a

T: git https://gitlab.freedesktop.org/drm/i915/kernel.git

on the display side so we direct display patches to go through
drm-intel like we are currently doing. But we can leave this for another
patch.

thanks
Lucas De Marchi


Signed-off-by: Thomas Hellström 
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 572be0546e21..8f9982c99257 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11037,6 +11037,7 @@ F:  include/uapi/drm/i915_drm.h
INTEL DRM XE DRIVER (Lunar Lake and newer)
M:      Lucas De Marchi 
M:  Thomas Hellström 
+M: Rodrigo Vivi 
L:  intel...@lists.freedesktop.org
S:  Supported
W:  https://drm.pages.freedesktop.org/intel-docs/
--
2.44.0



Re: FYI: drm-intel repo moved to fd.o GitLab

2024-05-21 Thread Lucas De Marchi

Cc'ing drm and drm-intel maintainers too.

And a few more details below.

On Tue, May 21, 2024 at 02:56:22PM GMT, Knop, Ryszard wrote:

Hello,

As of today, we've moved the drm-intel repository (upstream for
drivers/gpu/drm/i915) to a new location:

Previously: https://cgit.freedesktop.org/drm-intel
Moved to: https://gitlab.freedesktop.org/drm/i915/kernel


For those using dim, it should ask you to migrate your remote
automatically on your next branch update. By default it will use
the ssh protocol. If you don't have your ssh key in gitlab already,
you can follow 
https://docs.gitlab.com/ee/user/ssh.html#add-an-ssh-key-to-your-gitlab-account
to add it.

New server has no git:// protocol, so anyone using it is advised
to move to https:// instead.



The old repo is now a read-only mirror of the GitLab-hosted one.
Relevant documentation and links were updated. Thanks to bentiss,
mripard and demarchi for help with the move!


and thank you for preparing all the patches to the different repos.

Lucas De Marchi



Ryszard


Re: [PATCH] nightly.conf: Update drm-intel URLs, add missing bare ssh drm-xe URL

2024-05-21 Thread Lucas De Marchi

On Wed, Apr 24, 2024 at 01:32:19PM GMT, Ryszard Knop wrote:

- Switch drm-intel URLs to the new GitLab location.
- Add a short SSH link for drm-xe for completeness with other blocks.
- Add a missing tab in drm_tip_config for column alignment.

Signed-off-by: Ryszard Knop 


Applied, thanks
Lucas De Marchi


Re: [PATCH v5 0/8] drm/xe: Per client usage

2024-05-21 Thread Lucas De Marchi


On Fri, 17 May 2024 13:43:02 -0700, Lucas De Marchi wrote:
> v5 of of 
> https://lore.kernel.org/all/20240515214258.59209-1-lucas.demar...@intel.com
> 
> Add per-client usage statistics to xe. This ports xe to use the common
> method in drm to export the usage to userspace per client (where 1
> client == 1 drm fd open).
> 
> However instead of using the current format measured in nsec, this
> creates a new one. The intention here is not to mix the GPU clock domain
> with the CPU clock. It allows to cover a few more use cases without
> extra complications.
> 
> [...]

Applied to drm-xe-next. Thanks everyone for the reviews and contributions.

[1/8] drm/xe: Promote xe_hw_engine_class_to_str()
  commit: ab689514b6ac518ef6e88afa245b834b0dae15a5
[2/8] drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion
  commit: bd49e50d81b543e678965118a86958d87c045c73
[3/8] drm/xe/lrc: Add helper to capture context timestamp
  commit: 9b090d57746d965684f53a1aefcb363bab653ad3
[4/8] drm/xe: Add helper to capture engine timestamp
  commit: f2f6b667c67daee6fe2c51b5cec3bb0f1b4c1ce0
[5/8] drm/xe: Add helper to accumulate exec queue runtime
  commit: 6109f24f87d75122cf6de50901115cbee4285ce2
[6/8] drm/xe: Cache data about user-visible engines
  commit: baa14865529bf1f3c12dc6145bd9109ef289e038
[7/8] drm/xe: Add helper to return any available hw engine
  commit: 6aa18d7436b0c11f7e62fd6cdb707eaeab1dc473
[8/8] drm/xe/client: Print runtime to fdinfo
  commit: 188ced1e0ff892f0948f20480e2e0122380ae46d

Best regards,
-- 
Lucas De Marchi 


Re: [PATCH v5 5/8] drm/xe: Add helper to accumulate exec queue runtime

2024-05-20 Thread Lucas De Marchi

On Tue, May 21, 2024 at 02:53:56AM GMT, Matthew Brost wrote:

On Sat, May 18, 2024 at 09:37:20AM -0500, Lucas De Marchi wrote:

On Fri, May 17, 2024 at 03:40:22PM GMT, Matt Roper wrote:
> On Fri, May 17, 2024 at 01:43:07PM -0700, Lucas De Marchi wrote:
> > From: Umesh Nerlige Ramappa 
> >
> > Add a helper to accumulate per-client runtime of all its
> > exec queues. This is called every time a sched job is finished.
> >
> > v2:
> >   - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate
> > runtime when job is finished since xe_sched_job_completed() is not a
> > notification that job finished.
> >   - Stop trying to update runtime from xe_exec_queue_fini() - that is
> > redundant and may happen after xef is closed, leading to a
> > use-after-free
> >   - Do not special case the first timestamp read: the default LRC sets
> > CTX_TIMESTAMP to zero, so even the first sample should be a valid
> > one.
> >   - Handle the parallel submission case by multiplying the runtime by
> >     width.
> > v3: Update comments
> >
> > Signed-off-by: Umesh Nerlige Ramappa 
> > Signed-off-by: Lucas De Marchi 
> > ---
> >  drivers/gpu/drm/xe/xe_device_types.h |  3 +++
> >  drivers/gpu/drm/xe/xe_exec_queue.c   | 37 
> >  drivers/gpu/drm/xe/xe_exec_queue.h   |  1 +
> >  drivers/gpu/drm/xe/xe_execlist.c |  1 +
> >  drivers/gpu/drm/xe/xe_guc_submit.c   |  2 ++
> >  5 files changed, 44 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_device_types.h 
b/drivers/gpu/drm/xe/xe_device_types.h
> > index 5c5e36de452a..bc97990fd032 100644
> > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > @@ -555,6 +555,9 @@ struct xe_file {
> >   struct mutex lock;
> >   } exec_queue;
> >
> > + /** @runtime: hw engine class runtime in ticks for this drm client */
> > + u64 runtime[XE_ENGINE_CLASS_MAX];
> > +
> >   /** @client: drm client */
> >   struct xe_drm_client *client;
> >  };
> > diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c 
b/drivers/gpu/drm/xe/xe_exec_queue.c
> > index 395de93579fa..fa6dc996eca8 100644
> > --- a/drivers/gpu/drm/xe/xe_exec_queue.c
> > +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
> > @@ -769,6 +769,43 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q)
> >   q->lrc[0].fence_ctx.next_seqno - 1;
> >  }
> >
> > +/**
> > + * xe_exec_queue_update_runtime() - Update runtime for this exec queue 
from hw
> > + * @q: The exec queue
> > + *
> > + * Update the timestamp saved by HW for this exec queue and save runtime
> > + * calculated by using the delta from last update. On multi-lrc case, only 
the
> > + * first is considered.
> > + */
> > +void xe_exec_queue_update_runtime(struct xe_exec_queue *q)
> > +{
> > + struct xe_file *xef;
> > + struct xe_lrc *lrc;
> > + u32 old_ts, new_ts;
> > +
> > + /*
> > +  * Jobs that are run during driver load may use an exec_queue, but are
> > +  * not associated with a user xe file, so avoid accumulating busyness
> > +  * for kernel specific work.
> > +  */
> > + if (!q->vm || !q->vm->xef)
> > + return;
> > +
> > + xef = q->vm->xef;
> > +
> > + /*
> > +  * Only sample the first LRC. For parallel submission, all of them are
> > +  * scheduled together and we compensate that below by multiplying by
> > +  * width - this may introduce errors if that premise is not true and
> > +  * they don't exit 100% aligned. On the other hand, looping through
> > +  * the LRCs and reading them in different time could also introduce
> > +  * errors.
>
> At the time we're executing this function, those LRCs aren't executing
> on the hardware anymore and their timestamps aren't continuing to move,

not necessarily. Besides calling this function when execution ends, see
last patch. This is called when userspace reads the fdinfo file, which
portentially reads this for running LRCs.

> right?  I don't see where error could creep in from just looping over
> each of them?
>
> I guess parallel submission is mostly just used by media these days,
> where the batches submitted in parallel are nearly identical and
> expected to run the same amount of time, right?  Do we have any

what I got from Matt Brost is that they are always scheduled together
and will run together on the different instances of that engine class,
regardles

Re: [PATCH v5 5/8] drm/xe: Add helper to accumulate exec queue runtime

2024-05-18 Thread Lucas De Marchi

On Fri, May 17, 2024 at 03:40:22PM GMT, Matt Roper wrote:

On Fri, May 17, 2024 at 01:43:07PM -0700, Lucas De Marchi wrote:

From: Umesh Nerlige Ramappa 

Add a helper to accumulate per-client runtime of all its
exec queues. This is called every time a sched job is finished.

v2:
  - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate
runtime when job is finished since xe_sched_job_completed() is not a
notification that job finished.
  - Stop trying to update runtime from xe_exec_queue_fini() - that is
redundant and may happen after xef is closed, leading to a
use-after-free
  - Do not special case the first timestamp read: the default LRC sets
CTX_TIMESTAMP to zero, so even the first sample should be a valid
one.
  - Handle the parallel submission case by multiplying the runtime by
width.
v3: Update comments

Signed-off-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_device_types.h |  3 +++
 drivers/gpu/drm/xe/xe_exec_queue.c   | 37 
 drivers/gpu/drm/xe/xe_exec_queue.h   |  1 +
 drivers/gpu/drm/xe/xe_execlist.c |  1 +
 drivers/gpu/drm/xe/xe_guc_submit.c   |  2 ++
 5 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_device_types.h 
b/drivers/gpu/drm/xe/xe_device_types.h
index 5c5e36de452a..bc97990fd032 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -555,6 +555,9 @@ struct xe_file {
struct mutex lock;
} exec_queue;

+   /** @runtime: hw engine class runtime in ticks for this drm client */
+   u64 runtime[XE_ENGINE_CLASS_MAX];
+
/** @client: drm client */
struct xe_drm_client *client;
 };
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c 
b/drivers/gpu/drm/xe/xe_exec_queue.c
index 395de93579fa..fa6dc996eca8 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -769,6 +769,43 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q)
q->lrc[0].fence_ctx.next_seqno - 1;
 }

+/**
+ * xe_exec_queue_update_runtime() - Update runtime for this exec queue from hw
+ * @q: The exec queue
+ *
+ * Update the timestamp saved by HW for this exec queue and save runtime
+ * calculated by using the delta from last update. On multi-lrc case, only the
+ * first is considered.
+ */
+void xe_exec_queue_update_runtime(struct xe_exec_queue *q)
+{
+   struct xe_file *xef;
+   struct xe_lrc *lrc;
+   u32 old_ts, new_ts;
+
+   /*
+* Jobs that are run during driver load may use an exec_queue, but are
+* not associated with a user xe file, so avoid accumulating busyness
+* for kernel specific work.
+*/
+   if (!q->vm || !q->vm->xef)
+   return;
+
+   xef = q->vm->xef;
+
+   /*
+* Only sample the first LRC. For parallel submission, all of them are
+* scheduled together and we compensate that below by multiplying by
+* width - this may introduce errors if that premise is not true and
+* they don't exit 100% aligned. On the other hand, looping through
+* the LRCs and reading them in different time could also introduce
+* errors.


At the time we're executing this function, those LRCs aren't executing
on the hardware anymore and their timestamps aren't continuing to move,


not necessarily. Besides calling this function when execution ends, see
last patch. This is called when userspace reads the fdinfo file, which
portentially reads this for running LRCs.


right?  I don't see where error could creep in from just looping over
each of them?

I guess parallel submission is mostly just used by media these days,
where the batches submitted in parallel are nearly identical and
expected to run the same amount of time, right?  Do we have any


what I got from Matt Brost is that they are always scheduled together
and will run together on the different instances of that engine class,
regardless if they are different. As you said, I'm not sure there's
even a use case for running different batches.  +Matt Brost to confirm
my reasoning below.

Looking at our uapi and some tests in igt. This is controlled by the width
arg when creating the exec queue. Later when executing, we can only pass 1
sync obj to wait for completion. So even if userspace passes different
batch addresses during exec (i.e. different batches), the whole lot will
not complete until everything finishes. I think it's reasonable to
consider everything is busy while it doesn't complete.


userspace (or potential future userspace) that might submit
heterogeneous batches in parallel, which would make this inaccurate?

I'm not very familiar with the use cases of parallel submission, so I'll
trust that you've got a better understanding of the userspace usage than
I do; everything else here looks fine to me.


I kind of

[PATCH v5 6/8] drm/xe: Cache data about user-visible engines

2024-05-17 Thread Lucas De Marchi
gt->info.engine_mask used to indicate the available engines, but that
is not always true anymore: some engines are reserved to kernel and some
may be exposed as a single engine (e.g. with ccs_mode).

Runtime changes only happen when no clients exist, so it's safe to cache
the list of engines in the gt and update that when it's needed. This
will help implementing per client engine utilization so this (mostly
constant) information doesn't need to be re-calculated on every query.

Reviewed-by: Jonathan Cavitt 
Reviewed-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_gt.c  | 23 +++
 drivers/gpu/drm/xe/xe_gt.h  | 13 +
 drivers/gpu/drm/xe/xe_gt_ccs_mode.c |  1 +
 drivers/gpu/drm/xe/xe_gt_types.h| 21 -
 4 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index e69a03ddd255..5194a3d38e76 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -560,9 +560,32 @@ int xe_gt_init(struct xe_gt *gt)
if (err)
return err;
 
+   xe_gt_record_user_engines(gt);
+
return drmm_add_action_or_reset(>_to_xe(gt)->drm, gt_fini, gt);
 }
 
+void xe_gt_record_user_engines(struct xe_gt *gt)
+{
+   struct xe_hw_engine *hwe;
+   enum xe_hw_engine_id id;
+
+   gt->user_engines.mask = 0;
+   memset(gt->user_engines.instances_per_class, 0,
+  sizeof(gt->user_engines.instances_per_class));
+
+   for_each_hw_engine(hwe, gt, id) {
+   if (xe_hw_engine_is_reserved(hwe))
+   continue;
+
+   gt->user_engines.mask |= BIT_ULL(id);
+   gt->user_engines.instances_per_class[hwe->class]++;
+   }
+
+   xe_gt_assert(gt, (gt->user_engines.mask | gt->info.engine_mask)
+== gt->info.engine_mask);
+}
+
 static int do_gt_reset(struct xe_gt *gt)
 {
int err;
diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
index 8474c50b1b30..1d010bf4a756 100644
--- a/drivers/gpu/drm/xe/xe_gt.h
+++ b/drivers/gpu/drm/xe/xe_gt.h
@@ -38,6 +38,19 @@ int xe_gt_init_hwconfig(struct xe_gt *gt);
 int xe_gt_init_early(struct xe_gt *gt);
 int xe_gt_init(struct xe_gt *gt);
 int xe_gt_record_default_lrcs(struct xe_gt *gt);
+
+/**
+ * xe_gt_record_user_engines - save data related to engines available to
+ * usersapce
+ * @gt: GT structure
+ *
+ * Walk the available HW engines from gt->info.engine_mask and calculate data
+ * related to those engines that may be used by userspace. To be used whenever
+ * available engines change in runtime (e.g. with ccs_mode) or during
+ * initialization
+ */
+void xe_gt_record_user_engines(struct xe_gt *gt);
+
 void xe_gt_suspend_prepare(struct xe_gt *gt);
 int xe_gt_suspend(struct xe_gt *gt);
 int xe_gt_resume(struct xe_gt *gt);
diff --git a/drivers/gpu/drm/xe/xe_gt_ccs_mode.c 
b/drivers/gpu/drm/xe/xe_gt_ccs_mode.c
index a34c9a24dafc..c36218f4f6c8 100644
--- a/drivers/gpu/drm/xe/xe_gt_ccs_mode.c
+++ b/drivers/gpu/drm/xe/xe_gt_ccs_mode.c
@@ -134,6 +134,7 @@ ccs_mode_store(struct device *kdev, struct device_attribute 
*attr,
if (gt->ccs_mode != num_engines) {
xe_gt_info(gt, "Setting compute mode to %d\n", num_engines);
gt->ccs_mode = num_engines;
+   xe_gt_record_user_engines(gt);
xe_gt_reset_async(gt);
}
 
diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
index 475fb58882f1..10a9a9529377 100644
--- a/drivers/gpu/drm/xe/xe_gt_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_types.h
@@ -113,7 +113,11 @@ struct xe_gt {
enum xe_gt_type type;
/** @info.reference_clock: clock frequency */
u32 reference_clock;
-   /** @info.engine_mask: mask of engines present on GT */
+   /**
+* @info.engine_mask: mask of engines present on GT. Some of
+* them may be reserved in runtime and not available for user.
+* See @user_engines.mask
+*/
u64 engine_mask;
/** @info.gmdid: raw GMD_ID value from hardware */
u32 gmdid;
@@ -368,6 +372,21 @@ struct xe_gt {
/** @wa_active.oob: bitmap with active OOB workaroudns */
unsigned long *oob;
} wa_active;
+
+   /** @user_engines: engines present in GT and available to userspace */
+   struct {
+   /**
+* @user_engines.mask: like @info->engine_mask, but take in
+* consideration only engines available to userspace
+*/
+   u64 mask;
+
+   /**
+* @user_engines.instances_per_class: aggregate per class the
+* number of engines available to userspace
+

[PATCH v5 4/8] drm/xe: Add helper to capture engine timestamp

2024-05-17 Thread Lucas De Marchi
Just like CTX_TIMESTAMP is used to calculate runtime, add a helper to
get the timestamp for the engine so it can be used to calculate the
"engine time" with the same unit as the runtime is recorded.

Reviewed-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_hw_engine.c | 5 +
 drivers/gpu/drm/xe/xe_hw_engine.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c 
b/drivers/gpu/drm/xe/xe_hw_engine.c
index 942fca8f1eb9..de1aefaa2335 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -1121,3 +1121,8 @@ const char *xe_hw_engine_class_to_str(enum 
xe_engine_class class)
 
return NULL;
 }
+
+u64 xe_hw_engine_read_timestamp(struct xe_hw_engine *hwe)
+{
+   return xe_mmio_read64_2x32(hwe->gt, RING_TIMESTAMP(hwe->mmio_base));
+}
diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h 
b/drivers/gpu/drm/xe/xe_hw_engine.h
index 843de159e47c..7f2d27c0ba1a 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.h
+++ b/drivers/gpu/drm/xe/xe_hw_engine.h
@@ -68,5 +68,6 @@ static inline bool xe_hw_engine_is_valid(struct xe_hw_engine 
*hwe)
 }
 
 const char *xe_hw_engine_class_to_str(enum xe_engine_class class);
+u64 xe_hw_engine_read_timestamp(struct xe_hw_engine *hwe);
 
 #endif
-- 
2.43.0



[PATCH v5 7/8] drm/xe: Add helper to return any available hw engine

2024-05-17 Thread Lucas De Marchi
Get the first available engine from a gt, which helps in the case any
engine serves as a context, like when reading RING_TIMESTAMP.

Reviewed-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_gt.c | 11 +++
 drivers/gpu/drm/xe/xe_gt.h |  7 +++
 2 files changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index 5194a3d38e76..3432fef56486 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -833,3 +833,14 @@ struct xe_hw_engine 
*xe_gt_any_hw_engine_by_reset_domain(struct xe_gt *gt,
 
return NULL;
 }
+
+struct xe_hw_engine *xe_gt_any_hw_engine(struct xe_gt *gt)
+{
+   struct xe_hw_engine *hwe;
+   enum xe_hw_engine_id id;
+
+   for_each_hw_engine(hwe, gt, id)
+   return hwe;
+
+   return NULL;
+}
diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
index 1d010bf4a756..9073ac68a777 100644
--- a/drivers/gpu/drm/xe/xe_gt.h
+++ b/drivers/gpu/drm/xe/xe_gt.h
@@ -67,6 +67,13 @@ void xe_gt_remove(struct xe_gt *gt);
 struct xe_hw_engine *
 xe_gt_any_hw_engine_by_reset_domain(struct xe_gt *gt, enum xe_engine_class 
class);
 
+/**
+ * xe_gt_any_hw_engine - scan the list of engines and return the
+ * first available
+ * @gt: GT structure
+ */
+struct xe_hw_engine *xe_gt_any_hw_engine(struct xe_gt *gt);
+
 struct xe_hw_engine *xe_gt_hw_engine(struct xe_gt *gt,
 enum xe_engine_class class,
 u16 instance,
-- 
2.43.0



[PATCH v5 5/8] drm/xe: Add helper to accumulate exec queue runtime

2024-05-17 Thread Lucas De Marchi
From: Umesh Nerlige Ramappa 

Add a helper to accumulate per-client runtime of all its
exec queues. This is called every time a sched job is finished.

v2:
  - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate
runtime when job is finished since xe_sched_job_completed() is not a
notification that job finished.
  - Stop trying to update runtime from xe_exec_queue_fini() - that is
redundant and may happen after xef is closed, leading to a
use-after-free
  - Do not special case the first timestamp read: the default LRC sets
CTX_TIMESTAMP to zero, so even the first sample should be a valid
one.
  - Handle the parallel submission case by multiplying the runtime by
width.
v3: Update comments

Signed-off-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_device_types.h |  3 +++
 drivers/gpu/drm/xe/xe_exec_queue.c   | 37 
 drivers/gpu/drm/xe/xe_exec_queue.h   |  1 +
 drivers/gpu/drm/xe/xe_execlist.c |  1 +
 drivers/gpu/drm/xe/xe_guc_submit.c   |  2 ++
 5 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_device_types.h 
b/drivers/gpu/drm/xe/xe_device_types.h
index 5c5e36de452a..bc97990fd032 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -555,6 +555,9 @@ struct xe_file {
struct mutex lock;
} exec_queue;
 
+   /** @runtime: hw engine class runtime in ticks for this drm client */
+   u64 runtime[XE_ENGINE_CLASS_MAX];
+
/** @client: drm client */
struct xe_drm_client *client;
 };
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c 
b/drivers/gpu/drm/xe/xe_exec_queue.c
index 395de93579fa..fa6dc996eca8 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -769,6 +769,43 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q)
q->lrc[0].fence_ctx.next_seqno - 1;
 }
 
+/**
+ * xe_exec_queue_update_runtime() - Update runtime for this exec queue from hw
+ * @q: The exec queue
+ *
+ * Update the timestamp saved by HW for this exec queue and save runtime
+ * calculated by using the delta from last update. On multi-lrc case, only the
+ * first is considered.
+ */
+void xe_exec_queue_update_runtime(struct xe_exec_queue *q)
+{
+   struct xe_file *xef;
+   struct xe_lrc *lrc;
+   u32 old_ts, new_ts;
+
+   /*
+* Jobs that are run during driver load may use an exec_queue, but are
+* not associated with a user xe file, so avoid accumulating busyness
+* for kernel specific work.
+*/
+   if (!q->vm || !q->vm->xef)
+   return;
+
+   xef = q->vm->xef;
+
+   /*
+* Only sample the first LRC. For parallel submission, all of them are
+* scheduled together and we compensate that below by multiplying by
+* width - this may introduce errors if that premise is not true and
+* they don't exit 100% aligned. On the other hand, looping through
+* the LRCs and reading them in different time could also introduce
+* errors.
+*/
+   lrc = &q->lrc[0];
+   new_ts = xe_lrc_update_timestamp(lrc, &old_ts);
+   xef->runtime[q->class] += (new_ts - old_ts) * q->width;
+}
+
 void xe_exec_queue_kill(struct xe_exec_queue *q)
 {
struct xe_exec_queue *eq = q, *next;
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.h 
b/drivers/gpu/drm/xe/xe_exec_queue.h
index 48f6da53a292..e0f07d28ee1a 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.h
+++ b/drivers/gpu/drm/xe/xe_exec_queue.h
@@ -75,5 +75,6 @@ struct dma_fence *xe_exec_queue_last_fence_get(struct 
xe_exec_queue *e,
   struct xe_vm *vm);
 void xe_exec_queue_last_fence_set(struct xe_exec_queue *e, struct xe_vm *vm,
  struct dma_fence *fence);
+void xe_exec_queue_update_runtime(struct xe_exec_queue *q);
 
 #endif
diff --git a/drivers/gpu/drm/xe/xe_execlist.c b/drivers/gpu/drm/xe/xe_execlist.c
index e9dee1e14fef..bd7f27efe0e0 100644
--- a/drivers/gpu/drm/xe/xe_execlist.c
+++ b/drivers/gpu/drm/xe/xe_execlist.c
@@ -306,6 +306,7 @@ static void execlist_job_free(struct drm_sched_job *drm_job)
 {
struct xe_sched_job *job = to_xe_sched_job(drm_job);
 
+   xe_exec_queue_update_runtime(job->q);
xe_sched_job_put(job);
 }
 
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c 
b/drivers/gpu/drm/xe/xe_guc_submit.c
index 4efb88e3e056..ad2b8067d071 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -749,6 +749,8 @@ static void guc_exec_queue_free_job(struct drm_sched_job 
*drm_job)
 {
struct xe_sched_job *job = to_xe_sched_job(drm_job);
 
+   xe_exec_queue_update_runtime(job->q);
+
trace_xe_sched_job_free(job);
xe_sched_job_put(job);
 }
-- 
2.43.0



[PATCH v5 3/8] drm/xe/lrc: Add helper to capture context timestamp

2024-05-17 Thread Lucas De Marchi
From: Umesh Nerlige Ramappa 

Add a helper to capture CTX_TIMESTAMP from the context image so it can
be used to calculate the runtime.

v2: Add kernel-doc to clarify expectation from caller

Signed-off-by: Umesh Nerlige Ramappa 
Reviewed-by: Francois Dugast 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/regs/xe_lrc_layout.h |  1 +
 drivers/gpu/drm/xe/xe_lrc.c | 12 
 drivers/gpu/drm/xe/xe_lrc.h | 14 ++
 drivers/gpu/drm/xe/xe_lrc_types.h   |  3 +++
 4 files changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h 
b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
index e6ca8bbda8f4..045dfd09db99 100644
--- a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
+++ b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
@@ -11,6 +11,7 @@
 #define CTX_RING_TAIL  (0x06 + 1)
 #define CTX_RING_START (0x08 + 1)
 #define CTX_RING_CTL   (0x0a + 1)
+#define CTX_TIMESTAMP  (0x22 + 1)
 #define CTX_INDIRECT_RING_STATE(0x26 + 1)
 #define CTX_PDP0_UDW   (0x30 + 1)
 #define CTX_PDP0_LDW   (0x32 + 1)
diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
index 9b0a4078add3..f679cb9aaea7 100644
--- a/drivers/gpu/drm/xe/xe_lrc.c
+++ b/drivers/gpu/drm/xe/xe_lrc.c
@@ -844,6 +844,7 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine 
*hwe,
lrc->tile = gt_to_tile(hwe->gt);
lrc->ring.size = ring_size;
lrc->ring.tail = 0;
+   lrc->ctx_timestamp = 0;
 
xe_hw_fence_ctx_init(&lrc->fence_ctx, hwe->gt,
 hwe->fence_irq, hwe->name);
@@ -898,6 +899,8 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine 
*hwe,
 RING_CTL_SIZE(lrc->ring.size) | 
RING_VALID);
}
 
+   xe_lrc_write_ctx_reg(lrc, CTX_TIMESTAMP, 0);
+
if (xe->info.has_asid && vm)
xe_lrc_write_ctx_reg(lrc, PVC_CTX_ASID, vm->usm.asid);
 
@@ -1576,3 +1579,12 @@ void xe_lrc_snapshot_free(struct xe_lrc_snapshot 
*snapshot)
xe_bo_put(snapshot->lrc_bo);
kfree(snapshot);
 }
+
+u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts)
+{
+   *old_ts = lrc->ctx_timestamp;
+
+   lrc->ctx_timestamp = xe_lrc_read_ctx_reg(lrc, CTX_TIMESTAMP);
+
+   return lrc->ctx_timestamp;
+}
diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
index e0e841963c23..b9da1031083b 100644
--- a/drivers/gpu/drm/xe/xe_lrc.h
+++ b/drivers/gpu/drm/xe/xe_lrc.h
@@ -66,4 +66,18 @@ void xe_lrc_snapshot_capture_delayed(struct xe_lrc_snapshot 
*snapshot);
 void xe_lrc_snapshot_print(struct xe_lrc_snapshot *snapshot, struct 
drm_printer *p);
 void xe_lrc_snapshot_free(struct xe_lrc_snapshot *snapshot);
 
+/**
+ * xe_lrc_update_timestamp - readout LRC timestamp and update cached value
+ * @lrc: logical ring context for this exec queue
+ * @old_ts: pointer where to save the previous timestamp
+ *
+ * Read the current timestamp for this LRC and update the cached value. The
+ * previous cached value is also returned in @old_ts so the caller can 
calculate
+ * the delta between 2 updates. Note that this is not intended to be called 
from
+ * any place, but just by the paths updating the drm client utilization.
+ *
+ * Returns the current LRC timestamp
+ */
+u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts);
+
 #endif
diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h 
b/drivers/gpu/drm/xe/xe_lrc_types.h
index cdbf03faef15..0fa055da6b27 100644
--- a/drivers/gpu/drm/xe/xe_lrc_types.h
+++ b/drivers/gpu/drm/xe/xe_lrc_types.h
@@ -45,6 +45,9 @@ struct xe_lrc {
 
/** @fence_ctx: context for hw fence */
struct xe_hw_fence_ctx fence_ctx;
+
+   /** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */
+   u32 ctx_timestamp;
 };
 
 struct xe_lrc_snapshot;
-- 
2.43.0



[PATCH v5 2/8] drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion

2024-05-17 Thread Lucas De Marchi
XE_ENGINE_CLASS_OTHER was missing from the str conversion. Add it and
remove the default handling so it's protected by -Wswitch.
Currently the only user is xe_hw_engine_class_sysfs_init(), which
already skips XE_ENGINE_CLASS_OTHER, so there's no change in behavior.

Reviewed-by: Nirmoy Das 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_hw_engine.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c 
b/drivers/gpu/drm/xe/xe_hw_engine.c
index b71e90c555fa..942fca8f1eb9 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -,9 +,13 @@ const char *xe_hw_engine_class_to_str(enum 
xe_engine_class class)
return "vecs";
case XE_ENGINE_CLASS_COPY:
return "bcs";
+   case XE_ENGINE_CLASS_OTHER:
+   return "other";
case XE_ENGINE_CLASS_COMPUTE:
return "ccs";
-   default:
-   return NULL;
+   case XE_ENGINE_CLASS_MAX:
+   break;
}
+
+   return NULL;
 }
-- 
2.43.0



[PATCH v5 1/8] drm/xe: Promote xe_hw_engine_class_to_str()

2024-05-17 Thread Lucas De Marchi
Move it out of the sysfs compilation unit so it can be re-used in other
places.

Reviewed-by: Nirmoy Das 
Reviewed-by: Oak Zeng 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_hw_engine.c | 18 ++
 drivers/gpu/drm/xe/xe_hw_engine.h |  2 ++
 drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c | 18 --
 3 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c 
b/drivers/gpu/drm/xe/xe_hw_engine.c
index e19af179af33..b71e90c555fa 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -1099,3 +1099,21 @@ bool xe_hw_engine_is_reserved(struct xe_hw_engine *hwe)
return xe->info.has_usm && hwe->class == XE_ENGINE_CLASS_COPY &&
hwe->instance == gt->usm.reserved_bcs_instance;
 }
+
+const char *xe_hw_engine_class_to_str(enum xe_engine_class class)
+{
+   switch (class) {
+   case XE_ENGINE_CLASS_RENDER:
+   return "rcs";
+   case XE_ENGINE_CLASS_VIDEO_DECODE:
+   return "vcs";
+   case XE_ENGINE_CLASS_VIDEO_ENHANCE:
+   return "vecs";
+   case XE_ENGINE_CLASS_COPY:
+   return "bcs";
+   case XE_ENGINE_CLASS_COMPUTE:
+   return "ccs";
+   default:
+   return NULL;
+   }
+}
diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h 
b/drivers/gpu/drm/xe/xe_hw_engine.h
index 71968ee2f600..843de159e47c 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.h
+++ b/drivers/gpu/drm/xe/xe_hw_engine.h
@@ -67,4 +67,6 @@ static inline bool xe_hw_engine_is_valid(struct xe_hw_engine 
*hwe)
return hwe->name;
 }
 
+const char *xe_hw_engine_class_to_str(enum xe_engine_class class);
+
 #endif
diff --git a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c 
b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
index 844ec68cbbb8..efce6c7dd2a2 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
@@ -618,24 +618,6 @@ static void hw_engine_class_sysfs_fini(struct drm_device 
*drm, void *arg)
kobject_put(kobj);
 }
 
-static const char *xe_hw_engine_class_to_str(enum xe_engine_class class)
-{
-   switch (class) {
-   case XE_ENGINE_CLASS_RENDER:
-   return "rcs";
-   case XE_ENGINE_CLASS_VIDEO_DECODE:
-   return "vcs";
-   case XE_ENGINE_CLASS_VIDEO_ENHANCE:
-   return "vecs";
-   case XE_ENGINE_CLASS_COPY:
-   return "bcs";
-   case XE_ENGINE_CLASS_COMPUTE:
-   return "ccs";
-   default:
-   return NULL;
-   }
-}
-
 /**
  * xe_hw_engine_class_sysfs_init - Init HW engine classes on GT.
  * @gt: Xe GT.
-- 
2.43.0



[PATCH v5 0/8] drm/xe: Per client usage

2024-05-17 Thread Lucas De Marchi
v5 of of 
https://lore.kernel.org/all/20240515214258.59209-1-lucas.demar...@intel.com

Add per-client usage statistics to xe. This ports xe to use the common
method in drm to export the usage to userspace per client (where 1
client == 1 drm fd open).

However instead of using the current format measured in nsec, this
creates a new one. The intention here is not to mix the GPU clock domain
with the CPU clock. It allows to cover a few more use cases without
extra complications.

This version is tested on an ADL-P and also checked gputop with i915 to
make sure not regressed. Last patch also contains the documentation for
the new key and sample output as requested in v1.

The pre-existent drm-cycles- is used as is, which allows gputop
to work with xe.

This last patch still has some open discussion from v2, so we may need
to hold it a little more.

v2:
  - Create a new drm-total-cycles instead of re-using drm-engine with a
different unit
  - Add documentation for the new interface and clarify usage of
xe_lrc_update_timestamp()

v3:
  - Fix bugs in "drm/xe: Add helper to accumulate exec queue runtime" -
see commit message
  - Reorder commits so the ones that are useful in other patch series
come first

v4:
  - Fix some comments and documentation
  - Add 2 patches so we cache on the gt the mask of engines visible to
userspace and the per-class capacity. Previously we were doing this
during the query, but besides not being very efficient as pointed
by Tvrtko, we were also not handling correclty reserved engines and
engines "hidden" by e.g. ccs_mode.  So move that part to be executed
on init and when changing the available engines.
  - Simplify the fdinfo output loop since now we have the information
cached on gt. This also moves the read of the gpu timestamp out
of the loop as suggested by Tvrtko and using the helpers implemented
in the new patches.

v5:
  - Fix kernel-doc
  - Move pm_runtime_put() earlier in the function as it's not needed
anymore after interacting with the HW.

Lucas De Marchi (6):
  drm/xe: Promote xe_hw_engine_class_to_str()
  drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion
  drm/xe: Add helper to capture engine timestamp
  drm/xe: Cache data about user-visible engines
  drm/xe: Add helper to return any available hw engine
  drm/xe/client: Print runtime to fdinfo

Umesh Nerlige Ramappa (2):
  drm/xe/lrc: Add helper to capture context timestamp
  drm/xe: Add helper to accumulate exec queue runtime

 Documentation/gpu/drm-usage-stats.rst |  21 ++-
 Documentation/gpu/xe/index.rst|   1 +
 Documentation/gpu/xe/xe-drm-usage-stats.rst   |  10 ++
 drivers/gpu/drm/xe/regs/xe_lrc_layout.h   |   1 +
 drivers/gpu/drm/xe/xe_device_types.h  |   3 +
 drivers/gpu/drm/xe/xe_drm_client.c| 121 +-
 drivers/gpu/drm/xe/xe_exec_queue.c|  37 ++
 drivers/gpu/drm/xe/xe_exec_queue.h|   1 +
 drivers/gpu/drm/xe/xe_execlist.c  |   1 +
 drivers/gpu/drm/xe/xe_gt.c|  34 +
 drivers/gpu/drm/xe/xe_gt.h|  20 +++
 drivers/gpu/drm/xe/xe_gt_ccs_mode.c   |   1 +
 drivers/gpu/drm/xe/xe_gt_types.h  |  21 ++-
 drivers/gpu/drm/xe/xe_guc_submit.c|   2 +
 drivers/gpu/drm/xe/xe_hw_engine.c |  27 
 drivers/gpu/drm/xe/xe_hw_engine.h |   3 +
 drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c |  18 ---
 drivers/gpu/drm/xe/xe_lrc.c   |  12 ++
 drivers/gpu/drm/xe/xe_lrc.h   |  14 ++
 drivers/gpu/drm/xe/xe_lrc_types.h |   3 +
 20 files changed, 329 insertions(+), 22 deletions(-)
 create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst

-- 
2.43.0



[PATCH v5 8/8] drm/xe/client: Print runtime to fdinfo

2024-05-17 Thread Lucas De Marchi
Print the accumulated runtime for client when printing fdinfo.
Each time a query is done it first does 2 things:

1) loop through all the exec queues for the current client and
   accumulate the runtime, per engine class. CTX_TIMESTAMP is used for
   that, being read from the context image.

2) Read a "GPU timestamp" that can be used for considering "how much GPU
   time has passed" and that has the same unit/refclock as the one
   recording the runtime. RING_TIMESTAMP is used for that via MMIO.

Since for all current platforms RING_TIMESTAMP follows the same
refclock, just read it once, using any first engine available.

This is exported to userspace as 2 numbers in fdinfo:

drm-cycles-: 
drm-total-cycles-: 

Userspace is expected to collect at least 2 samples, which allows to
know the client engine busyness as per:

RUNTIME1 - RUNTIME0
busyness = -
  T1 - T0

Since drm-cycles- always starts at 0, it's also possible to know
if and engine was ever used by a client.

It's expected that userspace will read any 2 samples every few seconds.
Given the update frequency of the counters involved and that
CTX_TIMESTAMP is 32-bits, the counter for each exec_queue can wrap
around (assuming 100% utilization) after ~200s. The wraparound is not
perceived by userspace since it's just accumulated for all the
exec_queues in a 64-bit counter) but the measurement will not be
accurate if the samples are too far apart.

This could be mitigated by adding a workqueue to accumulate the counters
every so often, but it's additional complexity for something that is
done already by userspace every few seconds in tools like gputop (from
igt), htop, nvtop, etc, with none of them really defaulting to 1 sample
per minute or more.

Reviewed-by: Umesh Nerlige Ramappa 
Acked-by: Tvrtko Ursulin 
Signed-off-by: Lucas De Marchi 
---
 Documentation/gpu/drm-usage-stats.rst   |  21 +++-
 Documentation/gpu/xe/index.rst  |   1 +
 Documentation/gpu/xe/xe-drm-usage-stats.rst |  10 ++
 drivers/gpu/drm/xe/xe_drm_client.c  | 121 +++-
 4 files changed, 150 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst

diff --git a/Documentation/gpu/drm-usage-stats.rst 
b/Documentation/gpu/drm-usage-stats.rst
index 6dc299343b48..a80f95ca1b2f 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -112,6 +112,19 @@ larger value within a reasonable period. Upon observing a 
value lower than what
 was previously read, userspace is expected to stay with that larger previous
 value until a monotonic update is seen.
 
+- drm-total-cycles-: 
+
+Engine identifier string must be the same as the one specified in the
+drm-cycles- tag and shall contain the total number cycles for the given
+engine.
+
+This is a timestamp in GPU unspecified unit that matches the update rate
+of drm-cycles-. For drivers that implement this interface, the engine
+utilization can be calculated entirely on the GPU clock domain, without
+considering the CPU sleep time between 2 samples.
+
+A driver may implement either this key or drm-maxfreq-, but not both.
+
 - drm-maxfreq-:  [Hz|MHz|KHz]
 
 Engine identifier string must be the same as the one specified in the
@@ -121,6 +134,9 @@ percentage utilization of the engine, whereas 
drm-engine- only reflects
 time active without considering what frequency the engine is operating as a
 percentage of its maximum frequency.
 
+A driver may implement either this key or drm-total-cycles-, but not
+both.
+
 Memory
 ^^
 
@@ -168,5 +184,6 @@ be documented above and where possible, aligned with other 
drivers.
 Driver specific implementations
 ---
 
-:ref:`i915-usage-stats`
-:ref:`panfrost-usage-stats`
+* :ref:`i915-usage-stats`
+* :ref:`panfrost-usage-stats`
+* :ref:`xe-usage-stats`
diff --git a/Documentation/gpu/xe/index.rst b/Documentation/gpu/xe/index.rst
index c224ecaee81e..3f07aa3b5432 100644
--- a/Documentation/gpu/xe/index.rst
+++ b/Documentation/gpu/xe/index.rst
@@ -23,3 +23,4 @@ DG2, etc is provided to prototype the driver.
xe_firmware
xe_tile
xe_debugging
+   xe-drm-usage-stats.rst
diff --git a/Documentation/gpu/xe/xe-drm-usage-stats.rst 
b/Documentation/gpu/xe/xe-drm-usage-stats.rst
new file mode 100644
index ..482d503ae68a
--- /dev/null
+++ b/Documentation/gpu/xe/xe-drm-usage-stats.rst
@@ -0,0 +1,10 @@
+.. SPDX-License-Identifier: GPL-2.0+
+
+.. _xe-usage-stats:
+
+
+Xe DRM client usage stats implementation
+
+
+.. kernel-doc:: drivers/gpu/drm/xe/xe_drm_client.c
+   :doc: DRM Client usage stats
diff --git a/drivers/gpu/drm/xe/xe_drm_client.c 
b/drivers/gpu/drm/xe/xe_drm_client.c
index 08f0b7c95901..af404c9e5cc0 100644
--- a/drivers/gpu/drm/xe/xe_drm_client.c
+++ b/dr

Re: [PATCH v4 6/8] drm/xe: Cache data about user-visible engines

2024-05-16 Thread Lucas De Marchi

On Thu, May 16, 2024 at 11:33:54AM GMT, Umesh Nerlige Ramappa wrote:

On Wed, May 15, 2024 at 02:42:56PM -0700, Lucas De Marchi wrote:

gt->info.engine_mask used to indicate the available engines, but that
is not always true anymore: some engines are reserved to kernel and some
may be exposed as a single engine (e.g. with ccs_mode).

Runtime changes only happen when no clients exist, so it's safe to cache
the list of engines in the gt and update that when it's needed. This
will help implementing per client engine utilization so this (mostly
constant) information doesn't need to be re-calculated on every query.

Signed-off-by: Lucas De Marchi 


Just a few questions below, otherwise this looks good as is:

Reviewed-by: Umesh Nerlige Ramappa 


---
drivers/gpu/drm/xe/xe_gt.c  | 23 +++
drivers/gpu/drm/xe/xe_gt.h  | 13 +
drivers/gpu/drm/xe/xe_gt_ccs_mode.c |  1 +
drivers/gpu/drm/xe/xe_gt_types.h| 21 -
4 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index e69a03ddd255..5194a3d38e76 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -560,9 +560,32 @@ int xe_gt_init(struct xe_gt *gt)
if (err)
return err;

+   xe_gt_record_user_engines(gt);
+
return drmm_add_action_or_reset(>_to_xe(gt)->drm, gt_fini, gt);
}

+void xe_gt_record_user_engines(struct xe_gt *gt)
+{
+   struct xe_hw_engine *hwe;
+   enum xe_hw_engine_id id;
+
+   gt->user_engines.mask = 0;
+   memset(gt->user_engines.instances_per_class, 0,
+  sizeof(gt->user_engines.instances_per_class));
+
+   for_each_hw_engine(hwe, gt, id) {
+   if (xe_hw_engine_is_reserved(hwe))
+   continue;
+
+   gt->user_engines.mask |= BIT_ULL(id);
+   gt->user_engines.instances_per_class[hwe->class]++;
+   }
+
+   xe_gt_assert(gt, (gt->user_engines.mask | gt->info.engine_mask)
+== gt->info.engine_mask);


I am not seeing a place where user_engines.mask is not a subset of 
info.engine_mask in the driver, so the above check will always be 
true.


that's why it's an assert. user_engines.mask should always be a
subset of info.engine_mask, otherwise something went terribly wrong.



Did you mean to do and & instead of | above? That might make sense 
since then you are making sure that the user_engines are a subset of 
engine_mask.


no, what I'm trying to assert is that user_engines.mask never has an
engine that is not present in info.engine_mask. Example:

engine_mask   == 0b01
user_engines.mask == 0b11

That should never happen and it should fail the assert.

I decided to add the assert because I'm not deriving the
user_engines.mask directly from the mask, but indirectly. Early on probe
we setup the mask and create the hw_engine instances and we are
calculating the user_engines.mask from there. I just wanted to make sure
we don't screw up something in the middle that causes issues.




+}
+
static int do_gt_reset(struct xe_gt *gt)
{
int err;
diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
index 8474c50b1b30..ad3fd31e0a41 100644
--- a/drivers/gpu/drm/xe/xe_gt.h
+++ b/drivers/gpu/drm/xe/xe_gt.h
@@ -38,6 +38,19 @@ int xe_gt_init_hwconfig(struct xe_gt *gt);
int xe_gt_init_early(struct xe_gt *gt);
int xe_gt_init(struct xe_gt *gt);
int xe_gt_record_default_lrcs(struct xe_gt *gt);
+
+/**
+ * @xe_gt_record_user_engines - save data related to engines available to
+ * usersapce
+ * @gt: GT structure
+ *
+ * Walk the available HW engines from gt->info.engine_mask and calculate data
+ * related to those engines that may be used by userspace. To be used whenever
+ * available engines change in runtime (e.g. with ccs_mode) or during


After the driver loads, do we expect ccs_mode to change dynamically 
based on some criteria OR is it a one time configuration at driver 
load?


If former, can you provide an example where ccs_mode would change 
dynamically, just curious.


it can be set via sysfs, but it blocks changing it if there are clients.
For with display, it's easier to check by loading the driver with
enable_display=0. Trying that on a DG2:

# modprobe xe enable_display=0
# exec 3<> /dev/dri/card1
# tail -n4 /proc/self/fdinfo/3
drm-cycles-bcs: 0
drm-total-cycles-bcs:   37728138157
drm-cycles-ccs: 0
drm-total-cycles-ccs:   37728138157
#
# exec 3<&-
# echo 2 > 
/sys/devices/pci:00/:00:01.0/:01:00.0/:02:01.0/:03:00.0/tile0/gt0/ccs_mode
# exec 3<> /dev/dri/card1
# tail -n4 /proc/self/fdinfo/3
drm-total-cycles-bcs:   38260910526
drm-cycles-ccs: 0
drm-total-cycles-ccs:   382609

Re: [PATCH 2/2] MAINTAINERS: update Xe driver maintainers

2024-05-16 Thread Lucas De Marchi

On Wed, May 15, 2024 at 07:22:22PM GMT, Oded Gabbay wrote:

Because I left Intel, I'm removing myself from the list
of Xe driver maintainers.

Signed-off-by: Oded Gabbay 


Acked-by: Lucas De Marchi 

thanks
Lucas De Marchi


---
MAINTAINERS | 1 -
1 file changed, 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5bd45a919aff..2469607ff5b7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10863,7 +10863,6 @@ F:  include/uapi/drm/i915_drm.h

INTEL DRM XE DRIVER (Lunar Lake and newer)
M:  Lucas De Marchi 
-M: Oded Gabbay 
M:  Thomas Hellström 
L:  intel...@lists.freedesktop.org
S:  Supported
--
2.34.1



Re: [PATCH v4 8/8] drm/xe/client: Print runtime to fdinfo

2024-05-16 Thread Lucas De Marchi

On Thu, May 16, 2024 at 08:57:48AM GMT, Tvrtko Ursulin wrote:


On 15/05/2024 22:42, Lucas De Marchi wrote:

Print the accumulated runtime for client when printing fdinfo.
Each time a query is done it first does 2 things:

1) loop through all the exec queues for the current client and
   accumulate the runtime, per engine class. CTX_TIMESTAMP is used for
   that, being read from the context image.

2) Read a "GPU timestamp" that can be used for considering "how much GPU
   time has passed" and that has the same unit/refclock as the one
   recording the runtime. RING_TIMESTAMP is used for that via MMIO.

Since for all current platforms RING_TIMESTAMP follows the same
refclock, just read it once, using any first engine available.

This is exported to userspace as 2 numbers in fdinfo:

drm-cycles-: 
drm-total-cycles-: 

Userspace is expected to collect at least 2 samples, which allows to
know the client engine busyness as per:

RUNTIME1 - RUNTIME0
busyness = -
  T1 - T0

Since drm-cycles- always starts at 0, it's also possible to know
if and engine was ever used by a client.

It's expected that userspace will read any 2 samples every few seconds.
Given the update frequency of the counters involved and that
CTX_TIMESTAMP is 32-bits, the counter for each exec_queue can wrap
around (assuming 100% utilization) after ~200s. The wraparound is not
perceived by userspace since it's just accumulated for all the
exec_queues in a 64-bit counter) but the measurement will not be
accurate if the samples are too far apart.

This could be mitigated by adding a workqueue to accumulate the counters
every so often, but it's additional complexity for something that is
done already by userspace every few seconds in tools like gputop (from
igt), htop, nvtop, etc, with none of them really defaulting to 1 sample
per minute or more.

Signed-off-by: Lucas De Marchi 
---
 Documentation/gpu/drm-usage-stats.rst   |  21 +++-
 Documentation/gpu/xe/index.rst  |   1 +
 Documentation/gpu/xe/xe-drm-usage-stats.rst |  10 ++
 drivers/gpu/drm/xe/xe_drm_client.c  | 121 +++-
 4 files changed, 150 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst

diff --git a/Documentation/gpu/drm-usage-stats.rst 
b/Documentation/gpu/drm-usage-stats.rst
index 6dc299343b48..a80f95ca1b2f 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -112,6 +112,19 @@ larger value within a reasonable period. Upon observing a 
value lower than what
 was previously read, userspace is expected to stay with that larger previous
 value until a monotonic update is seen.
+- drm-total-cycles-: 
+
+Engine identifier string must be the same as the one specified in the
+drm-cycles- tag and shall contain the total number cycles for the given
+engine.
+
+This is a timestamp in GPU unspecified unit that matches the update rate
+of drm-cycles-. For drivers that implement this interface, the engine
+utilization can be calculated entirely on the GPU clock domain, without
+considering the CPU sleep time between 2 samples.
+
+A driver may implement either this key or drm-maxfreq-, but not both.
+
 - drm-maxfreq-:  [Hz|MHz|KHz]
 Engine identifier string must be the same as the one specified in the
@@ -121,6 +134,9 @@ percentage utilization of the engine, whereas 
drm-engine- only reflects
 time active without considering what frequency the engine is operating as a
 percentage of its maximum frequency.
+A driver may implement either this key or drm-total-cycles-, but not
+both.
+


For the spec part:

Acked-by: Tvrtko Ursulin 


thanks



Some minor comments and questions below.


 Memory
 ^^
@@ -168,5 +184,6 @@ be documented above and where possible, aligned with other 
drivers.
 Driver specific implementations
 ---
-:ref:`i915-usage-stats`
-:ref:`panfrost-usage-stats`
+* :ref:`i915-usage-stats`
+* :ref:`panfrost-usage-stats`
+* :ref:`xe-usage-stats`
diff --git a/Documentation/gpu/xe/index.rst b/Documentation/gpu/xe/index.rst
index c224ecaee81e..3f07aa3b5432 100644
--- a/Documentation/gpu/xe/index.rst
+++ b/Documentation/gpu/xe/index.rst
@@ -23,3 +23,4 @@ DG2, etc is provided to prototype the driver.
xe_firmware
xe_tile
xe_debugging
+   xe-drm-usage-stats.rst
diff --git a/Documentation/gpu/xe/xe-drm-usage-stats.rst 
b/Documentation/gpu/xe/xe-drm-usage-stats.rst
new file mode 100644
index ..482d503ae68a
--- /dev/null
+++ b/Documentation/gpu/xe/xe-drm-usage-stats.rst
@@ -0,0 +1,10 @@
+.. SPDX-License-Identifier: GPL-2.0+
+
+.. _xe-usage-stats:
+
+
+Xe DRM client usage stats implementation
+
+
+.. kernel-doc:: drivers/gpu/drm/xe/xe_drm_client.c
+   :doc: DRM Client usage stats
diff --git a/drivers/gpu/drm/

[PATCH v4 7/8] drm/xe: Add helper to return any available hw engine

2024-05-15 Thread Lucas De Marchi
Get the first available engine from a gt, which helps in the case any
engine serves as a context, like when reading RING_TIMESTAMP.

Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_gt.c | 11 +++
 drivers/gpu/drm/xe/xe_gt.h |  7 +++
 2 files changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index 5194a3d38e76..3432fef56486 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -833,3 +833,14 @@ struct xe_hw_engine 
*xe_gt_any_hw_engine_by_reset_domain(struct xe_gt *gt,
 
return NULL;
 }
+
+struct xe_hw_engine *xe_gt_any_hw_engine(struct xe_gt *gt)
+{
+   struct xe_hw_engine *hwe;
+   enum xe_hw_engine_id id;
+
+   for_each_hw_engine(hwe, gt, id)
+   return hwe;
+
+   return NULL;
+}
diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
index ad3fd31e0a41..a53f01362d94 100644
--- a/drivers/gpu/drm/xe/xe_gt.h
+++ b/drivers/gpu/drm/xe/xe_gt.h
@@ -67,6 +67,13 @@ void xe_gt_remove(struct xe_gt *gt);
 struct xe_hw_engine *
 xe_gt_any_hw_engine_by_reset_domain(struct xe_gt *gt, enum xe_engine_class 
class);
 
+/**
+ * xe_gt_any_hw_engine - scan the list of engines and return the
+ * first available
+ * @gt: GT structure
+ */
+struct xe_hw_engine *xe_gt_any_hw_engine(struct xe_gt *gt);
+
 struct xe_hw_engine *xe_gt_hw_engine(struct xe_gt *gt,
 enum xe_engine_class class,
 u16 instance,
-- 
2.43.0



[PATCH v4 4/8] drm/xe: Add helper to capture engine timestamp

2024-05-15 Thread Lucas De Marchi
Just like CTX_TIMESTAMP is used to calculate runtime, add a helper to
get the timestamp for the engine so it can be used to calculate the
"engine time" with the same unit as the runtime is recorded.

Reviewed-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_hw_engine.c | 5 +
 drivers/gpu/drm/xe/xe_hw_engine.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c 
b/drivers/gpu/drm/xe/xe_hw_engine.c
index f76bb507e1a5..51536d05ad86 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -1121,3 +1121,8 @@ const char *xe_hw_engine_class_to_str(enum 
xe_engine_class class)
 
return NULL;
 }
+
+u64 xe_hw_engine_read_timestamp(struct xe_hw_engine *hwe)
+{
+   return xe_mmio_read64_2x32(hwe->gt, RING_TIMESTAMP(hwe->mmio_base));
+}
diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h 
b/drivers/gpu/drm/xe/xe_hw_engine.h
index 843de159e47c..7f2d27c0ba1a 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.h
+++ b/drivers/gpu/drm/xe/xe_hw_engine.h
@@ -68,5 +68,6 @@ static inline bool xe_hw_engine_is_valid(struct xe_hw_engine 
*hwe)
 }
 
 const char *xe_hw_engine_class_to_str(enum xe_engine_class class);
+u64 xe_hw_engine_read_timestamp(struct xe_hw_engine *hwe);
 
 #endif
-- 
2.43.0



[PATCH v4 8/8] drm/xe/client: Print runtime to fdinfo

2024-05-15 Thread Lucas De Marchi
Print the accumulated runtime for client when printing fdinfo.
Each time a query is done it first does 2 things:

1) loop through all the exec queues for the current client and
   accumulate the runtime, per engine class. CTX_TIMESTAMP is used for
   that, being read from the context image.

2) Read a "GPU timestamp" that can be used for considering "how much GPU
   time has passed" and that has the same unit/refclock as the one
   recording the runtime. RING_TIMESTAMP is used for that via MMIO.

Since for all current platforms RING_TIMESTAMP follows the same
refclock, just read it once, using any first engine available.

This is exported to userspace as 2 numbers in fdinfo:

drm-cycles-: 
drm-total-cycles-: 

Userspace is expected to collect at least 2 samples, which allows to
know the client engine busyness as per:

RUNTIME1 - RUNTIME0
busyness = -
  T1 - T0

Since drm-cycles- always starts at 0, it's also possible to know
if and engine was ever used by a client.

It's expected that userspace will read any 2 samples every few seconds.
Given the update frequency of the counters involved and that
CTX_TIMESTAMP is 32-bits, the counter for each exec_queue can wrap
around (assuming 100% utilization) after ~200s. The wraparound is not
perceived by userspace since it's just accumulated for all the
exec_queues in a 64-bit counter) but the measurement will not be
accurate if the samples are too far apart.

This could be mitigated by adding a workqueue to accumulate the counters
every so often, but it's additional complexity for something that is
done already by userspace every few seconds in tools like gputop (from
igt), htop, nvtop, etc, with none of them really defaulting to 1 sample
per minute or more.

Signed-off-by: Lucas De Marchi 
---
 Documentation/gpu/drm-usage-stats.rst   |  21 +++-
 Documentation/gpu/xe/index.rst  |   1 +
 Documentation/gpu/xe/xe-drm-usage-stats.rst |  10 ++
 drivers/gpu/drm/xe/xe_drm_client.c  | 121 +++-
 4 files changed, 150 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst

diff --git a/Documentation/gpu/drm-usage-stats.rst 
b/Documentation/gpu/drm-usage-stats.rst
index 6dc299343b48..a80f95ca1b2f 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -112,6 +112,19 @@ larger value within a reasonable period. Upon observing a 
value lower than what
 was previously read, userspace is expected to stay with that larger previous
 value until a monotonic update is seen.
 
+- drm-total-cycles-: 
+
+Engine identifier string must be the same as the one specified in the
+drm-cycles- tag and shall contain the total number cycles for the given
+engine.
+
+This is a timestamp in GPU unspecified unit that matches the update rate
+of drm-cycles-. For drivers that implement this interface, the engine
+utilization can be calculated entirely on the GPU clock domain, without
+considering the CPU sleep time between 2 samples.
+
+A driver may implement either this key or drm-maxfreq-, but not both.
+
 - drm-maxfreq-:  [Hz|MHz|KHz]
 
 Engine identifier string must be the same as the one specified in the
@@ -121,6 +134,9 @@ percentage utilization of the engine, whereas 
drm-engine- only reflects
 time active without considering what frequency the engine is operating as a
 percentage of its maximum frequency.
 
+A driver may implement either this key or drm-total-cycles-, but not
+both.
+
 Memory
 ^^
 
@@ -168,5 +184,6 @@ be documented above and where possible, aligned with other 
drivers.
 Driver specific implementations
 ---
 
-:ref:`i915-usage-stats`
-:ref:`panfrost-usage-stats`
+* :ref:`i915-usage-stats`
+* :ref:`panfrost-usage-stats`
+* :ref:`xe-usage-stats`
diff --git a/Documentation/gpu/xe/index.rst b/Documentation/gpu/xe/index.rst
index c224ecaee81e..3f07aa3b5432 100644
--- a/Documentation/gpu/xe/index.rst
+++ b/Documentation/gpu/xe/index.rst
@@ -23,3 +23,4 @@ DG2, etc is provided to prototype the driver.
xe_firmware
xe_tile
xe_debugging
+   xe-drm-usage-stats.rst
diff --git a/Documentation/gpu/xe/xe-drm-usage-stats.rst 
b/Documentation/gpu/xe/xe-drm-usage-stats.rst
new file mode 100644
index ..482d503ae68a
--- /dev/null
+++ b/Documentation/gpu/xe/xe-drm-usage-stats.rst
@@ -0,0 +1,10 @@
+.. SPDX-License-Identifier: GPL-2.0+
+
+.. _xe-usage-stats:
+
+
+Xe DRM client usage stats implementation
+
+
+.. kernel-doc:: drivers/gpu/drm/xe/xe_drm_client.c
+   :doc: DRM Client usage stats
diff --git a/drivers/gpu/drm/xe/xe_drm_client.c 
b/drivers/gpu/drm/xe/xe_drm_client.c
index 08f0b7c95901..952b0cc87708 100644
--- a/drivers/gpu/drm/xe/xe_drm_client.c
+++ b/drivers/gpu/drm/xe/xe_drm_client.c
@@ -2,6 +2,7 @@
 /

[PATCH v4 6/8] drm/xe: Cache data about user-visible engines

2024-05-15 Thread Lucas De Marchi
gt->info.engine_mask used to indicate the available engines, but that
is not always true anymore: some engines are reserved to kernel and some
may be exposed as a single engine (e.g. with ccs_mode).

Runtime changes only happen when no clients exist, so it's safe to cache
the list of engines in the gt and update that when it's needed. This
will help implementing per client engine utilization so this (mostly
constant) information doesn't need to be re-calculated on every query.

Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_gt.c  | 23 +++
 drivers/gpu/drm/xe/xe_gt.h  | 13 +
 drivers/gpu/drm/xe/xe_gt_ccs_mode.c |  1 +
 drivers/gpu/drm/xe/xe_gt_types.h| 21 -
 4 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index e69a03ddd255..5194a3d38e76 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -560,9 +560,32 @@ int xe_gt_init(struct xe_gt *gt)
if (err)
return err;
 
+   xe_gt_record_user_engines(gt);
+
return drmm_add_action_or_reset(>_to_xe(gt)->drm, gt_fini, gt);
 }
 
+void xe_gt_record_user_engines(struct xe_gt *gt)
+{
+   struct xe_hw_engine *hwe;
+   enum xe_hw_engine_id id;
+
+   gt->user_engines.mask = 0;
+   memset(gt->user_engines.instances_per_class, 0,
+  sizeof(gt->user_engines.instances_per_class));
+
+   for_each_hw_engine(hwe, gt, id) {
+   if (xe_hw_engine_is_reserved(hwe))
+   continue;
+
+   gt->user_engines.mask |= BIT_ULL(id);
+   gt->user_engines.instances_per_class[hwe->class]++;
+   }
+
+   xe_gt_assert(gt, (gt->user_engines.mask | gt->info.engine_mask)
+== gt->info.engine_mask);
+}
+
 static int do_gt_reset(struct xe_gt *gt)
 {
int err;
diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
index 8474c50b1b30..ad3fd31e0a41 100644
--- a/drivers/gpu/drm/xe/xe_gt.h
+++ b/drivers/gpu/drm/xe/xe_gt.h
@@ -38,6 +38,19 @@ int xe_gt_init_hwconfig(struct xe_gt *gt);
 int xe_gt_init_early(struct xe_gt *gt);
 int xe_gt_init(struct xe_gt *gt);
 int xe_gt_record_default_lrcs(struct xe_gt *gt);
+
+/**
+ * @xe_gt_record_user_engines - save data related to engines available to
+ * usersapce
+ * @gt: GT structure
+ *
+ * Walk the available HW engines from gt->info.engine_mask and calculate data
+ * related to those engines that may be used by userspace. To be used whenever
+ * available engines change in runtime (e.g. with ccs_mode) or during
+ * initialization
+ */
+void xe_gt_record_user_engines(struct xe_gt *gt);
+
 void xe_gt_suspend_prepare(struct xe_gt *gt);
 int xe_gt_suspend(struct xe_gt *gt);
 int xe_gt_resume(struct xe_gt *gt);
diff --git a/drivers/gpu/drm/xe/xe_gt_ccs_mode.c 
b/drivers/gpu/drm/xe/xe_gt_ccs_mode.c
index a34c9a24dafc..c36218f4f6c8 100644
--- a/drivers/gpu/drm/xe/xe_gt_ccs_mode.c
+++ b/drivers/gpu/drm/xe/xe_gt_ccs_mode.c
@@ -134,6 +134,7 @@ ccs_mode_store(struct device *kdev, struct device_attribute 
*attr,
if (gt->ccs_mode != num_engines) {
xe_gt_info(gt, "Setting compute mode to %d\n", num_engines);
gt->ccs_mode = num_engines;
+   xe_gt_record_user_engines(gt);
xe_gt_reset_async(gt);
}
 
diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
index 5a114fc9dde7..aaf2951749a6 100644
--- a/drivers/gpu/drm/xe/xe_gt_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_types.h
@@ -112,7 +112,11 @@ struct xe_gt {
enum xe_gt_type type;
/** @info.reference_clock: clock frequency */
u32 reference_clock;
-   /** @info.engine_mask: mask of engines present on GT */
+   /**
+* @info.engine_mask: mask of engines present on GT. Some of
+* them may be reserved in runtime and not available for user.
+* See @user_engines.mask
+*/
u64 engine_mask;
/** @info.gmdid: raw GMD_ID value from hardware */
u32 gmdid;
@@ -365,6 +369,21 @@ struct xe_gt {
/** @wa_active.oob: bitmap with active OOB workaroudns */
unsigned long *oob;
} wa_active;
+
+   /** @user_engines: engines present in GT and available to userspace */
+   struct {
+   /**
+* @mask: like @info->engine_mask, but take in consideration
+* only engines available to userspace
+*/
+   u64 mask;
+
+   /**
+* @instances_per_class: aggregate per class the number of
+* engines available to userspace
+*/
+   u8 instances_per_class[XE_ENGINE_CLASS_MAX];
+   } user_engines;
 };
 
 #endif
-- 
2.43.0



[PATCH v4 3/8] drm/xe/lrc: Add helper to capture context timestamp

2024-05-15 Thread Lucas De Marchi
From: Umesh Nerlige Ramappa 

Add a helper to capture CTX_TIMESTAMP from the context image so it can
be used to calculate the runtime.

v2: Add kernel-doc to clarify expectation from caller

Signed-off-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/regs/xe_lrc_layout.h |  1 +
 drivers/gpu/drm/xe/xe_lrc.c | 12 
 drivers/gpu/drm/xe/xe_lrc.h | 14 ++
 drivers/gpu/drm/xe/xe_lrc_types.h   |  3 +++
 4 files changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h 
b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
index e6ca8bbda8f4..045dfd09db99 100644
--- a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
+++ b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
@@ -11,6 +11,7 @@
 #define CTX_RING_TAIL  (0x06 + 1)
 #define CTX_RING_START (0x08 + 1)
 #define CTX_RING_CTL   (0x0a + 1)
+#define CTX_TIMESTAMP  (0x22 + 1)
 #define CTX_INDIRECT_RING_STATE(0x26 + 1)
 #define CTX_PDP0_UDW   (0x30 + 1)
 #define CTX_PDP0_LDW   (0x32 + 1)
diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
index 9b0a4078add3..f679cb9aaea7 100644
--- a/drivers/gpu/drm/xe/xe_lrc.c
+++ b/drivers/gpu/drm/xe/xe_lrc.c
@@ -844,6 +844,7 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine 
*hwe,
lrc->tile = gt_to_tile(hwe->gt);
lrc->ring.size = ring_size;
lrc->ring.tail = 0;
+   lrc->ctx_timestamp = 0;
 
xe_hw_fence_ctx_init(&lrc->fence_ctx, hwe->gt,
 hwe->fence_irq, hwe->name);
@@ -898,6 +899,8 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine 
*hwe,
 RING_CTL_SIZE(lrc->ring.size) | 
RING_VALID);
}
 
+   xe_lrc_write_ctx_reg(lrc, CTX_TIMESTAMP, 0);
+
if (xe->info.has_asid && vm)
xe_lrc_write_ctx_reg(lrc, PVC_CTX_ASID, vm->usm.asid);
 
@@ -1576,3 +1579,12 @@ void xe_lrc_snapshot_free(struct xe_lrc_snapshot 
*snapshot)
xe_bo_put(snapshot->lrc_bo);
kfree(snapshot);
 }
+
+u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts)
+{
+   *old_ts = lrc->ctx_timestamp;
+
+   lrc->ctx_timestamp = xe_lrc_read_ctx_reg(lrc, CTX_TIMESTAMP);
+
+   return lrc->ctx_timestamp;
+}
diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
index e0e841963c23..b9da1031083b 100644
--- a/drivers/gpu/drm/xe/xe_lrc.h
+++ b/drivers/gpu/drm/xe/xe_lrc.h
@@ -66,4 +66,18 @@ void xe_lrc_snapshot_capture_delayed(struct xe_lrc_snapshot 
*snapshot);
 void xe_lrc_snapshot_print(struct xe_lrc_snapshot *snapshot, struct 
drm_printer *p);
 void xe_lrc_snapshot_free(struct xe_lrc_snapshot *snapshot);
 
+/**
+ * xe_lrc_update_timestamp - readout LRC timestamp and update cached value
+ * @lrc: logical ring context for this exec queue
+ * @old_ts: pointer where to save the previous timestamp
+ *
+ * Read the current timestamp for this LRC and update the cached value. The
+ * previous cached value is also returned in @old_ts so the caller can 
calculate
+ * the delta between 2 updates. Note that this is not intended to be called 
from
+ * any place, but just by the paths updating the drm client utilization.
+ *
+ * Returns the current LRC timestamp
+ */
+u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts);
+
 #endif
diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h 
b/drivers/gpu/drm/xe/xe_lrc_types.h
index cdbf03faef15..0fa055da6b27 100644
--- a/drivers/gpu/drm/xe/xe_lrc_types.h
+++ b/drivers/gpu/drm/xe/xe_lrc_types.h
@@ -45,6 +45,9 @@ struct xe_lrc {
 
/** @fence_ctx: context for hw fence */
struct xe_hw_fence_ctx fence_ctx;
+
+   /** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */
+   u32 ctx_timestamp;
 };
 
 struct xe_lrc_snapshot;
-- 
2.43.0



[PATCH v4 1/8] drm/xe: Promote xe_hw_engine_class_to_str()

2024-05-15 Thread Lucas De Marchi
Move it out of the sysfs compilation unit so it can be re-used in other
places.

Reviewed-by: Nirmoy Das 
Reviewed-by: Oak Zeng 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_hw_engine.c | 18 ++
 drivers/gpu/drm/xe/xe_hw_engine.h |  2 ++
 drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c | 18 --
 3 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c 
b/drivers/gpu/drm/xe/xe_hw_engine.c
index 45f582a7caaa..520db5d518c3 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -1099,3 +1099,21 @@ bool xe_hw_engine_is_reserved(struct xe_hw_engine *hwe)
return xe->info.has_usm && hwe->class == XE_ENGINE_CLASS_COPY &&
hwe->instance == gt->usm.reserved_bcs_instance;
 }
+
+const char *xe_hw_engine_class_to_str(enum xe_engine_class class)
+{
+   switch (class) {
+   case XE_ENGINE_CLASS_RENDER:
+   return "rcs";
+   case XE_ENGINE_CLASS_VIDEO_DECODE:
+   return "vcs";
+   case XE_ENGINE_CLASS_VIDEO_ENHANCE:
+   return "vecs";
+   case XE_ENGINE_CLASS_COPY:
+   return "bcs";
+   case XE_ENGINE_CLASS_COMPUTE:
+   return "ccs";
+   default:
+   return NULL;
+   }
+}
diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h 
b/drivers/gpu/drm/xe/xe_hw_engine.h
index 71968ee2f600..843de159e47c 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.h
+++ b/drivers/gpu/drm/xe/xe_hw_engine.h
@@ -67,4 +67,6 @@ static inline bool xe_hw_engine_is_valid(struct xe_hw_engine 
*hwe)
return hwe->name;
 }
 
+const char *xe_hw_engine_class_to_str(enum xe_engine_class class);
+
 #endif
diff --git a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c 
b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
index 844ec68cbbb8..efce6c7dd2a2 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
@@ -618,24 +618,6 @@ static void hw_engine_class_sysfs_fini(struct drm_device 
*drm, void *arg)
kobject_put(kobj);
 }
 
-static const char *xe_hw_engine_class_to_str(enum xe_engine_class class)
-{
-   switch (class) {
-   case XE_ENGINE_CLASS_RENDER:
-   return "rcs";
-   case XE_ENGINE_CLASS_VIDEO_DECODE:
-   return "vcs";
-   case XE_ENGINE_CLASS_VIDEO_ENHANCE:
-   return "vecs";
-   case XE_ENGINE_CLASS_COPY:
-   return "bcs";
-   case XE_ENGINE_CLASS_COMPUTE:
-   return "ccs";
-   default:
-   return NULL;
-   }
-}
-
 /**
  * xe_hw_engine_class_sysfs_init - Init HW engine classes on GT.
  * @gt: Xe GT.
-- 
2.43.0



[PATCH v4 5/8] drm/xe: Add helper to accumulate exec queue runtime

2024-05-15 Thread Lucas De Marchi
From: Umesh Nerlige Ramappa 

Add a helper to accumulate per-client runtime of all its
exec queues. This is called every time a sched job is finished.

v2:
  - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate
runtime when job is finished since xe_sched_job_completed() is not a
notification that job finished.
  - Stop trying to update runtime from xe_exec_queue_fini() - that is
redundant and may happen after xef is closed, leading to a
use-after-free
  - Do not special case the first timestamp read: the default LRC sets
CTX_TIMESTAMP to zero, so even the first sample should be a valid
one.
  - Handle the parallel submission case by multiplying the runtime by
width.
v3: Update comments

Signed-off-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_device_types.h |  3 +++
 drivers/gpu/drm/xe/xe_exec_queue.c   | 37 
 drivers/gpu/drm/xe/xe_exec_queue.h   |  1 +
 drivers/gpu/drm/xe/xe_execlist.c |  1 +
 drivers/gpu/drm/xe/xe_guc_submit.c   |  2 ++
 5 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_device_types.h 
b/drivers/gpu/drm/xe/xe_device_types.h
index 5c5e36de452a..bc97990fd032 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -555,6 +555,9 @@ struct xe_file {
struct mutex lock;
} exec_queue;
 
+   /** @runtime: hw engine class runtime in ticks for this drm client */
+   u64 runtime[XE_ENGINE_CLASS_MAX];
+
/** @client: drm client */
struct xe_drm_client *client;
 };
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c 
b/drivers/gpu/drm/xe/xe_exec_queue.c
index 395de93579fa..fa6dc996eca8 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -769,6 +769,43 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q)
q->lrc[0].fence_ctx.next_seqno - 1;
 }
 
+/**
+ * xe_exec_queue_update_runtime() - Update runtime for this exec queue from hw
+ * @q: The exec queue
+ *
+ * Update the timestamp saved by HW for this exec queue and save runtime
+ * calculated by using the delta from last update. On multi-lrc case, only the
+ * first is considered.
+ */
+void xe_exec_queue_update_runtime(struct xe_exec_queue *q)
+{
+   struct xe_file *xef;
+   struct xe_lrc *lrc;
+   u32 old_ts, new_ts;
+
+   /*
+* Jobs that are run during driver load may use an exec_queue, but are
+* not associated with a user xe file, so avoid accumulating busyness
+* for kernel specific work.
+*/
+   if (!q->vm || !q->vm->xef)
+   return;
+
+   xef = q->vm->xef;
+
+   /*
+* Only sample the first LRC. For parallel submission, all of them are
+* scheduled together and we compensate that below by multiplying by
+* width - this may introduce errors if that premise is not true and
+* they don't exit 100% aligned. On the other hand, looping through
+* the LRCs and reading them in different time could also introduce
+* errors.
+*/
+   lrc = &q->lrc[0];
+   new_ts = xe_lrc_update_timestamp(lrc, &old_ts);
+   xef->runtime[q->class] += (new_ts - old_ts) * q->width;
+}
+
 void xe_exec_queue_kill(struct xe_exec_queue *q)
 {
struct xe_exec_queue *eq = q, *next;
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.h 
b/drivers/gpu/drm/xe/xe_exec_queue.h
index 48f6da53a292..e0f07d28ee1a 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.h
+++ b/drivers/gpu/drm/xe/xe_exec_queue.h
@@ -75,5 +75,6 @@ struct dma_fence *xe_exec_queue_last_fence_get(struct 
xe_exec_queue *e,
   struct xe_vm *vm);
 void xe_exec_queue_last_fence_set(struct xe_exec_queue *e, struct xe_vm *vm,
  struct dma_fence *fence);
+void xe_exec_queue_update_runtime(struct xe_exec_queue *q);
 
 #endif
diff --git a/drivers/gpu/drm/xe/xe_execlist.c b/drivers/gpu/drm/xe/xe_execlist.c
index e9dee1e14fef..bd7f27efe0e0 100644
--- a/drivers/gpu/drm/xe/xe_execlist.c
+++ b/drivers/gpu/drm/xe/xe_execlist.c
@@ -306,6 +306,7 @@ static void execlist_job_free(struct drm_sched_job *drm_job)
 {
struct xe_sched_job *job = to_xe_sched_job(drm_job);
 
+   xe_exec_queue_update_runtime(job->q);
xe_sched_job_put(job);
 }
 
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c 
b/drivers/gpu/drm/xe/xe_guc_submit.c
index 4efb88e3e056..ad2b8067d071 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -749,6 +749,8 @@ static void guc_exec_queue_free_job(struct drm_sched_job 
*drm_job)
 {
struct xe_sched_job *job = to_xe_sched_job(drm_job);
 
+   xe_exec_queue_update_runtime(job->q);
+
trace_xe_sched_job_free(job);
xe_sched_job_put(job);
 }
-- 
2.43.0



[PATCH v4 2/8] drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion

2024-05-15 Thread Lucas De Marchi
XE_ENGINE_CLASS_OTHER was missing from the str conversion. Add it and
remove the default handling so it's protected by -Wswitch.
Currently the only user is xe_hw_engine_class_sysfs_init(), which
already skips XE_ENGINE_CLASS_OTHER, so there's no change in behavior.

Reviewed-by: Nirmoy Das 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_hw_engine.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c 
b/drivers/gpu/drm/xe/xe_hw_engine.c
index 520db5d518c3..f76bb507e1a5 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -,9 +,13 @@ const char *xe_hw_engine_class_to_str(enum 
xe_engine_class class)
return "vecs";
case XE_ENGINE_CLASS_COPY:
return "bcs";
+   case XE_ENGINE_CLASS_OTHER:
+   return "other";
case XE_ENGINE_CLASS_COMPUTE:
return "ccs";
-   default:
-   return NULL;
+   case XE_ENGINE_CLASS_MAX:
+   break;
}
+
+   return NULL;
 }
-- 
2.43.0



[PATCH v4 0/8] drm/xe: Per client usage

2024-05-15 Thread Lucas De Marchi
v4 of 
https://lore.kernel.org/all/20240507224510.442971-1-lucas.demar...@intel.com

Add per-client usage statistics to xe. This ports xe to use the common
method in drm to export the usage to userspace per client (where 1
client == 1 drm fd open).

However instead of using the current format measured in nsec, this
creates a new one. The intention here is not to mix the GPU clock domain
with the CPU clock. It allows to cover a few more use cases without
extra complications.

This version is tested on an ADL-P and also checked gputop with i915 to
make sure not regressed. Last patch also contains the documentation for
the new key and sample output as requested in v1.

The pre-existent drm-cycles- is used as is, which allows gputop
to work with xe.

This last patch still has some open discussion from v2, so we may need
to hold it a little more.

v2:
  - Create a new drm-total-cycles instead of re-using drm-engine with a
different unit
  - Add documentation for the new interface and clarify usage of
xe_lrc_update_timestamp()

v3:
  - Fix bugs in "drm/xe: Add helper to accumulate exec queue runtime" -
see commit message
  - Reorder commits so the ones that are useful in other patch series
come first

v4:
  - Fix some comments and documentation
  - Add 2 patches so we cache on the gt the mask of engines visible to
userspace and the per-class capacity. Previously we were doing this
during the query, but besides not being very efficient as pointed
by Tvrtko, we were also not handling correclty reserved engines and
engines "hidden" by e.g. ccs_mode.  So move that part to be executed
on init and when changing the available engines.
  - Simplify the fdinfo output loop since now we have the information
cached on gt. This also moves the read of the gpu timestamp out
of the loop as suggested by Tvrtko and using the helpers implemented
in the new patches.

Lucas De Marchi (6):
  drm/xe: Promote xe_hw_engine_class_to_str()
  drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion
  drm/xe: Add helper to capture engine timestamp
  drm/xe: Cache data about user-visible engines
  drm/xe: Add helper to return any available hw engine
  drm/xe/client: Print runtime to fdinfo

Umesh Nerlige Ramappa (2):
  drm/xe/lrc: Add helper to capture context timestamp
  drm/xe: Add helper to accumulate exec queue runtime

 Documentation/gpu/drm-usage-stats.rst |  21 ++-
 Documentation/gpu/xe/index.rst|   1 +
 Documentation/gpu/xe/xe-drm-usage-stats.rst   |  10 ++
 drivers/gpu/drm/xe/regs/xe_lrc_layout.h   |   1 +
 drivers/gpu/drm/xe/xe_device_types.h  |   3 +
 drivers/gpu/drm/xe/xe_drm_client.c| 121 +-
 drivers/gpu/drm/xe/xe_exec_queue.c|  37 ++
 drivers/gpu/drm/xe/xe_exec_queue.h|   1 +
 drivers/gpu/drm/xe/xe_execlist.c  |   1 +
 drivers/gpu/drm/xe/xe_gt.c|  34 +
 drivers/gpu/drm/xe/xe_gt.h|  20 +++
 drivers/gpu/drm/xe/xe_gt_ccs_mode.c   |   1 +
 drivers/gpu/drm/xe/xe_gt_types.h  |  21 ++-
 drivers/gpu/drm/xe/xe_guc_submit.c|   2 +
 drivers/gpu/drm/xe/xe_hw_engine.c |  27 
 drivers/gpu/drm/xe/xe_hw_engine.h |   3 +
 drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c |  18 ---
 drivers/gpu/drm/xe/xe_lrc.c   |  12 ++
 drivers/gpu/drm/xe/xe_lrc.h   |  14 ++
 drivers/gpu/drm/xe/xe_lrc_types.h |   3 +
 20 files changed, 329 insertions(+), 22 deletions(-)
 create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst

-- 
2.43.0



[PULL] drm-xe-fixes

2024-05-09 Thread Lucas De Marchi

Hi Dave and Sima,

Please pull the drm-xe-fixes for this week targeting v6.9.

thanks
Lucas De Marchi

drm-xe-fixes-2024-05-09:
- Fix use zero-length element array
- Move more from system wq to ordered private wq
- Do not ignore return for drmm_mutex_init
The following changes since commit dd5a440a31fae6e459c0d627162825505361:

  Linux 6.9-rc7 (2024-05-05 14:06:01 -0700)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-05-09

for you to fetch changes up to c002bfe644a29ba600c571f2abba13a155a12dcd:

  drm/xe: Use ordered WQ for G2H handler (2024-05-09 09:41:27 -0500)


- Fix use zero-length element array
- Move more from system wq to ordered private wq
- Do not ignore return for drmm_mutex_init


Daniele Ceraolo Spurio (1):
  drm/xe/guc: Check error code when initializing the CT mutex

Lucas De Marchi (1):
  drm/xe/ads: Use flexible-array

Matthew Brost (1):
  drm/xe: Use ordered WQ for G2H handler

 drivers/gpu/drm/xe/xe_guc_ads.c  |  2 +-
 drivers/gpu/drm/xe/xe_guc_ct.c   | 10 +-
 drivers/gpu/drm/xe/xe_guc_ct.h   |  2 +-
 drivers/gpu/drm/xe/xe_guc_ct_types.h |  2 ++
 4 files changed, 13 insertions(+), 3 deletions(-)


Re: [PATCH] drm/xe: Fix UBSAN shift-out-of-bounds failure

2024-05-08 Thread Lucas De Marchi

On Tue, May 07, 2024 at 01:04:11PM GMT, Shuicheng Lin wrote:

Here is the failure stack:
[   12.988209] [ cut here ]
[   12.988216] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13
[   12.988232] shift exponent 64 is too large for 64-bit type 'long unsigned 
int'
[   12.988235] CPU: 4 PID: 1310 Comm: gnome-shell Tainted: G U 
6.9.0-rc6+prerelease1158+ #19
[   12.988237] Hardware name: Intel Corporation Raptor Lake Client 
Platform/RPL-S ADP-S DDR5 UDIMM CRB, BIOS RPLSFWI1.R00.3301.A02.2208050712 
08/05/2022
[   12.988239] Call Trace:
[   12.988240]  
[   12.988242]  dump_stack_lvl+0xd7/0xf0
[   12.988248]  dump_stack+0x10/0x20
[   12.988250]  ubsan_epilogue+0x9/0x40
[   12.988253]  __ubsan_handle_shift_out_of_bounds+0x10e/0x170
[   12.988260]  dma_resv_reserve_fences.cold+0x2b/0x48
[   12.988262]  ? ww_mutex_lock_interruptible+0x3c/0x110
[   12.988267]  drm_exec_prepare_obj+0x45/0x60 [drm_exec]
[   12.988271]  ? vm_bind_ioctl_ops_execute+0x5b/0x740 [xe]
[   12.988345]  vm_bind_ioctl_ops_execute+0x78/0x740 [xe]

It is caused by the value 0 of parameter num_fences in function 
drm_exec_prepare_obj.
And lead to in function __rounddown_pow_of_two, "0 - 1" causes the 
shift-out-of-bounds.


ok


For num_fences == 0 case, drm_exec_prepare_obj is the same as drm_exec_lock_obj 
in
function, so call drm_exec_lock_obj instead to solve it.


this is not true and per discussion in this thread it's not going to
change. drm_exec_prepare_obj() should not be called with num_fences ==
0. So I'd reworded with something like below so we have all breadcrumbs
for anyone trying to understand the changes later:

By design drm_exec_prepare_obj() should be called only when there are
fences to be reserved. If num_fences is 0, calling drm_exec_lock_obj()
is sufficient as was done in commit 9377de4cb3e8 ("drm/xe/vm: Avoid
reserving zero fences")



Cc: Nirmoy Das 
Cc: Matthew Brost 
Signed-off-by: Shuicheng Lin 


with the reword,

Reviewed-by: Lucas De Marchi 

And also add:
Link: https://lore.kernel.org/all/24d4a9a9-c622-4f56-8672-21f4c6785...@amd.com

Could you also submit a patch to add the warning like mentioned by
Christian?

thanks
Lucas De Marchi


---
drivers/gpu/drm/xe/xe_vm.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index d17192c8b7de..c5b1694b292f 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -2692,7 +2692,7 @@ static int vma_lock_and_validate(struct drm_exec *exec, 
struct xe_vma *vma,

if (bo) {
if (!bo->vm)
-   err = drm_exec_prepare_obj(exec, &bo->ttm.base, 0);
+   err = drm_exec_lock_obj(exec, &bo->ttm.base);
if (!err && validate)
err = xe_bo_validate(bo, xe_vma_vm(vma), true);
}
@@ -2777,7 +2777,7 @@ static int vm_bind_ioctl_ops_lock_and_prep(struct 
drm_exec *exec,
struct xe_vma_op *op;
int err;

-   err = drm_exec_prepare_obj(exec, xe_vm_obj(vm), 0);
+   err = drm_exec_lock_obj(exec, xe_vm_obj(vm));
if (err)
return err;

--
2.25.1



Re: [PATCH v2 6/6] drm/xe/client: Print runtime to fdinfo

2024-05-08 Thread Lucas De Marchi

On Wed, May 08, 2024 at 09:23:17AM GMT, Tvrtko Ursulin wrote:


On 07/05/2024 22:35, Lucas De Marchi wrote:

On Fri, Apr 26, 2024 at 11:47:37AM GMT, Tvrtko Ursulin wrote:


On 24/04/2024 00:56, Lucas De Marchi wrote:

Print the accumulated runtime for client when printing fdinfo.
Each time a query is done it first does 2 things:

1) loop through all the exec queues for the current client and
   accumulate the runtime, per engine class. CTX_TIMESTAMP is used for
   that, being read from the context image.

2) Read a "GPU timestamp" that can be used for considering "how much GPU
   time has passed" and that has the same unit/refclock as the one
   recording the runtime. RING_TIMESTAMP is used for that via MMIO.

Since for all current platforms RING_TIMESTAMP follows the same
refclock, just read it once, using any first engine.

This is exported to userspace as 2 numbers in fdinfo:

drm-cycles-: 
drm-total-cycles-: 

Userspace is expected to collect at least 2 samples, which allows to
know the client engine busyness as per:

    RUNTIME1 - RUNTIME0
busyness = -
  T1 - T0

Another thing to point out is that it's expected that userspace will
read any 2 samples every few seconds.  Given the update frequency of the
counters involved and that CTX_TIMESTAMP is 32-bits, the counter for
each exec_queue can wrap around (assuming 100% utilization) after ~200s.
The wraparound is not perceived by userspace since it's just accumulated
for all the exec_queues in a 64-bit counter), but the measurement will
not be accurate if the samples are too far apart.

This could be mitigated by adding a workqueue to accumulate the counters
every so often, but it's additional complexity for something that is
done already by userspace every few seconds in tools like gputop (from
igt), htop, nvtop, etc with none of them really defaulting to 1 sample
per minute or more.

Signed-off-by: Lucas De Marchi 
---
 Documentation/gpu/drm-usage-stats.rst   |  16 ++-
 Documentation/gpu/xe/index.rst  |   1 +
 Documentation/gpu/xe/xe-drm-usage-stats.rst |  10 ++
 drivers/gpu/drm/xe/xe_drm_client.c  | 138 +++-
 4 files changed, 162 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst

diff --git a/Documentation/gpu/drm-usage-stats.rst 
b/Documentation/gpu/drm-usage-stats.rst

index 6dc299343b48..421766289b78 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -112,6 +112,17 @@ larger value within a reasonable period. 
Upon observing a value lower than what
 was previously read, userspace is expected to stay with that 
larger previous

 value until a monotonic update is seen.
+- drm-total-cycles-: 
+
+Engine identifier string must be the same as the one specified in the
+drm-cycles- tag and shall contain the total number 
cycles for the given

+engine.
+
+This is a timestamp in GPU unspecified unit that matches the 
update rate
+of drm-cycles-. For drivers that implement this 
interface, the engine

+utilization can be calculated entirely on the GPU clock domain, without
+considering the CPU sleep time between 2 samples.


Two opens.

1)
Do we need to explicity document that drm-total-cycles and 
drm-maxfreq are mutually exclusive?


so userspace has a fallback mechanism to calculate utilization depending
on what keys are available?


No, to document all three at once do not make sense. Or at least are 
not expected. Or you envisage someone might legitimately emit all 
three? I don't see what would be the semantics. When we have 
cycles+maxfreq the latter is in Hz. And when we have cycles+total then 
it is unitless. All three?


I don't follow what you mean here. *cycles* is actually a unit.

The engine spent 10 cycles running this context (drm-cycles). In the
same period there were 100 cycles available (drm-total-cycles). Current
frequency is X MHz. Max frequency is Y MHz. For me all of them make
sense if one wants to mix them together. For xe it doesn't make sense
because the counter backing drm-cycles and drm-total-cycles is unrelated
to the engine frequency.

I can add something in the doc that we do not expected to see all of them
together until we see a usecase. Each driver may implement a subset.




2)
Should drm-total-cycles for now be documents as driver specific?


you mean to call it xe-total-cycles?


Yes but it is not an ask, just an open.


Ok, my opinion is that we shouldn't. Just like we have drm-cycles today
implemented by some drivers, but not all. I'd consider the drm-curfreq,
not documented in the drm layer as something to be fixed or migrated to
a driver-only interface (probably not possible anymore as it'd break the
uapi).  Problem I see with turning it into xe-total-cycles, is that the
moment another driver decide to implement they will either have to use
xe- prefix or xe will need to st

Re: [PATCH v3 5/6] drm/xe: Add helper to accumulate exec queue runtime

2024-05-08 Thread Lucas De Marchi

On Wed, May 08, 2024 at 09:34:59AM GMT, Tvrtko Ursulin wrote:


On 07/05/2024 23:45, Lucas De Marchi wrote:

From: Umesh Nerlige Ramappa 

Add a helper to accumulate per-client runtime of all its
exec queues. This is called every time a sched job is finished.

v2:
  - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate
runtime when job is finished since xe_sched_job_completed() is not a
notification that job finished.
  - Stop trying to update runtime from xe_exec_queue_fini() - that is
redundant and may happen after xef is closed, leading to a
use-after-free
  - Do not special case the first timestamp read: the default LRC sets
CTX_TIMESTAMP to zero, so even the first sample should be a valid
one.
  - Handle the parallel submission case by multiplying the runtime by
width.

Signed-off-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_device_types.h |  9 +++
 drivers/gpu/drm/xe/xe_exec_queue.c   | 35 
 drivers/gpu/drm/xe/xe_exec_queue.h   |  1 +
 drivers/gpu/drm/xe/xe_execlist.c |  1 +
 drivers/gpu/drm/xe/xe_guc_submit.c   |  2 ++
 5 files changed, 48 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_device_types.h 
b/drivers/gpu/drm/xe/xe_device_types.h
index 906b98fb973b..de078bdf0ab9 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -560,6 +560,15 @@ struct xe_file {
struct mutex lock;
} exec_queue;
+   /**
+* @runtime: hw engine class runtime in ticks for this drm client
+*
+* Only stats from xe_exec_queue->lrc[0] are accumulated. For multi-lrc
+* case, since all jobs run in parallel on the engines, only the stats
+* from lrc[0] are sufficient.
+*/
+   u64 runtime[XE_ENGINE_CLASS_MAX];
+
/** @client: drm client */
struct xe_drm_client *client;
 };
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c 
b/drivers/gpu/drm/xe/xe_exec_queue.c
index 395de93579fa..86eb22e22c95 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -769,6 +769,41 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q)
q->lrc[0].fence_ctx.next_seqno - 1;
 }
+/**
+ * xe_exec_queue_update_runtime() - Update runtime for this exec queue from hw
+ * @q: The exec queue
+ *
+ * Update the timestamp saved by HW for this exec queue and save runtime
+ * calculated by using the delta from last update. On multi-lrc case, only the
+ * first is considered.
+ */
+void xe_exec_queue_update_runtime(struct xe_exec_queue *q)
+{
+   struct xe_file *xef;
+   struct xe_lrc *lrc;
+   u32 old_ts, new_ts;
+
+   /*
+* Jobs that are run during driver load may use an exec_queue, but are
+* not associated with a user xe file, so avoid accumulating busyness
+* for kernel specific work.
+*/
+   if (!q->vm || !q->vm->xef)
+   return;
+
+   xef = q->vm->xef;
+
+   /*
+* Only sample the first LRC. For parallel submission, all of them are
+* scheduled together and we compensate that below by multiplying by
+* width
+*/
+   lrc = &q->lrc[0];
+
+   new_ts = xe_lrc_update_timestamp(lrc, &old_ts);
+   xef->runtime[q->class] += (new_ts - old_ts) * q->width;


I think in theory this could be introducing a systematic error 
depending on how firmware handles things and tick resolution. Or even 
regardless of the firmware, if the timestamps are saved on context 
exit by the GPU hw itself and parallel contexts do not exit 100% 
aligned. Undershoot would be I think fine, but systematic overshoot 
under constant 100% parallel load from mutlitple client could 
constantly show >100% class utilisation. Probably not a concern in 
practice but worthy a comment?


Ok... just extend the comment above? I could have looped through the
LRCs to read the timestamp, but I didn't like reading them in slightly
different times, which could give slight different results if they are
in a running state.  This way here it doesn't have this problem at the
expense of "we have to trust the hw/firmware". We can switch to the loop
eventually if our level of trust decreases after getting more data/test :)

thanks
Lucas De Marchi



Regards,

Tvrtko


+}
+
 void xe_exec_queue_kill(struct xe_exec_queue *q)
 {
struct xe_exec_queue *eq = q, *next;
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.h 
b/drivers/gpu/drm/xe/xe_exec_queue.h
index 48f6da53a292..e0f07d28ee1a 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.h
+++ b/drivers/gpu/drm/xe/xe_exec_queue.h
@@ -75,5 +75,6 @@ struct dma_fence *xe_exec_queue_last_fence_get(struct 
xe_exec_queue *e,
   struct xe_vm *vm);
 void xe_exec_queue_last_fence_set(struct xe_exec_queue *e, struct xe_vm *vm,
  str

Re: [PATCH v3 5/6] drm/xe: Add helper to accumulate exec queue runtime

2024-05-07 Thread Lucas De Marchi

On Tue, May 07, 2024 at 04:45:36PM GMT, Umesh Nerlige Ramappa wrote:

On Tue, May 07, 2024 at 03:45:09PM -0700, Lucas De Marchi wrote:

From: Umesh Nerlige Ramappa 

Add a helper to accumulate per-client runtime of all its
exec queues. This is called every time a sched job is finished.

v2:
- Use guc_exec_queue_free_job() and execlist_job_free() to accumulate
  runtime when job is finished since xe_sched_job_completed() is not a
  notification that job finished.
- Stop trying to update runtime from xe_exec_queue_fini() - that is
  redundant and may happen after xef is closed, leading to a
  use-after-free
- Do not special case the first timestamp read: the default LRC sets
  CTX_TIMESTAMP to zero, so even the first sample should be a valid
  one.
- Handle the parallel submission case by multiplying the runtime by
  width.

Signed-off-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
drivers/gpu/drm/xe/xe_device_types.h |  9 +++
drivers/gpu/drm/xe/xe_exec_queue.c   | 35 
drivers/gpu/drm/xe/xe_exec_queue.h   |  1 +
drivers/gpu/drm/xe/xe_execlist.c |  1 +
drivers/gpu/drm/xe/xe_guc_submit.c   |  2 ++
5 files changed, 48 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_device_types.h 
b/drivers/gpu/drm/xe/xe_device_types.h
index 906b98fb973b..de078bdf0ab9 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -560,6 +560,15 @@ struct xe_file {
struct mutex lock;
} exec_queue;

+   /**
+* @runtime: hw engine class runtime in ticks for this drm client
+*
+* Only stats from xe_exec_queue->lrc[0] are accumulated. For multi-lrc
+* case, since all jobs run in parallel on the engines, only the stats
+* from lrc[0] are sufficient.


Maybe we can drop the above comment altogether after the multi-lrc 
update.


right... I removed it from the middle of the function and this was a
leftover. I will remove it on next version.

thanks
Lucas De Marchi




Umesh


[PATCH v3 6/6] drm/xe/client: Print runtime to fdinfo

2024-05-07 Thread Lucas De Marchi
Print the accumulated runtime for client when printing fdinfo.
Each time a query is done it first does 2 things:

1) loop through all the exec queues for the current client and
   accumulate the runtime, per engine class. CTX_TIMESTAMP is used for
   that, being read from the context image.

2) Read a "GPU timestamp" that can be used for considering "how much GPU
   time has passed" and that has the same unit/refclock as the one
   recording the runtime. RING_TIMESTAMP is used for that via MMIO.

Since for all current platforms RING_TIMESTAMP follows the same
refclock, just read it once, using any first engine.

This is exported to userspace as 2 numbers in fdinfo:

drm-cycles-: 
drm-total-cycles-: 

Userspace is expected to collect at least 2 samples, which allows to
know the client engine busyness as per:

RUNTIME1 - RUNTIME0
busyness = -
  T1 - T0

Another thing to point out is that it's expected that userspace will
read any 2 samples every few seconds.  Given the update frequency of the
counters involved and that CTX_TIMESTAMP is 32-bits, the counter for
each exec_queue can wrap around (assuming 100% utilization) after ~200s.
The wraparound is not perceived by userspace since it's just accumulated
for all the exec_queues in a 64-bit counter), but the measurement will
not be accurate if the samples are too far apart.

This could be mitigated by adding a workqueue to accumulate the counters
every so often, but it's additional complexity for something that is
done already by userspace every few seconds in tools like gputop (from
igt), htop, nvtop, etc with none of them really defaulting to 1 sample
per minute or more.

Signed-off-by: Lucas De Marchi 
---
 Documentation/gpu/drm-usage-stats.rst   |  16 ++-
 Documentation/gpu/xe/index.rst  |   1 +
 Documentation/gpu/xe/xe-drm-usage-stats.rst |  10 ++
 drivers/gpu/drm/xe/xe_drm_client.c  | 136 +++-
 4 files changed, 160 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst

diff --git a/Documentation/gpu/drm-usage-stats.rst 
b/Documentation/gpu/drm-usage-stats.rst
index 6dc299343b48..421766289b78 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -112,6 +112,17 @@ larger value within a reasonable period. Upon observing a 
value lower than what
 was previously read, userspace is expected to stay with that larger previous
 value until a monotonic update is seen.
 
+- drm-total-cycles-: 
+
+Engine identifier string must be the same as the one specified in the
+drm-cycles- tag and shall contain the total number cycles for the given
+engine.
+
+This is a timestamp in GPU unspecified unit that matches the update rate
+of drm-cycles-. For drivers that implement this interface, the engine
+utilization can be calculated entirely on the GPU clock domain, without
+considering the CPU sleep time between 2 samples.
+
 - drm-maxfreq-:  [Hz|MHz|KHz]
 
 Engine identifier string must be the same as the one specified in the
@@ -168,5 +179,6 @@ be documented above and where possible, aligned with other 
drivers.
 Driver specific implementations
 ---
 
-:ref:`i915-usage-stats`
-:ref:`panfrost-usage-stats`
+* :ref:`i915-usage-stats`
+* :ref:`panfrost-usage-stats`
+* :ref:`xe-usage-stats`
diff --git a/Documentation/gpu/xe/index.rst b/Documentation/gpu/xe/index.rst
index c224ecaee81e..3f07aa3b5432 100644
--- a/Documentation/gpu/xe/index.rst
+++ b/Documentation/gpu/xe/index.rst
@@ -23,3 +23,4 @@ DG2, etc is provided to prototype the driver.
xe_firmware
xe_tile
xe_debugging
+   xe-drm-usage-stats.rst
diff --git a/Documentation/gpu/xe/xe-drm-usage-stats.rst 
b/Documentation/gpu/xe/xe-drm-usage-stats.rst
new file mode 100644
index ..482d503ae68a
--- /dev/null
+++ b/Documentation/gpu/xe/xe-drm-usage-stats.rst
@@ -0,0 +1,10 @@
+.. SPDX-License-Identifier: GPL-2.0+
+
+.. _xe-usage-stats:
+
+
+Xe DRM client usage stats implementation
+
+
+.. kernel-doc:: drivers/gpu/drm/xe/xe_drm_client.c
+   :doc: DRM Client usage stats
diff --git a/drivers/gpu/drm/xe/xe_drm_client.c 
b/drivers/gpu/drm/xe/xe_drm_client.c
index 08f0b7c95901..27494839d586 100644
--- a/drivers/gpu/drm/xe/xe_drm_client.c
+++ b/drivers/gpu/drm/xe/xe_drm_client.c
@@ -2,6 +2,7 @@
 /*
  * Copyright © 2023 Intel Corporation
  */
+#include "xe_drm_client.h"
 
 #include 
 #include 
@@ -12,9 +13,65 @@
 #include "xe_bo.h"
 #include "xe_bo_types.h"
 #include "xe_device_types.h"
-#include "xe_drm_client.h"
+#include "xe_exec_queue.h"
+#include "xe_gt.h"
+#include "xe_hw_engine.h"
+#include "xe_pm.h"
 #include "xe_trace.h"
 
+/**
+ * DOC: DRM Client usag

[PATCH v3 3/6] drm/xe/lrc: Add helper to capture context timestamp

2024-05-07 Thread Lucas De Marchi
From: Umesh Nerlige Ramappa 

Add a helper to capture CTX_TIMESTAMP from the context image so it can
be used to calculate the runtime.

v2: Add kernel-doc to clarify expectation from caller

Signed-off-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/regs/xe_lrc_layout.h |  1 +
 drivers/gpu/drm/xe/xe_lrc.c | 11 +++
 drivers/gpu/drm/xe/xe_lrc.h | 14 ++
 drivers/gpu/drm/xe/xe_lrc_types.h   |  3 +++
 4 files changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h 
b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
index 1825d8f79db6..8780e6c6b649 100644
--- a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
+++ b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
@@ -11,6 +11,7 @@
 #define CTX_RING_TAIL  (0x06 + 1)
 #define CTX_RING_START (0x08 + 1)
 #define CTX_RING_CTL   (0x0a + 1)
+#define CTX_TIMESTAMP  (0x22 + 1)
 #define CTX_PDP0_UDW   (0x30 + 1)
 #define CTX_PDP0_LDW   (0x32 + 1)
 
diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
index 2066d34ddf0b..8c540f2f43e3 100644
--- a/drivers/gpu/drm/xe/xe_lrc.c
+++ b/drivers/gpu/drm/xe/xe_lrc.c
@@ -751,6 +751,7 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine 
*hwe,
lrc->tile = gt_to_tile(hwe->gt);
lrc->ring.size = ring_size;
lrc->ring.tail = 0;
+   lrc->ctx_timestamp = 0;
 
xe_hw_fence_ctx_init(&lrc->fence_ctx, hwe->gt,
 hwe->fence_irq, hwe->name);
@@ -786,6 +787,7 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine 
*hwe,
xe_drm_client_add_bo(vm->xef->client, lrc->bo);
}
 
+   xe_lrc_write_ctx_reg(lrc, CTX_TIMESTAMP, 0);
xe_lrc_write_ctx_reg(lrc, CTX_RING_START, __xe_lrc_ring_ggtt_addr(lrc));
xe_lrc_write_ctx_reg(lrc, CTX_RING_HEAD, 0);
xe_lrc_write_ctx_reg(lrc, CTX_RING_TAIL, lrc->ring.tail);
@@ -1444,3 +1446,12 @@ void xe_lrc_snapshot_free(struct xe_lrc_snapshot 
*snapshot)
xe_bo_put(snapshot->lrc_bo);
kfree(snapshot);
 }
+
+u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts)
+{
+   *old_ts = lrc->ctx_timestamp;
+
+   lrc->ctx_timestamp = xe_lrc_read_ctx_reg(lrc, CTX_TIMESTAMP);
+
+   return lrc->ctx_timestamp;
+}
diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
index d32fa31faa2c..dcbc6edd80da 100644
--- a/drivers/gpu/drm/xe/xe_lrc.h
+++ b/drivers/gpu/drm/xe/xe_lrc.h
@@ -60,4 +60,18 @@ void xe_lrc_snapshot_capture_delayed(struct xe_lrc_snapshot 
*snapshot);
 void xe_lrc_snapshot_print(struct xe_lrc_snapshot *snapshot, struct 
drm_printer *p);
 void xe_lrc_snapshot_free(struct xe_lrc_snapshot *snapshot);
 
+/**
+ * xe_lrc_update_timestamp - readout LRC timestamp and update cached value
+ * @lrc: logical ring context for this exec queue
+ * @old_ts: pointer where to save the previous timestamp
+ *
+ * Read the current timestamp for this LRC and update the cached value. The
+ * previous cached value is also returned in @old_ts so the caller can 
calculate
+ * the delta between 2 updates. Note that this is not intended to be called 
from
+ * any place, but just by the paths updating the drm client utilization.
+ *
+ * Returns the current LRC timestamp
+ */
+u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts);
+
 #endif
diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h 
b/drivers/gpu/drm/xe/xe_lrc_types.h
index b716df0dfb4e..5765d771b901 100644
--- a/drivers/gpu/drm/xe/xe_lrc_types.h
+++ b/drivers/gpu/drm/xe/xe_lrc_types.h
@@ -41,6 +41,9 @@ struct xe_lrc {
 
/** @fence_ctx: context for hw fence */
struct xe_hw_fence_ctx fence_ctx;
+
+   /** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */
+   u32 ctx_timestamp;
 };
 
 struct xe_lrc_snapshot;
-- 
2.43.0



[PATCH v3 0/6] drm/xe: Per client usage

2024-05-07 Thread Lucas De Marchi
Add per-client usage statistics to xe. This ports xe to use the common
method in drm to export the usage to userspace per client (where 1
client == 1 drm fd open).

However instead of using the current format measured in nsec, this
creates a new one. The intention here is not to mix the GPU clock domain
with the CPU clock. It allows to cover a few more use cases without
extra complications.

I tested this on DG2 and also checked gputop with i915 to make sure not
regressed. Last patch also contains the documentation for the new key
and sample output as requested in v1. Reproducing it partially here:

- drm-total-cycles-: 

Engine identifier string must be the same as the one specified in the
drm-cycles- tag and shall contain the total number cycles for 
the given
engine.

This is a timestamp in GPU unspecified unit that matches the update rate
of drm-cycles-. For drivers that implement this interface, the 
engine
utilization can be calculated entirely on the GPU clock domain, without
considering the CPU sleep time between 2 samples.

The pre-existent drm-cycles- is used as is, which allows gputop
to work with xe.

This last patch still has some open discussion from v2, so we may need
to hold it a little more.

Test-with: 20240504064643.25863-1-lucas.demar...@intel.com

v2:
  - Create a new drm-total-cycles instead of re-using drm-engine with a
different unit
  - Add documentation for the new interface and clarify usage of
xe_lrc_update_timestamp()

v3:
  - Fix bugs in "drm/xe: Add helper to accumulate exec queue runtime" -
see commit message
  - Reorder commits so the ones that are useful in other patch series
come first

Lucas De Marchi (4):
  drm/xe: Promote xe_hw_engine_class_to_str()
  drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion
  drm/xe: Add helper to capture engine timestamp
  drm/xe/client: Print runtime to fdinfo

Umesh Nerlige Ramappa (2):
  drm/xe/lrc: Add helper to capture context timestamp
  drm/xe: Add helper to accumulate exec queue runtime

 Documentation/gpu/drm-usage-stats.rst |  16 ++-
 Documentation/gpu/xe/index.rst|   1 +
 Documentation/gpu/xe/xe-drm-usage-stats.rst   |  10 ++
 drivers/gpu/drm/xe/regs/xe_lrc_layout.h   |   1 +
 drivers/gpu/drm/xe/xe_device_types.h  |   9 ++
 drivers/gpu/drm/xe/xe_drm_client.c| 136 +-
 drivers/gpu/drm/xe/xe_exec_queue.c|  35 +
 drivers/gpu/drm/xe/xe_exec_queue.h|   1 +
 drivers/gpu/drm/xe/xe_execlist.c  |   1 +
 drivers/gpu/drm/xe/xe_guc_submit.c|   2 +
 drivers/gpu/drm/xe/xe_hw_engine.c |  27 
 drivers/gpu/drm/xe/xe_hw_engine.h |   3 +
 drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c |  18 ---
 drivers/gpu/drm/xe/xe_lrc.c   |  11 ++
 drivers/gpu/drm/xe/xe_lrc.h   |  14 ++
 drivers/gpu/drm/xe/xe_lrc_types.h |   3 +
 16 files changed, 267 insertions(+), 21 deletions(-)
 create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst

-- 
2.43.0



[PATCH v3 4/6] drm/xe: Add helper to capture engine timestamp

2024-05-07 Thread Lucas De Marchi
Just like CTX_TIMESTAMP is used to calculate runtime, add a helper to
get the timestamp for the engine so it can be used to calculate the
"engine time" with the same unit as the runtime is recorded.

Reviewed-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_hw_engine.c | 5 +
 drivers/gpu/drm/xe/xe_hw_engine.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c 
b/drivers/gpu/drm/xe/xe_hw_engine.c
index de30b74bf253..734f43a35b48 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -1110,3 +1110,8 @@ const char *xe_hw_engine_class_to_str(enum 
xe_engine_class class)
 
return NULL;
 }
+
+u64 xe_hw_engine_read_timestamp(struct xe_hw_engine *hwe)
+{
+   return xe_mmio_read64_2x32(hwe->gt, RING_TIMESTAMP(hwe->mmio_base));
+}
diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h 
b/drivers/gpu/drm/xe/xe_hw_engine.h
index 843de159e47c..7f2d27c0ba1a 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.h
+++ b/drivers/gpu/drm/xe/xe_hw_engine.h
@@ -68,5 +68,6 @@ static inline bool xe_hw_engine_is_valid(struct xe_hw_engine 
*hwe)
 }
 
 const char *xe_hw_engine_class_to_str(enum xe_engine_class class);
+u64 xe_hw_engine_read_timestamp(struct xe_hw_engine *hwe);
 
 #endif
-- 
2.43.0



[PATCH v3 1/6] drm/xe: Promote xe_hw_engine_class_to_str()

2024-05-07 Thread Lucas De Marchi
Move it out of the sysfs compilation unit so it can be re-used in other
places.

Reviewed-by: Nirmoy Das 
Reviewed-by: Oak Zeng 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_hw_engine.c | 18 ++
 drivers/gpu/drm/xe/xe_hw_engine.h |  2 ++
 drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c | 18 --
 3 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c 
b/drivers/gpu/drm/xe/xe_hw_engine.c
index ec69803152a2..85712650be22 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -1088,3 +1088,21 @@ bool xe_hw_engine_is_reserved(struct xe_hw_engine *hwe)
return xe->info.has_usm && hwe->class == XE_ENGINE_CLASS_COPY &&
hwe->instance == gt->usm.reserved_bcs_instance;
 }
+
+const char *xe_hw_engine_class_to_str(enum xe_engine_class class)
+{
+   switch (class) {
+   case XE_ENGINE_CLASS_RENDER:
+   return "rcs";
+   case XE_ENGINE_CLASS_VIDEO_DECODE:
+   return "vcs";
+   case XE_ENGINE_CLASS_VIDEO_ENHANCE:
+   return "vecs";
+   case XE_ENGINE_CLASS_COPY:
+   return "bcs";
+   case XE_ENGINE_CLASS_COMPUTE:
+   return "ccs";
+   default:
+   return NULL;
+   }
+}
diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h 
b/drivers/gpu/drm/xe/xe_hw_engine.h
index 71968ee2f600..843de159e47c 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.h
+++ b/drivers/gpu/drm/xe/xe_hw_engine.h
@@ -67,4 +67,6 @@ static inline bool xe_hw_engine_is_valid(struct xe_hw_engine 
*hwe)
return hwe->name;
 }
 
+const char *xe_hw_engine_class_to_str(enum xe_engine_class class);
+
 #endif
diff --git a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c 
b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
index 844ec68cbbb8..efce6c7dd2a2 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
@@ -618,24 +618,6 @@ static void hw_engine_class_sysfs_fini(struct drm_device 
*drm, void *arg)
kobject_put(kobj);
 }
 
-static const char *xe_hw_engine_class_to_str(enum xe_engine_class class)
-{
-   switch (class) {
-   case XE_ENGINE_CLASS_RENDER:
-   return "rcs";
-   case XE_ENGINE_CLASS_VIDEO_DECODE:
-   return "vcs";
-   case XE_ENGINE_CLASS_VIDEO_ENHANCE:
-   return "vecs";
-   case XE_ENGINE_CLASS_COPY:
-   return "bcs";
-   case XE_ENGINE_CLASS_COMPUTE:
-   return "ccs";
-   default:
-   return NULL;
-   }
-}
-
 /**
  * xe_hw_engine_class_sysfs_init - Init HW engine classes on GT.
  * @gt: Xe GT.
-- 
2.43.0



[PATCH v3 5/6] drm/xe: Add helper to accumulate exec queue runtime

2024-05-07 Thread Lucas De Marchi
From: Umesh Nerlige Ramappa 

Add a helper to accumulate per-client runtime of all its
exec queues. This is called every time a sched job is finished.

v2:
  - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate
runtime when job is finished since xe_sched_job_completed() is not a
notification that job finished.
  - Stop trying to update runtime from xe_exec_queue_fini() - that is
redundant and may happen after xef is closed, leading to a
use-after-free
  - Do not special case the first timestamp read: the default LRC sets
CTX_TIMESTAMP to zero, so even the first sample should be a valid
one.
  - Handle the parallel submission case by multiplying the runtime by
width.

Signed-off-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_device_types.h |  9 +++
 drivers/gpu/drm/xe/xe_exec_queue.c   | 35 
 drivers/gpu/drm/xe/xe_exec_queue.h   |  1 +
 drivers/gpu/drm/xe/xe_execlist.c |  1 +
 drivers/gpu/drm/xe/xe_guc_submit.c   |  2 ++
 5 files changed, 48 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_device_types.h 
b/drivers/gpu/drm/xe/xe_device_types.h
index 906b98fb973b..de078bdf0ab9 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -560,6 +560,15 @@ struct xe_file {
struct mutex lock;
} exec_queue;
 
+   /**
+* @runtime: hw engine class runtime in ticks for this drm client
+*
+* Only stats from xe_exec_queue->lrc[0] are accumulated. For multi-lrc
+* case, since all jobs run in parallel on the engines, only the stats
+* from lrc[0] are sufficient.
+*/
+   u64 runtime[XE_ENGINE_CLASS_MAX];
+
/** @client: drm client */
struct xe_drm_client *client;
 };
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c 
b/drivers/gpu/drm/xe/xe_exec_queue.c
index 395de93579fa..86eb22e22c95 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -769,6 +769,41 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q)
q->lrc[0].fence_ctx.next_seqno - 1;
 }
 
+/**
+ * xe_exec_queue_update_runtime() - Update runtime for this exec queue from hw
+ * @q: The exec queue
+ *
+ * Update the timestamp saved by HW for this exec queue and save runtime
+ * calculated by using the delta from last update. On multi-lrc case, only the
+ * first is considered.
+ */
+void xe_exec_queue_update_runtime(struct xe_exec_queue *q)
+{
+   struct xe_file *xef;
+   struct xe_lrc *lrc;
+   u32 old_ts, new_ts;
+
+   /*
+* Jobs that are run during driver load may use an exec_queue, but are
+* not associated with a user xe file, so avoid accumulating busyness
+* for kernel specific work.
+*/
+   if (!q->vm || !q->vm->xef)
+   return;
+
+   xef = q->vm->xef;
+
+   /*
+* Only sample the first LRC. For parallel submission, all of them are
+* scheduled together and we compensate that below by multiplying by
+* width
+*/
+   lrc = &q->lrc[0];
+
+   new_ts = xe_lrc_update_timestamp(lrc, &old_ts);
+   xef->runtime[q->class] += (new_ts - old_ts) * q->width;
+}
+
 void xe_exec_queue_kill(struct xe_exec_queue *q)
 {
struct xe_exec_queue *eq = q, *next;
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.h 
b/drivers/gpu/drm/xe/xe_exec_queue.h
index 48f6da53a292..e0f07d28ee1a 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.h
+++ b/drivers/gpu/drm/xe/xe_exec_queue.h
@@ -75,5 +75,6 @@ struct dma_fence *xe_exec_queue_last_fence_get(struct 
xe_exec_queue *e,
   struct xe_vm *vm);
 void xe_exec_queue_last_fence_set(struct xe_exec_queue *e, struct xe_vm *vm,
  struct dma_fence *fence);
+void xe_exec_queue_update_runtime(struct xe_exec_queue *q);
 
 #endif
diff --git a/drivers/gpu/drm/xe/xe_execlist.c b/drivers/gpu/drm/xe/xe_execlist.c
index dece2785933c..a316431025c7 100644
--- a/drivers/gpu/drm/xe/xe_execlist.c
+++ b/drivers/gpu/drm/xe/xe_execlist.c
@@ -307,6 +307,7 @@ static void execlist_job_free(struct drm_sched_job *drm_job)
 {
struct xe_sched_job *job = to_xe_sched_job(drm_job);
 
+   xe_exec_queue_update_runtime(job->q);
xe_sched_job_put(job);
 }
 
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c 
b/drivers/gpu/drm/xe/xe_guc_submit.c
index d274a139010b..e0ebfe83dfe8 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -749,6 +749,8 @@ static void guc_exec_queue_free_job(struct drm_sched_job 
*drm_job)
 {
struct xe_sched_job *job = to_xe_sched_job(drm_job);
 
+   xe_exec_queue_update_runtime(job->q);
+
trace_xe_sched_job_free(job);
xe_sched_job_put(job);
 }
-- 
2.43.0



[PATCH v3 2/6] drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion

2024-05-07 Thread Lucas De Marchi
XE_ENGINE_CLASS_OTHER was missing from the str conversion. Add it and
remove the default handling so it's protected by -Wswitch.
Currently the only user is xe_hw_engine_class_sysfs_init(), which
already skips XE_ENGINE_CLASS_OTHER, so there's no change in behavior.

Reviewed-by: Nirmoy Das 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_hw_engine.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c 
b/drivers/gpu/drm/xe/xe_hw_engine.c
index 85712650be22..de30b74bf253 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -1100,9 +1100,13 @@ const char *xe_hw_engine_class_to_str(enum 
xe_engine_class class)
return "vecs";
case XE_ENGINE_CLASS_COPY:
return "bcs";
+   case XE_ENGINE_CLASS_OTHER:
+   return "other";
case XE_ENGINE_CLASS_COMPUTE:
return "ccs";
-   default:
-   return NULL;
+   case XE_ENGINE_CLASS_MAX:
+   break;
}
+
+   return NULL;
 }
-- 
2.43.0



Re: [PATCH v2 6/6] drm/xe/client: Print runtime to fdinfo

2024-05-07 Thread Lucas De Marchi

On Fri, Apr 26, 2024 at 11:47:37AM GMT, Tvrtko Ursulin wrote:


On 24/04/2024 00:56, Lucas De Marchi wrote:

Print the accumulated runtime for client when printing fdinfo.
Each time a query is done it first does 2 things:

1) loop through all the exec queues for the current client and
   accumulate the runtime, per engine class. CTX_TIMESTAMP is used for
   that, being read from the context image.

2) Read a "GPU timestamp" that can be used for considering "how much GPU
   time has passed" and that has the same unit/refclock as the one
   recording the runtime. RING_TIMESTAMP is used for that via MMIO.

Since for all current platforms RING_TIMESTAMP follows the same
refclock, just read it once, using any first engine.

This is exported to userspace as 2 numbers in fdinfo:

drm-cycles-: 
drm-total-cycles-: 

Userspace is expected to collect at least 2 samples, which allows to
know the client engine busyness as per:

RUNTIME1 - RUNTIME0
busyness = -
  T1 - T0

Another thing to point out is that it's expected that userspace will
read any 2 samples every few seconds.  Given the update frequency of the
counters involved and that CTX_TIMESTAMP is 32-bits, the counter for
each exec_queue can wrap around (assuming 100% utilization) after ~200s.
The wraparound is not perceived by userspace since it's just accumulated
for all the exec_queues in a 64-bit counter), but the measurement will
not be accurate if the samples are too far apart.

This could be mitigated by adding a workqueue to accumulate the counters
every so often, but it's additional complexity for something that is
done already by userspace every few seconds in tools like gputop (from
igt), htop, nvtop, etc with none of them really defaulting to 1 sample
per minute or more.

Signed-off-by: Lucas De Marchi 
---
 Documentation/gpu/drm-usage-stats.rst   |  16 ++-
 Documentation/gpu/xe/index.rst  |   1 +
 Documentation/gpu/xe/xe-drm-usage-stats.rst |  10 ++
 drivers/gpu/drm/xe/xe_drm_client.c  | 138 +++-
 4 files changed, 162 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst

diff --git a/Documentation/gpu/drm-usage-stats.rst 
b/Documentation/gpu/drm-usage-stats.rst
index 6dc299343b48..421766289b78 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -112,6 +112,17 @@ larger value within a reasonable period. Upon observing a 
value lower than what
 was previously read, userspace is expected to stay with that larger previous
 value until a monotonic update is seen.
+- drm-total-cycles-: 
+
+Engine identifier string must be the same as the one specified in the
+drm-cycles- tag and shall contain the total number cycles for the given
+engine.
+
+This is a timestamp in GPU unspecified unit that matches the update rate
+of drm-cycles-. For drivers that implement this interface, the engine
+utilization can be calculated entirely on the GPU clock domain, without
+considering the CPU sleep time between 2 samples.


Two opens.

1)
Do we need to explicity document that drm-total-cycles and drm-maxfreq 
are mutually exclusive?


so userspace has a fallback mechanism to calculate utilization depending
on what keys are available?



2)
Should drm-total-cycles for now be documents as driver specific?


you mean to call it xe-total-cycles?



I have added some more poeple in the cc who were involved with driver 
fdinfo implementations if they will have an opinion.


I would say potentially yes, and promote it to common if more than one 
driver would use it.


For instance I see panfrost has the driver specific drm-curfreq 
(although isn't documenting it fully in panfrost.rst). And I have to 
say it is somewhat questionable to expose the current frequency per 
fdinfo per engine but not my call.


aren't all of Documentation/gpu/drm-usage-stats.rst optional that
driver may or may not implement? When you say driver-specific I'd think
more of the ones not using  as prefix as e.g. amd-*.

I think drm-cycles + drm-total-cycles is just an alternative
implementation for engine utilization. Like drm-cycles + drm-maxfreq
already is an alternative to drm-engine and is not implemented by e.g.
amdgpu/i915.

I will submit a new version of the entire patch series to get the ball
rolling, but let's keep this open for now.

<...>


+static void show_runtime(struct drm_printer *p, struct drm_file *file)
+{
+   struct xe_file *xef = file->driver_priv;
+   struct xe_device *xe = xef->xe;
+   struct xe_gt *gt;
+   struct xe_hw_engine *hwe;
+   struct xe_exec_queue *q;
+   unsigned long i, id_hwe, id_gt, capacity[XE_ENGINE_CLASS_MAX] = { };
+   u64 gpu_timestamp, engine_mask = 0;
+   bool gpu_stamp = false;
+
+   xe_pm_runtime_get(xe);
+
+   /* Accumulate all the exec 

Re: [PATCH] drm/xe: Fix UBSAN shift-out-of-bounds failure

2024-05-07 Thread Lucas De Marchi



+Thomas, +Christian, +dri-devel

On Tue, May 07, 2024 at 11:42:46AM GMT, Nirmoy Das wrote:


On 5/7/2024 11:39 AM, Nirmoy Das wrote:



On 5/7/2024 10:04 AM, Shuicheng Lin wrote:

Here is the failure stack:
[   12.988209] [ cut here ]
[   12.988216] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13
[   12.988232] shift exponent 64 is too large for 64-bit type 'long unsigned 
int'
[   12.988235] CPU: 4 PID: 1310 Comm: gnome-shell Tainted: G U 
6.9.0-rc6+prerelease1158+ #19
[   12.988237] Hardware name: Intel Corporation Raptor Lake Client 
Platform/RPL-S ADP-S DDR5 UDIMM CRB, BIOS RPLSFWI1.R00.3301.A02.2208050712 
08/05/2022
[   12.988239] Call Trace:
[   12.988240]  
[   12.988242]  dump_stack_lvl+0xd7/0xf0
[   12.988248]  dump_stack+0x10/0x20
[   12.988250]  ubsan_epilogue+0x9/0x40
[   12.988253]  __ubsan_handle_shift_out_of_bounds+0x10e/0x170
[   12.988260]  dma_resv_reserve_fences.cold+0x2b/0x48
[   12.988262]  ? ww_mutex_lock_interruptible+0x3c/0x110
[   12.988267]  drm_exec_prepare_obj+0x45/0x60 [drm_exec]
[   12.988271]  ? vm_bind_ioctl_ops_execute+0x5b/0x740 [xe]
[   12.988345]  vm_bind_ioctl_ops_execute+0x78/0x740 [xe]

It is caused by the value 0 of parameter num_fences in function 
drm_exec_prepare_obj.
And lead to in function __rounddown_pow_of_two, "0 - 1" causes the 
shift-out-of-bounds.
For the num_fences, it should be 1 at least.

Cc: Matthew Brost
Signed-off-by: Shuicheng Lin
---
 drivers/gpu/drm/xe/xe_vm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index d17192c8b7de..96cb4d9762a3 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -2692,7 +2692,7 @@ static int vma_lock_and_validate(struct drm_exec *exec, 
struct xe_vma *vma,
if (bo) {
if (!bo->vm)
-   err = drm_exec_prepare_obj(exec, &bo->ttm.base, 0);
+   err = drm_exec_prepare_obj(exec, &bo->ttm.base, 1);


This needs to be fixed in drm_exec_prepare_obj() by checking 
num_fences and not calling dma_resv_reserve_fences()


or just call drm_exec_lock_obj() here. ref: 
https://patchwork.freedesktop.org/patch/577487/


we are hit again by this. Couldn't we change drm_exec_prepare_obj() to
check num_fences and if is 0 just fallback to just do
drm_exec_lock_obj() as  "the least amount of work needed in this case"?

Something like this:

| diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c
| index 2da094bdf8a4..68b5f6210b09 100644
| --- a/drivers/gpu/drm/drm_exec.c
| +++ b/drivers/gpu/drm/drm_exec.c
| @@ -296,10 +296,12 @@ int drm_exec_prepare_obj(struct drm_exec *exec, struct 
drm_gem_object *obj,
|   if (ret)
|   return ret;
|  
| -	ret = dma_resv_reserve_fences(obj->resv, num_fences);

| - if (ret) {
| - drm_exec_unlock_obj(exec, obj);
| - return ret;
| + if (num_fences) {
| + ret = dma_resv_reserve_fences(obj->resv, num_fences);
| + if (ret) {
| + drm_exec_unlock_obj(exec, obj);
| + return ret;
| +     }
|   }
|  
|  	return 0;


thanks
Lucas De Marchi



Nirmoy



Regards,

Nirmoy


if (!err && validate)
err = xe_bo_validate(bo, xe_vma_vm(vma), true);
}
@@ -2777,7 +2777,7 @@ static int vm_bind_ioctl_ops_lock_and_prep(struct 
drm_exec *exec,
struct xe_vma_op *op;
int err;
-   err = drm_exec_prepare_obj(exec, xe_vm_obj(vm), 0);
+   err = drm_exec_prepare_obj(exec, xe_vm_obj(vm), 1);
if (err)
return err;


[PULL] drm-xe-fixes

2024-05-02 Thread Lucas De Marchi

Hi Dave and Sima,

Please pull the drm-xe-fixes for this week targeting v6.9-rc7.

Two important fixes: one avoiding a use-after-free in the rebind worker
and the other to make display work in ADL-N.

thanks
Lucas De Marchi

drm-xe-fixes-2024-05-02:
- Fix UAF on rebind worker
- Fix ADL-N display integration
The following changes since commit e67572cd2204894179d89bd7b984072f19313b03:

  Linux 6.9-rc6 (2024-04-28 13:47:24 -0700)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-05-02

for you to fetch changes up to df04b152fca2d46e75fbb74ed79299bc420bc9e6:

  drm/xe/display: Fix ADL-N detection (2024-05-02 11:11:01 -0500)


- Fix UAF on rebind worker
- Fix ADL-N display integration


Lucas De Marchi (1):
  drm/xe/display: Fix ADL-N detection

Matthew Auld (1):
  drm/xe/vm: prevent UAF in rebind_work_func()

 drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h | 3 ++-
 drivers/gpu/drm/xe/xe_vm.c| 3 +++
 2 files changed, 5 insertions(+), 1 deletion(-)


Re: [PATCH] drm/sysfs: Add drm class-wide attribute to get active device clients

2024-05-01 Thread Lucas De Marchi

On Wed, May 01, 2024 at 09:25:01AM GMT, Rob Clark wrote:

On Wed, May 1, 2024 at 9:19 AM Lucas De Marchi  wrote:


On Wed, May 01, 2024 at 04:58:05PM GMT, Tvrtko Ursulin wrote:
>
>Hi,
>
>On 24/04/2024 15:48, Adrián Larumbe wrote:
>>Hi Tvrtko,
>>
>>On 15.04.2024 13:50, Tvrtko Ursulin wrote:
>>>
>>>On 05/04/2024 18:59, Rob Clark wrote:
>>>>On Wed, Apr 3, 2024 at 11:37 AM Adrián Larumbe
>>>> wrote:
>>>>>
>>>>>Up to this day, all fdinfo-based GPU profilers must traverse the entire
>>>>>/proc directory structure to find open DRM clients with fdinfo file
>>>>>descriptors. This is inefficient and time-consuming.
>>>>>
>>>>>This patch adds a new device class attribute that will install a sysfs file
>>>>>per DRM device, which can be queried by profilers to get a list of PIDs for
>>>>>their open clients. This file isn't human-readable, and it's meant to be
>>>>>queried only by GPU profilers like gputop and nvtop.
>>>>>
>>>>>Cc: Boris Brezillon 
>>>>>Cc: Tvrtko Ursulin 
>>>>>Cc: Christopher Healy 
>>>>>Signed-off-by: Adrián Larumbe 
>>>>
>>>>It does seem like a good idea.. idk if there is some precedent to
>>>>prefer binary vs ascii in sysfs, but having a way to avoid walking
>>>>_all_ processes is a good idea.
>>>
>>>I naturally second that it is a needed feature, but I do not think binary
>>>format is justified. AFAIR it should be used for things like hw/fw
>>>standardised tables or firmware images, not when exporting a simple list of
>>>PIDs. It also precludes easy shell/script access and the benefit of avoiding
>>>parsing a short list is I suspect completely dwarfed by needing to parse all
>>>the related fdinfo etc.
>>
>>I'd rather keep it as a binary file for the sake of easily parsing the number
>>list on the client side, in gputop or nvtop. For textual access, there's 
already
>>a debugfs file that presents the same information, so I thought it was best 
not
>>to duplicate that functionality and restrict sysfs to serving the very 
specific
>>use case of UM profilers having to access the DRM client list.
>>
>>I should mention I did something controversial here, which is a semantically
>>binary attribute through the regular attribute interface. I guess if I keep it
>>as a binary attribute in the end, I should switch over to the binary attribute
>>API.
>>
>>Another reason why I implemented it as a binary file is that we can only send
>>back at most a whole page. If a PID takes 4 bytes, that's usually 1024 clients
>>at most, which is probably enough for any UM profiler, but will decrease even
>>more if we turn it into an ASCII readable file.
>
>I'm afraid I still think there is no reason for a binary file, even
>less so artificially limited to 1024 clients. Any consumer will have
>to parse text fdinfo so a binary list of pids is not adding any real
>cost.

yeah, I don't really understand why you'd want the binary number that
you'd then have to turn into a string to open the /proc//. To me it
sounds more like we want a text output and that output to be:

/fdinfo/

So gputop could just read this file to know where the info is.
Too bad we can't symlink cross fs, otherwise we could just add symlinks
to e.g. /sys/class/drm/card/clients/*, which then nicely separate it
per gpu too.

But see below.

>
>>I did some research into sysfs binary attributes, and while some sources 
mention that
>>it's often used for dumping or loading of driver FW, none of them claim it 
cannot
>>be used for other purposes.
>>
>>>>>---
>>>>>   drivers/gpu/drm/drm_internal.h   |  2 +-
>>>>>   drivers/gpu/drm/drm_privacy_screen.c |  2 +-
>>>>>   drivers/gpu/drm/drm_sysfs.c  | 89 ++--
>>>>>   3 files changed, 74 insertions(+), 19 deletions(-)
>>>>>
>>>>>diff --git a/drivers/gpu/drm/drm_internal.h 
b/drivers/gpu/drm/drm_internal.h
>>>>>index 2215baef9a3e..9a399b03d11c 100644
>>>>>--- a/drivers/gpu/drm/drm_internal.h
>>>>>+++ b/drivers/gpu/drm/drm_internal.h
>>>>>@@ -145,7 +145,7 @@ bool drm_master_internal_acquire(struct drm_device 
*dev);
>>>>>   void drm_master_internal_release(struct drm_device *dev);
>>>>>
>>>>>   /* drm_sysfs.c */
>>>>>-extern struct class *drm_class;
>>>>>+extern struct 

Re: [PATCH] drm/sysfs: Add drm class-wide attribute to get active device clients

2024-05-01 Thread Lucas De Marchi
ice *dev)
  {
@@ -128,6 +126,62 @@ static const struct component_ops typec_connector_ops = {

  static CLASS_ATTR_STRING(version, S_IRUGO, "drm 1.1.0 20060810");

+static ssize_t clients_show(struct device *cd, struct device_attribute *attr, 
char *buf)
+{
+   struct drm_minor *minor = cd->driver_data;
+   struct drm_device *ddev = minor->dev;
+   struct drm_file *priv;
+   ssize_t offset = 0;
+   void *pid_buf;
+
+   if (minor->type != DRM_MINOR_RENDER)
+   return 0;


Why this?


I return nothing in case of a non-render node because we don't want display 
drivers
to confuse UM GPU profilers.


Feels to arbitrary to me. Let them handle it.


+
+   pid_buf = kvmalloc(PAGE_SIZE, GFP_KERNEL);


I don't quite get the kvmalloc for just one page (or why even a temporay
buffer and not write into buf directly?).


Should've used kmalloc, you're right. Or else I could just write everything 
straight into 'buf'.


+   if (!pid_buf)
+   return 0;
+
+   mutex_lock(&ddev->filelist_mutex);
+   list_for_each_entry_reverse(priv, &ddev->filelist, lhead) {
+   struct pid *pid;
+
+   if (drm_WARN_ON(ddev, (PAGE_SIZE - offset) < sizeof(pid_t)))
+   break;


Feels bad.. I would suggest exploring implementing a read callback (instead of
show) and handling arbitrary size output.


I think regular class attributes can only implement show() and set(). For a 
more complex
interface, I would have to turn it into an actual binary attribute, and that 
would be the only
choice if we want the list of clients to be of arbitrary size.


Yeah, i915 uses that to dump the error capture file which can be huge 
and is text so it is doable.



+
+   rcu_read_lock();
+   pid = rcu_dereference(priv->pid);
+   (*(pid_t *)(pid_buf + offset)) = pid_vnr(pid);
+   rcu_read_unlock();
+
+   offset += sizeof(pid_t);
+   }
+   mutex_unlock(&ddev->filelist_mutex);
+
+   if (offset < PAGE_SIZE)
+   (*(pid_t *)(pid_buf + offset)) = 0;


Either NULL terminated or PAGE_SIZE/sizeof(pid) entries and not NULL
terminated feels weird. If I got that right.

For me everything points towards going for text output.


Yes, I know it might sound weird, but my reasoning was: either there are 
PAGE_SIZE/sizeof(pid) entries
and the file isn't NULL terminated (which should be picked up by clients as 
being one page worth
of data, the sysfs attribute maximum transfer unit), or else there aren't 
enough entries to fill
a page and after the last one there's a NULL entry.



+
+   memcpy(buf, pid_buf, offset);
+
+   kvfree(pid_buf);
+
+   return offset;
+
+}
+static DEVICE_ATTR_RO(clients);



/proc//fdinfo/ is only readable by the owner. if we report what are
the open fds (or even what are the pids with a drm fd), we are
leaking that info. So we should probably make this
DEVICE_ATTR_ADMIN_RO.

Lucas De Marchi



Shouldn't BIN_ATTR_RO be used for binary files in sysfs?


Like I said above, I sort of faked a binary attribute through the regular sysfs 
attr API,
which is most likely a bad idea.


Regards,

Tvrtko

P.S. Or maybe it is time for drmfs? Where each client gets a directory and
drivers can populate files. Such as per client logging streams and whatnot.


Yes, but maybe this is something we can discuss in depth in an RFC at a later 
time?


Yes of course, it is just a long standing idea for flexible per client 
stuff.


Regards,

Tvrtko




+
+static struct attribute *drm_device_attrs[] = {
+   &dev_attr_clients.attr,
+   NULL,
+};
+ATTRIBUTE_GROUPS(drm_device);
+
+struct class drm_class = {
+   .name   = "drm",
+   .dev_groups = drm_device_groups,
+};
+
+static bool drm_class_initialised;
+
  /**
   * drm_sysfs_init - initialize sysfs helpers
   *
@@ -142,18 +196,19 @@ int drm_sysfs_init(void)
  {
 int err;

-   drm_class = class_create("drm");
-   if (IS_ERR(drm_class))
-   return PTR_ERR(drm_class);
+   err = class_register(&drm_class);
+   if (err)
+   return err;

-   err = class_create_file(drm_class, &class_attr_version.attr);
+   err = class_create_file(&drm_class, &class_attr_version.attr);
 if (err) {
-   class_destroy(drm_class);
-   drm_class = NULL;
+   class_destroy(&drm_class);
 return err;
 }

-   drm_class->devnode = drm_devnode;
+   drm_class.devnode = drm_devnode;
+
+   drm_class_initialised = true;

 drm_sysfs_acpi_register();
 return 0;
@@ -166,12 +221,12 @@ int drm_sysfs_init(void)
   */
  void drm_sysfs_destroy(void)
  {
-   if (IS_ERR_OR_NULL(drm_class))
+   if (!drm_class_initialised)
 return;
 dr

Re: [PATCH] nouveau: Add missing break statement

2024-04-30 Thread Lucas De Marchi

On Tue, Apr 30, 2024 at 06:48:40PM GMT, Chaitanya Kumar Borah wrote:

Add the missing break statement that causes the following build error

  CC [M]  drivers/gpu/drm/i915/display/intel_display_device.o
../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c: In function 
‘build_registry’:
../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1266:3: error: label 
at end of compound statement
1266 |   default:
  |   ^~~
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.o
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_reg.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/imu_v11_0.o
make[7]: *** [../scripts/Makefile.build:244: 
drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.o] Error 1
make[7]: *** Waiting for unfinished jobs

Fixes: b58a0bc904ff ("nouveau: add command-line GSP-RM registry support")
Closes: 
https://lore.kernel.org/all/913052ca6c0988db1bab293cfae38529251b4594.ca...@nvidia.com/T/#m3c9acebac754f2e74a85b76c858c093bb1aacaf0


Just found a previous bug report, so we can add here:

Closes: 
https://lore.kernel.org/all/ca+g9fyu7ug0k8h9qjt0wbtwh_ll9juc+vc0wmu_z_vsspdn...@mail.gmail.com/



Signed-off-by: Chaitanya Kumar Borah 
---
drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index 0b46db5c77b8..63619512e7f6 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -1264,6 +1264,7 @@ static void build_registry(struct nvkm_gsp *gsp, 
PACKED_REGISTRY_TABLE *registry
str_offset += reg->vlen;
break;
default:
+   break;



since reg->type is an enum and we are handling all the values, another
possible approach is to remove the default handling and then the
compiler will warn you of missing cases if built with -Wswitch.

Any of the approaches seem good to me.

Reviewed-by: Lucas De Marchi 

Lucas De Marchi


}

i++;
--
2.25.1



Re: [PATCH] [v7] nouveau: add command-line GSP-RM registry support

2024-04-30 Thread Lucas De Marchi

On Fri, Apr 26, 2024 at 06:08:19PM GMT, Danilo Krummrich wrote:

On 4/25/24 18:38, Timur Tabi wrote:

On Thu, 2024-04-25 at 15:22 +0200, Danilo Krummrich wrote:

+   size_t length;
+
+   /* Remove any whitespace from the parameter string */
+   length = strip(p, " \t\n");


With that, I see the following warning compiling this patch.

warning: variable ‘length’ set but not used [-Wunused-but-set-variable]

Did you intend to use the length for anything?


No, and I could have sworn I fixed that before sending out v7.  I think I
originally intended 'length' to determine when I finished parsing the
string.


Also, looking at the warning made me aware of 'p' potentially being NULL.

If you agree, I can fix the warning and add the corresponding NULL check
when
applying the patch.


Yes, that would be great.  You can just delete 'length'.  The NULL check for
'p' should call clean_registry() before returning -ENOMEM.


Applied to drm-misc-next, thanks!



since this commit our CI is failing to build with the following error:

  CC [M]  drivers/gpu/drm/i915/display/intel_display_device.o
../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c: In function 
‘build_registry’:
../drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1266:3: error: label 
at end of compound statement
1266 |   default:
  |   ^~~
  CC [M]  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.o
  HDRTEST drivers/gpu/drm/xe/compat-i915-headers/i915_reg.h
  CC [M]  drivers/gpu/drm/amd/amdgpu/imu_v11_0.o
make[7]: *** [../scripts/Makefile.build:244: 
drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.o] Error 1
make[7]: *** Waiting for unfinished jobs

you are missing a `break;` after that default.






Thanks for catching this.




Re: [PATCH 1/2] drm: print top commit sha after loading the driver

2024-04-29 Thread Lucas De Marchi

On Mon, Apr 29, 2024 at 02:02:28PM GMT, Jani Nikula wrote:

On Wed, 24 Apr 2024, Farah Kassabri  wrote:

Add the last driver sha to the existing log message
which prints the drm device info.


The commit message fails to answer the most important question: why?

Especially, what makes drm devices such special snowflakes that they'd
need to be the only ones in the kernel to print git commit sha1's?



the closest to what was added here would be srcversion:

$ modinfo build64/drivers/gpu/drm/xe/xe.ko  | grep srcversion
srcversion: 0EA08A43AC399A0D942740

which makes more sense and doesn't depend on the git tree checkout and
other possible problems with dirty checkouts.  If you want to print it
on module load to be able to tell from a log, you could probably just
access mod->srcversion.

but I'm not sure what we are trying to accomplish here and the commit
message certainly missed that. Please Cc dri-devel on changes outside xe
and provide the motivation in the commit message.

thanks
Lucas De Marchi




BR,
Jani.



Signed-off-by: Farah Kassabri 
---
 drivers/gpu/drm/drm_drv.c | 6 +++---
 include/drm/drm_drv.h | 2 ++
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 535b624d4c9d..e0f7af1b6ec3 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -947,10 +947,10 @@ int drm_dev_register(struct drm_device *dev, unsigned 
long flags)
}
drm_panic_register(dev);

-   DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",
+   DRM_INFO("Initialized %s %d.%d.%d%s %s for %s on minor %d\n",
 driver->name, driver->major, driver->minor,
-driver->patchlevel, driver->date,
-dev->dev ? dev_name(dev->dev) : "virtual device",
+driver->patchlevel, driver->git_sha ? driver->git_sha : "",
+driver->date, dev->dev ? dev_name(dev->dev) : "virtual device",
 dev->primary ? dev->primary->index : dev->accel->index);

goto out_unlock;
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index 8878260d7529..7578a1f4ce74 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -407,6 +407,8 @@ struct drm_driver {
int minor;
/** @patchlevel: driver patch level */
int patchlevel;
+   /** @git_sha: driver last commit sha */
+   char *git_sha;
/** @name: driver name */
char *name;
/** @desc: driver description */


--
Jani Nikula, Intel


Re: [PATCH] MAINTAINERS: Move the drm-intel repo location to fd.o GitLab

2024-04-26 Thread Lucas De Marchi

On Wed, Apr 24, 2024 at 01:41:59PM GMT, Ryszard Knop wrote:

The drm-intel repo is moving from the classic fd.o git host to GitLab.
Update its location with a URL matching other fd.o GitLab kernel trees.

Signed-off-by: Ryszard Knop 


Acked-by: Lucas De Marchi 

Also Cc'ing maintainers


---
MAINTAINERS | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index d6327dc12cb1..fbf7371a0bb0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10854,7 +10854,7 @@ W:  https://drm.pages.freedesktop.org/intel-docs/
Q:  http://patchwork.freedesktop.org/project/intel-gfx/
B:  https://drm.pages.freedesktop.org/intel-docs/how-to-file-i915-bugs.html
C:  irc://irc.oftc.net/intel-gfx
-T: git git://anongit.freedesktop.org/drm-intel
+T: git https://gitlab.freedesktop.org/drm/i915/kernel.git
F:  Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
F:  Documentation/gpu/i915.rst
F:  drivers/gpu/drm/ci/xfails/i915*
--
2.44.0



  1   2   3   4   5   6   7   8   9   10   >