space will no longer
be able to access reset queues.
v2: move per-queue reset flag to this patch
rebase based on patch 1 changes
Signed-off-by: Jonathan Kim
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 31 ---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 +
include/uapi
call safe during power saving modes.
clean up some other nitpicks.
Signed-off-by: Jonathan Kim
---
.../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 2 +
.../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 4 +-
.../drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c | 4 +-
.../drm/amd/amdgpu
and refactor sdma resource
bit setting logic.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 16 ++
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 38 +-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 5 +-
.../amd/amdkfd
The number of watchpoints should be set and constrained per logical
partition device, not by the socket device.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 20 ++--
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 4 ++--
drivers/gpu/drm/amd/amdkfd
Certain GPUs have better copy performance over xGMI on specific
SDMA engines depending on the source and destination GPU.
Allow users to create SDMA queues on these recommended engines.
Close to 2x overall performance has been observed with this
optimization.
Signed-off-by: Jonathan Kim
by SET_RESOURCES first to identify the user queue
candidates to reset.
Only signal reset events to processes that have had a queue reset.
If queue reset fails, fall back to GPU reset.
Signed-off-by: Jonathan Kim
---
.../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 1 +
.../drm/amd/amdgpu
space will no longer
be able to access reset queues.
Signed-off-by: Jonathan Kim
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 30 +++
include/uapi/linux/kfd_ioctl.h| 4 +++
2 files changed, 29 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd
MES internally has a timeout allowance of 2 seconds.
Increase driver timeout to 3 seconds to be safe.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
b/drivers/gpu
Due to a CP interrupt bug, bad packet garbage exception codes are raised.
Do a range check so that the debugger and runtime do not receive garbage
codes.
Update the user api to guard exception code type checking as well.
Signed-off-by: Jonathan Kim
Tested-by: Jesse Zhang
---
.../gpu/drm/amd
Prevent dropping the KFD process reference at the end of a debug
IOCTL call where the acquired process value is an error.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
b
Fix up on mes process context flush to prevent non-mes devices from
spamming error messages or running into undefined behaviour during
process termination.
Fixes: 73204d028eb5 ("drm/amdkfd: fix mes set shader debugger process
management")
Signed-off-by: Jonathan Kim
---
drivers/g
that the flush call and the MES debugger calls use the same MES
interface but are separated as KFD calls to avoid conflicting with each
other.
Signed-off-by: Jonathan Kim
Tested-by: Alice Wong
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 31 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
GC IP 9.4.2 and up support TA reporting of the number
of xGMI links between peers.
Tested-by: Vignesh Chander
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
handling and running KFD tests.
The only time ADD_QUEUE.skip_process_ctx_clear is required is for
debugger use cases where a debugged process is always runtime enabled
when adding a queue.
Tested-by: Shikai Guo
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 6
adding a queue.
Tested-by: Shikai Guo
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
b/drivers/gpu/drm/amd/amdkfd
Remove redundant assignment when skipping process ctx clear.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
b/drivers/gpu/drm/amd/amdkfd
The MES cached process context must be cleared on adding any queue for
the first time.
For proper debug support, the MES will clear it's cached process context
on the first call to SET_SHADER_DEBUGGER.
This allows TTMPs to be pesistently enabled in a safe manner.
Signed-off-by: Jonathan Kim
do not want these to be cooperative
dispatches.
v2: fix up indentation and comments.
remove unnecessary perf warning on oversubscription.
change 0 init to 0 memset to deal with padding.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 2 ++
drivers/gpu/drm
Update the list of devices that require the cwsr trap handling
workaround for debugging use cases.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 5 ++---
drivers/gpu/drm/amd/amdkfd/kfd_debug.h| 6 ++
drivers/gpu/drm/amd/amdkfd
do not want these to be cooperative
dispatches.
NOTE: FIXME MES FW enablement checks are a placeholder at the moment and
will be updated when the binary revision number is finalized.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 2 +-
drivers/gpu/drm
-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 1a4cdee86759..eeedc3ddffeb 100644
--- a/drivers/gpu/drm/amd/am
Queue count should decrement on queue destruction regardless of HWS
support type.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
b
Null check should be done on queue struct itself and not on the
process queue list node.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
b/drivers/gpu/drm/amd
. Once the binaries have been created, this check may
be subject to change.
v2: do a trap_en safety check in case old mes doesn't accept
unused trap_en d-word.
remove unnecessary process termination work around.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c| 7
access issues.
Remove KFD GFX OFF enable toggle clutter by moving these calls into the
KGD debug calls themselves.
v2: toggle gfx off around address watch hi/lo settings as well.
Signed-off-by: Jonathan Kim
---
.../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 4 +++
.../drm/amd/amdgpu
to change.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 5 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 4 ++-
drivers/gpu/drm/amd/amdgpu/mes_v11_0.c| 1 +
drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 31 ++-
.../drm/amd/amdkfd
Exception handling for vmfaults should be raised with additional data.
Reported-by: Mukul Joshi
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_events.c | 34 +++--
1 file changed, 20 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd
.
Also allow the debugger to clear exceptions when doing a snapshot.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 6 +++
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 36 +
.../drm/amd/amdkfd/kfd_device_queue_manager.h
of clearing the target exception on query.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 7 ++
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 120 +++
drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 6 ++
3 files changed, 133
a subsequent
successful call.
v2: add num_xcc to device snapshot and fixup new kfd_node reference
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 7 ++-
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 73
drivers/gpu/drm/amd/amdkfd/kfd_debug.h
Allow the debugger to query a single queue, device and process
exception.
The KFD should also return the GPU or Queue id of the exception.
The debugger also has the option of clearing exceptions after
being queried.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm
. This is because the IV from SQ interrupts are
packed into a new continguous format unlike GFX9. To make this clear,
a separate interrupting handling code file was created.
v2: use new kfd_node struct in prototypes.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
The debugger subscibes to nofication for requested exceptions on attach.
Allow the debugger to change its subsciption later on.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 3 ++
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 36
Bump the minor version to declare debugging capability is now
available.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 1 -
include/uapi/linux/kfd_ioctl.h | 3 ++-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git
Allow the debugger to set wave behaviour on to either normally operate,
halt at launch, trap on every instruction, terminate immediately or
stall on allocation.
v2: fixup with new kfd_node struct reference for mes check
Signed-off-by: Jonathan Kim
---
.../drm/amd/amdgpu
watch points are allocated or not.
v2: fixup with new kfd_node struct reference for mes and watch point
checks
Signed-off-by: Jonathan Kim
---
.../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 51 +++
.../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 2 +
.../drm/amd/amdgpu
engine, return the runtime status
as enabled but with an error.
In addition, like any other mutli-process debug supported devices,
disable trap temporary setup per-process to avoid performance impact from
setup overhead.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm
From: Jay Cornwall
Trap handler behavior will differ when a debugger is attached.
Make the debug trap flag available in the trap handler TMA.
Update it when the debug trap ioctl is invoked.
Signed-off-by: Jay Cornwall
Reviewed-by: Felix Kuehling
Signed-off-by: Jonathan Kim
Reviewed
.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 32 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 20
drivers/gpu/drm/amd/amdgpu/mes_v11_0.c| 12 +++
drivers/gpu/drm/amd/include
cise at the cost of performance. This setting is not
permitted on debug devices that support only a global setting of this
option.
Return the previous set flags to the debugger as well.
v2: fixup with new kfd_node struct reference mes checks
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/am
Implement the per-device calls to enable or disable HW debug mode
for GFX11.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c| 38 +++
1 file changed, 38 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu
be overridden or fully replaced.
In order for the debugger to know what is permissible, returned the
supported override mask back to the debugger along with the previously
enable overrides.
v2: fixup with new kfd_node struct reference for mes check
Signed-off-by: Jonathan Kim
---
.../drm/amd
suspend or
resume queues).
v2: fixup new kfd_node struct reference for mes fw check.
also fixup missing EC_QUEUE_NEW flagging on newly created queue.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 5 +
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 1 +
drivers
.
For runtime exceptions, this will unblock the runtime enable
function which will be explained and implemented in a follow up
patch.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
.../gpu/drm/amd/amdkfd/cik_event_interrupt.c | 4 +-
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
.
For memory violation exceptions, extra exception data will be saved.
The debugger will be able to query the saved exception states by query
operation that will be provided by follow up patches.
v2: use new kfd_node struct in prototype.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd
the required register values that the HWS needs to write on debug enable
and disable.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
.../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 42 ++-
1 file changed, 41 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm
functions are implemented in a follow up patch.
v2: spot fix with new kfd_node references
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 5 +
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 148 ++-
drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 29
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 143 ++-
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 6 +-
drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 4 +
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 1 +
4 files changed, 150 insertions(+), 4
SET_RESOUCES so that a debugged
process will never migrate away from its pinned VMID.
The KFD is responsible for reserving and releasing this pinned VMID
accordingly whenever the debugger attaches and detaches respectively.
v2: spot fix ups using new kfd_node references
Signed-off-by: Jonathan Kim
Flush delayed restore work in kfd_suspend_all_queues instead of
cancelling. Cancelling the work before it runs results in the queues
becoming permanently disabled. Flushing the work ensures that the
queue suspend/resume state stays balanced.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix
changing the implicit wait count setting. Once set, resume all work.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 +
.../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 116 ++
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
.
v2: add null grace period function pointers to VI packet manager.
Signed-off-by: Jonathan Kim
---
.../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 2 +
.../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 2 +
.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 43
.../drm/amd/amdgpu
.
v2: spot fixup new kfd_node references
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_debug.h| 5 ++
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 51 +++
.../drm/amd/amdkfd/kfd_device_queue_manager.h | 3 ++
.../drm/amd/amdkfd/kfd_packet_manager_v9.c
will be fixed for GFX11 onwards.
Also remove a bunch of deprecated misplaced references for GFX10.3.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 96
.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h| 28
.../drm/amd
rder to correctly set this up, set the special reserved CP bit by
default whenever the MQD is initailized.
v2: add missing 0-init of SPI_GDBG_TRAP_DATA0/1
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c| 26 +++
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
inheritence
of that mode is upheld.
Also ensure that exception overrides are reset to their original state
prior to debug enable or disable.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 92 +++
.../gpu/drm/amd/amdgpu
Introduce the require KGD debug calls that will execute hardware debug
mode setting.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
.../gpu/drm/amd/include/kgd_kfd_interface.h | 34 +++
1 file changed, 34 insertions(+)
diff --git a/drivers/gpu/drm/amd/include
events will notify the debugger through a pollable FIFO
file descriptor that the debugger provides to the KFD to manage.
Finally on process termination of either the debugger or the target,
debugging must be disabled if it has not been done so.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix
coordinates exception handling with the
HSA runtime.
Usage is available in the kern docs at uapi/linux/kfd_ioctl.h.
v2: add num_xcc to device snapshot entry.
fixup missing EC_QUEUE_PACKET_RESERVED mask.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 48 ++
include
-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 101 --
drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 6 ++
include/uapi/linux/kfd_sysfs.h| 15
3 files changed, 117 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd
access issues.
Remove KFD GFX OFF enable toggle clutter by moving these calls into the
KGD debug calls themselves.
Signed-off-by: Jonathan Kim
---
.../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 7
.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 33 ++-
.../gpu/drm/amd/amdgpu
Bump the minor version to declare debugging capability is now
available.
v2: bump to 1.13 after upstream rebase.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 1 -
include/uapi/linux/kfd_ioctl.h | 3 ++-
2 files changed, 2
-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
.../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 51 +++
.../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 2 +
.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 78 ++
.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h| 8 ++
.../drm
flag setup on APUs
Signed-off-by: Jay Cornwall
Reviewed-by: Felix Kuehling
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 11 +++
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 2 ++
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 15 +++
3 files changed
and remove deprecated launch mode options
Signed-off-by: Jonathan Kim
---
.../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 12 +++
.../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 1 +
.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 25 +
.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h
suspend or
resume queues).
v3: update safer copy context save header
v2: add gfx11/mes support.
prevent header copy on suspend from overwriting user fields.
simplify resume_queues function.
address other nit-picks
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 5
flag for now.
v2: add gfx11 support.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 58
drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 1 +
3 files changed, 61 insertions(+)
diff --git a/drivers/gpu
for runtime_enable.
v2: fix up hierarchy of semantics in description.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 143 ++-
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 6 +-
drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 4 +
drivers/gpu/drm/amd/amdkfd
.
For runtime exceptions, this will unblock the runtime enable
function which will be explained and implemented in a follow up
patch.
v2: missing closing brace in set workaround function got fixed
in patch 17.
Signed-off-by: Jonathan Kim
---
.../gpu/drm/amd/amdkfd/cik_event_interrupt.c | 4
for queue and device snapshot.
change device snapshot implementation to match queue snapshot
implementation.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 7 ++-
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 72
drivers
fw checks. remove asic family name comments.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 5 +
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 148 ++-
drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 29 +
drivers/gpu/drm/amd/amdkfd/kfd_process.c
-by: Jonathan Kim
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 93 +++
.../drm/amd/amdkfd/kfd_device_queue_manager.h | 5 +
.../drm/amd/amdkfd/kfd_packet_manager_v9.c| 9 ++
.../gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h | 5 +-
4 files changed, 111 insertions(+), 1
-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 +
.../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 116 ++
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +-
3 files changed, 121 insertions(+), 2 deletions(-)
diff --git
of clearing the target exception on query.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 7 ++
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 120 +++
drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 6 ++
3 files changed, 133
buf_size arg to num_queues for clarity.
fix minimum entry size calculation.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 6 +++
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 36 +
.../drm/amd/amdkfd
application.
disable debugging for now on gfx11 due to broken fw.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 2 +
drivers/gpu/drm/amd/amdgpu/mes_v11_0.c| 7 +--
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 -
drivers/gpu/drm/amd/amdkfd/kfd_debug.c
.
v3: remove unneeded comment. also add missing kfd_debug.h include
in dqm file.
v2: remove asic family code name comment in per vmid support check
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_debug.h| 5 ++
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 51
.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 32 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 20
drivers/gpu/drm/amd/amdgpu/mes_v11_0.c| 12 +++
drivers/gpu/drm/amd/include
The debugger subscibes to nofication for requested exceptions on attach.
Allow the debugger to change its subsciption later on.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 3 ++
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 36
v3: v2 was reviewed but requesting re-review for GFX11 added supported.
v2: switch unsupported override mode return from EPERM to EINVAL to
support unique EPERM on PTRACE failure.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
.../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 47
Allow the debugger to query a single queue, device and process
exception.
The KFD should also return the GPU or Queue id of the exception.
The debugger also has the option of clearing exceptions after
being queried.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm
Implement the per-device calls to enable or disable HW debug mode
for GFX11.
v2: remove unneeded ioctl reference and fix types and comment formats.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c| 38 +++
1 file
on queue create during -ERESTARTSYS.
fix up macros naming for ECODE parsing.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 16 +
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 2 +
drivers/gpu/drm/amd/amdkfd/Makefile | 1
-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 104 +++
drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 7 ++
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 10 +++
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 2 +
4 files changed, 123 insertions(+)
diff --git a/drivers/gpu
-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 1e3795e7e18d..55a4ddd35e12 100644
--- a/drivers/gpu/drm/amd/amdkfd
lock renaming.
add comments to explain ignored arguments for debug trap enable and
disable.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 92 +++
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h | 9 ++
2 files changed
.
- remove asic family code name comments in firmware support checking
- add gfx11 requirements in fw support checks and debug props and caps
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 101 --
drivers/gpu/drm/amd/amdkfd
init for gfx11.
add trap on wave start and end registers for gfx11.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c| 26 +++
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c| 1 +
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
.
v2: clarify purpose in the description of this patch
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
.../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 2 +
.../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 2 +
.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 43
.../drm/amd
will be fixed for GFX11 onwards.
Also remove a bunch of deprecated misplaced references for GFX10.3.
v2: fix 'boundaray' typo in description and added gfx10 kgd2kfd header
to avoid kern bot missing prototype complaint.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
.../drm/amd/amdgpu
the required register values that the HWS needs to write on debug enable
and disable.
v3: fix typo and comment format kern bot complaint.
add back cu occupancy that was removed by mistake.
v2: add commentary on unused restore_dbg_registers for debug enable.
Signed-off-by: Jonathan Kim
Reviewed
and disable).
Also remove non-needed dbg flag option.
Add revision and subvendor info to debug device snapshot entry.
Add trap on wave start and end override option.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 48 ++
include/uapi/linux/kfd_ioctl.h | 667
Introduce the require KGD debug calls that will execute hardware debug
mode setting.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
.../gpu/drm/amd/include/kgd_kfd_interface.h | 34 +++
1 file changed, 34 insertions(+)
diff --git a/drivers/gpu/drm/amd/include
there's nothing
to evict.
change err code to EALREADY if attaching to an already attached process.
move debug disable to release worker to avoid race with disable from
ioctl call.
v2: relax debug trap disable and PTRACE ATTACH requirement.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd
the required register values that the HWS needs to write on debug enable
and disable.
v2: add commentary on unused restore_dbg_registers for debug enable.
Signed-off-by: Jonathan Kim
---
.../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 43 ++-
1 file changed, 41 insertions(+), 2
for queue and device snapshot.
change device snapshot implementation to match queue snapshot
implementation.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 7 ++-
drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 72
drivers/gpu/drm/amd/amdkfd
suspend or
resume queues).
v2: add gfx11/mes support.
prevent header copy on suspend from overwriting user fields.
simplify resume_queues function.
address other nit-picks
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 5 +
drivers/gpu/drm/amd/amdgpu
.
For runtime exceptions, this will unblock the runtime enable
function which will be explained and implemented in a follow up
patch.
Signed-off-by: Jonathan Kim
---
.../gpu/drm/amd/amdkfd/cik_event_interrupt.c | 4 +-
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 5 ++
drivers/gpu/drm/amd
functions are implemented in a follow up patch.
v2: add gfx11 support. fix fw checks. remove asic family name comments.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 5 +
drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 148 +-
drivers/gpu/drm/amd
watch points are allocated or not.
v3: add gfx11 support.
cleanup gfx9 kgd calls to set and clear address watch.
use per device spinlock to set watch points.
fixup runlist refresh calls on set/clear address watch.
v2: change dev_id arg to gpu_id for consistency
Signed-off-by: Jonathan Kim
1 - 100 of 257 matches
Mail list logo