RE: [PATCH v2] gpu: drm/amd: Remove the redundant null pointer check in list_for_each_entry() loops
[Public] > -Original Message- > From: Kuehling, Felix > Sent: Monday, June 12, 2023 11:25 AM > To: Lu Hongfei ; Deucher, Alexander > ; Koenig, Christian > ; Pan, Xinhui ; David > Airlie ; Daniel Vetter ; open list:AMD > KFD ; open list:DRM DRIVERS de...@lists.freedesktop.org>; open list ; Kim, > Jonathan > Cc: opensource.ker...@vivo.com > Subject: Re: [PATCH v2] gpu: drm/amd: Remove the redundant null pointer > check in list_for_each_entry() loops > > [+Jon] > > Am 2023-06-12 um 07:58 schrieb Lu Hongfei: > > pqn bound in list_for_each_entry loop will not be null, so there is > > no need to check whether pqn is NULL or not. > > Thus remove a redundant null pointer check. > > > > Signed-off-by: Lu Hongfei > > --- > > The filename of the previous version was: > > 0001-gpu-drm-amd-Fix-the-bug-in-list_for_each_entry-loops.patch > > > > The modifications made compared to the previous version are as follows: > > 1. Modified the patch title > > 2. "Thus remove a redundant null pointer check." is used instead of > > "We could remove this check." > > > > drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 3 --- > > 1 file changed, 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c > b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c > > index cd34e7aaead4..10d0cef844f0 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c > > @@ -1097,9 +1097,6 @@ void > kfd_dbg_set_enabled_debug_exception_mask(struct kfd_process *target, > > > > pqm = >pqm; > > list_for_each_entry(pqn, >queues, process_queue_list) { > > - if (!pqn) > > Right, this check doesn't make a lot of sense. Jon, was this meant to > check pqn->q? Yes that's a bug. It should be a null check on the queue itself. I'll send out the fix shortly. Thanks, Jon > > Regards, >Felix > > > > - continue; > > - > > found_mask |= pqn->q->properties.exception_status; > > } > >
RE: [PATCH 01/33] drm/amdkfd: add debug and runtime enable interface
[Public] > -Original Message- > From: Alex Deucher > Sent: Wednesday, May 31, 2023 2:15 PM > To: Kuehling, Felix > Cc: Kim, Jonathan ; amd- > g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Huang, JinHuiEric > > Subject: Re: [PATCH 01/33] drm/amdkfd: add debug and runtime enable > interface > > Caution: This message originated from an External Source. Use proper > caution when opening attachments, clicking links, or responding. > > > On Tue, May 30, 2023 at 3:17 PM Felix Kuehling > wrote: > > > > Am 2023-05-25 um 13:27 schrieb Jonathan Kim: > > > Introduce the GPU debug operations interface. > > > > > > For ROCm-GDB to extend the GNU Debugger's ability to inspect the AMD > GPU > > > instruction set, provide the necessary interface to allow the debugger > > > to HW debug-mode set and query exceptions per HSA queue, process or > > > device. > > > > > > The runtime_enable interface coordinates exception handling with the > > > HSA runtime. > > > > > > Usage is available in the kern docs at uapi/linux/kfd_ioctl.h. > > > > > > v2: add num_xcc to device snapshot entry. > > > fixup missing EC_QUEUE_PACKET_RESERVED mask. > > > > > > Signed-off-by: Jonathan Kim > > > > Reviewed-by: Felix Kuehling > > Can you provide a link to the userspace which uses this? Hi Alex, Current WIP user space link is here -> https://github.com/ROCm-Developer-Tools/ROCdbgapi/tree/wip-dbgapi. This will eventually go to amd-master. Thanks, Jon > > Alex > > > > > > > > --- > > > drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 48 ++ > > > include/uapi/linux/kfd_ioctl.h | 668 ++- > > > 2 files changed, 715 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > > > index 88fe1f31739d..f4b50b74818e 100644 > > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > > > @@ -2729,6 +2729,48 @@ static int kfd_ioctl_criu(struct file *filep, > > > struct > kfd_process *p, void *data) > > > return ret; > > > } > > > > > > +static int kfd_ioctl_runtime_enable(struct file *filep, struct > > > kfd_process > *p, void *data) > > > +{ > > > + return 0; > > > +} > > > + > > > +static int kfd_ioctl_set_debug_trap(struct file *filep, struct > > > kfd_process > *p, void *data) > > > +{ > > > + struct kfd_ioctl_dbg_trap_args *args = data; > > > + int r = 0; > > > + > > > + if (sched_policy == KFD_SCHED_POLICY_NO_HWS) { > > > + pr_err("Debugging does not support sched_policy %i", > sched_policy); > > > + return -EINVAL; > > > + } > > > + > > > + switch (args->op) { > > > + case KFD_IOC_DBG_TRAP_ENABLE: > > > + case KFD_IOC_DBG_TRAP_DISABLE: > > > + case KFD_IOC_DBG_TRAP_SEND_RUNTIME_EVENT: > > > + case KFD_IOC_DBG_TRAP_SET_EXCEPTIONS_ENABLED: > > > + case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_OVERRIDE: > > > + case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_MODE: > > > + case KFD_IOC_DBG_TRAP_SUSPEND_QUEUES: > > > + case KFD_IOC_DBG_TRAP_RESUME_QUEUES: > > > + case KFD_IOC_DBG_TRAP_SET_NODE_ADDRESS_WATCH: > > > + case KFD_IOC_DBG_TRAP_CLEAR_NODE_ADDRESS_WATCH: > > > + case KFD_IOC_DBG_TRAP_SET_FLAGS: > > > + case KFD_IOC_DBG_TRAP_QUERY_DEBUG_EVENT: > > > + case KFD_IOC_DBG_TRAP_QUERY_EXCEPTION_INFO: > > > + case KFD_IOC_DBG_TRAP_GET_QUEUE_SNAPSHOT: > > > + case KFD_IOC_DBG_TRAP_GET_DEVICE_SNAPSHOT: > > > + pr_warn("Debugging not supported yet\n"); > > > + r = -EACCES; > > > + break; > > > + default: > > > + pr_err("Invalid option: %i\n", args->op); > > > + r = -EINVAL; > > > + } > > > + > > > + return r; > > > +} > > > + > > > #define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \ > > > [_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \ > > > .cmd_drv = 0, .name = #ioctl} > > > @@ -2841,6 +2883,12 @@ static const struct amdkfd_ioctl_desc > amdkfd_ioctls[] = { > > > > > >
RE: [PATCH 14/33] drm/amdgpu: prepare map process for multi-process debug devices
[Public] > -Original Message- > From: Kuehling, Felix > Sent: Tuesday, May 30, 2023 3:56 PM > To: Kim, Jonathan ; amd- > g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org > Cc: Huang, JinHuiEric > Subject: Re: [PATCH 14/33] drm/amdgpu: prepare map process for multi- > process debug devices > > Am 2023-05-25 um 13:27 schrieb Jonathan Kim: > > Unlike single process debug devices, multi-process debug devices allow > > debug mode setting per-VMID (non-device-global). > > > > Because the HWS manages PASID-VMID mapping, the new MAP_PROCESS > API allows > > the KFD to forward the required SPI debug register write requests. > > > > To request a new debug mode setting change, the KFD must be able to > > preempt all queues then remap all queues with these new setting > > requests for MAP_PROCESS to take effect. > > > > Note that by default, trap enablement in non-debug mode must be > disabled > > for performance reasons for multi-process debug devices due to setup > > overhead in FW. > > > > v2: spot fixup new kfd_node references > > > > Signed-off-by: Jonathan Kim > > --- > > drivers/gpu/drm/amd/amdkfd/kfd_debug.h| 5 ++ > > .../drm/amd/amdkfd/kfd_device_queue_manager.c | 51 > +++ > > .../drm/amd/amdkfd/kfd_device_queue_manager.h | 3 ++ > > .../drm/amd/amdkfd/kfd_packet_manager_v9.c| 14 + > > drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 9 > > drivers/gpu/drm/amd/amdkfd/kfd_process.c | 5 ++ > > 6 files changed, 87 insertions(+) > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h > b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h > > index a8abfe2a0a14..db6d72e7930f 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h > > @@ -29,4 +29,9 @@ int kfd_dbg_trap_disable(struct kfd_process *target); > > int kfd_dbg_trap_enable(struct kfd_process *target, uint32_t fd, > > void __user *runtime_info, > > uint32_t *runtime_info_size); > > +static inline bool kfd_dbg_is_per_vmid_supported(struct kfd_node *dev) > > +{ > > + return KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2); > > This needs to be updated to include 9.4.3 as well. Is that coming in a > different patch? Other than that, this patch is That's correct. This series does not enable the debugger for GFX9.4.3. This will be a follow-up series that Eric will provide. Thanks. Jon > > Reviewed-by: Felix Kuehling > > > > +} > > + > > #endif > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > > index c8519adc89ac..badfe1210bc4 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > > @@ -36,6 +36,7 @@ > > #include "kfd_kernel_queue.h" > > #include "amdgpu_amdkfd.h" > > #include "mes_api_def.h" > > +#include "kfd_debug.h" > > > > /* Size of the per-pipe EOP queue */ > > #define CIK_HPD_EOP_BYTES_LOG2 11 > > @@ -2593,6 +2594,56 @@ int release_debug_trap_vmid(struct > device_queue_manager *dqm, > > return r; > > } > > > > +int debug_lock_and_unmap(struct device_queue_manager *dqm) > > +{ > > + int r; > > + > > + if (dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) { > > + pr_err("Unsupported on sched_policy: %i\n", dqm- > >sched_policy); > > + return -EINVAL; > > + } > > + > > + if (!kfd_dbg_is_per_vmid_supported(dqm->dev)) > > + return 0; > > + > > + dqm_lock(dqm); > > + > > + r = unmap_queues_cpsch(dqm, > KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES, 0, 0, false); > > + if (r) > > + dqm_unlock(dqm); > > + > > + return r; > > +} > > + > > +int debug_map_and_unlock(struct device_queue_manager *dqm) > > +{ > > + int r; > > + > > + if (dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) { > > + pr_err("Unsupported on sched_policy: %i\n", dqm- > >sched_policy); > > + return -EINVAL; > > + } > > + > > + if (!kfd_dbg_is_per_vmid_supported(dqm->dev)) > > + return 0; > > + > > + r = map_queues_cpsch(dqm); > > + > > + dqm_unlock(dqm); > > + > > + return r; > > +} > > + > > +int debug_refresh_runlist(struct device_queue_manager *d
RE: [PATCH] drm/amdkfd: remove unused sq_int_priv variable
[Public] Hi Felix, That is correct. The debugger will need sq_int_priv to work. Thanks, Jon > -Original Message- > From: Kuehling, Felix > Sent: Thursday, March 30, 2023 11:39 AM > To: Tom Rix ; Deucher, Alexander > ; Koenig, Christian > ; Pan, Xinhui ; > airl...@gmail.com; dan...@ffwll.ch; nat...@kernel.org; > ndesaulni...@google.com; Kim, Jonathan > Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux- > ker...@vger.kernel.org; l...@lists.linux.dev > Subject: Re: [PATCH] drm/amdkfd: remove unused sq_int_priv variable > > Am 2023-03-30 um 11:20 schrieb Tom Rix: > > clang with W=1 reports > > drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_int_process_v11.c:282:38: > error: variable > >'sq_int_priv' set but not used [-Werror,-Wunused-but-set-variable] > > uint8_t sq_int_enc, sq_int_errtype, sq_int_priv; > > ^ > > This variable is not used so remove it. > > Hi Jon, > > I think your debugger patches are going to start using this. Can you > comment? > > I'd prefer not to apply this patch now, as Jon's patches are expected to > land soon, once Alex is done upstreaming GFX 9.4.3 support. > > Regards, >Felix > > > > > > Signed-off-by: Tom Rix > > --- > > drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c | 9 + > > 1 file changed, 1 insertion(+), 8 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c > b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c > > index 0d53f6067422..bbd646c0dee7 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c > > @@ -279,7 +279,7 @@ static void event_interrupt_wq_v11(struct kfd_dev > *dev, > > { > > uint16_t source_id, client_id, ring_id, pasid, vmid; > > uint32_t context_id0, context_id1; > > - uint8_t sq_int_enc, sq_int_errtype, sq_int_priv; > > + uint8_t sq_int_enc, sq_int_errtype; > > struct kfd_vm_fault_info info = {0}; > > struct kfd_hsa_memory_exception_data exception_data; > > > > @@ -348,13 +348,6 @@ static void event_interrupt_wq_v11(struct kfd_dev > *dev, > > break; > > case SQ_INTERRUPT_WORD_ENCODING_INST: > > print_sq_intr_info_inst(context_id0, > context_id1); > > - sq_int_priv = REG_GET_FIELD(context_id0, > > - > SQ_INTERRUPT_WORD_WAVE_CTXID0, PRIV); > > - /*if (sq_int_priv && > (kfd_set_dbg_ev_from_interrupt(dev, pasid, > > - > KFD_CTXID0_DOORBELL_ID(context_id0), > > - > KFD_CTXID0_TRAP_CODE(context_id0), > > - NULL, 0))) > > - return;*/ > > break; > > case SQ_INTERRUPT_WORD_ENCODING_ERROR: > > print_sq_intr_info_error(context_id0, > context_id1);
RE: [PATCH 12/34] drm/amdgpu: add configurable grace period for unmap queues
[Public] Thanks for catch Kent. I'll fix up the typos with a follow-on. Jon > -Original Message- > From: Russell, Kent > Sent: Tuesday, March 28, 2023 11:19 AM > To: Kim, Jonathan ; amd-...@lists.freedesktop.org; > dri-devel@lists.freedesktop.org > Cc: Kuehling, Felix ; Kim, Jonathan > > Subject: RE: [PATCH 12/34] drm/amdgpu: add configurable grace period for > unmap queues > > [AMD Official Use Only - General] > > 3 tiny grammar/spelling things inline (not critical) > > Kent > > > -Original Message- > > From: amd-gfx On Behalf Of > > Jonathan Kim > > Sent: Monday, March 27, 2023 2:43 PM > > To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org > > Cc: Kuehling, Felix ; Kim, Jonathan > > > > Subject: [PATCH 12/34] drm/amdgpu: add configurable grace period for > unmap > > queues > > > > The HWS schedule allows a grace period for wave completion prior to > > preemption for better performance by avoiding CWSR on waves that can > > potentially complete quickly. The debugger, on the other hand, will > > want to inspect wave status immediately after it actively triggers > > preemption (a suspend function to be provided). > > > > To minimize latency between preemption and debugger wave inspection, > allow > > immediate preemption by setting the grace period to 0. > > > > Note that setting the preepmtion grace period to 0 will result in an > > infinite grace period being set due to a CP FW bug so set it to 1 for now. > > > > v2: clarify purpose in the description of this patch > > > > Signed-off-by: Jonathan Kim > > Reviewed-by: Felix Kuehling > > --- > > .../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 2 + > > .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 2 + > > .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 43 > > .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h| 6 ++ > > .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c | 2 + > > .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 43 > > .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h | 9 ++- > > .../drm/amd/amdkfd/kfd_device_queue_manager.c | 62 +- > > .../drm/amd/amdkfd/kfd_device_queue_manager.h | 2 + > > .../gpu/drm/amd/amdkfd/kfd_packet_manager.c | 32 + > > .../drm/amd/amdkfd/kfd_packet_manager_v9.c| 39 +++ > > .../gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h | 65 > +++ > > drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 5 ++ > > 13 files changed, 291 insertions(+), 21 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c > > index a6f98141c29c..b811a0985050 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c > > @@ -82,5 +82,7 @@ const struct kfd2kgd_calls aldebaran_kfd2kgd = { > > .get_cu_occupancy = kgd_gfx_v9_get_cu_occupancy, > > .enable_debug_trap = kgd_aldebaran_enable_debug_trap, > > .disable_debug_trap = kgd_aldebaran_disable_debug_trap, > > + .get_iq_wait_times = kgd_gfx_v9_get_iq_wait_times, > > + .build_grace_period_packet_info = > > kgd_gfx_v9_build_grace_period_packet_info, > > .program_trap_handler_settings = > > kgd_gfx_v9_program_trap_handler_settings, > > }; > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c > > index d2918e5c0dea..a62bd0068515 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c > > @@ -410,6 +410,8 @@ const struct kfd2kgd_calls arcturus_kfd2kgd = { > > > > kgd_gfx_v9_set_vm_context_page_table_base, > > .enable_debug_trap = kgd_arcturus_enable_debug_trap, > > .disable_debug_trap = kgd_arcturus_disable_debug_trap, > > + .get_iq_wait_times = kgd_gfx_v9_get_iq_wait_times, > > + .build_grace_period_packet_info = > > kgd_gfx_v9_build_grace_period_packet_info, > > .get_cu_occupancy = kgd_gfx_v9_get_cu_occupancy, > > .program_trap_handler_settings = > > kgd_gfx_v9_program_trap_handler_settings > > }; > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c > > index 969015281510..605387e55d33 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c > > @@ -802,6 +802,4
RE: [PATCH 19/32] drm/amdkfd: add runtime enable operation
[AMD Official Use Only - General] > -Original Message- > From: Kuehling, Felix > Sent: Monday, March 20, 2023 8:31 PM > To: Kim, Jonathan ; amd- > g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org > Subject: Re: [PATCH 19/32] drm/amdkfd: add runtime enable operation > > > On 2023-01-25 14:53, Jonathan Kim wrote: > > The debugger can attach to a process prior to HSA enablement (i.e. > > inferior is spawned by the debugger and attached to immediately before > > target process has been enabled for HSA dispatches) or it > > can attach to a running target that is already HSA enabled. Either > > way, the debugger needs to know the enablement status to know when > > it can inspect queues. > > > > For the scenario where the debugger spawns the target process, > > it will have to wait for ROCr's runtime enable request from the target. > > The runtime enable request will be able to see that its process has been > > debug attached. ROCr raises an EC_PROCESS_RUNTIME signal to the > > debugger then blocks the target process while waiting the debugger's > > response. Once the debugger has received the runtime signal, it will > > unblock the target process. > > > > For the scenario where the debugger attaches to a running target > > process, ROCr will set the target process' runtime status as enabled so > > that on an attach request, the debugger will be able to see this > > status and will continue with debug enablement as normal. > > > > A secondary requirement is to conditionally enable the trap tempories > only > > if the user requests it (env var HSA_ENABLE_DEBUG=1) or if the debugger > > attaches with HSA runtime enabled. This is because setting up the trap > > temporaries incurs a performance overhead that is unacceptable for > > microbench performance in normal mode for certain customers. > > > > In the scenario where the debugger spawns the target process, when ROCr > > detects that the debugger has attached during the runtime enable > > request, it will enable the trap temporaries before it blocks the target > > process while waiting for the debugger to respond. > > > > In the scenario where the debugger attaches to a running target process, > > it will enable to trap temporaries itself. > > > > Finally, there is an additional restriction that is required to be > > enforced with runtime enable and HW debug mode setting. The debugger > must > > first ensure that HW debug mode has been enabled before permitting HW > debug > > mode operations. > > > > With single process debug devices, allowing the debugger to set debug > > HW modes prior to trap activation means that debug HW mode setting can > > occur before the KFD has reserved the debug VMID (0xf) from the hardware > > scheduler's VMID allocation resource pool. This can result in the > > hardware scheduler assigning VMID 0xf to a non-debugged process and > > having that process inherit debug HW mode settings intended for the > > debugged target process instead, which is both incorrect and potentially > > fatal for normal mode operation. > > > > With multi process debug devices, allowing the debugger to set debug > > HW modes prior to trap activation means that non-debugged processes > > migrating to a new VMID could inherit unintended debug settings. > > > > All debug operations that touch HW settings must require trap activation > > where trap activation is triggered by both debug attach and runtime > > enablement (target has KFD opened and is ready to dispatch work). > > > > v2: fix up hierarchy of semantics in description. > > > > Signed-off-by: Jonathan Kim > > --- > > drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 150 > ++- > > drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 6 +- > > drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 4 + > > drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 1 + > > 4 files changed, 157 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > > index 09fe8576dc8c..46f9d453dc5e 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > > @@ -2654,11 +2654,147 @@ static int kfd_ioctl_criu(struct file *filep, > struct kfd_process *p, void *data) > > return ret; > > } > > > > -static int kfd_ioctl_runtime_enable(struct file *filep, struct kfd_process > > *p, > void *data) > > +static int runtime_enable(struct kfd_process *p, uint64_t r_debug, >
RE: [PATCH 03/32] drm/amdkfd: prepare per-process debug enable and disable
[Public] > -Original Message- > From: Kuehling, Felix > Sent: Thursday, February 16, 2023 6:44 PM > To: Kim, Jonathan ; amd- > g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org > Subject: Re: [PATCH 03/32] drm/amdkfd: prepare per-process debug enable > and disable > > > On 2023-01-25 14:53, Jonathan Kim wrote: > > The ROCm debugger will attach to a process to debug by PTRACE and will > > expect the KFD to prepare a process for the target PID, whether the > > target PID has opened the KFD device or not. > > > > This patch is to explicity handle this requirement. Further HW mode > > setting and runtime coordination requirements will be handled in > > following patches. > > > > In the case where the target process has not opened the KFD device, > > a new KFD process must be created for the target PID. > > The debugger as well as the target process for this case will have not > > acquired any VMs so handle process restoration to correctly account for > > this. > > > > To coordinate with HSA runtime, the debugger must be aware of the target > > process' runtime enablement status and will copy the runtime status > > information into the debugged KFD process for later query. > > > > On enablement, the debugger will subscribe to a set of exceptions where > > each exception events will notify the debugger through a pollable FIFO > > file descriptor that the debugger provides to the KFD to manage. > > Some events will be synchronously raised while other are scheduled, > > which is why a debug_event_workarea worker is initialized. > > > > Finally on process termination of either the debugger or the target, > > debugging must be disabled if it has not been done so. > > > > v3: fix typo on debug trap disable and PTRACE ATTACH relax check. > > remove unnecessary queue eviction counter reset when there's nothing > > to evict. > > change err code to EALREADY if attaching to an already attached process. > > move debug disable to release worker to avoid race with disable from > > ioctl call. > > > > v2: relax debug trap disable and PTRACE ATTACH requirement. > > > > Signed-off-by: Jonathan Kim > > --- > > drivers/gpu/drm/amd/amdkfd/Makefile | 3 +- > > drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 88 - > > drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 94 > +++ > > drivers/gpu/drm/amd/amdkfd/kfd_debug.h| 33 +++ > > .../drm/amd/amdkfd/kfd_device_queue_manager.c | 22 - > > drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 34 ++- > > drivers/gpu/drm/amd/amdkfd/kfd_process.c | 63 + > > 7 files changed, 308 insertions(+), 29 deletions(-) > > create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_debug.c > > create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_debug.h > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/Makefile > b/drivers/gpu/drm/amd/amdkfd/Makefile > > index e758c2a24cd0..747754428073 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/Makefile > > +++ b/drivers/gpu/drm/amd/amdkfd/Makefile > > @@ -55,7 +55,8 @@ AMDKFD_FILES := $(AMDKFD_PATH)/kfd_module.o \ > > $(AMDKFD_PATH)/kfd_int_process_v9.o \ > > $(AMDKFD_PATH)/kfd_int_process_v11.o \ > > $(AMDKFD_PATH)/kfd_smi_events.o \ > > - $(AMDKFD_PATH)/kfd_crat.o > > + $(AMDKFD_PATH)/kfd_crat.o \ > > + $(AMDKFD_PATH)/kfd_debug.o > > > > ifneq ($(CONFIG_AMD_IOMMU_V2),) > > AMDKFD_FILES += $(AMDKFD_PATH)/kfd_iommu.o > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > > index d3b019e64093..ee05c2e54ef6 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > > @@ -44,6 +44,7 @@ > > #include "amdgpu_amdkfd.h" > > #include "kfd_smi_events.h" > > #include "amdgpu_dma_buf.h" > > +#include "kfd_debug.h" > > > > static long kfd_ioctl(struct file *, unsigned int, unsigned long); > > static int kfd_open(struct inode *, struct file *); > > @@ -142,10 +143,15 @@ static int kfd_open(struct inode *inode, struct > file *filep) > > return -EPERM; > > } > > > > - process = kfd_create_process(filep); > > + process = kfd_create_process(current); > > if (IS_ERR(process)) > > return PTR_ERR(process); > > > > + if (kfd_process_init_cwsr_apu(process, filep)) { > > +
RE: [PATCH 15/32] drm/amdkfd: prepare trap workaround for gfx11
[Public] > -Original Message- > From: Kuehling, Felix > Sent: Monday, March 20, 2023 5:50 PM > To: Kim, Jonathan ; amd- > g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org > Subject: Re: [PATCH 15/32] drm/amdkfd: prepare trap workaround for gfx11 > > > On 2023-01-25 14:53, Jonathan Kim wrote: > > Due to a HW bug, waves in only half the shader arrays can enter trap. > > > > When starting a debug session, relocate all waves to the first shader > > array of each shader engine and mask off the 2nd shader array as > > unavailable. > > > > When ending a debug session, re-enable the 2nd shader array per > > shader engine. > > > > User CU masking per queue cannot be guaranteed to remain functional > > if requested during debugging (e.g. user cu mask requests only 2nd shader > > array as an available resource leading to zero HW resources available) > > nor can runtime be alerted of any of these changes during execution. > > > > Make user CU masking and debugging mutual exclusive with respect to > > availability. > > > > If the debugger tries to attach to a process with a user cu masked > > queue, return the runtime status as enabled but busy. > > > > If the debugger tries to attach and fails to reallocate queue waves to > > the first shader array of each shader engine, return the runtime status > > as enabled but with an error. > > > > In addition, like any other mutli-process debug supported devices, > > disable trap temporary setup per-process to avoid performance impact > from > > setup overhead. > > > > Signed-off-by: Jonathan Kim > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 2 + > > drivers/gpu/drm/amd/amdgpu/mes_v11_0.c| 7 +- > > drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 - > > drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 64 > +++ > > drivers/gpu/drm/amd/amdkfd/kfd_debug.h| 3 +- > > .../drm/amd/amdkfd/kfd_device_queue_manager.c | 7 ++ > > .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 3 +- > > .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 3 +- > > .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c | 42 > > .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 3 +- > > .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c | 3 +- > > drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 5 +- > > .../amd/amdkfd/kfd_process_queue_manager.c| 9 ++- > > 13 files changed, 124 insertions(+), 29 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h > > index d20df0cf0d88..b5f5eed2b5ef 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h > > @@ -219,6 +219,8 @@ struct mes_add_queue_input { > > uint32_tgws_size; > > uint64_ttba_addr; > > uint64_ttma_addr; > > + uint32_ttrap_en; > > + uint32_tskip_process_ctx_clear; > > uint32_tis_kfd_process; > > uint32_tis_aql_queue; > > uint32_tqueue_size; > > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > index fbacdc42efac..38c7a0cbf264 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c > > @@ -197,17 +197,14 @@ static int mes_v11_0_add_hw_queue(struct > amdgpu_mes *mes, > > mes_add_queue_pkt.gws_size = input->gws_size; > > mes_add_queue_pkt.trap_handler_addr = input->tba_addr; > > mes_add_queue_pkt.tma_addr = input->tma_addr; > > + mes_add_queue_pkt.trap_en = input->trap_en; > > + mes_add_queue_pkt.skip_process_ctx_clear = input- > >skip_process_ctx_clear; > > mes_add_queue_pkt.is_kfd_process = input->is_kfd_process; > > > > /* For KFD, gds_size is re-used for queue size (needed in MES for AQL > queues) */ > > mes_add_queue_pkt.is_aql_queue = input->is_aql_queue; > > mes_add_queue_pkt.gds_size = input->queue_size; > > > > - if (!(((adev->mes.sched_version & AMDGPU_MES_VERSION_MASK) >= > 4) && > > - (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(11, 0, 0)) > && > > - (adev->ip_versions[GC_HWIP][0] <= IP_VERSION(11, 0, 3 > > - mes_add_queue_pkt.trap_en = 1; > > - > > /* For KFD, gds_size is re-used for queue size (needed in MES for AQL > queues) */ > > mes_add_queue_pkt.is_aql_queue = i
RE: [PATCH 01/44] drm/amdgpu: replace per_device_list by array
[AMD Official Use Only - Internal Distribution Only] > -Original Message- > From: amd-gfx On Behalf Of Felix > Kuehling > Sent: Monday, March 22, 2021 6:58 AM > To: dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org > Cc: Sierra Guiza, Alejandro (Alex) > Subject: [PATCH 01/44] drm/amdgpu: replace per_device_list by array > > [CAUTION: External Email] > > From: Alex Sierra > > Remove per_device_list from kfd_process and replace it with a > kfd_process_device pointers array of MAX_GPU_INSTANCES size. This helps > to manage the kfd_process_devices binded to a specific kfd_process. > Also, functions used by kfd_chardev to iterate over the list were removed, > since they are not valid anymore. Instead, it was replaced by a local loop > iterating the array. > > Signed-off-by: Alex Sierra > Signed-off-by: Felix Kuehling As discussed, this patch is required to sync internal branches for the KFD and is Reviewed-by: Jonathan Kim > --- > drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 116 -- > drivers/gpu/drm/amd/amdkfd/kfd_iommu.c| 8 +- > drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 20 +-- > drivers/gpu/drm/amd/amdkfd/kfd_process.c | 108 > .../amd/amdkfd/kfd_process_queue_manager.c| 6 +- > 5 files changed, 111 insertions(+), 147 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > index 6802c616e10e..43de260b2230 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > @@ -870,52 +870,47 @@ static int kfd_ioctl_get_process_apertures(struct > file *filp, { > struct kfd_ioctl_get_process_apertures_args *args = data; > struct kfd_process_device_apertures *pAperture; > - struct kfd_process_device *pdd; > + int i; > > dev_dbg(kfd_device, "get apertures for PASID 0x%x", p->pasid); > > args->num_of_nodes = 0; > > mutex_lock(>mutex); > + /* Run over all pdd of the process */ > + for (i = 0; i < p->n_pdds; i++) { > + struct kfd_process_device *pdd = p->pdds[i]; > + > + pAperture = > + >process_apertures[args->num_of_nodes]; > + pAperture->gpu_id = pdd->dev->id; > + pAperture->lds_base = pdd->lds_base; > + pAperture->lds_limit = pdd->lds_limit; > + pAperture->gpuvm_base = pdd->gpuvm_base; > + pAperture->gpuvm_limit = pdd->gpuvm_limit; > + pAperture->scratch_base = pdd->scratch_base; > + pAperture->scratch_limit = pdd->scratch_limit; > > - /*if the process-device list isn't empty*/ > - if (kfd_has_process_device_data(p)) { > - /* Run over all pdd of the process */ > - pdd = kfd_get_first_process_device_data(p); > - do { > - pAperture = > - >process_apertures[args->num_of_nodes]; > - pAperture->gpu_id = pdd->dev->id; > - pAperture->lds_base = pdd->lds_base; > - pAperture->lds_limit = pdd->lds_limit; > - pAperture->gpuvm_base = pdd->gpuvm_base; > - pAperture->gpuvm_limit = pdd->gpuvm_limit; > - pAperture->scratch_base = pdd->scratch_base; > - pAperture->scratch_limit = pdd->scratch_limit; > - > - dev_dbg(kfd_device, > - "node id %u\n", args->num_of_nodes); > - dev_dbg(kfd_device, > - "gpu id %u\n", pdd->dev->id); > - dev_dbg(kfd_device, > - "lds_base %llX\n", pdd->lds_base); > - dev_dbg(kfd_device, > - "lds_limit %llX\n", pdd->lds_limit); > - dev_dbg(kfd_device, > - "gpuvm_base %llX\n", pdd->gpuvm_base); > - dev_dbg(kfd_device, > - "gpuvm_limit %llX\n", pdd->gpuvm_limit); > - dev_dbg(kfd_device, > - "scratch_base %llX\n", pdd->scratch_base); > - dev_dbg(kfd_device, > - "scratch_limit %llX\n", pdd->scratch_limit); > - > - args->num_of_nodes++; > - > - pdd = kfd_get_next_process_device_data(p, pdd); > - } while (pdd && (args->num_of_nodes < > NUM_OF_SUPPORTED_GPUS)); > - } > + dev_dbg(kfd_device, > + "node id %u\n", args->num_of_nodes); > + dev_dbg(kfd_device, > + "gpu id %u\n", pdd->dev->id); > + dev_dbg(kfd_device, > + "lds_base %llX\n", pdd->lds_base); > +