RE: [PATCH v2] gpu: drm/amd: Remove the redundant null pointer check in list_for_each_entry() loops

2023-06-12 Thread Kim, Jonathan
[Public]

> -Original Message-
> From: Kuehling, Felix 
> Sent: Monday, June 12, 2023 11:25 AM
> To: Lu Hongfei ; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ; David
> Airlie ; Daniel Vetter ; open list:AMD
> KFD ; open list:DRM DRIVERS  de...@lists.freedesktop.org>; open list ; Kim,
> Jonathan 
> Cc: opensource.ker...@vivo.com
> Subject: Re: [PATCH v2] gpu: drm/amd: Remove the redundant null pointer
> check in list_for_each_entry() loops
>
> [+Jon]
>
> Am 2023-06-12 um 07:58 schrieb Lu Hongfei:
> > pqn bound in list_for_each_entry loop will not be null, so there is
> > no need to check whether pqn is NULL or not.
> > Thus remove a redundant null pointer check.
> >
> > Signed-off-by: Lu Hongfei 
> > ---
> > The filename of the previous version was:
> > 0001-gpu-drm-amd-Fix-the-bug-in-list_for_each_entry-loops.patch
> >
> > The modifications made compared to the previous version are as follows:
> > 1. Modified the patch title
> > 2. "Thus remove a redundant null pointer check." is used instead of
> > "We could remove this check."
> >
> >   drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 3 ---
> >   1 file changed, 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> > index cd34e7aaead4..10d0cef844f0 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> > @@ -1097,9 +1097,6 @@ void
> kfd_dbg_set_enabled_debug_exception_mask(struct kfd_process *target,
> >
> > pqm = >pqm;
> > list_for_each_entry(pqn, >queues, process_queue_list) {
> > -   if (!pqn)
>
> Right, this check doesn't make a lot of sense. Jon, was this meant to
> check pqn->q?

Yes that's a bug.  It should be a null check on the queue itself.
I'll send out the fix shortly.

Thanks,

Jon

>
> Regards,
>Felix
>
>
> > -   continue;
> > -
> > found_mask |= pqn->q->properties.exception_status;
> > }
> >


RE: [PATCH 01/33] drm/amdkfd: add debug and runtime enable interface

2023-05-31 Thread Kim, Jonathan
[Public]

> -Original Message-
> From: Alex Deucher 
> Sent: Wednesday, May 31, 2023 2:15 PM
> To: Kuehling, Felix 
> Cc: Kim, Jonathan ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Huang, JinHuiEric
> 
> Subject: Re: [PATCH 01/33] drm/amdkfd: add debug and runtime enable
> interface
>
> Caution: This message originated from an External Source. Use proper
> caution when opening attachments, clicking links, or responding.
>
>
> On Tue, May 30, 2023 at 3:17 PM Felix Kuehling 
> wrote:
> >
> > Am 2023-05-25 um 13:27 schrieb Jonathan Kim:
> > > Introduce the GPU debug operations interface.
> > >
> > > For ROCm-GDB to extend the GNU Debugger's ability to inspect the AMD
> GPU
> > > instruction set, provide the necessary interface to allow the debugger
> > > to HW debug-mode set and query exceptions per HSA queue, process or
> > > device.
> > >
> > > The runtime_enable interface coordinates exception handling with the
> > > HSA runtime.
> > >
> > > Usage is available in the kern docs at uapi/linux/kfd_ioctl.h.
> > >
> > > v2: add num_xcc to device snapshot entry.
> > > fixup missing EC_QUEUE_PACKET_RESERVED mask.
> > >
> > > Signed-off-by: Jonathan Kim 
> >
> > Reviewed-by: Felix Kuehling 
>
> Can you provide a link to the userspace which uses this?

Hi Alex,

Current WIP user space link is here -> 
https://github.com/ROCm-Developer-Tools/ROCdbgapi/tree/wip-dbgapi.
This will eventually go to amd-master.

Thanks,

Jon

>
> Alex
>
> >
> >
> > > ---
> > >   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  48 ++
> > >   include/uapi/linux/kfd_ioctl.h   | 668 ++-
> > >   2 files changed, 715 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> > > index 88fe1f31739d..f4b50b74818e 100644
> > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> > > @@ -2729,6 +2729,48 @@ static int kfd_ioctl_criu(struct file *filep, 
> > > struct
> kfd_process *p, void *data)
> > >   return ret;
> > >   }
> > >
> > > +static int kfd_ioctl_runtime_enable(struct file *filep, struct 
> > > kfd_process
> *p, void *data)
> > > +{
> > > + return 0;
> > > +}
> > > +
> > > +static int kfd_ioctl_set_debug_trap(struct file *filep, struct 
> > > kfd_process
> *p, void *data)
> > > +{
> > > + struct kfd_ioctl_dbg_trap_args *args = data;
> > > + int r = 0;
> > > +
> > > + if (sched_policy == KFD_SCHED_POLICY_NO_HWS) {
> > > + pr_err("Debugging does not support sched_policy %i",
> sched_policy);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + switch (args->op) {
> > > + case KFD_IOC_DBG_TRAP_ENABLE:
> > > + case KFD_IOC_DBG_TRAP_DISABLE:
> > > + case KFD_IOC_DBG_TRAP_SEND_RUNTIME_EVENT:
> > > + case KFD_IOC_DBG_TRAP_SET_EXCEPTIONS_ENABLED:
> > > + case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_OVERRIDE:
> > > + case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_MODE:
> > > + case KFD_IOC_DBG_TRAP_SUSPEND_QUEUES:
> > > + case KFD_IOC_DBG_TRAP_RESUME_QUEUES:
> > > + case KFD_IOC_DBG_TRAP_SET_NODE_ADDRESS_WATCH:
> > > + case KFD_IOC_DBG_TRAP_CLEAR_NODE_ADDRESS_WATCH:
> > > + case KFD_IOC_DBG_TRAP_SET_FLAGS:
> > > + case KFD_IOC_DBG_TRAP_QUERY_DEBUG_EVENT:
> > > + case KFD_IOC_DBG_TRAP_QUERY_EXCEPTION_INFO:
> > > + case KFD_IOC_DBG_TRAP_GET_QUEUE_SNAPSHOT:
> > > + case KFD_IOC_DBG_TRAP_GET_DEVICE_SNAPSHOT:
> > > + pr_warn("Debugging not supported yet\n");
> > > + r = -EACCES;
> > > + break;
> > > + default:
> > > + pr_err("Invalid option: %i\n", args->op);
> > > + r = -EINVAL;
> > > + }
> > > +
> > > + return r;
> > > +}
> > > +
> > >   #define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \
> > >   [_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \
> > >   .cmd_drv = 0, .name = #ioctl}
> > > @@ -2841,6 +2883,12 @@ static const struct amdkfd_ioctl_desc
> amdkfd_ioctls[] = {
> > >
> > >

RE: [PATCH 14/33] drm/amdgpu: prepare map process for multi-process debug devices

2023-05-30 Thread Kim, Jonathan
[Public]

> -Original Message-
> From: Kuehling, Felix 
> Sent: Tuesday, May 30, 2023 3:56 PM
> To: Kim, Jonathan ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Cc: Huang, JinHuiEric 
> Subject: Re: [PATCH 14/33] drm/amdgpu: prepare map process for multi-
> process debug devices
>
> Am 2023-05-25 um 13:27 schrieb Jonathan Kim:
> > Unlike single process debug devices, multi-process debug devices allow
> > debug mode setting per-VMID (non-device-global).
> >
> > Because the HWS manages PASID-VMID mapping, the new MAP_PROCESS
> API allows
> > the KFD to forward the required SPI debug register write requests.
> >
> > To request a new debug mode setting change, the KFD must be able to
> > preempt all queues then remap all queues with these new setting
> > requests for MAP_PROCESS to take effect.
> >
> > Note that by default, trap enablement in non-debug mode must be
> disabled
> > for performance reasons for multi-process debug devices due to setup
> > overhead in FW.
> >
> > v2: spot fixup new kfd_node references
> >
> > Signed-off-by: Jonathan Kim 
> > ---
> >   drivers/gpu/drm/amd/amdkfd/kfd_debug.h|  5 ++
> >   .../drm/amd/amdkfd/kfd_device_queue_manager.c | 51
> +++
> >   .../drm/amd/amdkfd/kfd_device_queue_manager.h |  3 ++
> >   .../drm/amd/amdkfd/kfd_packet_manager_v9.c| 14 +
> >   drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  9 
> >   drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  5 ++
> >   6 files changed, 87 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
> b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
> > index a8abfe2a0a14..db6d72e7930f 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
> > @@ -29,4 +29,9 @@ int kfd_dbg_trap_disable(struct kfd_process *target);
> >   int kfd_dbg_trap_enable(struct kfd_process *target, uint32_t fd,
> > void __user *runtime_info,
> > uint32_t *runtime_info_size);
> > +static inline bool kfd_dbg_is_per_vmid_supported(struct kfd_node *dev)
> > +{
> > +   return KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2);
>
> This needs to be updated to include 9.4.3 as well. Is that coming in a
> different patch? Other than that, this patch is

That's correct.  This series does not enable the debugger for GFX9.4.3.
This will be a follow-up series that Eric will provide.

Thanks.

Jon

>
> Reviewed-by: Felix Kuehling 
>
>
> > +}
> > +
> >   #endif
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> > index c8519adc89ac..badfe1210bc4 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> > @@ -36,6 +36,7 @@
> >   #include "kfd_kernel_queue.h"
> >   #include "amdgpu_amdkfd.h"
> >   #include "mes_api_def.h"
> > +#include "kfd_debug.h"
> >
> >   /* Size of the per-pipe EOP queue */
> >   #define CIK_HPD_EOP_BYTES_LOG2 11
> > @@ -2593,6 +2594,56 @@ int release_debug_trap_vmid(struct
> device_queue_manager *dqm,
> > return r;
> >   }
> >
> > +int debug_lock_and_unmap(struct device_queue_manager *dqm)
> > +{
> > +   int r;
> > +
> > +   if (dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) {
> > +   pr_err("Unsupported on sched_policy: %i\n", dqm-
> >sched_policy);
> > +   return -EINVAL;
> > +   }
> > +
> > +   if (!kfd_dbg_is_per_vmid_supported(dqm->dev))
> > +   return 0;
> > +
> > +   dqm_lock(dqm);
> > +
> > +   r = unmap_queues_cpsch(dqm,
> KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES, 0, 0, false);
> > +   if (r)
> > +   dqm_unlock(dqm);
> > +
> > +   return r;
> > +}
> > +
> > +int debug_map_and_unlock(struct device_queue_manager *dqm)
> > +{
> > +   int r;
> > +
> > +   if (dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) {
> > +   pr_err("Unsupported on sched_policy: %i\n", dqm-
> >sched_policy);
> > +   return -EINVAL;
> > +   }
> > +
> > +   if (!kfd_dbg_is_per_vmid_supported(dqm->dev))
> > +   return 0;
> > +
> > +   r = map_queues_cpsch(dqm);
> > +
> > +   dqm_unlock(dqm);
> > +
> > +   return r;
> > +}
> > +
> > +int debug_refresh_runlist(struct device_queue_manager *d

RE: [PATCH] drm/amdkfd: remove unused sq_int_priv variable

2023-03-30 Thread Kim, Jonathan
[Public]

Hi Felix,

That is correct.  The debugger will need sq_int_priv to work.

Thanks,

Jon

> -Original Message-
> From: Kuehling, Felix 
> Sent: Thursday, March 30, 2023 11:39 AM
> To: Tom Rix ; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; nat...@kernel.org;
> ndesaulni...@google.com; Kim, Jonathan 
> Cc: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org; linux-
> ker...@vger.kernel.org; l...@lists.linux.dev
> Subject: Re: [PATCH] drm/amdkfd: remove unused sq_int_priv variable
>
> Am 2023-03-30 um 11:20 schrieb Tom Rix:
> > clang with W=1 reports
> > drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_int_process_v11.c:282:38:
> error: variable
> >'sq_int_priv' set but not used [-Werror,-Wunused-but-set-variable]
> >  uint8_t sq_int_enc, sq_int_errtype, sq_int_priv;
> >  ^
> > This variable is not used so remove it.
>
> Hi Jon,
>
> I think your debugger patches are going to start using this. Can you
> comment?
>
> I'd prefer not to apply this patch now, as Jon's patches are expected to
> land soon, once Alex is done upstreaming GFX 9.4.3 support.
>
> Regards,
>Felix
>
>
> >
> > Signed-off-by: Tom Rix 
> > ---
> >   drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c | 9 +
> >   1 file changed, 1 insertion(+), 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
> > index 0d53f6067422..bbd646c0dee7 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
> > @@ -279,7 +279,7 @@ static void event_interrupt_wq_v11(struct kfd_dev
> *dev,
> >   {
> > uint16_t source_id, client_id, ring_id, pasid, vmid;
> > uint32_t context_id0, context_id1;
> > -   uint8_t sq_int_enc, sq_int_errtype, sq_int_priv;
> > +   uint8_t sq_int_enc, sq_int_errtype;
> > struct kfd_vm_fault_info info = {0};
> > struct kfd_hsa_memory_exception_data exception_data;
> >
> > @@ -348,13 +348,6 @@ static void event_interrupt_wq_v11(struct kfd_dev
> *dev,
> > break;
> > case SQ_INTERRUPT_WORD_ENCODING_INST:
> > print_sq_intr_info_inst(context_id0,
> context_id1);
> > -   sq_int_priv = REG_GET_FIELD(context_id0,
> > -
>   SQ_INTERRUPT_WORD_WAVE_CTXID0, PRIV);
> > -   /*if (sq_int_priv &&
> (kfd_set_dbg_ev_from_interrupt(dev, pasid,
> > -
>   KFD_CTXID0_DOORBELL_ID(context_id0),
> > -
>   KFD_CTXID0_TRAP_CODE(context_id0),
> > -   NULL, 0)))
> > -   return;*/
> > break;
> > case SQ_INTERRUPT_WORD_ENCODING_ERROR:
> > print_sq_intr_info_error(context_id0,
> context_id1);


RE: [PATCH 12/34] drm/amdgpu: add configurable grace period for unmap queues

2023-03-28 Thread Kim, Jonathan
[Public]

Thanks for catch Kent.
I'll fix up the typos with a follow-on.

Jon

> -Original Message-
> From: Russell, Kent 
> Sent: Tuesday, March 28, 2023 11:19 AM
> To: Kim, Jonathan ; amd-...@lists.freedesktop.org;
> dri-devel@lists.freedesktop.org
> Cc: Kuehling, Felix ; Kim, Jonathan
> 
> Subject: RE: [PATCH 12/34] drm/amdgpu: add configurable grace period for
> unmap queues
>
> [AMD Official Use Only - General]
>
> 3 tiny grammar/spelling things inline (not critical)
>
>  Kent
>
> > -Original Message-
> > From: amd-gfx  On Behalf Of
> > Jonathan Kim
> > Sent: Monday, March 27, 2023 2:43 PM
> > To: amd-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> > Cc: Kuehling, Felix ; Kim, Jonathan
> > 
> > Subject: [PATCH 12/34] drm/amdgpu: add configurable grace period for
> unmap
> > queues
> >
> > The HWS schedule allows a grace period for wave completion prior to
> > preemption for better performance by avoiding CWSR on waves that can
> > potentially complete quickly. The debugger, on the other hand, will
> > want to inspect wave status immediately after it actively triggers
> > preemption (a suspend function to be provided).
> >
> > To minimize latency between preemption and debugger wave inspection,
> allow
> > immediate preemption by setting the grace period to 0.
> >
> > Note that setting the preepmtion grace period to 0 will result in an
> > infinite grace period being set due to a CP FW bug so set it to 1 for now.
> >
> > v2: clarify purpose in the description of this patch
> >
> > Signed-off-by: Jonathan Kim 
> > Reviewed-by: Felix Kuehling 
> > ---
> >  .../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c  |  2 +
> >  .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |  2 +
> >  .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 43 
> >  .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h|  6 ++
> >  .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c  |  2 +
> >  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 43 
> >  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h |  9 ++-
> >  .../drm/amd/amdkfd/kfd_device_queue_manager.c | 62 +-
> >  .../drm/amd/amdkfd/kfd_device_queue_manager.h |  2 +
> >  .../gpu/drm/amd/amdkfd/kfd_packet_manager.c   | 32 +
> >  .../drm/amd/amdkfd/kfd_packet_manager_v9.c| 39 +++
> >  .../gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h   | 65
> +++
> >  drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  5 ++
> >  13 files changed, 291 insertions(+), 21 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
> > index a6f98141c29c..b811a0985050 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
> > @@ -82,5 +82,7 @@ const struct kfd2kgd_calls aldebaran_kfd2kgd = {
> > .get_cu_occupancy = kgd_gfx_v9_get_cu_occupancy,
> > .enable_debug_trap = kgd_aldebaran_enable_debug_trap,
> > .disable_debug_trap = kgd_aldebaran_disable_debug_trap,
> > +   .get_iq_wait_times = kgd_gfx_v9_get_iq_wait_times,
> > +   .build_grace_period_packet_info =
> > kgd_gfx_v9_build_grace_period_packet_info,
> > .program_trap_handler_settings =
> > kgd_gfx_v9_program_trap_handler_settings,
> >  };
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> > index d2918e5c0dea..a62bd0068515 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> > @@ -410,6 +410,8 @@ const struct kfd2kgd_calls arcturus_kfd2kgd = {
> >
> > kgd_gfx_v9_set_vm_context_page_table_base,
> > .enable_debug_trap = kgd_arcturus_enable_debug_trap,
> > .disable_debug_trap = kgd_arcturus_disable_debug_trap,
> > +   .get_iq_wait_times = kgd_gfx_v9_get_iq_wait_times,
> > +   .build_grace_period_packet_info =
> > kgd_gfx_v9_build_grace_period_packet_info,
> > .get_cu_occupancy = kgd_gfx_v9_get_cu_occupancy,
> > .program_trap_handler_settings =
> > kgd_gfx_v9_program_trap_handler_settings
> >  };
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
> > index 969015281510..605387e55d33 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
> > @@ -802,6 +802,4

RE: [PATCH 19/32] drm/amdkfd: add runtime enable operation

2023-03-23 Thread Kim, Jonathan
[AMD Official Use Only - General]

> -Original Message-
> From: Kuehling, Felix 
> Sent: Monday, March 20, 2023 8:31 PM
> To: Kim, Jonathan ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Subject: Re: [PATCH 19/32] drm/amdkfd: add runtime enable operation
>
>
> On 2023-01-25 14:53, Jonathan Kim wrote:
> > The debugger can attach to a process prior to HSA enablement (i.e.
> > inferior is spawned by the debugger and attached to immediately before
> > target process has been enabled for HSA dispatches) or it
> > can attach to a running target that is already HSA enabled.  Either
> > way, the debugger needs to know the enablement status to know when
> > it can inspect queues.
> >
> > For the scenario where the debugger spawns the target process,
> > it will have to wait for ROCr's runtime enable request from the target.
> > The runtime enable request will be able to see that its process has been
> > debug attached.  ROCr raises an EC_PROCESS_RUNTIME signal to the
> > debugger then blocks the target process while waiting the debugger's
> > response. Once the debugger has received the runtime signal, it will
> > unblock the target process.
> >
> > For the scenario where the debugger attaches to a running target
> > process, ROCr will set the target process' runtime status as enabled so
> > that on an attach request, the debugger will be able to see this
> > status and will continue with debug enablement as normal.
> >
> > A secondary requirement is to conditionally enable the trap tempories
> only
> > if the user requests it (env var HSA_ENABLE_DEBUG=1) or if the debugger
> > attaches with HSA runtime enabled.  This is because setting up the trap
> > temporaries incurs a performance overhead that is unacceptable for
> > microbench performance in normal mode for certain customers.
> >
> > In the scenario where the debugger spawns the target process, when ROCr
> > detects that the debugger has attached during the runtime enable
> > request, it will enable the trap temporaries before it blocks the target
> > process while waiting for the debugger to respond.
> >
> > In the scenario where the debugger attaches to a running target process,
> > it will enable to trap temporaries itself.
> >
> > Finally, there is an additional restriction that is required to be
> > enforced with runtime enable and HW debug mode setting. The debugger
> must
> > first ensure that HW debug mode has been enabled before permitting HW
> debug
> > mode operations.
> >
> > With single process debug devices, allowing the debugger to set debug
> > HW modes prior to trap activation means that debug HW mode setting can
> > occur before the KFD has reserved the debug VMID (0xf) from the hardware
> > scheduler's VMID allocation resource pool.  This can result in the
> > hardware scheduler assigning VMID 0xf to a non-debugged process and
> > having that process inherit debug HW mode settings intended for the
> > debugged target process instead, which is both incorrect and potentially
> > fatal for normal mode operation.
> >
> > With multi process debug devices, allowing the debugger to set debug
> > HW modes prior to trap activation means that non-debugged processes
> > migrating to a new VMID could inherit unintended debug settings.
> >
> > All debug operations that touch HW settings must require trap activation
> > where trap activation is triggered by both debug attach and runtime
> > enablement (target has KFD opened and is ready to dispatch work).
> >
> > v2: fix up hierarchy of semantics in description.
> >
> > Signed-off-by: Jonathan Kim 
> > ---
> >   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 150
> ++-
> >   drivers/gpu/drm/amd/amdkfd/kfd_debug.c   |   6 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_debug.h   |   4 +
> >   drivers/gpu/drm/amd/amdkfd/kfd_priv.h|   1 +
> >   4 files changed, 157 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> > index 09fe8576dc8c..46f9d453dc5e 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> > @@ -2654,11 +2654,147 @@ static int kfd_ioctl_criu(struct file *filep,
> struct kfd_process *p, void *data)
> > return ret;
> >   }
> >
> > -static int kfd_ioctl_runtime_enable(struct file *filep, struct kfd_process 
> > *p,
> void *data)
> > +static int runtime_enable(struct kfd_process *p, uint64_t r_debug,
> 

RE: [PATCH 03/32] drm/amdkfd: prepare per-process debug enable and disable

2023-03-23 Thread Kim, Jonathan
[Public]

> -Original Message-
> From: Kuehling, Felix 
> Sent: Thursday, February 16, 2023 6:44 PM
> To: Kim, Jonathan ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Subject: Re: [PATCH 03/32] drm/amdkfd: prepare per-process debug enable
> and disable
>
>
> On 2023-01-25 14:53, Jonathan Kim wrote:
> > The ROCm debugger will attach to a process to debug by PTRACE and will
> > expect the KFD to prepare a process for the target PID, whether the
> > target PID has opened the KFD device or not.
> >
> > This patch is to explicity handle this requirement.  Further HW mode
> > setting and runtime coordination requirements will be handled in
> > following patches.
> >
> > In the case where the target process has not opened the KFD device,
> > a new KFD process must be created for the target PID.
> > The debugger as well as the target process for this case will have not
> > acquired any VMs so handle process restoration to correctly account for
> > this.
> >
> > To coordinate with HSA runtime, the debugger must be aware of the target
> > process' runtime enablement status and will copy the runtime status
> > information into the debugged KFD process for later query.
> >
> > On enablement, the debugger will subscribe to a set of exceptions where
> > each exception events will notify the debugger through a pollable FIFO
> > file descriptor that the debugger provides to the KFD to manage.
> > Some events will be synchronously raised while other are scheduled,
> > which is why a debug_event_workarea worker is initialized.
> >
> > Finally on process termination of either the debugger or the target,
> > debugging must be disabled if it has not been done so.
> >
> > v3: fix typo on debug trap disable and PTRACE ATTACH relax check.
> > remove unnecessary queue eviction counter reset when there's nothing
> > to evict.
> > change err code to EALREADY if attaching to an already attached process.
> > move debug disable to release worker to avoid race with disable from
> > ioctl call.
> >
> > v2: relax debug trap disable and PTRACE ATTACH requirement.
> >
> > Signed-off-by: Jonathan Kim
> > ---
> >   drivers/gpu/drm/amd/amdkfd/Makefile   |  3 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 88 -
> >   drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 94
> +++
> >   drivers/gpu/drm/amd/amdkfd/kfd_debug.h| 33 +++
> >   .../drm/amd/amdkfd/kfd_device_queue_manager.c | 22 -
> >   drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 34 ++-
> >   drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 63 +
> >   7 files changed, 308 insertions(+), 29 deletions(-)
> >   create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> >   create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_debug.h
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/Makefile
> b/drivers/gpu/drm/amd/amdkfd/Makefile
> > index e758c2a24cd0..747754428073 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/Makefile
> > +++ b/drivers/gpu/drm/amd/amdkfd/Makefile
> > @@ -55,7 +55,8 @@ AMDKFD_FILES  := $(AMDKFD_PATH)/kfd_module.o \
> > $(AMDKFD_PATH)/kfd_int_process_v9.o \
> > $(AMDKFD_PATH)/kfd_int_process_v11.o \
> > $(AMDKFD_PATH)/kfd_smi_events.o \
> > -   $(AMDKFD_PATH)/kfd_crat.o
> > +   $(AMDKFD_PATH)/kfd_crat.o \
> > +   $(AMDKFD_PATH)/kfd_debug.o
> >
> >   ifneq ($(CONFIG_AMD_IOMMU_V2),)
> >   AMDKFD_FILES += $(AMDKFD_PATH)/kfd_iommu.o
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> > index d3b019e64093..ee05c2e54ef6 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> > @@ -44,6 +44,7 @@
> >   #include "amdgpu_amdkfd.h"
> >   #include "kfd_smi_events.h"
> >   #include "amdgpu_dma_buf.h"
> > +#include "kfd_debug.h"
> >
> >   static long kfd_ioctl(struct file *, unsigned int, unsigned long);
> >   static int kfd_open(struct inode *, struct file *);
> > @@ -142,10 +143,15 @@ static int kfd_open(struct inode *inode, struct
> file *filep)
> > return -EPERM;
> > }
> >
> > -   process = kfd_create_process(filep);
> > +   process = kfd_create_process(current);
> > if (IS_ERR(process))
> > return PTR_ERR(process);
> >
> > +   if (kfd_process_init_cwsr_apu(process, filep)) {
> > + 

RE: [PATCH 15/32] drm/amdkfd: prepare trap workaround for gfx11

2023-03-23 Thread Kim, Jonathan
[Public]

> -Original Message-
> From: Kuehling, Felix 
> Sent: Monday, March 20, 2023 5:50 PM
> To: Kim, Jonathan ; amd-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Subject: Re: [PATCH 15/32] drm/amdkfd: prepare trap workaround for gfx11
>
>
> On 2023-01-25 14:53, Jonathan Kim wrote:
> > Due to a HW bug, waves in only half the shader arrays can enter trap.
> >
> > When starting a debug session, relocate all waves to the first shader
> > array of each shader engine and mask off the 2nd shader array as
> > unavailable.
> >
> > When ending a debug session, re-enable the 2nd shader array per
> > shader engine.
> >
> > User CU masking per queue cannot be guaranteed to remain functional
> > if requested during debugging (e.g. user cu mask requests only 2nd shader
> > array as an available resource leading to zero HW resources available)
> > nor can runtime be alerted of any of these changes during execution.
> >
> > Make user CU masking and debugging mutual exclusive with respect to
> > availability.
> >
> > If the debugger tries to attach to a process with a user cu masked
> > queue, return the runtime status as enabled but busy.
> >
> > If the debugger tries to attach and fails to reallocate queue waves to
> > the first shader array of each shader engine, return the runtime status
> > as enabled but with an error.
> >
> > In addition, like any other mutli-process debug supported devices,
> > disable trap temporary setup per-process to avoid performance impact
> from
> > setup overhead.
> >
> > Signed-off-by: Jonathan Kim 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h   |  2 +
> >   drivers/gpu/drm/amd/amdgpu/mes_v11_0.c|  7 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  2 -
> >   drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 64
> +++
> >   drivers/gpu/drm/amd/amdkfd/kfd_debug.h|  3 +-
> >   .../drm/amd/amdkfd/kfd_device_queue_manager.c |  7 ++
> >   .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  |  3 +-
> >   .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  |  3 +-
> >   .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c  | 42 
> >   .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   |  3 +-
> >   .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   |  3 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  5 +-
> >   .../amd/amdkfd/kfd_process_queue_manager.c|  9 ++-
> >   13 files changed, 124 insertions(+), 29 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
> > index d20df0cf0d88..b5f5eed2b5ef 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
> > @@ -219,6 +219,8 @@ struct mes_add_queue_input {
> > uint32_tgws_size;
> > uint64_ttba_addr;
> > uint64_ttma_addr;
> > +   uint32_ttrap_en;
> > +   uint32_tskip_process_ctx_clear;
> > uint32_tis_kfd_process;
> > uint32_tis_aql_queue;
> > uint32_tqueue_size;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> > index fbacdc42efac..38c7a0cbf264 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
> > @@ -197,17 +197,14 @@ static int mes_v11_0_add_hw_queue(struct
> amdgpu_mes *mes,
> > mes_add_queue_pkt.gws_size = input->gws_size;
> > mes_add_queue_pkt.trap_handler_addr = input->tba_addr;
> > mes_add_queue_pkt.tma_addr = input->tma_addr;
> > +   mes_add_queue_pkt.trap_en = input->trap_en;
> > +   mes_add_queue_pkt.skip_process_ctx_clear = input-
> >skip_process_ctx_clear;
> > mes_add_queue_pkt.is_kfd_process = input->is_kfd_process;
> >
> > /* For KFD, gds_size is re-used for queue size (needed in MES for AQL
> queues) */
> > mes_add_queue_pkt.is_aql_queue = input->is_aql_queue;
> > mes_add_queue_pkt.gds_size = input->queue_size;
> >
> > -   if (!(((adev->mes.sched_version & AMDGPU_MES_VERSION_MASK) >=
> 4) &&
> > - (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(11, 0, 0))
> &&
> > - (adev->ip_versions[GC_HWIP][0] <= IP_VERSION(11, 0, 3
> > -   mes_add_queue_pkt.trap_en = 1;
> > -
> > /* For KFD, gds_size is re-used for queue size (needed in MES for AQL
> queues) */
> > mes_add_queue_pkt.is_aql_queue = i

RE: [PATCH 01/44] drm/amdgpu: replace per_device_list by array

2021-03-23 Thread Kim, Jonathan
[AMD Official Use Only - Internal Distribution Only]

> -Original Message-
> From: amd-gfx  On Behalf Of Felix
> Kuehling
> Sent: Monday, March 22, 2021 6:58 AM
> To: dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org
> Cc: Sierra Guiza, Alejandro (Alex) 
> Subject: [PATCH 01/44] drm/amdgpu: replace per_device_list by array
>
> [CAUTION: External Email]
>
> From: Alex Sierra 
>
> Remove per_device_list from kfd_process and replace it with a
> kfd_process_device pointers array of MAX_GPU_INSTANCES size. This helps
> to manage the kfd_process_devices binded to a specific kfd_process.
> Also, functions used by kfd_chardev to iterate over the list were removed,
> since they are not valid anymore. Instead, it was replaced by a local loop
> iterating the array.
>
> Signed-off-by: Alex Sierra 
> Signed-off-by: Felix Kuehling 

As discussed, this patch is required to sync internal branches for the KFD and 
is
Reviewed-by: Jonathan Kim 

> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 116 --
>  drivers/gpu/drm/amd/amdkfd/kfd_iommu.c|   8 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  20 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 108 
>  .../amd/amdkfd/kfd_process_queue_manager.c|   6 +-
>  5 files changed, 111 insertions(+), 147 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 6802c616e10e..43de260b2230 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -870,52 +870,47 @@ static int kfd_ioctl_get_process_apertures(struct
> file *filp,  {
> struct kfd_ioctl_get_process_apertures_args *args = data;
> struct kfd_process_device_apertures *pAperture;
> -   struct kfd_process_device *pdd;
> +   int i;
>
> dev_dbg(kfd_device, "get apertures for PASID 0x%x", p->pasid);
>
> args->num_of_nodes = 0;
>
> mutex_lock(>mutex);
> +   /* Run over all pdd of the process */
> +   for (i = 0; i < p->n_pdds; i++) {
> +   struct kfd_process_device *pdd = p->pdds[i];
> +
> +   pAperture =
> +   >process_apertures[args->num_of_nodes];
> +   pAperture->gpu_id = pdd->dev->id;
> +   pAperture->lds_base = pdd->lds_base;
> +   pAperture->lds_limit = pdd->lds_limit;
> +   pAperture->gpuvm_base = pdd->gpuvm_base;
> +   pAperture->gpuvm_limit = pdd->gpuvm_limit;
> +   pAperture->scratch_base = pdd->scratch_base;
> +   pAperture->scratch_limit = pdd->scratch_limit;
>
> -   /*if the process-device list isn't empty*/
> -   if (kfd_has_process_device_data(p)) {
> -   /* Run over all pdd of the process */
> -   pdd = kfd_get_first_process_device_data(p);
> -   do {
> -   pAperture =
> -   >process_apertures[args->num_of_nodes];
> -   pAperture->gpu_id = pdd->dev->id;
> -   pAperture->lds_base = pdd->lds_base;
> -   pAperture->lds_limit = pdd->lds_limit;
> -   pAperture->gpuvm_base = pdd->gpuvm_base;
> -   pAperture->gpuvm_limit = pdd->gpuvm_limit;
> -   pAperture->scratch_base = pdd->scratch_base;
> -   pAperture->scratch_limit = pdd->scratch_limit;
> -
> -   dev_dbg(kfd_device,
> -   "node id %u\n", args->num_of_nodes);
> -   dev_dbg(kfd_device,
> -   "gpu id %u\n", pdd->dev->id);
> -   dev_dbg(kfd_device,
> -   "lds_base %llX\n", pdd->lds_base);
> -   dev_dbg(kfd_device,
> -   "lds_limit %llX\n", pdd->lds_limit);
> -   dev_dbg(kfd_device,
> -   "gpuvm_base %llX\n", pdd->gpuvm_base);
> -   dev_dbg(kfd_device,
> -   "gpuvm_limit %llX\n", pdd->gpuvm_limit);
> -   dev_dbg(kfd_device,
> -   "scratch_base %llX\n", pdd->scratch_base);
> -   dev_dbg(kfd_device,
> -   "scratch_limit %llX\n", pdd->scratch_limit);
> -
> -   args->num_of_nodes++;
> -
> -   pdd = kfd_get_next_process_device_data(p, pdd);
> -   } while (pdd && (args->num_of_nodes <
> NUM_OF_SUPPORTED_GPUS));
> -   }
> +   dev_dbg(kfd_device,
> +   "node id %u\n", args->num_of_nodes);
> +   dev_dbg(kfd_device,
> +   "gpu id %u\n", pdd->dev->id);
> +   dev_dbg(kfd_device,
> +   "lds_base %llX\n", pdd->lds_base);
> +