On 2025-09-25 22:49, Zhu, Lingshan wrote:
On 9/25/2025 5:50 AM, Kuehling, Felix wrote:
On 2025-09-23 03:26, Zhu Lingshan wrote:
The user space program pass down a pid to kfd
through set_debug_trap ioctl, which can help
find the corresponding user space program and
its mm struct.

However, these information is insufficient to
locate a specific kfd process among the
multiple kfd_process that belong to the
same user space program.

For correctness and backward compatibilities,
this commit only allow set_debut_trap ioctl
work for a user space program which does not
own any secondary kfd_process.

What happens if a secondary context is created after the debugger attaches to a process?

It may be simpler and more consistent to allow debugging of the primary context, even if secondary contexts exist. Just ignore any secondary contexts for the purpose of the debug API.
Do you mean reject any set_debug_trap ioctls on secondary contexts, and only 
allow
debugging on the primary context, just like how we handle SVM and user ptr?

Yes, that is what I meant.

Regards,
  Felix



Thanks
Lingshan

Regards,
  Felix



Signed-off-by: Zhu Lingshan <[email protected]>
---
  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 29 ++++++++++++++++++++++++
  1 file changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 662b181f1fd2..2df095e25c2e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -36,6 +36,7 @@
  #include <linux/ptrace.h>
  #include <linux/dma-buf.h>
  #include <linux/processor.h>
+#include <linux/fdtable.h>
  #include "kfd_priv.h"
  #include "kfd_device_queue_manager.h"
  #include "kfd_svm.h"
@@ -2918,6 +2919,25 @@ static int kfd_ioctl_runtime_enable(struct file *filep, struct kfd_process *p, v
      return r;
  }
  +static int kfd_process_count_cb(const void *num, struct file *filep, unsigned int n)
+{
+    u16 *ret = (u16 *)num;
+
+    if (filep->f_op == &kfd_fops)
+        (*ret)++;
+
+    return 0;
+}
+
+static u16 kfd_process_count(struct task_struct *thread)
+{
+    u16 count = 0;
+
+    iterate_fd(thread->files, 0, kfd_process_count_cb, (const void *)&count);
+
+    return count;
+}
+
  static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, void *data)
  {
      struct kfd_ioctl_dbg_trap_args *args = data;
@@ -2927,6 +2947,7 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, v
      struct kfd_process *target = NULL;
      struct kfd_process_device *pdd = NULL;
      int r = 0;
+    u16 process_count = 0;
        if (sched_policy == KFD_SCHED_POLICY_NO_HWS) {
          pr_err("Debugging does not support sched_policy %i", sched_policy); @@ -2984,6 +3005,14 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, v
      if (r)
          goto out;
  +    process_count = kfd_process_count(thread);
+    /* The user space program has secondary contexts */
+    if (process_count > 1) {
+        pr_err("Detect multiple kfd_process owned by PID %i, please consider close any secondary kfd_process and try again\n", args->pid);
+        r = -EINVAL;
+        goto out;
+    }
+
      mutex_lock(&target->mutex);
        if (args->op != KFD_IOC_DBG_TRAP_ENABLE && !target->debug_trap_enabled) {

Reply via email to