date:20230525

RE: [PATCH] drm/amdgpu: add a flag to indicate if a VM is attached to fpriv

2023-05-25 Thread Chen, Guchun

[Public]

Ping..

> -Original Message-
> From: Chen, Guchun 
> Sent: Wednesday, May 24, 2023 5:23 PM
> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> ; Zhang, Hawking
> ; Lazar, Lijo ; Yang, Philip
> ; Kuehling, Felix 
> Cc: Chen, Guchun 
> Subject: [PATCH] drm/amdgpu: add a flag to indicate if a VM is attached to
> fpriv
>
> Recent code stores xcp_id to amdgpu bo for accounting memory usage or
> find correct KFD node, and this xcp_id is from file private data after opening
> device. However, not all VMs are attached to this fpriv structure like the 
> case
> in amdgpu_mes_self_test.
> So add a flag to differentiate the cases. Otherwise, KASAN will complain out
> of bound access.
>
> [   77.292314] BUG: KASAN: slab-out-of-bounds in
> amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
> [   77.293845] Read of size 4 at addr 888102c48a48 by task
> modprobe/1069
> [   77.294146] Call Trace:
> [   77.294178]  
> [   77.294208]  dump_stack_lvl+0x49/0x63
> [   77.294260]  print_report+0x16f/0x4a6
> [   77.294307]  ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
> [   77.295979]  ? kasan_complete_mode_report_info+0x3c/0x200
> [   77.296057]  ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
> [   77.297556]  kasan_report+0xb4/0x130
> [   77.297609]  ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
> [   77.299202]  __asan_load4+0x6f/0x90
> [   77.299272]  amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
> [   77.300796]  ? amdgpu_init+0x6e/0x1000 [amdgpu]
> [   77.30]  ? amdgpu_vm_pt_clear+0x750/0x750 [amdgpu]
> [   77.303721]  ? preempt_count_sub+0x18/0xc0
> [   77.303786]  amdgpu_vm_init+0x39e/0x870 [amdgpu]
> [   77.305186]  ? amdgpu_vm_wait_idle+0x90/0x90 [amdgpu]
> [   77.306683]  ? kasan_set_track+0x25/0x30
> [   77.306737]  ? kasan_save_alloc_info+0x1b/0x30
> [   77.306795]  ? __kasan_kmalloc+0x87/0xa0
> [   77.306852]  amdgpu_mes_self_test+0x169/0x620 [amdgpu]
>
> Fixes: ffc6deb773f7("drm/amdkfd: Store xcp partition id to amdgpu bo")
> Signed-off-by: Guchun Chen 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c   |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c   |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c|  5 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h|  5 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 12 +---
>  5 files changed, 19 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index 41d047e5de69..79b80f9233db 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -1229,7 +1229,7 @@ int amdgpu_driver_open_kms(struct drm_device
> *dev, struct drm_file *file_priv)
>   pasid = 0;
>   }
>
> - r = amdgpu_vm_init(adev, &fpriv->vm);
> + r = amdgpu_vm_init(adev, &fpriv->vm, true);
>   if (r)
>   goto error_pasid;
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> index 49bb6c03d606..3be5219edf88 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> @@ -1345,7 +1345,7 @@ int amdgpu_mes_self_test(struct amdgpu_device
> *adev)
>   goto error_pasid;
>   }
>
> - r = amdgpu_vm_init(adev, vm);
> + r = amdgpu_vm_init(adev, vm, false);
>   if (r) {
>   DRM_ERROR("failed to initialize vm\n");
>   goto error_pasid;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 37b9d8a8dbec..47ffaa1526a0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2099,13 +2099,15 @@ long amdgpu_vm_wait_idle(struct amdgpu_vm
> *vm, long timeout)
>   *
>   * @adev: amdgpu_device pointer
>   * @vm: requested vm
> + * @vm_attach_to_fpriv: flag to tell if vm is attached to file private
> + data
>   *
>   * Init @vm fields.
>   *
>   * Returns:
>   * 0 for success, error for failure.
>   */
> -int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm)
> +int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
> +bool vm_attach_to_fpriv)
>  {
>   struct amdgpu_bo *root_bo;
>   struct amdgpu_bo_vm *root;
> @@ -2131,6 +2133,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev,
> struct amdgpu_vm *vm)
>
>   vm->pte_support_ats = false;
>   vm->is_compute_context = false;
> + vm->vm_attach_to_fpriv = vm_attach_to_fpriv;
>
>   vm->use_cpu_for_update = !!(adev-
> >vm_manager.vm_update_mode &
>   AMDGPU_VM_USE_CPU_FOR_GFX);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index d551fca1780e..62ed14b1fc16 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -333,6 +333,9 @@ struct amdgpu_vm {
>   /* Flag to indicate if VM is used for compute */
>   boolis_compute_context;
>
> +

RE: [PATCH] drm/amdkfd: remove unused function get_reserved_sdma_queues_bitmap

2023-05-25 Thread Joshi, Mukul

[AMD Official Use Only - General]

> -Original Message-
> From: Kuehling, Felix 
> Sent: Thursday, May 25, 2023 5:10 PM
> To: Tom Rix ; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ;
> airl...@gmail.com; dan...@ffwll.ch; nat...@kernel.org;
> ndesaulni...@google.com; Joshi, Mukul 
> Cc: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; linux-
> ker...@vger.kernel.org; l...@lists.linux.dev
> Subject: Re: [PATCH] drm/amdkfd: remove unused function
> get_reserved_sdma_queues_bitmap
>
> [+Mukul]
>
> Looks like this problem was introduced by Mukul's patch "drm/amdkfd:
> Update SDMA queue management for GFX9.4.3". Could this be a merge
> error between GFX 9.4.3 and GFX11 branches? I think the
> reserved_sdma_queues_bitmap was introduced after the 9.4.3 branch was
> created. Mukul, you worked on both, so you're probably in the best position
> to resolve this.
>

Yes my patch introduced this regression. We need the 
get_reserved_sdma_queues_bitmap function.
I will fix this regression and send out a new patch.

Thanks for noticing/catching this.

Regards,
Mukul

> Regards,
>Felix
>
>
> On 2023-05-25 16:07, Tom Rix wrote:
> > clang with W=1 reports
> >
> drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:12
> 2:24: error:
> >unused function 'get_reserved_sdma_queues_bitmap'
> > [-Werror,-Wunused-function] static inline uint64_t
> get_reserved_sdma_queues_bitmap(struct device_queue_manager *dqm)
> > ^
> > This function is not used so remove it.
> >
> > Signed-off-by: Tom Rix 
> > ---
> >   drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 5 -
> >   1 file changed, 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> > b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> > index 493b4b66f180..2fbd0a96424f 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> > @@ -119,11 +119,6 @@ unsigned int get_num_xgmi_sdma_queues(struct
> device_queue_manager *dqm)
> > dqm->dev->kfd-
> >device_info.num_sdma_queues_per_engine;
> >   }
> >
> > -static inline uint64_t get_reserved_sdma_queues_bitmap(struct
> > device_queue_manager *dqm) -{
> > -   return dqm->dev->kfd-
> >device_info.reserved_sdma_queues_bitmap;
> > -}
> > -
> >   static void init_sdma_bitmaps(struct device_queue_manager *dqm)
> >   {
> > bitmap_zero(dqm->sdma_bitmap, KFD_MAX_SDMA_QUEUES);

[PATCH 1/2] drm/amdgpu: Modify indirect buffer packages for resubmission

2023-05-25 Thread jiadong.zhu

From: Jiadong Zhu 

When the preempted IB frame resubmitted to cp, we need to modify the frame
data including:
1. set PRE_RESUME 1 in CONTEXT_CONTROL.
2. use meta data(DE and CE) read from CSA in WRITE_DATA.

Add functions to save the location the first time IBs emitted and callback
to patch the package when resubmission happens.

Signed-off-by: Jiadong Zhu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 18 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  9 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c | 60 
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.h | 15 +
 4 files changed, 102 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 7429b20257a6..12ba863e69f4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -692,3 +692,21 @@ void amdgpu_ring_ib_end(struct amdgpu_ring *ring)
if (ring->is_sw_ring)
amdgpu_sw_ring_ib_end(ring);
 }
+
+void amdgpu_ring_ib_on_emit_cntl(struct amdgpu_ring *ring)
+{
+   if (ring->is_sw_ring)
+   amdgpu_sw_ring_ib_mark_offset(ring, 
AMDGPU_MUX_OFFSET_TYPE_CONTROL);
+}
+
+void amdgpu_ring_ib_on_emit_ce(struct amdgpu_ring *ring)
+{
+   if (ring->is_sw_ring)
+   amdgpu_sw_ring_ib_mark_offset(ring, AMDGPU_MUX_OFFSET_TYPE_CE);
+}
+
+void amdgpu_ring_ib_on_emit_de(struct amdgpu_ring *ring)
+{
+   if (ring->is_sw_ring)
+   amdgpu_sw_ring_ib_mark_offset(ring, AMDGPU_MUX_OFFSET_TYPE_DE);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index baa03527bf8b..702ce55b962a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -229,6 +229,9 @@ struct amdgpu_ring_funcs {
int (*preempt_ib)(struct amdgpu_ring *ring);
void (*emit_mem_sync)(struct amdgpu_ring *ring);
void (*emit_wave_limit)(struct amdgpu_ring *ring, bool enable);
+   void (*patch_cntl)(struct amdgpu_ring *ring, unsigned offset);
+   void (*patch_ce)(struct amdgpu_ring *ring, unsigned offset);
+   void (*patch_de)(struct amdgpu_ring *ring, unsigned offset);
 };
 
 struct amdgpu_ring {
@@ -323,11 +326,17 @@ struct amdgpu_ring {
 #define amdgpu_ring_init_cond_exec(r) (r)->funcs->init_cond_exec((r))
 #define amdgpu_ring_patch_cond_exec(r,o) (r)->funcs->patch_cond_exec((r),(o))
 #define amdgpu_ring_preempt_ib(r) (r)->funcs->preempt_ib(r)
+#define amdgpu_ring_patch_cntl(r, o) ((r)->funcs->patch_cntl((r), (o)))
+#define amdgpu_ring_patch_ce(r, o) ((r)->funcs->patch_ce((r), (o)))
+#define amdgpu_ring_patch_de(r, o) ((r)->funcs->patch_de((r), (o)))
 
 unsigned int amdgpu_ring_max_ibs(enum amdgpu_ring_type type);
 int amdgpu_ring_alloc(struct amdgpu_ring *ring, unsigned ndw);
 void amdgpu_ring_ib_begin(struct amdgpu_ring *ring);
 void amdgpu_ring_ib_end(struct amdgpu_ring *ring);
+void amdgpu_ring_ib_on_emit_cntl(struct amdgpu_ring *ring);
+void amdgpu_ring_ib_on_emit_ce(struct amdgpu_ring *ring);
+void amdgpu_ring_ib_on_emit_de(struct amdgpu_ring *ring);
 
 void amdgpu_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count);
 void amdgpu_ring_generic_pad_ib(struct amdgpu_ring *ring, struct amdgpu_ib 
*ib);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c
index 62079f0e3ee8..73516abef662 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c
@@ -105,6 +105,16 @@ static void amdgpu_mux_resubmit_chunks(struct 
amdgpu_ring_mux *mux)
amdgpu_fence_update_start_timestamp(e->ring,

chunk->sync_seq,

ktime_get());
+   if (chunk->sync_seq ==
+   
le32_to_cpu(*(e->ring->fence_drv.cpu_addr + 2))) {
+   if (chunk->cntl_offset <= 
e->ring->buf_mask)
+   amdgpu_ring_patch_cntl(e->ring,
+  
chunk->cntl_offset);
+   if (chunk->ce_offset <= 
e->ring->buf_mask)
+   amdgpu_ring_patch_ce(e->ring, 
chunk->ce_offset);
+   if (chunk->de_offset <= 
e->ring->buf_mask)
+   amdgpu_ring_patch_de(e->ring, 
chunk->de_offset);
+   }
amdgpu_ring_mux_copy_pkt_from_sw_ring(mux, 
e->ring,
  
chunk->start,
  
chunk->end);
@@ -407,6 +417,17 @@ void amdgpu_sw_ring_ib_end(struct amdgpu_

[PATCH 2/2] drm/amdgpu: Implement gfx9 patch functions for resubmission

2023-05-25 Thread jiadong.zhu

From: Jiadong Zhu 

Patch the packages including CONTEXT_CONTROL and WRITE_DATA for gfx9
during the resubmission scenario.

Signed-off-by: Jiadong Zhu 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 80 +++
 1 file changed, 80 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index cbcf6126cce5..4fbeb9b5752c 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -5172,9 +5172,83 @@ static void gfx_v9_0_ring_emit_ib_gfx(struct amdgpu_ring 
*ring,
 #endif
lower_32_bits(ib->gpu_addr));
amdgpu_ring_write(ring, upper_32_bits(ib->gpu_addr));
+   amdgpu_ring_ib_on_emit_cntl(ring);
amdgpu_ring_write(ring, control);
 }
 
+static void gfx_v9_0_ring_patch_cntl(struct amdgpu_ring *ring,
+unsigned offset)
+{
+   u32 control = ring->ring[offset];
+
+   control |= INDIRECT_BUFFER_PRE_RESUME(1);
+   ring->ring[offset] = control;
+}
+
+static void gfx_v9_0_ring_patch_ce_meta(struct amdgpu_ring *ring,
+   unsigned offset)
+{
+   struct amdgpu_device *adev = ring->adev;
+   void *ce_payload_cpu_addr;
+   uint64_t payload_offset, payload_size;
+
+   payload_size = sizeof(struct v9_ce_ib_state);
+
+   if (ring->is_mes_queue) {
+   payload_offset = offsetof(struct amdgpu_mes_ctx_meta_data,
+ gfx[0].gfx_meta_data) +
+   offsetof(struct v9_gfx_meta_data, ce_payload);
+   ce_payload_cpu_addr =
+   amdgpu_mes_ctx_get_offs_cpu_addr(ring, payload_offset);
+   } else {
+   payload_offset = offsetof(struct v9_gfx_meta_data, ce_payload);
+   ce_payload_cpu_addr = adev->virt.csa_cpu_addr + payload_offset;
+   }
+
+   if (offset + (payload_size >> 2) <= ring->buf_mask + 1) {
+   memcpy((void *)&ring->ring[offset], ce_payload_cpu_addr, 
payload_size);
+   } else {
+   memcpy((void *)&ring->ring[offset], ce_payload_cpu_addr,
+  (ring->buf_mask + 1 - offset) << 2);
+   payload_size -= (ring->buf_mask + 1 - offset) << 2;
+   memcpy((void *)&ring->ring[0],
+  ce_payload_cpu_addr + ((ring->buf_mask + 1 - offset) << 
2),
+  payload_size);
+   }
+}
+
+static void gfx_v9_0_ring_patch_de_meta(struct amdgpu_ring *ring,
+   unsigned offset)
+{
+   struct amdgpu_device *adev = ring->adev;
+   void *de_payload_cpu_addr;
+   uint64_t payload_offset, payload_size;
+
+   payload_size = sizeof(struct v9_de_ib_state);
+
+   if (ring->is_mes_queue) {
+   payload_offset = offsetof(struct amdgpu_mes_ctx_meta_data,
+ gfx[0].gfx_meta_data) +
+   offsetof(struct v9_gfx_meta_data, de_payload);
+   de_payload_cpu_addr =
+   amdgpu_mes_ctx_get_offs_cpu_addr(ring, payload_offset);
+   } else {
+   payload_offset = offsetof(struct v9_gfx_meta_data, de_payload);
+   de_payload_cpu_addr = adev->virt.csa_cpu_addr + payload_offset;
+   }
+
+   if (offset + (payload_size >> 2) <= ring->buf_mask + 1) {
+   memcpy((void *)&ring->ring[offset], de_payload_cpu_addr, 
payload_size);
+   } else {
+   memcpy((void *)&ring->ring[offset], de_payload_cpu_addr,
+  (ring->buf_mask + 1 - offset) << 2);
+   payload_size -= (ring->buf_mask + 1 - offset) << 2;
+   memcpy((void *)&ring->ring[0],
+  de_payload_cpu_addr + ((ring->buf_mask + 1 - offset) << 
2),
+  payload_size);
+   }
+}
+
 static void gfx_v9_0_ring_emit_ib_compute(struct amdgpu_ring *ring,
  struct amdgpu_job *job,
  struct amdgpu_ib *ib,
@@ -5370,6 +5444,8 @@ static void gfx_v9_0_ring_emit_ce_meta(struct amdgpu_ring 
*ring, bool resume)
amdgpu_ring_write(ring, lower_32_bits(ce_payload_gpu_addr));
amdgpu_ring_write(ring, upper_32_bits(ce_payload_gpu_addr));
 
+   amdgpu_ring_ib_on_emit_ce(ring);
+
if (resume)
amdgpu_ring_write_multiple(ring, ce_payload_cpu_addr,
   sizeof(ce_payload) >> 2);
@@ -5481,6 +5557,7 @@ static void gfx_v9_0_ring_emit_de_meta(struct amdgpu_ring 
*ring, bool resume, bo
amdgpu_ring_write(ring, lower_32_bits(de_payload_gpu_addr));
amdgpu_ring_write(ring, upper_32_bits(de_payload_gpu_addr));
 
+   amdgpu_ring_ib_on_emit_de(ring);
if (resume)
amdgpu_ring_write_multiple(ring, de_payload_cpu_addr,
   sizeof(de_payload) >> 2);
@@ -6891,6 +6968,9 @@ stat

Re: [PATCH] drm/amdkfd: fix gfx_target_version for certain 11.0.3 devices

2023-05-25 Thread Felix Kuehling


On 2023-05-25 16:12, Alex Deucher wrote:

Certain boards with GC IP 11.0.3 need slightly different handling
in the shader compiler due to board specific bounding box
optimizations.

Signed-off-by: Alex Deucher 


Acked-by: Felix Kuehling 



---
  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 +--
  1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 862a50f7b490..ebc3c3f965f9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -411,8 +411,15 @@ struct kfd_dev *kgd2kfd_probe(struct amdgpu_device *adev, 
bool vf)
f2g = &gfx_v11_kfd2kgd;
break;
case IP_VERSION(11, 0, 3):
-   /* Note: Compiler version is 11.0.1 while HW version is 
11.0.3 */
-   gfx_target_version = 110001;
+   if ((adev->pdev->device == 0x7460 &&
+adev->pdev->revision == 0x00) ||
+   (adev->pdev->device == 0x7461 &&
+adev->pdev->revision == 0x00))
+   /* Note: Compiler version is 11.0.5 while HW 
version is 11.0.3 */
+   gfx_target_version = 110005;
+   else
+   /* Note: Compiler version is 11.0.1 while HW 
version is 11.0.3 */
+   gfx_target_version = 110001;
f2g = &gfx_v11_kfd2kgd;
break;
default:

[PATCH] drm/radeon: remove unused variable rbo

2023-05-25 Thread Tom Rix

gcc with W=1 reports
drivers/gpu/drm/radeon/radeon_ttm.c:200:27: error: variable
  ‘rbo’ set but not used [-Werror=unused-but-set-variable]
  200 | struct radeon_bo *rbo;
  |   ^~~
This variable is not used so remove it.

Signed-off-by: Tom Rix 
---
 drivers/gpu/drm/radeon/radeon_ttm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
b/drivers/gpu/drm/radeon/radeon_ttm.c
index 4eb83ccc4906..de4e6d78f1e1 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -197,7 +197,6 @@ static int radeon_bo_move(struct ttm_buffer_object *bo, 
bool evict,
 {
struct ttm_resource *old_mem = bo->resource;
struct radeon_device *rdev;
-   struct radeon_bo *rbo;
int r;
 
if (new_mem->mem_type == TTM_PL_TT) {
@@ -210,7 +209,6 @@ static int radeon_bo_move(struct ttm_buffer_object *bo, 
bool evict,
if (r)
return r;
 
-   rbo = container_of(bo, struct radeon_bo, tbo);
rdev = radeon_get_rdev(bo->bdev);
if (!old_mem || (old_mem->mem_type == TTM_PL_SYSTEM &&
 bo->ttm == NULL)) {
-- 
2.27.0

Re: [PATCH] drm/amdkfd: remove unused function get_reserved_sdma_queues_bitmap

2023-05-25 Thread Felix Kuehling


[+Mukul]

Looks like this problem was introduced by Mukul's patch "drm/amdkfd: 
Update SDMA queue management for GFX9.4.3". Could this be a merge error 
between GFX 9.4.3 and GFX11 branches? I think the 
reserved_sdma_queues_bitmap was introduced after the 9.4.3 branch was 
created. Mukul, you worked on both, so you're probably in the best 
position to resolve this.


Regards,
  Felix


On 2023-05-25 16:07, Tom Rix wrote:

clang with W=1 reports
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:122:24: error:
   unused function 'get_reserved_sdma_queues_bitmap' [-Werror,-Wunused-function]
static inline uint64_t get_reserved_sdma_queues_bitmap(struct 
device_queue_manager *dqm)
^
This function is not used so remove it.

Signed-off-by: Tom Rix 
---
  drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 5 -
  1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 493b4b66f180..2fbd0a96424f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -119,11 +119,6 @@ unsigned int get_num_xgmi_sdma_queues(struct 
device_queue_manager *dqm)
dqm->dev->kfd->device_info.num_sdma_queues_per_engine;
  }
  
-static inline uint64_t get_reserved_sdma_queues_bitmap(struct device_queue_manager *dqm)

-{
-   return dqm->dev->kfd->device_info.reserved_sdma_queues_bitmap;
-}
-
  static void init_sdma_bitmaps(struct device_queue_manager *dqm)
  {
bitmap_zero(dqm->sdma_bitmap, KFD_MAX_SDMA_QUEUES);

RE: [PATCH] drm/amdgpu: add the accelerator pcie class

2023-05-25 Thread Deucher, Alexander

[AMD Official Use Only - General]

> -Original Message-
> From: amd-gfx  On Behalf Of
> Christoph Hellwig
> Sent: Thursday, May 25, 2023 5:47 AM
> To: Alex Deucher 
> Cc: Christoph Hellwig ; bhelg...@google.com; amd-
> g...@lists.freedesktop.org; Zhang, Morris ; linux-
> p...@vger.kernel.org
> Subject: Re: [PATCH] drm/amdgpu: add the accelerator pcie class
> 
> On Tue, May 23, 2023 at 10:02:32AM -0400, Alex Deucher wrote:
> > On Tue, May 23, 2023 at 5:25 AM Christoph Hellwig 
> wrote:
> > >
> > > On Tue, May 23, 2023 at 12:02:32PM +0800, Shiwu Zhang wrote:
> > > > + { PCI_DEVICE(0x1002, PCI_ANY_ID),
> > > > +   .class = PCI_CLASS_ACCELERATOR_PROCESSING << 8,
> > > > +   .class_mask = 0xff,
> > > > +   .driver_data = CHIP_IP_DISCOVERY },
> > >
> > > Probing for every single device of a given class for a single vendor
> > > to a driver is just fundamentaly wrong.  Please list the actual IDs
> > > that the driver can handle.
> >
> > How so?  The driver handles all devices of that class.  We already do
> > that for PCI_CLASS_DISPLAY_VGA and PCI_CLASS_DISPLAY_OTHER.  Other
> > drivers do similar things.
> 
> How is that going to work in the long run?  The chances of totally
> incompatbile devices from the same vendor appearing is absolutely given.
> 

We already handle this today for CLASS_DISPLAY via a data table provided on our 
hardware that details the components on the board.  The driver can then 
determine whether or not that combination of components is supported.  If the 
data table doesn't exist or isn’t parse-able, or the components enumerated are 
not supported, the driver doesn't load.

Alex

Re: [PATCH 01/13] drm: execution context for GEM buffers v4

2023-05-25 Thread Danilo Krummrich


On 5/4/23 13:51, Christian König wrote:

This adds the infrastructure for an execution context for GEM buffers
which is similar to the existing TTMs execbuf util and intended to replace
it in the long term.

The basic functionality is that we abstracts the necessary loop to lock
many different GEM buffers with automated deadlock and duplicate handling.

v2: drop xarray and use dynamic resized array instead, the locking
 overhead is unecessary and measurable.
v3: drop duplicate tracking, radeon is really the only one needing that.
v4: fixes issues pointed out by Danilo, some typos in comments and a
 helper for lock arrays of GEM objects.

Signed-off-by: Christian König 


Reviewed-by: Danilo Krummrich 


---
  Documentation/gpu/drm-mm.rst |  12 ++
  drivers/gpu/drm/Kconfig  |   6 +
  drivers/gpu/drm/Makefile |   2 +
  drivers/gpu/drm/drm_exec.c   | 278 +++
  include/drm/drm_exec.h   | 119 +++
  5 files changed, 417 insertions(+)
  create mode 100644 drivers/gpu/drm/drm_exec.c
  create mode 100644 include/drm/drm_exec.h

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index a79fd3549ff8..a52e6f4117d6 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -493,6 +493,18 @@ DRM Sync Objects
  .. kernel-doc:: drivers/gpu/drm/drm_syncobj.c
 :export:
  
+DRM Execution context

+=
+
+.. kernel-doc:: drivers/gpu/drm/drm_exec.c
+   :doc: Overview
+
+.. kernel-doc:: include/drm/drm_exec.h
+   :internal:
+
+.. kernel-doc:: drivers/gpu/drm/drm_exec.c
+   :export:
+
  GPU Scheduler
  =
  
diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig

index ba3fb04bb691..2dc81eb062eb 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -201,6 +201,12 @@ config DRM_TTM
  GPU memory types. Will be enabled automatically if a device driver
  uses it.
  
+config DRM_EXEC

+   tristate
+   depends on DRM
+   help
+ Execution context for command submissions
+
  config DRM_BUDDY
tristate
depends on DRM
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index a33257d2bc7f..9c6446eb3c83 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -78,6 +78,8 @@ obj-$(CONFIG_DRM_PANEL_ORIENTATION_QUIRKS) += 
drm_panel_orientation_quirks.o
  #
  # Memory-management helpers
  #
+#
+obj-$(CONFIG_DRM_EXEC) += drm_exec.o
  
  obj-$(CONFIG_DRM_BUDDY) += drm_buddy.o
  
diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c

new file mode 100644
index ..18071bff20f4
--- /dev/null
+++ b/drivers/gpu/drm/drm_exec.c
@@ -0,0 +1,278 @@
+/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+
+#include 
+#include 
+#include 
+
+/**
+ * DOC: Overview
+ *
+ * This component mainly abstracts the retry loop necessary for locking
+ * multiple GEM objects while preparing hardware operations (e.g. command
+ * submissions, page table updates etc..).
+ *
+ * If a contention is detected while locking a GEM object the cleanup procedure
+ * unlocks all previously locked GEM objects and locks the contended one first
+ * before locking any further objects.
+ *
+ * After an object is locked fences slots can optionally be reserved on the
+ * dma_resv object inside the GEM object.
+ *
+ * A typical usage pattern should look like this::
+ *
+ * struct drm_gem_object *obj;
+ * struct drm_exec exec;
+ * unsigned long index;
+ * int ret;
+ *
+ * drm_exec_init(&exec, true);
+ * drm_exec_while_not_all_locked(&exec) {
+ * ret = drm_exec_prepare_obj(&exec, boA, 1);
+ * drm_exec_continue_on_contention(&exec);
+ * if (ret)
+ * goto error;
+ *
+ * ret = drm_exec_prepare_obj(&exec, boB, 1);
+ * drm_exec_continue_on_contention(&exec);
+ * if (ret)
+ * goto error;
+ * }
+ *
+ * drm_exec_for_each_locked_object(&exec, index, obj) {
+ * dma_resv_add_fence(obj->resv, fence, DMA_RESV_USAGE_READ);
+ * ...
+ * }
+ * drm_exec_fini(&exec);
+ *
+ * See struct dma_exec for more details.
+ */
+
+/* Dummy value used to initially enter the retry loop */
+#define DRM_EXEC_DUMMY (void*)~0
+
+/* Unlock all objects and drop references */
+static void drm_exec_unlock_all(struct drm_exec *exec)
+{
+   struct drm_gem_object *obj;
+   unsigned long index;
+
+   drm_exec_for_each_locked_object(exec, index, obj) {
+   dma_resv_unlock(obj->resv);
+   drm_gem_object_put(obj);
+   }
+
+   drm_gem_object_put(exec->prelocked);
+   exec->prelocked = NULL;
+}
+
+/**
+ * drm_exec_init - initialize a drm_exec object
+ * @exec: the drm_exec object to initialize
+ * @interruptible: if locks should be acquired interruptible
+ *
+ * Initialize the object and make sure that we can track locked objects.
+ */
+void drm_exec_init(struct d

Re: [PATCH] drm/amdgpu: move gfx9_cs_data definition

2023-05-25 Thread Alex Deucher

On Thu, May 25, 2023 at 4:35 PM Tom Rix  wrote:
>
> gcc with W=1 reports
> In file included from drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:32:
> drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h:939:36: error:
>   ‘gfx9_cs_data’ defined but not used [-Werror=unused-const-variable=]
>   939 | static const struct cs_section_def gfx9_cs_data[] = {
>   |^~~~
>
> gfx9_cs_data is only used in gfx_v9_0.c, so move its definition there.
>
> Signed-off-by: Tom Rix 

Already fixed with:
https://patchwork.freedesktop.org/patch/539234/
which will show up in my tree momentarily.

Alex


> ---
>  drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h | 4 
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 5 +
>  2 files changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h 
> b/drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h
> index 567a904804bc..6de4778789ed 100644
> --- a/drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h
> +++ b/drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h
> @@ -936,7 +936,3 @@ static const struct cs_extent_def 
> gfx9_SECT_CONTEXT_defs[] =
>  {gfx9_SECT_CONTEXT_def_8, 0xa2f5, 155 },
>  { 0, 0, 0 }
>  };
> -static const struct cs_section_def gfx9_cs_data[] = {
> -{ gfx9_SECT_CONTEXT_defs, SECT_CONTEXT },
> -{ 0, SECT_NONE }
> -};
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> index 8bf95a6b0767..c97a68a39d93 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> @@ -56,6 +56,11 @@
>  #include "asic_reg/pwr/pwr_10_0_sh_mask.h"
>  #include "asic_reg/gc/gc_9_0_default.h"
>
> +static const struct cs_section_def gfx9_cs_data[] = {
> +{ gfx9_SECT_CONTEXT_defs, SECT_CONTEXT },
> +{ 0, SECT_NONE }
> +};
> +
>  #define GFX9_NUM_GFX_RINGS 1
>  #define GFX9_NUM_SW_GFX_RINGS  2
>  #define GFX9_MEC_HPD_SIZE 4096
> --
> 2.27.0
>

[PATCH] drm/amdgpu: move gfx9_cs_data definition

2023-05-25 Thread Tom Rix

gcc with W=1 reports
In file included from drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:32:
drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h:939:36: error:
  ‘gfx9_cs_data’ defined but not used [-Werror=unused-const-variable=]
  939 | static const struct cs_section_def gfx9_cs_data[] = {
  |^~~~

gfx9_cs_data is only used in gfx_v9_0.c, so move its definition there.

Signed-off-by: Tom Rix 
---
 drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h | 4 
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 5 +
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h 
b/drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h
index 567a904804bc..6de4778789ed 100644
--- a/drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h
+++ b/drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h
@@ -936,7 +936,3 @@ static const struct cs_extent_def gfx9_SECT_CONTEXT_defs[] =
 {gfx9_SECT_CONTEXT_def_8, 0xa2f5, 155 },
 { 0, 0, 0 }
 };
-static const struct cs_section_def gfx9_cs_data[] = {
-{ gfx9_SECT_CONTEXT_defs, SECT_CONTEXT },
-{ 0, SECT_NONE }
-};
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 8bf95a6b0767..c97a68a39d93 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -56,6 +56,11 @@
 #include "asic_reg/pwr/pwr_10_0_sh_mask.h"
 #include "asic_reg/gc/gc_9_0_default.h"
 
+static const struct cs_section_def gfx9_cs_data[] = {
+{ gfx9_SECT_CONTEXT_defs, SECT_CONTEXT },
+{ 0, SECT_NONE }
+};
+
 #define GFX9_NUM_GFX_RINGS 1
 #define GFX9_NUM_SW_GFX_RINGS  2
 #define GFX9_MEC_HPD_SIZE 4096
-- 
2.27.0

Re: [PATCH 2/2] drm/amdgpu: Remove duplicate fdinfo fields

2023-05-25 Thread Alex Deucher

On Thu, May 25, 2023 at 11:52 AM Rob Clark  wrote:
>
> From: Rob Clark 
>
> Some of the fields that are handled by drm_show_fdinfo() crept back in
> when rebasing the patch.  Remove them again.
>
> Fixes: 376c25f8ca47 ("drm/amdgpu: Switch to fdinfo helper")
> Signed-off-by: Rob Clark 

Series is:
Reviewed-by: 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> index 13d7413d4ca3..a93e5627901a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
> @@ -80,23 +80,20 @@ void amdgpu_show_fdinfo(struct drm_printer *p, struct 
> drm_file *file)
>
> amdgpu_ctx_mgr_usage(&fpriv->ctx_mgr, usage);
>
> /*
>  * **
>  * For text output format description please see drm-usage-stats.rst!
>  * **
>  */
>
> drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid);
> -   drm_printf(p, "drm-driver:\t%s\n", file->minor->dev->driver->name);
> -   drm_printf(p, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn);
> -   drm_printf(p, "drm-client-id:\t%Lu\n", vm->immediate.fence_context);
> drm_printf(p, "drm-memory-vram:\t%llu KiB\n", stats.vram/1024UL);
> drm_printf(p, "drm-memory-gtt: \t%llu KiB\n", stats.gtt/1024UL);
> drm_printf(p, "drm-memory-cpu: \t%llu KiB\n", stats.cpu/1024UL);
> drm_printf(p, "amd-memory-visible-vram:\t%llu KiB\n",
>stats.visible_vram/1024UL);
> drm_printf(p, "amd-evicted-vram:\t%llu KiB\n",
>stats.evicted_vram/1024UL);
> drm_printf(p, "amd-evicted-visible-vram:\t%llu KiB\n",
>stats.evicted_visible_vram/1024UL);
> drm_printf(p, "amd-requested-vram:\t%llu KiB\n",
> --
> 2.40.1
>

Re: [PATCH] drm/amdgpu: Fix up kdoc in amdgpu_acpi.c

2023-05-25 Thread Alex Deucher

On Thu, May 25, 2023 at 2:03 PM Srinivasan Shanmugam
 wrote:
>
> Fix these warnings by adding & deleting the deviant arguments.
>
> gcc with W=1
> drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:906: warning: Function parameter or 
> member 'numa_info' not described in 'amdgpu_acpi_get_node_id'
> drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:906: warning: Excess function 
> parameter 'nid' description in 'amdgpu_acpi_get_node_id'
>
> Cc: Christian König 
> Cc: Alex Deucher 
> Signed-off-by: Srinivasan Shanmugam 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> index b050d462b2f3..3a6b2e2089f6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> @@ -894,7 +894,7 @@ static struct amdgpu_numa_info 
> *amdgpu_acpi_get_numa_info(uint32_t pxm)
>   * acpi device handle
>   *
>   * @handle: acpi handle
> - * @nid: NUMA Node id returned by the platform firmware
> + * @numa_info: amdgpu_numa_info structure holding numa information
>   *
>   * Queries the ACPI interface to fetch the corresponding NUMA Node ID for a
>   * given amdgpu acpi device.
> --
> 2.25.1
>

Re: [PATCH] drm/amdkfd: remove unused function get_reserved_sdma_queues_bitmap

2023-05-25 Thread Nathan Chancellor

On Thu, May 25, 2023 at 04:07:59PM -0400, Tom Rix wrote:
> clang with W=1 reports
> drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:122:24: error:
>   unused function 'get_reserved_sdma_queues_bitmap' 
> [-Werror,-Wunused-function]
> static inline uint64_t get_reserved_sdma_queues_bitmap(struct 
> device_queue_manager *dqm)
>^
> This function is not used so remove it.
> 
> Signed-off-by: Tom Rix 

Caused by commit 09a95a85cf3e ("drm/amdkfd: Update SDMA queue management
for GFX9.4.3") it seems.

You can actually go a step farther and remove the
reserved_sdma_queues_bitmap member from 'struct kfd_device_info' because
it is now only assigned, never read.

$ git grep reserved_sdma_queues_bitmap next-20230525
next:20230525:drivers/gpu/drm/amd/amdkfd/kfd_device.c:
kfd->device_info.reserved_sdma_queues_bitmap = 0xFULL;
next:20230525:drivers/gpu/drm/amd/amdkfd/kfd_device.c:    
kfd->device_info.reserved_sdma_queues_bitmap = 0x3ULL;
next:20230525:drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c:static 
inline uint64_t get_reserved_sdma_queues_bitmap(struct device_queue_manager 
*dqm)
next:20230525:drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c:return 
dqm->dev->kfd->device_info.reserved_sdma_queues_bitmap;
next:20230525:drivers/gpu/drm/amd/amdkfd/kfd_priv.h:uint64_t 
reserved_sdma_queues_bitmap;

> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 493b4b66f180..2fbd0a96424f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -119,11 +119,6 @@ unsigned int get_num_xgmi_sdma_queues(struct 
> device_queue_manager *dqm)
>   dqm->dev->kfd->device_info.num_sdma_queues_per_engine;
>  }
>  
> -static inline uint64_t get_reserved_sdma_queues_bitmap(struct 
> device_queue_manager *dqm)
> -{
> - return dqm->dev->kfd->device_info.reserved_sdma_queues_bitmap;
> -}
> -
>  static void init_sdma_bitmaps(struct device_queue_manager *dqm)
>  {
>   bitmap_zero(dqm->sdma_bitmap, KFD_MAX_SDMA_QUEUES);
> -- 
> 2.27.0
>

[PATCH] drm/amdkfd: remove unused function get_reserved_sdma_queues_bitmap

2023-05-25 Thread Tom Rix

clang with W=1 reports
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:122:24: error:
  unused function 'get_reserved_sdma_queues_bitmap' [-Werror,-Wunused-function]
static inline uint64_t get_reserved_sdma_queues_bitmap(struct 
device_queue_manager *dqm)
   ^
This function is not used so remove it.

Signed-off-by: Tom Rix 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 493b4b66f180..2fbd0a96424f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -119,11 +119,6 @@ unsigned int get_num_xgmi_sdma_queues(struct 
device_queue_manager *dqm)
dqm->dev->kfd->device_info.num_sdma_queues_per_engine;
 }
 
-static inline uint64_t get_reserved_sdma_queues_bitmap(struct 
device_queue_manager *dqm)
-{
-   return dqm->dev->kfd->device_info.reserved_sdma_queues_bitmap;
-}
-
 static void init_sdma_bitmaps(struct device_queue_manager *dqm)
 {
bitmap_zero(dqm->sdma_bitmap, KFD_MAX_SDMA_QUEUES);
-- 
2.27.0

[PATCH] drm/amdkfd: fix gfx_target_version for certain 11.0.3 devices

2023-05-25 Thread Alex Deucher

Certain boards with GC IP 11.0.3 need slightly different handling
in the shader compiler due to board specific bounding box
optimizations.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 862a50f7b490..ebc3c3f965f9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -411,8 +411,15 @@ struct kfd_dev *kgd2kfd_probe(struct amdgpu_device *adev, 
bool vf)
f2g = &gfx_v11_kfd2kgd;
break;
case IP_VERSION(11, 0, 3):
-   /* Note: Compiler version is 11.0.1 while HW version is 
11.0.3 */
-   gfx_target_version = 110001;
+   if ((adev->pdev->device == 0x7460 &&
+adev->pdev->revision == 0x00) ||
+   (adev->pdev->device == 0x7461 &&
+adev->pdev->revision == 0x00))
+   /* Note: Compiler version is 11.0.5 while HW 
version is 11.0.3 */
+   gfx_target_version = 110005;
+   else
+   /* Note: Compiler version is 11.0.1 while HW 
version is 11.0.3 */
+   gfx_target_version = 110001;
f2g = &gfx_v11_kfd2kgd;
break;
default:
-- 
2.40.1

Re: [PATCH] drm/amd/amdgpu: Fix up locking etc in amdgpu_debugfs_gprwave_ioctl()

2023-05-25 Thread Alex Deucher

Applied.  Thanks!

Alex

On Thu, May 25, 2023 at 4:05 AM Dan Carpenter  wrote:
>
> There are two bugs here.
> 1) Drop the lock if copy_from_user() fails.
> 2) If the copy fails then the correct error code is -EFAULT instead of
>-EINVAL.
>
> I also broke up the long line and changed "sizeof rd->id" to
> "sizeof(rd->id)".
>
> Fixes: 164fb2940933 ("drm/amd/amdgpu: Update debugfs for XCC support (v3)")
> Signed-off-by: Dan Carpenter 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index c657bed350ac..56e89e76ff17 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -478,15 +478,16 @@ static ssize_t amdgpu_debugfs_gprwave_read(struct file 
> *f, char __user *buf, siz
>  static long amdgpu_debugfs_gprwave_ioctl(struct file *f, unsigned int cmd, 
> unsigned long data)
>  {
> struct amdgpu_debugfs_gprwave_data *rd = f->private_data;
> -   int r;
> +   int r = 0;
>
> mutex_lock(&rd->lock);
>
> switch (cmd) {
> case AMDGPU_DEBUGFS_GPRWAVE_IOC_SET_STATE:
> -   r = copy_from_user(&rd->id, (struct 
> amdgpu_debugfs_gprwave_iocdata *)data, sizeof rd->id);
> -   if (r)
> -   return r ? -EINVAL : 0;
> +   if (copy_from_user(&rd->id,
> +  (struct amdgpu_debugfs_gprwave_iocdata 
> *)data,
> +  sizeof(rd->id)))
> +   r = -EFAULT;
> goto done;
> default:
> r = -EINVAL;
> --
> 2.39.2
>

[PATCH v4 05/13] drm/connector: Print connector colorspace in state debugfs

2023-05-25 Thread Harry Wentland

v3: Fix kerneldocs (kernel test robot)

v4: Avoid returning NULL from drm_get_colorspace_name

Signed-off-by: Harry Wentland 
Cc: Pekka Paalanen 
Cc: Sebastian Wick 
Cc: vitaly.pros...@amd.com
Cc: Uma Shankar 
Cc: Ville Syrjälä 
Cc: Joshua Ashton 
Cc: Jani Nikula 
Cc: Simon Ser 
Cc: Ville Syrjälä 
Cc: Melissa Wen 
Cc: dri-de...@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
---
 drivers/gpu/drm/drm_atomic.c|  1 +
 drivers/gpu/drm/drm_connector.c | 15 +++
 include/drm/drm_connector.h |  1 +
 3 files changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index c0dc5858a723..d6d04c4ccfc0 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -1071,6 +1071,7 @@ static void drm_atomic_connector_print_state(struct 
drm_printer *p,
drm_printf(p, "\tcrtc=%s\n", state->crtc ? state->crtc->name : 
"(null)");
drm_printf(p, "\tself_refresh_aware=%d\n", state->self_refresh_aware);
drm_printf(p, "\tmax_requested_bpc=%d\n", state->max_requested_bpc);
+   drm_printf(p, "\tcolorspace=%s\n", 
drm_get_colorspace_name(state->colorspace));
 
if (connector->connector_type == DRM_MODE_CONNECTOR_WRITEBACK)
if (state->writeback_job && state->writeback_job->fb)
diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index 8d24a5da4076..69480385eaf3 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -1048,6 +1048,21 @@ static const char * const colorspace_names[] = {
[DRM_MODE_COLORIMETRY_BT601_YCC] = "BT601_YCC",
 };
 
+/**
+ * drm_get_colorspace_name - return a string for color encoding
+ * @colorspace: color space to compute name of
+ *
+ * In contrast to the other drm_get_*_name functions this one here returns a
+ * const pointer and hence is threadsafe.
+ */
+const char *drm_get_colorspace_name(enum drm_colorspace colorspace)
+{
+   if (colorspace < ARRAY_SIZE(colorspace_names) && 
colorspace_names[colorspace])
+   return colorspace_names[colorspace];
+   else
+   return "(null)";
+}
+
 static const u32 hdmi_colorspaces =
BIT(DRM_MODE_COLORIMETRY_SMPTE_170M_YCC) |
BIT(DRM_MODE_COLORIMETRY_BT709_YCC) |
diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
index 565311e194da..ae0b1ee5b99a 100644
--- a/include/drm/drm_connector.h
+++ b/include/drm/drm_connector.h
@@ -1980,6 +1980,7 @@ void drm_connector_list_iter_end(struct 
drm_connector_list_iter *iter);
 
 bool drm_connector_has_possible_encoder(struct drm_connector *connector,
struct drm_encoder *encoder);
+const char *drm_get_colorspace_name(enum drm_colorspace colorspace);
 
 /**
  * drm_for_each_connector_iter - connector_list iterator macro
-- 
2.40.1

[PATCH v4 11/13] drm/amd/display: Always set crtcinfo from create_stream_for_sink

2023-05-25 Thread Harry Wentland

From: Joshua Ashton 

Given that we always pass dm_state into here now, this won't ever
trigger anymore.

This is needed for we will always fail mode validation with invalid
clocks or link bandwidth errors.

Signed-off-by: Joshua Ashton 
Signed-off-by: Harry Wentland 
Cc: Pekka Paalanen 
Cc: Sebastian Wick 
Cc: vitaly.pros...@amd.com
Cc: Joshua Ashton 
Cc: Simon Ser 
Cc: Melissa Wen 
Cc: dri-de...@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
Reviewed-By: Harry Wentland 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index a8de26f09806..4e96a34148cc 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6046,7 +6046,7 @@ create_stream_for_sink(struct amdgpu_dm_connector 
*aconnector,
 
if (recalculate_timing)
drm_mode_set_crtcinfo(&saved_mode, 0);
-   else if (!dm_state)
+   else
drm_mode_set_crtcinfo(&mode, 0);
 
/*
-- 
2.40.1

[PATCH v4 13/13] drm/amd/display: Refactor avi_info_frame colorimetry determination

2023-05-25 Thread Harry Wentland

From: Joshua Ashton 

Replace the messy two if-else chains here that were
on the same value with a switch on the enum.

Signed-off-by: Joshua Ashton 
Signed-off-by: Harry Wentland 
Cc: Pekka Paalanen 
Cc: Sebastian Wick 
Cc: vitaly.pros...@amd.com
Cc: Joshua Ashton 
Cc: Simon Ser 
Cc: Melissa Wen 
Cc: dri-de...@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
Reviewed-by: Harry Wentland 
---
 .../gpu/drm/amd/display/dc/core/dc_resource.c | 28 +++
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index 7e1e5532f88f..ac3062abec51 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -3015,23 +3015,29 @@ static void set_avi_info_frame(
hdmi_info.bits.S0_S1 = scan_type;
 
/* C0, C1 : Colorimetry */
-   if (color_space == COLOR_SPACE_YCBCR709 ||
-   color_space == COLOR_SPACE_YCBCR709_LIMITED)
+   switch (color_space) {
+   case COLOR_SPACE_YCBCR709:
+   case COLOR_SPACE_YCBCR709_LIMITED:
hdmi_info.bits.C0_C1 = COLORIMETRY_ITU709;
-   else if (color_space == COLOR_SPACE_YCBCR601 ||
-   color_space == COLOR_SPACE_YCBCR601_LIMITED)
+   break;
+   case COLOR_SPACE_YCBCR601:
+   case COLOR_SPACE_YCBCR601_LIMITED:
hdmi_info.bits.C0_C1 = COLORIMETRY_ITU601;
-   else {
-   hdmi_info.bits.C0_C1 = COLORIMETRY_NO_DATA;
-   }
-   if (color_space == COLOR_SPACE_2020_RGB_FULLRANGE ||
-   color_space == COLOR_SPACE_2020_RGB_LIMITEDRANGE ||
-   color_space == COLOR_SPACE_2020_YCBCR) {
+   break;
+   case COLOR_SPACE_2020_RGB_FULLRANGE:
+   case COLOR_SPACE_2020_RGB_LIMITEDRANGE:
+   case COLOR_SPACE_2020_YCBCR:
hdmi_info.bits.EC0_EC2 = COLORIMETRYEX_BT2020RGBYCBCR;
hdmi_info.bits.C0_C1   = COLORIMETRY_EXTENDED;
-   } else if (color_space == COLOR_SPACE_ADOBERGB) {
+   break;
+   case COLOR_SPACE_ADOBERGB:
hdmi_info.bits.EC0_EC2 = COLORIMETRYEX_ADOBERGB;
hdmi_info.bits.C0_C1   = COLORIMETRY_EXTENDED;
+   break;
+   case COLOR_SPACE_SRGB:
+   default:
+   hdmi_info.bits.C0_C1 = COLORIMETRY_NO_DATA;
+   break;
}
 
if (pixel_encoding && color_space == COLOR_SPACE_2020_YCBCR &&
-- 
2.40.1

[PATCH v4 10/13] drm/amd/display: Send correct DP colorspace infopacket

2023-05-25 Thread Harry Wentland

Look at connector->colorimetry to determine output colorspace.

We don't want to impact current SDR behavior, so
DRM_MODE_COLORIMETRY_DEFAULT preserves current behavior.

Also add support to explicitly set BT601 and BT709.

v4:
- Roll support for BT709 and BT601 into this patch
- Add default case to avoid warnings for unhandled
  enum values

Signed-off-by: Harry Wentland 
Cc: Pekka Paalanen 
Cc: Sebastian Wick 
Cc: vitaly.pros...@amd.com
Cc: Joshua Ashton 
Cc: Simon Ser 
Cc: Melissa Wen 
Cc: dri-de...@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 48 ---
 1 file changed, 31 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 5c290e6aac46..a8de26f09806 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -5330,21 +5330,44 @@ get_aspect_ratio(const struct drm_display_mode *mode_in)
 }
 
 static enum dc_color_space
-get_output_color_space(const struct dc_crtc_timing *dc_crtc_timing)
+get_output_color_space(const struct dc_crtc_timing *dc_crtc_timing,
+  const struct drm_connector_state *connector_state)
 {
enum dc_color_space color_space = COLOR_SPACE_SRGB;
 
-   switch (dc_crtc_timing->pixel_encoding) {
-   case PIXEL_ENCODING_YCBCR422:
-   case PIXEL_ENCODING_YCBCR444:
-   case PIXEL_ENCODING_YCBCR420:
-   {
+   switch (connector_state->colorspace) {
+   case DRM_MODE_COLORIMETRY_BT601_YCC:
+   if (dc_crtc_timing->flags.Y_ONLY)
+   color_space = COLOR_SPACE_YCBCR601_LIMITED;
+   else
+   color_space = COLOR_SPACE_YCBCR601;
+   break;
+   case DRM_MODE_COLORIMETRY_BT709_YCC:
+   if (dc_crtc_timing->flags.Y_ONLY)
+   color_space = COLOR_SPACE_YCBCR709_LIMITED;
+   else
+   color_space = COLOR_SPACE_YCBCR709;
+   break;
+   case DRM_MODE_COLORIMETRY_OPRGB:
+   color_space = COLOR_SPACE_ADOBERGB;
+   break;
+   case DRM_MODE_COLORIMETRY_BT2020_RGB:
+   case DRM_MODE_COLORIMETRY_BT2020_YCC:
+   if (dc_crtc_timing->pixel_encoding == PIXEL_ENCODING_RGB)
+   color_space = COLOR_SPACE_2020_RGB_FULLRANGE;
+   else
+   color_space = COLOR_SPACE_2020_YCBCR;
+   break;
+   case DRM_MODE_COLORIMETRY_DEFAULT: // ITU601
+   default:
+   if (dc_crtc_timing->pixel_encoding == PIXEL_ENCODING_RGB) {
+   color_space = COLOR_SPACE_SRGB;
/*
 * 27030khz is the separation point between HDTV and SDTV
 * according to HDMI spec, we use YCbCr709 and YCbCr601
 * respectively
 */
-   if (dc_crtc_timing->pix_clk_100hz > 270300) {
+   } else if (dc_crtc_timing->pix_clk_100hz > 270300) {
if (dc_crtc_timing->flags.Y_ONLY)
color_space =
COLOR_SPACE_YCBCR709_LIMITED;
@@ -5357,15 +5380,6 @@ get_output_color_space(const struct dc_crtc_timing 
*dc_crtc_timing)
else
color_space = COLOR_SPACE_YCBCR601;
}
-
-   }
-   break;
-   case PIXEL_ENCODING_RGB:
-   color_space = COLOR_SPACE_SRGB;
-   break;
-
-   default:
-   WARN_ON(1);
break;
}
 
@@ -5504,7 +5518,7 @@ static void fill_stream_properties_from_drm_display_mode(
}
}
 
-   stream->output_color_space = get_output_color_space(timing_out);
+   stream->output_color_space = get_output_color_space(timing_out, 
connector_state);
 }
 
 static void fill_audio_info(struct audio_info *audio_info,
-- 
2.40.1

[PATCH v4 12/13] drm/amd/display: Add debugfs for testing output colorspace

2023-05-25 Thread Harry Wentland

In order to IGT test colorspace we'll want to print
the currently enabled colorspace on a stream. We add
a new debugfs to do so, using the same scheme as
current bpc reporting.

This might also come in handy when debugging display
issues.

v4:
- Fix function doc comment
- Fix sRGB debug print

Signed-off-by: Harry Wentland 
Cc: Pekka Paalanen 
Cc: Sebastian Wick 
Cc: vitaly.pros...@amd.com
Cc: Joshua Ashton 
Cc: Simon Ser 
Cc: Melissa Wen 
Cc: dri-de...@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
---
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 57 +++
 1 file changed, 57 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 827fcb4fb3b3..9a885e2effec 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -906,6 +906,61 @@ static int amdgpu_current_bpc_show(struct seq_file *m, 
void *data)
 }
 DEFINE_SHOW_ATTRIBUTE(amdgpu_current_bpc);
 
+/*
+ * Returns the current colorspace for the crtc.
+ * Example usage: cat /sys/kernel/debug/dri/0/crtc-0/amdgpu_current_colorspace
+ */
+static int amdgpu_current_colorspace_show(struct seq_file *m, void *data)
+{
+   struct drm_crtc *crtc = m->private;
+   struct drm_device *dev = crtc->dev;
+   struct dm_crtc_state *dm_crtc_state = NULL;
+   int res = -ENODEV;
+
+   mutex_lock(&dev->mode_config.mutex);
+   drm_modeset_lock(&crtc->mutex, NULL);
+   if (crtc->state == NULL)
+   goto unlock;
+
+   dm_crtc_state = to_dm_crtc_state(crtc->state);
+   if (dm_crtc_state->stream == NULL)
+   goto unlock;
+
+   switch (dm_crtc_state->stream->output_color_space) {
+   case COLOR_SPACE_SRGB:
+   seq_printf(m, "sRGB");
+   break;
+   case COLOR_SPACE_YCBCR601:
+   case COLOR_SPACE_YCBCR601_LIMITED:
+   seq_printf(m, "BT601_YCC");
+   break;
+   case COLOR_SPACE_YCBCR709:
+   case COLOR_SPACE_YCBCR709_LIMITED:
+   seq_printf(m, "BT709_YCC");
+   break;
+   case COLOR_SPACE_ADOBERGB:
+   seq_printf(m, "opRGB");
+   break;
+   case COLOR_SPACE_2020_RGB_FULLRANGE:
+   seq_printf(m, "BT2020_RGB");
+   break;
+   case COLOR_SPACE_2020_YCBCR:
+   seq_printf(m, "BT2020_YCC");
+   break;
+   default:
+   goto unlock;
+   }
+   res = 0;
+
+unlock:
+   drm_modeset_unlock(&crtc->mutex);
+   mutex_unlock(&dev->mode_config.mutex);
+
+   return res;
+}
+DEFINE_SHOW_ATTRIBUTE(amdgpu_current_colorspace);
+
+
 /*
  * Example usage:
  * Disable dsc passthrough, i.e.,: have dsc decoding at converver, not 
external RX
@@ -3246,6 +3301,8 @@ void crtc_debugfs_init(struct drm_crtc *crtc)
 #endif
debugfs_create_file("amdgpu_current_bpc", 0644, crtc->debugfs_entry,
crtc, &amdgpu_current_bpc_fops);
+   debugfs_create_file("amdgpu_current_colorspace", 0644, 
crtc->debugfs_entry,
+   crtc, &amdgpu_current_colorspace_fops);
 }
 
 /*
-- 
2.40.1

[PATCH v4 08/13] drm/amd/display: Register Colorspace property for DP and HDMI

2023-05-25 Thread Harry Wentland

We want compositors to be able to set the output
colorspace on DP and HDMI outputs, based on the
caps reported from the receiver via EDID.

Signed-off-by: Harry Wentland 
Cc: Pekka Paalanen 
Cc: Sebastian Wick 
Cc: vitaly.pros...@amd.com
Cc: Joshua Ashton 
Cc: Simon Ser 
Cc: Ville Syrjälä 
Cc: Melissa Wen 
Cc: dri-de...@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index ca093396d1ac..dc99a8ffac70 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7238,6 +7238,12 @@ static int amdgpu_dm_connector_get_modes(struct 
drm_connector *connector)
return amdgpu_dm_connector->num_modes;
 }
 
+static const u32 supported_colorspaces =
+   BIT(DRM_MODE_COLORIMETRY_BT709_YCC) |
+   BIT(DRM_MODE_COLORIMETRY_OPRGB) |
+   BIT(DRM_MODE_COLORIMETRY_BT2020_RGB) |
+   BIT(DRM_MODE_COLORIMETRY_BT2020_YCC);
+
 void amdgpu_dm_connector_init_helper(struct amdgpu_display_manager *dm,
 struct amdgpu_dm_connector *aconnector,
 int connector_type,
@@ -7318,6 +7324,15 @@ void amdgpu_dm_connector_init_helper(struct 
amdgpu_display_manager *dm,
adev->mode_info.abm_level_property, 0);
}
 
+   if (connector_type == DRM_MODE_CONNECTOR_HDMIA) {
+   if 
(!drm_mode_create_hdmi_colorspace_property(&aconnector->base, 
supported_colorspaces))
+   
drm_connector_attach_colorspace_property(&aconnector->base);
+   } else if (connector_type == DRM_MODE_CONNECTOR_DisplayPort ||
+  connector_type == DRM_MODE_CONNECTOR_eDP) {
+   if (!drm_mode_create_dp_colorspace_property(&aconnector->base, 
supported_colorspaces))
+   
drm_connector_attach_colorspace_property(&aconnector->base);
+   }
+
if (connector_type == DRM_MODE_CONNECTOR_HDMIA ||
connector_type == DRM_MODE_CONNECTOR_DisplayPort ||
connector_type == DRM_MODE_CONNECTOR_eDP) {
-- 
2.40.1

[PATCH v4 09/13] drm/amd/display: Signal mode_changed if colorspace changed

2023-05-25 Thread Harry Wentland

We need to signal mode_changed to make sure we update the output
colorspace.

v2: No need to call drm_hdmi_avi_infoframe_colorimetry as DC does its
own infoframe packing.

Signed-off-by: Harry Wentland 
Cc: Pekka Paalanen 
Cc: Sebastian Wick 
Cc: vitaly.pros...@amd.com
Cc: Uma Shankar 
Cc: Joshua Ashton 
Cc: Simon Ser 
Cc: Melissa Wen 
Cc: dri-de...@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
Reviewed-by: Leo Li 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index dc99a8ffac70..5c290e6aac46 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6691,6 +6691,14 @@ amdgpu_dm_connector_atomic_check(struct drm_connector 
*conn,
if (!crtc)
return 0;
 
+   if (new_con_state->colorspace != old_con_state->colorspace) {
+   new_crtc_state = drm_atomic_get_crtc_state(state, crtc);
+   if (IS_ERR(new_crtc_state))
+   return PTR_ERR(new_crtc_state);
+
+   new_crtc_state->mode_changed = true;
+   }
+
if (!drm_connector_atomic_hdr_metadata_equal(old_con_state, 
new_con_state)) {
struct dc_info_packet hdr_infopacket;
 
@@ -6713,7 +6721,7 @@ amdgpu_dm_connector_atomic_check(struct drm_connector 
*conn,
 * set is permissible, however. So only force a
 * modeset if we're entering or exiting HDR.
 */
-   new_crtc_state->mode_changed =
+   new_crtc_state->mode_changed = new_crtc_state->mode_changed ||
!old_con_state->hdr_output_metadata ||
!new_con_state->hdr_output_metadata;
}
-- 
2.40.1

[PATCH v4 07/13] drm/amd/display: Always pass connector_state to stream validation

2023-05-25 Thread Harry Wentland

We need the connector_state for colorspace and scaling information
and can get it from connector->state.

Signed-off-by: Harry Wentland 
Cc: Pekka Paalanen 
Cc: Sebastian Wick 
Cc: vitaly.pros...@amd.com
Cc: Joshua Ashton 
Cc: Simon Ser 
Cc: Melissa Wen 
Cc: dri-de...@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index a69f4a39d92a..ca093396d1ac 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -5946,15 +5946,14 @@ create_stream_for_sink(struct amdgpu_dm_connector 
*aconnector,
 {
struct drm_display_mode *preferred_mode = NULL;
struct drm_connector *drm_connector;
-   const struct drm_connector_state *con_state =
-   dm_state ? &dm_state->base : NULL;
+   const struct drm_connector_state *con_state = &dm_state->base;
struct dc_stream_state *stream = NULL;
struct drm_display_mode mode;
struct drm_display_mode saved_mode;
struct drm_display_mode *freesync_mode = NULL;
bool native_mode_found = false;
bool recalculate_timing = false;
-   bool scale = dm_state ? (dm_state->scaling != RMX_OFF) : false;
+   bool scale = dm_state->scaling != RMX_OFF;
int mode_refresh;
int preferred_refresh = 0;
enum color_transfer_func tf = TRANSFER_FUNC_UNKNOWN;
@@ -6596,7 +6595,9 @@ enum drm_mode_status 
amdgpu_dm_connector_mode_valid(struct drm_connector *connec
goto fail;
}
 
-   stream = create_validate_stream_for_sink(aconnector, mode, NULL, NULL);
+   stream = create_validate_stream_for_sink(aconnector, mode,
+
to_dm_connector_state(connector->state),
+NULL);
if (stream) {
dc_stream_release(stream);
result = MODE_OK;
-- 
2.40.1

[PATCH v4 03/13] drm/connector: Pull out common create_colorspace_property code

2023-05-25 Thread Harry Wentland

Signed-off-by: Harry Wentland 
Cc: Pekka Paalanen 
Cc: Sebastian Wick 
Cc: vitaly.pros...@amd.com
Cc: Uma Shankar 
Cc: Ville Syrjälä 
Cc: Joshua Ashton 
Cc: Jani Nikula 
Cc: Simon Ser 
Cc: Ville Syrjälä 
Cc: Melissa Wen 
Cc: dri-de...@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
---
 drivers/gpu/drm/drm_connector.c | 54 -
 1 file changed, 27 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index 547356e00341..9c087d6f5691 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -1975,33 +1975,44 @@ EXPORT_SYMBOL(drm_mode_create_aspect_ratio_property);
  * drm_mode_create_dp_colorspace_property() is used for DP connector.
  */
 
-/**
- * drm_mode_create_hdmi_colorspace_property - create hdmi colorspace property
- * @connector: connector to create the Colorspace property on.
- *
- * Called by a driver the first time it's needed, must be attached to desired
- * HDMI connectors.
- *
- * Returns:
- * Zero on success, negative errno on failure.
- */
-int drm_mode_create_hdmi_colorspace_property(struct drm_connector *connector)
+static int drm_mode_create_colorspace_property(struct drm_connector *connector,
+   const struct drm_prop_enum_list 
*colorspaces,
+   int size)
 {
struct drm_device *dev = connector->dev;
 
if (connector->colorspace_property)
return 0;
 
+   if (!colorspaces)
+   return 0;
+
connector->colorspace_property =
drm_property_create_enum(dev, DRM_MODE_PROP_ENUM, "Colorspace",
-hdmi_colorspaces,
-ARRAY_SIZE(hdmi_colorspaces));
+   colorspaces,
+   size);
 
if (!connector->colorspace_property)
return -ENOMEM;
 
return 0;
 }
+/**
+ * drm_mode_create_hdmi_colorspace_property - create hdmi colorspace property
+ * @connector: connector to create the Colorspace property on.
+ *
+ * Called by a driver the first time it's needed, must be attached to desired
+ * HDMI connectors.
+ *
+ * Returns:
+ * Zero on success, negative errno on failure.
+ */
+int drm_mode_create_hdmi_colorspace_property(struct drm_connector *connector)
+{
+   return drm_mode_create_colorspace_property(connector,
+  hdmi_colorspaces,
+  
ARRAY_SIZE(hdmi_colorspaces));
+}
 EXPORT_SYMBOL(drm_mode_create_hdmi_colorspace_property);
 
 /**
@@ -2016,20 +2027,9 @@ EXPORT_SYMBOL(drm_mode_create_hdmi_colorspace_property);
  */
 int drm_mode_create_dp_colorspace_property(struct drm_connector *connector)
 {
-   struct drm_device *dev = connector->dev;
-
-   if (connector->colorspace_property)
-   return 0;
-
-   connector->colorspace_property =
-   drm_property_create_enum(dev, DRM_MODE_PROP_ENUM, "Colorspace",
-dp_colorspaces,
-ARRAY_SIZE(dp_colorspaces));
-
-   if (!connector->colorspace_property)
-   return -ENOMEM;
-
-   return 0;
+   return drm_mode_create_colorspace_property(connector,
+  dp_colorspaces,
+  ARRAY_SIZE(dp_colorspaces));
 }
 EXPORT_SYMBOL(drm_mode_create_dp_colorspace_property);
 
-- 
2.40.1

[PATCH v4 06/13] drm/connector: Allow drivers to pass list of supported colorspaces

2023-05-25 Thread Harry Wentland

Drivers might not support all colorspaces defined in
dp_colorspaces and hdmi_colorspaces. This results in
undefined behavior when userspace is setting an
unsupported colorspace.

Allow drivers to pass the list of supported colorspaces
when creating the colorspace property.

v2:
 - Use 0 to indicate support for all colorspaces (Jani)
 - Print drm_dbg_kms message when drivers pass 0
   to signal that drivers should specify supported
   colorspaecs explicity (Jani)
v3:
 - Move changes to create a common colorspace_names array
   to separate patch

Signed-off-by: Harry Wentland 
Cc: Pekka Paalanen 
Cc: Sebastian Wick 
Cc: vitaly.pros...@amd.com
Cc: Uma Shankar 
Cc: Ville Syrjälä 
Cc: Joshua Ashton 
Cc: Jani Nikula 
Cc: Simon Ser 
Cc: Ville Syrjälä 
Cc: Melissa Wen 
Cc: dri-de...@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
---
 drivers/gpu/drm/drm_connector.c| 14 ++
 drivers/gpu/drm/i915/display/intel_connector.c |  4 ++--
 drivers/gpu/drm/vc4/vc4_hdmi.c |  2 +-
 include/drm/drm_connector.h|  7 +--
 4 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index 69480385eaf3..b63b3e3168a1 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -2045,9 +2045,12 @@ static int drm_mode_create_colorspace_property(struct 
drm_connector *connector,
  * Returns:
  * Zero on success, negative errno on failure.
  */
-int drm_mode_create_hdmi_colorspace_property(struct drm_connector *connector)
+int drm_mode_create_hdmi_colorspace_property(struct drm_connector *connector,
+u32 supported_colorspaces)
 {
-   return drm_mode_create_colorspace_property(connector, hdmi_colorspaces);
+   u32 colorspaces = supported_colorspaces & hdmi_colorspaces;
+
+   return drm_mode_create_colorspace_property(connector, colorspaces);
 }
 EXPORT_SYMBOL(drm_mode_create_hdmi_colorspace_property);
 
@@ -2061,9 +2064,12 @@ EXPORT_SYMBOL(drm_mode_create_hdmi_colorspace_property);
  * Returns:
  * Zero on success, negative errno on failure.
  */
-int drm_mode_create_dp_colorspace_property(struct drm_connector *connector)
+int drm_mode_create_dp_colorspace_property(struct drm_connector *connector,
+  u32 supported_colorspaces)
 {
-   return drm_mode_create_colorspace_property(connector, dp_colorspaces);
+   u32 colorspaces = supported_colorspaces & dp_colorspaces;
+
+   return drm_mode_create_colorspace_property(connector, colorspaces);
 }
 EXPORT_SYMBOL(drm_mode_create_dp_colorspace_property);
 
diff --git a/drivers/gpu/drm/i915/display/intel_connector.c 
b/drivers/gpu/drm/i915/display/intel_connector.c
index 6205ddd3ded0..e8b4a352a7a6 100644
--- a/drivers/gpu/drm/i915/display/intel_connector.c
+++ b/drivers/gpu/drm/i915/display/intel_connector.c
@@ -283,14 +283,14 @@ intel_attach_aspect_ratio_property(struct drm_connector 
*connector)
 void
 intel_attach_hdmi_colorspace_property(struct drm_connector *connector)
 {
-   if (!drm_mode_create_hdmi_colorspace_property(connector))
+   if (!drm_mode_create_hdmi_colorspace_property(connector, 0))
drm_connector_attach_colorspace_property(connector);
 }
 
 void
 intel_attach_dp_colorspace_property(struct drm_connector *connector)
 {
-   if (!drm_mode_create_dp_colorspace_property(connector))
+   if (!drm_mode_create_dp_colorspace_property(connector, 0))
drm_connector_attach_colorspace_property(connector);
 }
 
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 55744216392b..eee53e841701 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -618,7 +618,7 @@ static int vc4_hdmi_connector_init(struct drm_device *dev,
if (ret)
return ret;
 
-   ret = drm_mode_create_hdmi_colorspace_property(connector);
+   ret = drm_mode_create_hdmi_colorspace_property(connector, 0);
if (ret)
return ret;
 
diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
index ae0b1ee5b99a..abe775e1382f 100644
--- a/include/drm/drm_connector.h
+++ b/include/drm/drm_connector.h
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -1896,8 +1897,10 @@ int 
drm_connector_attach_hdr_output_metadata_property(struct drm_connector *conn
 bool drm_connector_atomic_hdr_metadata_equal(struct drm_connector_state 
*old_state,
 struct drm_connector_state 
*new_state);
 int drm_mode_create_aspect_ratio_property(struct drm_device *dev);
-int drm_mode_create_hdmi_colorspace_property(struct drm_connector *connector);
-int drm_mode_create_dp_colorspace_property(struct drm_connector *connector);
+int drm_mode_create_hdmi_colorspace_property(struct drm_connector *connector,
+u32 supp

[PATCH v4 04/13] drm/connector: Use common colorspace_names array

2023-05-25 Thread Harry Wentland

We an use bitfields to track the support ones for HDMI
and DP. This allows us to print colorspaces in a consistent
manner without needing to know whether we're dealing with
DP or HDMI.

v4:
- Rename _MAX to _COUNT and leave comment to indicate
  it's not a valid value
- Fix misplaced function doc

Signed-off-by: Harry Wentland 
Cc: Pekka Paalanen 
Cc: Sebastian Wick 
Cc: vitaly.pros...@amd.com
Cc: Uma Shankar 
Cc: Ville Syrjälä 
Cc: Joshua Ashton 
Cc: Jani Nikula 
Cc: Simon Ser 
Cc: Ville Syrjälä 
Cc: Melissa Wen 
Cc: dri-de...@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
---
 drivers/gpu/drm/drm_connector.c | 125 ++--
 include/drm/drm_connector.h |   2 +
 2 files changed, 74 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index 9c087d6f5691..8d24a5da4076 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -1016,64 +1016,70 @@ static const struct drm_prop_enum_list 
drm_dp_subconnector_enum_list[] = {
 DRM_ENUM_NAME_FN(drm_get_dp_subconnector_name,
 drm_dp_subconnector_enum_list)
 
-static const struct drm_prop_enum_list hdmi_colorspaces[] = {
+
+static const char * const colorspace_names[] = {
/* For Default case, driver will set the colorspace */
-   { DRM_MODE_COLORIMETRY_DEFAULT, "Default" },
+   [DRM_MODE_COLORIMETRY_DEFAULT] = "Default",
/* Standard Definition Colorimetry based on CEA 861 */
-   { DRM_MODE_COLORIMETRY_SMPTE_170M_YCC, "SMPTE_170M_YCC" },
-   { DRM_MODE_COLORIMETRY_BT709_YCC, "BT709_YCC" },
+   [DRM_MODE_COLORIMETRY_SMPTE_170M_YCC] = "SMPTE_170M_YCC",
+   [DRM_MODE_COLORIMETRY_BT709_YCC] = "BT709_YCC",
/* Standard Definition Colorimetry based on IEC 61966-2-4 */
-   { DRM_MODE_COLORIMETRY_XVYCC_601, "XVYCC_601" },
+   [DRM_MODE_COLORIMETRY_XVYCC_601] = "XVYCC_601",
/* High Definition Colorimetry based on IEC 61966-2-4 */
-   { DRM_MODE_COLORIMETRY_XVYCC_709, "XVYCC_709" },
+   [DRM_MODE_COLORIMETRY_XVYCC_709] = "XVYCC_709",
/* Colorimetry based on IEC 61966-2-1/Amendment 1 */
-   { DRM_MODE_COLORIMETRY_SYCC_601, "SYCC_601" },
+   [DRM_MODE_COLORIMETRY_SYCC_601] = "SYCC_601",
/* Colorimetry based on IEC 61966-2-5 [33] */
-   { DRM_MODE_COLORIMETRY_OPYCC_601, "opYCC_601" },
+   [DRM_MODE_COLORIMETRY_OPYCC_601] = "opYCC_601",
/* Colorimetry based on IEC 61966-2-5 */
-   { DRM_MODE_COLORIMETRY_OPRGB, "opRGB" },
+   [DRM_MODE_COLORIMETRY_OPRGB] = "opRGB",
/* Colorimetry based on ITU-R BT.2020 */
-   { DRM_MODE_COLORIMETRY_BT2020_CYCC, "BT2020_CYCC" },
+   [DRM_MODE_COLORIMETRY_BT2020_CYCC] = "BT2020_CYCC",
/* Colorimetry based on ITU-R BT.2020 */
-   { DRM_MODE_COLORIMETRY_BT2020_RGB, "BT2020_RGB" },
+   [DRM_MODE_COLORIMETRY_BT2020_RGB] = "BT2020_RGB",
/* Colorimetry based on ITU-R BT.2020 */
-   { DRM_MODE_COLORIMETRY_BT2020_YCC, "BT2020_YCC" },
+   [DRM_MODE_COLORIMETRY_BT2020_YCC] = "BT2020_YCC",
/* Added as part of Additional Colorimetry Extension in 861.G */
-   { DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65, "DCI-P3_RGB_D65" },
-   { DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER, "DCI-P3_RGB_Theater" },
+   [DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65] = "DCI-P3_RGB_D65",
+   [DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER] = "DCI-P3_RGB_Theater",
+   [DRM_MODE_COLORIMETRY_RGB_WIDE_FIXED] = "RGB_WIDE_FIXED",
+   /* Colorimetry based on scRGB (IEC 61966-2-2) */
+   [DRM_MODE_COLORIMETRY_RGB_WIDE_FLOAT] = "RGB_WIDE_FLOAT",
+   [DRM_MODE_COLORIMETRY_BT601_YCC] = "BT601_YCC",
 };
 
+static const u32 hdmi_colorspaces =
+   BIT(DRM_MODE_COLORIMETRY_SMPTE_170M_YCC) |
+   BIT(DRM_MODE_COLORIMETRY_BT709_YCC) |
+   BIT(DRM_MODE_COLORIMETRY_XVYCC_601) |
+   BIT(DRM_MODE_COLORIMETRY_XVYCC_709) |
+   BIT(DRM_MODE_COLORIMETRY_SYCC_601) |
+   BIT(DRM_MODE_COLORIMETRY_OPYCC_601) |
+   BIT(DRM_MODE_COLORIMETRY_OPRGB) |
+   BIT(DRM_MODE_COLORIMETRY_BT2020_CYCC) |
+   BIT(DRM_MODE_COLORIMETRY_BT2020_RGB) |
+   BIT(DRM_MODE_COLORIMETRY_BT2020_YCC) |
+   BIT(DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65) |
+   BIT(DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER);
+
 /*
  * As per DP 1.4a spec, 2.2.5.7.5 VSC SDP Payload for Pixel 
Encoding/Colorimetry
  * Format Table 2-120
  */
-static const struct drm_prop_enum_list dp_colorspaces[] = {
-   /* For Default case, driver will set the colorspace */
-   { DRM_MODE_COLORIMETRY_DEFAULT, "Default" },
-   { DRM_MODE_COLORIMETRY_RGB_WIDE_FIXED, "RGB_Wide_Gamut_Fixed_Point" },
-   /* Colorimetry based on scRGB (IEC 61966-2-2) */
-   { DRM_MODE_COLORIMETRY_RGB_WIDE_FLOAT, "RGB_Wide_Gamut_Floating_Point" 
},
-   /* Colorimetry based on IEC 61966-2-5 */
-   { DRM_MODE_COLORIMETRY_OPRGB, "opRGB" },
-   /* Colorimetry based on SMPTE RP 431-2 */
-   { DRM_MODE_COLORIMETRY_DCI_P

[PATCH v4 02/13] drm/connector: Add enum documentation to drm_colorspace

2023-05-25 Thread Harry Wentland

From: Joshua Ashton 

To match the other enums, and add more information about these values.

v2:
 - Specify where an enum entry comes from
 - Clarify DEFAULT and NO_DATA behavior
 - BT.2020 CYCC is "constant luminance"
 - correct type for BT.601

v4:
- drop DP/HDMI clarifications that might create
  more questions than answers

Signed-off-by: Joshua Ashton 
Signed-off-by: Harry Wentland 
Reviewed-by: Harry Wentland 

Cc: Pekka Paalanen 
Cc: Sebastian Wick 
Cc: vitaly.pros...@amd.com
Cc: Uma Shankar 
Cc: Ville Syrjälä 
Cc: Joshua Ashton 
Cc: Simon Ser 
Cc: Ville Syrjälä 
Cc: Melissa Wen 
Cc: dri-de...@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
---
 include/drm/drm_connector.h | 62 +++--
 1 file changed, 60 insertions(+), 2 deletions(-)

diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
index 77401e425341..ee597593d7e6 100644
--- a/include/drm/drm_connector.h
+++ b/include/drm/drm_connector.h
@@ -363,13 +363,71 @@ enum drm_privacy_screen_status {
PRIVACY_SCREEN_ENABLED_LOCKED,
 };
 
-/*
- * This is a consolidated colorimetry list supported by HDMI and
+/**
+ * enum drm_colorspace - color space
+ *
+ * This enum is a consolidated colorimetry list supported by HDMI and
  * DP protocol standard. The respective connectors will register
  * a property with the subset of this list (supported by that
  * respective protocol). Userspace will set the colorspace through
  * a colorspace property which will be created and exposed to
  * userspace.
+ *
+ * DP definitions come from the DP v2.0 spec
+ * HDMI definitions come from the CTA-861-H spec
+ *
+ * @DRM_MODE_COLORIMETRY_DEFAULT:
+ *   Driver specific behavior.
+ * @DRM_MODE_COLORIMETRY_NO_DATA:
+ *   Driver specific behavior.
+ * @DRM_MODE_COLORIMETRY_SMPTE_170M_YCC:
+ *   (HDMI)
+ *   SMPTE ST 170M colorimetry format
+ * @DRM_MODE_COLORIMETRY_BT709_YCC:
+ *   (HDMI, DP)
+ *   ITU-R BT.709 colorimetry format
+ * @DRM_MODE_COLORIMETRY_XVYCC_601:
+ *   (HDMI, DP)
+ *   xvYCC601 colorimetry format
+ * @DRM_MODE_COLORIMETRY_XVYCC_709:
+ *   (HDMI, DP)
+ *   xvYCC709 colorimetry format
+ * @DRM_MODE_COLORIMETRY_SYCC_601:
+ *   (HDMI, DP)
+ *   sYCC601 colorimetry format
+ * @DRM_MODE_COLORIMETRY_OPYCC_601:
+ *   (HDMI, DP)
+ *   opYCC601 colorimetry format
+ * @DRM_MODE_COLORIMETRY_OPRGB:
+ *   (HDMI, DP)
+ *   opRGB colorimetry format
+ * @DRM_MODE_COLORIMETRY_BT2020_CYCC:
+ *   (HDMI, DP)
+ *   ITU-R BT.2020 Y'c C'bc C'rc (constant luminance) colorimetry format
+ * @DRM_MODE_COLORIMETRY_BT2020_RGB:
+ *   (HDMI, DP)
+ *   ITU-R BT.2020 R' G' B' colorimetry format
+ * @DRM_MODE_COLORIMETRY_BT2020_YCC:
+ *   (HDMI, DP)
+ *   ITU-R BT.2020 Y' C'b C'r colorimetry format
+ * @DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65:
+ *   (HDMI)
+ *   SMPTE ST 2113 P3D65 colorimetry format
+ * @DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER:
+ *   (HDMI)
+ *   SMPTE ST 2113 P3DCI colorimetry format
+ * @DRM_MODE_COLORIMETRY_RGB_WIDE_FIXED:
+ *   (DP)
+ *   RGB wide gamut fixed point colorimetry format
+ * @DRM_MODE_COLORIMETRY_RGB_WIDE_FLOAT:
+ *   (DP)
+ *   RGB wide gamut floating point
+ *   (scRGB (IEC 61966-2-2)) colorimetry format
+ * @DRM_MODE_COLORIMETRY_BT601_YCC:
+ *   (DP)
+ *   ITU-R BT.601 colorimetry format
+ *   The DP spec does not say whether this is the 525 or the 625
+ *   line version.
  */
 enum drm_colorspace {
/* For Default case, driver will set the colorspace */
-- 
2.40.1

[PATCH v4 01/13] drm/connector: Convert DRM_MODE_COLORIMETRY to enum

2023-05-25 Thread Harry Wentland

This allows us to use strongly typed arguments.

v2:
 - Bring NO_DATA back
 - Provide explicit enum values

v3:
- Drop unnecessary '&' from kerneldoc (emersion)

v4:
- Fix Normal Colorimetry comment

Signed-off-by: Harry Wentland 
Reviewed-by: Simon Ser 

Cc: Pekka Paalanen 
Cc: Sebastian Wick 
Cc: vitaly.pros...@amd.com
Cc: Uma Shankar 
Cc: Ville Syrjälä 
Cc: Joshua Ashton 
Cc: Simon Ser 
Cc: Ville Syrjälä 
Cc: Melissa Wen 
Cc: dri-de...@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
Reviewed-by: Pekka Paalanen 
---
 include/drm/display/drm_dp.h |  2 +-
 include/drm/drm_connector.h  | 49 ++--
 2 files changed, 26 insertions(+), 25 deletions(-)

diff --git a/include/drm/display/drm_dp.h b/include/drm/display/drm_dp.h
index f1be179c5f1f..7f858352cb43 100644
--- a/include/drm/display/drm_dp.h
+++ b/include/drm/display/drm_dp.h
@@ -1626,7 +1626,7 @@ enum dp_pixelformat {
  *
  * This enum is used to indicate DP VSC SDP Colorimetry formats.
  * It is based on DP 1.4 spec [Table 2-117: VSC SDP Payload for DB16 through
- * DB18] and a name of enum member follows DRM_MODE_COLORIMETRY definition.
+ * DB18] and a name of enum member follows enum drm_colorimetry definition.
  *
  * @DP_COLORIMETRY_DEFAULT: sRGB (IEC 61966-2-1) or
  *  ITU-R BT.601 colorimetry format
diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
index 565cf9d3c550..77401e425341 100644
--- a/include/drm/drm_connector.h
+++ b/include/drm/drm_connector.h
@@ -371,29 +371,30 @@ enum drm_privacy_screen_status {
  * a colorspace property which will be created and exposed to
  * userspace.
  */
-
-/* For Default case, driver will set the colorspace */
-#define DRM_MODE_COLORIMETRY_DEFAULT   0
-/* CEA 861 Normal Colorimetry options */
-#define DRM_MODE_COLORIMETRY_NO_DATA   0
-#define DRM_MODE_COLORIMETRY_SMPTE_170M_YCC1
-#define DRM_MODE_COLORIMETRY_BT709_YCC 2
-/* CEA 861 Extended Colorimetry Options */
-#define DRM_MODE_COLORIMETRY_XVYCC_601 3
-#define DRM_MODE_COLORIMETRY_XVYCC_709 4
-#define DRM_MODE_COLORIMETRY_SYCC_601  5
-#define DRM_MODE_COLORIMETRY_OPYCC_601 6
-#define DRM_MODE_COLORIMETRY_OPRGB 7
-#define DRM_MODE_COLORIMETRY_BT2020_CYCC   8
-#define DRM_MODE_COLORIMETRY_BT2020_RGB9
-#define DRM_MODE_COLORIMETRY_BT2020_YCC10
-/* Additional Colorimetry extension added as part of CTA 861.G */
-#define DRM_MODE_COLORIMETRY_DCI_P3_RGB_D6511
-#define DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER12
-/* Additional Colorimetry Options added for DP 1.4a VSC Colorimetry Format */
-#define DRM_MODE_COLORIMETRY_RGB_WIDE_FIXED13
-#define DRM_MODE_COLORIMETRY_RGB_WIDE_FLOAT14
-#define DRM_MODE_COLORIMETRY_BT601_YCC 15
+enum drm_colorspace {
+   /* For Default case, driver will set the colorspace */
+   DRM_MODE_COLORIMETRY_DEFAULT= 0,
+   /* CEA 861 Normal Colorimetry options */
+   DRM_MODE_COLORIMETRY_NO_DATA= 0,
+   DRM_MODE_COLORIMETRY_SMPTE_170M_YCC = 1,
+   DRM_MODE_COLORIMETRY_BT709_YCC  = 2,
+   /* CEA 861 Extended Colorimetry Options */
+   DRM_MODE_COLORIMETRY_XVYCC_601  = 3,
+   DRM_MODE_COLORIMETRY_XVYCC_709  = 4,
+   DRM_MODE_COLORIMETRY_SYCC_601   = 5,
+   DRM_MODE_COLORIMETRY_OPYCC_601  = 6,
+   DRM_MODE_COLORIMETRY_OPRGB  = 7,
+   DRM_MODE_COLORIMETRY_BT2020_CYCC= 8,
+   DRM_MODE_COLORIMETRY_BT2020_RGB = 9,
+   DRM_MODE_COLORIMETRY_BT2020_YCC = 10,
+   /* Additional Colorimetry extension added as part of CTA 861.G */
+   DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65 = 11,
+   DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER = 12,
+   /* Additional Colorimetry Options added for DP 1.4a VSC Colorimetry 
Format */
+   DRM_MODE_COLORIMETRY_RGB_WIDE_FIXED = 13,
+   DRM_MODE_COLORIMETRY_RGB_WIDE_FLOAT = 14,
+   DRM_MODE_COLORIMETRY_BT601_YCC  = 15,
+};
 
 /**
  * enum drm_bus_flags - bus_flags info for &drm_display_info
@@ -828,7 +829,7 @@ struct drm_connector_state {
 * colorspace change on Sink. This is most commonly used to switch
 * to wider color gamuts like BT2020.
 */
-   u32 colorspace;
+   enum drm_colorspace colorspace;
 
/**
 * @writeback_job: Writeback job for writeback connectors
-- 
2.40.1

[PATCH v4 00/13] Enable Colorspace connector property in amdgpu

2023-05-25 Thread Harry Wentland

This patchset is based on Joshua's previous patchset [1], as well
as my previous patchset [2].

It is
- enabling support for the colorspace property in amdgpu, as well as
- allowing drivers to specify the supported set of colorspaces, and

Colorspace, Infoframes, and YCbCr matrix
---

Even though the initial intent of the colorspace property was to set the
colorspace field in the respective HDMI AVI and DP SDP infoframes that
is not sufficient in all scenarios. For DP the colorspace information
also affects the MSA (main stream attribute) packet. For YUV output the
colorspace affects the RGB-to-YCbCr conversion matrix. The colorspace
field of the infopackets also depends on the encoding used, which is
something that is decided by the driver and not known to userspace.

For these reasons a driver will need to be able to select the supported
colorspaces at property creation.

Note: There seems to be an understanding that the colorspace property
should ONLY modify the infoframe. While this is current behavior and
sufficient in some cases it is nowhere specified that this should be the
only use of this property. As outlined above this limitation is not
going to work in all cases.

This patchset does not affect current behavior for the drivers that
implement this property: i915 and vc4.

In the future we might want to give userspace control over the encoding
format on the wire, in particular to avoid use of YUV420 when image
fidelity is important. This work would likely go hand in hand with a
min_bpc property and wouldn't conflict with the work done in this
patchset. I would expect this future work to tag along with a drm_crtc
or drm_connector's Color Pipeline, similar to the one propsed for
drm_plane [3].

Colorspace on crtc or connector?


There have been suggestions of programming 'colorspace' on the drm_crtc
but I don't think the crtc is the right place for this property. The
drm_plane and drm_crtc will be used to offload color processing that
would normally be done via the GFX or other pipelines. The drm_connector
controls the signalling with the display and ensures the wire format is
appropriate for the encoding by programming the RGB-to-YCbCr matrix.

[1] https://patchwork.freedesktop.org/series/113632/
[2] https://patchwork.freedesktop.org/series/111865/
[3] https://lists.freedesktop.org/archives/dri-devel/2023-May/403173.html

v2:
- Tested with DP and HDMI analyzers
- Confirmed driver will fallback to lower bpc when needed
- Dropped hunk to set HDMI AVI infoframe as it was a no-op
- Fixed BT.2020 YCbCr colorimetry (JoshuaAshton)
- Simplify initialization of supported colorspaces (Jani)
- Fix kerneldoc (kernel test robot)

v3:
- Added documentation for colorspaces (Pekka, Joshua)
- Split 'Allow drivers to pass list of supported colorspaces' patch
  to pull out code to create common colorspace array and keep it separate
  from change to create only supported colorspaces

v4:
- Don't "deprecate" existing enum values
- Fixes based on review comments throughout
- Dropped Josh's RBs

Cc: Pekka Paalanen 
Cc: Sebastian Wick 
Cc: vitaly.pros...@amd.com
Cc: Uma Shankar 
Cc: Ville Syrjälä 
Cc: Joshua Ashton 
Cc: Jani Nikula 
Cc: Michel Dänzer 
Cc: Simon Ser 
Cc: Melissa Wen 
Cc: dri-de...@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org

Harry Wentland (10):
  drm/connector: Convert DRM_MODE_COLORIMETRY to enum
  drm/connector: Pull out common create_colorspace_property code
  drm/connector: Use common colorspace_names array
  drm/connector: Print connector colorspace in state debugfs
  drm/connector: Allow drivers to pass list of supported colorspaces
  drm/amd/display: Always pass connector_state to stream validation
  drm/amd/display: Register Colorspace property for DP and HDMI
  drm/amd/display: Signal mode_changed if colorspace changed
  drm/amd/display: Send correct DP colorspace infopacket
  drm/amd/display: Add debugfs for testing output colorspace

Joshua Ashton (3):
  drm/connector: Add enum documentation to drm_colorspace
  drm/amd/display: Always set crtcinfo from create_stream_for_sink
  drm/amd/display: Refactor avi_info_frame colorimetry determination

 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  84 ++---
 .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c |  57 ++
 .../gpu/drm/amd/display/dc/core/dc_resource.c |  28 +--
 drivers/gpu/drm/drm_atomic.c  |   1 +
 drivers/gpu/drm/drm_connector.c   | 176 +++---
 .../gpu/drm/i915/display/intel_connector.c|   4 +-
 drivers/gpu/drm/vc4/vc4_hdmi.c|   2 +-
 include/drm/display/drm_dp.h  |   2 +-
 include/drm/drm_connector.h   | 121 +---
 9 files changed, 341 insertions(+), 134 deletions(-)

--
2.40.1

[PATCH AUTOSEL 5.15 42/43] drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged

2023-05-25 Thread Sasha Levin

From: Guchun Chen 

[ Upstream commit c1a322a7a4a96cd0a3dde32ce37af437a78bf8cd ]

When performing device unbind or halt, we have disabled all irqs at the
very begining like amdgpu_pci_remove or amdgpu_device_halt. So
amdgpu_irq_put for irqs stored in fence driver should not be called
any more, otherwise, below calltrace will arrive.

[  139.114088] WARNING: CPU: 2 PID: 1550 at 
drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:616 amdgpu_irq_put+0xf6/0x110 [amdgpu]
[  139.114655] Call Trace:
[  139.114655]  
[  139.114657]  amdgpu_fence_driver_hw_fini+0x93/0x130 [amdgpu]
[  139.114836]  amdgpu_device_fini_hw+0xb6/0x350 [amdgpu]
[  139.114955]  amdgpu_driver_unload_kms+0x51/0x70 [amdgpu]
[  139.115075]  amdgpu_pci_remove+0x63/0x160 [amdgpu]
[  139.115193]  ? __pm_runtime_resume+0x64/0x90
[  139.115195]  pci_device_remove+0x3a/0xb0
[  139.115197]  device_remove+0x43/0x70
[  139.115198]  device_release_driver_internal+0xbd/0x140

Signed-off-by: Guchun Chen 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index bbd6f7a123033..8599e0ffa8292 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -561,7 +561,8 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev)
if (r)
amdgpu_fence_driver_force_completion(ring);
 
-   if (ring->fence_drv.irq_src)
+   if (!drm_dev_is_unplugged(adev_to_drm(adev)) &&
+   ring->fence_drv.irq_src)
amdgpu_irq_put(adev, ring->fence_drv.irq_src,
   ring->fence_drv.irq_type);
 
-- 
2.39.2

[PATCH AUTOSEL 6.1 54/57] drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged

2023-05-25 Thread Sasha Levin

From: Guchun Chen 

[ Upstream commit c1a322a7a4a96cd0a3dde32ce37af437a78bf8cd ]

When performing device unbind or halt, we have disabled all irqs at the
very begining like amdgpu_pci_remove or amdgpu_device_halt. So
amdgpu_irq_put for irqs stored in fence driver should not be called
any more, otherwise, below calltrace will arrive.

[  139.114088] WARNING: CPU: 2 PID: 1550 at 
drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:616 amdgpu_irq_put+0xf6/0x110 [amdgpu]
[  139.114655] Call Trace:
[  139.114655]  
[  139.114657]  amdgpu_fence_driver_hw_fini+0x93/0x130 [amdgpu]
[  139.114836]  amdgpu_device_fini_hw+0xb6/0x350 [amdgpu]
[  139.114955]  amdgpu_driver_unload_kms+0x51/0x70 [amdgpu]
[  139.115075]  amdgpu_pci_remove+0x63/0x160 [amdgpu]
[  139.115193]  ? __pm_runtime_resume+0x64/0x90
[  139.115195]  pci_device_remove+0x3a/0xb0
[  139.115197]  device_remove+0x43/0x70
[  139.115198]  device_release_driver_internal+0xbd/0x140

Signed-off-by: Guchun Chen 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 3cc1929285fc0..ed6878d5b3ce3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -528,7 +528,8 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev)
if (r)
amdgpu_fence_driver_force_completion(ring);
 
-   if (ring->fence_drv.irq_src)
+   if (!drm_dev_is_unplugged(adev_to_drm(adev)) &&
+   ring->fence_drv.irq_src)
amdgpu_irq_put(adev, ring->fence_drv.irq_src,
   ring->fence_drv.irq_type);
 
-- 
2.39.2

[PATCH AUTOSEL 6.3 64/67] drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged

2023-05-25 Thread Sasha Levin

From: Guchun Chen 

[ Upstream commit c1a322a7a4a96cd0a3dde32ce37af437a78bf8cd ]

When performing device unbind or halt, we have disabled all irqs at the
very begining like amdgpu_pci_remove or amdgpu_device_halt. So
amdgpu_irq_put for irqs stored in fence driver should not be called
any more, otherwise, below calltrace will arrive.

[  139.114088] WARNING: CPU: 2 PID: 1550 at 
drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:616 amdgpu_irq_put+0xf6/0x110 [amdgpu]
[  139.114655] Call Trace:
[  139.114655]  
[  139.114657]  amdgpu_fence_driver_hw_fini+0x93/0x130 [amdgpu]
[  139.114836]  amdgpu_device_fini_hw+0xb6/0x350 [amdgpu]
[  139.114955]  amdgpu_driver_unload_kms+0x51/0x70 [amdgpu]
[  139.115075]  amdgpu_pci_remove+0x63/0x160 [amdgpu]
[  139.115193]  ? __pm_runtime_resume+0x64/0x90
[  139.115195]  pci_device_remove+0x3a/0xb0
[  139.115197]  device_remove+0x43/0x70
[  139.115198]  device_release_driver_internal+0xbd/0x140

Signed-off-by: Guchun Chen 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index f52d0ba91a770..a7d250809da99 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -582,7 +582,8 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev)
if (r)
amdgpu_fence_driver_force_completion(ring);
 
-   if (ring->fence_drv.irq_src)
+   if (!drm_dev_is_unplugged(adev_to_drm(adev)) &&
+   ring->fence_drv.irq_src)
amdgpu_irq_put(adev, ring->fence_drv.irq_src,
   ring->fence_drv.irq_type);
 
-- 
2.39.2

[PATCH] drm/amdgpu: Fix up kdoc in amdgpu_acpi.c

2023-05-25 Thread Srinivasan Shanmugam

Fix these warnings by adding & deleting the deviant arguments.

gcc with W=1
drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:906: warning: Function parameter or 
member 'numa_info' not described in 'amdgpu_acpi_get_node_id'
drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:906: warning: Excess function 
parameter 'nid' description in 'amdgpu_acpi_get_node_id'

Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
index b050d462b2f3..3a6b2e2089f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
@@ -894,7 +894,7 @@ static struct amdgpu_numa_info 
*amdgpu_acpi_get_numa_info(uint32_t pxm)
  * acpi device handle
  *
  * @handle: acpi handle
- * @nid: NUMA Node id returned by the platform firmware
+ * @numa_info: amdgpu_numa_info structure holding numa information
  *
  * Queries the ACPI interface to fetch the corresponding NUMA Node ID for a
  * given amdgpu acpi device.
-- 
2.25.1

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Nick Desaulniers

On Thu, May 25, 2023 at 9:42 AM Alex Deucher  wrote:
>
> On Thu, May 25, 2023 at 12:29 PM Nathan Chancellor  wrote:
> >
> > On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote:
> > > On 2023-05-25 11:22, Nathan Chancellor wrote:
> > > > On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote:
> > > >> Silencing the compiler from below compilation error:
> > > >>
> > > >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable 
> > > >> 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted 
> > > >> [-Werror,-Wunneeded-internal-declaration]
> > > >> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
> > > >>   ^
> > > >> 1 error generated.
> > > >>
> > > >> Mark the variable as __maybe_unused to make it clear to clang that this
> > > >> is expected, so there is no more warning.
> > > >>
> > > >> Cc: Christian König 
> > > >> Cc: Lijo Lazar 
> > > >> Cc: Luben Tuikov 
> > > >> Cc: Alex Deucher 
> > > >> Signed-off-by: Srinivasan Shanmugam 
> > > >
> > > > Traditionally, this attribute would go between the [] and =, but that is
> > > > a nit. Can someone please pick this up to unblock our builds on -next?
> > > >
> > > > Reviewed-by: Nathan Chancellor 
> > >
> > > I'll pick this up, fix it, and submit to amd-staging-drm-next.
> >
> > Thanks a lot :)
> >
> > > Which -next are you referring to, Nathan?
> >
> > linux-next, this warning breaks the build when -Werror is enabled, such
> > as with allmodconfig:
> >
> > https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2QHtlCTz2JL0yXNpRB5hVmiP9lq/build.log
> >
>
> Srinivasan has already pushed it.  I'll push it out once CI has
> completed.  We are trying to figure out the best way to enable -WERROR
> in our CI system as it is almost always broken depending on what
> compiler you are using.  Also, I'm not sure fixing these is always
> better.  A lot of these warnings seem spurious and in a lot of cases
> the "fix" doesn't really improve the code, it just silences a warning.
> As one of my coworkers put it, there is a reason warnings are not
> errors.

https://www.theregister.com/2021/09/08/compromise_linux_kernel_compiler_warnings/

>
> Alex
>
>
> > Cheers,
> > Nathan
> >
> > > >> ---
> > > >>  drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 1 +
> > > >>  1 file changed, 1 insertion(+)
> > > >>
> > > >> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c 
> > > >> b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> > > >> index 3648994724c2..cba087e529c0 100644
> > > >> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> > > >> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> > > >> @@ -701,6 +701,7 @@ static void 
> > > >> mmhub_v1_8_reset_ras_error_count(struct amdgpu_device *adev)
> > > >>mmhub_v1_8_inst_reset_ras_error_count(adev, i);
> > > >>  }
> > > >>
> > > >> +__maybe_unused
> > > >>  static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
> > > >>regMMEA0_ERR_STATUS,
> > > >>regMMEA1_ERR_STATUS,
> > > >> --
> > > >> 2.25.1
> > > >>
> > >
>


-- 
Thanks,
~Nick Desaulniers

Re: [PATCH] drm/amdgpu: Fix up kdoc in sdma_v4_4_2.c

2023-05-25 Thread Alex Deucher

On Thu, May 25, 2023 at 1:08 PM Srinivasan Shanmugam
 wrote:
>
> Address a bunch of kdoc warnings:
>
> gcc with W=1
> drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:426: warning: Function parameter or 
> member 'inst_mask' not described in 'sdma_v4_4_2_inst_gfx_stop'
> drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:457: warning: Function parameter or 
> member 'inst_mask' not described in 'sdma_v4_4_2_inst_rlc_stop'
> drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:470: warning: Function parameter or 
> member 'inst_mask' not described in 'sdma_v4_4_2_inst_page_stop'
> drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:506: warning: Function parameter or 
> member 'inst_mask' not described in 'sdma_v4_4_2_inst_ctx_switch_enable'
> drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:794: warning: Function parameter or 
> member 'inst_mask' not described in 'sdma_v4_4_2_inst_rlc_resume'
> drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:810: warning: Function parameter or 
> member 'inst_mask' not described in 'sdma_v4_4_2_inst_load_microcode'
> drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:854: warning: Function parameter or 
> member 'inst_mask' not described in 'sdma_v4_4_2_inst_start'

I thought someone already landed a patch for this.  If not,

Reviewed-by: Alex Deucher 

>
> Cc: Christian König 
> Cc: Alex Deucher 
> Signed-off-by: Srinivasan Shanmugam 
> ---
>  drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c 
> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
> index ff41fb577cdd..8eebf9c2bbcd 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
> @@ -418,6 +418,7 @@ static void sdma_v4_4_2_ring_emit_fence(struct 
> amdgpu_ring *ring, u64 addr, u64
>   * sdma_v4_4_2_inst_gfx_stop - stop the gfx async dma engines
>   *
>   * @adev: amdgpu_device pointer
> + * @inst_mask: mask of dma engine instances to be disabled
>   *
>   * Stop the gfx async dma ring buffers.
>   */
> @@ -449,6 +450,7 @@ static void sdma_v4_4_2_inst_gfx_stop(struct 
> amdgpu_device *adev,
>   * sdma_v4_4_2_inst_rlc_stop - stop the compute async dma engines
>   *
>   * @adev: amdgpu_device pointer
> + * @inst_mask: mask of dma engine instances to be disabled
>   *
>   * Stop the compute async dma queues.
>   */
> @@ -462,6 +464,7 @@ static void sdma_v4_4_2_inst_rlc_stop(struct 
> amdgpu_device *adev,
>   * sdma_v4_4_2_inst_page_stop - stop the page async dma engines
>   *
>   * @adev: amdgpu_device pointer
> + * @inst_mask: mask of dma engine instances to be disabled
>   *
>   * Stop the page async dma ring buffers.
>   */
> @@ -498,6 +501,7 @@ static void sdma_v4_4_2_inst_page_stop(struct 
> amdgpu_device *adev,
>   *
>   * @adev: amdgpu_device pointer
>   * @enable: enable/disable the DMA MEs context switch.
> + * @inst_mask: mask of dma engine instances to be enabled
>   *
>   * Halt or unhalt the async dma engines context switch.
>   */
> @@ -785,6 +789,7 @@ static void sdma_v4_4_2_init_pg(struct amdgpu_device 
> *adev)
>   * sdma_v4_4_2_inst_rlc_resume - setup and start the async dma engines
>   *
>   * @adev: amdgpu_device pointer
> + * @inst_mask: mask of dma engine instances to be enabled
>   *
>   * Set up the compute DMA queues and enable them.
>   * Returns 0 for success, error for failure.
> @@ -801,6 +806,7 @@ static int sdma_v4_4_2_inst_rlc_resume(struct 
> amdgpu_device *adev,
>   * sdma_v4_4_2_inst_load_microcode - load the sDMA ME ucode
>   *
>   * @adev: amdgpu_device pointer
> + * @inst_mask: mask of dma engine instances to be enabled
>   *
>   * Loads the sDMA0/1 ucode.
>   * Returns 0 for success, -EINVAL if the ucode is not available.
> @@ -845,6 +851,7 @@ static int sdma_v4_4_2_inst_load_microcode(struct 
> amdgpu_device *adev,
>   * sdma_v4_4_2_inst_start - setup and start the async dma engines
>   *
>   * @adev: amdgpu_device pointer
> + * @inst_mask: mask of dma engine instances to be enabled
>   *
>   * Set up the DMA engines and enable them.
>   * Returns 0 for success, error for failure.
> --
> 2.25.1
>

[PATCH] drm/amdgpu: Fix up kdoc in amdgpu_device.c

2023-05-25 Thread Srinivasan Shanmugam

Fix these warnings by deleting the deviant arguments.

gcc with W=1
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:799: warning: Excess function 
parameter 'pcie_index' description in 'amdgpu_device_indirect_wreg'
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:799: warning: Excess function 
parameter 'pcie_data' description in 'amdgpu_device_indirect_wreg'
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:870: warning: Excess function 
parameter 'pcie_index' description in 'amdgpu_device_indirect_wreg64'
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:870: warning: Excess function 
parameter 'pcie_data' description in 'amdgpu_device_indirect_wreg64'

Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index c1e9ed26b7bf..301abfb7a0d3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -788,8 +788,6 @@ u64 amdgpu_device_indirect_rreg64(struct amdgpu_device 
*adev,
  * amdgpu_device_indirect_wreg - write an indirect register address
  *
  * @adev: amdgpu_device pointer
- * @pcie_index: mmio register offset
- * @pcie_data: mmio register offset
  * @reg_addr: indirect register offset
  * @reg_data: indirect register data
  *
@@ -859,8 +857,6 @@ void amdgpu_device_indirect_wreg_ext(struct amdgpu_device 
*adev,
  * amdgpu_device_indirect_wreg64 - write a 64bits indirect register address
  *
  * @adev: amdgpu_device pointer
- * @pcie_index: mmio register offset
- * @pcie_data: mmio register offset
  * @reg_addr: indirect register offset
  * @reg_data: indirect register data
  *
-- 
2.25.1

[PATCH 31/33] drm/amdkfd: add debug queue snapshot operation

2023-05-25 Thread Jonathan Kim

Allow the debugger to get a snapshot of a specified number of queues
containing various queue property information that is copied to the
debugger.

Since the debugger doesn't know how many queues exist at any given time,
allow the debugger to pass the requested number of snapshots as 0 to get
the actual number of potential snapshots to use for a subsequent snapshot
request for actual information.

To prevent future ABI breakage, pass in the requested entry_size.
The KFD will return it's own entry_size in case the debugger still wants
log the information in a core dump on sizing failure.

Also allow the debugger to clear exceptions when doing a snapshot.

Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  6 +++
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 36 +
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |  3 ++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  5 +++
 .../amd/amdkfd/kfd_process_queue_manager.c| 40 +++
 5 files changed, 90 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 00aa844762b0..b24a73fd53af 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -3053,6 +3053,12 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, 
struct kfd_process *p, v
&args->query_exception_info.info_size);
break;
case KFD_IOC_DBG_TRAP_GET_QUEUE_SNAPSHOT:
+   r = pqm_get_queue_snapshot(&target->pqm,
+   args->queue_snapshot.exception_mask,
+   (void __user 
*)args->queue_snapshot.snapshot_buf_ptr,
+   &args->queue_snapshot.num_queues,
+   &args->queue_snapshot.entry_size);
+   break;
case KFD_IOC_DBG_TRAP_GET_DEVICE_SNAPSHOT:
pr_warn("Debug op %i not supported yet\n", args->op);
r = -EACCES;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 03fabe6e9cdb..9f52f8426ed1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -3040,6 +3040,42 @@ int suspend_queues(struct kfd_process *p,
return total_suspended;
 }
 
+static uint32_t set_queue_type_for_user(struct queue_properties *q_props)
+{
+   switch (q_props->type) {
+   case KFD_QUEUE_TYPE_COMPUTE:
+   return q_props->format == KFD_QUEUE_FORMAT_PM4
+   ? KFD_IOC_QUEUE_TYPE_COMPUTE
+   : KFD_IOC_QUEUE_TYPE_COMPUTE_AQL;
+   case KFD_QUEUE_TYPE_SDMA:
+   return KFD_IOC_QUEUE_TYPE_SDMA;
+   case KFD_QUEUE_TYPE_SDMA_XGMI:
+   return KFD_IOC_QUEUE_TYPE_SDMA_XGMI;
+   default:
+   WARN_ONCE(true, "queue type not recognized!");
+   return 0x;
+   };
+}
+
+void set_queue_snapshot_entry(struct queue *q,
+ uint64_t exception_clear_mask,
+ struct kfd_queue_snapshot_entry *qss_entry)
+{
+   qss_entry->ring_base_address = q->properties.queue_address;
+   qss_entry->write_pointer_address = (uint64_t)q->properties.write_ptr;
+   qss_entry->read_pointer_address = (uint64_t)q->properties.read_ptr;
+   qss_entry->ctx_save_restore_address =
+   q->properties.ctx_save_restore_area_address;
+   qss_entry->ctx_save_restore_area_size =
+   q->properties.ctx_save_restore_area_size;
+   qss_entry->exception_status = q->properties.exception_status;
+   qss_entry->queue_id = q->properties.queue_id;
+   qss_entry->gpu_id = q->device->id;
+   qss_entry->ring_size = (uint32_t)q->properties.queue_size;
+   qss_entry->queue_type = set_queue_type_for_user(&q->properties);
+   q->properties.exception_status &= ~exception_clear_mask;
+}
+
 int debug_lock_and_unmap(struct device_queue_manager *dqm)
 {
int r;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
index d4e6dbffe8c2..7dd4b177219d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
@@ -300,6 +300,9 @@ int suspend_queues(struct kfd_process *p,
 int resume_queues(struct kfd_process *p,
uint32_t num_queues,
uint32_t *usr_queue_id_array);
+void set_queue_snapshot_entry(struct queue *q,
+ uint64_t exception_clear_mask,
+ struct kfd_queue_snapshot_entry *qss_entry);
 int debug_lock_and_unmap(struct device_queue_manager *dqm);
 int debug_map_and_unlock(struct device_queue_manager *dqm

[PATCH 30/33] drm/amdkfd: add debug query exception info operation

2023-05-25 Thread Jonathan Kim

Allow the debugger to query additional info based on an exception code.
For device exceptions, it's currently only memory violation information.
For process exceptions, it's currently only runtime information.
Queue exception only report the queue exception status.

The debugger has the option of clearing the target exception on query.

Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |   7 ++
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c   | 120 +++
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h   |   6 ++
 3 files changed, 133 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index ebb2088d12fa..00aa844762b0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -3045,6 +3045,13 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, 
struct kfd_process *p, v
&args->query_debug_event.exception_mask);
break;
case KFD_IOC_DBG_TRAP_QUERY_EXCEPTION_INFO:
+   r = kfd_dbg_trap_query_exception_info(target,
+   args->query_exception_info.source_id,
+   args->query_exception_info.exception_code,
+   args->query_exception_info.clear_exception,
+   (void __user 
*)args->query_exception_info.info_ptr,
+   &args->query_exception_info.info_size);
+   break;
case KFD_IOC_DBG_TRAP_GET_QUEUE_SNAPSHOT:
case KFD_IOC_DBG_TRAP_GET_DEVICE_SNAPSHOT:
pr_warn("Debug op %i not supported yet\n", args->op);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
index e9530e682e85..24e2b285448a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
@@ -890,6 +890,126 @@ int kfd_dbg_trap_set_wave_launch_mode(struct kfd_process 
*target,
return r;
 }
 
+int kfd_dbg_trap_query_exception_info(struct kfd_process *target,
+   uint32_t source_id,
+   uint32_t exception_code,
+   bool clear_exception,
+   void __user *info,
+   uint32_t *info_size)
+{
+   bool found = false;
+   int r = 0;
+   uint32_t copy_size, actual_info_size = 0;
+   uint64_t *exception_status_ptr = NULL;
+
+   if (!target)
+   return -EINVAL;
+
+   if (!info || !info_size)
+   return -EINVAL;
+
+   mutex_lock(&target->event_mutex);
+
+   if (KFD_DBG_EC_TYPE_IS_QUEUE(exception_code)) {
+   /* Per queue exceptions */
+   struct queue *queue = NULL;
+   int i;
+
+   for (i = 0; i < target->n_pdds; i++) {
+   struct kfd_process_device *pdd = target->pdds[i];
+   struct qcm_process_device *qpd = &pdd->qpd;
+
+   list_for_each_entry(queue, &qpd->queues_list, list) {
+   if (!found && queue->properties.queue_id == 
source_id) {
+   found = true;
+   break;
+   }
+   }
+   if (found)
+   break;
+   }
+
+   if (!found) {
+   r = -EINVAL;
+   goto out;
+   }
+
+   if (!(queue->properties.exception_status & 
KFD_EC_MASK(exception_code))) {
+   r = -ENODATA;
+   goto out;
+   }
+   exception_status_ptr = &queue->properties.exception_status;
+   } else if (KFD_DBG_EC_TYPE_IS_DEVICE(exception_code)) {
+   /* Per device exceptions */
+   struct kfd_process_device *pdd = NULL;
+   int i;
+
+   for (i = 0; i < target->n_pdds; i++) {
+   pdd = target->pdds[i];
+   if (pdd->dev->id == source_id) {
+   found = true;
+   break;
+   }
+   }
+
+   if (!found) {
+   r = -EINVAL;
+   goto out;
+   }
+
+   if (!(pdd->exception_status & KFD_EC_MASK(exception_code))) {
+   r = -ENODATA;
+   goto out;
+   }
+
+   if (exception_code == EC_DEVICE_MEMORY_VIOLATION) {
+   copy_size = min((size_t)(*info_size), 
pdd->vm_fault_exc_data_size);
+
+   if (copy_to_user(info, pdd->vm_fault_exc_data, 
copy_size)) {
+   r = -EFAULT;
+   goto out;
+   }
+   actual_info_size = pdd->vm_fault_exc_d

[PATCH 32/33] drm/amdkfd: add debug device snapshot operation

2023-05-25 Thread Jonathan Kim

Similar to queue snapshot, return an array of device information using
an entry_size check and return.
Unlike queue snapshots, the debugger needs to pass to correct number of
devices that exist.  If it fails to do so, the KFD will return the
number of actual devices so that the debugger can make a subsequent
successful call.

v2: add num_xcc to device snapshot and fixup new kfd_node reference

Signed-off-by: Jonathan Kim 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  7 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c   | 73 
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h   |  5 ++
 3 files changed, 83 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index b24a73fd53af..f522325b409b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -3060,8 +3060,11 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, 
struct kfd_process *p, v
&args->queue_snapshot.entry_size);
break;
case KFD_IOC_DBG_TRAP_GET_DEVICE_SNAPSHOT:
-   pr_warn("Debug op %i not supported yet\n", args->op);
-   r = -EACCES;
+   r = kfd_dbg_trap_device_snapshot(target,
+   args->device_snapshot.exception_mask,
+   (void __user 
*)args->device_snapshot.snapshot_buf_ptr,
+   &args->device_snapshot.num_devices,
+   &args->device_snapshot.entry_size);
break;
default:
pr_err("Invalid option: %i\n", args->op);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
index 24e2b285448a..125274445f43 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
@@ -22,6 +22,7 @@
 
 #include "kfd_debug.h"
 #include "kfd_device_queue_manager.h"
+#include "kfd_topology.h"
 #include 
 #include 
 
@@ -1010,6 +1011,78 @@ int kfd_dbg_trap_query_exception_info(struct kfd_process 
*target,
return r;
 }
 
+int kfd_dbg_trap_device_snapshot(struct kfd_process *target,
+   uint64_t exception_clear_mask,
+   void __user *user_info,
+   uint32_t *number_of_device_infos,
+   uint32_t *entry_size)
+{
+   struct kfd_dbg_device_info_entry device_info;
+   uint32_t tmp_entry_size = *entry_size, tmp_num_devices;
+   int i, r = 0;
+
+   if (!(target && user_info && number_of_device_infos && entry_size))
+   return -EINVAL;
+
+   tmp_num_devices = min_t(size_t, *number_of_device_infos, 
target->n_pdds);
+   *number_of_device_infos = target->n_pdds;
+   *entry_size = min_t(size_t, *entry_size, sizeof(device_info));
+
+   if (!tmp_num_devices)
+   return 0;
+
+   memset(&device_info, 0, sizeof(device_info));
+
+   mutex_lock(&target->event_mutex);
+
+   /* Run over all pdd of the process */
+   for (i = 0; i < tmp_num_devices; i++) {
+   struct kfd_process_device *pdd = target->pdds[i];
+   struct kfd_topology_device *topo_dev = 
kfd_topology_device_by_id(pdd->dev->id);
+
+   device_info.gpu_id = pdd->dev->id;
+   device_info.exception_status = pdd->exception_status;
+   device_info.lds_base = pdd->lds_base;
+   device_info.lds_limit = pdd->lds_limit;
+   device_info.scratch_base = pdd->scratch_base;
+   device_info.scratch_limit = pdd->scratch_limit;
+   device_info.gpuvm_base = pdd->gpuvm_base;
+   device_info.gpuvm_limit = pdd->gpuvm_limit;
+   device_info.location_id = topo_dev->node_props.location_id;
+   device_info.vendor_id = topo_dev->node_props.vendor_id;
+   device_info.device_id = topo_dev->node_props.device_id;
+   device_info.revision_id = pdd->dev->adev->pdev->revision;
+   device_info.subsystem_vendor_id = 
pdd->dev->adev->pdev->subsystem_vendor;
+   device_info.subsystem_device_id = 
pdd->dev->adev->pdev->subsystem_device;
+   device_info.fw_version = pdd->dev->kfd->mec_fw_version;
+   device_info.gfx_target_version =
+   topo_dev->node_props.gfx_target_version;
+   device_info.simd_count = topo_dev->node_props.simd_count;
+   device_info.max_waves_per_simd =
+   topo_dev->node_props.max_waves_per_simd;
+   device_info.array_count = topo_dev->node_props.array_count;
+   device_info.simd_arrays_per_engine =
+   topo_dev->node_props.simd_arrays_per_engine;
+   device_info.num_xcc = NUM_XCC(pdd->dev->xcc_mask);
+   device_info.capability = topo_dev->node_props.capability;
+   device_info.debug_prop = topo_dev->node_props.debug_

[PATCH 29/33] drm/amdkfd: add debug query event operation

2023-05-25 Thread Jonathan Kim

Allow the debugger to query a single queue, device and process
exception.
The KFD should also return the GPU or Queue id of the exception.
The debugger also has the option of clearing exceptions after
being queried.

Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  6 +++
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c   | 64 
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h   |  5 ++
 3 files changed, 75 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index e5d95b144dcd..ebb2088d12fa 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -3038,6 +3038,12 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, 
struct kfd_process *p, v
r = kfd_dbg_trap_set_flags(target, &args->set_flags.flags);
break;
case KFD_IOC_DBG_TRAP_QUERY_DEBUG_EVENT:
+   r = kfd_dbg_ev_query_debug_event(target,
+   &args->query_debug_event.queue_id,
+   &args->query_debug_event.gpu_id,
+   args->query_debug_event.exception_mask,
+   &args->query_debug_event.exception_mask);
+   break;
case KFD_IOC_DBG_TRAP_QUERY_EXCEPTION_INFO:
case KFD_IOC_DBG_TRAP_GET_QUEUE_SNAPSHOT:
case KFD_IOC_DBG_TRAP_GET_DEVICE_SNAPSHOT:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
index 43c3170998d3..e9530e682e85 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
@@ -27,6 +27,70 @@
 
 #define MAX_WATCH_ADDRESSES4
 
+int kfd_dbg_ev_query_debug_event(struct kfd_process *process,
+ unsigned int *queue_id,
+ unsigned int *gpu_id,
+ uint64_t exception_clear_mask,
+ uint64_t *event_status)
+{
+   struct process_queue_manager *pqm;
+   struct process_queue_node *pqn;
+   int i;
+
+   if (!(process && process->debug_trap_enabled))
+   return -ENODATA;
+
+   mutex_lock(&process->event_mutex);
+   *event_status = 0;
+   *queue_id = 0;
+   *gpu_id = 0;
+
+   /* find and report queue events */
+   pqm = &process->pqm;
+   list_for_each_entry(pqn, &pqm->queues, process_queue_list) {
+   uint64_t tmp = process->exception_enable_mask;
+
+   if (!pqn->q)
+   continue;
+
+   tmp &= pqn->q->properties.exception_status;
+
+   if (!tmp)
+   continue;
+
+   *event_status = pqn->q->properties.exception_status;
+   *queue_id = pqn->q->properties.queue_id;
+   *gpu_id = pqn->q->device->id;
+   pqn->q->properties.exception_status &= ~exception_clear_mask;
+   goto out;
+   }
+
+   /* find and report device events */
+   for (i = 0; i < process->n_pdds; i++) {
+   struct kfd_process_device *pdd = process->pdds[i];
+   uint64_t tmp = process->exception_enable_mask
+   & pdd->exception_status;
+
+   if (!tmp)
+   continue;
+
+   *event_status = pdd->exception_status;
+   *gpu_id = pdd->dev->id;
+   pdd->exception_status &= ~exception_clear_mask;
+   goto out;
+   }
+
+   /* report process events */
+   if (process->exception_enable_mask & process->exception_status) {
+   *event_status = process->exception_status;
+   process->exception_status &= ~exception_clear_mask;
+   }
+
+out:
+   mutex_unlock(&process->event_mutex);
+   return *event_status ? 0 : -EAGAIN;
+}
+
 void debug_event_write_work_handler(struct work_struct *work)
 {
struct kfd_process *process;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
index ef8e9f7f1716..e78f954c0684 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
@@ -27,6 +27,11 @@
 
 void kfd_dbg_trap_deactivate(struct kfd_process *target, bool unwind, int 
unwind_count);
 int kfd_dbg_trap_activate(struct kfd_process *target);
+int kfd_dbg_ev_query_debug_event(struct kfd_process *process,
+   unsigned int *queue_id,
+   unsigned int *gpu_id,
+   uint64_t exception_clear_mask,
+   uint64_t *event_status);
 bool kfd_set_dbg_ev_from_interrupt(struct kfd_node *dev,
   unsigned int pasid,
   uint32_t doorbell_id,
-- 
2.25.1

[PATCH 22/33] drm/amdkfd: update process interrupt handling for debug events

2023-05-25 Thread Jonathan Kim

The debugger must be notified by any debugger subscribed exception
that comes from hardware interrupts.

If a debugger session exits, any exceptions it subscribed to may still
have interrupts in the interrupt ring buffer or KGD/KFD pipeline.
To prevent a new session from inheriting stale interrupts, when a new
queue is created, open an interrupt drain and allow the IH ring to drain
from a timestamped checkpoint.  Then inject a custom IV so that once
the custom IV is picked up by the KFD, it's safe to close the drain
and proceed with queue creation.

The drain must also be on debug disable as SW interrupts may still
be processed.  Drain at this time and clear all the exception status.

The debugger may also not be attached nor subscibed to certain
exceptions so forward them directly to the runtime.

GFX10 also requires its own IV processing, hence the creation of
kfd_int_process_v10.c.  This is because the IV from SQ interrupts are
packed into a new continguous format unlike GFX9. To make this clear,
a separate interrupting handling code file was created.

v2: use new kfd_node struct in prototypes.

Signed-off-by: Jonathan Kim 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c|  16 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|   2 +
 drivers/gpu/drm/amd/amdkfd/Makefile   |   1 +
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c|  84 
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h|   6 +
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |   4 +-
 .../gpu/drm/amd/amdkfd/kfd_int_process_v10.c  | 405 ++
 .../gpu/drm/amd/amdkfd/kfd_int_process_v11.c  |  21 +-
 .../gpu/drm/amd/amdkfd/kfd_int_process_v9.c   |  98 -
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  12 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  47 ++
 .../amd/amdkfd/kfd_process_queue_manager.c|   4 +
 12 files changed, 680 insertions(+), 20 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_int_process_v10.c

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 66f80b9ab0c5..98cd52bb005f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -777,6 +777,22 @@ void amdgpu_amdkfd_ras_poison_consumption_handler(struct 
amdgpu_device *adev, bo
amdgpu_umc_poison_handler(adev, reset);
 }
 
+int amdgpu_amdkfd_send_close_event_drain_irq(struct amdgpu_device *adev,
+   uint32_t *payload)
+{
+   int ret;
+
+   /* Device or IH ring is not ready so bail. */
+   ret = amdgpu_ih_wait_on_checkpoint_process_ts(adev, &adev->irq.ih);
+   if (ret)
+   return ret;
+
+   /* Send payload to fence KFD interrupts */
+   amdgpu_amdkfd_interrupt(adev, payload);
+
+   return 0;
+}
+
 bool amdgpu_amdkfd_ras_query_utcl2_poison_status(struct amdgpu_device *adev)
 {
if (adev->gfx.ras && adev->gfx.ras->query_utcl2_poison_status)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 94cc456761e5..dd740e64e6e1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -250,6 +250,8 @@ int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct 
amdgpu_device *dst,
struct amdgpu_device *src,
bool is_min);
 int amdgpu_amdkfd_get_pcie_bandwidth_mbytes(struct amdgpu_device *adev, bool 
is_min);
+int amdgpu_amdkfd_send_close_event_drain_irq(struct amdgpu_device *adev,
+   uint32_t *payload);
 
 /* Read user wptr from a specified user address space with page fault
  * disabled. The memory must be pinned and mapped to the hardware when
diff --git a/drivers/gpu/drm/amd/amdkfd/Makefile 
b/drivers/gpu/drm/amd/amdkfd/Makefile
index 747754428073..2ec8f27c5366 100644
--- a/drivers/gpu/drm/amd/amdkfd/Makefile
+++ b/drivers/gpu/drm/amd/amdkfd/Makefile
@@ -53,6 +53,7 @@ AMDKFD_FILES  := $(AMDKFD_PATH)/kfd_module.o \
$(AMDKFD_PATH)/kfd_events.o \
$(AMDKFD_PATH)/cik_event_interrupt.o \
$(AMDKFD_PATH)/kfd_int_process_v9.o \
+   $(AMDKFD_PATH)/kfd_int_process_v10.o \
$(AMDKFD_PATH)/kfd_int_process_v11.o \
$(AMDKFD_PATH)/kfd_smi_events.o \
$(AMDKFD_PATH)/kfd_crat.o \
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
index 17e8e9edccbf..68b657398d41 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
@@ -125,6 +125,64 @@ bool kfd_dbg_ev_raise(uint64_t event_mask,
return is_subscribed;
 }
 
+/* set pending event queue entry from ring entry  */
+bool kfd_set_dbg_ev_from_interrupt(struct kfd_node *dev,
+  unsigned int pasid,
+  uint32_t doorbell_id,
+

[PATCH 23/33] drm/amdkfd: add debug set exceptions enabled operation

2023-05-25 Thread Jonathan Kim

The debugger subscibes to nofication for requested exceptions on attach.
Allow the debugger to change its subsciption later on.

Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  3 ++
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c   | 36 
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h   |  2 ++
 3 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 73cb5abce431..80d354eade35 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2980,6 +2980,9 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, 
struct kfd_process *p, v
args->send_runtime_event.exception_mask);
break;
case KFD_IOC_DBG_TRAP_SET_EXCEPTIONS_ENABLED:
+   kfd_dbg_set_enabled_debug_exception_mask(target,
+   args->set_exceptions_enabled.exception_mask);
+   break;
case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_OVERRIDE:
case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_MODE:
case KFD_IOC_DBG_TRAP_SUSPEND_QUEUES:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
index 68b657398d41..48a4e3cc2234 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
@@ -521,3 +521,39 @@ int kfd_dbg_trap_enable(struct kfd_process *target, 
uint32_t fd,
 
return r;
 }
+
+void kfd_dbg_set_enabled_debug_exception_mask(struct kfd_process *target,
+   uint64_t exception_set_mask)
+{
+   uint64_t found_mask = 0;
+   struct process_queue_manager *pqm;
+   struct process_queue_node *pqn;
+   static const char write_data = '.';
+   loff_t pos = 0;
+   int i;
+
+   mutex_lock(&target->event_mutex);
+
+   found_mask |= target->exception_status;
+
+   pqm = &target->pqm;
+   list_for_each_entry(pqn, &pqm->queues, process_queue_list) {
+   if (!pqn)
+   continue;
+
+   found_mask |= pqn->q->properties.exception_status;
+   }
+
+   for (i = 0; i < target->n_pdds; i++) {
+   struct kfd_process_device *pdd = target->pdds[i];
+
+   found_mask |= pdd->exception_status;
+   }
+
+   if (exception_set_mask & found_mask)
+   kernel_write(target->dbg_ev_file, &write_data, 1, &pos);
+
+   target->exception_enable_mask = exception_set_mask;
+
+   mutex_unlock(&target->event_mutex);
+}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
index 5153ccbd7fd1..6c1054a08872 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
@@ -56,6 +56,8 @@ static inline bool kfd_dbg_is_per_vmid_supported(struct 
kfd_node *dev)
 
 void debug_event_write_work_handler(struct work_struct *work);
 
+void kfd_dbg_set_enabled_debug_exception_mask(struct kfd_process *target,
+   uint64_t exception_set_mask);
 /*
  * If GFX off is enabled, chips that do not support RLC restore for the debug
  * registers will disable GFX off temporarily for the entire debug session.
-- 
2.25.1

[PATCH 25/33] drm/amdkfd: add debug wave launch mode operation

2023-05-25 Thread Jonathan Kim

Allow the debugger to set wave behaviour on to either normally operate,
halt at launch, trap on every instruction, terminate immediately or
stall on allocation.

v2: fixup with new kfd_node struct reference for mes check

Signed-off-by: Jonathan Kim 
---
 .../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c  | 12 +++
 .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |  1 +
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 25 +
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h|  3 ++
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c  |  3 +-
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c| 14 +++-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 25 +
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h |  3 ++
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  3 ++
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 36 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h|  2 ++
 11 files changed, 124 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
index d7881bbd828d..774ecfc3451a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
@@ -107,6 +107,17 @@ static uint32_t 
kgd_aldebaran_set_wave_launch_trap_override(struct amdgpu_device
return data;
 }
 
+static uint32_t kgd_aldebaran_set_wave_launch_mode(struct amdgpu_device *adev,
+   uint8_t wave_launch_mode,
+   uint32_t vmid)
+{
+   uint32_t data = 0;
+
+   data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, LAUNCH_MODE, 
wave_launch_mode);
+
+   return data;
+}
+
 const struct kfd2kgd_calls aldebaran_kfd2kgd = {
.program_sh_mem_settings = kgd_gfx_v9_program_sh_mem_settings,
.set_pasid_vmid_mapping = kgd_gfx_v9_set_pasid_vmid_mapping,
@@ -129,6 +140,7 @@ const struct kfd2kgd_calls aldebaran_kfd2kgd = {
.disable_debug_trap = kgd_aldebaran_disable_debug_trap,
.validate_trap_override_request = 
kgd_aldebaran_validate_trap_override_request,
.set_wave_launch_trap_override = 
kgd_aldebaran_set_wave_launch_trap_override,
+   .set_wave_launch_mode = kgd_aldebaran_set_wave_launch_mode,
.get_iq_wait_times = kgd_gfx_v9_get_iq_wait_times,
.build_grace_period_packet_info = 
kgd_gfx_v9_build_grace_period_packet_info,
.program_trap_handler_settings = 
kgd_gfx_v9_program_trap_handler_settings,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
index ec2587664001..fbdc1b7b1e42 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
@@ -412,6 +412,7 @@ const struct kfd2kgd_calls arcturus_kfd2kgd = {
.disable_debug_trap = kgd_arcturus_disable_debug_trap,
.validate_trap_override_request = 
kgd_gfx_v9_validate_trap_override_request,
.set_wave_launch_trap_override = 
kgd_gfx_v9_set_wave_launch_trap_override,
+   .set_wave_launch_mode = kgd_gfx_v9_set_wave_launch_mode,
.get_iq_wait_times = kgd_gfx_v9_get_iq_wait_times,
.build_grace_period_packet_info = 
kgd_gfx_v9_build_grace_period_packet_info,
.get_cu_occupancy = kgd_gfx_v9_get_cu_occupancy,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
index 7ea0362dcab3..a7a6edda557f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
@@ -856,6 +856,30 @@ uint32_t kgd_gfx_v10_set_wave_launch_trap_override(struct 
amdgpu_device *adev,
return 0;
 }
 
+uint32_t kgd_gfx_v10_set_wave_launch_mode(struct amdgpu_device *adev,
+   uint8_t wave_launch_mode,
+   uint32_t vmid)
+{
+   uint32_t data = 0;
+   bool is_mode_set = !!wave_launch_mode;
+
+   mutex_lock(&adev->grbm_idx_mutex);
+
+   kgd_gfx_v10_set_wave_launch_stall(adev, vmid, true);
+
+   data = REG_SET_FIELD(data, SPI_GDBG_WAVE_CNTL2,
+   VMID_MASK, is_mode_set ? 1 << vmid : 0);
+   data = REG_SET_FIELD(data, SPI_GDBG_WAVE_CNTL2,
+   MODE, is_mode_set ? wave_launch_mode : 0);
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_WAVE_CNTL2), data);
+
+   kgd_gfx_v10_set_wave_launch_stall(adev, vmid, false);
+
+   mutex_unlock(&adev->grbm_idx_mutex);
+
+   return 0;
+}
+
 /* kgd_gfx_v10_get_iq_wait_times: Returns the mmCP_IQ_WAIT_TIME1/2 values
  * The values read are:
  * ib_offload_wait_time -- Wait Count for Indirect Buffer Offloads.
@@ -944,6 +968,7 @@ const struct kfd2kgd_calls gfx_v10_kfd2kgd = {
.disable_debug_trap = kgd_gfx_v10_disable_debug_trap,
.validate_trap_override_request = 
kgd_gfx_v10_validate_trap_override_request,
.set_wav

[PATCH 27/33] drm/amdkfd: add debug set and clear address watch points operation

2023-05-25 Thread Jonathan Kim

Shader read, write and atomic memory operations can be alerted to the
debugger as an address watch exception.

Allow the debugger to pass in a watch point to a particular memory
address per device.

Note that there exists only 4 watch points per devices to date, so have
the KFD keep track of what watch points are allocated or not.

v2: fixup with new kfd_node struct reference for mes and watch point
checks

Signed-off-by: Jonathan Kim 
---
 .../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c  |  51 +++
 .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |   2 +
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c|  78 ++
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h|   8 ++
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c  |   5 +-
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c|  52 ++-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  77 ++
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h |   8 ++
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  24 
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 136 ++
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h|   8 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |   2 +
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |   6 +-
 13 files changed, 452 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
index 774ecfc3451a..efd6a72aab4e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
@@ -118,6 +118,55 @@ static uint32_t kgd_aldebaran_set_wave_launch_mode(struct 
amdgpu_device *adev,
return data;
 }
 
+#define TCP_WATCH_STRIDE (regTCP_WATCH1_ADDR_H - regTCP_WATCH0_ADDR_H)
+static uint32_t kgd_gfx_aldebaran_set_address_watch(
+   struct amdgpu_device *adev,
+   uint64_t watch_address,
+   uint32_t watch_address_mask,
+   uint32_t watch_id,
+   uint32_t watch_mode,
+   uint32_t debug_vmid)
+{
+   uint32_t watch_address_high;
+   uint32_t watch_address_low;
+   uint32_t watch_address_cntl;
+
+   watch_address_cntl = 0;
+   watch_address_low = lower_32_bits(watch_address);
+   watch_address_high = upper_32_bits(watch_address) & 0x;
+
+   watch_address_cntl = REG_SET_FIELD(watch_address_cntl,
+   TCP_WATCH0_CNTL,
+   MODE,
+   watch_mode);
+
+   watch_address_cntl = REG_SET_FIELD(watch_address_cntl,
+   TCP_WATCH0_CNTL,
+   MASK,
+   watch_address_mask >> 6);
+
+   watch_address_cntl = REG_SET_FIELD(watch_address_cntl,
+   TCP_WATCH0_CNTL,
+   VALID,
+   1);
+
+   WREG32_RLC((SOC15_REG_OFFSET(GC, 0, regTCP_WATCH0_ADDR_H) +
+   (watch_id * TCP_WATCH_STRIDE)),
+   watch_address_high);
+
+   WREG32_RLC((SOC15_REG_OFFSET(GC, 0, regTCP_WATCH0_ADDR_L) +
+   (watch_id * TCP_WATCH_STRIDE)),
+   watch_address_low);
+
+   return watch_address_cntl;
+}
+
+uint32_t kgd_gfx_aldebaran_clear_address_watch(struct amdgpu_device *adev,
+   uint32_t watch_id)
+{
+   return 0;
+}
+
 const struct kfd2kgd_calls aldebaran_kfd2kgd = {
.program_sh_mem_settings = kgd_gfx_v9_program_sh_mem_settings,
.set_pasid_vmid_mapping = kgd_gfx_v9_set_pasid_vmid_mapping,
@@ -141,6 +190,8 @@ const struct kfd2kgd_calls aldebaran_kfd2kgd = {
.validate_trap_override_request = 
kgd_aldebaran_validate_trap_override_request,
.set_wave_launch_trap_override = 
kgd_aldebaran_set_wave_launch_trap_override,
.set_wave_launch_mode = kgd_aldebaran_set_wave_launch_mode,
+   .set_address_watch = kgd_gfx_aldebaran_set_address_watch,
+   .clear_address_watch = kgd_gfx_aldebaran_clear_address_watch,
.get_iq_wait_times = kgd_gfx_v9_get_iq_wait_times,
.build_grace_period_packet_info = 
kgd_gfx_v9_build_grace_period_packet_info,
.program_trap_handler_settings = 
kgd_gfx_v9_program_trap_handler_settings,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
index fbdc1b7b1e42..6df215aba4c4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
@@ -413,6 +413,8 @@ const struct kfd2kgd_calls arcturus_kfd2kgd = {
.validate_trap_override_request = 
kgd_gfx_v9_validate_trap_override_request,
.set_wave_launch_trap_override = 
kgd_gfx_v9_set_wave_launch_trap_override,
.set_wave_launch_mode = kgd_gfx_v9_set_wave_launch_mode,
+   .set_address_watch =

[PATCH 33/33] drm/amdkfd: bump kfd ioctl minor version for debug api availability

2023-05-25 Thread Jonathan Kim

Bump the minor version to declare debugging capability is now
available.

Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 1 -
 include/uapi/linux/kfd_ioctl.h   | 3 ++-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index f522325b409b..56f55da482e2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2984,7 +2984,6 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, 
struct kfd_process *p, v
if (!r)
target->exception_enable_mask = 
args->enable.exception_mask;
 
-   pr_warn("Debug functions limited\n");
break;
case KFD_IOC_DBG_TRAP_DISABLE:
r = kfd_dbg_trap_disable(target);
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index dfe745ee427e..ea0d50955eac 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -38,9 +38,10 @@
  * - 1.10 - Add SMI profiler event log
  * - 1.11 - Add unified memory for ctx save/restore area
  * - 1.12 - Add DMA buf export ioctl
+ * - 1.13 - Add debugger API
  */
 #define KFD_IOCTL_MAJOR_VERSION 1
-#define KFD_IOCTL_MINOR_VERSION 12
+#define KFD_IOCTL_MINOR_VERSION 13
 
 struct kfd_ioctl_get_version_args {
__u32 major_version;/* from KFD */
-- 
2.25.1

[PATCH 17/33] drm/amdkfd: apply trap workaround for gfx11

2023-05-25 Thread Jonathan Kim

Due to a HW bug, waves in only half the shader arrays can enter trap.

When starting a debug session, relocate all waves to the first shader
array of each shader engine and mask off the 2nd shader array as
unavailable.

When ending a debug session, re-enable the 2nd shader array per
shader engine.

User CU masking per queue cannot be guaranteed to remain functional
if requested during debugging (e.g. user cu mask requests only 2nd shader
array as an available resource leading to zero HW resources available)
nor can runtime be alerted of any of these changes during execution.

Make user CU masking and debugging mutual exclusive with respect to
availability.

If the debugger tries to attach to a process with a user cu masked
queue, return the runtime status as enabled but busy.

If the debugger tries to attach and fails to reallocate queue waves to
the first shader array of each shader engine, return the runtime status
as enabled but with an error.

In addition, like any other mutli-process debug supported devices,
disable trap temporary setup per-process to avoid performance impact from
setup overhead.

Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h   |  2 +
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c|  7 +--
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  2 -
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 57 +++
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h|  3 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  7 +++
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  |  3 +-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  |  3 +-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c  | 42 ++
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   |  3 +-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   |  3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  5 +-
 .../amd/amdkfd/kfd_process_queue_manager.c|  9 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  7 ++-
 14 files changed, 122 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index d20df0cf0d88..b5f5eed2b5ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -219,6 +219,8 @@ struct mes_add_queue_input {
uint32_tgws_size;
uint64_ttba_addr;
uint64_ttma_addr;
+   uint32_ttrap_en;
+   uint32_tskip_process_ctx_clear;
uint32_tis_kfd_process;
uint32_tis_aql_queue;
uint32_tqueue_size;
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index 861910a6662d..c4e3cb8d44de 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -202,17 +202,14 @@ static int mes_v11_0_add_hw_queue(struct amdgpu_mes *mes,
mes_add_queue_pkt.gws_size = input->gws_size;
mes_add_queue_pkt.trap_handler_addr = input->tba_addr;
mes_add_queue_pkt.tma_addr = input->tma_addr;
+   mes_add_queue_pkt.trap_en = input->trap_en;
+   mes_add_queue_pkt.skip_process_ctx_clear = 
input->skip_process_ctx_clear;
mes_add_queue_pkt.is_kfd_process = input->is_kfd_process;
 
/* For KFD, gds_size is re-used for queue size (needed in MES for AQL 
queues) */
mes_add_queue_pkt.is_aql_queue = input->is_aql_queue;
mes_add_queue_pkt.gds_size = input->queue_size;
 
-   if (!(((adev->mes.sched_version & AMDGPU_MES_VERSION_MASK) >= 4) &&
- (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(11, 0, 0)) &&
- (adev->ip_versions[GC_HWIP][0] <= IP_VERSION(11, 0, 3
-   mes_add_queue_pkt.trap_en = 1;
-
/* For KFD, gds_size is re-used for queue size (needed in MES for AQL 
queues) */
mes_add_queue_pkt.is_aql_queue = input->is_aql_queue;
mes_add_queue_pkt.gds_size = input->queue_size;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 9d0c247f80fe..a5c457863048 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -537,8 +537,6 @@ static int kfd_ioctl_set_cu_mask(struct file *filp, struct 
kfd_process *p,
goto out;
}
 
-   minfo.update_flag = UPDATE_FLAG_CU_MASK;
-
mutex_lock(&p->mutex);
 
retval = pqm_update_mqd(&p->pqm, args->queue_id, &minfo);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
index 73b07b5f17f1..5e2ee2d1acc4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
@@ -24,6 +24,57 @@
 #include "kfd_device_queue_manager.h"
 #include 
 
+static int kfd_dbg_set_queue_workaround(struct queue *q, bool enable)
+{
+   struct mqd_update_info minfo = {0};
+   int err;
+
+   if (!q)
+   return 0;
+
+   if (KFD_GC_VE

[PATCH 21/33] drm/amdkfd: add debug trap enabled flag to tma

2023-05-25 Thread Jonathan Kim

From: Jay Cornwall 

Trap handler behavior will differ when a debugger is attached.

Make the debug trap flag available in the trap handler TMA.
Update it when the debug trap ioctl is invoked.

Signed-off-by: Jay Cornwall 
Reviewed-by: Felix Kuehling 
Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c   | 11 +++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  2 ++
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 15 +++
 3 files changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
index a19c21d04438..17e8e9edccbf 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
@@ -256,6 +256,8 @@ void kfd_dbg_trap_deactivate(struct kfd_process *target, 
bool unwind, int unwind
if (unwind && i == unwind_count)
break;
 
+   kfd_process_set_trap_debug_flag(&pdd->qpd, false);
+
/* GFX off is already disabled by debug activate if not RLC 
restore supported. */
if (kfd_dbg_is_rlc_restore_supported(pdd->dev))
amdgpu_gfx_off_ctrl(pdd->dev->adev, false);
@@ -351,6 +353,15 @@ int kfd_dbg_trap_activate(struct kfd_process *target)
if (kfd_dbg_is_rlc_restore_supported(pdd->dev))
amdgpu_gfx_off_ctrl(pdd->dev->adev, true);
 
+   /*
+* Setting the debug flag in the trap handler requires that the 
TMA has been
+* allocated, which occurs during CWSR initialization.
+* In the event that CWSR has not been initialized at this 
point, setting the
+* flag will be called again during CWSR initialization if the 
target process
+* is still debug enabled.
+*/
+   kfd_process_set_trap_debug_flag(&pdd->qpd, true);
+
if (!pdd->dev->kfd->shared_resources.enable_mes)
r = debug_refresh_runlist(pdd->dev->dqm);
else
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 4b80f74b9de0..a02fb939614a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -1157,6 +1157,8 @@ int kfd_init_apertures(struct kfd_process *process);
 void kfd_process_set_trap_handler(struct qcm_process_device *qpd,
  uint64_t tba_addr,
  uint64_t tma_addr);
+void kfd_process_set_trap_debug_flag(struct qcm_process_device *qpd,
+bool enabled);
 
 /* CWSR initialization */
 int kfd_process_init_cwsr_apu(struct kfd_process *process, struct file *filep);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 8bfd0c91fb92..2a60c630ab5d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1309,6 +1309,8 @@ int kfd_process_init_cwsr_apu(struct kfd_process *p, 
struct file *filep)
 
memcpy(qpd->cwsr_kaddr, dev->kfd->cwsr_isa, 
dev->kfd->cwsr_isa_size);
 
+   kfd_process_set_trap_debug_flag(qpd, p->debug_trap_enabled);
+
qpd->tma_addr = qpd->tba_addr + KFD_CWSR_TMA_OFFSET;
pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for 
pqm.\n",
qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr);
@@ -1345,6 +1347,9 @@ static int kfd_process_device_init_cwsr_dgpu(struct 
kfd_process_device *pdd)
 
memcpy(qpd->cwsr_kaddr, dev->kfd->cwsr_isa, dev->kfd->cwsr_isa_size);
 
+   kfd_process_set_trap_debug_flag(&pdd->qpd,
+   pdd->process->debug_trap_enabled);
+
qpd->tma_addr = qpd->tba_addr + KFD_CWSR_TMA_OFFSET;
pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for pqm.\n",
 qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr);
@@ -1431,6 +1436,16 @@ bool kfd_process_xnack_mode(struct kfd_process *p, bool 
supported)
return true;
 }
 
+void kfd_process_set_trap_debug_flag(struct qcm_process_device *qpd,
+bool enabled)
+{
+   if (qpd->cwsr_kaddr) {
+   uint64_t *tma =
+   (uint64_t *)(qpd->cwsr_kaddr + KFD_CWSR_TMA_OFFSET);
+   tma[2] = enabled;
+   }
+}
+
 /*
  * On return the kfd_process is fully operational and will be freed when the
  * mm is released
-- 
2.25.1

[PATCH 15/33] drm/amdgpu: expose debug api for mes

2023-05-25 Thread Jonathan Kim

Similar to the F32 HWS, the RS64 HWS for GFX11 now supports a multi-process
debug API.

The skip_process_ctx_clear ADD_QUEUE requirement is to prevent the MES
from clearing the process context when the first queue is added to the
scheduler in order to maintain debug mode settings during queue preemption
and restore.  The MES clears the process context in this case due to an
unresolved FW caching bug during normal mode operations.
During debug mode, the KFD will hold a reference to the target process
so the process context should never go stale and MES can afford to skip
this requirement.

Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c   | 32 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h   | 20 
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c| 12 +++
 drivers/gpu/drm/amd/include/mes_v11_api_def.h | 21 +++-
 4 files changed, 84 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 49bb6c03d606..20cc3fffe921 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -924,6 +924,38 @@ int amdgpu_mes_reg_wait(struct amdgpu_device *adev, 
uint32_t reg,
return r;
 }
 
+int amdgpu_mes_set_shader_debugger(struct amdgpu_device *adev,
+   uint64_t process_context_addr,
+   uint32_t spi_gdbg_per_vmid_cntl,
+   const uint32_t *tcp_watch_cntl,
+   uint32_t flags)
+{
+   struct mes_misc_op_input op_input = {0};
+   int r;
+
+   if (!adev->mes.funcs->misc_op) {
+   DRM_ERROR("mes set shader debugger is not supported!\n");
+   return -EINVAL;
+   }
+
+   op_input.op = MES_MISC_OP_SET_SHADER_DEBUGGER;
+   op_input.set_shader_debugger.process_context_addr = 
process_context_addr;
+   op_input.set_shader_debugger.flags.u32all = flags;
+   op_input.set_shader_debugger.spi_gdbg_per_vmid_cntl = 
spi_gdbg_per_vmid_cntl;
+   memcpy(op_input.set_shader_debugger.tcp_watch_cntl, tcp_watch_cntl,
+   sizeof(op_input.set_shader_debugger.tcp_watch_cntl));
+
+   amdgpu_mes_lock(&adev->mes);
+
+   r = adev->mes.funcs->misc_op(&adev->mes, &op_input);
+   if (r)
+   DRM_ERROR("failed to set_shader_debugger\n");
+
+   amdgpu_mes_unlock(&adev->mes);
+
+   return r;
+}
+
 static void
 amdgpu_mes_ring_to_queue_props(struct amdgpu_device *adev,
   struct amdgpu_ring *ring,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
index 547ec35691fa..d20df0cf0d88 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h
@@ -256,6 +256,7 @@ enum mes_misc_opcode {
MES_MISC_OP_READ_REG,
MES_MISC_OP_WRM_REG_WAIT,
MES_MISC_OP_WRM_REG_WR_WAIT,
+   MES_MISC_OP_SET_SHADER_DEBUGGER,
 };
 
 struct mes_misc_op_input {
@@ -278,6 +279,20 @@ struct mes_misc_op_input {
uint32_t   reg0;
uint32_t   reg1;
} wrm_reg;
+
+   struct {
+   uint64_t process_context_addr;
+   union {
+   struct {
+   uint64_t single_memop : 1;
+   uint64_t single_alu_op : 1;
+   uint64_t reserved: 30;
+   };
+   uint32_t u32all;
+   } flags;
+   uint32_t spi_gdbg_per_vmid_cntl;
+   uint32_t tcp_watch_cntl[4];
+   } set_shader_debugger;
};
 };
 
@@ -340,6 +355,11 @@ int amdgpu_mes_reg_wait(struct amdgpu_device *adev, 
uint32_t reg,
 int amdgpu_mes_reg_write_reg_wait(struct amdgpu_device *adev,
  uint32_t reg0, uint32_t reg1,
  uint32_t ref, uint32_t mask);
+int amdgpu_mes_set_shader_debugger(struct amdgpu_device *adev,
+   uint64_t process_context_addr,
+   uint32_t spi_gdbg_per_vmid_cntl,
+   const uint32_t *tcp_watch_cntl,
+   uint32_t flags);
 
 int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id,
int queue_type, int idx,
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index 90b4a74ccf01..861910a6662d 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -339,6 +339,18 @@ static int mes_v11_0_misc_op(struct amdgpu_mes *mes,
misc_pkt.wait_reg_mem.reg_offset1 = input->wrm_reg.reg0;

[PATCH 28/33] drm/amdkfd: add debug set flags operation

2023-05-25 Thread Jonathan Kim

Allow the debugger to set single memory and single ALU operations.

Some exceptions are imprecise (memory violations, address watch) in the
sense that a trap occurs only when the exception interrupt occurs and
not at the non-halting faulty instruction.  Trap temporaries 0 & 1 save
the program counter address, which means that these values will not point
to the faulty instruction address but to whenever the interrupt was
raised.

Setting the Single Memory Operations flag will inject an automatic wait
on every memory operation instruction forcing imprecise memory exceptions
to become precise at the cost of performance.  This setting is not
permitted on debug devices that support only a global setting of this
option.

Return the previous set flags to the debugger as well.

v2: fixup with new kfd_node struct reference mes checks

Signed-off-by: Jonathan Kim 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  2 +
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c   | 58 
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h   |  1 +
 3 files changed, 61 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index e88be582d44d..e5d95b144dcd 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -3035,6 +3035,8 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, 
struct kfd_process *p, v
args->clear_node_address_watch.id);
break;
case KFD_IOC_DBG_TRAP_SET_FLAGS:
+   r = kfd_dbg_trap_set_flags(target, &args->set_flags.flags);
+   break;
case KFD_IOC_DBG_TRAP_QUERY_DEBUG_EVENT:
case KFD_IOC_DBG_TRAP_QUERY_EXCEPTION_INFO:
case KFD_IOC_DBG_TRAP_GET_QUEUE_SNAPSHOT:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
index 4b36cc8b5fb7..43c3170998d3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
@@ -23,6 +23,7 @@
 #include "kfd_debug.h"
 #include "kfd_device_queue_manager.h"
 #include 
+#include 
 
 #define MAX_WATCH_ADDRESSES4
 
@@ -423,6 +424,59 @@ static void kfd_dbg_clear_process_address_watch(struct 
kfd_process *target)
kfd_dbg_trap_clear_dev_address_watch(target->pdds[i], 
j);
 }
 
+int kfd_dbg_trap_set_flags(struct kfd_process *target, uint32_t *flags)
+{
+   uint32_t prev_flags = target->dbg_flags;
+   int i, r = 0, rewind_count = 0;
+
+   for (i = 0; i < target->n_pdds; i++) {
+   if (!kfd_dbg_is_per_vmid_supported(target->pdds[i]->dev) &&
+   (*flags & KFD_DBG_TRAP_FLAG_SINGLE_MEM_OP)) {
+   *flags = prev_flags;
+   return -EACCES;
+   }
+   }
+
+   target->dbg_flags = *flags & KFD_DBG_TRAP_FLAG_SINGLE_MEM_OP;
+   *flags = prev_flags;
+   for (i = 0; i < target->n_pdds; i++) {
+   struct kfd_process_device *pdd = target->pdds[i];
+
+   if (!kfd_dbg_is_per_vmid_supported(pdd->dev))
+   continue;
+
+   if (!pdd->dev->kfd->shared_resources.enable_mes)
+   r = debug_refresh_runlist(pdd->dev->dqm);
+   else
+   r = kfd_dbg_set_mes_debug_mode(pdd);
+
+   if (r) {
+   target->dbg_flags = prev_flags;
+   break;
+   }
+
+   rewind_count++;
+   }
+
+   /* Rewind flags */
+   if (r) {
+   target->dbg_flags = prev_flags;
+
+   for (i = 0; i < rewind_count; i++) {
+   struct kfd_process_device *pdd = target->pdds[i];
+
+   if (!kfd_dbg_is_per_vmid_supported(pdd->dev))
+   continue;
+
+   if (!pdd->dev->kfd->shared_resources.enable_mes)
+   debug_refresh_runlist(pdd->dev->dqm);
+   else
+   kfd_dbg_set_mes_debug_mode(pdd);
+   }
+   }
+
+   return r;
+}
 
 /* kfd_dbg_trap_deactivate:
  * target: target process
@@ -437,9 +491,13 @@ void kfd_dbg_trap_deactivate(struct kfd_process *target, 
bool unwind, int unwind
int i;
 
if (!unwind) {
+   uint32_t flags = 0;
+
cancel_work_sync(&target->debug_event_workarea);
kfd_dbg_clear_process_address_watch(target);
kfd_dbg_trap_set_wave_launch_mode(target, 0);
+
+   kfd_dbg_trap_set_flags(target, &flags);
}
 
for (i = 0; i < target->n_pdds; i++) {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
index 7f0757c2af2c..ef8e9f7f1716 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
@@ -57,6 +57,7 @@ int kfd_dbg_trap_set_dev_address_watch(struct 
kfd_process_d

[PATCH 11/33] drm/amdgpu: add gfx11 hw debug mode enable and disable calls

2023-05-25 Thread Jonathan Kim

Implement the per-device calls to enable or disable HW debug mode
for GFX11.

Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c| 38 +++
 1 file changed, 38 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c
index 7deff8a547fb..cc954cf248ca 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c
@@ -607,6 +607,42 @@ static void set_vm_context_page_table_base_v11(struct 
amdgpu_device *adev,
adev->gfxhub.funcs->setup_vm_pt_regs(adev, vmid, page_table_base);
 }
 
+/*
+ * Returns TRAP_EN, EXCP_EN and EXCP_REPLACE.
+ *
+ * restore_dbg_registers is ignored here but is a general interface requirement
+ * for devices that support GFXOFF and where the RLC save/restore list
+ * does not support hw registers for debugging i.e. the driver has to manually
+ * initialize the debug mode registers after it has disabled GFX off during the
+ * debug session.
+ */
+static uint32_t kgd_gfx_v11_enable_debug_trap(struct amdgpu_device *adev,
+   bool restore_dbg_registers,
+   uint32_t vmid)
+{
+   uint32_t data = 0;
+
+   data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, TRAP_EN, 1);
+   data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_EN, 0);
+   data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_REPLACE, 0);
+
+   return data;
+}
+
+/* Returns TRAP_EN, EXCP_EN and EXCP_REPLACE. */
+static uint32_t kgd_gfx_v11_disable_debug_trap(struct amdgpu_device *adev,
+   bool keep_trap_enabled,
+   uint32_t vmid)
+{
+   uint32_t data = 0;
+
+   data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, TRAP_EN, 
keep_trap_enabled);
+   data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_EN, 0);
+   data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_REPLACE, 0);
+
+   return data;
+}
+
 const struct kfd2kgd_calls gfx_v11_kfd2kgd = {
.program_sh_mem_settings = program_sh_mem_settings_v11,
.set_pasid_vmid_mapping = set_pasid_vmid_mapping_v11,
@@ -623,4 +659,6 @@ const struct kfd2kgd_calls gfx_v11_kfd2kgd = {
.wave_control_execute = wave_control_execute_v11,
.get_atc_vmid_pasid_mapping_info = NULL,
.set_vm_context_page_table_base = set_vm_context_page_table_base_v11,
+   .enable_debug_trap = kgd_gfx_v11_enable_debug_trap,
+   .disable_debug_trap = kgd_gfx_v11_disable_debug_trap
 };
-- 
2.25.1

[PATCH 24/33] drm/amdkfd: add debug wave launch override operation

2023-05-25 Thread Jonathan Kim

This operation allows the debugger to override the enabled HW
exceptions on the device.

On debug devices that only support the debugging of a single process,
the HW exceptions are global and set through the SPI_GDBG_TRAP_MASK
register.
Because they are global, only address watch exceptions are allowed to
be enabled.  In other words, the debugger must preserve all non-address
watch exception states in normal mode operation by barring a full
replacement override or a non-address watch override request.

For multi-process debugging, all HW exception overrides are per-VMID so
all exceptions can be overridden or fully replaced.

In order for the debugger to know what is permissible, returned the
supported override mask back to the debugger along with the previously
enable overrides.

v2: fixup with new kfd_node struct reference for mes check

Signed-off-by: Jonathan Kim 
---
 .../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c  | 47 ++
 .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |  2 +
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 55 
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h| 10 +++
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c  |  5 +-
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c| 87 ++-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 55 
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h | 10 +++
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  7 ++
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 69 +++
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h|  6 ++
 11 files changed, 351 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
index b811a0985050..d7881bbd828d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
@@ -25,6 +25,7 @@
 #include "amdgpu_amdkfd_gfx_v9.h"
 #include "gc/gc_9_4_2_offset.h"
 #include "gc/gc_9_4_2_sh_mask.h"
+#include 
 
 /*
  * Returns TRAP_EN, EXCP_EN and EXCP_REPLACE.
@@ -62,6 +63,50 @@ static uint32_t kgd_aldebaran_disable_debug_trap(struct 
amdgpu_device *adev,
return data;
 }
 
+static int kgd_aldebaran_validate_trap_override_request(struct amdgpu_device 
*adev,
+   uint32_t trap_override,
+   uint32_t 
*trap_mask_supported)
+{
+   *trap_mask_supported &= KFD_DBG_TRAP_MASK_FP_INVALID |
+   KFD_DBG_TRAP_MASK_FP_INPUT_DENORMAL |
+   KFD_DBG_TRAP_MASK_FP_DIVIDE_BY_ZERO |
+   KFD_DBG_TRAP_MASK_FP_OVERFLOW |
+   KFD_DBG_TRAP_MASK_FP_UNDERFLOW |
+   KFD_DBG_TRAP_MASK_FP_INEXACT |
+   KFD_DBG_TRAP_MASK_INT_DIVIDE_BY_ZERO |
+   KFD_DBG_TRAP_MASK_DBG_ADDRESS_WATCH |
+   KFD_DBG_TRAP_MASK_DBG_MEMORY_VIOLATION;
+
+   if (trap_override != KFD_DBG_TRAP_OVERRIDE_OR &&
+   trap_override != KFD_DBG_TRAP_OVERRIDE_REPLACE)
+   return -EPERM;
+
+   return 0;
+}
+
+/* returns TRAP_EN, EXCP_EN and EXCP_RPLACE. */
+static uint32_t kgd_aldebaran_set_wave_launch_trap_override(struct 
amdgpu_device *adev,
+   uint32_t vmid,
+   uint32_t trap_override,
+   uint32_t trap_mask_bits,
+   uint32_t trap_mask_request,
+   uint32_t *trap_mask_prev,
+   uint32_t kfd_dbg_trap_cntl_prev)
+
+{
+   uint32_t data = 0;
+
+   *trap_mask_prev = REG_GET_FIELD(kfd_dbg_trap_cntl_prev, 
SPI_GDBG_PER_VMID_CNTL, EXCP_EN);
+   trap_mask_bits = (trap_mask_bits & trap_mask_request) |
+   (*trap_mask_prev & ~trap_mask_request);
+
+   data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, TRAP_EN, 1);
+   data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_EN, 
trap_mask_bits);
+   data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_REPLACE, 
trap_override);
+
+   return data;
+}
+
 const struct kfd2kgd_calls aldebaran_kfd2kgd = {
.program_sh_mem_settings = kgd_gfx_v9_program_sh_mem_settings,
.set_pasid_vmid_mapping = kgd_gfx_v9_set_pasid_vmid_mapping,
@@ -82,6 +127,8 @@ const struct kfd2kgd_calls aldebaran_kfd2kgd = {
.get_cu_occupancy = kgd_gfx_v9_get_cu_occupancy,
.enable_debug_trap = kgd_aldebaran_enable_debug_trap,
.disable_debug_trap = kgd_aldebaran_disable_debug_trap,
+   .validate_trap_override_request = 
kgd_aldebaran_validate_trap_override_request,
+   .set_wave_launch_trap_override = 
kgd_aldebaran_set_wave_launch_trap_override,
.get_iq_wait_times = kgd_gfx_v9_get_iq_wait_times,
.build_grace_

[PATCH 26/33] drm/amdkfd: add debug suspend and resume process queues operation

2023-05-25 Thread Jonathan Kim

In order to inspect waves from the saved context at any point during a
debug session, the debugger must be able to preempt queues to trigger
context save by suspending them.

On queue suspend, the KFD will copy the context save header information
so that the debugger can correctly crawl the appropriate size of the saved
context. The debugger must then also be allowed to resume suspended queues.

A queue that is newly created cannot be suspended because queue ids are
recycled after destruction so the debugger needs to know that this has
occurred.  Query functions will be later added that will clear a given
queue of its new queue status.

A queue cannot be destroyed while it is suspended to preserve its saved
context during debugger inspection.  Have queue destruction block while
a queue is suspended and unblocked when it is resumed.  Likewise, if a
queue is about to be destroyed, it cannot be suspended.

Return the number of queues successfully suspended or resumed along with
a per queue status array where the upper bits per queue status show that
the request was invalid (new/destroyed queue suspend request, missing
queue) or an error occurred (HWS in a fatal state so it can't suspend or
resume queues).

v2: fixup new kfd_node struct reference for mes fw check.
also fixup missing EC_QUEUE_NEW flagging on newly created queue.

Signed-off-by: Jonathan Kim 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c|   5 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|   1 +
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  11 +
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c|   7 +
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 447 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |  10 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  |  10 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c  |  15 +-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   |  14 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |   5 +-
 .../amd/amdkfd/kfd_process_queue_manager.c|   1 +
 11 files changed, 512 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 98cd52bb005f..b4fcad0e62f7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -772,6 +772,11 @@ bool amdgpu_amdkfd_have_atomics_support(struct 
amdgpu_device *adev)
return adev->have_atomics_support;
 }
 
+void amdgpu_amdkfd_debug_mem_fence(struct amdgpu_device *adev)
+{
+   amdgpu_device_flush_hdp(adev, NULL);
+}
+
 void amdgpu_amdkfd_ras_poison_consumption_handler(struct amdgpu_device *adev, 
bool reset)
 {
amdgpu_umc_poison_handler(adev, reset);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index dd740e64e6e1..2d0406bff84e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -322,6 +322,7 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device 
*adev,
  uint64_t *mmap_offset);
 int amdgpu_amdkfd_gpuvm_export_dmabuf(struct kgd_mem *mem,
  struct dma_buf **dmabuf);
+void amdgpu_amdkfd_debug_mem_fence(struct amdgpu_device *adev);
 int amdgpu_amdkfd_get_tile_config(struct amdgpu_device *adev,
struct tile_config *config);
 void amdgpu_amdkfd_ras_poison_consumption_handler(struct amdgpu_device *adev,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 4b45d4539d48..adda60273456 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -410,6 +410,7 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
pr_debug("Write ptr address   == 0x%016llX\n",
args->write_pointer_address);
 
+   kfd_dbg_ev_raise(KFD_EC_MASK(EC_QUEUE_NEW), p, dev, queue_id, false, 
NULL, 0);
return 0;
 
 err_create_queue:
@@ -2996,7 +2997,17 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, 
struct kfd_process *p, v
args->launch_mode.launch_mode);
break;
case KFD_IOC_DBG_TRAP_SUSPEND_QUEUES:
+   r = suspend_queues(target,
+   args->suspend_queues.num_queues,
+   args->suspend_queues.grace_period,
+   args->suspend_queues.exception_mask,
+   (uint32_t 
*)args->suspend_queues.queue_array_ptr);
+
+   break;
case KFD_IOC_DBG_TRAP_RESUME_QUEUES:
+   r = resume_queues(target, args->resume_queues.num_queues,
+   (uint32_t 
*)args->resume_queues.queue_array_ptr);
+   break;
case KFD_IOC_DBG_TRAP_SET_NODE_ADDRESS_WATCH:
case KFD_IOC_DBG_TRAP_CLEAR_NODE_ADDRESS_WATCH:
case KFD_IOC_DBG_T

[PATCH 19/33] drm/amdkfd: add send exception operation

2023-05-25 Thread Jonathan Kim

Add a debug operation that allows the debugger to send an exception
directly to runtime through a payload address.

For memory violations, normal vmfault signals will be applied to
notify runtime instead after passing in the saved exception data
when a memory violation was raised to the debugger.

For runtime exceptions, this will unblock the runtime enable
function which will be explained and implemented in a follow up
patch.

Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 .../gpu/drm/amd/amdkfd/cik_event_interrupt.c  |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  5 ++
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 43 +++
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h|  6 ++
 drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  3 +-
 .../gpu/drm/amd/amdkfd/kfd_int_process_v9.c   |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  7 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 71 ++-
 8 files changed, 135 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c 
b/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c
index 4ebfff6b6c55..795382b55e0a 100644
--- a/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c
+++ b/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c
@@ -118,9 +118,9 @@ static void cik_event_interrupt_wq(struct kfd_node *dev,
return;
 
if (info.vmid == vmid)
-   kfd_signal_vm_fault_event(dev, pasid, &info);
+   kfd_signal_vm_fault_event(dev, pasid, &info, NULL);
else
-   kfd_signal_vm_fault_event(dev, pasid, NULL);
+   kfd_signal_vm_fault_event(dev, pasid, NULL, NULL);
}
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index a5c457863048..ec5a85454192 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2833,6 +2833,11 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, 
struct kfd_process *p, v
r = kfd_dbg_trap_disable(target);
break;
case KFD_IOC_DBG_TRAP_SEND_RUNTIME_EVENT:
+   r = kfd_dbg_send_exception_to_runtime(target,
+   args->send_runtime_event.gpu_id,
+   args->send_runtime_event.queue_id,
+   args->send_runtime_event.exception_mask);
+   break;
case KFD_IOC_DBG_TRAP_SET_EXCEPTIONS_ENABLED:
case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_OVERRIDE:
case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_MODE:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
index dccb27fc764b..61098975bb0e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
@@ -125,6 +125,49 @@ bool kfd_dbg_ev_raise(uint64_t event_mask,
return is_subscribed;
 }
 
+int kfd_dbg_send_exception_to_runtime(struct kfd_process *p,
+   unsigned int dev_id,
+   unsigned int queue_id,
+   uint64_t error_reason)
+{
+   if (error_reason & KFD_EC_MASK(EC_DEVICE_MEMORY_VIOLATION)) {
+   struct kfd_process_device *pdd = NULL;
+   struct kfd_hsa_memory_exception_data *data;
+   int i;
+
+   for (i = 0; i < p->n_pdds; i++) {
+   if (p->pdds[i]->dev->id == dev_id) {
+   pdd = p->pdds[i];
+   break;
+   }
+   }
+
+   if (!pdd)
+   return -ENODEV;
+
+   data = (struct kfd_hsa_memory_exception_data *)
+   pdd->vm_fault_exc_data;
+
+   kfd_dqm_evict_pasid(pdd->dev->dqm, p->pasid);
+   kfd_signal_vm_fault_event(pdd->dev, p->pasid, NULL, data);
+   error_reason &= ~KFD_EC_MASK(EC_DEVICE_MEMORY_VIOLATION);
+   }
+
+   if (error_reason & (KFD_EC_MASK(EC_PROCESS_RUNTIME))) {
+   /*
+* block should only happen after the debugger receives runtime
+* enable notice.
+*/
+   up(&p->runtime_enable_sema);
+   error_reason &= ~KFD_EC_MASK(EC_PROCESS_RUNTIME);
+   }
+
+   if (error_reason)
+   return kfd_send_exception_to_runtime(p, queue_id, error_reason);
+
+   return 0;
+}
+
 static int kfd_dbg_set_queue_workaround(struct queue *q, bool enable)
 {
struct mqd_update_info minfo = {0};
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
index 66ee7b95d08a..2c6866bb8850 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
@@ -34,6 +34,12 @@ int kfd_dbg_trap_disable(struct kfd_proc

[PATCH 18/33] drm/amdkfd: add raise exception event function

2023-05-25 Thread Jonathan Kim

Exception events can be generated from interrupts or queue activitity.

The raise event function will save exception status of a queue, device
or process then notify the debugger of the status change by writing to
a debugger polled file descriptor that the debugger provides during
debug attach.

For memory violation exceptions, extra exception data will be saved.

The debugger will be able to query the saved exception states by query
operation that will be provided by follow up patches.

v2: use new kfd_node struct in prototype.

Signed-off-by: Jonathan Kim 
---
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c   | 104 +++
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h   |   7 ++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  10 +++
 drivers/gpu/drm/amd/amdkfd/kfd_process.c |   2 +
 4 files changed, 123 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
index 5e2ee2d1acc4..dccb27fc764b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
@@ -24,6 +24,107 @@
 #include "kfd_device_queue_manager.h"
 #include 
 
+void debug_event_write_work_handler(struct work_struct *work)
+{
+   struct kfd_process *process;
+
+   static const char write_data = '.';
+   loff_t pos = 0;
+
+   process = container_of(work,
+   struct kfd_process,
+   debug_event_workarea);
+
+   kernel_write(process->dbg_ev_file, &write_data, 1, &pos);
+}
+
+/* update process/device/queue exception status, write to descriptor
+ * only if exception_status is enabled.
+ */
+bool kfd_dbg_ev_raise(uint64_t event_mask,
+   struct kfd_process *process, struct kfd_node *dev,
+   unsigned int source_id, bool use_worker,
+   void *exception_data, size_t exception_data_size)
+{
+   struct process_queue_manager *pqm;
+   struct process_queue_node *pqn;
+   int i;
+   static const char write_data = '.';
+   loff_t pos = 0;
+   bool is_subscribed = true;
+
+   if (!(process && process->debug_trap_enabled))
+   return false;
+
+   mutex_lock(&process->event_mutex);
+
+   if (event_mask & KFD_EC_MASK_DEVICE) {
+   for (i = 0; i < process->n_pdds; i++) {
+   struct kfd_process_device *pdd = process->pdds[i];
+
+   if (pdd->dev != dev)
+   continue;
+
+   pdd->exception_status |= event_mask & 
KFD_EC_MASK_DEVICE;
+
+   if (event_mask & 
KFD_EC_MASK(EC_DEVICE_MEMORY_VIOLATION)) {
+   if (!pdd->vm_fault_exc_data) {
+   pdd->vm_fault_exc_data = kmemdup(
+   exception_data,
+   exception_data_size,
+   GFP_KERNEL);
+   if (!pdd->vm_fault_exc_data)
+   pr_debug("Failed to allocate 
exception data memory");
+   } else {
+   pr_debug("Debugger exception data not 
saved\n");
+   print_hex_dump_bytes("exception data: ",
+   DUMP_PREFIX_OFFSET,
+   exception_data,
+   exception_data_size);
+   }
+   }
+   break;
+   }
+   } else if (event_mask & KFD_EC_MASK_PROCESS) {
+   process->exception_status |= event_mask & KFD_EC_MASK_PROCESS;
+   } else {
+   pqm = &process->pqm;
+   list_for_each_entry(pqn, &pqm->queues,
+   process_queue_list) {
+   int target_id;
+
+   if (!pqn->q)
+   continue;
+
+   target_id = event_mask & KFD_EC_MASK(EC_QUEUE_NEW) ?
+   pqn->q->properties.queue_id :
+   pqn->q->doorbell_id;
+
+   if (pqn->q->device != dev || target_id != source_id)
+   continue;
+
+   pqn->q->properties.exception_status |= event_mask;
+   break;
+   }
+   }
+
+   if (process->exception_enable_mask & event_mask) {
+   if (use_worker)
+   schedule_work(&process->debug_event_workarea);
+   else
+   kernel_write(process->dbg_ev_file,
+   &write_data,
+   1,
+

[PATCH 10/33] drm/amdgpu: add gfx9.4.2 hw debug mode enable and disable calls

2023-05-25 Thread Jonathan Kim

GFX9.4.2 now supports per-VMID debug mode controls registers
(SPI_GDBG_PER_VMID_CNTL).

Because the KFD lets the HWS handle PASID-VMID mapping, the KFD will
forward all debug mode setting register writes to the HWS scheduler
using a new MAP_PROCESS API, so instead of writing to registers, return
the required register values that the HWS needs to write on debug enable
and disable.

Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 .../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c  | 42 ++-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
index 4485bb29bec9..a6f98141c29c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
@@ -23,6 +23,44 @@
 #include "amdgpu_amdkfd.h"
 #include "amdgpu_amdkfd_arcturus.h"
 #include "amdgpu_amdkfd_gfx_v9.h"
+#include "gc/gc_9_4_2_offset.h"
+#include "gc/gc_9_4_2_sh_mask.h"
+
+/*
+ * Returns TRAP_EN, EXCP_EN and EXCP_REPLACE.
+ *
+ * restore_dbg_registers is ignored here but is a general interface requirement
+ * for devices that support GFXOFF and where the RLC save/restore list
+ * does not support hw registers for debugging i.e. the driver has to manually
+ * initialize the debug mode registers after it has disabled GFX off during the
+ * debug session.
+ */
+static uint32_t kgd_aldebaran_enable_debug_trap(struct amdgpu_device *adev,
+   bool restore_dbg_registers,
+   uint32_t vmid)
+{
+   uint32_t data = 0;
+
+   data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, TRAP_EN, 1);
+   data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_EN, 0);
+   data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_REPLACE, 0);
+
+   return data;
+}
+
+/* returns TRAP_EN, EXCP_EN and EXCP_REPLACE. */
+static uint32_t kgd_aldebaran_disable_debug_trap(struct amdgpu_device *adev,
+   bool keep_trap_enabled,
+   uint32_t vmid)
+{
+   uint32_t data = 0;
+
+   data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, TRAP_EN, 
keep_trap_enabled);
+   data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_EN, 0);
+   data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_REPLACE, 0);
+
+   return data;
+}
 
 const struct kfd2kgd_calls aldebaran_kfd2kgd = {
.program_sh_mem_settings = kgd_gfx_v9_program_sh_mem_settings,
@@ -42,5 +80,7 @@ const struct kfd2kgd_calls aldebaran_kfd2kgd = {
kgd_gfx_v9_get_atc_vmid_pasid_mapping_info,
.set_vm_context_page_table_base = 
kgd_gfx_v9_set_vm_context_page_table_base,
.get_cu_occupancy = kgd_gfx_v9_get_cu_occupancy,
-   .program_trap_handler_settings = 
kgd_gfx_v9_program_trap_handler_settings
+   .enable_debug_trap = kgd_aldebaran_enable_debug_trap,
+   .disable_debug_trap = kgd_aldebaran_disable_debug_trap,
+   .program_trap_handler_settings = 
kgd_gfx_v9_program_trap_handler_settings,
 };
-- 
2.25.1

[PATCH 16/33] drm/amdkfd: add per process hw trap enable and disable functions

2023-05-25 Thread Jonathan Kim

To enable HW debug mode per process, all devices must be debug enabled
successfully.  If a failure occures, rewind the enablement of debug mode
on the enabled devices.

A power management scenario that needs to be considered is HW
debug mode setting during GFXOFF.  During GFXOFF, these registers
will be unreachable so we have to transiently disable GFXOFF when
setting.  Also, some devices don't support the RLC save restore
function for these debug registers so we have to disable GFXOFF
completely during a debug session.

Cooperative launch also has debugging restriction based on HW/FW bugs.
If such bugs exists, the debugger cannot attach to a process that uses GWS
resources nor can GWS resources be requested if a process is being
debugged.

Multi-process debug devices can only enable trap temporaries based
on certain runtime scenerios, which will be explained when the
runtime enable functions are implemented in a follow up patch.

v2: spot fix with new kfd_node references

Signed-off-by: Jonathan Kim 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |   5 +
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c   | 148 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h   |  29 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c |  10 ++
 4 files changed, 190 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 7082d5d0f0e9..9d0c247f80fe 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1488,6 +1488,11 @@ static int kfd_ioctl_alloc_queue_gws(struct file *filep,
goto out_unlock;
}
 
+   if (!kfd_dbg_has_gws_support(dev) && p->debug_trap_enabled) {
+   retval = -EBUSY;
+   goto out_unlock;
+   }
+
retval = pqm_set_gws(&p->pqm, args->queue_id, args->num_gws ? dev->gws 
: NULL);
mutex_unlock(&p->mutex);
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
index 898cc1fe3d13..73b07b5f17f1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
@@ -21,13 +21,78 @@
  */
 
 #include "kfd_debug.h"
+#include "kfd_device_queue_manager.h"
 #include 
 
+static int kfd_dbg_set_mes_debug_mode(struct kfd_process_device *pdd)
+{
+   uint32_t spi_dbg_cntl = pdd->spi_dbg_override | 
pdd->spi_dbg_launch_mode;
+   uint32_t flags = pdd->process->dbg_flags;
+
+   if (!kfd_dbg_is_per_vmid_supported(pdd->dev))
+   return 0;
+
+   return amdgpu_mes_set_shader_debugger(pdd->dev->adev, 
pdd->proc_ctx_gpu_addr, spi_dbg_cntl,
+   pdd->watch_points, flags);
+}
+
+/* kfd_dbg_trap_deactivate:
+ * target: target process
+ * unwind: If this is unwinding a failed kfd_dbg_trap_enable()
+ * unwind_count:
+ * If unwind == true, how far down the pdd list we need
+ * to unwind
+ * else: ignored
+ */
+static void kfd_dbg_trap_deactivate(struct kfd_process *target, bool unwind, 
int unwind_count)
+{
+   int i;
+
+   for (i = 0; i < target->n_pdds; i++) {
+   struct kfd_process_device *pdd = target->pdds[i];
+
+   /* If this is an unwind, and we have unwound the required
+* enable calls on the pdd list, we need to stop now
+* otherwise we may mess up another debugger session.
+*/
+   if (unwind && i == unwind_count)
+   break;
+
+   /* GFX off is already disabled by debug activate if not RLC 
restore supported. */
+   if (kfd_dbg_is_rlc_restore_supported(pdd->dev))
+   amdgpu_gfx_off_ctrl(pdd->dev->adev, false);
+   pdd->spi_dbg_override =
+   pdd->dev->kfd2kgd->disable_debug_trap(
+   pdd->dev->adev,
+   target->runtime_info.ttmp_setup,
+   pdd->dev->vm_info.last_vmid_kfd);
+   amdgpu_gfx_off_ctrl(pdd->dev->adev, true);
+
+   if (!kfd_dbg_is_per_vmid_supported(pdd->dev) &&
+   release_debug_trap_vmid(pdd->dev->dqm, 
&pdd->qpd))
+   pr_err("Failed to release debug vmid on [%i]\n", 
pdd->dev->id);
+
+   if (!pdd->dev->kfd->shared_resources.enable_mes)
+   debug_refresh_runlist(pdd->dev->dqm);
+   else
+   kfd_dbg_set_mes_debug_mode(pdd);
+   }
+}
+
 int kfd_dbg_trap_disable(struct kfd_process *target)
 {
if (!target->debug_trap_enabled)
return 0;
 
+   /*
+* Defer deactivation to runtime if runtime not enabled otherwise reset
+* attached running target runtime state to enable for re-attach.
+*/
+   if (target->runtime_info.runtime_state == DEBUG_RUNTIME_STATE_ENABLED)
+

[PATCH 20/33] drm/amdkfd: add runtime enable operation

2023-05-25 Thread Jonathan Kim

The debugger can attach to a process prior to HSA enablement (i.e.
inferior is spawned by the debugger and attached to immediately before
target process has been enabled for HSA dispatches) or it
can attach to a running target that is already HSA enabled.  Either
way, the debugger needs to know the enablement status to know when
it can inspect queues.

For the scenario where the debugger spawns the target process,
it will have to wait for ROCr's runtime enable request from the target.
The runtime enable request will be able to see that its process has been
debug attached.  ROCr raises an EC_PROCESS_RUNTIME signal to the
debugger then blocks the target process while waiting the debugger's
response. Once the debugger has received the runtime signal, it will
unblock the target process.

For the scenario where the debugger attaches to a running target
process, ROCr will set the target process' runtime status as enabled so
that on an attach request, the debugger will be able to see this
status and will continue with debug enablement as normal.

A secondary requirement is to conditionally enable the trap tempories only
if the user requests it (env var HSA_ENABLE_DEBUG=1) or if the debugger
attaches with HSA runtime enabled.  This is because setting up the trap
temporaries incurs a performance overhead that is unacceptable for
microbench performance in normal mode for certain customers.

In the scenario where the debugger spawns the target process, when ROCr
detects that the debugger has attached during the runtime enable
request, it will enable the trap temporaries before it blocks the target
process while waiting for the debugger to respond.

In the scenario where the debugger attaches to a running target process,
it will enable to trap temporaries itself.

Finally, there is an additional restriction that is required to be
enforced with runtime enable and HW debug mode setting. The debugger must
first ensure that HW debug mode has been enabled before permitting HW debug
mode operations.

With single process debug devices, allowing the debugger to set debug
HW modes prior to trap activation means that debug HW mode setting can
occur before the KFD has reserved the debug VMID (0xf) from the hardware
scheduler's VMID allocation resource pool.  This can result in the
hardware scheduler assigning VMID 0xf to a non-debugged process and
having that process inherit debug HW mode settings intended for the
debugged target process instead, which is both incorrect and potentially
fatal for normal mode operation.

With multi process debug devices, allowing the debugger to set debug
HW modes prior to trap activation means that non-debugged processes
migrating to a new VMID could inherit unintended debug settings.

All debug operations that touch HW settings must require trap activation
where trap activation is triggered by both debug attach and runtime
enablement (target has KFD opened and is ready to dispatch work).

v2: fixup with new kfd_node struct reference

Signed-off-by: Jonathan Kim 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 143 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c   |   6 +-
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h   |   4 +
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|   1 +
 4 files changed, 150 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index ec5a85454192..73cb5abce431 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2738,11 +2738,140 @@ static int kfd_ioctl_criu(struct file *filep, struct 
kfd_process *p, void *data)
return ret;
 }
 
-static int kfd_ioctl_runtime_enable(struct file *filep, struct kfd_process *p, 
void *data)
+static int runtime_enable(struct kfd_process *p, uint64_t r_debug,
+   bool enable_ttmp_setup)
+{
+   int i = 0, ret = 0;
+
+   if (p->is_runtime_retry)
+   goto retry;
+
+   if (p->runtime_info.runtime_state != DEBUG_RUNTIME_STATE_DISABLED)
+   return -EBUSY;
+
+   for (i = 0; i < p->n_pdds; i++) {
+   struct kfd_process_device *pdd = p->pdds[i];
+
+   if (pdd->qpd.queue_count)
+   return -EEXIST;
+   }
+
+   p->runtime_info.runtime_state = DEBUG_RUNTIME_STATE_ENABLED;
+   p->runtime_info.r_debug = r_debug;
+   p->runtime_info.ttmp_setup = enable_ttmp_setup;
+
+   if (p->runtime_info.ttmp_setup) {
+   for (i = 0; i < p->n_pdds; i++) {
+   struct kfd_process_device *pdd = p->pdds[i];
+
+   if (!kfd_dbg_is_rlc_restore_supported(pdd->dev)) {
+   amdgpu_gfx_off_ctrl(pdd->dev->adev, false);
+   pdd->dev->kfd2kgd->enable_debug_trap(
+   pdd->dev->adev,
+   true,
+

[PATCH 13/33] drm/amdkfd: prepare map process for single process debug devices

2023-05-25 Thread Jonathan Kim

Older HW only supports debugging on a single process because the
SPI debug mode setting registers are device global.

The HWS has supplied a single pinned VMID (0xf) for MAP_PROCESS
for debug purposes. To pin the VMID, the KFD will remove the VMID from
the HWS dynamic VMID allocation via SET_RESOUCES so that a debugged
process will never migrate away from its pinned VMID.

The KFD is responsible for reserving and releasing this pinned VMID
accordingly whenever the debugger attaches and detaches respectively.

v2: spot fix ups using new kfd_node references

Signed-off-by: Jonathan Kim 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 93 +++
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |  5 +
 .../drm/amd/amdkfd/kfd_packet_manager_v9.c|  9 ++
 .../gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h   |  5 +-
 4 files changed, 111 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index d1f44feb7084..c8519adc89ac 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1524,6 +1524,7 @@ static int initialize_cpsch(struct device_queue_manager 
*dqm)
dqm->gws_queue_count = 0;
dqm->active_runlist = false;
INIT_WORK(&dqm->hw_exception_work, kfd_process_hw_exception);
+   dqm->trap_debug_vmid = 0;
 
init_sdma_bitmaps(dqm);
 
@@ -2500,6 +2501,98 @@ static void kfd_process_hw_exception(struct work_struct 
*work)
amdgpu_amdkfd_gpu_reset(dqm->dev->adev);
 }
 
+int reserve_debug_trap_vmid(struct device_queue_manager *dqm,
+   struct qcm_process_device *qpd)
+{
+   int r;
+   int updated_vmid_mask;
+
+   if (dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) {
+   pr_err("Unsupported on sched_policy: %i\n", dqm->sched_policy);
+   return -EINVAL;
+   }
+
+   dqm_lock(dqm);
+
+   if (dqm->trap_debug_vmid != 0) {
+   pr_err("Trap debug id already reserved\n");
+   r = -EBUSY;
+   goto out_unlock;
+   }
+
+   r = unmap_queues_cpsch(dqm, KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES, 0,
+   USE_DEFAULT_GRACE_PERIOD, false);
+   if (r)
+   goto out_unlock;
+
+   updated_vmid_mask = dqm->dev->kfd->shared_resources.compute_vmid_bitmap;
+   updated_vmid_mask &= ~(1 << dqm->dev->vm_info.last_vmid_kfd);
+
+   dqm->dev->kfd->shared_resources.compute_vmid_bitmap = updated_vmid_mask;
+   dqm->trap_debug_vmid = dqm->dev->vm_info.last_vmid_kfd;
+   r = set_sched_resources(dqm);
+   if (r)
+   goto out_unlock;
+
+   r = map_queues_cpsch(dqm);
+   if (r)
+   goto out_unlock;
+
+   pr_debug("Reserved VMID for trap debug: %i\n", dqm->trap_debug_vmid);
+
+out_unlock:
+   dqm_unlock(dqm);
+   return r;
+}
+
+/*
+ * Releases vmid for the trap debugger
+ */
+int release_debug_trap_vmid(struct device_queue_manager *dqm,
+   struct qcm_process_device *qpd)
+{
+   int r;
+   int updated_vmid_mask;
+   uint32_t trap_debug_vmid;
+
+   if (dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) {
+   pr_err("Unsupported on sched_policy: %i\n", dqm->sched_policy);
+   return -EINVAL;
+   }
+
+   dqm_lock(dqm);
+   trap_debug_vmid = dqm->trap_debug_vmid;
+   if (dqm->trap_debug_vmid == 0) {
+   pr_err("Trap debug id is not reserved\n");
+   r = -EINVAL;
+   goto out_unlock;
+   }
+
+   r = unmap_queues_cpsch(dqm, KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES, 0,
+   USE_DEFAULT_GRACE_PERIOD, false);
+   if (r)
+   goto out_unlock;
+
+   updated_vmid_mask = dqm->dev->kfd->shared_resources.compute_vmid_bitmap;
+   updated_vmid_mask |= (1 << dqm->dev->vm_info.last_vmid_kfd);
+
+   dqm->dev->kfd->shared_resources.compute_vmid_bitmap = updated_vmid_mask;
+   dqm->trap_debug_vmid = 0;
+   r = set_sched_resources(dqm);
+   if (r)
+   goto out_unlock;
+
+   r = map_queues_cpsch(dqm);
+   if (r)
+   goto out_unlock;
+
+   pr_debug("Released VMID for trap debug: %i\n", trap_debug_vmid);
+
+out_unlock:
+   dqm_unlock(dqm);
+   return r;
+}
+
 #if defined(CONFIG_DEBUG_FS)
 
 static void seq_reg_dump(struct seq_file *m,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
index d4dd3b4acbf0..bf7aa3f84182 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
@@ -250,6 +250,7 @@ struct device_queue_manager {
struct kfd_mem_obj  *fence_mem;
boolactive_runlist;
int sched_policy;
+   uint32_ttrap_debug_vmid

[PATCH 08/33] drm/amdkfd: fix kfd_suspend_all_processes

2023-05-25 Thread Jonathan Kim

Flush delayed restore work in kfd_suspend_all_queues instead of
cancelling. Cancelling the work before it runs results in the queues
becoming permanently disabled. Flushing the work ensures that the
queue suspend/resume state stays balanced.

Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index af0a4b5257cc..d63a764dafb9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -2014,7 +2014,7 @@ void kfd_suspend_all_processes(void)
WARN(debug_evictions, "Evicting all processes");
hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
cancel_delayed_work_sync(&p->eviction_work);
-   cancel_delayed_work_sync(&p->restore_work);
+   flush_delayed_work(&p->restore_work);
 
if (kfd_process_evict_queues(p, 
KFD_QUEUE_EVICTION_TRIGGER_SUSPEND))
pr_err("Failed to suspend process 0x%x\n", p->pasid);
-- 
2.25.1

[PATCH 07/33] drm/amdgpu: add gfx9.4.1 hw debug mode enable and disable calls

2023-05-25 Thread Jonathan Kim

On GFX9.4.1, the implicit wait count instruction on s_barrier is
disabled by default in the driver during normal operation for
performance requirements.

There is a hardware bug in GFX9.4.1 where if the implicit wait count
instruction after an s_barrier instruction is disabled, any wave that
hits an exception may step over the s_barrier when returning from the
trap handler with the barrier logic having no ability to be
aware of this, thereby causing other waves to wait at the barrier
indefinitely resulting in a shader hang.  This bug has been corrected
for GFX9.4.2 and onward.

Since the debugger subscribes to hardware exceptions, in order to avoid
this bug, the debugger must enable implicit wait count on s_barrier
for a debug session and disable it on detach.

In order to change this setting in the in the device global SQ_CONFIG
register, the GFX pipeline must be idle.  GFX9.4.1 as a compute device
will either dispatch work through the compute ring buffers used for
image post processing or through the hardware scheduler by the KFD.

Have the KGD suspend and drain the compute ring buffer, then suspend the
hardware scheduler and block any future KFD process job requests before
changing the implicit wait count setting.  Once set, resume all work.

Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   3 +
 .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   | 116 ++
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c |   4 +-
 3 files changed, 121 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 5af954abd5ba..7c49cdc37d95 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1065,6 +1065,9 @@ struct amdgpu_device {
struct pci_saved_state  *pci_state;
pci_channel_state_t pci_channel_state;
 
+   /* Track auto wait count on s_barrier settings */
+   boolbarrier_has_auto_waitcnt;
+
struct amdgpu_reset_control *reset_cntl;
uint32_t
ip_versions[MAX_HWIP][HWIP_MAX_INSTANCE];
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
index 4191af5a3f13..d2918e5c0dea 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
@@ -26,6 +26,7 @@
 #include "amdgpu.h"
 #include "amdgpu_amdkfd.h"
 #include "amdgpu_amdkfd_arcturus.h"
+#include "amdgpu_reset.h"
 #include "sdma0/sdma0_4_2_2_offset.h"
 #include "sdma0/sdma0_4_2_2_sh_mask.h"
 #include "sdma1/sdma1_4_2_2_offset.h"
@@ -48,6 +49,8 @@
 #include "amdgpu_amdkfd_gfx_v9.h"
 #include "gfxhub_v1_0.h"
 #include "mmhub_v9_4.h"
+#include "gc/gc_9_0_offset.h"
+#include "gc/gc_9_0_sh_mask.h"
 
 #define HQD_N_REGS 56
 #define DUMP_REG(addr) do {\
@@ -276,6 +279,117 @@ int kgd_arcturus_hqd_sdma_destroy(struct amdgpu_device 
*adev, void *mqd,
return 0;
 }
 
+/*
+ * Helper used to suspend/resume gfx pipe for image post process work to set
+ * barrier behaviour.
+ */
+static int suspend_resume_compute_scheduler(struct amdgpu_device *adev, bool 
suspend)
+{
+   int i, r = 0;
+
+   for (i = 0; i < adev->gfx.num_compute_rings; i++) {
+   struct amdgpu_ring *ring = &adev->gfx.compute_ring[i];
+
+   if (!(ring && ring->sched.thread))
+   continue;
+
+   /* stop secheduler and drain ring. */
+   if (suspend) {
+   drm_sched_stop(&ring->sched, NULL);
+   r = amdgpu_fence_wait_empty(ring);
+   if (r)
+   goto out;
+   } else {
+   drm_sched_start(&ring->sched, false);
+   }
+   }
+
+out:
+   /* return on resume or failure to drain rings. */
+   if (!suspend || r)
+   return r;
+
+   return amdgpu_device_ip_wait_for_idle(adev, GC_HWIP);
+}
+
+static void set_barrier_auto_waitcnt(struct amdgpu_device *adev, bool 
enable_waitcnt)
+{
+   uint32_t data;
+
+   WRITE_ONCE(adev->barrier_has_auto_waitcnt, enable_waitcnt);
+
+   if (!down_read_trylock(&adev->reset_domain->sem))
+   return;
+
+   amdgpu_amdkfd_suspend(adev, false);
+
+   if (suspend_resume_compute_scheduler(adev, true))
+   goto out;
+
+   data = RREG32(SOC15_REG_OFFSET(GC, 0, mmSQ_CONFIG));
+   data = REG_SET_FIELD(data, SQ_CONFIG, DISABLE_BARRIER_WAITCNT,
+   !enable_waitcnt);
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSQ_CONFIG), data);
+
+out:
+   suspend_resume_compute_scheduler(adev, false);
+
+   amdgpu_amdkfd_resume(adev, false);
+
+   up_read(&adev->reset_domain->sem);
+}
+
+/*
+ * restore_dbg_registers is ignored here but is a general i

[PATCH 12/33] drm/amdgpu: add configurable grace period for unmap queues

2023-05-25 Thread Jonathan Kim

The HWS schedule allows a grace period for wave completion prior to
preemption for better performance by avoiding CWSR on waves that can
potentially complete quickly. The debugger, on the other hand, will
want to inspect wave status immediately after it actively triggers
preemption (a suspend function to be provided).

To minimize latency between preemption and debugger wave inspection, allow
immediate preemption by setting the grace period to 0.

Note that setting the preepmtion grace period to 0 will result in an
infinite grace period being set due to a CP FW bug so set it to 1 for now.

v2: add null grace period function pointers to VI packet manager.

Signed-off-by: Jonathan Kim 
---
 .../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c  |  2 +
 .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |  2 +
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 43 
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h|  6 ++
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c  |  2 +
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 43 
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h |  8 ++-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 63 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |  3 +
 .../gpu/drm/amd/amdkfd/kfd_packet_manager.c   | 32 +
 .../drm/amd/amdkfd/kfd_packet_manager_v9.c| 39 +++
 .../drm/amd/amdkfd/kfd_packet_manager_vi.c|  2 +
 .../gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h   | 65 +++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  5 ++
 14 files changed, 295 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
index a6f98141c29c..b811a0985050 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
@@ -82,5 +82,7 @@ const struct kfd2kgd_calls aldebaran_kfd2kgd = {
.get_cu_occupancy = kgd_gfx_v9_get_cu_occupancy,
.enable_debug_trap = kgd_aldebaran_enable_debug_trap,
.disable_debug_trap = kgd_aldebaran_disable_debug_trap,
+   .get_iq_wait_times = kgd_gfx_v9_get_iq_wait_times,
+   .build_grace_period_packet_info = 
kgd_gfx_v9_build_grace_period_packet_info,
.program_trap_handler_settings = 
kgd_gfx_v9_program_trap_handler_settings,
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
index d2918e5c0dea..a62bd0068515 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
@@ -410,6 +410,8 @@ const struct kfd2kgd_calls arcturus_kfd2kgd = {
kgd_gfx_v9_set_vm_context_page_table_base,
.enable_debug_trap = kgd_arcturus_enable_debug_trap,
.disable_debug_trap = kgd_arcturus_disable_debug_trap,
+   .get_iq_wait_times = kgd_gfx_v9_get_iq_wait_times,
+   .build_grace_period_packet_info = 
kgd_gfx_v9_build_grace_period_packet_info,
.get_cu_occupancy = kgd_gfx_v9_get_cu_occupancy,
.program_trap_handler_settings = 
kgd_gfx_v9_program_trap_handler_settings
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
index 240f5006e278..98006c7021dd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
@@ -803,6 +803,47 @@ uint32_t kgd_gfx_v10_disable_debug_trap(struct 
amdgpu_device *adev,
return 0;
 }
 
+/* kgd_gfx_v10_get_iq_wait_times: Returns the mmCP_IQ_WAIT_TIME1/2 values
+ * The values read are:
+ * ib_offload_wait_time -- Wait Count for Indirect Buffer Offloads.
+ * atomic_offload_wait_time -- Wait Count for L2 and GDS Atomics Offloads.
+ * wrm_offload_wait_time-- Wait Count for WAIT_REG_MEM Offloads.
+ * gws_wait_time-- Wait Count for Global Wave Syncs.
+ * que_sleep_wait_time  -- Wait Count for Dequeue Retry.
+ * sch_wave_wait_time   -- Wait Count for Scheduling Wave Message.
+ * sem_rearm_wait_time  -- Wait Count for Semaphore re-arm.
+ * deq_retry_wait_time  -- Wait Count for Global Wave Syncs.
+ */
+void kgd_gfx_v10_get_iq_wait_times(struct amdgpu_device *adev,
+   uint32_t *wait_times)
+
+{
+   *wait_times = RREG32(SOC15_REG_OFFSET(GC, 0, mmCP_IQ_WAIT_TIME2));
+}
+
+void kgd_gfx_v10_build_grace_period_packet_info(struct amdgpu_device *adev,
+   uint32_t wait_times,
+   uint32_t grace_period,
+   uint32_t *reg_offset,
+   uint32_t *reg_data)
+{
+   *reg_data = wait_times;
+
+   /*
+* The CP cannont handle a 0 grace period input and will result in
+* an infinite grace period being set so set to

[PATCH 14/33] drm/amdgpu: prepare map process for multi-process debug devices

2023-05-25 Thread Jonathan Kim

Unlike single process debug devices, multi-process debug devices allow
debug mode setting per-VMID (non-device-global).

Because the HWS manages PASID-VMID mapping, the new MAP_PROCESS API allows
the KFD to forward the required SPI debug register write requests.

To request a new debug mode setting change, the KFD must be able to
preempt all queues then remap all queues with these new setting
requests for MAP_PROCESS to take effect.

Note that by default, trap enablement in non-debug mode must be disabled
for performance reasons for multi-process debug devices due to setup
overhead in FW.

v2: spot fixup new kfd_node references

Signed-off-by: Jonathan Kim 
---
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h|  5 ++
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 51 +++
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |  3 ++
 .../drm/amd/amdkfd/kfd_packet_manager_v9.c| 14 +
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  9 
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  5 ++
 6 files changed, 87 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
index a8abfe2a0a14..db6d72e7930f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
@@ -29,4 +29,9 @@ int kfd_dbg_trap_disable(struct kfd_process *target);
 int kfd_dbg_trap_enable(struct kfd_process *target, uint32_t fd,
void __user *runtime_info,
uint32_t *runtime_info_size);
+static inline bool kfd_dbg_is_per_vmid_supported(struct kfd_node *dev)
+{
+   return KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2);
+}
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index c8519adc89ac..badfe1210bc4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -36,6 +36,7 @@
 #include "kfd_kernel_queue.h"
 #include "amdgpu_amdkfd.h"
 #include "mes_api_def.h"
+#include "kfd_debug.h"
 
 /* Size of the per-pipe EOP queue */
 #define CIK_HPD_EOP_BYTES_LOG2 11
@@ -2593,6 +2594,56 @@ int release_debug_trap_vmid(struct device_queue_manager 
*dqm,
return r;
 }
 
+int debug_lock_and_unmap(struct device_queue_manager *dqm)
+{
+   int r;
+
+   if (dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) {
+   pr_err("Unsupported on sched_policy: %i\n", dqm->sched_policy);
+   return -EINVAL;
+   }
+
+   if (!kfd_dbg_is_per_vmid_supported(dqm->dev))
+   return 0;
+
+   dqm_lock(dqm);
+
+   r = unmap_queues_cpsch(dqm, KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES, 0, 0, 
false);
+   if (r)
+   dqm_unlock(dqm);
+
+   return r;
+}
+
+int debug_map_and_unlock(struct device_queue_manager *dqm)
+{
+   int r;
+
+   if (dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) {
+   pr_err("Unsupported on sched_policy: %i\n", dqm->sched_policy);
+   return -EINVAL;
+   }
+
+   if (!kfd_dbg_is_per_vmid_supported(dqm->dev))
+   return 0;
+
+   r = map_queues_cpsch(dqm);
+
+   dqm_unlock(dqm);
+
+   return r;
+}
+
+int debug_refresh_runlist(struct device_queue_manager *dqm)
+{
+   int r = debug_lock_and_unmap(dqm);
+
+   if (r)
+   return r;
+
+   return debug_map_and_unlock(dqm);
+}
+
 #if defined(CONFIG_DEBUG_FS)
 
 static void seq_reg_dump(struct seq_file *m,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
index bf7aa3f84182..bb75d93712eb 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
@@ -290,6 +290,9 @@ int reserve_debug_trap_vmid(struct device_queue_manager 
*dqm,
struct qcm_process_device *qpd);
 int release_debug_trap_vmid(struct device_queue_manager *dqm,
struct qcm_process_device *qpd);
+int debug_lock_and_unmap(struct device_queue_manager *dqm);
+int debug_map_and_unlock(struct device_queue_manager *dqm);
+int debug_refresh_runlist(struct device_queue_manager *dqm);
 
 static inline unsigned int get_sh_mem_bases_32(struct kfd_process_device *pdd)
 {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c
index 0fe73dbd28af..29a2d0499b67 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c
@@ -88,6 +88,10 @@ static int pm_map_process_aldebaran(struct packet_manager 
*pm,
 {
struct pm4_mes_map_process_aldebaran *packet;
uint64_t vm_page_table_base_addr = qpd->page_table_base;
+   struct kfd_dev *kfd = pm->dqm->dev->kfd;
+   struct kfd_process_device *pdd =
+   container_of(qpd, struct kfd_process_device, qpd);
+   int i;
 
packet = (s

[PATCH 09/33] drm/amdgpu: add gfx10 hw debug mode enable and disable calls

2023-05-25 Thread Jonathan Kim

Similar to GFX9 debug devices, set the hardware debug mode by draining
the SPI appropriately prior the mode setting request.

Because GFX10 has waves allocated by the work group boundary and each
SE's SPI instances do not communicate, the SPI drain time is much longer.
This long drain time will be fixed for GFX11 onwards.

Also remove a bunch of deprecated misplaced references for GFX10.3.

Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c|  96 
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h|  28 
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c  | 148 +-
 3 files changed, 127 insertions(+), 145 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
index 7b60268d93c0..240f5006e278 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
@@ -21,6 +21,7 @@
  */
 #include "amdgpu.h"
 #include "amdgpu_amdkfd.h"
+#include "amdgpu_amdkfd_gfx_v10.h"
 #include "gc/gc_10_1_0_offset.h"
 #include "gc/gc_10_1_0_sh_mask.h"
 #include "athub/athub_2_0_0_offset.h"
@@ -709,6 +710,99 @@ static void set_vm_context_page_table_base(struct 
amdgpu_device *adev,
adev->gfxhub.funcs->setup_vm_pt_regs(adev, vmid, page_table_base);
 }
 
+/*
+ * GFX10 helper for wave launch stall requirements on debug trap setting.
+ *
+ * vmid:
+ *   Target VMID to stall/unstall.
+ *
+ * stall:
+ *   0-unstall wave launch (enable), 1-stall wave launch (disable).
+ *   After wavefront launch has been stalled, allocated waves must drain from
+ *   SPI in order for debug trap settings to take effect on those waves.
+ *   This is roughly a ~3500 clock cycle wait on SPI where a read on
+ *   SPI_GDBG_WAVE_CNTL translates to ~32 clock cycles.
+ *   KGD_GFX_V10_WAVE_LAUNCH_SPI_DRAIN_LATENCY indicates the number of reads 
required.
+ *
+ *   NOTE: We can afford to clear the entire STALL_VMID field on unstall
+ *   because current GFX10 chips cannot support multi-process debugging due to
+ *   trap configuration and masking being limited to global scope.  Always
+ *   assume single process conditions.
+ *
+ */
+
+#define KGD_GFX_V10_WAVE_LAUNCH_SPI_DRAIN_LATENCY  110
+static void kgd_gfx_v10_set_wave_launch_stall(struct amdgpu_device *adev, 
uint32_t vmid, bool stall)
+{
+   uint32_t data = RREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_WAVE_CNTL));
+   int i;
+
+   data = REG_SET_FIELD(data, SPI_GDBG_WAVE_CNTL, STALL_VMID,
+   stall ? 1 << vmid : 0);
+
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_WAVE_CNTL), data);
+
+   if (!stall)
+   return;
+
+   for (i = 0; i < KGD_GFX_V10_WAVE_LAUNCH_SPI_DRAIN_LATENCY; i++)
+   RREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_WAVE_CNTL));
+}
+
+uint32_t kgd_gfx_v10_enable_debug_trap(struct amdgpu_device *adev,
+   bool restore_dbg_registers,
+   uint32_t vmid)
+{
+
+   mutex_lock(&adev->grbm_idx_mutex);
+
+   kgd_gfx_v10_set_wave_launch_stall(adev, vmid, true);
+
+   /* assume gfx off is disabled for the debug session if rlc restore not 
supported. */
+   if (restore_dbg_registers) {
+   uint32_t data = 0;
+
+   data = REG_SET_FIELD(data, SPI_GDBG_TRAP_CONFIG,
+   VMID_SEL, 1 << vmid);
+   data = REG_SET_FIELD(data, SPI_GDBG_TRAP_CONFIG,
+   TRAP_EN, 1);
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_CONFIG), data);
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_DATA0), 0);
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_DATA1), 0);
+
+   kgd_gfx_v10_set_wave_launch_stall(adev, vmid, false);
+
+   mutex_unlock(&adev->grbm_idx_mutex);
+
+   return 0;
+   }
+
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_MASK), 0);
+
+   kgd_gfx_v10_set_wave_launch_stall(adev, vmid, false);
+
+   mutex_unlock(&adev->grbm_idx_mutex);
+
+   return 0;
+}
+
+uint32_t kgd_gfx_v10_disable_debug_trap(struct amdgpu_device *adev,
+   bool keep_trap_enabled,
+   uint32_t vmid)
+{
+   mutex_lock(&adev->grbm_idx_mutex);
+
+   kgd_gfx_v10_set_wave_launch_stall(adev, vmid, true);
+
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_MASK), 0);
+
+   kgd_gfx_v10_set_wave_launch_stall(adev, vmid, false);
+
+   mutex_unlock(&adev->grbm_idx_mutex);
+
+   return 0;
+}
+
 static void program_trap_handler_settings(struct amdgpu_device *adev,
uint32_t vmid, uint64_t tba_addr, uint64_t tma_addr,
uint32_t inst)
@@ -752,5 +846,7 @@ const struct kfd2kgd_calls gfx_v10_kfd2kgd

[PATCH 05/33] drm/amdgpu: setup hw debug registers on driver initialization

2023-05-25 Thread Jonathan Kim

Add missing debug trap registers references and initialize all debug
registers on boot by clearing the hardware exception overrides and the
wave allocation ID index.

The debugger requires that TTMPs 6 & 7 save the dispatch ID to map
waves onto dispatch during compute context inspection.
In order to correctly set this up, set the special reserved CP bit by
default whenever the MQD is initailized.

v2: add missing 0-init of SPI_GDBG_TRAP_DATA0/1

Signed-off-by: Jonathan Kim 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c| 26 +++
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c|  1 +
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 30 
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c   |  3 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  |  5 ++
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c  |  5 ++
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   |  5 ++
 .../include/asic_reg/gc/gc_10_1_0_offset.h| 14 
 .../include/asic_reg/gc/gc_10_1_0_sh_mask.h   | 69 +++
 .../include/asic_reg/gc/gc_10_3_0_offset.h| 10 +++
 .../include/asic_reg/gc/gc_10_3_0_sh_mask.h   |  4 ++
 .../include/asic_reg/gc/gc_11_0_0_sh_mask.h   |  4 ++
 12 files changed, 176 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index f7ad883a70fa..be984f8c71c7 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -4825,6 +4825,29 @@ static u32 
gfx_v10_0_init_pa_sc_tile_steering_override(struct amdgpu_device *ade
 
 #define DEFAULT_SH_MEM_BASES   (0x6000)
 
+static void gfx_v10_0_debug_trap_config_init(struct amdgpu_device *adev,
+   uint32_t first_vmid,
+   uint32_t last_vmid)
+{
+   uint32_t data;
+   uint32_t trap_config_vmid_mask = 0;
+   int i;
+
+   /* Calculate trap config vmid mask */
+   for (i = first_vmid; i < last_vmid; i++)
+   trap_config_vmid_mask |= (1 << i);
+
+   data = REG_SET_FIELD(0, SPI_GDBG_TRAP_CONFIG,
+   VMID_SEL, trap_config_vmid_mask);
+   data = REG_SET_FIELD(data, SPI_GDBG_TRAP_CONFIG,
+   TRAP_EN, 1);
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_CONFIG), data);
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_MASK), 0);
+
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_DATA0), 0);
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_DATA1), 0);
+}
+
 static void gfx_v10_0_init_compute_vmid(struct amdgpu_device *adev)
 {
int i;
@@ -4856,6 +4879,9 @@ static void gfx_v10_0_init_compute_vmid(struct 
amdgpu_device *adev)
WREG32_SOC15_OFFSET(GC, 0, mmGDS_GWS_VMID0, i, 0);
WREG32_SOC15_OFFSET(GC, 0, mmGDS_OA_VMID0, i, 0);
}
+
+   gfx_v10_0_debug_trap_config_init(adev, adev->vm_manager.first_kfd_vmid,
+   AMDGPU_NUM_VMID);
 }
 
 static void gfx_v10_0_init_gds_vmid(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index da21bf868080..690e121d9dda 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -1638,6 +1638,7 @@ static void gfx_v11_0_init_compute_vmid(struct 
amdgpu_device *adev)
/* Enable trap for each kfd vmid. */
data = RREG32_SOC15(GC, 0, regSPI_GDBG_PER_VMID_CNTL);
data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, TRAP_EN, 1);
+   WREG32_SOC15(GC, 0, regSPI_GDBG_PER_VMID_CNTL, data);
}
soc21_grbm_select(adev, 0, 0, 0, 0);
mutex_unlock(&adev->srbm_mutex);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 0189e50bd89f..7f17e0061027 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -2303,6 +2303,29 @@ static void gfx_v9_0_setup_rb(struct amdgpu_device *adev)
adev->gfx.config.num_rbs = hweight32(active_rbs);
 }
 
+static void gfx_v9_0_debug_trap_config_init(struct amdgpu_device *adev,
+   uint32_t first_vmid,
+   uint32_t last_vmid)
+{
+   uint32_t data;
+   uint32_t trap_config_vmid_mask = 0;
+   int i;
+
+   /* Calculate trap config vmid mask */
+   for (i = first_vmid; i < last_vmid; i++)
+   trap_config_vmid_mask |= (1 << i);
+
+   data = REG_SET_FIELD(0, SPI_GDBG_TRAP_CONFIG,
+   VMID_SEL, trap_config_vmid_mask);
+   data = REG_SET_FIELD(data, SPI_GDBG_TRAP_CONFIG,
+   TRAP_EN, 1);
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_CONFIG), data);
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_MASK), 0);
+
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_DATA0), 0);
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_DATA1), 0);
+}
+
 #define DEFAULT_SH_MEM_BASES   (0x6000)

[PATCH 06/33] drm/amdgpu: add gfx9 hw debug mode enable and disable calls

2023-05-25 Thread Jonathan Kim

Implement the per-device calls to enable or disable HW debug mode for
GFX9 prior to GFX9.4.1.

GFX9.4.1 and onward will require their own enable/disable sequence as
follow on patches.

When hardware debug mode setting is requested, waves will inherit
these settings in the Shader Processor Input's (SPI) Sequencer Global
Block (SQG). This means that the KGD must drain all waves from the SPI
into SQG (approximately 96 SPI clock cycles) prior to debug mode setting
to ensure that the order of operations that the debugger expects with
regards to debug mode setting transaction requests and wave inheritence
of that mode is upheld.

Also ensure that exception overrides are reset to their original state
prior to debug enable or disable.

Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 92 +++
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h |  9 ++
 2 files changed, 101 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
index 9fa9aab22fe9..64da5995170c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
@@ -649,6 +649,96 @@ int kgd_gfx_v9_wave_control_execute(struct amdgpu_device 
*adev,
return 0;
 }
 
+/*
+ * GFX9 helper for wave launch stall requirements on debug trap setting.
+ *
+ * vmid:
+ *   Target VMID to stall/unstall.
+ *
+ * stall:
+ *   0-unstall wave launch (enable), 1-stall wave launch (disable).
+ *   After wavefront launch has been stalled, allocated waves must drain from
+ *   SPI in order for debug trap settings to take effect on those waves.
+ *   This is roughly a ~96 clock cycle wait on SPI where a read on
+ *   SPI_GDBG_WAVE_CNTL translates to ~32 clock cycles.
+ *   KGD_GFX_V9_WAVE_LAUNCH_SPI_DRAIN_LATENCY indicates the number of reads 
required.
+ *
+ *   NOTE: We can afford to clear the entire STALL_VMID field on unstall
+ *   because GFX9.4.1 cannot support multi-process debugging due to trap
+ *   configuration and masking being limited to global scope.  Always assume
+ *   single process conditions.
+ */
+#define KGD_GFX_V9_WAVE_LAUNCH_SPI_DRAIN_LATENCY   3
+void kgd_gfx_v9_set_wave_launch_stall(struct amdgpu_device *adev,
+   uint32_t vmid,
+   bool stall)
+{
+   int i;
+   uint32_t data = RREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_WAVE_CNTL));
+
+   if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 4, 1))
+   data = REG_SET_FIELD(data, SPI_GDBG_WAVE_CNTL, STALL_VMID,
+   stall ? 1 << vmid : 0);
+   else
+   data = REG_SET_FIELD(data, SPI_GDBG_WAVE_CNTL, STALL_RA,
+   stall ? 1 : 0);
+
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_WAVE_CNTL), data);
+
+   if (!stall)
+   return;
+
+   for (i = 0; i < KGD_GFX_V9_WAVE_LAUNCH_SPI_DRAIN_LATENCY; i++)
+   RREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_WAVE_CNTL));
+}
+
+/*
+ * restore_dbg_registers is ignored here but is a general interface requirement
+ * for devices that support GFXOFF and where the RLC save/restore list
+ * does not support hw registers for debugging i.e. the driver has to manually
+ * initialize the debug mode registers after it has disabled GFX off during the
+ * debug session.
+ */
+uint32_t kgd_gfx_v9_enable_debug_trap(struct amdgpu_device *adev,
+   bool restore_dbg_registers,
+   uint32_t vmid)
+{
+   mutex_lock(&adev->grbm_idx_mutex);
+
+   kgd_gfx_v9_set_wave_launch_stall(adev, vmid, true);
+
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_MASK), 0);
+
+   kgd_gfx_v9_set_wave_launch_stall(adev, vmid, false);
+
+   mutex_unlock(&adev->grbm_idx_mutex);
+
+   return 0;
+}
+
+/*
+ * keep_trap_enabled is ignored here but is a general interface requirement
+ * for devices that support multi-process debugging where the performance
+ * overhead from trap temporary setup needs to be bypassed when the debug
+ * session has ended.
+ */
+uint32_t kgd_gfx_v9_disable_debug_trap(struct amdgpu_device *adev,
+   bool keep_trap_enabled,
+   uint32_t vmid)
+{
+   mutex_lock(&adev->grbm_idx_mutex);
+
+   kgd_gfx_v9_set_wave_launch_stall(adev, vmid, true);
+
+   WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_MASK), 0);
+
+   kgd_gfx_v9_set_wave_launch_stall(adev, vmid, false);
+
+   mutex_unlock(&adev->grbm_idx_mutex);
+
+   return 0;
+}
+
 void kgd_gfx_v9_set_vm_context_page_table_base(struct amdgpu_device *adev,
uint32_t vmid, uint64_t page_table_base)
 {
@@ -875,6 +965,8 @@ const struct kfd2kgd_calls gfx_v9_kfd2kgd = {
.get_atc_vmid_pasid_mapping_info =

[PATCH 04/33] drm/amdgpu: add kgd hw debug mode setting interface

2023-05-25 Thread Jonathan Kim

Introduce the require KGD debug calls that will execute hardware debug
mode setting.

Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 .../gpu/drm/amd/include/kgd_kfd_interface.h   | 34 +++
 1 file changed, 34 insertions(+)

diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h 
b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index 8cb3dbcae3e4..d0df3381539f 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -291,6 +291,40 @@ struct kfd2kgd_calls {
uint32_t vmid, uint64_t page_table_base);
uint32_t (*read_vmid_from_vmfault_reg)(struct amdgpu_device *adev);
 
+   uint32_t (*enable_debug_trap)(struct amdgpu_device *adev,
+   bool restore_dbg_registers,
+   uint32_t vmid);
+   uint32_t (*disable_debug_trap)(struct amdgpu_device *adev,
+   bool keep_trap_enabled,
+   uint32_t vmid);
+   int (*validate_trap_override_request)(struct amdgpu_device *adev,
+   uint32_t trap_override,
+   uint32_t *trap_mask_supported);
+   uint32_t (*set_wave_launch_trap_override)(struct amdgpu_device *adev,
+uint32_t vmid,
+uint32_t trap_override,
+uint32_t trap_mask_bits,
+uint32_t trap_mask_request,
+uint32_t *trap_mask_prev,
+uint32_t kfd_dbg_trap_cntl_prev);
+   uint32_t (*set_wave_launch_mode)(struct amdgpu_device *adev,
+   uint8_t wave_launch_mode,
+   uint32_t vmid);
+   uint32_t (*set_address_watch)(struct amdgpu_device *adev,
+   uint64_t watch_address,
+   uint32_t watch_address_mask,
+   uint32_t watch_id,
+   uint32_t watch_mode,
+   uint32_t debug_vmid);
+   uint32_t (*clear_address_watch)(struct amdgpu_device *adev,
+   uint32_t watch_id);
+   void (*get_iq_wait_times)(struct amdgpu_device *adev,
+   uint32_t *wait_times);
+   void (*build_grace_period_packet_info)(struct amdgpu_device *adev,
+   uint32_t wait_times,
+   uint32_t grace_period,
+   uint32_t *reg_offset,
+   uint32_t *reg_data);
void (*get_cu_occupancy)(struct amdgpu_device *adev, int pasid,
int *wave_cnt, int *max_waves_per_cu, uint32_t inst);
void (*program_trap_handler_settings)(struct amdgpu_device *adev,
-- 
2.25.1

[PATCH 03/33] drm/amdkfd: prepare per-process debug enable and disable

2023-05-25 Thread Jonathan Kim

The ROCm debugger will attach to a process to debug by PTRACE and will
expect the KFD to prepare a process for the target PID, whether the
target PID has opened the KFD device or not.

This patch is to explicity handle this requirement.  Further HW mode
setting and runtime coordination requirements will be handled in
following patches.

In the case where the target process has not opened the KFD device,
a new KFD process must be created for the target PID.
The debugger as well as the target process for this case will have not
acquired any VMs so handle process restoration to correctly account for
this.

To coordinate with HSA runtime, the debugger must be aware of the target
process' runtime enablement status and will copy the runtime status
information into the debugged KFD process for later query.

On enablement, the debugger will subscribe to a set of exceptions where
each exception events will notify the debugger through a pollable FIFO
file descriptor that the debugger provides to the KFD to manage.

Finally on process termination of either the debugger or the target,
debugging must be disabled if it has not been done so.

Signed-off-by: Jonathan Kim 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/Makefile   |   3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 102 +-
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c|  80 ++
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h|  32 ++
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  26 +++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  31 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  60 +++
 7 files changed, 304 insertions(+), 30 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_debug.c
 create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_debug.h

diff --git a/drivers/gpu/drm/amd/amdkfd/Makefile 
b/drivers/gpu/drm/amd/amdkfd/Makefile
index e758c2a24cd0..747754428073 100644
--- a/drivers/gpu/drm/amd/amdkfd/Makefile
+++ b/drivers/gpu/drm/amd/amdkfd/Makefile
@@ -55,7 +55,8 @@ AMDKFD_FILES  := $(AMDKFD_PATH)/kfd_module.o \
$(AMDKFD_PATH)/kfd_int_process_v9.o \
$(AMDKFD_PATH)/kfd_int_process_v11.o \
$(AMDKFD_PATH)/kfd_smi_events.o \
-   $(AMDKFD_PATH)/kfd_crat.o
+   $(AMDKFD_PATH)/kfd_crat.o \
+   $(AMDKFD_PATH)/kfd_debug.o
 
 ifneq ($(CONFIG_AMD_IOMMU_V2),)
 AMDKFD_FILES += $(AMDKFD_PATH)/kfd_iommu.o
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index f4b50b74818e..7082d5d0f0e9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -44,6 +44,7 @@
 #include "amdgpu_amdkfd.h"
 #include "kfd_smi_events.h"
 #include "amdgpu_dma_buf.h"
+#include "kfd_debug.h"
 
 static long kfd_ioctl(struct file *, unsigned int, unsigned long);
 static int kfd_open(struct inode *, struct file *);
@@ -142,10 +143,15 @@ static int kfd_open(struct inode *inode, struct file 
*filep)
return -EPERM;
}
 
-   process = kfd_create_process(filep);
+   process = kfd_create_process(current);
if (IS_ERR(process))
return PTR_ERR(process);
 
+   if (kfd_process_init_cwsr_apu(process, filep)) {
+   kfd_unref_process(process);
+   return -EFAULT;
+   }
+
/* filep now owns the reference returned by kfd_create_process */
filep->private_data = process;
 
@@ -2737,6 +2743,10 @@ static int kfd_ioctl_runtime_enable(struct file *filep, 
struct kfd_process *p, v
 static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, 
void *data)
 {
struct kfd_ioctl_dbg_trap_args *args = data;
+   struct task_struct *thread = NULL;
+   struct mm_struct *mm = NULL;
+   struct pid *pid = NULL;
+   struct kfd_process *target = NULL;
int r = 0;
 
if (sched_policy == KFD_SCHED_POLICY_NO_HWS) {
@@ -2744,9 +2754,81 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, 
struct kfd_process *p, v
return -EINVAL;
}
 
+   pid = find_get_pid(args->pid);
+   if (!pid) {
+   pr_debug("Cannot find pid info for %i\n", args->pid);
+   r = -ESRCH;
+   goto out;
+   }
+
+   thread = get_pid_task(pid, PIDTYPE_PID);
+   if (!thread) {
+   r = -ESRCH;
+   goto out;
+   }
+
+   mm = get_task_mm(thread);
+   if (!mm) {
+   r = -ESRCH;
+   goto out;
+   }
+
+   if (args->op == KFD_IOC_DBG_TRAP_ENABLE) {
+   bool create_process;
+
+   rcu_read_lock();
+   create_process = thread && thread != current && 
ptrace_parent(thread) == current;
+   rcu_read_unlock();
+
+   target = create_process ? kfd_create_process(thread) :
+   kfd_lookup_process_by_pid(pid);
+   } els

[PATCH 01/33] drm/amdkfd: add debug and runtime enable interface

2023-05-25 Thread Jonathan Kim

Introduce the GPU debug operations interface.

For ROCm-GDB to extend the GNU Debugger's ability to inspect the AMD GPU
instruction set, provide the necessary interface to allow the debugger
to HW debug-mode set and query exceptions per HSA queue, process or
device.

The runtime_enable interface coordinates exception handling with the
HSA runtime.

Usage is available in the kern docs at uapi/linux/kfd_ioctl.h.

v2: add num_xcc to device snapshot entry.
fixup missing EC_QUEUE_PACKET_RESERVED mask.

Signed-off-by: Jonathan Kim 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  48 ++
 include/uapi/linux/kfd_ioctl.h   | 668 ++-
 2 files changed, 715 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 88fe1f31739d..f4b50b74818e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2729,6 +2729,48 @@ static int kfd_ioctl_criu(struct file *filep, struct 
kfd_process *p, void *data)
return ret;
 }
 
+static int kfd_ioctl_runtime_enable(struct file *filep, struct kfd_process *p, 
void *data)
+{
+   return 0;
+}
+
+static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, 
void *data)
+{
+   struct kfd_ioctl_dbg_trap_args *args = data;
+   int r = 0;
+
+   if (sched_policy == KFD_SCHED_POLICY_NO_HWS) {
+   pr_err("Debugging does not support sched_policy %i", 
sched_policy);
+   return -EINVAL;
+   }
+
+   switch (args->op) {
+   case KFD_IOC_DBG_TRAP_ENABLE:
+   case KFD_IOC_DBG_TRAP_DISABLE:
+   case KFD_IOC_DBG_TRAP_SEND_RUNTIME_EVENT:
+   case KFD_IOC_DBG_TRAP_SET_EXCEPTIONS_ENABLED:
+   case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_OVERRIDE:
+   case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_MODE:
+   case KFD_IOC_DBG_TRAP_SUSPEND_QUEUES:
+   case KFD_IOC_DBG_TRAP_RESUME_QUEUES:
+   case KFD_IOC_DBG_TRAP_SET_NODE_ADDRESS_WATCH:
+   case KFD_IOC_DBG_TRAP_CLEAR_NODE_ADDRESS_WATCH:
+   case KFD_IOC_DBG_TRAP_SET_FLAGS:
+   case KFD_IOC_DBG_TRAP_QUERY_DEBUG_EVENT:
+   case KFD_IOC_DBG_TRAP_QUERY_EXCEPTION_INFO:
+   case KFD_IOC_DBG_TRAP_GET_QUEUE_SNAPSHOT:
+   case KFD_IOC_DBG_TRAP_GET_DEVICE_SNAPSHOT:
+   pr_warn("Debugging not supported yet\n");
+   r = -EACCES;
+   break;
+   default:
+   pr_err("Invalid option: %i\n", args->op);
+   r = -EINVAL;
+   }
+
+   return r;
+}
+
 #define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \
[_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \
.cmd_drv = 0, .name = #ioctl}
@@ -2841,6 +2883,12 @@ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = {
 
AMDKFD_IOCTL_DEF(AMDKFD_IOC_EXPORT_DMABUF,
kfd_ioctl_export_dmabuf, 0),
+
+   AMDKFD_IOCTL_DEF(AMDKFD_IOC_RUNTIME_ENABLE,
+   kfd_ioctl_runtime_enable, 0),
+
+   AMDKFD_IOCTL_DEF(AMDKFD_IOC_DBG_TRAP,
+   kfd_ioctl_set_debug_trap, 0),
 };
 
 #define AMDKFD_CORE_IOCTL_COUNTARRAY_SIZE(amdkfd_ioctls)
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index 2a9671e1ddb5..dfe745ee427e 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -110,6 +110,32 @@ struct kfd_ioctl_get_available_memory_args {
__u32 pad;
 };
 
+struct kfd_dbg_device_info_entry {
+   __u64 exception_status;
+   __u64 lds_base;
+   __u64 lds_limit;
+   __u64 scratch_base;
+   __u64 scratch_limit;
+   __u64 gpuvm_base;
+   __u64 gpuvm_limit;
+   __u32 gpu_id;
+   __u32 location_id;
+   __u32 vendor_id;
+   __u32 device_id;
+   __u32 revision_id;
+   __u32 subsystem_vendor_id;
+   __u32 subsystem_device_id;
+   __u32 fw_version;
+   __u32 gfx_target_version;
+   __u32 simd_count;
+   __u32 max_waves_per_simd;
+   __u32 array_count;
+   __u32 simd_arrays_per_engine;
+   __u32 num_xcc;
+   __u32 capability;
+   __u32 debug_prop;
+};
+
 /* For kfd_ioctl_set_memory_policy_args.default_policy and alternate_policy */
 #define KFD_IOC_CACHE_POLICY_COHERENT 0
 #define KFD_IOC_CACHE_POLICY_NONCOHERENT 1
@@ -775,6 +801,640 @@ struct kfd_ioctl_set_xnack_mode_args {
__s32 xnack_enabled;
 };
 
+/* Wave launch override modes */
+enum kfd_dbg_trap_override_mode {
+   KFD_DBG_TRAP_OVERRIDE_OR = 0,
+   KFD_DBG_TRAP_OVERRIDE_REPLACE = 1
+};
+
+/* Wave launch overrides */
+enum kfd_dbg_trap_mask {
+   KFD_DBG_TRAP_MASK_FP_INVALID = 1,
+   KFD_DBG_TRAP_MASK_FP_INPUT_DENORMAL = 2,
+   KFD_DBG_TRAP_MASK_FP_DIVIDE_BY_ZERO = 4,
+   KFD_DBG_TRAP_MASK_FP_OVERFLOW = 8,
+   KFD_DBG_TRAP_MASK_FP_UNDERFLOW = 16,
+   KFD_DBG_TRAP_MASK_FP_INEXACT = 32,
+   KFD_DBG_TRAP_MASK_INT_DIVIDE_BY_ZERO = 64,
+   KFD_DBG_TR

[PATCH 02/33] drm/amdkfd: display debug capabilities

2023-05-25 Thread Jonathan Kim

Expose debug capabilities in the KFD topology node's HSA capabilities and
debug properties flags.

Ensure correct capabilities are exposed based on firmware support.

Flag definitions can be referenced in uapi/linux/kfd_sysfs.h.

v2: rebase topology fw check fix with kfd_node struct update

Signed-off-by: Jonathan Kim 
---
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 101 --
 drivers/gpu/drm/amd/amdkfd/kfd_topology.h |   6 ++
 include/uapi/linux/kfd_sysfs.h|  15 
 3 files changed, 117 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 8302d8967158..3def25b2bdbb 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -535,6 +535,8 @@ static ssize_t node_show(struct kobject *kobj, struct 
attribute *attr,
  dev->gpu->kfd->mec_fw_version);
sysfs_show_32bit_prop(buffer, offs, "capability",
  dev->node_props.capability);
+   sysfs_show_64bit_prop(buffer, offs, "debug_prop",
+ dev->node_props.debug_prop);
sysfs_show_32bit_prop(buffer, offs, "sdma_fw_version",
  dev->gpu->kfd->sdma_fw_version);
sysfs_show_64bit_prop(buffer, offs, "unique_id",
@@ -1857,6 +1859,97 @@ static int kfd_topology_add_device_locked(struct 
kfd_node *gpu, uint32_t gpu_id,
return res;
 }
 
+static void kfd_topology_set_dbg_firmware_support(struct kfd_topology_device 
*dev)
+{
+   bool firmware_supported = true;
+
+   if (KFD_GC_VERSION(dev->gpu) >= IP_VERSION(11, 0, 0) &&
+   KFD_GC_VERSION(dev->gpu) < IP_VERSION(12, 0, 0)) {
+   firmware_supported =
+   (dev->gpu->adev->mes.sched_version & 
AMDGPU_MES_VERSION_MASK) >= 9;
+   goto out;
+   }
+
+   /*
+* Note: Any unlisted devices here are assumed to support exception 
handling.
+* Add additional checks here as needed.
+*/
+   switch (KFD_GC_VERSION(dev->gpu)) {
+   case IP_VERSION(9, 0, 1):
+   firmware_supported = dev->gpu->kfd->mec_fw_version >= 459 + 
32768;
+   break;
+   case IP_VERSION(9, 1, 0):
+   case IP_VERSION(9, 2, 1):
+   case IP_VERSION(9, 2, 2):
+   case IP_VERSION(9, 3, 0):
+   case IP_VERSION(9, 4, 0):
+   firmware_supported = dev->gpu->kfd->mec_fw_version >= 459;
+   break;
+   case IP_VERSION(9, 4, 1):
+   firmware_supported = dev->gpu->kfd->mec_fw_version >= 60;
+   break;
+   case IP_VERSION(9, 4, 2):
+   firmware_supported = dev->gpu->kfd->mec_fw_version >= 51;
+   break;
+   case IP_VERSION(10, 1, 10):
+   case IP_VERSION(10, 1, 2):
+   case IP_VERSION(10, 1, 1):
+   firmware_supported = dev->gpu->kfd->mec_fw_version >= 144;
+   break;
+   case IP_VERSION(10, 3, 0):
+   case IP_VERSION(10, 3, 2):
+   case IP_VERSION(10, 3, 1):
+   case IP_VERSION(10, 3, 4):
+   case IP_VERSION(10, 3, 5):
+   firmware_supported = dev->gpu->kfd->mec_fw_version >= 89;
+   break;
+   case IP_VERSION(10, 1, 3):
+   case IP_VERSION(10, 3, 3):
+   firmware_supported = false;
+   break;
+   default:
+   break;
+   }
+
+out:
+   if (firmware_supported)
+   dev->node_props.capability |= 
HSA_CAP_TRAP_DEBUG_FIRMWARE_SUPPORTED;
+}
+
+static void kfd_topology_set_capabilities(struct kfd_topology_device *dev)
+{
+   dev->node_props.capability |= ((HSA_CAP_DOORBELL_TYPE_2_0 <<
+   HSA_CAP_DOORBELL_TYPE_TOTALBITS_SHIFT) &
+   HSA_CAP_DOORBELL_TYPE_TOTALBITS_MASK);
+
+   dev->node_props.capability |= HSA_CAP_TRAP_DEBUG_SUPPORT |
+   HSA_CAP_TRAP_DEBUG_WAVE_LAUNCH_TRAP_OVERRIDE_SUPPORTED |
+   HSA_CAP_TRAP_DEBUG_WAVE_LAUNCH_MODE_SUPPORTED;
+
+   if (KFD_GC_VERSION(dev->gpu) < IP_VERSION(10, 0, 0)) {
+   dev->node_props.debug_prop |= 
HSA_DBG_WATCH_ADDR_MASK_LO_BIT_GFX9 |
+   HSA_DBG_WATCH_ADDR_MASK_HI_BIT;
+
+   if (KFD_GC_VERSION(dev->gpu) < IP_VERSION(9, 4, 2))
+   dev->node_props.debug_prop |=
+   HSA_DBG_DISPATCH_INFO_ALWAYS_VALID;
+   else
+   dev->node_props.capability |=
+   
HSA_CAP_TRAP_DEBUG_PRECISE_MEMORY_OPERATIONS_SUPPORTED;
+   } else {
+   dev->node_props.debug_prop |= 
HSA_DBG_WATCH_ADDR_MASK_LO_BIT_GFX10 |
+   HSA_DBG_WATCH_ADDR_MASK_HI_BIT;
+
+   if (KFD_GC_VERSION(dev->gpu) < IP_VERSION(11,

[PATCH] drm/amdgpu: Fix up kdoc in sdma_v4_4_2.c

2023-05-25 Thread Srinivasan Shanmugam

Address a bunch of kdoc warnings:

gcc with W=1
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:426: warning: Function parameter or 
member 'inst_mask' not described in 'sdma_v4_4_2_inst_gfx_stop'
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:457: warning: Function parameter or 
member 'inst_mask' not described in 'sdma_v4_4_2_inst_rlc_stop'
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:470: warning: Function parameter or 
member 'inst_mask' not described in 'sdma_v4_4_2_inst_page_stop'
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:506: warning: Function parameter or 
member 'inst_mask' not described in 'sdma_v4_4_2_inst_ctx_switch_enable'
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:794: warning: Function parameter or 
member 'inst_mask' not described in 'sdma_v4_4_2_inst_rlc_resume'
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:810: warning: Function parameter or 
member 'inst_mask' not described in 'sdma_v4_4_2_inst_load_microcode'
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:854: warning: Function parameter or 
member 'inst_mask' not described in 'sdma_v4_4_2_inst_start'

Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
index ff41fb577cdd..8eebf9c2bbcd 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
@@ -418,6 +418,7 @@ static void sdma_v4_4_2_ring_emit_fence(struct amdgpu_ring 
*ring, u64 addr, u64
  * sdma_v4_4_2_inst_gfx_stop - stop the gfx async dma engines
  *
  * @adev: amdgpu_device pointer
+ * @inst_mask: mask of dma engine instances to be disabled
  *
  * Stop the gfx async dma ring buffers.
  */
@@ -449,6 +450,7 @@ static void sdma_v4_4_2_inst_gfx_stop(struct amdgpu_device 
*adev,
  * sdma_v4_4_2_inst_rlc_stop - stop the compute async dma engines
  *
  * @adev: amdgpu_device pointer
+ * @inst_mask: mask of dma engine instances to be disabled
  *
  * Stop the compute async dma queues.
  */
@@ -462,6 +464,7 @@ static void sdma_v4_4_2_inst_rlc_stop(struct amdgpu_device 
*adev,
  * sdma_v4_4_2_inst_page_stop - stop the page async dma engines
  *
  * @adev: amdgpu_device pointer
+ * @inst_mask: mask of dma engine instances to be disabled
  *
  * Stop the page async dma ring buffers.
  */
@@ -498,6 +501,7 @@ static void sdma_v4_4_2_inst_page_stop(struct amdgpu_device 
*adev,
  *
  * @adev: amdgpu_device pointer
  * @enable: enable/disable the DMA MEs context switch.
+ * @inst_mask: mask of dma engine instances to be enabled
  *
  * Halt or unhalt the async dma engines context switch.
  */
@@ -785,6 +789,7 @@ static void sdma_v4_4_2_init_pg(struct amdgpu_device *adev)
  * sdma_v4_4_2_inst_rlc_resume - setup and start the async dma engines
  *
  * @adev: amdgpu_device pointer
+ * @inst_mask: mask of dma engine instances to be enabled
  *
  * Set up the compute DMA queues and enable them.
  * Returns 0 for success, error for failure.
@@ -801,6 +806,7 @@ static int sdma_v4_4_2_inst_rlc_resume(struct amdgpu_device 
*adev,
  * sdma_v4_4_2_inst_load_microcode - load the sDMA ME ucode
  *
  * @adev: amdgpu_device pointer
+ * @inst_mask: mask of dma engine instances to be enabled
  *
  * Loads the sDMA0/1 ucode.
  * Returns 0 for success, -EINVAL if the ucode is not available.
@@ -845,6 +851,7 @@ static int sdma_v4_4_2_inst_load_microcode(struct 
amdgpu_device *adev,
  * sdma_v4_4_2_inst_start - setup and start the async dma engines
  *
  * @adev: amdgpu_device pointer
+ * @inst_mask: mask of dma engine instances to be enabled
  *
  * Set up the DMA engines and enable them.
  * Returns 0 for success, error for failure.
-- 
2.25.1

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Nathan Chancellor

On Thu, May 25, 2023 at 12:45:13PM -0400, Luben Tuikov wrote:
> On 2023-05-25 12:29, Nathan Chancellor wrote:
> > On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote:
> >> On 2023-05-25 11:22, Nathan Chancellor wrote:
> >>> On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote:
>  Silencing the compiler from below compilation error:
> 
>  drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable 
>  'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted 
>  [-Werror,-Wunneeded-internal-declaration]
>  static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
>    ^
>  1 error generated.
> 
>  Mark the variable as __maybe_unused to make it clear to clang that this
>  is expected, so there is no more warning.
> 
>  Cc: Christian König 
>  Cc: Lijo Lazar 
>  Cc: Luben Tuikov 
>  Cc: Alex Deucher 
>  Signed-off-by: Srinivasan Shanmugam 
> >>>
> >>> Traditionally, this attribute would go between the [] and =, but that is
> >>> a nit. Can someone please pick this up to unblock our builds on -next?
> >>>
> >>> Reviewed-by: Nathan Chancellor 
> >>
> >> I'll pick this up, fix it, and submit to amd-staging-drm-next.
> > 
> > Thanks a lot :)
> > 
> >> Which -next are you referring to, Nathan?
> > 
> > linux-next, this warning breaks the build when -Werror is enabled, such
> > as with allmodconfig:
> > 
> > https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2QHtlCTz2JL0yXNpRB5hVmiP9lq/build.log
> > 
> 
> Hi Nathan,
> 
> Thanks for the pointers.
> 
> Srinivasan has already submitted it to amd-staging-drm-next.
> 
> Seems Alex will push it upstream.
> 
> Not sure who fast you need it, we can send you the commit itself
> for you to git-am if you cannot wait.

Thanks for that extra info. We can just wait for the patch to end up in
-next naturally, we try to avoid applying extra patches when possible.

Cheers,
Nathan

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Nathan Chancellor

On Thu, May 25, 2023 at 12:42:05PM -0400, Alex Deucher wrote:
> On Thu, May 25, 2023 at 12:29 PM Nathan Chancellor  wrote:
> >
> > On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote:
> > > On 2023-05-25 11:22, Nathan Chancellor wrote:
> > > > On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote:
> > > >> Silencing the compiler from below compilation error:
> > > >>
> > > >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable 
> > > >> 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted 
> > > >> [-Werror,-Wunneeded-internal-declaration]
> > > >> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
> > > >>   ^
> > > >> 1 error generated.
> > > >>
> > > >> Mark the variable as __maybe_unused to make it clear to clang that this
> > > >> is expected, so there is no more warning.
> > > >>
> > > >> Cc: Christian König 
> > > >> Cc: Lijo Lazar 
> > > >> Cc: Luben Tuikov 
> > > >> Cc: Alex Deucher 
> > > >> Signed-off-by: Srinivasan Shanmugam 
> > > >
> > > > Traditionally, this attribute would go between the [] and =, but that is
> > > > a nit. Can someone please pick this up to unblock our builds on -next?
> > > >
> > > > Reviewed-by: Nathan Chancellor 
> > >
> > > I'll pick this up, fix it, and submit to amd-staging-drm-next.
> >
> > Thanks a lot :)
> >
> > > Which -next are you referring to, Nathan?
> >
> > linux-next, this warning breaks the build when -Werror is enabled, such
> > as with allmodconfig:
> >
> > https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2QHtlCTz2JL0yXNpRB5hVmiP9lq/build.log
> >
> 
> Srinivasan has already pushed it.  I'll push it out once CI has
> completed.  We are trying to figure out the best way to enable -WERROR
> in our CI system as it is almost always broken depending on what
> compiler you are using.  Also, I'm not sure fixing these is always
> better.  A lot of these warnings seem spurious and in a lot of cases
> the "fix" doesn't really improve the code, it just silences a warning.
> As one of my coworkers put it, there is a reason warnings are not
> errors.

I do not necessarily disagree with that final sentiment but at the end
of the day, it is pointing out a potential problem ("this variable is
only used in a compile time context, is that what you intended or not?")
and the solution is either to fix the code so that it works as initially
intended or you silence the warning because you know it is not actually
a problem. There are always going to be false positives, otherwise they
would just always be hard errors, but that does not mean that they are
not worth listening to, which is why Linus insists on -Werror being a
thing. We can opt out of -Werror for our CI but that does not change the
fact it is default enabled with allmodconfig, so that is how most people
will test.

Cheers,
Nathan

Re: [PATCH] drm/amdgpu: Fix up missing kdoc in sdma_v6_0.c

2023-05-25 Thread Alex Deucher

Reviewed-by: Alex Deucher 

On Thu, May 25, 2023 at 12:15 PM Srinivasan Shanmugam
 wrote:
>
> Address a bunch of kdoc warnings:
>
> gcc with W=1
> drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or 
> member 'job' not described in 'sdma_v6_0_ring_emit_ib'
> drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or 
> member 'flags' not described in 'sdma_v6_0_ring_emit_ib'
> drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:946: warning: Function parameter or 
> member 'timeout' not described in 'sdma_v6_0_ring_test_ib'
> drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1125: warning: Function parameter or 
> member 'ring' not described in 'sdma_v6_0_ring_pad_ib'
> drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1176: warning: Function parameter or 
> member 'vmid' not described in 'sdma_v6_0_ring_emit_vm_flush'
> drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1176: warning: Function parameter or 
> member 'pd_addr' not described in 'sdma_v6_0_ring_emit_vm_flush'
>
> Cc: Christian König 
> Cc: Alex Deucher 
> Signed-off-by: Srinivasan Shanmugam 
> ---
>  drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c 
> b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
> index 1c90b5c661fb..967849c59ebe 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
> @@ -238,6 +238,8 @@ static void sdma_v6_0_ring_insert_nop(struct amdgpu_ring 
> *ring, uint32_t count)
>   *
>   * @ring: amdgpu ring pointer
>   * @ib: IB object to schedule
> + * @flags: unused
> + * @job: job to retrieve vmid from
>   *
>   * Schedule an IB in the DMA ring.
>   */
> @@ -938,6 +940,7 @@ static int sdma_v6_0_ring_test_ring(struct amdgpu_ring 
> *ring)
>   * sdma_v6_0_ring_test_ib - test an IB on the DMA engine
>   *
>   * @ring: amdgpu_ring structure holding ring information
> + * @timeout: timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT
>   *
>   * Test a simple IB in the DMA ring.
>   * Returns 0 on success, error on failure.
> @@ -1167,6 +1170,8 @@ static void sdma_v6_0_ring_emit_pipeline_sync(struct 
> amdgpu_ring *ring)
>   * sdma_v6_0_ring_emit_vm_flush - vm flush using sDMA
>   *
>   * @ring: amdgpu_ring pointer
> + * @vmid: vmid number to use
> + * @pd_addr: address
>   *
>   * Update the page table base and flush the VM TLB
>   * using sDMA.
> --
> 2.25.1
>

[PATCH 1/3] drm/amdgpu: add cached GPU fault structure to vm struct

2023-05-25 Thread Alex Deucher

When we get a GPU pge fault, cache the fault for later
analysis.

Cc: samuel.pitoi...@gmail.com
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 31 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 18 +++
 2 files changed, 49 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 22f9a65ca0fc..73e022f3daa4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2631,3 +2631,34 @@ void amdgpu_debugfs_vm_bo_info(struct amdgpu_vm *vm, 
struct seq_file *m)
   total_done_objs);
 }
 #endif
+
+/**
+ * amdgpu_vm_update_fault_cache - update cached fault into.
+ * @adev: amdgpu device pointer
+ * @pasid: PASID of the VM
+ * @addr: Address of the fault
+ * @status: fault status register
+ * @vmhub: which vmhub got the fault
+ *
+ * Cache the fault info for later use by userspace in debuggging.
+ */
+void amdgpu_vm_update_fault_cache(struct amdgpu_device *adev,
+ unsigned int pasid,
+ uint64_t addr,
+ uint32_t status,
+ unsigned int vmhub)
+{
+   struct amdgpu_vm *vm;
+   unsigned long flags;
+
+   xa_lock_irqsave(&adev->vm_manager.pasids, flags);
+
+   vm = xa_load(&adev->vm_manager.pasids, pasid);
+   if (vm) {
+   vm->fault_info.addr = addr;
+   vm->fault_info.status = status;
+   vm->fault_info.vmhub = vmhub;
+   }
+   xa_unlock_irqrestore(&adev->vm_manager.pasids, flags);
+}
+
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 14f9a2bf3acb..fb66a413110c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -244,6 +244,15 @@ struct amdgpu_vm_update_funcs {
  struct dma_fence **fence);
 };
 
+struct amdgpu_vm_fault_info {
+   /* fault address */
+   uint64_taddr;
+   /* fault status register */
+   uint32_tstatus;
+   /* which vmhub? gfxhub, mmhub, etc. */
+   unsigned intvmhub;
+};
+
 struct amdgpu_vm {
/* tree of virtual addresses mapped */
struct rb_root_cached   va;
@@ -332,6 +341,9 @@ struct amdgpu_vm {
 
/* Memory partition number, -1 means any partition */
int8_t  mem_id;
+
+   /* cached fault info */
+   struct amdgpu_vm_fault_info fault_info;
 };
 
 struct amdgpu_vm_manager {
@@ -540,4 +552,10 @@ static inline void amdgpu_vm_eviction_unlock(struct 
amdgpu_vm *vm)
mutex_unlock(&vm->eviction_lock);
 }
 
+void amdgpu_vm_update_fault_cache(struct amdgpu_device *adev,
+ unsigned int pasid,
+ uint64_t addr,
+ uint32_t status,
+ unsigned int vmhub);
+
 #endif
-- 
2.40.1

[PATCH 0/3] Add GPU page fault query interface

2023-05-25 Thread Alex Deucher

This patch set adds support for an application to query GPU
page faults.  It's useful for debugging and there are
vulkan extensions that could make use of this.  Preliminary
user space code which uses this can be found here:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/298

Note, that I made a small change to the vmhub definition to
decouple it from how the kernel tracks vmhubs so that we have
a consistent user view even if we decide to add more vmhubs
like we recently did for gfx 9.4.3.

I've also pushed the changed to:
https://gitlab.freedesktop.org/agd5f/linux/-/commits/gpu_fault_info_ioctl


Alex Deucher (3):
  drm/amdgpu: add cached GPU fault structure to vm struct
  drm/amdgpu: cache gpuvm fault information for gmc7+
  drm/amdgpu: add new INFO ioctl query for the last GPU page fault

 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 16 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  | 45 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  | 31 +++--
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  |  3 ++
 drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c  |  3 ++
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c   |  3 ++
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c   |  3 ++
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c   | 11 +++---
 include/uapi/drm/amdgpu_drm.h   | 16 +
 10 files changed, 126 insertions(+), 8 deletions(-)

-- 
2.40.1

[PATCH 3/3] drm/amdgpu: add new INFO ioctl query for the last GPU page fault

2023-05-25 Thread Alex Deucher

Add a interface to query the last GPU page fault for the process.
Useful for debugging context lost errors.

v2: split vmhub representation between kernel and userspace

Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238
libdrm MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238

Cc: samuel.pitoi...@gmail.com
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 16 
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  | 16 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  | 13 ++---
 include/uapi/drm/amdgpu_drm.h   | 16 
 5 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 7300df2a342c..7e17b285decc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -112,9 +112,10 @@
  *gl1c_cache_size, gl2c_cache_size, mall_size, 
enabled_rb_pipes_mask_hi
  *   3.53.0 - Support for GFX11 CP GFX shadowing
  *   3.54.0 - Add AMDGPU_CTX_QUERY2_FLAGS_RESET_IN_PROGRESS support
+ * - 3.55.0 - Add AMDGPU_INFO_GPUVM_FAULT query
  */
 #define KMS_DRIVER_MAJOR   3
-#define KMS_DRIVER_MINOR   54
+#define KMS_DRIVER_MINOR   55
 #define KMS_DRIVER_PATCHLEVEL  0
 
 unsigned int amdgpu_vram_limit = UINT_MAX;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 41d047e5de69..bca2a56046ae 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -1163,6 +1163,22 @@ int amdgpu_info_ioctl(struct drm_device *dev, void 
*data, struct drm_file *filp)
return copy_to_user(out, max_ibs,
min((size_t)size, sizeof(max_ibs))) ? 
-EFAULT : 0;
}
+   case AMDGPU_INFO_GPUVM_FAULT: {
+   struct amdgpu_fpriv *fpriv = filp->driver_priv;
+   struct amdgpu_vm *vm = &fpriv->vm;
+   struct drm_amdgpu_info_gpuvm_fault gpuvm_fault;
+
+   if (!vm)
+   return -EINVAL;
+
+   memset(&gpuvm_fault, 0, sizeof(gpuvm_fault));
+   gpuvm_fault.addr = vm->fault_info.addr;
+   gpuvm_fault.status = vm->fault_info.status;
+   gpuvm_fault.vmhub = vm->fault_info.vmhub;
+
+   return copy_to_user(out, &gpuvm_fault,
+   min((size_t)size, sizeof(gpuvm_fault))) ? 
-EFAULT : 0;
+   }
default:
DRM_DEBUG_KMS("Invalid request %d\n", info->query);
return -EINVAL;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 73e022f3daa4..c1b0c5f3c1f8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2657,7 +2657,21 @@ void amdgpu_vm_update_fault_cache(struct amdgpu_device 
*adev,
if (vm) {
vm->fault_info.addr = addr;
vm->fault_info.status = status;
-   vm->fault_info.vmhub = vmhub;
+   if (AMDGPU_IS_GFXHUB(vmhub)) {
+   vm->fault_info.vmhub = AMDGPU_VMHUB_TYPE_GFX;
+   vm->fault_info.vmhub |=
+   (vmhub - AMDGPU_GFXHUB_START) << 
AMDGPU_VMHUB_IDX_SHIFT;
+   } else if (AMDGPU_IS_MMHUB0(vmhub)) {
+   vm->fault_info.vmhub = AMDGPU_VMHUB_TYPE_MM0;
+   vm->fault_info.vmhub |=
+   (vmhub - AMDGPU_MMHUB0_START) << 
AMDGPU_VMHUB_IDX_SHIFT;
+   } else if (AMDGPU_IS_MMHUB1(vmhub)) {
+   vm->fault_info.vmhub = AMDGPU_VMHUB_TYPE_MM1;
+   vm->fault_info.vmhub |=
+   (vmhub - AMDGPU_MMHUB1_START) << 
AMDGPU_VMHUB_IDX_SHIFT;
+   } else {
+   WARN_ONCE(1, "Invalid vmhub %u\n", vmhub);
+   }
}
xa_unlock_irqrestore(&adev->vm_manager.pasids, flags);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index fb66a413110c..1a34fea9acb9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -116,9 +116,16 @@ struct amdgpu_mem_stats;
  * layout: max 8 GFXHUB + 4 MMHUB0 + 1 MMHUB1
  */
 #define AMDGPU_MAX_VMHUBS  13
-#define AMDGPU_GFXHUB(x)   (x)
-#define AMDGPU_MMHUB0(x)   (8 + x)
-#define AMDGPU_MMHUB1(x)   (8 + 4 + x)
+#define AMDGPU_GFXHUB_START0
+#define AMDGPU_MMHUB0_START8
+#define AMDGPU_MMHUB1_START12
+#define AMDGPU_GFXHUB(x)   (AMDGPU_GFXHUB_START + (x))
+#define AMDGPU_MMHUB0(x)   (AMDGPU_MMHUB0_START + (x))
+#define AMDGPU_MM

[PATCH 2/3] drm/amdgpu: cache gpuvm fault information for gmc7+

2023-05-25 Thread Alex Deucher

Cache the current fault info in the vm struct.  This can be queried
by userspace later to help debug UMDs.

Cc: samuel.pitoi...@gmail.com
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c |  3 +++
 drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c |  3 +++
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  |  3 +++
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  |  3 +++
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 11 +++
 5 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 01bd45651382..5f88db5432b6 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -155,6 +155,9 @@ static int gmc_v10_0_process_interrupt(struct amdgpu_device 
*adev,
 
status = RREG32(hub->vm_l2_pro_fault_status);
WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
+
+   amdgpu_vm_update_fault_cache(adev, entry->pasid, addr, status,
+entry->vmid_src ? AMDGPU_MMHUB0(0) 
: AMDGPU_GFXHUB(0));
}
 
if (!printk_ratelimit())
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
index 4bf807d825c0..087f1ec3cf54 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
@@ -115,6 +115,9 @@ static int gmc_v11_0_process_interrupt(struct amdgpu_device 
*adev,
 
status = RREG32(hub->vm_l2_pro_fault_status);
WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
+
+   amdgpu_vm_update_fault_cache(adev, entry->pasid, addr, status,
+entry->vmid_src ? AMDGPU_MMHUB0(0) 
: AMDGPU_GFXHUB(0));
}
 
if (printk_ratelimit()) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index 6f53049619cd..1386e2b2e773 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -1272,6 +1272,9 @@ static int gmc_v7_0_process_interrupt(struct 
amdgpu_device *adev,
if (!addr && !status)
return 0;
 
+   amdgpu_vm_update_fault_cache(adev, entry->pasid,
+((u64)addr) << AMDGPU_GPU_PAGE_SHIFT, 
status, AMDGPU_GFXHUB(0));
+
if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
gmc_v7_0_set_fault_enable_default(adev, false);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 48475077ca92..6d46390ee9e3 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -1447,6 +1447,9 @@ static int gmc_v8_0_process_interrupt(struct 
amdgpu_device *adev,
if (!addr && !status)
return 0;
 
+   amdgpu_vm_update_fault_cache(adev, entry->pasid,
+((u64)addr) << AMDGPU_GPU_PAGE_SHIFT, 
status, AMDGPU_GFXHUB(0));
+
if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
gmc_v8_0_set_fault_enable_default(adev, false);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 1e8b2aaa48c1..28a66aa377f3 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -555,6 +555,7 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device 
*adev,
struct amdgpu_vmhub *hub;
const char *mmhub_cid;
const char *hub_name;
+   unsigned int vmhub;
u64 addr;
uint32_t cam_index = 0;
int ret, xcc_id = 0;
@@ -567,10 +568,10 @@ static int gmc_v9_0_process_interrupt(struct 
amdgpu_device *adev,
 
if (entry->client_id == SOC15_IH_CLIENTID_VMC) {
hub_name = "mmhub0";
-   hub = &adev->vmhub[AMDGPU_MMHUB0(node_id / 4)];
+   vmhub = AMDGPU_MMHUB0(node_id / 4);
} else if (entry->client_id == SOC15_IH_CLIENTID_VMC1) {
hub_name = "mmhub1";
-   hub = &adev->vmhub[AMDGPU_MMHUB1(0)];
+   vmhub = AMDGPU_MMHUB1(0);
} else {
hub_name = "gfxhub0";
if (adev->gfx.funcs->ih_node_to_logical_xcc) {
@@ -579,8 +580,9 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device 
*adev,
if (xcc_id < 0)
xcc_id = 0;
}
-   hub = &adev->vmhub[xcc_id];
+   vmhub = xcc_id;
}
+   hub = &adev->vmhub[vmhub];
 
if (retry_fault) {
if (adev->irq.retry_cam_enabled) {
@@ -626,7 +628,6 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device 
*adev,
if (!printk_ratelimit())
return 0;
 
-
memset(&task_info, 0, sizeof(struct amdgpu_task_info));
amdgpu_vm_get_task_info(adev, entry->pasid, &task_info);
 
@@ -663,6 +664,8 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Luben Tuikov

On 2023-05-25 12:29, Nathan Chancellor wrote:
> On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote:
>> On 2023-05-25 11:22, Nathan Chancellor wrote:
>>> On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote:
 Silencing the compiler from below compilation error:

 drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable 
 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted 
 [-Werror,-Wunneeded-internal-declaration]
 static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
   ^
 1 error generated.

 Mark the variable as __maybe_unused to make it clear to clang that this
 is expected, so there is no more warning.

 Cc: Christian König 
 Cc: Lijo Lazar 
 Cc: Luben Tuikov 
 Cc: Alex Deucher 
 Signed-off-by: Srinivasan Shanmugam 
>>>
>>> Traditionally, this attribute would go between the [] and =, but that is
>>> a nit. Can someone please pick this up to unblock our builds on -next?
>>>
>>> Reviewed-by: Nathan Chancellor 
>>
>> I'll pick this up, fix it, and submit to amd-staging-drm-next.
> 
> Thanks a lot :)
> 
>> Which -next are you referring to, Nathan?
> 
> linux-next, this warning breaks the build when -Werror is enabled, such
> as with allmodconfig:
> 
> https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2QHtlCTz2JL0yXNpRB5hVmiP9lq/build.log
> 

Hi Nathan,

Thanks for the pointers.

Srinivasan has already submitted it to amd-staging-drm-next.

Seems Alex will push it upstream.

Not sure who fast you need it, we can send you the commit itself
for you to git-am if you cannot wait.

Regards,
Luben

> Cheers,
> Nathan
> 
 ---
  drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c 
 b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
 index 3648994724c2..cba087e529c0 100644
 --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
 +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
 @@ -701,6 +701,7 @@ static void mmhub_v1_8_reset_ras_error_count(struct 
 amdgpu_device *adev)
mmhub_v1_8_inst_reset_ras_error_count(adev, i);
  }
  
 +__maybe_unused
  static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
regMMEA0_ERR_STATUS,
regMMEA1_ERR_STATUS,
 -- 
 2.25.1

>>

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Alex Deucher

On Thu, May 25, 2023 at 12:29 PM Nathan Chancellor  wrote:
>
> On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote:
> > On 2023-05-25 11:22, Nathan Chancellor wrote:
> > > On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote:
> > >> Silencing the compiler from below compilation error:
> > >>
> > >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable 
> > >> 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted 
> > >> [-Werror,-Wunneeded-internal-declaration]
> > >> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
> > >>   ^
> > >> 1 error generated.
> > >>
> > >> Mark the variable as __maybe_unused to make it clear to clang that this
> > >> is expected, so there is no more warning.
> > >>
> > >> Cc: Christian König 
> > >> Cc: Lijo Lazar 
> > >> Cc: Luben Tuikov 
> > >> Cc: Alex Deucher 
> > >> Signed-off-by: Srinivasan Shanmugam 
> > >
> > > Traditionally, this attribute would go between the [] and =, but that is
> > > a nit. Can someone please pick this up to unblock our builds on -next?
> > >
> > > Reviewed-by: Nathan Chancellor 
> >
> > I'll pick this up, fix it, and submit to amd-staging-drm-next.
>
> Thanks a lot :)
>
> > Which -next are you referring to, Nathan?
>
> linux-next, this warning breaks the build when -Werror is enabled, such
> as with allmodconfig:
>
> https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2QHtlCTz2JL0yXNpRB5hVmiP9lq/build.log
>

Srinivasan has already pushed it.  I'll push it out once CI has
completed.  We are trying to figure out the best way to enable -WERROR
in our CI system as it is almost always broken depending on what
compiler you are using.  Also, I'm not sure fixing these is always
better.  A lot of these warnings seem spurious and in a lot of cases
the "fix" doesn't really improve the code, it just silences a warning.
As one of my coworkers put it, there is a reason warnings are not
errors.

Alex


> Cheers,
> Nathan
>
> > >> ---
> > >>  drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 1 +
> > >>  1 file changed, 1 insertion(+)
> > >>
> > >> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c 
> > >> b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> > >> index 3648994724c2..cba087e529c0 100644
> > >> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> > >> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> > >> @@ -701,6 +701,7 @@ static void mmhub_v1_8_reset_ras_error_count(struct 
> > >> amdgpu_device *adev)
> > >>mmhub_v1_8_inst_reset_ras_error_count(adev, i);
> > >>  }
> > >>
> > >> +__maybe_unused
> > >>  static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
> > >>regMMEA0_ERR_STATUS,
> > >>regMMEA1_ERR_STATUS,
> > >> --
> > >> 2.25.1
> > >>
> >

[PATCH] drm/amd/amdgpu: introduce DRM_AMDGPU_WERROR

2023-05-25 Thread Hamza Mahfooz

We want to do -Werror builds on our CI. However, non-amdgpu breakages
have prevented us from doing so thus far. Also, there are a number of
additional checks that we should enable, that the community cares about
and are hidden behind -Wextra. So, define DRM_AMDGPU_WERROR to only
enable -Werror for the amdgpu kernel module and enable -Wextra while
disabling all of the checks that are too noisy.

Cc: Alex Deucher 
Cc: Kenny Ho 
Suggested-by: Jani Nikula 
Signed-off-by: Hamza Mahfooz 
---
 drivers/gpu/drm/amd/amdgpu/Kconfig  | 10 ++
 drivers/gpu/drm/amd/amdgpu/Makefile |  9 +
 2 files changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig 
b/drivers/gpu/drm/amd/amdgpu/Kconfig
index 07135ffa6d24..334511f331e3 100644
--- a/drivers/gpu/drm/amd/amdgpu/Kconfig
+++ b/drivers/gpu/drm/amd/amdgpu/Kconfig
@@ -66,6 +66,16 @@ config DRM_AMDGPU_USERPTR
  This option selects CONFIG_HMM and CONFIG_HMM_MIRROR if it
  isn't already selected to enabled full userptr support.
 
+config DRM_AMDGPU_WERROR
+   bool "Force the compiler to throw an error instead of a warning when 
compiling"
+   depends on DRM_AMDGPU
+   depends on EXPERT
+   depends on !COMPILE_TEST
+   default n
+   help
+ Add -Werror to the build flags for amdgpu.ko.
+ Only enable this if you are warning code for amdgpu.ko.
+
 source "drivers/gpu/drm/amd/acp/Kconfig"
 source "drivers/gpu/drm/amd/display/Kconfig"
 source "drivers/gpu/drm/amd/amdkfd/Kconfig"
diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 74a9aa6fe18c..7ee68b1bbfed 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -39,6 +39,15 @@ ccflags-y := -I$(FULL_AMD_PATH)/include/asic_reg \
-I$(FULL_AMD_DISPLAY_PATH)/amdgpu_dm \
-I$(FULL_AMD_PATH)/amdkfd
 
+subdir-ccflags-y := -Wextra
+subdir-ccflags-y += -Wunused-but-set-variable
+subdir-ccflags-y += -Wno-unused-parameter
+subdir-ccflags-y += -Wno-type-limits
+subdir-ccflags-y += -Wno-sign-compare
+subdir-ccflags-y += -Wno-missing-field-initializers
+subdir-ccflags-y += -Wno-override-init
+subdir-ccflags-$(CONFIG_DRM_AMDGPU_WERROR) += -Werror
+
 amdgpu-y := amdgpu_drv.o
 
 # add KMS driver
-- 
2.40.1

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Nathan Chancellor

On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote:
> On 2023-05-25 11:22, Nathan Chancellor wrote:
> > On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote:
> >> Silencing the compiler from below compilation error:
> >>
> >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable 
> >> 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted 
> >> [-Werror,-Wunneeded-internal-declaration]
> >> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
> >>   ^
> >> 1 error generated.
> >>
> >> Mark the variable as __maybe_unused to make it clear to clang that this
> >> is expected, so there is no more warning.
> >>
> >> Cc: Christian König 
> >> Cc: Lijo Lazar 
> >> Cc: Luben Tuikov 
> >> Cc: Alex Deucher 
> >> Signed-off-by: Srinivasan Shanmugam 
> > 
> > Traditionally, this attribute would go between the [] and =, but that is
> > a nit. Can someone please pick this up to unblock our builds on -next?
> > 
> > Reviewed-by: Nathan Chancellor 
> 
> I'll pick this up, fix it, and submit to amd-staging-drm-next.

Thanks a lot :)

> Which -next are you referring to, Nathan?

linux-next, this warning breaks the build when -Werror is enabled, such
as with allmodconfig:

https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2QHtlCTz2JL0yXNpRB5hVmiP9lq/build.log

Cheers,
Nathan

> >> ---
> >>  drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 1 +
> >>  1 file changed, 1 insertion(+)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c 
> >> b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> >> index 3648994724c2..cba087e529c0 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> >> @@ -701,6 +701,7 @@ static void mmhub_v1_8_reset_ras_error_count(struct 
> >> amdgpu_device *adev)
> >>mmhub_v1_8_inst_reset_ras_error_count(adev, i);
> >>  }
> >>  
> >> +__maybe_unused
> >>  static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
> >>regMMEA0_ERR_STATUS,
> >>regMMEA1_ERR_STATUS,
> >> -- 
> >> 2.25.1
> >>
>

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Luben Tuikov

On 2023-05-25 11:22, Nathan Chancellor wrote:
> On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote:
>> Silencing the compiler from below compilation error:
>>
>> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable 
>> 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted 
>> [-Werror,-Wunneeded-internal-declaration]
>> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
>>   ^
>> 1 error generated.
>>
>> Mark the variable as __maybe_unused to make it clear to clang that this
>> is expected, so there is no more warning.
>>
>> Cc: Christian König 
>> Cc: Lijo Lazar 
>> Cc: Luben Tuikov 
>> Cc: Alex Deucher 
>> Signed-off-by: Srinivasan Shanmugam 
> 
> Traditionally, this attribute would go between the [] and =, but that is
> a nit. Can someone please pick this up to unblock our builds on -next?
> 
> Reviewed-by: Nathan Chancellor 

I'll pick this up, fix it, and submit to amd-staging-drm-next.

Which -next are you referring to, Nathan?

Regards,
Luben


> 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c 
>> b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
>> index 3648994724c2..cba087e529c0 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
>> @@ -701,6 +701,7 @@ static void mmhub_v1_8_reset_ras_error_count(struct 
>> amdgpu_device *adev)
>>  mmhub_v1_8_inst_reset_ras_error_count(adev, i);
>>  }
>>  
>> +__maybe_unused
>>  static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
>>  regMMEA0_ERR_STATUS,
>>  regMMEA1_ERR_STATUS,
>> -- 
>> 2.25.1
>>

Re: [PATCH v2] drm/amd/display: enable more strict compile checks

2023-05-25 Thread Nathan Chancellor

On Thu, May 25, 2023 at 08:37:07AM -0700, Kees Cook wrote:
> Hi!
> 
> On Wed, May 24, 2023 at 04:27:31PM -0400, Hamza Mahfooz wrote:
> > + Kees
> > 
> > On 5/24/23 15:50, Alex Deucher wrote:
> > > On Wed, May 24, 2023 at 3:46 PM Felix Kuehling  
> > > wrote:
> > > > 
> > > > Sure, I think we tried enabling warnings as errors before and had to
> > > > revert it because of weird compiler quirks or the variety of compiler
> > > > versions that need to be supported.
> > > > 
> > > > Alex, are you planning to upstream this, or is this only to enforce more
> > > > internal discipline about not ignoring warnings?
> > > 
> > > I'd like to upstream it.  Upstream already has CONFIG_WERROR as a
> > > config option, but it's been problematic to enable in CI because of
> > > various breakages outside of the driver and in different compilers.
> > > That said, I don't know how much trouble enabling it will cause with
> > > various compilers in the wild.
> 
> -Wmisleading-indentation is already part of -Wall, so this is globally
> enabled already.
> 
> -Wunused is enabled under W=1, and it's pretty noisy still. If you can
> get builds clean in drm, that'll be a good step towards getting it
> enabled globally. (A middle ground with less to clean up might be
> -Wunused-but-set-variable)
> 
> I agree about -Werror: just stick with CONFIG_WERROR instead.

There is also W=e, added by commit c77d06e70d59 ("kbuild: support W=e
to make build abort in case of warning") in 5.19, which works well for
building with configurations that do not have CONFIG_WERROR enabled and
avoiding dipping into menuconfig.

Unconditionally enabling -Werror with no way to turn it off is just
asking for problems over time with new compiler versions, either due to
new warnings in -Wall or warnings that have been improved or changed.
Should that still be desired, consider doing what i915 and PowerPC have
done and add a Kconfig option that can be disabled.

Cheers,
Nathan

[PATCH] drm/amdgpu: Fix up missing kdoc in sdma_v6_0.c

2023-05-25 Thread Srinivasan Shanmugam

Address a bunch of kdoc warnings:

gcc with W=1
drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or 
member 'job' not described in 'sdma_v6_0_ring_emit_ib'
drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or 
member 'flags' not described in 'sdma_v6_0_ring_emit_ib'
drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:946: warning: Function parameter or 
member 'timeout' not described in 'sdma_v6_0_ring_test_ib'
drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1125: warning: Function parameter or 
member 'ring' not described in 'sdma_v6_0_ring_pad_ib'
drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1176: warning: Function parameter or 
member 'vmid' not described in 'sdma_v6_0_ring_emit_vm_flush'
drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1176: warning: Function parameter or 
member 'pd_addr' not described in 'sdma_v6_0_ring_emit_vm_flush'

Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
index 1c90b5c661fb..967849c59ebe 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
@@ -238,6 +238,8 @@ static void sdma_v6_0_ring_insert_nop(struct amdgpu_ring 
*ring, uint32_t count)
  *
  * @ring: amdgpu ring pointer
  * @ib: IB object to schedule
+ * @flags: unused
+ * @job: job to retrieve vmid from
  *
  * Schedule an IB in the DMA ring.
  */
@@ -938,6 +940,7 @@ static int sdma_v6_0_ring_test_ring(struct amdgpu_ring 
*ring)
  * sdma_v6_0_ring_test_ib - test an IB on the DMA engine
  *
  * @ring: amdgpu_ring structure holding ring information
+ * @timeout: timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT
  *
  * Test a simple IB in the DMA ring.
  * Returns 0 on success, error on failure.
@@ -1167,6 +1170,8 @@ static void sdma_v6_0_ring_emit_pipeline_sync(struct 
amdgpu_ring *ring)
  * sdma_v6_0_ring_emit_vm_flush - vm flush using sDMA
  *
  * @ring: amdgpu_ring pointer
+ * @vmid: vmid number to use
+ * @pd_addr: address
  *
  * Update the page table base and flush the VM TLB
  * using sDMA.
-- 
2.25.1

[PATCH 2/2] drm/amdgpu: Remove duplicate fdinfo fields

2023-05-25 Thread Rob Clark

From: Rob Clark 

Some of the fields that are handled by drm_show_fdinfo() crept back in
when rebasing the patch.  Remove them again.

Fixes: 376c25f8ca47 ("drm/amdgpu: Switch to fdinfo helper")
Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
index 13d7413d4ca3..a93e5627901a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
@@ -80,23 +80,20 @@ void amdgpu_show_fdinfo(struct drm_printer *p, struct 
drm_file *file)
 
amdgpu_ctx_mgr_usage(&fpriv->ctx_mgr, usage);
 
/*
 * **
 * For text output format description please see drm-usage-stats.rst!
 * **
 */
 
drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid);
-   drm_printf(p, "drm-driver:\t%s\n", file->minor->dev->driver->name);
-   drm_printf(p, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn);
-   drm_printf(p, "drm-client-id:\t%Lu\n", vm->immediate.fence_context);
drm_printf(p, "drm-memory-vram:\t%llu KiB\n", stats.vram/1024UL);
drm_printf(p, "drm-memory-gtt: \t%llu KiB\n", stats.gtt/1024UL);
drm_printf(p, "drm-memory-cpu: \t%llu KiB\n", stats.cpu/1024UL);
drm_printf(p, "amd-memory-visible-vram:\t%llu KiB\n",
   stats.visible_vram/1024UL);
drm_printf(p, "amd-evicted-vram:\t%llu KiB\n",
   stats.evicted_vram/1024UL);
drm_printf(p, "amd-evicted-visible-vram:\t%llu KiB\n",
   stats.evicted_visible_vram/1024UL);
drm_printf(p, "amd-requested-vram:\t%llu KiB\n",
-- 
2.40.1

[PATCH 1/2] drm/amdgpu: Fix no-procfs build

2023-05-25 Thread Rob Clark

From: Rob Clark 

Fixes undefined symbol when PROC_FS is not enabled.

Reported-by: kernel test robot 
Closes: 
https://lore.kernel.org/oe-kbuild-all/202305251510.u0r2as7k-...@intel.com/
Fixes: 376c25f8ca47 ("drm/amdgpu: Switch to fdinfo helper")
Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 1b46e7ac7cb0..c9a41c997c6c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2795,21 +2795,23 @@ static const struct drm_driver amdgpu_kms_driver = {
DRIVER_SYNCOBJ_TIMELINE,
.open = amdgpu_driver_open_kms,
.postclose = amdgpu_driver_postclose_kms,
.lastclose = amdgpu_driver_lastclose_kms,
.ioctls = amdgpu_ioctls_kms,
.num_ioctls = ARRAY_SIZE(amdgpu_ioctls_kms),
.dumb_create = amdgpu_mode_dumb_create,
.dumb_map_offset = amdgpu_mode_dumb_mmap,
.fops = &amdgpu_driver_kms_fops,
.release = &amdgpu_driver_release_kms,
+#ifdef CONFIG_PROC_FS
.show_fdinfo = amdgpu_show_fdinfo,
+#endif
 
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_import = amdgpu_gem_prime_import,
.gem_prime_mmap = drm_gem_prime_mmap,
 
.name = DRIVER_NAME,
.desc = DRIVER_DESC,
.date = DRIVER_DATE,
.major = KMS_DRIVER_MAJOR,
-- 
2.40.1

Re: [PATCH v2] drm/amd/display: enable more strict compile checks

2023-05-25 Thread Christoph Hellwig

> +subdir-ccflags-y += -Werror -Wunused -Wmisleading-indentation

We have a config option for -Werror.  Blindly adding this will create
problems with too new (or sometimes too old, or just too weird)
compilers all the time.  Don't do this.

[PATCH] drm/amd/amdgpu: Fix up locking etc in amdgpu_debugfs_gprwave_ioctl()

2023-05-25 Thread Dan Carpenter

There are two bugs here.
1) Drop the lock if copy_from_user() fails.
2) If the copy fails then the correct error code is -EFAULT instead of
   -EINVAL.

I also broke up the long line and changed "sizeof rd->id" to
"sizeof(rd->id)".

Fixes: 164fb2940933 ("drm/amd/amdgpu: Update debugfs for XCC support (v3)")
Signed-off-by: Dan Carpenter 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index c657bed350ac..56e89e76ff17 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -478,15 +478,16 @@ static ssize_t amdgpu_debugfs_gprwave_read(struct file 
*f, char __user *buf, siz
 static long amdgpu_debugfs_gprwave_ioctl(struct file *f, unsigned int cmd, 
unsigned long data)
 {
struct amdgpu_debugfs_gprwave_data *rd = f->private_data;
-   int r;
+   int r = 0;
 
mutex_lock(&rd->lock);
 
switch (cmd) {
case AMDGPU_DEBUGFS_GPRWAVE_IOC_SET_STATE:
-   r = copy_from_user(&rd->id, (struct 
amdgpu_debugfs_gprwave_iocdata *)data, sizeof rd->id);
-   if (r)
-   return r ? -EINVAL : 0;
+   if (copy_from_user(&rd->id,
+  (struct amdgpu_debugfs_gprwave_iocdata 
*)data,
+  sizeof(rd->id)))
+   r = -EFAULT;
goto done;
default:
r = -EINVAL;
-- 
2.39.2

Re: [PATCH v2] drm/amd/display: enable more strict compile checks

2023-05-25 Thread Kees Cook

Hi!

On Wed, May 24, 2023 at 04:27:31PM -0400, Hamza Mahfooz wrote:
> + Kees
> 
> On 5/24/23 15:50, Alex Deucher wrote:
> > On Wed, May 24, 2023 at 3:46 PM Felix Kuehling  
> > wrote:
> > > 
> > > Sure, I think we tried enabling warnings as errors before and had to
> > > revert it because of weird compiler quirks or the variety of compiler
> > > versions that need to be supported.
> > > 
> > > Alex, are you planning to upstream this, or is this only to enforce more
> > > internal discipline about not ignoring warnings?
> > 
> > I'd like to upstream it.  Upstream already has CONFIG_WERROR as a
> > config option, but it's been problematic to enable in CI because of
> > various breakages outside of the driver and in different compilers.
> > That said, I don't know how much trouble enabling it will cause with
> > various compilers in the wild.

-Wmisleading-indentation is already part of -Wall, so this is globally
enabled already.

-Wunused is enabled under W=1, and it's pretty noisy still. If you can
get builds clean in drm, that'll be a good step towards getting it
enabled globally. (A middle ground with less to clean up might be
-Wunused-but-set-variable)

I agree about -Werror: just stick with CONFIG_WERROR instead.

-Kees

> > 
> > Alex
> > 
> > > 
> > > Regards,
> > > Felix
> > > 
> > > 
> > > On 2023-05-24 15:41, Russell, Kent wrote:
> > > > 
> > > > [AMD Official Use Only - General]
> > > > 
> > > > 
> > > > (Adding Felix in CC)
> > > > 
> > > > I’m a fan of adding it to KFD as well. Felix, can you foresee any
> > > > issues here?
> > > > 
> > > > Kent
> > > > 
> > > > *From:* amd-gfx  *On Behalf Of
> > > > *Ho, Kenny
> > > > *Sent:* Wednesday, May 24, 2023 3:23 PM
> > > > *To:* Alex Deucher ; Mahfooz, Hamza
> > > > 
> > > > *Cc:* Li, Sun peng (Leo) ; Wentland, Harry
> > > > ; Pan, Xinhui ; Siqueira,
> > > > Rodrigo ; linux-ker...@vger.kernel.org;
> > > > dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Daniel
> > > > Vetter ; Deucher, Alexander
> > > > ; David Airlie ; Koenig,
> > > > Christian 
> > > > *Subject:* Re: [PATCH v2] drm/amd/display: enable more strict compile
> > > > checks
> > > > 
> > > > [AMD Official Use Only - General]
> > > > 
> > > > [AMD Official Use Only - General]
> > > > 
> > > > (+ Felix)
> > > > 
> > > > Should we do the same for other modules under amd (amdkfd)?  I was
> > > > going to enable full kernel werror in the kconfig used by my CI but
> > > > this is probably better.
> > > > 
> > > > Kenny
> > > > 
> > > > 
> > > > 
> > > > *From:*Alex Deucher 
> > > > *Sent:* Wednesday, May 24, 2023 3:22 PM
> > > > *To:* Mahfooz, Hamza 
> > > > *Cc:* amd-gfx@lists.freedesktop.org ;
> > > > Li, Sun peng (Leo) ; Ho, Kenny ;
> > > > Pan, Xinhui ; Siqueira, Rodrigo
> > > > ; linux-ker...@vger.kernel.org
> > > > ; dri-de...@lists.freedesktop.org
> > > > ; Daniel Vetter ;
> > > > Deucher, Alexander ; David Airlie
> > > > ; Wentland, Harry ; Koenig,
> > > > Christian 
> > > > *Subject:* Re: [PATCH v2] drm/amd/display: enable more strict compile
> > > > checks
> > > > 
> > > > On Wed, May 24, 2023 at 3:20 PM Hamza Mahfooz 
> > > > wrote:
> > > > > 
> > > > > Currently, there are quite a number of issues that are quite easy for
> > > > > the CI to catch, that slip through the cracks. Among them, there are
> > > > > unused variable and indentation issues. Also, we should consider all
> > > > > warnings to be compile errors, since the community will eventually end
> > > > > up complaining about them. So, enable -Werror, -Wunused and
> > > > > -Wmisleading-indentation for all kernel builds.
> > > > > 
> > > > > Cc: Alex Deucher 
> > > > > Cc: Harry Wentland 
> > > > > Cc: Kenny Ho 
> > > > > Signed-off-by: Hamza Mahfooz 
> > > > > ---
> > > > > v2: fix grammatical error
> > > > > ---
> > > > >   drivers/gpu/drm/amd/display/Makefile | 2 ++
> > > > >   1 file changed, 2 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/amd/display/Makefile
> > > > b/drivers/gpu/drm/amd/display/Makefile
> > > > > index 0d610cb376bb..3c44162ebe21 100644
> > > > > --- a/drivers/gpu/drm/amd/display/Makefile
> > > > > +++ b/drivers/gpu/drm/amd/display/Makefile
> > > > > @@ -26,6 +26,8 @@
> > > > > 
> > > > >   AMDDALPATH = $(RELATIVE_AMD_DISPLAY_PATH)
> > > > > 
> > > > > +subdir-ccflags-y += -Werror -Wunused -Wmisleading-indentation
> > > > > +
> > > > 
> > > > Care to enable this for the rest of amdgpu as well?  Or send out an
> > > > additional patch to do that?  Either way:
> > > > Reviewed-by: Alex Deucher 
> > > > 
> > > > Alex
> > > > 
> > > > >   subdir-ccflags-y += -I$(FULL_AMD_DISPLAY_PATH)/dc/inc/
> > > > >   subdir-ccflags-y += -I$(FULL_AMD_DISPLAY_PATH)/dc/inc/hw
> > > > >   subdir-ccflags-y += -I$(FULL_AMD_DISPLAY_PATH)/dc/clk_mgr
> > > > > --
> > > > > 2.40.1
> > > > > 
> > > > 
> -- 
> Hamza
> 

-- 
Kees Cook

Re: [PATCH] drm/amdgpu: add the accelerator pcie class

2023-05-25 Thread Christoph Hellwig

On Tue, May 23, 2023 at 10:02:32AM -0400, Alex Deucher wrote:
> On Tue, May 23, 2023 at 5:25 AM Christoph Hellwig  wrote:
> >
> > On Tue, May 23, 2023 at 12:02:32PM +0800, Shiwu Zhang wrote:
> > > + { PCI_DEVICE(0x1002, PCI_ANY_ID),
> > > +   .class = PCI_CLASS_ACCELERATOR_PROCESSING << 8,
> > > +   .class_mask = 0xff,
> > > +   .driver_data = CHIP_IP_DISCOVERY },
> >
> > Probing for every single device of a given class for a single vendor
> > to a driver is just fundamentaly wrong.  Please list the actual IDs
> > that the driver can handle.
> 
> How so?  The driver handles all devices of that class.  We already do
> that for PCI_CLASS_DISPLAY_VGA and PCI_CLASS_DISPLAY_OTHER.  Other
> drivers do similar things.

How is that going to work in the long run?  The chances of totally
incompatbile devices from the same vendor appearing is absolutely given.

> The hda audio driver does the same thing
> for PCI_CLASS_MULTIMEDIA_HD_AUDIO for example.
>

That, just like PCI_CLASS_STORAGE_EXPRESS is a different case, as
the class is associated with an actual documented programming interface.

Re: [PATCH 06/36] drm/amd/display: add CRTC driver-specific property for gamma TF

2023-05-25 Thread Harry Wentland




On 5/24/23 04:24, Pekka Paalanen wrote:
> On Tue, 23 May 2023 21:14:50 -0100
> Melissa Wen  wrote:
> 
>> Hook up driver-specific atomic operations for managing AMD color
>> properties and create AMD driver-specific color management properties
>> and attach them according to HW capabilities defined by `struct
>> dc_color_caps`. Add enumerated transfer function property to DRM CRTC
>> gamma to convert to wire encoding with or without a user gamma LUT.
>> Enumerated TFs are not supported yet by the DRM color pipeline,
>> therefore, create a DRM enum list with the predefined TFs supported by
>> the AMD display driver.
>>
>> Co-developed-by: Joshua Ashton 
>> Signed-off-by: Joshua Ashton 
>> Signed-off-by: Melissa Wen 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 36 ++
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  |  8 +++
>>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 22 ++
>>  .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c| 72 ++-
>>  4 files changed, 137 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
>> index 389396eac222..88af075e6c18 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
>> @@ -1247,6 +1247,38 @@ amdgpu_display_user_framebuffer_create(struct 
>> drm_device *dev,
>>  return &amdgpu_fb->base;
>>  }
>>  
>> +static const struct drm_prop_enum_list drm_transfer_function_enum_list[] = {
>> +{ DRM_TRANSFER_FUNCTION_DEFAULT, "Default" },
>> +{ DRM_TRANSFER_FUNCTION_SRGB, "sRGB" },
>> +{ DRM_TRANSFER_FUNCTION_BT709, "BT.709" },
>> +{ DRM_TRANSFER_FUNCTION_PQ, "PQ (Perceptual Quantizer)" },
>> +{ DRM_TRANSFER_FUNCTION_LINEAR, "Linear" },
>> +{ DRM_TRANSFER_FUNCTION_UNITY, "Unity" },
>> +{ DRM_TRANSFER_FUNCTION_HLG, "HLG (Hybrid Log Gamma)" },
>> +{ DRM_TRANSFER_FUNCTION_GAMMA22, "Gamma 2.2" },
>> +{ DRM_TRANSFER_FUNCTION_GAMMA24, "Gamma 2.4" },
>> +{ DRM_TRANSFER_FUNCTION_GAMMA26, "Gamma 2.6" },
>> +};
>> +
>> +#ifdef AMD_PRIVATE_COLOR
>> +static int
>> +amdgpu_display_create_color_properties(struct amdgpu_device *adev)
>> +{
>> +struct drm_property *prop;
>> +
>> +prop = drm_property_create_enum(adev_to_drm(adev),
>> +DRM_MODE_PROP_ENUM,
>> +"AMD_REGAMMA_TF",
> 
> Hi,
> 
> is this REGAMMA element capable of applying only optical-to-electrical
> direction of the listed TFs?
> 
> I was expecting that the listed TF names would include an explanation
> of the direction, for example "PQ EOTF" vs. "inverse PQ EOTF".
> 
> Very specifically "inverse EOTF" and not "OETF" because they
> generally are not the same concept.
> 
> PQ defines only EOTF while HLG for example defines OETF (HLG defines
> its EOTF as a combination of inverse HLG OETF and a parameterised HLG
> OOTF). So if you say "PQ TF" I will assume it means
> electrical-to-optical and if you say HLG TF I might assume
> optical-to-electrical. I think these enum names should be more explicit
> about what they refer to, to avoid any ambiguity.
> 
> What kind of TF is "Unity"?
> 
> This patch is not adding any docs for any of these. Is there another
> patch that does?
> 
> I'm still confused about how this "private" API should be thought of.
> Should it be documented at all? Is it free to use for userspace?
> Was the expectation that only the Steam Deck distribution would enable
> these in the kernel, and no-one else? And if anyone builds their own
> kernel with these enabled? So my ask for docs may or may not be
> warranted.
> 

The current plan is to put the API bits behind a #ifdef AMD_PRIVATE_COLOR
(or a similar name) and not making it configurable via KConfig. Anyone
that wants them would have to build the kernel with -DAMD_PRIVATE_COLOR.

Thanks for re-iterating your naming and documentation concerns. It would
be good to still fix that up, even if this doesn't become upstream API
as-is.

Harry

> I don't like the names degamma/regamma/gamma at all. I don't like
> calling something a LUT when it can have a parametric or enumerated
> curve. I don't like calling an element a transfer function if it could
> be a shaper or a combination of TF and shaper and maybe something else
> (i.e. a LUT).
> 
> But that's nothing new. If the expectation is that no-one should use
> these, then it's fine, and you don't need to CC me. You know I will
> always respond with similar comments about documenting things, having
> good names, etc. that is important for generic userspace, which is just
> not needed for "no-users UAPI". ;-)
> 
> 
> Thanks,
> pq
> 
>> +drm_transfer_function_enum_list,
>> +
>> ARRAY_SIZE(drm_transfer_function_enum_list));
>> +if (!prop)
>> +return -ENOMEM;
>> +adev->mode_info.regamma_tf_property = prop;
>> +
>> +return 0;
>>

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused

2023-05-25 Thread Nathan Chancellor

On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote:
> Silencing the compiler from below compilation error:
> 
> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable 
> 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted 
> [-Werror,-Wunneeded-internal-declaration]
> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
>   ^
> 1 error generated.
> 
> Mark the variable as __maybe_unused to make it clear to clang that this
> is expected, so there is no more warning.
> 
> Cc: Christian König 
> Cc: Lijo Lazar 
> Cc: Luben Tuikov 
> Cc: Alex Deucher 
> Signed-off-by: Srinivasan Shanmugam 

Traditionally, this attribute would go between the [] and =, but that is
a nit. Can someone please pick this up to unblock our builds on -next?

Reviewed-by: Nathan Chancellor 

> ---
>  drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c 
> b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> index 3648994724c2..cba087e529c0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
> @@ -701,6 +701,7 @@ static void mmhub_v1_8_reset_ras_error_count(struct 
> amdgpu_device *adev)
>   mmhub_v1_8_inst_reset_ras_error_count(adev, i);
>  }
>  
> +__maybe_unused
>  static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = {
>   regMMEA0_ERR_STATUS,
>   regMMEA1_ERR_STATUS,
> -- 
> 2.25.1
>

Re: [PATCH 02/36] drm/drm_property: make replace_property_blob_from_id a DRM helper

2023-05-25 Thread Liviu Dudau

On Tue, May 23, 2023 at 09:14:46PM -0100, Melissa Wen wrote:
> Place it in drm_property where drm_property_replace_blob and
> drm_property_lookup_blob live. Then we can use the DRM helper for
> driver-specific KMS properties too.
> 
> Signed-off-by: Melissa Wen 

I know that I've got Cc-ed because of a comment, but I did have a look at the 
whole
patch. If it is useful, then you can add

Reviewed-by: Liviu Dudau 

Best regards,
Liviu

> ---
>  drivers/gpu/drm/arm/malidp_crtc.c |  2 +-
>  drivers/gpu/drm/drm_atomic_uapi.c | 43 ---
>  drivers/gpu/drm/drm_property.c| 49 +++
>  include/drm/drm_property.h|  6 
>  4 files changed, 61 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/gpu/drm/arm/malidp_crtc.c 
> b/drivers/gpu/drm/arm/malidp_crtc.c
> index dc01c43f6193..d72c22dcf685 100644
> --- a/drivers/gpu/drm/arm/malidp_crtc.c
> +++ b/drivers/gpu/drm/arm/malidp_crtc.c
> @@ -221,7 +221,7 @@ static int malidp_crtc_atomic_check_ctm(struct drm_crtc 
> *crtc,
>  
>   /*
>* The size of the ctm is checked in
> -  * drm_atomic_replace_property_blob_from_id.
> +  * drm_property_replace_blob_from_id.
>*/
>   ctm = (struct drm_color_ctm *)state->ctm->data;
>   for (i = 0; i < ARRAY_SIZE(ctm->matrix); ++i) {
> diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
> b/drivers/gpu/drm/drm_atomic_uapi.c
> index c06d0639d552..b76d50ae244c 100644
> --- a/drivers/gpu/drm/drm_atomic_uapi.c
> +++ b/drivers/gpu/drm/drm_atomic_uapi.c
> @@ -362,39 +362,6 @@ static s32 __user *get_out_fence_for_connector(struct 
> drm_atomic_state *state,
>   return fence_ptr;
>  }
>  
> -static int
> -drm_atomic_replace_property_blob_from_id(struct drm_device *dev,
> -  struct drm_property_blob **blob,
> -  uint64_t blob_id,
> -  ssize_t expected_size,
> -  ssize_t expected_elem_size,
> -  bool *replaced)
> -{
> - struct drm_property_blob *new_blob = NULL;
> -
> - if (blob_id != 0) {
> - new_blob = drm_property_lookup_blob(dev, blob_id);
> - if (new_blob == NULL)
> - return -EINVAL;
> -
> - if (expected_size > 0 &&
> - new_blob->length != expected_size) {
> - drm_property_blob_put(new_blob);
> - return -EINVAL;
> - }
> - if (expected_elem_size > 0 &&
> - new_blob->length % expected_elem_size != 0) {
> - drm_property_blob_put(new_blob);
> - return -EINVAL;
> - }
> - }
> -
> - *replaced |= drm_property_replace_blob(blob, new_blob);
> - drm_property_blob_put(new_blob);
> -
> - return 0;
> -}
> -
>  static int drm_atomic_crtc_set_property(struct drm_crtc *crtc,
>   struct drm_crtc_state *state, struct drm_property *property,
>   uint64_t val)
> @@ -415,7 +382,7 @@ static int drm_atomic_crtc_set_property(struct drm_crtc 
> *crtc,
>   } else if (property == config->prop_vrr_enabled) {
>   state->vrr_enabled = val;
>   } else if (property == config->degamma_lut_property) {
> - ret = drm_atomic_replace_property_blob_from_id(dev,
> + ret = drm_property_replace_blob_from_id(dev,
>   &state->degamma_lut,
>   val,
>   -1, sizeof(struct drm_color_lut),
> @@ -423,7 +390,7 @@ static int drm_atomic_crtc_set_property(struct drm_crtc 
> *crtc,
>   state->color_mgmt_changed |= replaced;
>   return ret;
>   } else if (property == config->ctm_property) {
> - ret = drm_atomic_replace_property_blob_from_id(dev,
> + ret = drm_property_replace_blob_from_id(dev,
>   &state->ctm,
>   val,
>   sizeof(struct drm_color_ctm), -1,
> @@ -431,7 +398,7 @@ static int drm_atomic_crtc_set_property(struct drm_crtc 
> *crtc,
>   state->color_mgmt_changed |= replaced;
>   return ret;
>   } else if (property == config->gamma_lut_property) {
> - ret = drm_atomic_replace_property_blob_from_id(dev,
> + ret = drm_property_replace_blob_from_id(dev,
>   &state->gamma_lut,
>   val,
>   -1, sizeof(struct drm_color_lut),
> @@ -563,7 +530,7 @@ static int drm_atomic_plane_set_property(struct drm_plane 
> *plane,
>   } else if (property == plane->color_range_property) {
>   state->color_range = val;
>   } else if (property == config->prop_fb_damage_clips) {
> - ret = drm_atomic

Re: [PATCH 1/2] Revert "drm/amd/display: Block optimize on consecutive FAMS enables"

2023-05-25 Thread Alex Deucher

On Thu, May 25, 2023 at 6:27 AM Michel Dänzer  wrote:
>
> On 5/23/23 18:09, Hamza Mahfooz wrote:
> > On 5/22/23 09:08, Michel Dänzer wrote:
> >> From: Michel Dänzer 
> >>
> >> This reverts commit ce560ac40272a5c8b5b68a9d63a75edd9e66aed2.
> >>
> >> It depends on its parent commit, which we want to revert.
> >>
> >> Signed-off-by: Michel Dänzer 
> >
> > I have applied the series, thanks!
>
> Thank you. Note that these need to be merged for 6.4; they weren't in Alex's 
> 6.4 fixes PR yesterday.

Yes, I plan to include it in next week's PR.

Alex

Re: [PATCH] drm/amdgpu: keep irq count in amdgpu_irq_disable_all

2023-05-25 Thread Christian König


Am 25.05.23 um 11:28 schrieb Guchun Chen:

This can clean up all irq warnings because of unbalanced
amdgpu_irq_get/put when unplugging/unbind device, and leave
irq count decrease in each ip fini function.

Signed-off-by: Guchun Chen 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index 00f2106c17b9..f90920fbd340 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -140,7 +140,6 @@ void amdgpu_irq_disable_all(struct amdgpu_device *adev)
continue;
  
  			for (k = 0; k < src->num_types; ++k) {

-   atomic_set(&src->enabled_types[k], 0);
r = src->funcs->set(adev, src, k,
AMDGPU_IRQ_STATE_DISABLE);
if (r)

Re: [PATCH] drm/amdgpu: enable tmz by default for GC 11.0.1

2023-05-25 Thread Alex Deucher

Reviewed-by: Alex Deucher 

On Thu, May 25, 2023 at 3:22 AM Ikshwaku Chauhan
 wrote:
>
> Add IP GC 11.0.1 in the list of target to have
> tmz enabled by default.
>
> Signed-off-by: Ikshwaku Chauhan 
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index 3f5dd9e32e08..348d856626c6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -591,6 +591,8 @@ void amdgpu_gmc_tmz_set(struct amdgpu_device *adev)
> case IP_VERSION(9, 3, 0):
> /* GC 10.3.7 */
> case IP_VERSION(10, 3, 7):
> +   /* GC 11.0.1 */
> +   case IP_VERSION(11, 0, 1):
> if (amdgpu_tmz == 0) {
> adev->gmc.tmz_enabled = false;
> dev_info(adev->dev,
> @@ -614,7 +616,6 @@ void amdgpu_gmc_tmz_set(struct amdgpu_device *adev)
> case IP_VERSION(10, 3, 1):
> /* YELLOW_CARP*/
> case IP_VERSION(10, 3, 3):
> -   case IP_VERSION(11, 0, 1):
> case IP_VERSION(11, 0, 4):
> /* Don't enable it by default yet.
>  */
> --
> 2.25.1
>

[PATCH] drm/amdgpu: Fix unused sq_int_priv variable in event_interrupt_wq_v11

2023-05-25 Thread Srinivasan Shanmugam

gcc with W=1
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_int_process_v11.c: In function 
‘event_interrupt_wq_v11’:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_int_process_v11.c:282:38: warning: 
variable ‘sq_int_priv’ set but not used [-Wunused-but-set-variable]
  282 |  uint8_t sq_int_enc, sq_int_errtype, sq_int_priv;
  |

Remove unused sq_int_priv variable.

Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
---
 drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
index 0f0fdea4cd8a..fa0cf6d17baa 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c
@@ -279,7 +279,7 @@ static void event_interrupt_wq_v11(struct kfd_node *dev,
 {
uint16_t source_id, client_id, ring_id, pasid, vmid;
uint32_t context_id0, context_id1;
-   uint8_t sq_int_enc, sq_int_errtype, sq_int_priv;
+   u8 sq_int_enc, sq_int_errtype;
struct kfd_vm_fault_info info = {0};
struct kfd_hsa_memory_exception_data exception_data;
 
@@ -348,13 +348,6 @@ static void event_interrupt_wq_v11(struct kfd_node *dev,
break;
case SQ_INTERRUPT_WORD_ENCODING_INST:
print_sq_intr_info_inst(context_id0, 
context_id1);
-   sq_int_priv = REG_GET_FIELD(context_id0,
-   SQ_INTERRUPT_WORD_WAVE_CTXID0, 
PRIV);
-   /*if (sq_int_priv && 
(kfd_set_dbg_ev_from_interrupt(dev, pasid,
-   
KFD_CTXID0_DOORBELL_ID(context_id0),
-   
KFD_CTXID0_TRAP_CODE(context_id0),
-   NULL, 0)))
-   return;*/
break;
case SQ_INTERRUPT_WORD_ENCODING_ERROR:
print_sq_intr_info_error(context_id0, 
context_id1);
-- 
2.25.1

Re: [PATCH 1/2] Revert "drm/amd/display: Block optimize on consecutive FAMS enables"

2023-05-25 Thread Michel Dänzer

On 5/23/23 18:09, Hamza Mahfooz wrote:
> On 5/22/23 09:08, Michel Dänzer wrote:
>> From: Michel Dänzer 
>>
>> This reverts commit ce560ac40272a5c8b5b68a9d63a75edd9e66aed2.
>>
>> It depends on its parent commit, which we want to revert.
>>
>> Signed-off-by: Michel Dänzer 
> 
> I have applied the series, thanks!

Thank you. Note that these need to be merged for 6.4; they weren't in Alex's 
6.4 fixes PR yesterday.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer

1 2 >

1 - 100 of 108 matches

Mail list logo