RE: [PATCH] drm/amdgpu: add a flag to indicate if a VM is attached to fpriv
[Public] Ping.. > -Original Message- > From: Chen, Guchun > Sent: Wednesday, May 24, 2023 5:23 PM > To: amd-gfx@lists.freedesktop.org; Deucher, Alexander > ; Zhang, Hawking > ; Lazar, Lijo ; Yang, Philip > ; Kuehling, Felix > Cc: Chen, Guchun > Subject: [PATCH] drm/amdgpu: add a flag to indicate if a VM is attached to > fpriv > > Recent code stores xcp_id to amdgpu bo for accounting memory usage or > find correct KFD node, and this xcp_id is from file private data after opening > device. However, not all VMs are attached to this fpriv structure like the > case > in amdgpu_mes_self_test. > So add a flag to differentiate the cases. Otherwise, KASAN will complain out > of bound access. > > [ 77.292314] BUG: KASAN: slab-out-of-bounds in > amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] > [ 77.293845] Read of size 4 at addr 888102c48a48 by task > modprobe/1069 > [ 77.294146] Call Trace: > [ 77.294178] > [ 77.294208] dump_stack_lvl+0x49/0x63 > [ 77.294260] print_report+0x16f/0x4a6 > [ 77.294307] ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] > [ 77.295979] ? kasan_complete_mode_report_info+0x3c/0x200 > [ 77.296057] ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] > [ 77.297556] kasan_report+0xb4/0x130 > [ 77.297609] ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] > [ 77.299202] __asan_load4+0x6f/0x90 > [ 77.299272] amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] > [ 77.300796] ? amdgpu_init+0x6e/0x1000 [amdgpu] > [ 77.30] ? amdgpu_vm_pt_clear+0x750/0x750 [amdgpu] > [ 77.303721] ? preempt_count_sub+0x18/0xc0 > [ 77.303786] amdgpu_vm_init+0x39e/0x870 [amdgpu] > [ 77.305186] ? amdgpu_vm_wait_idle+0x90/0x90 [amdgpu] > [ 77.306683] ? kasan_set_track+0x25/0x30 > [ 77.306737] ? kasan_save_alloc_info+0x1b/0x30 > [ 77.306795] ? __kasan_kmalloc+0x87/0xa0 > [ 77.306852] amdgpu_mes_self_test+0x169/0x620 [amdgpu] > > Fixes: ffc6deb773f7("drm/amdkfd: Store xcp partition id to amdgpu bo") > Signed-off-by: Guchun Chen > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 2 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 2 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 5 - > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h| 5 - > drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 12 +--- > 5 files changed, 19 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > index 41d047e5de69..79b80f9233db 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > @@ -1229,7 +1229,7 @@ int amdgpu_driver_open_kms(struct drm_device > *dev, struct drm_file *file_priv) > pasid = 0; > } > > - r = amdgpu_vm_init(adev, &fpriv->vm); > + r = amdgpu_vm_init(adev, &fpriv->vm, true); > if (r) > goto error_pasid; > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c > index 49bb6c03d606..3be5219edf88 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c > @@ -1345,7 +1345,7 @@ int amdgpu_mes_self_test(struct amdgpu_device > *adev) > goto error_pasid; > } > > - r = amdgpu_vm_init(adev, vm); > + r = amdgpu_vm_init(adev, vm, false); > if (r) { > DRM_ERROR("failed to initialize vm\n"); > goto error_pasid; > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index 37b9d8a8dbec..47ffaa1526a0 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -2099,13 +2099,15 @@ long amdgpu_vm_wait_idle(struct amdgpu_vm > *vm, long timeout) > * > * @adev: amdgpu_device pointer > * @vm: requested vm > + * @vm_attach_to_fpriv: flag to tell if vm is attached to file private > + data > * > * Init @vm fields. > * > * Returns: > * 0 for success, error for failure. > */ > -int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm) > +int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm, > +bool vm_attach_to_fpriv) > { > struct amdgpu_bo *root_bo; > struct amdgpu_bo_vm *root; > @@ -2131,6 +2133,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, > struct amdgpu_vm *vm) > > vm->pte_support_ats = false; > vm->is_compute_context = false; > + vm->vm_attach_to_fpriv = vm_attach_to_fpriv; > > vm->use_cpu_for_update = !!(adev- > >vm_manager.vm_update_mode & > AMDGPU_VM_USE_CPU_FOR_GFX); > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > index d551fca1780e..62ed14b1fc16 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > @@ -333,6 +333,9 @@ struct amdgpu_vm { > /* Flag to indicate if VM is used for compute */ > boolis_compute_context; > > +
RE: [PATCH] drm/amdkfd: remove unused function get_reserved_sdma_queues_bitmap
[AMD Official Use Only - General] > -Original Message- > From: Kuehling, Felix > Sent: Thursday, May 25, 2023 5:10 PM > To: Tom Rix ; Deucher, Alexander > ; Koenig, Christian > ; Pan, Xinhui ; > airl...@gmail.com; dan...@ffwll.ch; nat...@kernel.org; > ndesaulni...@google.com; Joshi, Mukul > Cc: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; linux- > ker...@vger.kernel.org; l...@lists.linux.dev > Subject: Re: [PATCH] drm/amdkfd: remove unused function > get_reserved_sdma_queues_bitmap > > [+Mukul] > > Looks like this problem was introduced by Mukul's patch "drm/amdkfd: > Update SDMA queue management for GFX9.4.3". Could this be a merge > error between GFX 9.4.3 and GFX11 branches? I think the > reserved_sdma_queues_bitmap was introduced after the 9.4.3 branch was > created. Mukul, you worked on both, so you're probably in the best position > to resolve this. > Yes my patch introduced this regression. We need the get_reserved_sdma_queues_bitmap function. I will fix this regression and send out a new patch. Thanks for noticing/catching this. Regards, Mukul > Regards, >Felix > > > On 2023-05-25 16:07, Tom Rix wrote: > > clang with W=1 reports > > > drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:12 > 2:24: error: > >unused function 'get_reserved_sdma_queues_bitmap' > > [-Werror,-Wunused-function] static inline uint64_t > get_reserved_sdma_queues_bitmap(struct device_queue_manager *dqm) > > ^ > > This function is not used so remove it. > > > > Signed-off-by: Tom Rix > > --- > > drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 5 - > > 1 file changed, 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > > b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > > index 493b4b66f180..2fbd0a96424f 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > > @@ -119,11 +119,6 @@ unsigned int get_num_xgmi_sdma_queues(struct > device_queue_manager *dqm) > > dqm->dev->kfd- > >device_info.num_sdma_queues_per_engine; > > } > > > > -static inline uint64_t get_reserved_sdma_queues_bitmap(struct > > device_queue_manager *dqm) -{ > > - return dqm->dev->kfd- > >device_info.reserved_sdma_queues_bitmap; > > -} > > - > > static void init_sdma_bitmaps(struct device_queue_manager *dqm) > > { > > bitmap_zero(dqm->sdma_bitmap, KFD_MAX_SDMA_QUEUES);
[PATCH 1/2] drm/amdgpu: Modify indirect buffer packages for resubmission
From: Jiadong Zhu When the preempted IB frame resubmitted to cp, we need to modify the frame data including: 1. set PRE_RESUME 1 in CONTEXT_CONTROL. 2. use meta data(DE and CE) read from CSA in WRITE_DATA. Add functions to save the location the first time IBs emitted and callback to patch the package when resubmission happens. Signed-off-by: Jiadong Zhu --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 18 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 9 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c | 60 drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.h | 15 + 4 files changed, 102 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c index 7429b20257a6..12ba863e69f4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c @@ -692,3 +692,21 @@ void amdgpu_ring_ib_end(struct amdgpu_ring *ring) if (ring->is_sw_ring) amdgpu_sw_ring_ib_end(ring); } + +void amdgpu_ring_ib_on_emit_cntl(struct amdgpu_ring *ring) +{ + if (ring->is_sw_ring) + amdgpu_sw_ring_ib_mark_offset(ring, AMDGPU_MUX_OFFSET_TYPE_CONTROL); +} + +void amdgpu_ring_ib_on_emit_ce(struct amdgpu_ring *ring) +{ + if (ring->is_sw_ring) + amdgpu_sw_ring_ib_mark_offset(ring, AMDGPU_MUX_OFFSET_TYPE_CE); +} + +void amdgpu_ring_ib_on_emit_de(struct amdgpu_ring *ring) +{ + if (ring->is_sw_ring) + amdgpu_sw_ring_ib_mark_offset(ring, AMDGPU_MUX_OFFSET_TYPE_DE); +} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index baa03527bf8b..702ce55b962a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h @@ -229,6 +229,9 @@ struct amdgpu_ring_funcs { int (*preempt_ib)(struct amdgpu_ring *ring); void (*emit_mem_sync)(struct amdgpu_ring *ring); void (*emit_wave_limit)(struct amdgpu_ring *ring, bool enable); + void (*patch_cntl)(struct amdgpu_ring *ring, unsigned offset); + void (*patch_ce)(struct amdgpu_ring *ring, unsigned offset); + void (*patch_de)(struct amdgpu_ring *ring, unsigned offset); }; struct amdgpu_ring { @@ -323,11 +326,17 @@ struct amdgpu_ring { #define amdgpu_ring_init_cond_exec(r) (r)->funcs->init_cond_exec((r)) #define amdgpu_ring_patch_cond_exec(r,o) (r)->funcs->patch_cond_exec((r),(o)) #define amdgpu_ring_preempt_ib(r) (r)->funcs->preempt_ib(r) +#define amdgpu_ring_patch_cntl(r, o) ((r)->funcs->patch_cntl((r), (o))) +#define amdgpu_ring_patch_ce(r, o) ((r)->funcs->patch_ce((r), (o))) +#define amdgpu_ring_patch_de(r, o) ((r)->funcs->patch_de((r), (o))) unsigned int amdgpu_ring_max_ibs(enum amdgpu_ring_type type); int amdgpu_ring_alloc(struct amdgpu_ring *ring, unsigned ndw); void amdgpu_ring_ib_begin(struct amdgpu_ring *ring); void amdgpu_ring_ib_end(struct amdgpu_ring *ring); +void amdgpu_ring_ib_on_emit_cntl(struct amdgpu_ring *ring); +void amdgpu_ring_ib_on_emit_ce(struct amdgpu_ring *ring); +void amdgpu_ring_ib_on_emit_de(struct amdgpu_ring *ring); void amdgpu_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count); void amdgpu_ring_generic_pad_ib(struct amdgpu_ring *ring, struct amdgpu_ib *ib); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c index 62079f0e3ee8..73516abef662 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c @@ -105,6 +105,16 @@ static void amdgpu_mux_resubmit_chunks(struct amdgpu_ring_mux *mux) amdgpu_fence_update_start_timestamp(e->ring, chunk->sync_seq, ktime_get()); + if (chunk->sync_seq == + le32_to_cpu(*(e->ring->fence_drv.cpu_addr + 2))) { + if (chunk->cntl_offset <= e->ring->buf_mask) + amdgpu_ring_patch_cntl(e->ring, + chunk->cntl_offset); + if (chunk->ce_offset <= e->ring->buf_mask) + amdgpu_ring_patch_ce(e->ring, chunk->ce_offset); + if (chunk->de_offset <= e->ring->buf_mask) + amdgpu_ring_patch_de(e->ring, chunk->de_offset); + } amdgpu_ring_mux_copy_pkt_from_sw_ring(mux, e->ring, chunk->start, chunk->end); @@ -407,6 +417,17 @@ void amdgpu_sw_ring_ib_end(struct amdgpu_
[PATCH 2/2] drm/amdgpu: Implement gfx9 patch functions for resubmission
From: Jiadong Zhu Patch the packages including CONTEXT_CONTROL and WRITE_DATA for gfx9 during the resubmission scenario. Signed-off-by: Jiadong Zhu --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 80 +++ 1 file changed, 80 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index cbcf6126cce5..4fbeb9b5752c 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -5172,9 +5172,83 @@ static void gfx_v9_0_ring_emit_ib_gfx(struct amdgpu_ring *ring, #endif lower_32_bits(ib->gpu_addr)); amdgpu_ring_write(ring, upper_32_bits(ib->gpu_addr)); + amdgpu_ring_ib_on_emit_cntl(ring); amdgpu_ring_write(ring, control); } +static void gfx_v9_0_ring_patch_cntl(struct amdgpu_ring *ring, +unsigned offset) +{ + u32 control = ring->ring[offset]; + + control |= INDIRECT_BUFFER_PRE_RESUME(1); + ring->ring[offset] = control; +} + +static void gfx_v9_0_ring_patch_ce_meta(struct amdgpu_ring *ring, + unsigned offset) +{ + struct amdgpu_device *adev = ring->adev; + void *ce_payload_cpu_addr; + uint64_t payload_offset, payload_size; + + payload_size = sizeof(struct v9_ce_ib_state); + + if (ring->is_mes_queue) { + payload_offset = offsetof(struct amdgpu_mes_ctx_meta_data, + gfx[0].gfx_meta_data) + + offsetof(struct v9_gfx_meta_data, ce_payload); + ce_payload_cpu_addr = + amdgpu_mes_ctx_get_offs_cpu_addr(ring, payload_offset); + } else { + payload_offset = offsetof(struct v9_gfx_meta_data, ce_payload); + ce_payload_cpu_addr = adev->virt.csa_cpu_addr + payload_offset; + } + + if (offset + (payload_size >> 2) <= ring->buf_mask + 1) { + memcpy((void *)&ring->ring[offset], ce_payload_cpu_addr, payload_size); + } else { + memcpy((void *)&ring->ring[offset], ce_payload_cpu_addr, + (ring->buf_mask + 1 - offset) << 2); + payload_size -= (ring->buf_mask + 1 - offset) << 2; + memcpy((void *)&ring->ring[0], + ce_payload_cpu_addr + ((ring->buf_mask + 1 - offset) << 2), + payload_size); + } +} + +static void gfx_v9_0_ring_patch_de_meta(struct amdgpu_ring *ring, + unsigned offset) +{ + struct amdgpu_device *adev = ring->adev; + void *de_payload_cpu_addr; + uint64_t payload_offset, payload_size; + + payload_size = sizeof(struct v9_de_ib_state); + + if (ring->is_mes_queue) { + payload_offset = offsetof(struct amdgpu_mes_ctx_meta_data, + gfx[0].gfx_meta_data) + + offsetof(struct v9_gfx_meta_data, de_payload); + de_payload_cpu_addr = + amdgpu_mes_ctx_get_offs_cpu_addr(ring, payload_offset); + } else { + payload_offset = offsetof(struct v9_gfx_meta_data, de_payload); + de_payload_cpu_addr = adev->virt.csa_cpu_addr + payload_offset; + } + + if (offset + (payload_size >> 2) <= ring->buf_mask + 1) { + memcpy((void *)&ring->ring[offset], de_payload_cpu_addr, payload_size); + } else { + memcpy((void *)&ring->ring[offset], de_payload_cpu_addr, + (ring->buf_mask + 1 - offset) << 2); + payload_size -= (ring->buf_mask + 1 - offset) << 2; + memcpy((void *)&ring->ring[0], + de_payload_cpu_addr + ((ring->buf_mask + 1 - offset) << 2), + payload_size); + } +} + static void gfx_v9_0_ring_emit_ib_compute(struct amdgpu_ring *ring, struct amdgpu_job *job, struct amdgpu_ib *ib, @@ -5370,6 +5444,8 @@ static void gfx_v9_0_ring_emit_ce_meta(struct amdgpu_ring *ring, bool resume) amdgpu_ring_write(ring, lower_32_bits(ce_payload_gpu_addr)); amdgpu_ring_write(ring, upper_32_bits(ce_payload_gpu_addr)); + amdgpu_ring_ib_on_emit_ce(ring); + if (resume) amdgpu_ring_write_multiple(ring, ce_payload_cpu_addr, sizeof(ce_payload) >> 2); @@ -5481,6 +5557,7 @@ static void gfx_v9_0_ring_emit_de_meta(struct amdgpu_ring *ring, bool resume, bo amdgpu_ring_write(ring, lower_32_bits(de_payload_gpu_addr)); amdgpu_ring_write(ring, upper_32_bits(de_payload_gpu_addr)); + amdgpu_ring_ib_on_emit_de(ring); if (resume) amdgpu_ring_write_multiple(ring, de_payload_cpu_addr, sizeof(de_payload) >> 2); @@ -6891,6 +6968,9 @@ stat
Re: [PATCH] drm/amdkfd: fix gfx_target_version for certain 11.0.3 devices
On 2023-05-25 16:12, Alex Deucher wrote: Certain boards with GC IP 11.0.3 need slightly different handling in the shader compiler due to board specific bounding box optimizations. Signed-off-by: Alex Deucher Acked-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index 862a50f7b490..ebc3c3f965f9 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -411,8 +411,15 @@ struct kfd_dev *kgd2kfd_probe(struct amdgpu_device *adev, bool vf) f2g = &gfx_v11_kfd2kgd; break; case IP_VERSION(11, 0, 3): - /* Note: Compiler version is 11.0.1 while HW version is 11.0.3 */ - gfx_target_version = 110001; + if ((adev->pdev->device == 0x7460 && +adev->pdev->revision == 0x00) || + (adev->pdev->device == 0x7461 && +adev->pdev->revision == 0x00)) + /* Note: Compiler version is 11.0.5 while HW version is 11.0.3 */ + gfx_target_version = 110005; + else + /* Note: Compiler version is 11.0.1 while HW version is 11.0.3 */ + gfx_target_version = 110001; f2g = &gfx_v11_kfd2kgd; break; default:
[PATCH] drm/radeon: remove unused variable rbo
gcc with W=1 reports drivers/gpu/drm/radeon/radeon_ttm.c:200:27: error: variable ‘rbo’ set but not used [-Werror=unused-but-set-variable] 200 | struct radeon_bo *rbo; | ^~~ This variable is not used so remove it. Signed-off-by: Tom Rix --- drivers/gpu/drm/radeon/radeon_ttm.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index 4eb83ccc4906..de4e6d78f1e1 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -197,7 +197,6 @@ static int radeon_bo_move(struct ttm_buffer_object *bo, bool evict, { struct ttm_resource *old_mem = bo->resource; struct radeon_device *rdev; - struct radeon_bo *rbo; int r; if (new_mem->mem_type == TTM_PL_TT) { @@ -210,7 +209,6 @@ static int radeon_bo_move(struct ttm_buffer_object *bo, bool evict, if (r) return r; - rbo = container_of(bo, struct radeon_bo, tbo); rdev = radeon_get_rdev(bo->bdev); if (!old_mem || (old_mem->mem_type == TTM_PL_SYSTEM && bo->ttm == NULL)) { -- 2.27.0
Re: [PATCH] drm/amdkfd: remove unused function get_reserved_sdma_queues_bitmap
[+Mukul] Looks like this problem was introduced by Mukul's patch "drm/amdkfd: Update SDMA queue management for GFX9.4.3". Could this be a merge error between GFX 9.4.3 and GFX11 branches? I think the reserved_sdma_queues_bitmap was introduced after the 9.4.3 branch was created. Mukul, you worked on both, so you're probably in the best position to resolve this. Regards, Felix On 2023-05-25 16:07, Tom Rix wrote: clang with W=1 reports drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:122:24: error: unused function 'get_reserved_sdma_queues_bitmap' [-Werror,-Wunused-function] static inline uint64_t get_reserved_sdma_queues_bitmap(struct device_queue_manager *dqm) ^ This function is not used so remove it. Signed-off-by: Tom Rix --- drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 5 - 1 file changed, 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index 493b4b66f180..2fbd0a96424f 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -119,11 +119,6 @@ unsigned int get_num_xgmi_sdma_queues(struct device_queue_manager *dqm) dqm->dev->kfd->device_info.num_sdma_queues_per_engine; } -static inline uint64_t get_reserved_sdma_queues_bitmap(struct device_queue_manager *dqm) -{ - return dqm->dev->kfd->device_info.reserved_sdma_queues_bitmap; -} - static void init_sdma_bitmaps(struct device_queue_manager *dqm) { bitmap_zero(dqm->sdma_bitmap, KFD_MAX_SDMA_QUEUES);
RE: [PATCH] drm/amdgpu: add the accelerator pcie class
[AMD Official Use Only - General] > -Original Message- > From: amd-gfx On Behalf Of > Christoph Hellwig > Sent: Thursday, May 25, 2023 5:47 AM > To: Alex Deucher > Cc: Christoph Hellwig ; bhelg...@google.com; amd- > g...@lists.freedesktop.org; Zhang, Morris ; linux- > p...@vger.kernel.org > Subject: Re: [PATCH] drm/amdgpu: add the accelerator pcie class > > On Tue, May 23, 2023 at 10:02:32AM -0400, Alex Deucher wrote: > > On Tue, May 23, 2023 at 5:25 AM Christoph Hellwig > wrote: > > > > > > On Tue, May 23, 2023 at 12:02:32PM +0800, Shiwu Zhang wrote: > > > > + { PCI_DEVICE(0x1002, PCI_ANY_ID), > > > > + .class = PCI_CLASS_ACCELERATOR_PROCESSING << 8, > > > > + .class_mask = 0xff, > > > > + .driver_data = CHIP_IP_DISCOVERY }, > > > > > > Probing for every single device of a given class for a single vendor > > > to a driver is just fundamentaly wrong. Please list the actual IDs > > > that the driver can handle. > > > > How so? The driver handles all devices of that class. We already do > > that for PCI_CLASS_DISPLAY_VGA and PCI_CLASS_DISPLAY_OTHER. Other > > drivers do similar things. > > How is that going to work in the long run? The chances of totally > incompatbile devices from the same vendor appearing is absolutely given. > We already handle this today for CLASS_DISPLAY via a data table provided on our hardware that details the components on the board. The driver can then determine whether or not that combination of components is supported. If the data table doesn't exist or isn’t parse-able, or the components enumerated are not supported, the driver doesn't load. Alex
Re: [PATCH 01/13] drm: execution context for GEM buffers v4
On 5/4/23 13:51, Christian König wrote: This adds the infrastructure for an execution context for GEM buffers which is similar to the existing TTMs execbuf util and intended to replace it in the long term. The basic functionality is that we abstracts the necessary loop to lock many different GEM buffers with automated deadlock and duplicate handling. v2: drop xarray and use dynamic resized array instead, the locking overhead is unecessary and measurable. v3: drop duplicate tracking, radeon is really the only one needing that. v4: fixes issues pointed out by Danilo, some typos in comments and a helper for lock arrays of GEM objects. Signed-off-by: Christian König Reviewed-by: Danilo Krummrich --- Documentation/gpu/drm-mm.rst | 12 ++ drivers/gpu/drm/Kconfig | 6 + drivers/gpu/drm/Makefile | 2 + drivers/gpu/drm/drm_exec.c | 278 +++ include/drm/drm_exec.h | 119 +++ 5 files changed, 417 insertions(+) create mode 100644 drivers/gpu/drm/drm_exec.c create mode 100644 include/drm/drm_exec.h diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst index a79fd3549ff8..a52e6f4117d6 100644 --- a/Documentation/gpu/drm-mm.rst +++ b/Documentation/gpu/drm-mm.rst @@ -493,6 +493,18 @@ DRM Sync Objects .. kernel-doc:: drivers/gpu/drm/drm_syncobj.c :export: +DRM Execution context += + +.. kernel-doc:: drivers/gpu/drm/drm_exec.c + :doc: Overview + +.. kernel-doc:: include/drm/drm_exec.h + :internal: + +.. kernel-doc:: drivers/gpu/drm/drm_exec.c + :export: + GPU Scheduler = diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index ba3fb04bb691..2dc81eb062eb 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -201,6 +201,12 @@ config DRM_TTM GPU memory types. Will be enabled automatically if a device driver uses it. +config DRM_EXEC + tristate + depends on DRM + help + Execution context for command submissions + config DRM_BUDDY tristate depends on DRM diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index a33257d2bc7f..9c6446eb3c83 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -78,6 +78,8 @@ obj-$(CONFIG_DRM_PANEL_ORIENTATION_QUIRKS) += drm_panel_orientation_quirks.o # # Memory-management helpers # +# +obj-$(CONFIG_DRM_EXEC) += drm_exec.o obj-$(CONFIG_DRM_BUDDY) += drm_buddy.o diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c new file mode 100644 index ..18071bff20f4 --- /dev/null +++ b/drivers/gpu/drm/drm_exec.c @@ -0,0 +1,278 @@ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ + +#include +#include +#include + +/** + * DOC: Overview + * + * This component mainly abstracts the retry loop necessary for locking + * multiple GEM objects while preparing hardware operations (e.g. command + * submissions, page table updates etc..). + * + * If a contention is detected while locking a GEM object the cleanup procedure + * unlocks all previously locked GEM objects and locks the contended one first + * before locking any further objects. + * + * After an object is locked fences slots can optionally be reserved on the + * dma_resv object inside the GEM object. + * + * A typical usage pattern should look like this:: + * + * struct drm_gem_object *obj; + * struct drm_exec exec; + * unsigned long index; + * int ret; + * + * drm_exec_init(&exec, true); + * drm_exec_while_not_all_locked(&exec) { + * ret = drm_exec_prepare_obj(&exec, boA, 1); + * drm_exec_continue_on_contention(&exec); + * if (ret) + * goto error; + * + * ret = drm_exec_prepare_obj(&exec, boB, 1); + * drm_exec_continue_on_contention(&exec); + * if (ret) + * goto error; + * } + * + * drm_exec_for_each_locked_object(&exec, index, obj) { + * dma_resv_add_fence(obj->resv, fence, DMA_RESV_USAGE_READ); + * ... + * } + * drm_exec_fini(&exec); + * + * See struct dma_exec for more details. + */ + +/* Dummy value used to initially enter the retry loop */ +#define DRM_EXEC_DUMMY (void*)~0 + +/* Unlock all objects and drop references */ +static void drm_exec_unlock_all(struct drm_exec *exec) +{ + struct drm_gem_object *obj; + unsigned long index; + + drm_exec_for_each_locked_object(exec, index, obj) { + dma_resv_unlock(obj->resv); + drm_gem_object_put(obj); + } + + drm_gem_object_put(exec->prelocked); + exec->prelocked = NULL; +} + +/** + * drm_exec_init - initialize a drm_exec object + * @exec: the drm_exec object to initialize + * @interruptible: if locks should be acquired interruptible + * + * Initialize the object and make sure that we can track locked objects. + */ +void drm_exec_init(struct d
Re: [PATCH] drm/amdgpu: move gfx9_cs_data definition
On Thu, May 25, 2023 at 4:35 PM Tom Rix wrote: > > gcc with W=1 reports > In file included from drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:32: > drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h:939:36: error: > ‘gfx9_cs_data’ defined but not used [-Werror=unused-const-variable=] > 939 | static const struct cs_section_def gfx9_cs_data[] = { > |^~~~ > > gfx9_cs_data is only used in gfx_v9_0.c, so move its definition there. > > Signed-off-by: Tom Rix Already fixed with: https://patchwork.freedesktop.org/patch/539234/ which will show up in my tree momentarily. Alex > --- > drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h | 4 > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 5 + > 2 files changed, 5 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h > b/drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h > index 567a904804bc..6de4778789ed 100644 > --- a/drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h > +++ b/drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h > @@ -936,7 +936,3 @@ static const struct cs_extent_def > gfx9_SECT_CONTEXT_defs[] = > {gfx9_SECT_CONTEXT_def_8, 0xa2f5, 155 }, > { 0, 0, 0 } > }; > -static const struct cs_section_def gfx9_cs_data[] = { > -{ gfx9_SECT_CONTEXT_defs, SECT_CONTEXT }, > -{ 0, SECT_NONE } > -}; > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > index 8bf95a6b0767..c97a68a39d93 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > @@ -56,6 +56,11 @@ > #include "asic_reg/pwr/pwr_10_0_sh_mask.h" > #include "asic_reg/gc/gc_9_0_default.h" > > +static const struct cs_section_def gfx9_cs_data[] = { > +{ gfx9_SECT_CONTEXT_defs, SECT_CONTEXT }, > +{ 0, SECT_NONE } > +}; > + > #define GFX9_NUM_GFX_RINGS 1 > #define GFX9_NUM_SW_GFX_RINGS 2 > #define GFX9_MEC_HPD_SIZE 4096 > -- > 2.27.0 >
[PATCH] drm/amdgpu: move gfx9_cs_data definition
gcc with W=1 reports In file included from drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:32: drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h:939:36: error: ‘gfx9_cs_data’ defined but not used [-Werror=unused-const-variable=] 939 | static const struct cs_section_def gfx9_cs_data[] = { |^~~~ gfx9_cs_data is only used in gfx_v9_0.c, so move its definition there. Signed-off-by: Tom Rix --- drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h | 4 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c| 5 + 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h b/drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h index 567a904804bc..6de4778789ed 100644 --- a/drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h +++ b/drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h @@ -936,7 +936,3 @@ static const struct cs_extent_def gfx9_SECT_CONTEXT_defs[] = {gfx9_SECT_CONTEXT_def_8, 0xa2f5, 155 }, { 0, 0, 0 } }; -static const struct cs_section_def gfx9_cs_data[] = { -{ gfx9_SECT_CONTEXT_defs, SECT_CONTEXT }, -{ 0, SECT_NONE } -}; diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index 8bf95a6b0767..c97a68a39d93 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -56,6 +56,11 @@ #include "asic_reg/pwr/pwr_10_0_sh_mask.h" #include "asic_reg/gc/gc_9_0_default.h" +static const struct cs_section_def gfx9_cs_data[] = { +{ gfx9_SECT_CONTEXT_defs, SECT_CONTEXT }, +{ 0, SECT_NONE } +}; + #define GFX9_NUM_GFX_RINGS 1 #define GFX9_NUM_SW_GFX_RINGS 2 #define GFX9_MEC_HPD_SIZE 4096 -- 2.27.0
Re: [PATCH 2/2] drm/amdgpu: Remove duplicate fdinfo fields
On Thu, May 25, 2023 at 11:52 AM Rob Clark wrote: > > From: Rob Clark > > Some of the fields that are handled by drm_show_fdinfo() crept back in > when rebasing the patch. Remove them again. > > Fixes: 376c25f8ca47 ("drm/amdgpu: Switch to fdinfo helper") > Signed-off-by: Rob Clark Series is: Reviewed-by: > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c > index 13d7413d4ca3..a93e5627901a 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c > @@ -80,23 +80,20 @@ void amdgpu_show_fdinfo(struct drm_printer *p, struct > drm_file *file) > > amdgpu_ctx_mgr_usage(&fpriv->ctx_mgr, usage); > > /* > * ** > * For text output format description please see drm-usage-stats.rst! > * ** > */ > > drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid); > - drm_printf(p, "drm-driver:\t%s\n", file->minor->dev->driver->name); > - drm_printf(p, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn); > - drm_printf(p, "drm-client-id:\t%Lu\n", vm->immediate.fence_context); > drm_printf(p, "drm-memory-vram:\t%llu KiB\n", stats.vram/1024UL); > drm_printf(p, "drm-memory-gtt: \t%llu KiB\n", stats.gtt/1024UL); > drm_printf(p, "drm-memory-cpu: \t%llu KiB\n", stats.cpu/1024UL); > drm_printf(p, "amd-memory-visible-vram:\t%llu KiB\n", >stats.visible_vram/1024UL); > drm_printf(p, "amd-evicted-vram:\t%llu KiB\n", >stats.evicted_vram/1024UL); > drm_printf(p, "amd-evicted-visible-vram:\t%llu KiB\n", >stats.evicted_visible_vram/1024UL); > drm_printf(p, "amd-requested-vram:\t%llu KiB\n", > -- > 2.40.1 >
Re: [PATCH] drm/amdgpu: Fix up kdoc in amdgpu_acpi.c
On Thu, May 25, 2023 at 2:03 PM Srinivasan Shanmugam wrote: > > Fix these warnings by adding & deleting the deviant arguments. > > gcc with W=1 > drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:906: warning: Function parameter or > member 'numa_info' not described in 'amdgpu_acpi_get_node_id' > drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:906: warning: Excess function > parameter 'nid' description in 'amdgpu_acpi_get_node_id' > > Cc: Christian König > Cc: Alex Deucher > Signed-off-by: Srinivasan Shanmugam Reviewed-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > index b050d462b2f3..3a6b2e2089f6 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > @@ -894,7 +894,7 @@ static struct amdgpu_numa_info > *amdgpu_acpi_get_numa_info(uint32_t pxm) > * acpi device handle > * > * @handle: acpi handle > - * @nid: NUMA Node id returned by the platform firmware > + * @numa_info: amdgpu_numa_info structure holding numa information > * > * Queries the ACPI interface to fetch the corresponding NUMA Node ID for a > * given amdgpu acpi device. > -- > 2.25.1 >
Re: [PATCH] drm/amdkfd: remove unused function get_reserved_sdma_queues_bitmap
On Thu, May 25, 2023 at 04:07:59PM -0400, Tom Rix wrote: > clang with W=1 reports > drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:122:24: error: > unused function 'get_reserved_sdma_queues_bitmap' > [-Werror,-Wunused-function] > static inline uint64_t get_reserved_sdma_queues_bitmap(struct > device_queue_manager *dqm) >^ > This function is not used so remove it. > > Signed-off-by: Tom Rix Caused by commit 09a95a85cf3e ("drm/amdkfd: Update SDMA queue management for GFX9.4.3") it seems. You can actually go a step farther and remove the reserved_sdma_queues_bitmap member from 'struct kfd_device_info' because it is now only assigned, never read. $ git grep reserved_sdma_queues_bitmap next-20230525 next:20230525:drivers/gpu/drm/amd/amdkfd/kfd_device.c: kfd->device_info.reserved_sdma_queues_bitmap = 0xFULL; next:20230525:drivers/gpu/drm/amd/amdkfd/kfd_device.c: kfd->device_info.reserved_sdma_queues_bitmap = 0x3ULL; next:20230525:drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c:static inline uint64_t get_reserved_sdma_queues_bitmap(struct device_queue_manager *dqm) next:20230525:drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c:return dqm->dev->kfd->device_info.reserved_sdma_queues_bitmap; next:20230525:drivers/gpu/drm/amd/amdkfd/kfd_priv.h:uint64_t reserved_sdma_queues_bitmap; > --- > drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 5 - > 1 file changed, 5 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > index 493b4b66f180..2fbd0a96424f 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > @@ -119,11 +119,6 @@ unsigned int get_num_xgmi_sdma_queues(struct > device_queue_manager *dqm) > dqm->dev->kfd->device_info.num_sdma_queues_per_engine; > } > > -static inline uint64_t get_reserved_sdma_queues_bitmap(struct > device_queue_manager *dqm) > -{ > - return dqm->dev->kfd->device_info.reserved_sdma_queues_bitmap; > -} > - > static void init_sdma_bitmaps(struct device_queue_manager *dqm) > { > bitmap_zero(dqm->sdma_bitmap, KFD_MAX_SDMA_QUEUES); > -- > 2.27.0 >
[PATCH] drm/amdkfd: remove unused function get_reserved_sdma_queues_bitmap
clang with W=1 reports drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:122:24: error: unused function 'get_reserved_sdma_queues_bitmap' [-Werror,-Wunused-function] static inline uint64_t get_reserved_sdma_queues_bitmap(struct device_queue_manager *dqm) ^ This function is not used so remove it. Signed-off-by: Tom Rix --- drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 5 - 1 file changed, 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index 493b4b66f180..2fbd0a96424f 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -119,11 +119,6 @@ unsigned int get_num_xgmi_sdma_queues(struct device_queue_manager *dqm) dqm->dev->kfd->device_info.num_sdma_queues_per_engine; } -static inline uint64_t get_reserved_sdma_queues_bitmap(struct device_queue_manager *dqm) -{ - return dqm->dev->kfd->device_info.reserved_sdma_queues_bitmap; -} - static void init_sdma_bitmaps(struct device_queue_manager *dqm) { bitmap_zero(dqm->sdma_bitmap, KFD_MAX_SDMA_QUEUES); -- 2.27.0
[PATCH] drm/amdkfd: fix gfx_target_version for certain 11.0.3 devices
Certain boards with GC IP 11.0.3 need slightly different handling in the shader compiler due to board specific bounding box optimizations. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index 862a50f7b490..ebc3c3f965f9 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -411,8 +411,15 @@ struct kfd_dev *kgd2kfd_probe(struct amdgpu_device *adev, bool vf) f2g = &gfx_v11_kfd2kgd; break; case IP_VERSION(11, 0, 3): - /* Note: Compiler version is 11.0.1 while HW version is 11.0.3 */ - gfx_target_version = 110001; + if ((adev->pdev->device == 0x7460 && +adev->pdev->revision == 0x00) || + (adev->pdev->device == 0x7461 && +adev->pdev->revision == 0x00)) + /* Note: Compiler version is 11.0.5 while HW version is 11.0.3 */ + gfx_target_version = 110005; + else + /* Note: Compiler version is 11.0.1 while HW version is 11.0.3 */ + gfx_target_version = 110001; f2g = &gfx_v11_kfd2kgd; break; default: -- 2.40.1
Re: [PATCH] drm/amd/amdgpu: Fix up locking etc in amdgpu_debugfs_gprwave_ioctl()
Applied. Thanks! Alex On Thu, May 25, 2023 at 4:05 AM Dan Carpenter wrote: > > There are two bugs here. > 1) Drop the lock if copy_from_user() fails. > 2) If the copy fails then the correct error code is -EFAULT instead of >-EINVAL. > > I also broke up the long line and changed "sizeof rd->id" to > "sizeof(rd->id)". > > Fixes: 164fb2940933 ("drm/amd/amdgpu: Update debugfs for XCC support (v3)") > Signed-off-by: Dan Carpenter > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 9 + > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c > index c657bed350ac..56e89e76ff17 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c > @@ -478,15 +478,16 @@ static ssize_t amdgpu_debugfs_gprwave_read(struct file > *f, char __user *buf, siz > static long amdgpu_debugfs_gprwave_ioctl(struct file *f, unsigned int cmd, > unsigned long data) > { > struct amdgpu_debugfs_gprwave_data *rd = f->private_data; > - int r; > + int r = 0; > > mutex_lock(&rd->lock); > > switch (cmd) { > case AMDGPU_DEBUGFS_GPRWAVE_IOC_SET_STATE: > - r = copy_from_user(&rd->id, (struct > amdgpu_debugfs_gprwave_iocdata *)data, sizeof rd->id); > - if (r) > - return r ? -EINVAL : 0; > + if (copy_from_user(&rd->id, > + (struct amdgpu_debugfs_gprwave_iocdata > *)data, > + sizeof(rd->id))) > + r = -EFAULT; > goto done; > default: > r = -EINVAL; > -- > 2.39.2 >
[PATCH v4 05/13] drm/connector: Print connector colorspace in state debugfs
v3: Fix kerneldocs (kernel test robot) v4: Avoid returning NULL from drm_get_colorspace_name Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Uma Shankar Cc: Ville Syrjälä Cc: Joshua Ashton Cc: Jani Nikula Cc: Simon Ser Cc: Ville Syrjälä Cc: Melissa Wen Cc: dri-de...@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org --- drivers/gpu/drm/drm_atomic.c| 1 + drivers/gpu/drm/drm_connector.c | 15 +++ include/drm/drm_connector.h | 1 + 3 files changed, 17 insertions(+) diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c index c0dc5858a723..d6d04c4ccfc0 100644 --- a/drivers/gpu/drm/drm_atomic.c +++ b/drivers/gpu/drm/drm_atomic.c @@ -1071,6 +1071,7 @@ static void drm_atomic_connector_print_state(struct drm_printer *p, drm_printf(p, "\tcrtc=%s\n", state->crtc ? state->crtc->name : "(null)"); drm_printf(p, "\tself_refresh_aware=%d\n", state->self_refresh_aware); drm_printf(p, "\tmax_requested_bpc=%d\n", state->max_requested_bpc); + drm_printf(p, "\tcolorspace=%s\n", drm_get_colorspace_name(state->colorspace)); if (connector->connector_type == DRM_MODE_CONNECTOR_WRITEBACK) if (state->writeback_job && state->writeback_job->fb) diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c index 8d24a5da4076..69480385eaf3 100644 --- a/drivers/gpu/drm/drm_connector.c +++ b/drivers/gpu/drm/drm_connector.c @@ -1048,6 +1048,21 @@ static const char * const colorspace_names[] = { [DRM_MODE_COLORIMETRY_BT601_YCC] = "BT601_YCC", }; +/** + * drm_get_colorspace_name - return a string for color encoding + * @colorspace: color space to compute name of + * + * In contrast to the other drm_get_*_name functions this one here returns a + * const pointer and hence is threadsafe. + */ +const char *drm_get_colorspace_name(enum drm_colorspace colorspace) +{ + if (colorspace < ARRAY_SIZE(colorspace_names) && colorspace_names[colorspace]) + return colorspace_names[colorspace]; + else + return "(null)"; +} + static const u32 hdmi_colorspaces = BIT(DRM_MODE_COLORIMETRY_SMPTE_170M_YCC) | BIT(DRM_MODE_COLORIMETRY_BT709_YCC) | diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h index 565311e194da..ae0b1ee5b99a 100644 --- a/include/drm/drm_connector.h +++ b/include/drm/drm_connector.h @@ -1980,6 +1980,7 @@ void drm_connector_list_iter_end(struct drm_connector_list_iter *iter); bool drm_connector_has_possible_encoder(struct drm_connector *connector, struct drm_encoder *encoder); +const char *drm_get_colorspace_name(enum drm_colorspace colorspace); /** * drm_for_each_connector_iter - connector_list iterator macro -- 2.40.1
[PATCH v4 11/13] drm/amd/display: Always set crtcinfo from create_stream_for_sink
From: Joshua Ashton Given that we always pass dm_state into here now, this won't ever trigger anymore. This is needed for we will always fail mode validation with invalid clocks or link bandwidth errors. Signed-off-by: Joshua Ashton Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Joshua Ashton Cc: Simon Ser Cc: Melissa Wen Cc: dri-de...@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Reviewed-By: Harry Wentland --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index a8de26f09806..4e96a34148cc 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -6046,7 +6046,7 @@ create_stream_for_sink(struct amdgpu_dm_connector *aconnector, if (recalculate_timing) drm_mode_set_crtcinfo(&saved_mode, 0); - else if (!dm_state) + else drm_mode_set_crtcinfo(&mode, 0); /* -- 2.40.1
[PATCH v4 13/13] drm/amd/display: Refactor avi_info_frame colorimetry determination
From: Joshua Ashton Replace the messy two if-else chains here that were on the same value with a switch on the enum. Signed-off-by: Joshua Ashton Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Joshua Ashton Cc: Simon Ser Cc: Melissa Wen Cc: dri-de...@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Reviewed-by: Harry Wentland --- .../gpu/drm/amd/display/dc/core/dc_resource.c | 28 +++ 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c index 7e1e5532f88f..ac3062abec51 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c @@ -3015,23 +3015,29 @@ static void set_avi_info_frame( hdmi_info.bits.S0_S1 = scan_type; /* C0, C1 : Colorimetry */ - if (color_space == COLOR_SPACE_YCBCR709 || - color_space == COLOR_SPACE_YCBCR709_LIMITED) + switch (color_space) { + case COLOR_SPACE_YCBCR709: + case COLOR_SPACE_YCBCR709_LIMITED: hdmi_info.bits.C0_C1 = COLORIMETRY_ITU709; - else if (color_space == COLOR_SPACE_YCBCR601 || - color_space == COLOR_SPACE_YCBCR601_LIMITED) + break; + case COLOR_SPACE_YCBCR601: + case COLOR_SPACE_YCBCR601_LIMITED: hdmi_info.bits.C0_C1 = COLORIMETRY_ITU601; - else { - hdmi_info.bits.C0_C1 = COLORIMETRY_NO_DATA; - } - if (color_space == COLOR_SPACE_2020_RGB_FULLRANGE || - color_space == COLOR_SPACE_2020_RGB_LIMITEDRANGE || - color_space == COLOR_SPACE_2020_YCBCR) { + break; + case COLOR_SPACE_2020_RGB_FULLRANGE: + case COLOR_SPACE_2020_RGB_LIMITEDRANGE: + case COLOR_SPACE_2020_YCBCR: hdmi_info.bits.EC0_EC2 = COLORIMETRYEX_BT2020RGBYCBCR; hdmi_info.bits.C0_C1 = COLORIMETRY_EXTENDED; - } else if (color_space == COLOR_SPACE_ADOBERGB) { + break; + case COLOR_SPACE_ADOBERGB: hdmi_info.bits.EC0_EC2 = COLORIMETRYEX_ADOBERGB; hdmi_info.bits.C0_C1 = COLORIMETRY_EXTENDED; + break; + case COLOR_SPACE_SRGB: + default: + hdmi_info.bits.C0_C1 = COLORIMETRY_NO_DATA; + break; } if (pixel_encoding && color_space == COLOR_SPACE_2020_YCBCR && -- 2.40.1
[PATCH v4 10/13] drm/amd/display: Send correct DP colorspace infopacket
Look at connector->colorimetry to determine output colorspace. We don't want to impact current SDR behavior, so DRM_MODE_COLORIMETRY_DEFAULT preserves current behavior. Also add support to explicitly set BT601 and BT709. v4: - Roll support for BT709 and BT601 into this patch - Add default case to avoid warnings for unhandled enum values Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Joshua Ashton Cc: Simon Ser Cc: Melissa Wen Cc: dri-de...@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 48 --- 1 file changed, 31 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 5c290e6aac46..a8de26f09806 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -5330,21 +5330,44 @@ get_aspect_ratio(const struct drm_display_mode *mode_in) } static enum dc_color_space -get_output_color_space(const struct dc_crtc_timing *dc_crtc_timing) +get_output_color_space(const struct dc_crtc_timing *dc_crtc_timing, + const struct drm_connector_state *connector_state) { enum dc_color_space color_space = COLOR_SPACE_SRGB; - switch (dc_crtc_timing->pixel_encoding) { - case PIXEL_ENCODING_YCBCR422: - case PIXEL_ENCODING_YCBCR444: - case PIXEL_ENCODING_YCBCR420: - { + switch (connector_state->colorspace) { + case DRM_MODE_COLORIMETRY_BT601_YCC: + if (dc_crtc_timing->flags.Y_ONLY) + color_space = COLOR_SPACE_YCBCR601_LIMITED; + else + color_space = COLOR_SPACE_YCBCR601; + break; + case DRM_MODE_COLORIMETRY_BT709_YCC: + if (dc_crtc_timing->flags.Y_ONLY) + color_space = COLOR_SPACE_YCBCR709_LIMITED; + else + color_space = COLOR_SPACE_YCBCR709; + break; + case DRM_MODE_COLORIMETRY_OPRGB: + color_space = COLOR_SPACE_ADOBERGB; + break; + case DRM_MODE_COLORIMETRY_BT2020_RGB: + case DRM_MODE_COLORIMETRY_BT2020_YCC: + if (dc_crtc_timing->pixel_encoding == PIXEL_ENCODING_RGB) + color_space = COLOR_SPACE_2020_RGB_FULLRANGE; + else + color_space = COLOR_SPACE_2020_YCBCR; + break; + case DRM_MODE_COLORIMETRY_DEFAULT: // ITU601 + default: + if (dc_crtc_timing->pixel_encoding == PIXEL_ENCODING_RGB) { + color_space = COLOR_SPACE_SRGB; /* * 27030khz is the separation point between HDTV and SDTV * according to HDMI spec, we use YCbCr709 and YCbCr601 * respectively */ - if (dc_crtc_timing->pix_clk_100hz > 270300) { + } else if (dc_crtc_timing->pix_clk_100hz > 270300) { if (dc_crtc_timing->flags.Y_ONLY) color_space = COLOR_SPACE_YCBCR709_LIMITED; @@ -5357,15 +5380,6 @@ get_output_color_space(const struct dc_crtc_timing *dc_crtc_timing) else color_space = COLOR_SPACE_YCBCR601; } - - } - break; - case PIXEL_ENCODING_RGB: - color_space = COLOR_SPACE_SRGB; - break; - - default: - WARN_ON(1); break; } @@ -5504,7 +5518,7 @@ static void fill_stream_properties_from_drm_display_mode( } } - stream->output_color_space = get_output_color_space(timing_out); + stream->output_color_space = get_output_color_space(timing_out, connector_state); } static void fill_audio_info(struct audio_info *audio_info, -- 2.40.1
[PATCH v4 12/13] drm/amd/display: Add debugfs for testing output colorspace
In order to IGT test colorspace we'll want to print the currently enabled colorspace on a stream. We add a new debugfs to do so, using the same scheme as current bpc reporting. This might also come in handy when debugging display issues. v4: - Fix function doc comment - Fix sRGB debug print Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Joshua Ashton Cc: Simon Ser Cc: Melissa Wen Cc: dri-de...@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org --- .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 57 +++ 1 file changed, 57 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c index 827fcb4fb3b3..9a885e2effec 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c @@ -906,6 +906,61 @@ static int amdgpu_current_bpc_show(struct seq_file *m, void *data) } DEFINE_SHOW_ATTRIBUTE(amdgpu_current_bpc); +/* + * Returns the current colorspace for the crtc. + * Example usage: cat /sys/kernel/debug/dri/0/crtc-0/amdgpu_current_colorspace + */ +static int amdgpu_current_colorspace_show(struct seq_file *m, void *data) +{ + struct drm_crtc *crtc = m->private; + struct drm_device *dev = crtc->dev; + struct dm_crtc_state *dm_crtc_state = NULL; + int res = -ENODEV; + + mutex_lock(&dev->mode_config.mutex); + drm_modeset_lock(&crtc->mutex, NULL); + if (crtc->state == NULL) + goto unlock; + + dm_crtc_state = to_dm_crtc_state(crtc->state); + if (dm_crtc_state->stream == NULL) + goto unlock; + + switch (dm_crtc_state->stream->output_color_space) { + case COLOR_SPACE_SRGB: + seq_printf(m, "sRGB"); + break; + case COLOR_SPACE_YCBCR601: + case COLOR_SPACE_YCBCR601_LIMITED: + seq_printf(m, "BT601_YCC"); + break; + case COLOR_SPACE_YCBCR709: + case COLOR_SPACE_YCBCR709_LIMITED: + seq_printf(m, "BT709_YCC"); + break; + case COLOR_SPACE_ADOBERGB: + seq_printf(m, "opRGB"); + break; + case COLOR_SPACE_2020_RGB_FULLRANGE: + seq_printf(m, "BT2020_RGB"); + break; + case COLOR_SPACE_2020_YCBCR: + seq_printf(m, "BT2020_YCC"); + break; + default: + goto unlock; + } + res = 0; + +unlock: + drm_modeset_unlock(&crtc->mutex); + mutex_unlock(&dev->mode_config.mutex); + + return res; +} +DEFINE_SHOW_ATTRIBUTE(amdgpu_current_colorspace); + + /* * Example usage: * Disable dsc passthrough, i.e.,: have dsc decoding at converver, not external RX @@ -3246,6 +3301,8 @@ void crtc_debugfs_init(struct drm_crtc *crtc) #endif debugfs_create_file("amdgpu_current_bpc", 0644, crtc->debugfs_entry, crtc, &amdgpu_current_bpc_fops); + debugfs_create_file("amdgpu_current_colorspace", 0644, crtc->debugfs_entry, + crtc, &amdgpu_current_colorspace_fops); } /* -- 2.40.1
[PATCH v4 08/13] drm/amd/display: Register Colorspace property for DP and HDMI
We want compositors to be able to set the output colorspace on DP and HDMI outputs, based on the caps reported from the receiver via EDID. Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Joshua Ashton Cc: Simon Ser Cc: Ville Syrjälä Cc: Melissa Wen Cc: dri-de...@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index ca093396d1ac..dc99a8ffac70 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -7238,6 +7238,12 @@ static int amdgpu_dm_connector_get_modes(struct drm_connector *connector) return amdgpu_dm_connector->num_modes; } +static const u32 supported_colorspaces = + BIT(DRM_MODE_COLORIMETRY_BT709_YCC) | + BIT(DRM_MODE_COLORIMETRY_OPRGB) | + BIT(DRM_MODE_COLORIMETRY_BT2020_RGB) | + BIT(DRM_MODE_COLORIMETRY_BT2020_YCC); + void amdgpu_dm_connector_init_helper(struct amdgpu_display_manager *dm, struct amdgpu_dm_connector *aconnector, int connector_type, @@ -7318,6 +7324,15 @@ void amdgpu_dm_connector_init_helper(struct amdgpu_display_manager *dm, adev->mode_info.abm_level_property, 0); } + if (connector_type == DRM_MODE_CONNECTOR_HDMIA) { + if (!drm_mode_create_hdmi_colorspace_property(&aconnector->base, supported_colorspaces)) + drm_connector_attach_colorspace_property(&aconnector->base); + } else if (connector_type == DRM_MODE_CONNECTOR_DisplayPort || + connector_type == DRM_MODE_CONNECTOR_eDP) { + if (!drm_mode_create_dp_colorspace_property(&aconnector->base, supported_colorspaces)) + drm_connector_attach_colorspace_property(&aconnector->base); + } + if (connector_type == DRM_MODE_CONNECTOR_HDMIA || connector_type == DRM_MODE_CONNECTOR_DisplayPort || connector_type == DRM_MODE_CONNECTOR_eDP) { -- 2.40.1
[PATCH v4 09/13] drm/amd/display: Signal mode_changed if colorspace changed
We need to signal mode_changed to make sure we update the output colorspace. v2: No need to call drm_hdmi_avi_infoframe_colorimetry as DC does its own infoframe packing. Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Uma Shankar Cc: Joshua Ashton Cc: Simon Ser Cc: Melissa Wen Cc: dri-de...@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Reviewed-by: Leo Li --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index dc99a8ffac70..5c290e6aac46 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -6691,6 +6691,14 @@ amdgpu_dm_connector_atomic_check(struct drm_connector *conn, if (!crtc) return 0; + if (new_con_state->colorspace != old_con_state->colorspace) { + new_crtc_state = drm_atomic_get_crtc_state(state, crtc); + if (IS_ERR(new_crtc_state)) + return PTR_ERR(new_crtc_state); + + new_crtc_state->mode_changed = true; + } + if (!drm_connector_atomic_hdr_metadata_equal(old_con_state, new_con_state)) { struct dc_info_packet hdr_infopacket; @@ -6713,7 +6721,7 @@ amdgpu_dm_connector_atomic_check(struct drm_connector *conn, * set is permissible, however. So only force a * modeset if we're entering or exiting HDR. */ - new_crtc_state->mode_changed = + new_crtc_state->mode_changed = new_crtc_state->mode_changed || !old_con_state->hdr_output_metadata || !new_con_state->hdr_output_metadata; } -- 2.40.1
[PATCH v4 07/13] drm/amd/display: Always pass connector_state to stream validation
We need the connector_state for colorspace and scaling information and can get it from connector->state. Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Joshua Ashton Cc: Simon Ser Cc: Melissa Wen Cc: dri-de...@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index a69f4a39d92a..ca093396d1ac 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -5946,15 +5946,14 @@ create_stream_for_sink(struct amdgpu_dm_connector *aconnector, { struct drm_display_mode *preferred_mode = NULL; struct drm_connector *drm_connector; - const struct drm_connector_state *con_state = - dm_state ? &dm_state->base : NULL; + const struct drm_connector_state *con_state = &dm_state->base; struct dc_stream_state *stream = NULL; struct drm_display_mode mode; struct drm_display_mode saved_mode; struct drm_display_mode *freesync_mode = NULL; bool native_mode_found = false; bool recalculate_timing = false; - bool scale = dm_state ? (dm_state->scaling != RMX_OFF) : false; + bool scale = dm_state->scaling != RMX_OFF; int mode_refresh; int preferred_refresh = 0; enum color_transfer_func tf = TRANSFER_FUNC_UNKNOWN; @@ -6596,7 +6595,9 @@ enum drm_mode_status amdgpu_dm_connector_mode_valid(struct drm_connector *connec goto fail; } - stream = create_validate_stream_for_sink(aconnector, mode, NULL, NULL); + stream = create_validate_stream_for_sink(aconnector, mode, + to_dm_connector_state(connector->state), +NULL); if (stream) { dc_stream_release(stream); result = MODE_OK; -- 2.40.1
[PATCH v4 03/13] drm/connector: Pull out common create_colorspace_property code
Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Uma Shankar Cc: Ville Syrjälä Cc: Joshua Ashton Cc: Jani Nikula Cc: Simon Ser Cc: Ville Syrjälä Cc: Melissa Wen Cc: dri-de...@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org --- drivers/gpu/drm/drm_connector.c | 54 - 1 file changed, 27 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c index 547356e00341..9c087d6f5691 100644 --- a/drivers/gpu/drm/drm_connector.c +++ b/drivers/gpu/drm/drm_connector.c @@ -1975,33 +1975,44 @@ EXPORT_SYMBOL(drm_mode_create_aspect_ratio_property); * drm_mode_create_dp_colorspace_property() is used for DP connector. */ -/** - * drm_mode_create_hdmi_colorspace_property - create hdmi colorspace property - * @connector: connector to create the Colorspace property on. - * - * Called by a driver the first time it's needed, must be attached to desired - * HDMI connectors. - * - * Returns: - * Zero on success, negative errno on failure. - */ -int drm_mode_create_hdmi_colorspace_property(struct drm_connector *connector) +static int drm_mode_create_colorspace_property(struct drm_connector *connector, + const struct drm_prop_enum_list *colorspaces, + int size) { struct drm_device *dev = connector->dev; if (connector->colorspace_property) return 0; + if (!colorspaces) + return 0; + connector->colorspace_property = drm_property_create_enum(dev, DRM_MODE_PROP_ENUM, "Colorspace", -hdmi_colorspaces, -ARRAY_SIZE(hdmi_colorspaces)); + colorspaces, + size); if (!connector->colorspace_property) return -ENOMEM; return 0; } +/** + * drm_mode_create_hdmi_colorspace_property - create hdmi colorspace property + * @connector: connector to create the Colorspace property on. + * + * Called by a driver the first time it's needed, must be attached to desired + * HDMI connectors. + * + * Returns: + * Zero on success, negative errno on failure. + */ +int drm_mode_create_hdmi_colorspace_property(struct drm_connector *connector) +{ + return drm_mode_create_colorspace_property(connector, + hdmi_colorspaces, + ARRAY_SIZE(hdmi_colorspaces)); +} EXPORT_SYMBOL(drm_mode_create_hdmi_colorspace_property); /** @@ -2016,20 +2027,9 @@ EXPORT_SYMBOL(drm_mode_create_hdmi_colorspace_property); */ int drm_mode_create_dp_colorspace_property(struct drm_connector *connector) { - struct drm_device *dev = connector->dev; - - if (connector->colorspace_property) - return 0; - - connector->colorspace_property = - drm_property_create_enum(dev, DRM_MODE_PROP_ENUM, "Colorspace", -dp_colorspaces, -ARRAY_SIZE(dp_colorspaces)); - - if (!connector->colorspace_property) - return -ENOMEM; - - return 0; + return drm_mode_create_colorspace_property(connector, + dp_colorspaces, + ARRAY_SIZE(dp_colorspaces)); } EXPORT_SYMBOL(drm_mode_create_dp_colorspace_property); -- 2.40.1
[PATCH v4 06/13] drm/connector: Allow drivers to pass list of supported colorspaces
Drivers might not support all colorspaces defined in dp_colorspaces and hdmi_colorspaces. This results in undefined behavior when userspace is setting an unsupported colorspace. Allow drivers to pass the list of supported colorspaces when creating the colorspace property. v2: - Use 0 to indicate support for all colorspaces (Jani) - Print drm_dbg_kms message when drivers pass 0 to signal that drivers should specify supported colorspaecs explicity (Jani) v3: - Move changes to create a common colorspace_names array to separate patch Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Uma Shankar Cc: Ville Syrjälä Cc: Joshua Ashton Cc: Jani Nikula Cc: Simon Ser Cc: Ville Syrjälä Cc: Melissa Wen Cc: dri-de...@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org --- drivers/gpu/drm/drm_connector.c| 14 ++ drivers/gpu/drm/i915/display/intel_connector.c | 4 ++-- drivers/gpu/drm/vc4/vc4_hdmi.c | 2 +- include/drm/drm_connector.h| 7 +-- 4 files changed, 18 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c index 69480385eaf3..b63b3e3168a1 100644 --- a/drivers/gpu/drm/drm_connector.c +++ b/drivers/gpu/drm/drm_connector.c @@ -2045,9 +2045,12 @@ static int drm_mode_create_colorspace_property(struct drm_connector *connector, * Returns: * Zero on success, negative errno on failure. */ -int drm_mode_create_hdmi_colorspace_property(struct drm_connector *connector) +int drm_mode_create_hdmi_colorspace_property(struct drm_connector *connector, +u32 supported_colorspaces) { - return drm_mode_create_colorspace_property(connector, hdmi_colorspaces); + u32 colorspaces = supported_colorspaces & hdmi_colorspaces; + + return drm_mode_create_colorspace_property(connector, colorspaces); } EXPORT_SYMBOL(drm_mode_create_hdmi_colorspace_property); @@ -2061,9 +2064,12 @@ EXPORT_SYMBOL(drm_mode_create_hdmi_colorspace_property); * Returns: * Zero on success, negative errno on failure. */ -int drm_mode_create_dp_colorspace_property(struct drm_connector *connector) +int drm_mode_create_dp_colorspace_property(struct drm_connector *connector, + u32 supported_colorspaces) { - return drm_mode_create_colorspace_property(connector, dp_colorspaces); + u32 colorspaces = supported_colorspaces & dp_colorspaces; + + return drm_mode_create_colorspace_property(connector, colorspaces); } EXPORT_SYMBOL(drm_mode_create_dp_colorspace_property); diff --git a/drivers/gpu/drm/i915/display/intel_connector.c b/drivers/gpu/drm/i915/display/intel_connector.c index 6205ddd3ded0..e8b4a352a7a6 100644 --- a/drivers/gpu/drm/i915/display/intel_connector.c +++ b/drivers/gpu/drm/i915/display/intel_connector.c @@ -283,14 +283,14 @@ intel_attach_aspect_ratio_property(struct drm_connector *connector) void intel_attach_hdmi_colorspace_property(struct drm_connector *connector) { - if (!drm_mode_create_hdmi_colorspace_property(connector)) + if (!drm_mode_create_hdmi_colorspace_property(connector, 0)) drm_connector_attach_colorspace_property(connector); } void intel_attach_dp_colorspace_property(struct drm_connector *connector) { - if (!drm_mode_create_dp_colorspace_property(connector)) + if (!drm_mode_create_dp_colorspace_property(connector, 0)) drm_connector_attach_colorspace_property(connector); } diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c index 55744216392b..eee53e841701 100644 --- a/drivers/gpu/drm/vc4/vc4_hdmi.c +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c @@ -618,7 +618,7 @@ static int vc4_hdmi_connector_init(struct drm_device *dev, if (ret) return ret; - ret = drm_mode_create_hdmi_colorspace_property(connector); + ret = drm_mode_create_hdmi_colorspace_property(connector, 0); if (ret) return ret; diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h index ae0b1ee5b99a..abe775e1382f 100644 --- a/include/drm/drm_connector.h +++ b/include/drm/drm_connector.h @@ -30,6 +30,7 @@ #include #include #include +#include #include @@ -1896,8 +1897,10 @@ int drm_connector_attach_hdr_output_metadata_property(struct drm_connector *conn bool drm_connector_atomic_hdr_metadata_equal(struct drm_connector_state *old_state, struct drm_connector_state *new_state); int drm_mode_create_aspect_ratio_property(struct drm_device *dev); -int drm_mode_create_hdmi_colorspace_property(struct drm_connector *connector); -int drm_mode_create_dp_colorspace_property(struct drm_connector *connector); +int drm_mode_create_hdmi_colorspace_property(struct drm_connector *connector, +u32 supp
[PATCH v4 04/13] drm/connector: Use common colorspace_names array
We an use bitfields to track the support ones for HDMI and DP. This allows us to print colorspaces in a consistent manner without needing to know whether we're dealing with DP or HDMI. v4: - Rename _MAX to _COUNT and leave comment to indicate it's not a valid value - Fix misplaced function doc Signed-off-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Uma Shankar Cc: Ville Syrjälä Cc: Joshua Ashton Cc: Jani Nikula Cc: Simon Ser Cc: Ville Syrjälä Cc: Melissa Wen Cc: dri-de...@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org --- drivers/gpu/drm/drm_connector.c | 125 ++-- include/drm/drm_connector.h | 2 + 2 files changed, 74 insertions(+), 53 deletions(-) diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c index 9c087d6f5691..8d24a5da4076 100644 --- a/drivers/gpu/drm/drm_connector.c +++ b/drivers/gpu/drm/drm_connector.c @@ -1016,64 +1016,70 @@ static const struct drm_prop_enum_list drm_dp_subconnector_enum_list[] = { DRM_ENUM_NAME_FN(drm_get_dp_subconnector_name, drm_dp_subconnector_enum_list) -static const struct drm_prop_enum_list hdmi_colorspaces[] = { + +static const char * const colorspace_names[] = { /* For Default case, driver will set the colorspace */ - { DRM_MODE_COLORIMETRY_DEFAULT, "Default" }, + [DRM_MODE_COLORIMETRY_DEFAULT] = "Default", /* Standard Definition Colorimetry based on CEA 861 */ - { DRM_MODE_COLORIMETRY_SMPTE_170M_YCC, "SMPTE_170M_YCC" }, - { DRM_MODE_COLORIMETRY_BT709_YCC, "BT709_YCC" }, + [DRM_MODE_COLORIMETRY_SMPTE_170M_YCC] = "SMPTE_170M_YCC", + [DRM_MODE_COLORIMETRY_BT709_YCC] = "BT709_YCC", /* Standard Definition Colorimetry based on IEC 61966-2-4 */ - { DRM_MODE_COLORIMETRY_XVYCC_601, "XVYCC_601" }, + [DRM_MODE_COLORIMETRY_XVYCC_601] = "XVYCC_601", /* High Definition Colorimetry based on IEC 61966-2-4 */ - { DRM_MODE_COLORIMETRY_XVYCC_709, "XVYCC_709" }, + [DRM_MODE_COLORIMETRY_XVYCC_709] = "XVYCC_709", /* Colorimetry based on IEC 61966-2-1/Amendment 1 */ - { DRM_MODE_COLORIMETRY_SYCC_601, "SYCC_601" }, + [DRM_MODE_COLORIMETRY_SYCC_601] = "SYCC_601", /* Colorimetry based on IEC 61966-2-5 [33] */ - { DRM_MODE_COLORIMETRY_OPYCC_601, "opYCC_601" }, + [DRM_MODE_COLORIMETRY_OPYCC_601] = "opYCC_601", /* Colorimetry based on IEC 61966-2-5 */ - { DRM_MODE_COLORIMETRY_OPRGB, "opRGB" }, + [DRM_MODE_COLORIMETRY_OPRGB] = "opRGB", /* Colorimetry based on ITU-R BT.2020 */ - { DRM_MODE_COLORIMETRY_BT2020_CYCC, "BT2020_CYCC" }, + [DRM_MODE_COLORIMETRY_BT2020_CYCC] = "BT2020_CYCC", /* Colorimetry based on ITU-R BT.2020 */ - { DRM_MODE_COLORIMETRY_BT2020_RGB, "BT2020_RGB" }, + [DRM_MODE_COLORIMETRY_BT2020_RGB] = "BT2020_RGB", /* Colorimetry based on ITU-R BT.2020 */ - { DRM_MODE_COLORIMETRY_BT2020_YCC, "BT2020_YCC" }, + [DRM_MODE_COLORIMETRY_BT2020_YCC] = "BT2020_YCC", /* Added as part of Additional Colorimetry Extension in 861.G */ - { DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65, "DCI-P3_RGB_D65" }, - { DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER, "DCI-P3_RGB_Theater" }, + [DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65] = "DCI-P3_RGB_D65", + [DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER] = "DCI-P3_RGB_Theater", + [DRM_MODE_COLORIMETRY_RGB_WIDE_FIXED] = "RGB_WIDE_FIXED", + /* Colorimetry based on scRGB (IEC 61966-2-2) */ + [DRM_MODE_COLORIMETRY_RGB_WIDE_FLOAT] = "RGB_WIDE_FLOAT", + [DRM_MODE_COLORIMETRY_BT601_YCC] = "BT601_YCC", }; +static const u32 hdmi_colorspaces = + BIT(DRM_MODE_COLORIMETRY_SMPTE_170M_YCC) | + BIT(DRM_MODE_COLORIMETRY_BT709_YCC) | + BIT(DRM_MODE_COLORIMETRY_XVYCC_601) | + BIT(DRM_MODE_COLORIMETRY_XVYCC_709) | + BIT(DRM_MODE_COLORIMETRY_SYCC_601) | + BIT(DRM_MODE_COLORIMETRY_OPYCC_601) | + BIT(DRM_MODE_COLORIMETRY_OPRGB) | + BIT(DRM_MODE_COLORIMETRY_BT2020_CYCC) | + BIT(DRM_MODE_COLORIMETRY_BT2020_RGB) | + BIT(DRM_MODE_COLORIMETRY_BT2020_YCC) | + BIT(DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65) | + BIT(DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER); + /* * As per DP 1.4a spec, 2.2.5.7.5 VSC SDP Payload for Pixel Encoding/Colorimetry * Format Table 2-120 */ -static const struct drm_prop_enum_list dp_colorspaces[] = { - /* For Default case, driver will set the colorspace */ - { DRM_MODE_COLORIMETRY_DEFAULT, "Default" }, - { DRM_MODE_COLORIMETRY_RGB_WIDE_FIXED, "RGB_Wide_Gamut_Fixed_Point" }, - /* Colorimetry based on scRGB (IEC 61966-2-2) */ - { DRM_MODE_COLORIMETRY_RGB_WIDE_FLOAT, "RGB_Wide_Gamut_Floating_Point" }, - /* Colorimetry based on IEC 61966-2-5 */ - { DRM_MODE_COLORIMETRY_OPRGB, "opRGB" }, - /* Colorimetry based on SMPTE RP 431-2 */ - { DRM_MODE_COLORIMETRY_DCI_P
[PATCH v4 02/13] drm/connector: Add enum documentation to drm_colorspace
From: Joshua Ashton To match the other enums, and add more information about these values. v2: - Specify where an enum entry comes from - Clarify DEFAULT and NO_DATA behavior - BT.2020 CYCC is "constant luminance" - correct type for BT.601 v4: - drop DP/HDMI clarifications that might create more questions than answers Signed-off-by: Joshua Ashton Signed-off-by: Harry Wentland Reviewed-by: Harry Wentland Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Uma Shankar Cc: Ville Syrjälä Cc: Joshua Ashton Cc: Simon Ser Cc: Ville Syrjälä Cc: Melissa Wen Cc: dri-de...@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org --- include/drm/drm_connector.h | 62 +++-- 1 file changed, 60 insertions(+), 2 deletions(-) diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h index 77401e425341..ee597593d7e6 100644 --- a/include/drm/drm_connector.h +++ b/include/drm/drm_connector.h @@ -363,13 +363,71 @@ enum drm_privacy_screen_status { PRIVACY_SCREEN_ENABLED_LOCKED, }; -/* - * This is a consolidated colorimetry list supported by HDMI and +/** + * enum drm_colorspace - color space + * + * This enum is a consolidated colorimetry list supported by HDMI and * DP protocol standard. The respective connectors will register * a property with the subset of this list (supported by that * respective protocol). Userspace will set the colorspace through * a colorspace property which will be created and exposed to * userspace. + * + * DP definitions come from the DP v2.0 spec + * HDMI definitions come from the CTA-861-H spec + * + * @DRM_MODE_COLORIMETRY_DEFAULT: + * Driver specific behavior. + * @DRM_MODE_COLORIMETRY_NO_DATA: + * Driver specific behavior. + * @DRM_MODE_COLORIMETRY_SMPTE_170M_YCC: + * (HDMI) + * SMPTE ST 170M colorimetry format + * @DRM_MODE_COLORIMETRY_BT709_YCC: + * (HDMI, DP) + * ITU-R BT.709 colorimetry format + * @DRM_MODE_COLORIMETRY_XVYCC_601: + * (HDMI, DP) + * xvYCC601 colorimetry format + * @DRM_MODE_COLORIMETRY_XVYCC_709: + * (HDMI, DP) + * xvYCC709 colorimetry format + * @DRM_MODE_COLORIMETRY_SYCC_601: + * (HDMI, DP) + * sYCC601 colorimetry format + * @DRM_MODE_COLORIMETRY_OPYCC_601: + * (HDMI, DP) + * opYCC601 colorimetry format + * @DRM_MODE_COLORIMETRY_OPRGB: + * (HDMI, DP) + * opRGB colorimetry format + * @DRM_MODE_COLORIMETRY_BT2020_CYCC: + * (HDMI, DP) + * ITU-R BT.2020 Y'c C'bc C'rc (constant luminance) colorimetry format + * @DRM_MODE_COLORIMETRY_BT2020_RGB: + * (HDMI, DP) + * ITU-R BT.2020 R' G' B' colorimetry format + * @DRM_MODE_COLORIMETRY_BT2020_YCC: + * (HDMI, DP) + * ITU-R BT.2020 Y' C'b C'r colorimetry format + * @DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65: + * (HDMI) + * SMPTE ST 2113 P3D65 colorimetry format + * @DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER: + * (HDMI) + * SMPTE ST 2113 P3DCI colorimetry format + * @DRM_MODE_COLORIMETRY_RGB_WIDE_FIXED: + * (DP) + * RGB wide gamut fixed point colorimetry format + * @DRM_MODE_COLORIMETRY_RGB_WIDE_FLOAT: + * (DP) + * RGB wide gamut floating point + * (scRGB (IEC 61966-2-2)) colorimetry format + * @DRM_MODE_COLORIMETRY_BT601_YCC: + * (DP) + * ITU-R BT.601 colorimetry format + * The DP spec does not say whether this is the 525 or the 625 + * line version. */ enum drm_colorspace { /* For Default case, driver will set the colorspace */ -- 2.40.1
[PATCH v4 01/13] drm/connector: Convert DRM_MODE_COLORIMETRY to enum
This allows us to use strongly typed arguments. v2: - Bring NO_DATA back - Provide explicit enum values v3: - Drop unnecessary '&' from kerneldoc (emersion) v4: - Fix Normal Colorimetry comment Signed-off-by: Harry Wentland Reviewed-by: Simon Ser Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Uma Shankar Cc: Ville Syrjälä Cc: Joshua Ashton Cc: Simon Ser Cc: Ville Syrjälä Cc: Melissa Wen Cc: dri-de...@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Reviewed-by: Pekka Paalanen --- include/drm/display/drm_dp.h | 2 +- include/drm/drm_connector.h | 49 ++-- 2 files changed, 26 insertions(+), 25 deletions(-) diff --git a/include/drm/display/drm_dp.h b/include/drm/display/drm_dp.h index f1be179c5f1f..7f858352cb43 100644 --- a/include/drm/display/drm_dp.h +++ b/include/drm/display/drm_dp.h @@ -1626,7 +1626,7 @@ enum dp_pixelformat { * * This enum is used to indicate DP VSC SDP Colorimetry formats. * It is based on DP 1.4 spec [Table 2-117: VSC SDP Payload for DB16 through - * DB18] and a name of enum member follows DRM_MODE_COLORIMETRY definition. + * DB18] and a name of enum member follows enum drm_colorimetry definition. * * @DP_COLORIMETRY_DEFAULT: sRGB (IEC 61966-2-1) or * ITU-R BT.601 colorimetry format diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h index 565cf9d3c550..77401e425341 100644 --- a/include/drm/drm_connector.h +++ b/include/drm/drm_connector.h @@ -371,29 +371,30 @@ enum drm_privacy_screen_status { * a colorspace property which will be created and exposed to * userspace. */ - -/* For Default case, driver will set the colorspace */ -#define DRM_MODE_COLORIMETRY_DEFAULT 0 -/* CEA 861 Normal Colorimetry options */ -#define DRM_MODE_COLORIMETRY_NO_DATA 0 -#define DRM_MODE_COLORIMETRY_SMPTE_170M_YCC1 -#define DRM_MODE_COLORIMETRY_BT709_YCC 2 -/* CEA 861 Extended Colorimetry Options */ -#define DRM_MODE_COLORIMETRY_XVYCC_601 3 -#define DRM_MODE_COLORIMETRY_XVYCC_709 4 -#define DRM_MODE_COLORIMETRY_SYCC_601 5 -#define DRM_MODE_COLORIMETRY_OPYCC_601 6 -#define DRM_MODE_COLORIMETRY_OPRGB 7 -#define DRM_MODE_COLORIMETRY_BT2020_CYCC 8 -#define DRM_MODE_COLORIMETRY_BT2020_RGB9 -#define DRM_MODE_COLORIMETRY_BT2020_YCC10 -/* Additional Colorimetry extension added as part of CTA 861.G */ -#define DRM_MODE_COLORIMETRY_DCI_P3_RGB_D6511 -#define DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER12 -/* Additional Colorimetry Options added for DP 1.4a VSC Colorimetry Format */ -#define DRM_MODE_COLORIMETRY_RGB_WIDE_FIXED13 -#define DRM_MODE_COLORIMETRY_RGB_WIDE_FLOAT14 -#define DRM_MODE_COLORIMETRY_BT601_YCC 15 +enum drm_colorspace { + /* For Default case, driver will set the colorspace */ + DRM_MODE_COLORIMETRY_DEFAULT= 0, + /* CEA 861 Normal Colorimetry options */ + DRM_MODE_COLORIMETRY_NO_DATA= 0, + DRM_MODE_COLORIMETRY_SMPTE_170M_YCC = 1, + DRM_MODE_COLORIMETRY_BT709_YCC = 2, + /* CEA 861 Extended Colorimetry Options */ + DRM_MODE_COLORIMETRY_XVYCC_601 = 3, + DRM_MODE_COLORIMETRY_XVYCC_709 = 4, + DRM_MODE_COLORIMETRY_SYCC_601 = 5, + DRM_MODE_COLORIMETRY_OPYCC_601 = 6, + DRM_MODE_COLORIMETRY_OPRGB = 7, + DRM_MODE_COLORIMETRY_BT2020_CYCC= 8, + DRM_MODE_COLORIMETRY_BT2020_RGB = 9, + DRM_MODE_COLORIMETRY_BT2020_YCC = 10, + /* Additional Colorimetry extension added as part of CTA 861.G */ + DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65 = 11, + DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER = 12, + /* Additional Colorimetry Options added for DP 1.4a VSC Colorimetry Format */ + DRM_MODE_COLORIMETRY_RGB_WIDE_FIXED = 13, + DRM_MODE_COLORIMETRY_RGB_WIDE_FLOAT = 14, + DRM_MODE_COLORIMETRY_BT601_YCC = 15, +}; /** * enum drm_bus_flags - bus_flags info for &drm_display_info @@ -828,7 +829,7 @@ struct drm_connector_state { * colorspace change on Sink. This is most commonly used to switch * to wider color gamuts like BT2020. */ - u32 colorspace; + enum drm_colorspace colorspace; /** * @writeback_job: Writeback job for writeback connectors -- 2.40.1
[PATCH v4 00/13] Enable Colorspace connector property in amdgpu
This patchset is based on Joshua's previous patchset [1], as well as my previous patchset [2]. It is - enabling support for the colorspace property in amdgpu, as well as - allowing drivers to specify the supported set of colorspaces, and Colorspace, Infoframes, and YCbCr matrix --- Even though the initial intent of the colorspace property was to set the colorspace field in the respective HDMI AVI and DP SDP infoframes that is not sufficient in all scenarios. For DP the colorspace information also affects the MSA (main stream attribute) packet. For YUV output the colorspace affects the RGB-to-YCbCr conversion matrix. The colorspace field of the infopackets also depends on the encoding used, which is something that is decided by the driver and not known to userspace. For these reasons a driver will need to be able to select the supported colorspaces at property creation. Note: There seems to be an understanding that the colorspace property should ONLY modify the infoframe. While this is current behavior and sufficient in some cases it is nowhere specified that this should be the only use of this property. As outlined above this limitation is not going to work in all cases. This patchset does not affect current behavior for the drivers that implement this property: i915 and vc4. In the future we might want to give userspace control over the encoding format on the wire, in particular to avoid use of YUV420 when image fidelity is important. This work would likely go hand in hand with a min_bpc property and wouldn't conflict with the work done in this patchset. I would expect this future work to tag along with a drm_crtc or drm_connector's Color Pipeline, similar to the one propsed for drm_plane [3]. Colorspace on crtc or connector? There have been suggestions of programming 'colorspace' on the drm_crtc but I don't think the crtc is the right place for this property. The drm_plane and drm_crtc will be used to offload color processing that would normally be done via the GFX or other pipelines. The drm_connector controls the signalling with the display and ensures the wire format is appropriate for the encoding by programming the RGB-to-YCbCr matrix. [1] https://patchwork.freedesktop.org/series/113632/ [2] https://patchwork.freedesktop.org/series/111865/ [3] https://lists.freedesktop.org/archives/dri-devel/2023-May/403173.html v2: - Tested with DP and HDMI analyzers - Confirmed driver will fallback to lower bpc when needed - Dropped hunk to set HDMI AVI infoframe as it was a no-op - Fixed BT.2020 YCbCr colorimetry (JoshuaAshton) - Simplify initialization of supported colorspaces (Jani) - Fix kerneldoc (kernel test robot) v3: - Added documentation for colorspaces (Pekka, Joshua) - Split 'Allow drivers to pass list of supported colorspaces' patch to pull out code to create common colorspace array and keep it separate from change to create only supported colorspaces v4: - Don't "deprecate" existing enum values - Fixes based on review comments throughout - Dropped Josh's RBs Cc: Pekka Paalanen Cc: Sebastian Wick Cc: vitaly.pros...@amd.com Cc: Uma Shankar Cc: Ville Syrjälä Cc: Joshua Ashton Cc: Jani Nikula Cc: Michel Dänzer Cc: Simon Ser Cc: Melissa Wen Cc: dri-de...@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Harry Wentland (10): drm/connector: Convert DRM_MODE_COLORIMETRY to enum drm/connector: Pull out common create_colorspace_property code drm/connector: Use common colorspace_names array drm/connector: Print connector colorspace in state debugfs drm/connector: Allow drivers to pass list of supported colorspaces drm/amd/display: Always pass connector_state to stream validation drm/amd/display: Register Colorspace property for DP and HDMI drm/amd/display: Signal mode_changed if colorspace changed drm/amd/display: Send correct DP colorspace infopacket drm/amd/display: Add debugfs for testing output colorspace Joshua Ashton (3): drm/connector: Add enum documentation to drm_colorspace drm/amd/display: Always set crtcinfo from create_stream_for_sink drm/amd/display: Refactor avi_info_frame colorimetry determination .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 84 ++--- .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 57 ++ .../gpu/drm/amd/display/dc/core/dc_resource.c | 28 +-- drivers/gpu/drm/drm_atomic.c | 1 + drivers/gpu/drm/drm_connector.c | 176 +++--- .../gpu/drm/i915/display/intel_connector.c| 4 +- drivers/gpu/drm/vc4/vc4_hdmi.c| 2 +- include/drm/display/drm_dp.h | 2 +- include/drm/drm_connector.h | 121 +--- 9 files changed, 341 insertions(+), 134 deletions(-) -- 2.40.1
[PATCH AUTOSEL 5.15 42/43] drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged
From: Guchun Chen [ Upstream commit c1a322a7a4a96cd0a3dde32ce37af437a78bf8cd ] When performing device unbind or halt, we have disabled all irqs at the very begining like amdgpu_pci_remove or amdgpu_device_halt. So amdgpu_irq_put for irqs stored in fence driver should not be called any more, otherwise, below calltrace will arrive. [ 139.114088] WARNING: CPU: 2 PID: 1550 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:616 amdgpu_irq_put+0xf6/0x110 [amdgpu] [ 139.114655] Call Trace: [ 139.114655] [ 139.114657] amdgpu_fence_driver_hw_fini+0x93/0x130 [amdgpu] [ 139.114836] amdgpu_device_fini_hw+0xb6/0x350 [amdgpu] [ 139.114955] amdgpu_driver_unload_kms+0x51/0x70 [amdgpu] [ 139.115075] amdgpu_pci_remove+0x63/0x160 [amdgpu] [ 139.115193] ? __pm_runtime_resume+0x64/0x90 [ 139.115195] pci_device_remove+0x3a/0xb0 [ 139.115197] device_remove+0x43/0x70 [ 139.115198] device_release_driver_internal+0xbd/0x140 Signed-off-by: Guchun Chen Acked-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index bbd6f7a123033..8599e0ffa8292 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -561,7 +561,8 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev) if (r) amdgpu_fence_driver_force_completion(ring); - if (ring->fence_drv.irq_src) + if (!drm_dev_is_unplugged(adev_to_drm(adev)) && + ring->fence_drv.irq_src) amdgpu_irq_put(adev, ring->fence_drv.irq_src, ring->fence_drv.irq_type); -- 2.39.2
[PATCH AUTOSEL 6.1 54/57] drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged
From: Guchun Chen [ Upstream commit c1a322a7a4a96cd0a3dde32ce37af437a78bf8cd ] When performing device unbind or halt, we have disabled all irqs at the very begining like amdgpu_pci_remove or amdgpu_device_halt. So amdgpu_irq_put for irqs stored in fence driver should not be called any more, otherwise, below calltrace will arrive. [ 139.114088] WARNING: CPU: 2 PID: 1550 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:616 amdgpu_irq_put+0xf6/0x110 [amdgpu] [ 139.114655] Call Trace: [ 139.114655] [ 139.114657] amdgpu_fence_driver_hw_fini+0x93/0x130 [amdgpu] [ 139.114836] amdgpu_device_fini_hw+0xb6/0x350 [amdgpu] [ 139.114955] amdgpu_driver_unload_kms+0x51/0x70 [amdgpu] [ 139.115075] amdgpu_pci_remove+0x63/0x160 [amdgpu] [ 139.115193] ? __pm_runtime_resume+0x64/0x90 [ 139.115195] pci_device_remove+0x3a/0xb0 [ 139.115197] device_remove+0x43/0x70 [ 139.115198] device_release_driver_internal+0xbd/0x140 Signed-off-by: Guchun Chen Acked-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index 3cc1929285fc0..ed6878d5b3ce3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -528,7 +528,8 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev) if (r) amdgpu_fence_driver_force_completion(ring); - if (ring->fence_drv.irq_src) + if (!drm_dev_is_unplugged(adev_to_drm(adev)) && + ring->fence_drv.irq_src) amdgpu_irq_put(adev, ring->fence_drv.irq_src, ring->fence_drv.irq_type); -- 2.39.2
[PATCH AUTOSEL 6.3 64/67] drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged
From: Guchun Chen [ Upstream commit c1a322a7a4a96cd0a3dde32ce37af437a78bf8cd ] When performing device unbind or halt, we have disabled all irqs at the very begining like amdgpu_pci_remove or amdgpu_device_halt. So amdgpu_irq_put for irqs stored in fence driver should not be called any more, otherwise, below calltrace will arrive. [ 139.114088] WARNING: CPU: 2 PID: 1550 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:616 amdgpu_irq_put+0xf6/0x110 [amdgpu] [ 139.114655] Call Trace: [ 139.114655] [ 139.114657] amdgpu_fence_driver_hw_fini+0x93/0x130 [amdgpu] [ 139.114836] amdgpu_device_fini_hw+0xb6/0x350 [amdgpu] [ 139.114955] amdgpu_driver_unload_kms+0x51/0x70 [amdgpu] [ 139.115075] amdgpu_pci_remove+0x63/0x160 [amdgpu] [ 139.115193] ? __pm_runtime_resume+0x64/0x90 [ 139.115195] pci_device_remove+0x3a/0xb0 [ 139.115197] device_remove+0x43/0x70 [ 139.115198] device_release_driver_internal+0xbd/0x140 Signed-off-by: Guchun Chen Acked-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index f52d0ba91a770..a7d250809da99 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -582,7 +582,8 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev) if (r) amdgpu_fence_driver_force_completion(ring); - if (ring->fence_drv.irq_src) + if (!drm_dev_is_unplugged(adev_to_drm(adev)) && + ring->fence_drv.irq_src) amdgpu_irq_put(adev, ring->fence_drv.irq_src, ring->fence_drv.irq_type); -- 2.39.2
[PATCH] drm/amdgpu: Fix up kdoc in amdgpu_acpi.c
Fix these warnings by adding & deleting the deviant arguments. gcc with W=1 drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:906: warning: Function parameter or member 'numa_info' not described in 'amdgpu_acpi_get_node_id' drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:906: warning: Excess function parameter 'nid' description in 'amdgpu_acpi_get_node_id' Cc: Christian König Cc: Alex Deucher Signed-off-by: Srinivasan Shanmugam --- drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c index b050d462b2f3..3a6b2e2089f6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c @@ -894,7 +894,7 @@ static struct amdgpu_numa_info *amdgpu_acpi_get_numa_info(uint32_t pxm) * acpi device handle * * @handle: acpi handle - * @nid: NUMA Node id returned by the platform firmware + * @numa_info: amdgpu_numa_info structure holding numa information * * Queries the ACPI interface to fetch the corresponding NUMA Node ID for a * given amdgpu acpi device. -- 2.25.1
Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused
On Thu, May 25, 2023 at 9:42 AM Alex Deucher wrote: > > On Thu, May 25, 2023 at 12:29 PM Nathan Chancellor wrote: > > > > On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote: > > > On 2023-05-25 11:22, Nathan Chancellor wrote: > > > > On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote: > > > >> Silencing the compiler from below compilation error: > > > >> > > > >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable > > > >> 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted > > > >> [-Werror,-Wunneeded-internal-declaration] > > > >> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = { > > > >> ^ > > > >> 1 error generated. > > > >> > > > >> Mark the variable as __maybe_unused to make it clear to clang that this > > > >> is expected, so there is no more warning. > > > >> > > > >> Cc: Christian König > > > >> Cc: Lijo Lazar > > > >> Cc: Luben Tuikov > > > >> Cc: Alex Deucher > > > >> Signed-off-by: Srinivasan Shanmugam > > > > > > > > Traditionally, this attribute would go between the [] and =, but that is > > > > a nit. Can someone please pick this up to unblock our builds on -next? > > > > > > > > Reviewed-by: Nathan Chancellor > > > > > > I'll pick this up, fix it, and submit to amd-staging-drm-next. > > > > Thanks a lot :) > > > > > Which -next are you referring to, Nathan? > > > > linux-next, this warning breaks the build when -Werror is enabled, such > > as with allmodconfig: > > > > https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2QHtlCTz2JL0yXNpRB5hVmiP9lq/build.log > > > > Srinivasan has already pushed it. I'll push it out once CI has > completed. We are trying to figure out the best way to enable -WERROR > in our CI system as it is almost always broken depending on what > compiler you are using. Also, I'm not sure fixing these is always > better. A lot of these warnings seem spurious and in a lot of cases > the "fix" doesn't really improve the code, it just silences a warning. > As one of my coworkers put it, there is a reason warnings are not > errors. https://www.theregister.com/2021/09/08/compromise_linux_kernel_compiler_warnings/ > > Alex > > > > Cheers, > > Nathan > > > > > >> --- > > > >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 1 + > > > >> 1 file changed, 1 insertion(+) > > > >> > > > >> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c > > > >> b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c > > > >> index 3648994724c2..cba087e529c0 100644 > > > >> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c > > > >> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c > > > >> @@ -701,6 +701,7 @@ static void > > > >> mmhub_v1_8_reset_ras_error_count(struct amdgpu_device *adev) > > > >>mmhub_v1_8_inst_reset_ras_error_count(adev, i); > > > >> } > > > >> > > > >> +__maybe_unused > > > >> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = { > > > >>regMMEA0_ERR_STATUS, > > > >>regMMEA1_ERR_STATUS, > > > >> -- > > > >> 2.25.1 > > > >> > > > > -- Thanks, ~Nick Desaulniers
Re: [PATCH] drm/amdgpu: Fix up kdoc in sdma_v4_4_2.c
On Thu, May 25, 2023 at 1:08 PM Srinivasan Shanmugam wrote: > > Address a bunch of kdoc warnings: > > gcc with W=1 > drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:426: warning: Function parameter or > member 'inst_mask' not described in 'sdma_v4_4_2_inst_gfx_stop' > drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:457: warning: Function parameter or > member 'inst_mask' not described in 'sdma_v4_4_2_inst_rlc_stop' > drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:470: warning: Function parameter or > member 'inst_mask' not described in 'sdma_v4_4_2_inst_page_stop' > drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:506: warning: Function parameter or > member 'inst_mask' not described in 'sdma_v4_4_2_inst_ctx_switch_enable' > drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:794: warning: Function parameter or > member 'inst_mask' not described in 'sdma_v4_4_2_inst_rlc_resume' > drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:810: warning: Function parameter or > member 'inst_mask' not described in 'sdma_v4_4_2_inst_load_microcode' > drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:854: warning: Function parameter or > member 'inst_mask' not described in 'sdma_v4_4_2_inst_start' I thought someone already landed a patch for this. If not, Reviewed-by: Alex Deucher > > Cc: Christian König > Cc: Alex Deucher > Signed-off-by: Srinivasan Shanmugam > --- > drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c > b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c > index ff41fb577cdd..8eebf9c2bbcd 100644 > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c > @@ -418,6 +418,7 @@ static void sdma_v4_4_2_ring_emit_fence(struct > amdgpu_ring *ring, u64 addr, u64 > * sdma_v4_4_2_inst_gfx_stop - stop the gfx async dma engines > * > * @adev: amdgpu_device pointer > + * @inst_mask: mask of dma engine instances to be disabled > * > * Stop the gfx async dma ring buffers. > */ > @@ -449,6 +450,7 @@ static void sdma_v4_4_2_inst_gfx_stop(struct > amdgpu_device *adev, > * sdma_v4_4_2_inst_rlc_stop - stop the compute async dma engines > * > * @adev: amdgpu_device pointer > + * @inst_mask: mask of dma engine instances to be disabled > * > * Stop the compute async dma queues. > */ > @@ -462,6 +464,7 @@ static void sdma_v4_4_2_inst_rlc_stop(struct > amdgpu_device *adev, > * sdma_v4_4_2_inst_page_stop - stop the page async dma engines > * > * @adev: amdgpu_device pointer > + * @inst_mask: mask of dma engine instances to be disabled > * > * Stop the page async dma ring buffers. > */ > @@ -498,6 +501,7 @@ static void sdma_v4_4_2_inst_page_stop(struct > amdgpu_device *adev, > * > * @adev: amdgpu_device pointer > * @enable: enable/disable the DMA MEs context switch. > + * @inst_mask: mask of dma engine instances to be enabled > * > * Halt or unhalt the async dma engines context switch. > */ > @@ -785,6 +789,7 @@ static void sdma_v4_4_2_init_pg(struct amdgpu_device > *adev) > * sdma_v4_4_2_inst_rlc_resume - setup and start the async dma engines > * > * @adev: amdgpu_device pointer > + * @inst_mask: mask of dma engine instances to be enabled > * > * Set up the compute DMA queues and enable them. > * Returns 0 for success, error for failure. > @@ -801,6 +806,7 @@ static int sdma_v4_4_2_inst_rlc_resume(struct > amdgpu_device *adev, > * sdma_v4_4_2_inst_load_microcode - load the sDMA ME ucode > * > * @adev: amdgpu_device pointer > + * @inst_mask: mask of dma engine instances to be enabled > * > * Loads the sDMA0/1 ucode. > * Returns 0 for success, -EINVAL if the ucode is not available. > @@ -845,6 +851,7 @@ static int sdma_v4_4_2_inst_load_microcode(struct > amdgpu_device *adev, > * sdma_v4_4_2_inst_start - setup and start the async dma engines > * > * @adev: amdgpu_device pointer > + * @inst_mask: mask of dma engine instances to be enabled > * > * Set up the DMA engines and enable them. > * Returns 0 for success, error for failure. > -- > 2.25.1 >
[PATCH] drm/amdgpu: Fix up kdoc in amdgpu_device.c
Fix these warnings by deleting the deviant arguments. gcc with W=1 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:799: warning: Excess function parameter 'pcie_index' description in 'amdgpu_device_indirect_wreg' drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:799: warning: Excess function parameter 'pcie_data' description in 'amdgpu_device_indirect_wreg' drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:870: warning: Excess function parameter 'pcie_index' description in 'amdgpu_device_indirect_wreg64' drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:870: warning: Excess function parameter 'pcie_data' description in 'amdgpu_device_indirect_wreg64' Cc: Christian König Cc: Alex Deucher Signed-off-by: Srinivasan Shanmugam --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 1 file changed, 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index c1e9ed26b7bf..301abfb7a0d3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -788,8 +788,6 @@ u64 amdgpu_device_indirect_rreg64(struct amdgpu_device *adev, * amdgpu_device_indirect_wreg - write an indirect register address * * @adev: amdgpu_device pointer - * @pcie_index: mmio register offset - * @pcie_data: mmio register offset * @reg_addr: indirect register offset * @reg_data: indirect register data * @@ -859,8 +857,6 @@ void amdgpu_device_indirect_wreg_ext(struct amdgpu_device *adev, * amdgpu_device_indirect_wreg64 - write a 64bits indirect register address * * @adev: amdgpu_device pointer - * @pcie_index: mmio register offset - * @pcie_data: mmio register offset * @reg_addr: indirect register offset * @reg_data: indirect register data * -- 2.25.1
[PATCH 31/33] drm/amdkfd: add debug queue snapshot operation
Allow the debugger to get a snapshot of a specified number of queues containing various queue property information that is copied to the debugger. Since the debugger doesn't know how many queues exist at any given time, allow the debugger to pass the requested number of snapshots as 0 to get the actual number of potential snapshots to use for a subsequent snapshot request for actual information. To prevent future ABI breakage, pass in the requested entry_size. The KFD will return it's own entry_size in case the debugger still wants log the information in a core dump on sizing failure. Also allow the debugger to clear exceptions when doing a snapshot. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 6 +++ .../drm/amd/amdkfd/kfd_device_queue_manager.c | 36 + .../drm/amd/amdkfd/kfd_device_queue_manager.h | 3 ++ drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 5 +++ .../amd/amdkfd/kfd_process_queue_manager.c| 40 +++ 5 files changed, 90 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 00aa844762b0..b24a73fd53af 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -3053,6 +3053,12 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, v &args->query_exception_info.info_size); break; case KFD_IOC_DBG_TRAP_GET_QUEUE_SNAPSHOT: + r = pqm_get_queue_snapshot(&target->pqm, + args->queue_snapshot.exception_mask, + (void __user *)args->queue_snapshot.snapshot_buf_ptr, + &args->queue_snapshot.num_queues, + &args->queue_snapshot.entry_size); + break; case KFD_IOC_DBG_TRAP_GET_DEVICE_SNAPSHOT: pr_warn("Debug op %i not supported yet\n", args->op); r = -EACCES; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index 03fabe6e9cdb..9f52f8426ed1 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -3040,6 +3040,42 @@ int suspend_queues(struct kfd_process *p, return total_suspended; } +static uint32_t set_queue_type_for_user(struct queue_properties *q_props) +{ + switch (q_props->type) { + case KFD_QUEUE_TYPE_COMPUTE: + return q_props->format == KFD_QUEUE_FORMAT_PM4 + ? KFD_IOC_QUEUE_TYPE_COMPUTE + : KFD_IOC_QUEUE_TYPE_COMPUTE_AQL; + case KFD_QUEUE_TYPE_SDMA: + return KFD_IOC_QUEUE_TYPE_SDMA; + case KFD_QUEUE_TYPE_SDMA_XGMI: + return KFD_IOC_QUEUE_TYPE_SDMA_XGMI; + default: + WARN_ONCE(true, "queue type not recognized!"); + return 0x; + }; +} + +void set_queue_snapshot_entry(struct queue *q, + uint64_t exception_clear_mask, + struct kfd_queue_snapshot_entry *qss_entry) +{ + qss_entry->ring_base_address = q->properties.queue_address; + qss_entry->write_pointer_address = (uint64_t)q->properties.write_ptr; + qss_entry->read_pointer_address = (uint64_t)q->properties.read_ptr; + qss_entry->ctx_save_restore_address = + q->properties.ctx_save_restore_area_address; + qss_entry->ctx_save_restore_area_size = + q->properties.ctx_save_restore_area_size; + qss_entry->exception_status = q->properties.exception_status; + qss_entry->queue_id = q->properties.queue_id; + qss_entry->gpu_id = q->device->id; + qss_entry->ring_size = (uint32_t)q->properties.queue_size; + qss_entry->queue_type = set_queue_type_for_user(&q->properties); + q->properties.exception_status &= ~exception_clear_mask; +} + int debug_lock_and_unmap(struct device_queue_manager *dqm) { int r; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h index d4e6dbffe8c2..7dd4b177219d 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h @@ -300,6 +300,9 @@ int suspend_queues(struct kfd_process *p, int resume_queues(struct kfd_process *p, uint32_t num_queues, uint32_t *usr_queue_id_array); +void set_queue_snapshot_entry(struct queue *q, + uint64_t exception_clear_mask, + struct kfd_queue_snapshot_entry *qss_entry); int debug_lock_and_unmap(struct device_queue_manager *dqm); int debug_map_and_unlock(struct device_queue_manager *dqm
[PATCH 30/33] drm/amdkfd: add debug query exception info operation
Allow the debugger to query additional info based on an exception code. For device exceptions, it's currently only memory violation information. For process exceptions, it's currently only runtime information. Queue exception only report the queue exception status. The debugger has the option of clearing the target exception on query. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 7 ++ drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 120 +++ drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 6 ++ 3 files changed, 133 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index ebb2088d12fa..00aa844762b0 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -3045,6 +3045,13 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, v &args->query_debug_event.exception_mask); break; case KFD_IOC_DBG_TRAP_QUERY_EXCEPTION_INFO: + r = kfd_dbg_trap_query_exception_info(target, + args->query_exception_info.source_id, + args->query_exception_info.exception_code, + args->query_exception_info.clear_exception, + (void __user *)args->query_exception_info.info_ptr, + &args->query_exception_info.info_size); + break; case KFD_IOC_DBG_TRAP_GET_QUEUE_SNAPSHOT: case KFD_IOC_DBG_TRAP_GET_DEVICE_SNAPSHOT: pr_warn("Debug op %i not supported yet\n", args->op); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c index e9530e682e85..24e2b285448a 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c @@ -890,6 +890,126 @@ int kfd_dbg_trap_set_wave_launch_mode(struct kfd_process *target, return r; } +int kfd_dbg_trap_query_exception_info(struct kfd_process *target, + uint32_t source_id, + uint32_t exception_code, + bool clear_exception, + void __user *info, + uint32_t *info_size) +{ + bool found = false; + int r = 0; + uint32_t copy_size, actual_info_size = 0; + uint64_t *exception_status_ptr = NULL; + + if (!target) + return -EINVAL; + + if (!info || !info_size) + return -EINVAL; + + mutex_lock(&target->event_mutex); + + if (KFD_DBG_EC_TYPE_IS_QUEUE(exception_code)) { + /* Per queue exceptions */ + struct queue *queue = NULL; + int i; + + for (i = 0; i < target->n_pdds; i++) { + struct kfd_process_device *pdd = target->pdds[i]; + struct qcm_process_device *qpd = &pdd->qpd; + + list_for_each_entry(queue, &qpd->queues_list, list) { + if (!found && queue->properties.queue_id == source_id) { + found = true; + break; + } + } + if (found) + break; + } + + if (!found) { + r = -EINVAL; + goto out; + } + + if (!(queue->properties.exception_status & KFD_EC_MASK(exception_code))) { + r = -ENODATA; + goto out; + } + exception_status_ptr = &queue->properties.exception_status; + } else if (KFD_DBG_EC_TYPE_IS_DEVICE(exception_code)) { + /* Per device exceptions */ + struct kfd_process_device *pdd = NULL; + int i; + + for (i = 0; i < target->n_pdds; i++) { + pdd = target->pdds[i]; + if (pdd->dev->id == source_id) { + found = true; + break; + } + } + + if (!found) { + r = -EINVAL; + goto out; + } + + if (!(pdd->exception_status & KFD_EC_MASK(exception_code))) { + r = -ENODATA; + goto out; + } + + if (exception_code == EC_DEVICE_MEMORY_VIOLATION) { + copy_size = min((size_t)(*info_size), pdd->vm_fault_exc_data_size); + + if (copy_to_user(info, pdd->vm_fault_exc_data, copy_size)) { + r = -EFAULT; + goto out; + } + actual_info_size = pdd->vm_fault_exc_d
[PATCH 32/33] drm/amdkfd: add debug device snapshot operation
Similar to queue snapshot, return an array of device information using an entry_size check and return. Unlike queue snapshots, the debugger needs to pass to correct number of devices that exist. If it fails to do so, the KFD will return the number of actual devices so that the debugger can make a subsequent successful call. v2: add num_xcc to device snapshot and fixup new kfd_node reference Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 7 ++- drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 73 drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 5 ++ 3 files changed, 83 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index b24a73fd53af..f522325b409b 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -3060,8 +3060,11 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, v &args->queue_snapshot.entry_size); break; case KFD_IOC_DBG_TRAP_GET_DEVICE_SNAPSHOT: - pr_warn("Debug op %i not supported yet\n", args->op); - r = -EACCES; + r = kfd_dbg_trap_device_snapshot(target, + args->device_snapshot.exception_mask, + (void __user *)args->device_snapshot.snapshot_buf_ptr, + &args->device_snapshot.num_devices, + &args->device_snapshot.entry_size); break; default: pr_err("Invalid option: %i\n", args->op); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c index 24e2b285448a..125274445f43 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c @@ -22,6 +22,7 @@ #include "kfd_debug.h" #include "kfd_device_queue_manager.h" +#include "kfd_topology.h" #include #include @@ -1010,6 +1011,78 @@ int kfd_dbg_trap_query_exception_info(struct kfd_process *target, return r; } +int kfd_dbg_trap_device_snapshot(struct kfd_process *target, + uint64_t exception_clear_mask, + void __user *user_info, + uint32_t *number_of_device_infos, + uint32_t *entry_size) +{ + struct kfd_dbg_device_info_entry device_info; + uint32_t tmp_entry_size = *entry_size, tmp_num_devices; + int i, r = 0; + + if (!(target && user_info && number_of_device_infos && entry_size)) + return -EINVAL; + + tmp_num_devices = min_t(size_t, *number_of_device_infos, target->n_pdds); + *number_of_device_infos = target->n_pdds; + *entry_size = min_t(size_t, *entry_size, sizeof(device_info)); + + if (!tmp_num_devices) + return 0; + + memset(&device_info, 0, sizeof(device_info)); + + mutex_lock(&target->event_mutex); + + /* Run over all pdd of the process */ + for (i = 0; i < tmp_num_devices; i++) { + struct kfd_process_device *pdd = target->pdds[i]; + struct kfd_topology_device *topo_dev = kfd_topology_device_by_id(pdd->dev->id); + + device_info.gpu_id = pdd->dev->id; + device_info.exception_status = pdd->exception_status; + device_info.lds_base = pdd->lds_base; + device_info.lds_limit = pdd->lds_limit; + device_info.scratch_base = pdd->scratch_base; + device_info.scratch_limit = pdd->scratch_limit; + device_info.gpuvm_base = pdd->gpuvm_base; + device_info.gpuvm_limit = pdd->gpuvm_limit; + device_info.location_id = topo_dev->node_props.location_id; + device_info.vendor_id = topo_dev->node_props.vendor_id; + device_info.device_id = topo_dev->node_props.device_id; + device_info.revision_id = pdd->dev->adev->pdev->revision; + device_info.subsystem_vendor_id = pdd->dev->adev->pdev->subsystem_vendor; + device_info.subsystem_device_id = pdd->dev->adev->pdev->subsystem_device; + device_info.fw_version = pdd->dev->kfd->mec_fw_version; + device_info.gfx_target_version = + topo_dev->node_props.gfx_target_version; + device_info.simd_count = topo_dev->node_props.simd_count; + device_info.max_waves_per_simd = + topo_dev->node_props.max_waves_per_simd; + device_info.array_count = topo_dev->node_props.array_count; + device_info.simd_arrays_per_engine = + topo_dev->node_props.simd_arrays_per_engine; + device_info.num_xcc = NUM_XCC(pdd->dev->xcc_mask); + device_info.capability = topo_dev->node_props.capability; + device_info.debug_prop = topo_dev->node_props.debug_
[PATCH 29/33] drm/amdkfd: add debug query event operation
Allow the debugger to query a single queue, device and process exception. The KFD should also return the GPU or Queue id of the exception. The debugger also has the option of clearing exceptions after being queried. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 6 +++ drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 64 drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 5 ++ 3 files changed, 75 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index e5d95b144dcd..ebb2088d12fa 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -3038,6 +3038,12 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, v r = kfd_dbg_trap_set_flags(target, &args->set_flags.flags); break; case KFD_IOC_DBG_TRAP_QUERY_DEBUG_EVENT: + r = kfd_dbg_ev_query_debug_event(target, + &args->query_debug_event.queue_id, + &args->query_debug_event.gpu_id, + args->query_debug_event.exception_mask, + &args->query_debug_event.exception_mask); + break; case KFD_IOC_DBG_TRAP_QUERY_EXCEPTION_INFO: case KFD_IOC_DBG_TRAP_GET_QUEUE_SNAPSHOT: case KFD_IOC_DBG_TRAP_GET_DEVICE_SNAPSHOT: diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c index 43c3170998d3..e9530e682e85 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c @@ -27,6 +27,70 @@ #define MAX_WATCH_ADDRESSES4 +int kfd_dbg_ev_query_debug_event(struct kfd_process *process, + unsigned int *queue_id, + unsigned int *gpu_id, + uint64_t exception_clear_mask, + uint64_t *event_status) +{ + struct process_queue_manager *pqm; + struct process_queue_node *pqn; + int i; + + if (!(process && process->debug_trap_enabled)) + return -ENODATA; + + mutex_lock(&process->event_mutex); + *event_status = 0; + *queue_id = 0; + *gpu_id = 0; + + /* find and report queue events */ + pqm = &process->pqm; + list_for_each_entry(pqn, &pqm->queues, process_queue_list) { + uint64_t tmp = process->exception_enable_mask; + + if (!pqn->q) + continue; + + tmp &= pqn->q->properties.exception_status; + + if (!tmp) + continue; + + *event_status = pqn->q->properties.exception_status; + *queue_id = pqn->q->properties.queue_id; + *gpu_id = pqn->q->device->id; + pqn->q->properties.exception_status &= ~exception_clear_mask; + goto out; + } + + /* find and report device events */ + for (i = 0; i < process->n_pdds; i++) { + struct kfd_process_device *pdd = process->pdds[i]; + uint64_t tmp = process->exception_enable_mask + & pdd->exception_status; + + if (!tmp) + continue; + + *event_status = pdd->exception_status; + *gpu_id = pdd->dev->id; + pdd->exception_status &= ~exception_clear_mask; + goto out; + } + + /* report process events */ + if (process->exception_enable_mask & process->exception_status) { + *event_status = process->exception_status; + process->exception_status &= ~exception_clear_mask; + } + +out: + mutex_unlock(&process->event_mutex); + return *event_status ? 0 : -EAGAIN; +} + void debug_event_write_work_handler(struct work_struct *work) { struct kfd_process *process; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h index ef8e9f7f1716..e78f954c0684 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h @@ -27,6 +27,11 @@ void kfd_dbg_trap_deactivate(struct kfd_process *target, bool unwind, int unwind_count); int kfd_dbg_trap_activate(struct kfd_process *target); +int kfd_dbg_ev_query_debug_event(struct kfd_process *process, + unsigned int *queue_id, + unsigned int *gpu_id, + uint64_t exception_clear_mask, + uint64_t *event_status); bool kfd_set_dbg_ev_from_interrupt(struct kfd_node *dev, unsigned int pasid, uint32_t doorbell_id, -- 2.25.1
[PATCH 22/33] drm/amdkfd: update process interrupt handling for debug events
The debugger must be notified by any debugger subscribed exception that comes from hardware interrupts. If a debugger session exits, any exceptions it subscribed to may still have interrupts in the interrupt ring buffer or KGD/KFD pipeline. To prevent a new session from inheriting stale interrupts, when a new queue is created, open an interrupt drain and allow the IH ring to drain from a timestamped checkpoint. Then inject a custom IV so that once the custom IV is picked up by the KFD, it's safe to close the drain and proceed with queue creation. The drain must also be on debug disable as SW interrupts may still be processed. Drain at this time and clear all the exception status. The debugger may also not be attached nor subscibed to certain exceptions so forward them directly to the runtime. GFX10 also requires its own IV processing, hence the creation of kfd_int_process_v10.c. This is because the IV from SQ interrupts are packed into a new continguous format unlike GFX9. To make this clear, a separate interrupting handling code file was created. v2: use new kfd_node struct in prototypes. Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 16 + drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 2 + drivers/gpu/drm/amd/amdkfd/Makefile | 1 + drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 84 drivers/gpu/drm/amd/amdkfd/kfd_debug.h| 6 + drivers/gpu/drm/amd/amdkfd/kfd_device.c | 4 +- .../gpu/drm/amd/amdkfd/kfd_int_process_v10.c | 405 ++ .../gpu/drm/amd/amdkfd/kfd_int_process_v11.c | 21 +- .../gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 98 - drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 12 + drivers/gpu/drm/amd/amdkfd/kfd_process.c | 47 ++ .../amd/amdkfd/kfd_process_queue_manager.c| 4 + 12 files changed, 680 insertions(+), 20 deletions(-) create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_int_process_v10.c diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index 66f80b9ab0c5..98cd52bb005f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -777,6 +777,22 @@ void amdgpu_amdkfd_ras_poison_consumption_handler(struct amdgpu_device *adev, bo amdgpu_umc_poison_handler(adev, reset); } +int amdgpu_amdkfd_send_close_event_drain_irq(struct amdgpu_device *adev, + uint32_t *payload) +{ + int ret; + + /* Device or IH ring is not ready so bail. */ + ret = amdgpu_ih_wait_on_checkpoint_process_ts(adev, &adev->irq.ih); + if (ret) + return ret; + + /* Send payload to fence KFD interrupts */ + amdgpu_amdkfd_interrupt(adev, payload); + + return 0; +} + bool amdgpu_amdkfd_ras_query_utcl2_poison_status(struct amdgpu_device *adev) { if (adev->gfx.ras && adev->gfx.ras->query_utcl2_poison_status) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index 94cc456761e5..dd740e64e6e1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -250,6 +250,8 @@ int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct amdgpu_device *dst, struct amdgpu_device *src, bool is_min); int amdgpu_amdkfd_get_pcie_bandwidth_mbytes(struct amdgpu_device *adev, bool is_min); +int amdgpu_amdkfd_send_close_event_drain_irq(struct amdgpu_device *adev, + uint32_t *payload); /* Read user wptr from a specified user address space with page fault * disabled. The memory must be pinned and mapped to the hardware when diff --git a/drivers/gpu/drm/amd/amdkfd/Makefile b/drivers/gpu/drm/amd/amdkfd/Makefile index 747754428073..2ec8f27c5366 100644 --- a/drivers/gpu/drm/amd/amdkfd/Makefile +++ b/drivers/gpu/drm/amd/amdkfd/Makefile @@ -53,6 +53,7 @@ AMDKFD_FILES := $(AMDKFD_PATH)/kfd_module.o \ $(AMDKFD_PATH)/kfd_events.o \ $(AMDKFD_PATH)/cik_event_interrupt.o \ $(AMDKFD_PATH)/kfd_int_process_v9.o \ + $(AMDKFD_PATH)/kfd_int_process_v10.o \ $(AMDKFD_PATH)/kfd_int_process_v11.o \ $(AMDKFD_PATH)/kfd_smi_events.o \ $(AMDKFD_PATH)/kfd_crat.o \ diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c index 17e8e9edccbf..68b657398d41 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c @@ -125,6 +125,64 @@ bool kfd_dbg_ev_raise(uint64_t event_mask, return is_subscribed; } +/* set pending event queue entry from ring entry */ +bool kfd_set_dbg_ev_from_interrupt(struct kfd_node *dev, + unsigned int pasid, + uint32_t doorbell_id, +
[PATCH 23/33] drm/amdkfd: add debug set exceptions enabled operation
The debugger subscibes to nofication for requested exceptions on attach. Allow the debugger to change its subsciption later on. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 3 ++ drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 36 drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 2 ++ 3 files changed, 41 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 73cb5abce431..80d354eade35 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -2980,6 +2980,9 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, v args->send_runtime_event.exception_mask); break; case KFD_IOC_DBG_TRAP_SET_EXCEPTIONS_ENABLED: + kfd_dbg_set_enabled_debug_exception_mask(target, + args->set_exceptions_enabled.exception_mask); + break; case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_OVERRIDE: case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_MODE: case KFD_IOC_DBG_TRAP_SUSPEND_QUEUES: diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c index 68b657398d41..48a4e3cc2234 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c @@ -521,3 +521,39 @@ int kfd_dbg_trap_enable(struct kfd_process *target, uint32_t fd, return r; } + +void kfd_dbg_set_enabled_debug_exception_mask(struct kfd_process *target, + uint64_t exception_set_mask) +{ + uint64_t found_mask = 0; + struct process_queue_manager *pqm; + struct process_queue_node *pqn; + static const char write_data = '.'; + loff_t pos = 0; + int i; + + mutex_lock(&target->event_mutex); + + found_mask |= target->exception_status; + + pqm = &target->pqm; + list_for_each_entry(pqn, &pqm->queues, process_queue_list) { + if (!pqn) + continue; + + found_mask |= pqn->q->properties.exception_status; + } + + for (i = 0; i < target->n_pdds; i++) { + struct kfd_process_device *pdd = target->pdds[i]; + + found_mask |= pdd->exception_status; + } + + if (exception_set_mask & found_mask) + kernel_write(target->dbg_ev_file, &write_data, 1, &pos); + + target->exception_enable_mask = exception_set_mask; + + mutex_unlock(&target->event_mutex); +} diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h index 5153ccbd7fd1..6c1054a08872 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h @@ -56,6 +56,8 @@ static inline bool kfd_dbg_is_per_vmid_supported(struct kfd_node *dev) void debug_event_write_work_handler(struct work_struct *work); +void kfd_dbg_set_enabled_debug_exception_mask(struct kfd_process *target, + uint64_t exception_set_mask); /* * If GFX off is enabled, chips that do not support RLC restore for the debug * registers will disable GFX off temporarily for the entire debug session. -- 2.25.1
[PATCH 25/33] drm/amdkfd: add debug wave launch mode operation
Allow the debugger to set wave behaviour on to either normally operate, halt at launch, trap on every instruction, terminate immediately or stall on allocation. v2: fixup with new kfd_node struct reference for mes check Signed-off-by: Jonathan Kim --- .../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 12 +++ .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 1 + .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 25 + .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h| 3 ++ .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c | 3 +- .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c| 14 +++- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 25 + .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h | 3 ++ drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 3 ++ drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 36 ++- drivers/gpu/drm/amd/amdkfd/kfd_debug.h| 2 ++ 11 files changed, 124 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c index d7881bbd828d..774ecfc3451a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c @@ -107,6 +107,17 @@ static uint32_t kgd_aldebaran_set_wave_launch_trap_override(struct amdgpu_device return data; } +static uint32_t kgd_aldebaran_set_wave_launch_mode(struct amdgpu_device *adev, + uint8_t wave_launch_mode, + uint32_t vmid) +{ + uint32_t data = 0; + + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, LAUNCH_MODE, wave_launch_mode); + + return data; +} + const struct kfd2kgd_calls aldebaran_kfd2kgd = { .program_sh_mem_settings = kgd_gfx_v9_program_sh_mem_settings, .set_pasid_vmid_mapping = kgd_gfx_v9_set_pasid_vmid_mapping, @@ -129,6 +140,7 @@ const struct kfd2kgd_calls aldebaran_kfd2kgd = { .disable_debug_trap = kgd_aldebaran_disable_debug_trap, .validate_trap_override_request = kgd_aldebaran_validate_trap_override_request, .set_wave_launch_trap_override = kgd_aldebaran_set_wave_launch_trap_override, + .set_wave_launch_mode = kgd_aldebaran_set_wave_launch_mode, .get_iq_wait_times = kgd_gfx_v9_get_iq_wait_times, .build_grace_period_packet_info = kgd_gfx_v9_build_grace_period_packet_info, .program_trap_handler_settings = kgd_gfx_v9_program_trap_handler_settings, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c index ec2587664001..fbdc1b7b1e42 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c @@ -412,6 +412,7 @@ const struct kfd2kgd_calls arcturus_kfd2kgd = { .disable_debug_trap = kgd_arcturus_disable_debug_trap, .validate_trap_override_request = kgd_gfx_v9_validate_trap_override_request, .set_wave_launch_trap_override = kgd_gfx_v9_set_wave_launch_trap_override, + .set_wave_launch_mode = kgd_gfx_v9_set_wave_launch_mode, .get_iq_wait_times = kgd_gfx_v9_get_iq_wait_times, .build_grace_period_packet_info = kgd_gfx_v9_build_grace_period_packet_info, .get_cu_occupancy = kgd_gfx_v9_get_cu_occupancy, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c index 7ea0362dcab3..a7a6edda557f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c @@ -856,6 +856,30 @@ uint32_t kgd_gfx_v10_set_wave_launch_trap_override(struct amdgpu_device *adev, return 0; } +uint32_t kgd_gfx_v10_set_wave_launch_mode(struct amdgpu_device *adev, + uint8_t wave_launch_mode, + uint32_t vmid) +{ + uint32_t data = 0; + bool is_mode_set = !!wave_launch_mode; + + mutex_lock(&adev->grbm_idx_mutex); + + kgd_gfx_v10_set_wave_launch_stall(adev, vmid, true); + + data = REG_SET_FIELD(data, SPI_GDBG_WAVE_CNTL2, + VMID_MASK, is_mode_set ? 1 << vmid : 0); + data = REG_SET_FIELD(data, SPI_GDBG_WAVE_CNTL2, + MODE, is_mode_set ? wave_launch_mode : 0); + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_WAVE_CNTL2), data); + + kgd_gfx_v10_set_wave_launch_stall(adev, vmid, false); + + mutex_unlock(&adev->grbm_idx_mutex); + + return 0; +} + /* kgd_gfx_v10_get_iq_wait_times: Returns the mmCP_IQ_WAIT_TIME1/2 values * The values read are: * ib_offload_wait_time -- Wait Count for Indirect Buffer Offloads. @@ -944,6 +968,7 @@ const struct kfd2kgd_calls gfx_v10_kfd2kgd = { .disable_debug_trap = kgd_gfx_v10_disable_debug_trap, .validate_trap_override_request = kgd_gfx_v10_validate_trap_override_request, .set_wav
[PATCH 27/33] drm/amdkfd: add debug set and clear address watch points operation
Shader read, write and atomic memory operations can be alerted to the debugger as an address watch exception. Allow the debugger to pass in a watch point to a particular memory address per device. Note that there exists only 4 watch points per devices to date, so have the KFD keep track of what watch points are allocated or not. v2: fixup with new kfd_node struct reference for mes and watch point checks Signed-off-by: Jonathan Kim --- .../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 51 +++ .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 2 + .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 78 ++ .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h| 8 ++ .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c | 5 +- .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c| 52 ++- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 77 ++ .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h | 8 ++ drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 24 drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 136 ++ drivers/gpu/drm/amd/amdkfd/kfd_debug.h| 8 +- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 + drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 6 +- 13 files changed, 452 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c index 774ecfc3451a..efd6a72aab4e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c @@ -118,6 +118,55 @@ static uint32_t kgd_aldebaran_set_wave_launch_mode(struct amdgpu_device *adev, return data; } +#define TCP_WATCH_STRIDE (regTCP_WATCH1_ADDR_H - regTCP_WATCH0_ADDR_H) +static uint32_t kgd_gfx_aldebaran_set_address_watch( + struct amdgpu_device *adev, + uint64_t watch_address, + uint32_t watch_address_mask, + uint32_t watch_id, + uint32_t watch_mode, + uint32_t debug_vmid) +{ + uint32_t watch_address_high; + uint32_t watch_address_low; + uint32_t watch_address_cntl; + + watch_address_cntl = 0; + watch_address_low = lower_32_bits(watch_address); + watch_address_high = upper_32_bits(watch_address) & 0x; + + watch_address_cntl = REG_SET_FIELD(watch_address_cntl, + TCP_WATCH0_CNTL, + MODE, + watch_mode); + + watch_address_cntl = REG_SET_FIELD(watch_address_cntl, + TCP_WATCH0_CNTL, + MASK, + watch_address_mask >> 6); + + watch_address_cntl = REG_SET_FIELD(watch_address_cntl, + TCP_WATCH0_CNTL, + VALID, + 1); + + WREG32_RLC((SOC15_REG_OFFSET(GC, 0, regTCP_WATCH0_ADDR_H) + + (watch_id * TCP_WATCH_STRIDE)), + watch_address_high); + + WREG32_RLC((SOC15_REG_OFFSET(GC, 0, regTCP_WATCH0_ADDR_L) + + (watch_id * TCP_WATCH_STRIDE)), + watch_address_low); + + return watch_address_cntl; +} + +uint32_t kgd_gfx_aldebaran_clear_address_watch(struct amdgpu_device *adev, + uint32_t watch_id) +{ + return 0; +} + const struct kfd2kgd_calls aldebaran_kfd2kgd = { .program_sh_mem_settings = kgd_gfx_v9_program_sh_mem_settings, .set_pasid_vmid_mapping = kgd_gfx_v9_set_pasid_vmid_mapping, @@ -141,6 +190,8 @@ const struct kfd2kgd_calls aldebaran_kfd2kgd = { .validate_trap_override_request = kgd_aldebaran_validate_trap_override_request, .set_wave_launch_trap_override = kgd_aldebaran_set_wave_launch_trap_override, .set_wave_launch_mode = kgd_aldebaran_set_wave_launch_mode, + .set_address_watch = kgd_gfx_aldebaran_set_address_watch, + .clear_address_watch = kgd_gfx_aldebaran_clear_address_watch, .get_iq_wait_times = kgd_gfx_v9_get_iq_wait_times, .build_grace_period_packet_info = kgd_gfx_v9_build_grace_period_packet_info, .program_trap_handler_settings = kgd_gfx_v9_program_trap_handler_settings, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c index fbdc1b7b1e42..6df215aba4c4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c @@ -413,6 +413,8 @@ const struct kfd2kgd_calls arcturus_kfd2kgd = { .validate_trap_override_request = kgd_gfx_v9_validate_trap_override_request, .set_wave_launch_trap_override = kgd_gfx_v9_set_wave_launch_trap_override, .set_wave_launch_mode = kgd_gfx_v9_set_wave_launch_mode, + .set_address_watch =
[PATCH 33/33] drm/amdkfd: bump kfd ioctl minor version for debug api availability
Bump the minor version to declare debugging capability is now available. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 1 - include/uapi/linux/kfd_ioctl.h | 3 ++- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index f522325b409b..56f55da482e2 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -2984,7 +2984,6 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, v if (!r) target->exception_enable_mask = args->enable.exception_mask; - pr_warn("Debug functions limited\n"); break; case KFD_IOC_DBG_TRAP_DISABLE: r = kfd_dbg_trap_disable(target); diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h index dfe745ee427e..ea0d50955eac 100644 --- a/include/uapi/linux/kfd_ioctl.h +++ b/include/uapi/linux/kfd_ioctl.h @@ -38,9 +38,10 @@ * - 1.10 - Add SMI profiler event log * - 1.11 - Add unified memory for ctx save/restore area * - 1.12 - Add DMA buf export ioctl + * - 1.13 - Add debugger API */ #define KFD_IOCTL_MAJOR_VERSION 1 -#define KFD_IOCTL_MINOR_VERSION 12 +#define KFD_IOCTL_MINOR_VERSION 13 struct kfd_ioctl_get_version_args { __u32 major_version;/* from KFD */ -- 2.25.1
[PATCH 17/33] drm/amdkfd: apply trap workaround for gfx11
Due to a HW bug, waves in only half the shader arrays can enter trap. When starting a debug session, relocate all waves to the first shader array of each shader engine and mask off the 2nd shader array as unavailable. When ending a debug session, re-enable the 2nd shader array per shader engine. User CU masking per queue cannot be guaranteed to remain functional if requested during debugging (e.g. user cu mask requests only 2nd shader array as an available resource leading to zero HW resources available) nor can runtime be alerted of any of these changes during execution. Make user CU masking and debugging mutual exclusive with respect to availability. If the debugger tries to attach to a process with a user cu masked queue, return the runtime status as enabled but busy. If the debugger tries to attach and fails to reallocate queue waves to the first shader array of each shader engine, return the runtime status as enabled but with an error. In addition, like any other mutli-process debug supported devices, disable trap temporary setup per-process to avoid performance impact from setup overhead. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 2 + drivers/gpu/drm/amd/amdgpu/mes_v11_0.c| 7 +-- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 - drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 57 +++ drivers/gpu/drm/amd/amdkfd/kfd_debug.h| 3 +- .../drm/amd/amdkfd/kfd_device_queue_manager.c | 7 +++ .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 3 +- .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 3 +- .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c | 42 ++ .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 3 +- .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c | 3 +- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 5 +- .../amd/amdkfd/kfd_process_queue_manager.c| 9 ++- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 7 ++- 14 files changed, 122 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h index d20df0cf0d88..b5f5eed2b5ef 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h @@ -219,6 +219,8 @@ struct mes_add_queue_input { uint32_tgws_size; uint64_ttba_addr; uint64_ttma_addr; + uint32_ttrap_en; + uint32_tskip_process_ctx_clear; uint32_tis_kfd_process; uint32_tis_aql_queue; uint32_tqueue_size; diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c index 861910a6662d..c4e3cb8d44de 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c @@ -202,17 +202,14 @@ static int mes_v11_0_add_hw_queue(struct amdgpu_mes *mes, mes_add_queue_pkt.gws_size = input->gws_size; mes_add_queue_pkt.trap_handler_addr = input->tba_addr; mes_add_queue_pkt.tma_addr = input->tma_addr; + mes_add_queue_pkt.trap_en = input->trap_en; + mes_add_queue_pkt.skip_process_ctx_clear = input->skip_process_ctx_clear; mes_add_queue_pkt.is_kfd_process = input->is_kfd_process; /* For KFD, gds_size is re-used for queue size (needed in MES for AQL queues) */ mes_add_queue_pkt.is_aql_queue = input->is_aql_queue; mes_add_queue_pkt.gds_size = input->queue_size; - if (!(((adev->mes.sched_version & AMDGPU_MES_VERSION_MASK) >= 4) && - (adev->ip_versions[GC_HWIP][0] >= IP_VERSION(11, 0, 0)) && - (adev->ip_versions[GC_HWIP][0] <= IP_VERSION(11, 0, 3 - mes_add_queue_pkt.trap_en = 1; - /* For KFD, gds_size is re-used for queue size (needed in MES for AQL queues) */ mes_add_queue_pkt.is_aql_queue = input->is_aql_queue; mes_add_queue_pkt.gds_size = input->queue_size; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 9d0c247f80fe..a5c457863048 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -537,8 +537,6 @@ static int kfd_ioctl_set_cu_mask(struct file *filp, struct kfd_process *p, goto out; } - minfo.update_flag = UPDATE_FLAG_CU_MASK; - mutex_lock(&p->mutex); retval = pqm_update_mqd(&p->pqm, args->queue_id, &minfo); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c index 73b07b5f17f1..5e2ee2d1acc4 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c @@ -24,6 +24,57 @@ #include "kfd_device_queue_manager.h" #include +static int kfd_dbg_set_queue_workaround(struct queue *q, bool enable) +{ + struct mqd_update_info minfo = {0}; + int err; + + if (!q) + return 0; + + if (KFD_GC_VE
[PATCH 21/33] drm/amdkfd: add debug trap enabled flag to tma
From: Jay Cornwall Trap handler behavior will differ when a debugger is attached. Make the debug trap flag available in the trap handler TMA. Update it when the debug trap ioctl is invoked. Signed-off-by: Jay Cornwall Reviewed-by: Felix Kuehling Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 11 +++ drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 2 ++ drivers/gpu/drm/amd/amdkfd/kfd_process.c | 15 +++ 3 files changed, 28 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c index a19c21d04438..17e8e9edccbf 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c @@ -256,6 +256,8 @@ void kfd_dbg_trap_deactivate(struct kfd_process *target, bool unwind, int unwind if (unwind && i == unwind_count) break; + kfd_process_set_trap_debug_flag(&pdd->qpd, false); + /* GFX off is already disabled by debug activate if not RLC restore supported. */ if (kfd_dbg_is_rlc_restore_supported(pdd->dev)) amdgpu_gfx_off_ctrl(pdd->dev->adev, false); @@ -351,6 +353,15 @@ int kfd_dbg_trap_activate(struct kfd_process *target) if (kfd_dbg_is_rlc_restore_supported(pdd->dev)) amdgpu_gfx_off_ctrl(pdd->dev->adev, true); + /* +* Setting the debug flag in the trap handler requires that the TMA has been +* allocated, which occurs during CWSR initialization. +* In the event that CWSR has not been initialized at this point, setting the +* flag will be called again during CWSR initialization if the target process +* is still debug enabled. +*/ + kfd_process_set_trap_debug_flag(&pdd->qpd, true); + if (!pdd->dev->kfd->shared_resources.enable_mes) r = debug_refresh_runlist(pdd->dev->dqm); else diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 4b80f74b9de0..a02fb939614a 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -1157,6 +1157,8 @@ int kfd_init_apertures(struct kfd_process *process); void kfd_process_set_trap_handler(struct qcm_process_device *qpd, uint64_t tba_addr, uint64_t tma_addr); +void kfd_process_set_trap_debug_flag(struct qcm_process_device *qpd, +bool enabled); /* CWSR initialization */ int kfd_process_init_cwsr_apu(struct kfd_process *process, struct file *filep); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index 8bfd0c91fb92..2a60c630ab5d 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -1309,6 +1309,8 @@ int kfd_process_init_cwsr_apu(struct kfd_process *p, struct file *filep) memcpy(qpd->cwsr_kaddr, dev->kfd->cwsr_isa, dev->kfd->cwsr_isa_size); + kfd_process_set_trap_debug_flag(qpd, p->debug_trap_enabled); + qpd->tma_addr = qpd->tba_addr + KFD_CWSR_TMA_OFFSET; pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for pqm.\n", qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr); @@ -1345,6 +1347,9 @@ static int kfd_process_device_init_cwsr_dgpu(struct kfd_process_device *pdd) memcpy(qpd->cwsr_kaddr, dev->kfd->cwsr_isa, dev->kfd->cwsr_isa_size); + kfd_process_set_trap_debug_flag(&pdd->qpd, + pdd->process->debug_trap_enabled); + qpd->tma_addr = qpd->tba_addr + KFD_CWSR_TMA_OFFSET; pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for pqm.\n", qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr); @@ -1431,6 +1436,16 @@ bool kfd_process_xnack_mode(struct kfd_process *p, bool supported) return true; } +void kfd_process_set_trap_debug_flag(struct qcm_process_device *qpd, +bool enabled) +{ + if (qpd->cwsr_kaddr) { + uint64_t *tma = + (uint64_t *)(qpd->cwsr_kaddr + KFD_CWSR_TMA_OFFSET); + tma[2] = enabled; + } +} + /* * On return the kfd_process is fully operational and will be freed when the * mm is released -- 2.25.1
[PATCH 15/33] drm/amdgpu: expose debug api for mes
Similar to the F32 HWS, the RS64 HWS for GFX11 now supports a multi-process debug API. The skip_process_ctx_clear ADD_QUEUE requirement is to prevent the MES from clearing the process context when the first queue is added to the scheduler in order to maintain debug mode settings during queue preemption and restore. The MES clears the process context in this case due to an unresolved FW caching bug during normal mode operations. During debug mode, the KFD will hold a reference to the target process so the process context should never go stale and MES can afford to skip this requirement. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 32 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 20 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c| 12 +++ drivers/gpu/drm/amd/include/mes_v11_api_def.h | 21 +++- 4 files changed, 84 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c index 49bb6c03d606..20cc3fffe921 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c @@ -924,6 +924,38 @@ int amdgpu_mes_reg_wait(struct amdgpu_device *adev, uint32_t reg, return r; } +int amdgpu_mes_set_shader_debugger(struct amdgpu_device *adev, + uint64_t process_context_addr, + uint32_t spi_gdbg_per_vmid_cntl, + const uint32_t *tcp_watch_cntl, + uint32_t flags) +{ + struct mes_misc_op_input op_input = {0}; + int r; + + if (!adev->mes.funcs->misc_op) { + DRM_ERROR("mes set shader debugger is not supported!\n"); + return -EINVAL; + } + + op_input.op = MES_MISC_OP_SET_SHADER_DEBUGGER; + op_input.set_shader_debugger.process_context_addr = process_context_addr; + op_input.set_shader_debugger.flags.u32all = flags; + op_input.set_shader_debugger.spi_gdbg_per_vmid_cntl = spi_gdbg_per_vmid_cntl; + memcpy(op_input.set_shader_debugger.tcp_watch_cntl, tcp_watch_cntl, + sizeof(op_input.set_shader_debugger.tcp_watch_cntl)); + + amdgpu_mes_lock(&adev->mes); + + r = adev->mes.funcs->misc_op(&adev->mes, &op_input); + if (r) + DRM_ERROR("failed to set_shader_debugger\n"); + + amdgpu_mes_unlock(&adev->mes); + + return r; +} + static void amdgpu_mes_ring_to_queue_props(struct amdgpu_device *adev, struct amdgpu_ring *ring, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h index 547ec35691fa..d20df0cf0d88 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h @@ -256,6 +256,7 @@ enum mes_misc_opcode { MES_MISC_OP_READ_REG, MES_MISC_OP_WRM_REG_WAIT, MES_MISC_OP_WRM_REG_WR_WAIT, + MES_MISC_OP_SET_SHADER_DEBUGGER, }; struct mes_misc_op_input { @@ -278,6 +279,20 @@ struct mes_misc_op_input { uint32_t reg0; uint32_t reg1; } wrm_reg; + + struct { + uint64_t process_context_addr; + union { + struct { + uint64_t single_memop : 1; + uint64_t single_alu_op : 1; + uint64_t reserved: 30; + }; + uint32_t u32all; + } flags; + uint32_t spi_gdbg_per_vmid_cntl; + uint32_t tcp_watch_cntl[4]; + } set_shader_debugger; }; }; @@ -340,6 +355,11 @@ int amdgpu_mes_reg_wait(struct amdgpu_device *adev, uint32_t reg, int amdgpu_mes_reg_write_reg_wait(struct amdgpu_device *adev, uint32_t reg0, uint32_t reg1, uint32_t ref, uint32_t mask); +int amdgpu_mes_set_shader_debugger(struct amdgpu_device *adev, + uint64_t process_context_addr, + uint32_t spi_gdbg_per_vmid_cntl, + const uint32_t *tcp_watch_cntl, + uint32_t flags); int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id, int queue_type, int idx, diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c index 90b4a74ccf01..861910a6662d 100644 --- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c @@ -339,6 +339,18 @@ static int mes_v11_0_misc_op(struct amdgpu_mes *mes, misc_pkt.wait_reg_mem.reg_offset1 = input->wrm_reg.reg0;
[PATCH 28/33] drm/amdkfd: add debug set flags operation
Allow the debugger to set single memory and single ALU operations. Some exceptions are imprecise (memory violations, address watch) in the sense that a trap occurs only when the exception interrupt occurs and not at the non-halting faulty instruction. Trap temporaries 0 & 1 save the program counter address, which means that these values will not point to the faulty instruction address but to whenever the interrupt was raised. Setting the Single Memory Operations flag will inject an automatic wait on every memory operation instruction forcing imprecise memory exceptions to become precise at the cost of performance. This setting is not permitted on debug devices that support only a global setting of this option. Return the previous set flags to the debugger as well. v2: fixup with new kfd_node struct reference mes checks Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 + drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 58 drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 1 + 3 files changed, 61 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index e88be582d44d..e5d95b144dcd 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -3035,6 +3035,8 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, v args->clear_node_address_watch.id); break; case KFD_IOC_DBG_TRAP_SET_FLAGS: + r = kfd_dbg_trap_set_flags(target, &args->set_flags.flags); + break; case KFD_IOC_DBG_TRAP_QUERY_DEBUG_EVENT: case KFD_IOC_DBG_TRAP_QUERY_EXCEPTION_INFO: case KFD_IOC_DBG_TRAP_GET_QUEUE_SNAPSHOT: diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c index 4b36cc8b5fb7..43c3170998d3 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c @@ -23,6 +23,7 @@ #include "kfd_debug.h" #include "kfd_device_queue_manager.h" #include +#include #define MAX_WATCH_ADDRESSES4 @@ -423,6 +424,59 @@ static void kfd_dbg_clear_process_address_watch(struct kfd_process *target) kfd_dbg_trap_clear_dev_address_watch(target->pdds[i], j); } +int kfd_dbg_trap_set_flags(struct kfd_process *target, uint32_t *flags) +{ + uint32_t prev_flags = target->dbg_flags; + int i, r = 0, rewind_count = 0; + + for (i = 0; i < target->n_pdds; i++) { + if (!kfd_dbg_is_per_vmid_supported(target->pdds[i]->dev) && + (*flags & KFD_DBG_TRAP_FLAG_SINGLE_MEM_OP)) { + *flags = prev_flags; + return -EACCES; + } + } + + target->dbg_flags = *flags & KFD_DBG_TRAP_FLAG_SINGLE_MEM_OP; + *flags = prev_flags; + for (i = 0; i < target->n_pdds; i++) { + struct kfd_process_device *pdd = target->pdds[i]; + + if (!kfd_dbg_is_per_vmid_supported(pdd->dev)) + continue; + + if (!pdd->dev->kfd->shared_resources.enable_mes) + r = debug_refresh_runlist(pdd->dev->dqm); + else + r = kfd_dbg_set_mes_debug_mode(pdd); + + if (r) { + target->dbg_flags = prev_flags; + break; + } + + rewind_count++; + } + + /* Rewind flags */ + if (r) { + target->dbg_flags = prev_flags; + + for (i = 0; i < rewind_count; i++) { + struct kfd_process_device *pdd = target->pdds[i]; + + if (!kfd_dbg_is_per_vmid_supported(pdd->dev)) + continue; + + if (!pdd->dev->kfd->shared_resources.enable_mes) + debug_refresh_runlist(pdd->dev->dqm); + else + kfd_dbg_set_mes_debug_mode(pdd); + } + } + + return r; +} /* kfd_dbg_trap_deactivate: * target: target process @@ -437,9 +491,13 @@ void kfd_dbg_trap_deactivate(struct kfd_process *target, bool unwind, int unwind int i; if (!unwind) { + uint32_t flags = 0; + cancel_work_sync(&target->debug_event_workarea); kfd_dbg_clear_process_address_watch(target); kfd_dbg_trap_set_wave_launch_mode(target, 0); + + kfd_dbg_trap_set_flags(target, &flags); } for (i = 0; i < target->n_pdds; i++) { diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h index 7f0757c2af2c..ef8e9f7f1716 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h @@ -57,6 +57,7 @@ int kfd_dbg_trap_set_dev_address_watch(struct kfd_process_d
[PATCH 11/33] drm/amdgpu: add gfx11 hw debug mode enable and disable calls
Implement the per-device calls to enable or disable HW debug mode for GFX11. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c| 38 +++ 1 file changed, 38 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c index 7deff8a547fb..cc954cf248ca 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c @@ -607,6 +607,42 @@ static void set_vm_context_page_table_base_v11(struct amdgpu_device *adev, adev->gfxhub.funcs->setup_vm_pt_regs(adev, vmid, page_table_base); } +/* + * Returns TRAP_EN, EXCP_EN and EXCP_REPLACE. + * + * restore_dbg_registers is ignored here but is a general interface requirement + * for devices that support GFXOFF and where the RLC save/restore list + * does not support hw registers for debugging i.e. the driver has to manually + * initialize the debug mode registers after it has disabled GFX off during the + * debug session. + */ +static uint32_t kgd_gfx_v11_enable_debug_trap(struct amdgpu_device *adev, + bool restore_dbg_registers, + uint32_t vmid) +{ + uint32_t data = 0; + + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, TRAP_EN, 1); + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_EN, 0); + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_REPLACE, 0); + + return data; +} + +/* Returns TRAP_EN, EXCP_EN and EXCP_REPLACE. */ +static uint32_t kgd_gfx_v11_disable_debug_trap(struct amdgpu_device *adev, + bool keep_trap_enabled, + uint32_t vmid) +{ + uint32_t data = 0; + + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, TRAP_EN, keep_trap_enabled); + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_EN, 0); + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_REPLACE, 0); + + return data; +} + const struct kfd2kgd_calls gfx_v11_kfd2kgd = { .program_sh_mem_settings = program_sh_mem_settings_v11, .set_pasid_vmid_mapping = set_pasid_vmid_mapping_v11, @@ -623,4 +659,6 @@ const struct kfd2kgd_calls gfx_v11_kfd2kgd = { .wave_control_execute = wave_control_execute_v11, .get_atc_vmid_pasid_mapping_info = NULL, .set_vm_context_page_table_base = set_vm_context_page_table_base_v11, + .enable_debug_trap = kgd_gfx_v11_enable_debug_trap, + .disable_debug_trap = kgd_gfx_v11_disable_debug_trap }; -- 2.25.1
[PATCH 24/33] drm/amdkfd: add debug wave launch override operation
This operation allows the debugger to override the enabled HW exceptions on the device. On debug devices that only support the debugging of a single process, the HW exceptions are global and set through the SPI_GDBG_TRAP_MASK register. Because they are global, only address watch exceptions are allowed to be enabled. In other words, the debugger must preserve all non-address watch exception states in normal mode operation by barring a full replacement override or a non-address watch override request. For multi-process debugging, all HW exception overrides are per-VMID so all exceptions can be overridden or fully replaced. In order for the debugger to know what is permissible, returned the supported override mask back to the debugger along with the previously enable overrides. v2: fixup with new kfd_node struct reference for mes check Signed-off-by: Jonathan Kim --- .../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 47 ++ .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 2 + .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 55 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h| 10 +++ .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c | 5 +- .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c| 87 ++- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 55 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h | 10 +++ drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 7 ++ drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 69 +++ drivers/gpu/drm/amd/amdkfd/kfd_debug.h| 6 ++ 11 files changed, 351 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c index b811a0985050..d7881bbd828d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c @@ -25,6 +25,7 @@ #include "amdgpu_amdkfd_gfx_v9.h" #include "gc/gc_9_4_2_offset.h" #include "gc/gc_9_4_2_sh_mask.h" +#include /* * Returns TRAP_EN, EXCP_EN and EXCP_REPLACE. @@ -62,6 +63,50 @@ static uint32_t kgd_aldebaran_disable_debug_trap(struct amdgpu_device *adev, return data; } +static int kgd_aldebaran_validate_trap_override_request(struct amdgpu_device *adev, + uint32_t trap_override, + uint32_t *trap_mask_supported) +{ + *trap_mask_supported &= KFD_DBG_TRAP_MASK_FP_INVALID | + KFD_DBG_TRAP_MASK_FP_INPUT_DENORMAL | + KFD_DBG_TRAP_MASK_FP_DIVIDE_BY_ZERO | + KFD_DBG_TRAP_MASK_FP_OVERFLOW | + KFD_DBG_TRAP_MASK_FP_UNDERFLOW | + KFD_DBG_TRAP_MASK_FP_INEXACT | + KFD_DBG_TRAP_MASK_INT_DIVIDE_BY_ZERO | + KFD_DBG_TRAP_MASK_DBG_ADDRESS_WATCH | + KFD_DBG_TRAP_MASK_DBG_MEMORY_VIOLATION; + + if (trap_override != KFD_DBG_TRAP_OVERRIDE_OR && + trap_override != KFD_DBG_TRAP_OVERRIDE_REPLACE) + return -EPERM; + + return 0; +} + +/* returns TRAP_EN, EXCP_EN and EXCP_RPLACE. */ +static uint32_t kgd_aldebaran_set_wave_launch_trap_override(struct amdgpu_device *adev, + uint32_t vmid, + uint32_t trap_override, + uint32_t trap_mask_bits, + uint32_t trap_mask_request, + uint32_t *trap_mask_prev, + uint32_t kfd_dbg_trap_cntl_prev) + +{ + uint32_t data = 0; + + *trap_mask_prev = REG_GET_FIELD(kfd_dbg_trap_cntl_prev, SPI_GDBG_PER_VMID_CNTL, EXCP_EN); + trap_mask_bits = (trap_mask_bits & trap_mask_request) | + (*trap_mask_prev & ~trap_mask_request); + + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, TRAP_EN, 1); + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_EN, trap_mask_bits); + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_REPLACE, trap_override); + + return data; +} + const struct kfd2kgd_calls aldebaran_kfd2kgd = { .program_sh_mem_settings = kgd_gfx_v9_program_sh_mem_settings, .set_pasid_vmid_mapping = kgd_gfx_v9_set_pasid_vmid_mapping, @@ -82,6 +127,8 @@ const struct kfd2kgd_calls aldebaran_kfd2kgd = { .get_cu_occupancy = kgd_gfx_v9_get_cu_occupancy, .enable_debug_trap = kgd_aldebaran_enable_debug_trap, .disable_debug_trap = kgd_aldebaran_disable_debug_trap, + .validate_trap_override_request = kgd_aldebaran_validate_trap_override_request, + .set_wave_launch_trap_override = kgd_aldebaran_set_wave_launch_trap_override, .get_iq_wait_times = kgd_gfx_v9_get_iq_wait_times, .build_grace_
[PATCH 26/33] drm/amdkfd: add debug suspend and resume process queues operation
In order to inspect waves from the saved context at any point during a debug session, the debugger must be able to preempt queues to trigger context save by suspending them. On queue suspend, the KFD will copy the context save header information so that the debugger can correctly crawl the appropriate size of the saved context. The debugger must then also be allowed to resume suspended queues. A queue that is newly created cannot be suspended because queue ids are recycled after destruction so the debugger needs to know that this has occurred. Query functions will be later added that will clear a given queue of its new queue status. A queue cannot be destroyed while it is suspended to preserve its saved context during debugger inspection. Have queue destruction block while a queue is suspended and unblocked when it is resumed. Likewise, if a queue is about to be destroyed, it cannot be suspended. Return the number of queues successfully suspended or resumed along with a per queue status array where the upper bits per queue status show that the request was invalid (new/destroyed queue suspend request, missing queue) or an error occurred (HWS in a fatal state so it can't suspend or resume queues). v2: fixup new kfd_node struct reference for mes fw check. also fixup missing EC_QUEUE_NEW flagging on newly created queue. Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 5 + drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 1 + drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 11 + drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 7 + .../drm/amd/amdkfd/kfd_device_queue_manager.c | 447 +- .../drm/amd/amdkfd/kfd_device_queue_manager.h | 10 + .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 10 + .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c | 15 +- .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 14 +- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 5 +- .../amd/amdkfd/kfd_process_queue_manager.c| 1 + 11 files changed, 512 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index 98cd52bb005f..b4fcad0e62f7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -772,6 +772,11 @@ bool amdgpu_amdkfd_have_atomics_support(struct amdgpu_device *adev) return adev->have_atomics_support; } +void amdgpu_amdkfd_debug_mem_fence(struct amdgpu_device *adev) +{ + amdgpu_device_flush_hdp(adev, NULL); +} + void amdgpu_amdkfd_ras_poison_consumption_handler(struct amdgpu_device *adev, bool reset) { amdgpu_umc_poison_handler(adev, reset); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index dd740e64e6e1..2d0406bff84e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -322,6 +322,7 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev, uint64_t *mmap_offset); int amdgpu_amdkfd_gpuvm_export_dmabuf(struct kgd_mem *mem, struct dma_buf **dmabuf); +void amdgpu_amdkfd_debug_mem_fence(struct amdgpu_device *adev); int amdgpu_amdkfd_get_tile_config(struct amdgpu_device *adev, struct tile_config *config); void amdgpu_amdkfd_ras_poison_consumption_handler(struct amdgpu_device *adev, diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 4b45d4539d48..adda60273456 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -410,6 +410,7 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, pr_debug("Write ptr address == 0x%016llX\n", args->write_pointer_address); + kfd_dbg_ev_raise(KFD_EC_MASK(EC_QUEUE_NEW), p, dev, queue_id, false, NULL, 0); return 0; err_create_queue: @@ -2996,7 +2997,17 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, v args->launch_mode.launch_mode); break; case KFD_IOC_DBG_TRAP_SUSPEND_QUEUES: + r = suspend_queues(target, + args->suspend_queues.num_queues, + args->suspend_queues.grace_period, + args->suspend_queues.exception_mask, + (uint32_t *)args->suspend_queues.queue_array_ptr); + + break; case KFD_IOC_DBG_TRAP_RESUME_QUEUES: + r = resume_queues(target, args->resume_queues.num_queues, + (uint32_t *)args->resume_queues.queue_array_ptr); + break; case KFD_IOC_DBG_TRAP_SET_NODE_ADDRESS_WATCH: case KFD_IOC_DBG_TRAP_CLEAR_NODE_ADDRESS_WATCH: case KFD_IOC_DBG_T
[PATCH 19/33] drm/amdkfd: add send exception operation
Add a debug operation that allows the debugger to send an exception directly to runtime through a payload address. For memory violations, normal vmfault signals will be applied to notify runtime instead after passing in the saved exception data when a memory violation was raised to the debugger. For runtime exceptions, this will unblock the runtime enable function which will be explained and implemented in a follow up patch. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- .../gpu/drm/amd/amdkfd/cik_event_interrupt.c | 4 +- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 5 ++ drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 43 +++ drivers/gpu/drm/amd/amdkfd/kfd_debug.h| 6 ++ drivers/gpu/drm/amd/amdkfd/kfd_events.c | 3 +- .../gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 7 +- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 71 ++- 8 files changed, 135 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c b/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c index 4ebfff6b6c55..795382b55e0a 100644 --- a/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c +++ b/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c @@ -118,9 +118,9 @@ static void cik_event_interrupt_wq(struct kfd_node *dev, return; if (info.vmid == vmid) - kfd_signal_vm_fault_event(dev, pasid, &info); + kfd_signal_vm_fault_event(dev, pasid, &info, NULL); else - kfd_signal_vm_fault_event(dev, pasid, NULL); + kfd_signal_vm_fault_event(dev, pasid, NULL, NULL); } } diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index a5c457863048..ec5a85454192 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -2833,6 +2833,11 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, v r = kfd_dbg_trap_disable(target); break; case KFD_IOC_DBG_TRAP_SEND_RUNTIME_EVENT: + r = kfd_dbg_send_exception_to_runtime(target, + args->send_runtime_event.gpu_id, + args->send_runtime_event.queue_id, + args->send_runtime_event.exception_mask); + break; case KFD_IOC_DBG_TRAP_SET_EXCEPTIONS_ENABLED: case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_OVERRIDE: case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_MODE: diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c index dccb27fc764b..61098975bb0e 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c @@ -125,6 +125,49 @@ bool kfd_dbg_ev_raise(uint64_t event_mask, return is_subscribed; } +int kfd_dbg_send_exception_to_runtime(struct kfd_process *p, + unsigned int dev_id, + unsigned int queue_id, + uint64_t error_reason) +{ + if (error_reason & KFD_EC_MASK(EC_DEVICE_MEMORY_VIOLATION)) { + struct kfd_process_device *pdd = NULL; + struct kfd_hsa_memory_exception_data *data; + int i; + + for (i = 0; i < p->n_pdds; i++) { + if (p->pdds[i]->dev->id == dev_id) { + pdd = p->pdds[i]; + break; + } + } + + if (!pdd) + return -ENODEV; + + data = (struct kfd_hsa_memory_exception_data *) + pdd->vm_fault_exc_data; + + kfd_dqm_evict_pasid(pdd->dev->dqm, p->pasid); + kfd_signal_vm_fault_event(pdd->dev, p->pasid, NULL, data); + error_reason &= ~KFD_EC_MASK(EC_DEVICE_MEMORY_VIOLATION); + } + + if (error_reason & (KFD_EC_MASK(EC_PROCESS_RUNTIME))) { + /* +* block should only happen after the debugger receives runtime +* enable notice. +*/ + up(&p->runtime_enable_sema); + error_reason &= ~KFD_EC_MASK(EC_PROCESS_RUNTIME); + } + + if (error_reason) + return kfd_send_exception_to_runtime(p, queue_id, error_reason); + + return 0; +} + static int kfd_dbg_set_queue_workaround(struct queue *q, bool enable) { struct mqd_update_info minfo = {0}; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h index 66ee7b95d08a..2c6866bb8850 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h @@ -34,6 +34,12 @@ int kfd_dbg_trap_disable(struct kfd_proc
[PATCH 18/33] drm/amdkfd: add raise exception event function
Exception events can be generated from interrupts or queue activitity. The raise event function will save exception status of a queue, device or process then notify the debugger of the status change by writing to a debugger polled file descriptor that the debugger provides during debug attach. For memory violation exceptions, extra exception data will be saved. The debugger will be able to query the saved exception states by query operation that will be provided by follow up patches. v2: use new kfd_node struct in prototype. Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 104 +++ drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 7 ++ drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 10 +++ drivers/gpu/drm/amd/amdkfd/kfd_process.c | 2 + 4 files changed, 123 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c index 5e2ee2d1acc4..dccb27fc764b 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c @@ -24,6 +24,107 @@ #include "kfd_device_queue_manager.h" #include +void debug_event_write_work_handler(struct work_struct *work) +{ + struct kfd_process *process; + + static const char write_data = '.'; + loff_t pos = 0; + + process = container_of(work, + struct kfd_process, + debug_event_workarea); + + kernel_write(process->dbg_ev_file, &write_data, 1, &pos); +} + +/* update process/device/queue exception status, write to descriptor + * only if exception_status is enabled. + */ +bool kfd_dbg_ev_raise(uint64_t event_mask, + struct kfd_process *process, struct kfd_node *dev, + unsigned int source_id, bool use_worker, + void *exception_data, size_t exception_data_size) +{ + struct process_queue_manager *pqm; + struct process_queue_node *pqn; + int i; + static const char write_data = '.'; + loff_t pos = 0; + bool is_subscribed = true; + + if (!(process && process->debug_trap_enabled)) + return false; + + mutex_lock(&process->event_mutex); + + if (event_mask & KFD_EC_MASK_DEVICE) { + for (i = 0; i < process->n_pdds; i++) { + struct kfd_process_device *pdd = process->pdds[i]; + + if (pdd->dev != dev) + continue; + + pdd->exception_status |= event_mask & KFD_EC_MASK_DEVICE; + + if (event_mask & KFD_EC_MASK(EC_DEVICE_MEMORY_VIOLATION)) { + if (!pdd->vm_fault_exc_data) { + pdd->vm_fault_exc_data = kmemdup( + exception_data, + exception_data_size, + GFP_KERNEL); + if (!pdd->vm_fault_exc_data) + pr_debug("Failed to allocate exception data memory"); + } else { + pr_debug("Debugger exception data not saved\n"); + print_hex_dump_bytes("exception data: ", + DUMP_PREFIX_OFFSET, + exception_data, + exception_data_size); + } + } + break; + } + } else if (event_mask & KFD_EC_MASK_PROCESS) { + process->exception_status |= event_mask & KFD_EC_MASK_PROCESS; + } else { + pqm = &process->pqm; + list_for_each_entry(pqn, &pqm->queues, + process_queue_list) { + int target_id; + + if (!pqn->q) + continue; + + target_id = event_mask & KFD_EC_MASK(EC_QUEUE_NEW) ? + pqn->q->properties.queue_id : + pqn->q->doorbell_id; + + if (pqn->q->device != dev || target_id != source_id) + continue; + + pqn->q->properties.exception_status |= event_mask; + break; + } + } + + if (process->exception_enable_mask & event_mask) { + if (use_worker) + schedule_work(&process->debug_event_workarea); + else + kernel_write(process->dbg_ev_file, + &write_data, + 1, +
[PATCH 10/33] drm/amdgpu: add gfx9.4.2 hw debug mode enable and disable calls
GFX9.4.2 now supports per-VMID debug mode controls registers (SPI_GDBG_PER_VMID_CNTL). Because the KFD lets the HWS handle PASID-VMID mapping, the KFD will forward all debug mode setting register writes to the HWS scheduler using a new MAP_PROCESS API, so instead of writing to registers, return the required register values that the HWS needs to write on debug enable and disable. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- .../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 42 ++- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c index 4485bb29bec9..a6f98141c29c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c @@ -23,6 +23,44 @@ #include "amdgpu_amdkfd.h" #include "amdgpu_amdkfd_arcturus.h" #include "amdgpu_amdkfd_gfx_v9.h" +#include "gc/gc_9_4_2_offset.h" +#include "gc/gc_9_4_2_sh_mask.h" + +/* + * Returns TRAP_EN, EXCP_EN and EXCP_REPLACE. + * + * restore_dbg_registers is ignored here but is a general interface requirement + * for devices that support GFXOFF and where the RLC save/restore list + * does not support hw registers for debugging i.e. the driver has to manually + * initialize the debug mode registers after it has disabled GFX off during the + * debug session. + */ +static uint32_t kgd_aldebaran_enable_debug_trap(struct amdgpu_device *adev, + bool restore_dbg_registers, + uint32_t vmid) +{ + uint32_t data = 0; + + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, TRAP_EN, 1); + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_EN, 0); + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_REPLACE, 0); + + return data; +} + +/* returns TRAP_EN, EXCP_EN and EXCP_REPLACE. */ +static uint32_t kgd_aldebaran_disable_debug_trap(struct amdgpu_device *adev, + bool keep_trap_enabled, + uint32_t vmid) +{ + uint32_t data = 0; + + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, TRAP_EN, keep_trap_enabled); + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_EN, 0); + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_REPLACE, 0); + + return data; +} const struct kfd2kgd_calls aldebaran_kfd2kgd = { .program_sh_mem_settings = kgd_gfx_v9_program_sh_mem_settings, @@ -42,5 +80,7 @@ const struct kfd2kgd_calls aldebaran_kfd2kgd = { kgd_gfx_v9_get_atc_vmid_pasid_mapping_info, .set_vm_context_page_table_base = kgd_gfx_v9_set_vm_context_page_table_base, .get_cu_occupancy = kgd_gfx_v9_get_cu_occupancy, - .program_trap_handler_settings = kgd_gfx_v9_program_trap_handler_settings + .enable_debug_trap = kgd_aldebaran_enable_debug_trap, + .disable_debug_trap = kgd_aldebaran_disable_debug_trap, + .program_trap_handler_settings = kgd_gfx_v9_program_trap_handler_settings, }; -- 2.25.1
[PATCH 16/33] drm/amdkfd: add per process hw trap enable and disable functions
To enable HW debug mode per process, all devices must be debug enabled successfully. If a failure occures, rewind the enablement of debug mode on the enabled devices. A power management scenario that needs to be considered is HW debug mode setting during GFXOFF. During GFXOFF, these registers will be unreachable so we have to transiently disable GFXOFF when setting. Also, some devices don't support the RLC save restore function for these debug registers so we have to disable GFXOFF completely during a debug session. Cooperative launch also has debugging restriction based on HW/FW bugs. If such bugs exists, the debugger cannot attach to a process that uses GWS resources nor can GWS resources be requested if a process is being debugged. Multi-process debug devices can only enable trap temporaries based on certain runtime scenerios, which will be explained when the runtime enable functions are implemented in a follow up patch. v2: spot fix with new kfd_node references Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 5 + drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 148 ++- drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 29 + drivers/gpu/drm/amd/amdkfd/kfd_process.c | 10 ++ 4 files changed, 190 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 7082d5d0f0e9..9d0c247f80fe 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1488,6 +1488,11 @@ static int kfd_ioctl_alloc_queue_gws(struct file *filep, goto out_unlock; } + if (!kfd_dbg_has_gws_support(dev) && p->debug_trap_enabled) { + retval = -EBUSY; + goto out_unlock; + } + retval = pqm_set_gws(&p->pqm, args->queue_id, args->num_gws ? dev->gws : NULL); mutex_unlock(&p->mutex); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c index 898cc1fe3d13..73b07b5f17f1 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c @@ -21,13 +21,78 @@ */ #include "kfd_debug.h" +#include "kfd_device_queue_manager.h" #include +static int kfd_dbg_set_mes_debug_mode(struct kfd_process_device *pdd) +{ + uint32_t spi_dbg_cntl = pdd->spi_dbg_override | pdd->spi_dbg_launch_mode; + uint32_t flags = pdd->process->dbg_flags; + + if (!kfd_dbg_is_per_vmid_supported(pdd->dev)) + return 0; + + return amdgpu_mes_set_shader_debugger(pdd->dev->adev, pdd->proc_ctx_gpu_addr, spi_dbg_cntl, + pdd->watch_points, flags); +} + +/* kfd_dbg_trap_deactivate: + * target: target process + * unwind: If this is unwinding a failed kfd_dbg_trap_enable() + * unwind_count: + * If unwind == true, how far down the pdd list we need + * to unwind + * else: ignored + */ +static void kfd_dbg_trap_deactivate(struct kfd_process *target, bool unwind, int unwind_count) +{ + int i; + + for (i = 0; i < target->n_pdds; i++) { + struct kfd_process_device *pdd = target->pdds[i]; + + /* If this is an unwind, and we have unwound the required +* enable calls on the pdd list, we need to stop now +* otherwise we may mess up another debugger session. +*/ + if (unwind && i == unwind_count) + break; + + /* GFX off is already disabled by debug activate if not RLC restore supported. */ + if (kfd_dbg_is_rlc_restore_supported(pdd->dev)) + amdgpu_gfx_off_ctrl(pdd->dev->adev, false); + pdd->spi_dbg_override = + pdd->dev->kfd2kgd->disable_debug_trap( + pdd->dev->adev, + target->runtime_info.ttmp_setup, + pdd->dev->vm_info.last_vmid_kfd); + amdgpu_gfx_off_ctrl(pdd->dev->adev, true); + + if (!kfd_dbg_is_per_vmid_supported(pdd->dev) && + release_debug_trap_vmid(pdd->dev->dqm, &pdd->qpd)) + pr_err("Failed to release debug vmid on [%i]\n", pdd->dev->id); + + if (!pdd->dev->kfd->shared_resources.enable_mes) + debug_refresh_runlist(pdd->dev->dqm); + else + kfd_dbg_set_mes_debug_mode(pdd); + } +} + int kfd_dbg_trap_disable(struct kfd_process *target) { if (!target->debug_trap_enabled) return 0; + /* +* Defer deactivation to runtime if runtime not enabled otherwise reset +* attached running target runtime state to enable for re-attach. +*/ + if (target->runtime_info.runtime_state == DEBUG_RUNTIME_STATE_ENABLED) +
[PATCH 20/33] drm/amdkfd: add runtime enable operation
The debugger can attach to a process prior to HSA enablement (i.e. inferior is spawned by the debugger and attached to immediately before target process has been enabled for HSA dispatches) or it can attach to a running target that is already HSA enabled. Either way, the debugger needs to know the enablement status to know when it can inspect queues. For the scenario where the debugger spawns the target process, it will have to wait for ROCr's runtime enable request from the target. The runtime enable request will be able to see that its process has been debug attached. ROCr raises an EC_PROCESS_RUNTIME signal to the debugger then blocks the target process while waiting the debugger's response. Once the debugger has received the runtime signal, it will unblock the target process. For the scenario where the debugger attaches to a running target process, ROCr will set the target process' runtime status as enabled so that on an attach request, the debugger will be able to see this status and will continue with debug enablement as normal. A secondary requirement is to conditionally enable the trap tempories only if the user requests it (env var HSA_ENABLE_DEBUG=1) or if the debugger attaches with HSA runtime enabled. This is because setting up the trap temporaries incurs a performance overhead that is unacceptable for microbench performance in normal mode for certain customers. In the scenario where the debugger spawns the target process, when ROCr detects that the debugger has attached during the runtime enable request, it will enable the trap temporaries before it blocks the target process while waiting for the debugger to respond. In the scenario where the debugger attaches to a running target process, it will enable to trap temporaries itself. Finally, there is an additional restriction that is required to be enforced with runtime enable and HW debug mode setting. The debugger must first ensure that HW debug mode has been enabled before permitting HW debug mode operations. With single process debug devices, allowing the debugger to set debug HW modes prior to trap activation means that debug HW mode setting can occur before the KFD has reserved the debug VMID (0xf) from the hardware scheduler's VMID allocation resource pool. This can result in the hardware scheduler assigning VMID 0xf to a non-debugged process and having that process inherit debug HW mode settings intended for the debugged target process instead, which is both incorrect and potentially fatal for normal mode operation. With multi process debug devices, allowing the debugger to set debug HW modes prior to trap activation means that non-debugged processes migrating to a new VMID could inherit unintended debug settings. All debug operations that touch HW settings must require trap activation where trap activation is triggered by both debug attach and runtime enablement (target has KFD opened and is ready to dispatch work). v2: fixup with new kfd_node struct reference Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 143 ++- drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 6 +- drivers/gpu/drm/amd/amdkfd/kfd_debug.h | 4 + drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 1 + 4 files changed, 150 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index ec5a85454192..73cb5abce431 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -2738,11 +2738,140 @@ static int kfd_ioctl_criu(struct file *filep, struct kfd_process *p, void *data) return ret; } -static int kfd_ioctl_runtime_enable(struct file *filep, struct kfd_process *p, void *data) +static int runtime_enable(struct kfd_process *p, uint64_t r_debug, + bool enable_ttmp_setup) +{ + int i = 0, ret = 0; + + if (p->is_runtime_retry) + goto retry; + + if (p->runtime_info.runtime_state != DEBUG_RUNTIME_STATE_DISABLED) + return -EBUSY; + + for (i = 0; i < p->n_pdds; i++) { + struct kfd_process_device *pdd = p->pdds[i]; + + if (pdd->qpd.queue_count) + return -EEXIST; + } + + p->runtime_info.runtime_state = DEBUG_RUNTIME_STATE_ENABLED; + p->runtime_info.r_debug = r_debug; + p->runtime_info.ttmp_setup = enable_ttmp_setup; + + if (p->runtime_info.ttmp_setup) { + for (i = 0; i < p->n_pdds; i++) { + struct kfd_process_device *pdd = p->pdds[i]; + + if (!kfd_dbg_is_rlc_restore_supported(pdd->dev)) { + amdgpu_gfx_off_ctrl(pdd->dev->adev, false); + pdd->dev->kfd2kgd->enable_debug_trap( + pdd->dev->adev, + true, +
[PATCH 13/33] drm/amdkfd: prepare map process for single process debug devices
Older HW only supports debugging on a single process because the SPI debug mode setting registers are device global. The HWS has supplied a single pinned VMID (0xf) for MAP_PROCESS for debug purposes. To pin the VMID, the KFD will remove the VMID from the HWS dynamic VMID allocation via SET_RESOUCES so that a debugged process will never migrate away from its pinned VMID. The KFD is responsible for reserving and releasing this pinned VMID accordingly whenever the debugger attaches and detaches respectively. v2: spot fix ups using new kfd_node references Signed-off-by: Jonathan Kim --- .../drm/amd/amdkfd/kfd_device_queue_manager.c | 93 +++ .../drm/amd/amdkfd/kfd_device_queue_manager.h | 5 + .../drm/amd/amdkfd/kfd_packet_manager_v9.c| 9 ++ .../gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h | 5 +- 4 files changed, 111 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index d1f44feb7084..c8519adc89ac 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -1524,6 +1524,7 @@ static int initialize_cpsch(struct device_queue_manager *dqm) dqm->gws_queue_count = 0; dqm->active_runlist = false; INIT_WORK(&dqm->hw_exception_work, kfd_process_hw_exception); + dqm->trap_debug_vmid = 0; init_sdma_bitmaps(dqm); @@ -2500,6 +2501,98 @@ static void kfd_process_hw_exception(struct work_struct *work) amdgpu_amdkfd_gpu_reset(dqm->dev->adev); } +int reserve_debug_trap_vmid(struct device_queue_manager *dqm, + struct qcm_process_device *qpd) +{ + int r; + int updated_vmid_mask; + + if (dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) { + pr_err("Unsupported on sched_policy: %i\n", dqm->sched_policy); + return -EINVAL; + } + + dqm_lock(dqm); + + if (dqm->trap_debug_vmid != 0) { + pr_err("Trap debug id already reserved\n"); + r = -EBUSY; + goto out_unlock; + } + + r = unmap_queues_cpsch(dqm, KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES, 0, + USE_DEFAULT_GRACE_PERIOD, false); + if (r) + goto out_unlock; + + updated_vmid_mask = dqm->dev->kfd->shared_resources.compute_vmid_bitmap; + updated_vmid_mask &= ~(1 << dqm->dev->vm_info.last_vmid_kfd); + + dqm->dev->kfd->shared_resources.compute_vmid_bitmap = updated_vmid_mask; + dqm->trap_debug_vmid = dqm->dev->vm_info.last_vmid_kfd; + r = set_sched_resources(dqm); + if (r) + goto out_unlock; + + r = map_queues_cpsch(dqm); + if (r) + goto out_unlock; + + pr_debug("Reserved VMID for trap debug: %i\n", dqm->trap_debug_vmid); + +out_unlock: + dqm_unlock(dqm); + return r; +} + +/* + * Releases vmid for the trap debugger + */ +int release_debug_trap_vmid(struct device_queue_manager *dqm, + struct qcm_process_device *qpd) +{ + int r; + int updated_vmid_mask; + uint32_t trap_debug_vmid; + + if (dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) { + pr_err("Unsupported on sched_policy: %i\n", dqm->sched_policy); + return -EINVAL; + } + + dqm_lock(dqm); + trap_debug_vmid = dqm->trap_debug_vmid; + if (dqm->trap_debug_vmid == 0) { + pr_err("Trap debug id is not reserved\n"); + r = -EINVAL; + goto out_unlock; + } + + r = unmap_queues_cpsch(dqm, KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES, 0, + USE_DEFAULT_GRACE_PERIOD, false); + if (r) + goto out_unlock; + + updated_vmid_mask = dqm->dev->kfd->shared_resources.compute_vmid_bitmap; + updated_vmid_mask |= (1 << dqm->dev->vm_info.last_vmid_kfd); + + dqm->dev->kfd->shared_resources.compute_vmid_bitmap = updated_vmid_mask; + dqm->trap_debug_vmid = 0; + r = set_sched_resources(dqm); + if (r) + goto out_unlock; + + r = map_queues_cpsch(dqm); + if (r) + goto out_unlock; + + pr_debug("Released VMID for trap debug: %i\n", trap_debug_vmid); + +out_unlock: + dqm_unlock(dqm); + return r; +} + #if defined(CONFIG_DEBUG_FS) static void seq_reg_dump(struct seq_file *m, diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h index d4dd3b4acbf0..bf7aa3f84182 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h @@ -250,6 +250,7 @@ struct device_queue_manager { struct kfd_mem_obj *fence_mem; boolactive_runlist; int sched_policy; + uint32_ttrap_debug_vmid
[PATCH 08/33] drm/amdkfd: fix kfd_suspend_all_processes
Flush delayed restore work in kfd_suspend_all_queues instead of cancelling. Cancelling the work before it runs results in the queues becoming permanently disabled. Flushing the work ensures that the queue suspend/resume state stays balanced. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index af0a4b5257cc..d63a764dafb9 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -2014,7 +2014,7 @@ void kfd_suspend_all_processes(void) WARN(debug_evictions, "Evicting all processes"); hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) { cancel_delayed_work_sync(&p->eviction_work); - cancel_delayed_work_sync(&p->restore_work); + flush_delayed_work(&p->restore_work); if (kfd_process_evict_queues(p, KFD_QUEUE_EVICTION_TRIGGER_SUSPEND)) pr_err("Failed to suspend process 0x%x\n", p->pasid); -- 2.25.1
[PATCH 07/33] drm/amdgpu: add gfx9.4.1 hw debug mode enable and disable calls
On GFX9.4.1, the implicit wait count instruction on s_barrier is disabled by default in the driver during normal operation for performance requirements. There is a hardware bug in GFX9.4.1 where if the implicit wait count instruction after an s_barrier instruction is disabled, any wave that hits an exception may step over the s_barrier when returning from the trap handler with the barrier logic having no ability to be aware of this, thereby causing other waves to wait at the barrier indefinitely resulting in a shader hang. This bug has been corrected for GFX9.4.2 and onward. Since the debugger subscribes to hardware exceptions, in order to avoid this bug, the debugger must enable implicit wait count on s_barrier for a debug session and disable it on detach. In order to change this setting in the in the device global SQ_CONFIG register, the GFX pipeline must be idle. GFX9.4.1 as a compute device will either dispatch work through the compute ring buffers used for image post processing or through the hardware scheduler by the KFD. Have the KGD suspend and drain the compute ring buffer, then suspend the hardware scheduler and block any future KFD process job requests before changing the implicit wait count setting. Once set, resume all work. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 + .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 116 ++ drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +- 3 files changed, 121 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 5af954abd5ba..7c49cdc37d95 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1065,6 +1065,9 @@ struct amdgpu_device { struct pci_saved_state *pci_state; pci_channel_state_t pci_channel_state; + /* Track auto wait count on s_barrier settings */ + boolbarrier_has_auto_waitcnt; + struct amdgpu_reset_control *reset_cntl; uint32_t ip_versions[MAX_HWIP][HWIP_MAX_INSTANCE]; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c index 4191af5a3f13..d2918e5c0dea 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c @@ -26,6 +26,7 @@ #include "amdgpu.h" #include "amdgpu_amdkfd.h" #include "amdgpu_amdkfd_arcturus.h" +#include "amdgpu_reset.h" #include "sdma0/sdma0_4_2_2_offset.h" #include "sdma0/sdma0_4_2_2_sh_mask.h" #include "sdma1/sdma1_4_2_2_offset.h" @@ -48,6 +49,8 @@ #include "amdgpu_amdkfd_gfx_v9.h" #include "gfxhub_v1_0.h" #include "mmhub_v9_4.h" +#include "gc/gc_9_0_offset.h" +#include "gc/gc_9_0_sh_mask.h" #define HQD_N_REGS 56 #define DUMP_REG(addr) do {\ @@ -276,6 +279,117 @@ int kgd_arcturus_hqd_sdma_destroy(struct amdgpu_device *adev, void *mqd, return 0; } +/* + * Helper used to suspend/resume gfx pipe for image post process work to set + * barrier behaviour. + */ +static int suspend_resume_compute_scheduler(struct amdgpu_device *adev, bool suspend) +{ + int i, r = 0; + + for (i = 0; i < adev->gfx.num_compute_rings; i++) { + struct amdgpu_ring *ring = &adev->gfx.compute_ring[i]; + + if (!(ring && ring->sched.thread)) + continue; + + /* stop secheduler and drain ring. */ + if (suspend) { + drm_sched_stop(&ring->sched, NULL); + r = amdgpu_fence_wait_empty(ring); + if (r) + goto out; + } else { + drm_sched_start(&ring->sched, false); + } + } + +out: + /* return on resume or failure to drain rings. */ + if (!suspend || r) + return r; + + return amdgpu_device_ip_wait_for_idle(adev, GC_HWIP); +} + +static void set_barrier_auto_waitcnt(struct amdgpu_device *adev, bool enable_waitcnt) +{ + uint32_t data; + + WRITE_ONCE(adev->barrier_has_auto_waitcnt, enable_waitcnt); + + if (!down_read_trylock(&adev->reset_domain->sem)) + return; + + amdgpu_amdkfd_suspend(adev, false); + + if (suspend_resume_compute_scheduler(adev, true)) + goto out; + + data = RREG32(SOC15_REG_OFFSET(GC, 0, mmSQ_CONFIG)); + data = REG_SET_FIELD(data, SQ_CONFIG, DISABLE_BARRIER_WAITCNT, + !enable_waitcnt); + WREG32(SOC15_REG_OFFSET(GC, 0, mmSQ_CONFIG), data); + +out: + suspend_resume_compute_scheduler(adev, false); + + amdgpu_amdkfd_resume(adev, false); + + up_read(&adev->reset_domain->sem); +} + +/* + * restore_dbg_registers is ignored here but is a general i
[PATCH 12/33] drm/amdgpu: add configurable grace period for unmap queues
The HWS schedule allows a grace period for wave completion prior to preemption for better performance by avoiding CWSR on waves that can potentially complete quickly. The debugger, on the other hand, will want to inspect wave status immediately after it actively triggers preemption (a suspend function to be provided). To minimize latency between preemption and debugger wave inspection, allow immediate preemption by setting the grace period to 0. Note that setting the preepmtion grace period to 0 will result in an infinite grace period being set due to a CP FW bug so set it to 1 for now. v2: add null grace period function pointers to VI packet manager. Signed-off-by: Jonathan Kim --- .../drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 2 + .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 2 + .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 43 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h| 6 ++ .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c | 2 + .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 43 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h | 8 ++- .../drm/amd/amdkfd/kfd_device_queue_manager.c | 63 +- .../drm/amd/amdkfd/kfd_device_queue_manager.h | 3 + .../gpu/drm/amd/amdkfd/kfd_packet_manager.c | 32 + .../drm/amd/amdkfd/kfd_packet_manager_v9.c| 39 +++ .../drm/amd/amdkfd/kfd_packet_manager_vi.c| 2 + .../gpu/drm/amd/amdkfd/kfd_pm4_headers_ai.h | 65 +++ drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 5 ++ 14 files changed, 295 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c index a6f98141c29c..b811a0985050 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c @@ -82,5 +82,7 @@ const struct kfd2kgd_calls aldebaran_kfd2kgd = { .get_cu_occupancy = kgd_gfx_v9_get_cu_occupancy, .enable_debug_trap = kgd_aldebaran_enable_debug_trap, .disable_debug_trap = kgd_aldebaran_disable_debug_trap, + .get_iq_wait_times = kgd_gfx_v9_get_iq_wait_times, + .build_grace_period_packet_info = kgd_gfx_v9_build_grace_period_packet_info, .program_trap_handler_settings = kgd_gfx_v9_program_trap_handler_settings, }; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c index d2918e5c0dea..a62bd0068515 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c @@ -410,6 +410,8 @@ const struct kfd2kgd_calls arcturus_kfd2kgd = { kgd_gfx_v9_set_vm_context_page_table_base, .enable_debug_trap = kgd_arcturus_enable_debug_trap, .disable_debug_trap = kgd_arcturus_disable_debug_trap, + .get_iq_wait_times = kgd_gfx_v9_get_iq_wait_times, + .build_grace_period_packet_info = kgd_gfx_v9_build_grace_period_packet_info, .get_cu_occupancy = kgd_gfx_v9_get_cu_occupancy, .program_trap_handler_settings = kgd_gfx_v9_program_trap_handler_settings }; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c index 240f5006e278..98006c7021dd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c @@ -803,6 +803,47 @@ uint32_t kgd_gfx_v10_disable_debug_trap(struct amdgpu_device *adev, return 0; } +/* kgd_gfx_v10_get_iq_wait_times: Returns the mmCP_IQ_WAIT_TIME1/2 values + * The values read are: + * ib_offload_wait_time -- Wait Count for Indirect Buffer Offloads. + * atomic_offload_wait_time -- Wait Count for L2 and GDS Atomics Offloads. + * wrm_offload_wait_time-- Wait Count for WAIT_REG_MEM Offloads. + * gws_wait_time-- Wait Count for Global Wave Syncs. + * que_sleep_wait_time -- Wait Count for Dequeue Retry. + * sch_wave_wait_time -- Wait Count for Scheduling Wave Message. + * sem_rearm_wait_time -- Wait Count for Semaphore re-arm. + * deq_retry_wait_time -- Wait Count for Global Wave Syncs. + */ +void kgd_gfx_v10_get_iq_wait_times(struct amdgpu_device *adev, + uint32_t *wait_times) + +{ + *wait_times = RREG32(SOC15_REG_OFFSET(GC, 0, mmCP_IQ_WAIT_TIME2)); +} + +void kgd_gfx_v10_build_grace_period_packet_info(struct amdgpu_device *adev, + uint32_t wait_times, + uint32_t grace_period, + uint32_t *reg_offset, + uint32_t *reg_data) +{ + *reg_data = wait_times; + + /* +* The CP cannont handle a 0 grace period input and will result in +* an infinite grace period being set so set to
[PATCH 14/33] drm/amdgpu: prepare map process for multi-process debug devices
Unlike single process debug devices, multi-process debug devices allow debug mode setting per-VMID (non-device-global). Because the HWS manages PASID-VMID mapping, the new MAP_PROCESS API allows the KFD to forward the required SPI debug register write requests. To request a new debug mode setting change, the KFD must be able to preempt all queues then remap all queues with these new setting requests for MAP_PROCESS to take effect. Note that by default, trap enablement in non-debug mode must be disabled for performance reasons for multi-process debug devices due to setup overhead in FW. v2: spot fixup new kfd_node references Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdkfd/kfd_debug.h| 5 ++ .../drm/amd/amdkfd/kfd_device_queue_manager.c | 51 +++ .../drm/amd/amdkfd/kfd_device_queue_manager.h | 3 ++ .../drm/amd/amdkfd/kfd_packet_manager_v9.c| 14 + drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 9 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 5 ++ 6 files changed, 87 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h index a8abfe2a0a14..db6d72e7930f 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h @@ -29,4 +29,9 @@ int kfd_dbg_trap_disable(struct kfd_process *target); int kfd_dbg_trap_enable(struct kfd_process *target, uint32_t fd, void __user *runtime_info, uint32_t *runtime_info_size); +static inline bool kfd_dbg_is_per_vmid_supported(struct kfd_node *dev) +{ + return KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2); +} + #endif diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index c8519adc89ac..badfe1210bc4 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -36,6 +36,7 @@ #include "kfd_kernel_queue.h" #include "amdgpu_amdkfd.h" #include "mes_api_def.h" +#include "kfd_debug.h" /* Size of the per-pipe EOP queue */ #define CIK_HPD_EOP_BYTES_LOG2 11 @@ -2593,6 +2594,56 @@ int release_debug_trap_vmid(struct device_queue_manager *dqm, return r; } +int debug_lock_and_unmap(struct device_queue_manager *dqm) +{ + int r; + + if (dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) { + pr_err("Unsupported on sched_policy: %i\n", dqm->sched_policy); + return -EINVAL; + } + + if (!kfd_dbg_is_per_vmid_supported(dqm->dev)) + return 0; + + dqm_lock(dqm); + + r = unmap_queues_cpsch(dqm, KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES, 0, 0, false); + if (r) + dqm_unlock(dqm); + + return r; +} + +int debug_map_and_unlock(struct device_queue_manager *dqm) +{ + int r; + + if (dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) { + pr_err("Unsupported on sched_policy: %i\n", dqm->sched_policy); + return -EINVAL; + } + + if (!kfd_dbg_is_per_vmid_supported(dqm->dev)) + return 0; + + r = map_queues_cpsch(dqm); + + dqm_unlock(dqm); + + return r; +} + +int debug_refresh_runlist(struct device_queue_manager *dqm) +{ + int r = debug_lock_and_unmap(dqm); + + if (r) + return r; + + return debug_map_and_unlock(dqm); +} + #if defined(CONFIG_DEBUG_FS) static void seq_reg_dump(struct seq_file *m, diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h index bf7aa3f84182..bb75d93712eb 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h @@ -290,6 +290,9 @@ int reserve_debug_trap_vmid(struct device_queue_manager *dqm, struct qcm_process_device *qpd); int release_debug_trap_vmid(struct device_queue_manager *dqm, struct qcm_process_device *qpd); +int debug_lock_and_unmap(struct device_queue_manager *dqm); +int debug_map_and_unlock(struct device_queue_manager *dqm); +int debug_refresh_runlist(struct device_queue_manager *dqm); static inline unsigned int get_sh_mem_bases_32(struct kfd_process_device *pdd) { diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c index 0fe73dbd28af..29a2d0499b67 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c @@ -88,6 +88,10 @@ static int pm_map_process_aldebaran(struct packet_manager *pm, { struct pm4_mes_map_process_aldebaran *packet; uint64_t vm_page_table_base_addr = qpd->page_table_base; + struct kfd_dev *kfd = pm->dqm->dev->kfd; + struct kfd_process_device *pdd = + container_of(qpd, struct kfd_process_device, qpd); + int i; packet = (s
[PATCH 09/33] drm/amdgpu: add gfx10 hw debug mode enable and disable calls
Similar to GFX9 debug devices, set the hardware debug mode by draining the SPI appropriately prior the mode setting request. Because GFX10 has waves allocated by the work group boundary and each SE's SPI instances do not communicate, the SPI drain time is much longer. This long drain time will be fixed for GFX11 onwards. Also remove a bunch of deprecated misplaced references for GFX10.3. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 96 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h| 28 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c | 148 +- 3 files changed, 127 insertions(+), 145 deletions(-) create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c index 7b60268d93c0..240f5006e278 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c @@ -21,6 +21,7 @@ */ #include "amdgpu.h" #include "amdgpu_amdkfd.h" +#include "amdgpu_amdkfd_gfx_v10.h" #include "gc/gc_10_1_0_offset.h" #include "gc/gc_10_1_0_sh_mask.h" #include "athub/athub_2_0_0_offset.h" @@ -709,6 +710,99 @@ static void set_vm_context_page_table_base(struct amdgpu_device *adev, adev->gfxhub.funcs->setup_vm_pt_regs(adev, vmid, page_table_base); } +/* + * GFX10 helper for wave launch stall requirements on debug trap setting. + * + * vmid: + * Target VMID to stall/unstall. + * + * stall: + * 0-unstall wave launch (enable), 1-stall wave launch (disable). + * After wavefront launch has been stalled, allocated waves must drain from + * SPI in order for debug trap settings to take effect on those waves. + * This is roughly a ~3500 clock cycle wait on SPI where a read on + * SPI_GDBG_WAVE_CNTL translates to ~32 clock cycles. + * KGD_GFX_V10_WAVE_LAUNCH_SPI_DRAIN_LATENCY indicates the number of reads required. + * + * NOTE: We can afford to clear the entire STALL_VMID field on unstall + * because current GFX10 chips cannot support multi-process debugging due to + * trap configuration and masking being limited to global scope. Always + * assume single process conditions. + * + */ + +#define KGD_GFX_V10_WAVE_LAUNCH_SPI_DRAIN_LATENCY 110 +static void kgd_gfx_v10_set_wave_launch_stall(struct amdgpu_device *adev, uint32_t vmid, bool stall) +{ + uint32_t data = RREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_WAVE_CNTL)); + int i; + + data = REG_SET_FIELD(data, SPI_GDBG_WAVE_CNTL, STALL_VMID, + stall ? 1 << vmid : 0); + + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_WAVE_CNTL), data); + + if (!stall) + return; + + for (i = 0; i < KGD_GFX_V10_WAVE_LAUNCH_SPI_DRAIN_LATENCY; i++) + RREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_WAVE_CNTL)); +} + +uint32_t kgd_gfx_v10_enable_debug_trap(struct amdgpu_device *adev, + bool restore_dbg_registers, + uint32_t vmid) +{ + + mutex_lock(&adev->grbm_idx_mutex); + + kgd_gfx_v10_set_wave_launch_stall(adev, vmid, true); + + /* assume gfx off is disabled for the debug session if rlc restore not supported. */ + if (restore_dbg_registers) { + uint32_t data = 0; + + data = REG_SET_FIELD(data, SPI_GDBG_TRAP_CONFIG, + VMID_SEL, 1 << vmid); + data = REG_SET_FIELD(data, SPI_GDBG_TRAP_CONFIG, + TRAP_EN, 1); + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_CONFIG), data); + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_DATA0), 0); + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_DATA1), 0); + + kgd_gfx_v10_set_wave_launch_stall(adev, vmid, false); + + mutex_unlock(&adev->grbm_idx_mutex); + + return 0; + } + + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_MASK), 0); + + kgd_gfx_v10_set_wave_launch_stall(adev, vmid, false); + + mutex_unlock(&adev->grbm_idx_mutex); + + return 0; +} + +uint32_t kgd_gfx_v10_disable_debug_trap(struct amdgpu_device *adev, + bool keep_trap_enabled, + uint32_t vmid) +{ + mutex_lock(&adev->grbm_idx_mutex); + + kgd_gfx_v10_set_wave_launch_stall(adev, vmid, true); + + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_MASK), 0); + + kgd_gfx_v10_set_wave_launch_stall(adev, vmid, false); + + mutex_unlock(&adev->grbm_idx_mutex); + + return 0; +} + static void program_trap_handler_settings(struct amdgpu_device *adev, uint32_t vmid, uint64_t tba_addr, uint64_t tma_addr, uint32_t inst) @@ -752,5 +846,7 @@ const struct kfd2kgd_calls gfx_v10_kfd2kgd
[PATCH 05/33] drm/amdgpu: setup hw debug registers on driver initialization
Add missing debug trap registers references and initialize all debug registers on boot by clearing the hardware exception overrides and the wave allocation ID index. The debugger requires that TTMPs 6 & 7 save the dispatch ID to map waves onto dispatch during compute context inspection. In order to correctly set this up, set the special reserved CP bit by default whenever the MQD is initailized. v2: add missing 0-init of SPI_GDBG_TRAP_DATA0/1 Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c| 26 +++ drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c| 1 + drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 30 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c | 3 + .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 5 ++ .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c | 5 ++ .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 5 ++ .../include/asic_reg/gc/gc_10_1_0_offset.h| 14 .../include/asic_reg/gc/gc_10_1_0_sh_mask.h | 69 +++ .../include/asic_reg/gc/gc_10_3_0_offset.h| 10 +++ .../include/asic_reg/gc/gc_10_3_0_sh_mask.h | 4 ++ .../include/asic_reg/gc/gc_11_0_0_sh_mask.h | 4 ++ 12 files changed, 176 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index f7ad883a70fa..be984f8c71c7 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -4825,6 +4825,29 @@ static u32 gfx_v10_0_init_pa_sc_tile_steering_override(struct amdgpu_device *ade #define DEFAULT_SH_MEM_BASES (0x6000) +static void gfx_v10_0_debug_trap_config_init(struct amdgpu_device *adev, + uint32_t first_vmid, + uint32_t last_vmid) +{ + uint32_t data; + uint32_t trap_config_vmid_mask = 0; + int i; + + /* Calculate trap config vmid mask */ + for (i = first_vmid; i < last_vmid; i++) + trap_config_vmid_mask |= (1 << i); + + data = REG_SET_FIELD(0, SPI_GDBG_TRAP_CONFIG, + VMID_SEL, trap_config_vmid_mask); + data = REG_SET_FIELD(data, SPI_GDBG_TRAP_CONFIG, + TRAP_EN, 1); + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_CONFIG), data); + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_MASK), 0); + + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_DATA0), 0); + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_DATA1), 0); +} + static void gfx_v10_0_init_compute_vmid(struct amdgpu_device *adev) { int i; @@ -4856,6 +4879,9 @@ static void gfx_v10_0_init_compute_vmid(struct amdgpu_device *adev) WREG32_SOC15_OFFSET(GC, 0, mmGDS_GWS_VMID0, i, 0); WREG32_SOC15_OFFSET(GC, 0, mmGDS_OA_VMID0, i, 0); } + + gfx_v10_0_debug_trap_config_init(adev, adev->vm_manager.first_kfd_vmid, + AMDGPU_NUM_VMID); } static void gfx_v10_0_init_gds_vmid(struct amdgpu_device *adev) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c index da21bf868080..690e121d9dda 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c @@ -1638,6 +1638,7 @@ static void gfx_v11_0_init_compute_vmid(struct amdgpu_device *adev) /* Enable trap for each kfd vmid. */ data = RREG32_SOC15(GC, 0, regSPI_GDBG_PER_VMID_CNTL); data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, TRAP_EN, 1); + WREG32_SOC15(GC, 0, regSPI_GDBG_PER_VMID_CNTL, data); } soc21_grbm_select(adev, 0, 0, 0, 0); mutex_unlock(&adev->srbm_mutex); diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index 0189e50bd89f..7f17e0061027 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -2303,6 +2303,29 @@ static void gfx_v9_0_setup_rb(struct amdgpu_device *adev) adev->gfx.config.num_rbs = hweight32(active_rbs); } +static void gfx_v9_0_debug_trap_config_init(struct amdgpu_device *adev, + uint32_t first_vmid, + uint32_t last_vmid) +{ + uint32_t data; + uint32_t trap_config_vmid_mask = 0; + int i; + + /* Calculate trap config vmid mask */ + for (i = first_vmid; i < last_vmid; i++) + trap_config_vmid_mask |= (1 << i); + + data = REG_SET_FIELD(0, SPI_GDBG_TRAP_CONFIG, + VMID_SEL, trap_config_vmid_mask); + data = REG_SET_FIELD(data, SPI_GDBG_TRAP_CONFIG, + TRAP_EN, 1); + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_CONFIG), data); + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_MASK), 0); + + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_DATA0), 0); + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_DATA1), 0); +} + #define DEFAULT_SH_MEM_BASES (0x6000)
[PATCH 06/33] drm/amdgpu: add gfx9 hw debug mode enable and disable calls
Implement the per-device calls to enable or disable HW debug mode for GFX9 prior to GFX9.4.1. GFX9.4.1 and onward will require their own enable/disable sequence as follow on patches. When hardware debug mode setting is requested, waves will inherit these settings in the Shader Processor Input's (SPI) Sequencer Global Block (SQG). This means that the KGD must drain all waves from the SPI into SQG (approximately 96 SPI clock cycles) prior to debug mode setting to ensure that the order of operations that the debugger expects with regards to debug mode setting transaction requests and wave inheritence of that mode is upheld. Also ensure that exception overrides are reset to their original state prior to debug enable or disable. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 92 +++ .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h | 9 ++ 2 files changed, 101 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c index 9fa9aab22fe9..64da5995170c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c @@ -649,6 +649,96 @@ int kgd_gfx_v9_wave_control_execute(struct amdgpu_device *adev, return 0; } +/* + * GFX9 helper for wave launch stall requirements on debug trap setting. + * + * vmid: + * Target VMID to stall/unstall. + * + * stall: + * 0-unstall wave launch (enable), 1-stall wave launch (disable). + * After wavefront launch has been stalled, allocated waves must drain from + * SPI in order for debug trap settings to take effect on those waves. + * This is roughly a ~96 clock cycle wait on SPI where a read on + * SPI_GDBG_WAVE_CNTL translates to ~32 clock cycles. + * KGD_GFX_V9_WAVE_LAUNCH_SPI_DRAIN_LATENCY indicates the number of reads required. + * + * NOTE: We can afford to clear the entire STALL_VMID field on unstall + * because GFX9.4.1 cannot support multi-process debugging due to trap + * configuration and masking being limited to global scope. Always assume + * single process conditions. + */ +#define KGD_GFX_V9_WAVE_LAUNCH_SPI_DRAIN_LATENCY 3 +void kgd_gfx_v9_set_wave_launch_stall(struct amdgpu_device *adev, + uint32_t vmid, + bool stall) +{ + int i; + uint32_t data = RREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_WAVE_CNTL)); + + if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 4, 1)) + data = REG_SET_FIELD(data, SPI_GDBG_WAVE_CNTL, STALL_VMID, + stall ? 1 << vmid : 0); + else + data = REG_SET_FIELD(data, SPI_GDBG_WAVE_CNTL, STALL_RA, + stall ? 1 : 0); + + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_WAVE_CNTL), data); + + if (!stall) + return; + + for (i = 0; i < KGD_GFX_V9_WAVE_LAUNCH_SPI_DRAIN_LATENCY; i++) + RREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_WAVE_CNTL)); +} + +/* + * restore_dbg_registers is ignored here but is a general interface requirement + * for devices that support GFXOFF and where the RLC save/restore list + * does not support hw registers for debugging i.e. the driver has to manually + * initialize the debug mode registers after it has disabled GFX off during the + * debug session. + */ +uint32_t kgd_gfx_v9_enable_debug_trap(struct amdgpu_device *adev, + bool restore_dbg_registers, + uint32_t vmid) +{ + mutex_lock(&adev->grbm_idx_mutex); + + kgd_gfx_v9_set_wave_launch_stall(adev, vmid, true); + + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_MASK), 0); + + kgd_gfx_v9_set_wave_launch_stall(adev, vmid, false); + + mutex_unlock(&adev->grbm_idx_mutex); + + return 0; +} + +/* + * keep_trap_enabled is ignored here but is a general interface requirement + * for devices that support multi-process debugging where the performance + * overhead from trap temporary setup needs to be bypassed when the debug + * session has ended. + */ +uint32_t kgd_gfx_v9_disable_debug_trap(struct amdgpu_device *adev, + bool keep_trap_enabled, + uint32_t vmid) +{ + mutex_lock(&adev->grbm_idx_mutex); + + kgd_gfx_v9_set_wave_launch_stall(adev, vmid, true); + + WREG32(SOC15_REG_OFFSET(GC, 0, mmSPI_GDBG_TRAP_MASK), 0); + + kgd_gfx_v9_set_wave_launch_stall(adev, vmid, false); + + mutex_unlock(&adev->grbm_idx_mutex); + + return 0; +} + void kgd_gfx_v9_set_vm_context_page_table_base(struct amdgpu_device *adev, uint32_t vmid, uint64_t page_table_base) { @@ -875,6 +965,8 @@ const struct kfd2kgd_calls gfx_v9_kfd2kgd = { .get_atc_vmid_pasid_mapping_info =
[PATCH 04/33] drm/amdgpu: add kgd hw debug mode setting interface
Introduce the require KGD debug calls that will execute hardware debug mode setting. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- .../gpu/drm/amd/include/kgd_kfd_interface.h | 34 +++ 1 file changed, 34 insertions(+) diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h index 8cb3dbcae3e4..d0df3381539f 100644 --- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h +++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h @@ -291,6 +291,40 @@ struct kfd2kgd_calls { uint32_t vmid, uint64_t page_table_base); uint32_t (*read_vmid_from_vmfault_reg)(struct amdgpu_device *adev); + uint32_t (*enable_debug_trap)(struct amdgpu_device *adev, + bool restore_dbg_registers, + uint32_t vmid); + uint32_t (*disable_debug_trap)(struct amdgpu_device *adev, + bool keep_trap_enabled, + uint32_t vmid); + int (*validate_trap_override_request)(struct amdgpu_device *adev, + uint32_t trap_override, + uint32_t *trap_mask_supported); + uint32_t (*set_wave_launch_trap_override)(struct amdgpu_device *adev, +uint32_t vmid, +uint32_t trap_override, +uint32_t trap_mask_bits, +uint32_t trap_mask_request, +uint32_t *trap_mask_prev, +uint32_t kfd_dbg_trap_cntl_prev); + uint32_t (*set_wave_launch_mode)(struct amdgpu_device *adev, + uint8_t wave_launch_mode, + uint32_t vmid); + uint32_t (*set_address_watch)(struct amdgpu_device *adev, + uint64_t watch_address, + uint32_t watch_address_mask, + uint32_t watch_id, + uint32_t watch_mode, + uint32_t debug_vmid); + uint32_t (*clear_address_watch)(struct amdgpu_device *adev, + uint32_t watch_id); + void (*get_iq_wait_times)(struct amdgpu_device *adev, + uint32_t *wait_times); + void (*build_grace_period_packet_info)(struct amdgpu_device *adev, + uint32_t wait_times, + uint32_t grace_period, + uint32_t *reg_offset, + uint32_t *reg_data); void (*get_cu_occupancy)(struct amdgpu_device *adev, int pasid, int *wave_cnt, int *max_waves_per_cu, uint32_t inst); void (*program_trap_handler_settings)(struct amdgpu_device *adev, -- 2.25.1
[PATCH 03/33] drm/amdkfd: prepare per-process debug enable and disable
The ROCm debugger will attach to a process to debug by PTRACE and will expect the KFD to prepare a process for the target PID, whether the target PID has opened the KFD device or not. This patch is to explicity handle this requirement. Further HW mode setting and runtime coordination requirements will be handled in following patches. In the case where the target process has not opened the KFD device, a new KFD process must be created for the target PID. The debugger as well as the target process for this case will have not acquired any VMs so handle process restoration to correctly account for this. To coordinate with HSA runtime, the debugger must be aware of the target process' runtime enablement status and will copy the runtime status information into the debugged KFD process for later query. On enablement, the debugger will subscribe to a set of exceptions where each exception events will notify the debugger through a pollable FIFO file descriptor that the debugger provides to the KFD to manage. Finally on process termination of either the debugger or the target, debugging must be disabled if it has not been done so. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/Makefile | 3 +- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 102 +- drivers/gpu/drm/amd/amdkfd/kfd_debug.c| 80 ++ drivers/gpu/drm/amd/amdkfd/kfd_debug.h| 32 ++ .../drm/amd/amdkfd/kfd_device_queue_manager.c | 26 +++-- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 31 +- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 60 +++ 7 files changed, 304 insertions(+), 30 deletions(-) create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_debug.c create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_debug.h diff --git a/drivers/gpu/drm/amd/amdkfd/Makefile b/drivers/gpu/drm/amd/amdkfd/Makefile index e758c2a24cd0..747754428073 100644 --- a/drivers/gpu/drm/amd/amdkfd/Makefile +++ b/drivers/gpu/drm/amd/amdkfd/Makefile @@ -55,7 +55,8 @@ AMDKFD_FILES := $(AMDKFD_PATH)/kfd_module.o \ $(AMDKFD_PATH)/kfd_int_process_v9.o \ $(AMDKFD_PATH)/kfd_int_process_v11.o \ $(AMDKFD_PATH)/kfd_smi_events.o \ - $(AMDKFD_PATH)/kfd_crat.o + $(AMDKFD_PATH)/kfd_crat.o \ + $(AMDKFD_PATH)/kfd_debug.o ifneq ($(CONFIG_AMD_IOMMU_V2),) AMDKFD_FILES += $(AMDKFD_PATH)/kfd_iommu.o diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index f4b50b74818e..7082d5d0f0e9 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -44,6 +44,7 @@ #include "amdgpu_amdkfd.h" #include "kfd_smi_events.h" #include "amdgpu_dma_buf.h" +#include "kfd_debug.h" static long kfd_ioctl(struct file *, unsigned int, unsigned long); static int kfd_open(struct inode *, struct file *); @@ -142,10 +143,15 @@ static int kfd_open(struct inode *inode, struct file *filep) return -EPERM; } - process = kfd_create_process(filep); + process = kfd_create_process(current); if (IS_ERR(process)) return PTR_ERR(process); + if (kfd_process_init_cwsr_apu(process, filep)) { + kfd_unref_process(process); + return -EFAULT; + } + /* filep now owns the reference returned by kfd_create_process */ filep->private_data = process; @@ -2737,6 +2743,10 @@ static int kfd_ioctl_runtime_enable(struct file *filep, struct kfd_process *p, v static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, void *data) { struct kfd_ioctl_dbg_trap_args *args = data; + struct task_struct *thread = NULL; + struct mm_struct *mm = NULL; + struct pid *pid = NULL; + struct kfd_process *target = NULL; int r = 0; if (sched_policy == KFD_SCHED_POLICY_NO_HWS) { @@ -2744,9 +2754,81 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, v return -EINVAL; } + pid = find_get_pid(args->pid); + if (!pid) { + pr_debug("Cannot find pid info for %i\n", args->pid); + r = -ESRCH; + goto out; + } + + thread = get_pid_task(pid, PIDTYPE_PID); + if (!thread) { + r = -ESRCH; + goto out; + } + + mm = get_task_mm(thread); + if (!mm) { + r = -ESRCH; + goto out; + } + + if (args->op == KFD_IOC_DBG_TRAP_ENABLE) { + bool create_process; + + rcu_read_lock(); + create_process = thread && thread != current && ptrace_parent(thread) == current; + rcu_read_unlock(); + + target = create_process ? kfd_create_process(thread) : + kfd_lookup_process_by_pid(pid); + } els
[PATCH 01/33] drm/amdkfd: add debug and runtime enable interface
Introduce the GPU debug operations interface. For ROCm-GDB to extend the GNU Debugger's ability to inspect the AMD GPU instruction set, provide the necessary interface to allow the debugger to HW debug-mode set and query exceptions per HSA queue, process or device. The runtime_enable interface coordinates exception handling with the HSA runtime. Usage is available in the kern docs at uapi/linux/kfd_ioctl.h. v2: add num_xcc to device snapshot entry. fixup missing EC_QUEUE_PACKET_RESERVED mask. Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 48 ++ include/uapi/linux/kfd_ioctl.h | 668 ++- 2 files changed, 715 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 88fe1f31739d..f4b50b74818e 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -2729,6 +2729,48 @@ static int kfd_ioctl_criu(struct file *filep, struct kfd_process *p, void *data) return ret; } +static int kfd_ioctl_runtime_enable(struct file *filep, struct kfd_process *p, void *data) +{ + return 0; +} + +static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, void *data) +{ + struct kfd_ioctl_dbg_trap_args *args = data; + int r = 0; + + if (sched_policy == KFD_SCHED_POLICY_NO_HWS) { + pr_err("Debugging does not support sched_policy %i", sched_policy); + return -EINVAL; + } + + switch (args->op) { + case KFD_IOC_DBG_TRAP_ENABLE: + case KFD_IOC_DBG_TRAP_DISABLE: + case KFD_IOC_DBG_TRAP_SEND_RUNTIME_EVENT: + case KFD_IOC_DBG_TRAP_SET_EXCEPTIONS_ENABLED: + case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_OVERRIDE: + case KFD_IOC_DBG_TRAP_SET_WAVE_LAUNCH_MODE: + case KFD_IOC_DBG_TRAP_SUSPEND_QUEUES: + case KFD_IOC_DBG_TRAP_RESUME_QUEUES: + case KFD_IOC_DBG_TRAP_SET_NODE_ADDRESS_WATCH: + case KFD_IOC_DBG_TRAP_CLEAR_NODE_ADDRESS_WATCH: + case KFD_IOC_DBG_TRAP_SET_FLAGS: + case KFD_IOC_DBG_TRAP_QUERY_DEBUG_EVENT: + case KFD_IOC_DBG_TRAP_QUERY_EXCEPTION_INFO: + case KFD_IOC_DBG_TRAP_GET_QUEUE_SNAPSHOT: + case KFD_IOC_DBG_TRAP_GET_DEVICE_SNAPSHOT: + pr_warn("Debugging not supported yet\n"); + r = -EACCES; + break; + default: + pr_err("Invalid option: %i\n", args->op); + r = -EINVAL; + } + + return r; +} + #define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \ [_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \ .cmd_drv = 0, .name = #ioctl} @@ -2841,6 +2883,12 @@ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = { AMDKFD_IOCTL_DEF(AMDKFD_IOC_EXPORT_DMABUF, kfd_ioctl_export_dmabuf, 0), + + AMDKFD_IOCTL_DEF(AMDKFD_IOC_RUNTIME_ENABLE, + kfd_ioctl_runtime_enable, 0), + + AMDKFD_IOCTL_DEF(AMDKFD_IOC_DBG_TRAP, + kfd_ioctl_set_debug_trap, 0), }; #define AMDKFD_CORE_IOCTL_COUNTARRAY_SIZE(amdkfd_ioctls) diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h index 2a9671e1ddb5..dfe745ee427e 100644 --- a/include/uapi/linux/kfd_ioctl.h +++ b/include/uapi/linux/kfd_ioctl.h @@ -110,6 +110,32 @@ struct kfd_ioctl_get_available_memory_args { __u32 pad; }; +struct kfd_dbg_device_info_entry { + __u64 exception_status; + __u64 lds_base; + __u64 lds_limit; + __u64 scratch_base; + __u64 scratch_limit; + __u64 gpuvm_base; + __u64 gpuvm_limit; + __u32 gpu_id; + __u32 location_id; + __u32 vendor_id; + __u32 device_id; + __u32 revision_id; + __u32 subsystem_vendor_id; + __u32 subsystem_device_id; + __u32 fw_version; + __u32 gfx_target_version; + __u32 simd_count; + __u32 max_waves_per_simd; + __u32 array_count; + __u32 simd_arrays_per_engine; + __u32 num_xcc; + __u32 capability; + __u32 debug_prop; +}; + /* For kfd_ioctl_set_memory_policy_args.default_policy and alternate_policy */ #define KFD_IOC_CACHE_POLICY_COHERENT 0 #define KFD_IOC_CACHE_POLICY_NONCOHERENT 1 @@ -775,6 +801,640 @@ struct kfd_ioctl_set_xnack_mode_args { __s32 xnack_enabled; }; +/* Wave launch override modes */ +enum kfd_dbg_trap_override_mode { + KFD_DBG_TRAP_OVERRIDE_OR = 0, + KFD_DBG_TRAP_OVERRIDE_REPLACE = 1 +}; + +/* Wave launch overrides */ +enum kfd_dbg_trap_mask { + KFD_DBG_TRAP_MASK_FP_INVALID = 1, + KFD_DBG_TRAP_MASK_FP_INPUT_DENORMAL = 2, + KFD_DBG_TRAP_MASK_FP_DIVIDE_BY_ZERO = 4, + KFD_DBG_TRAP_MASK_FP_OVERFLOW = 8, + KFD_DBG_TRAP_MASK_FP_UNDERFLOW = 16, + KFD_DBG_TRAP_MASK_FP_INEXACT = 32, + KFD_DBG_TRAP_MASK_INT_DIVIDE_BY_ZERO = 64, + KFD_DBG_TR
[PATCH 02/33] drm/amdkfd: display debug capabilities
Expose debug capabilities in the KFD topology node's HSA capabilities and debug properties flags. Ensure correct capabilities are exposed based on firmware support. Flag definitions can be referenced in uapi/linux/kfd_sysfs.h. v2: rebase topology fw check fix with kfd_node struct update Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 101 -- drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 6 ++ include/uapi/linux/kfd_sysfs.h| 15 3 files changed, 117 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c index 8302d8967158..3def25b2bdbb 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c @@ -535,6 +535,8 @@ static ssize_t node_show(struct kobject *kobj, struct attribute *attr, dev->gpu->kfd->mec_fw_version); sysfs_show_32bit_prop(buffer, offs, "capability", dev->node_props.capability); + sysfs_show_64bit_prop(buffer, offs, "debug_prop", + dev->node_props.debug_prop); sysfs_show_32bit_prop(buffer, offs, "sdma_fw_version", dev->gpu->kfd->sdma_fw_version); sysfs_show_64bit_prop(buffer, offs, "unique_id", @@ -1857,6 +1859,97 @@ static int kfd_topology_add_device_locked(struct kfd_node *gpu, uint32_t gpu_id, return res; } +static void kfd_topology_set_dbg_firmware_support(struct kfd_topology_device *dev) +{ + bool firmware_supported = true; + + if (KFD_GC_VERSION(dev->gpu) >= IP_VERSION(11, 0, 0) && + KFD_GC_VERSION(dev->gpu) < IP_VERSION(12, 0, 0)) { + firmware_supported = + (dev->gpu->adev->mes.sched_version & AMDGPU_MES_VERSION_MASK) >= 9; + goto out; + } + + /* +* Note: Any unlisted devices here are assumed to support exception handling. +* Add additional checks here as needed. +*/ + switch (KFD_GC_VERSION(dev->gpu)) { + case IP_VERSION(9, 0, 1): + firmware_supported = dev->gpu->kfd->mec_fw_version >= 459 + 32768; + break; + case IP_VERSION(9, 1, 0): + case IP_VERSION(9, 2, 1): + case IP_VERSION(9, 2, 2): + case IP_VERSION(9, 3, 0): + case IP_VERSION(9, 4, 0): + firmware_supported = dev->gpu->kfd->mec_fw_version >= 459; + break; + case IP_VERSION(9, 4, 1): + firmware_supported = dev->gpu->kfd->mec_fw_version >= 60; + break; + case IP_VERSION(9, 4, 2): + firmware_supported = dev->gpu->kfd->mec_fw_version >= 51; + break; + case IP_VERSION(10, 1, 10): + case IP_VERSION(10, 1, 2): + case IP_VERSION(10, 1, 1): + firmware_supported = dev->gpu->kfd->mec_fw_version >= 144; + break; + case IP_VERSION(10, 3, 0): + case IP_VERSION(10, 3, 2): + case IP_VERSION(10, 3, 1): + case IP_VERSION(10, 3, 4): + case IP_VERSION(10, 3, 5): + firmware_supported = dev->gpu->kfd->mec_fw_version >= 89; + break; + case IP_VERSION(10, 1, 3): + case IP_VERSION(10, 3, 3): + firmware_supported = false; + break; + default: + break; + } + +out: + if (firmware_supported) + dev->node_props.capability |= HSA_CAP_TRAP_DEBUG_FIRMWARE_SUPPORTED; +} + +static void kfd_topology_set_capabilities(struct kfd_topology_device *dev) +{ + dev->node_props.capability |= ((HSA_CAP_DOORBELL_TYPE_2_0 << + HSA_CAP_DOORBELL_TYPE_TOTALBITS_SHIFT) & + HSA_CAP_DOORBELL_TYPE_TOTALBITS_MASK); + + dev->node_props.capability |= HSA_CAP_TRAP_DEBUG_SUPPORT | + HSA_CAP_TRAP_DEBUG_WAVE_LAUNCH_TRAP_OVERRIDE_SUPPORTED | + HSA_CAP_TRAP_DEBUG_WAVE_LAUNCH_MODE_SUPPORTED; + + if (KFD_GC_VERSION(dev->gpu) < IP_VERSION(10, 0, 0)) { + dev->node_props.debug_prop |= HSA_DBG_WATCH_ADDR_MASK_LO_BIT_GFX9 | + HSA_DBG_WATCH_ADDR_MASK_HI_BIT; + + if (KFD_GC_VERSION(dev->gpu) < IP_VERSION(9, 4, 2)) + dev->node_props.debug_prop |= + HSA_DBG_DISPATCH_INFO_ALWAYS_VALID; + else + dev->node_props.capability |= + HSA_CAP_TRAP_DEBUG_PRECISE_MEMORY_OPERATIONS_SUPPORTED; + } else { + dev->node_props.debug_prop |= HSA_DBG_WATCH_ADDR_MASK_LO_BIT_GFX10 | + HSA_DBG_WATCH_ADDR_MASK_HI_BIT; + + if (KFD_GC_VERSION(dev->gpu) < IP_VERSION(11,
[PATCH] drm/amdgpu: Fix up kdoc in sdma_v4_4_2.c
Address a bunch of kdoc warnings: gcc with W=1 drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:426: warning: Function parameter or member 'inst_mask' not described in 'sdma_v4_4_2_inst_gfx_stop' drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:457: warning: Function parameter or member 'inst_mask' not described in 'sdma_v4_4_2_inst_rlc_stop' drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:470: warning: Function parameter or member 'inst_mask' not described in 'sdma_v4_4_2_inst_page_stop' drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:506: warning: Function parameter or member 'inst_mask' not described in 'sdma_v4_4_2_inst_ctx_switch_enable' drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:794: warning: Function parameter or member 'inst_mask' not described in 'sdma_v4_4_2_inst_rlc_resume' drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:810: warning: Function parameter or member 'inst_mask' not described in 'sdma_v4_4_2_inst_load_microcode' drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c:854: warning: Function parameter or member 'inst_mask' not described in 'sdma_v4_4_2_inst_start' Cc: Christian König Cc: Alex Deucher Signed-off-by: Srinivasan Shanmugam --- drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c index ff41fb577cdd..8eebf9c2bbcd 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c @@ -418,6 +418,7 @@ static void sdma_v4_4_2_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 * sdma_v4_4_2_inst_gfx_stop - stop the gfx async dma engines * * @adev: amdgpu_device pointer + * @inst_mask: mask of dma engine instances to be disabled * * Stop the gfx async dma ring buffers. */ @@ -449,6 +450,7 @@ static void sdma_v4_4_2_inst_gfx_stop(struct amdgpu_device *adev, * sdma_v4_4_2_inst_rlc_stop - stop the compute async dma engines * * @adev: amdgpu_device pointer + * @inst_mask: mask of dma engine instances to be disabled * * Stop the compute async dma queues. */ @@ -462,6 +464,7 @@ static void sdma_v4_4_2_inst_rlc_stop(struct amdgpu_device *adev, * sdma_v4_4_2_inst_page_stop - stop the page async dma engines * * @adev: amdgpu_device pointer + * @inst_mask: mask of dma engine instances to be disabled * * Stop the page async dma ring buffers. */ @@ -498,6 +501,7 @@ static void sdma_v4_4_2_inst_page_stop(struct amdgpu_device *adev, * * @adev: amdgpu_device pointer * @enable: enable/disable the DMA MEs context switch. + * @inst_mask: mask of dma engine instances to be enabled * * Halt or unhalt the async dma engines context switch. */ @@ -785,6 +789,7 @@ static void sdma_v4_4_2_init_pg(struct amdgpu_device *adev) * sdma_v4_4_2_inst_rlc_resume - setup and start the async dma engines * * @adev: amdgpu_device pointer + * @inst_mask: mask of dma engine instances to be enabled * * Set up the compute DMA queues and enable them. * Returns 0 for success, error for failure. @@ -801,6 +806,7 @@ static int sdma_v4_4_2_inst_rlc_resume(struct amdgpu_device *adev, * sdma_v4_4_2_inst_load_microcode - load the sDMA ME ucode * * @adev: amdgpu_device pointer + * @inst_mask: mask of dma engine instances to be enabled * * Loads the sDMA0/1 ucode. * Returns 0 for success, -EINVAL if the ucode is not available. @@ -845,6 +851,7 @@ static int sdma_v4_4_2_inst_load_microcode(struct amdgpu_device *adev, * sdma_v4_4_2_inst_start - setup and start the async dma engines * * @adev: amdgpu_device pointer + * @inst_mask: mask of dma engine instances to be enabled * * Set up the DMA engines and enable them. * Returns 0 for success, error for failure. -- 2.25.1
Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused
On Thu, May 25, 2023 at 12:45:13PM -0400, Luben Tuikov wrote: > On 2023-05-25 12:29, Nathan Chancellor wrote: > > On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote: > >> On 2023-05-25 11:22, Nathan Chancellor wrote: > >>> On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote: > Silencing the compiler from below compilation error: > > drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable > 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted > [-Werror,-Wunneeded-internal-declaration] > static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = { > ^ > 1 error generated. > > Mark the variable as __maybe_unused to make it clear to clang that this > is expected, so there is no more warning. > > Cc: Christian König > Cc: Lijo Lazar > Cc: Luben Tuikov > Cc: Alex Deucher > Signed-off-by: Srinivasan Shanmugam > >>> > >>> Traditionally, this attribute would go between the [] and =, but that is > >>> a nit. Can someone please pick this up to unblock our builds on -next? > >>> > >>> Reviewed-by: Nathan Chancellor > >> > >> I'll pick this up, fix it, and submit to amd-staging-drm-next. > > > > Thanks a lot :) > > > >> Which -next are you referring to, Nathan? > > > > linux-next, this warning breaks the build when -Werror is enabled, such > > as with allmodconfig: > > > > https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2QHtlCTz2JL0yXNpRB5hVmiP9lq/build.log > > > > Hi Nathan, > > Thanks for the pointers. > > Srinivasan has already submitted it to amd-staging-drm-next. > > Seems Alex will push it upstream. > > Not sure who fast you need it, we can send you the commit itself > for you to git-am if you cannot wait. Thanks for that extra info. We can just wait for the patch to end up in -next naturally, we try to avoid applying extra patches when possible. Cheers, Nathan
Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused
On Thu, May 25, 2023 at 12:42:05PM -0400, Alex Deucher wrote: > On Thu, May 25, 2023 at 12:29 PM Nathan Chancellor wrote: > > > > On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote: > > > On 2023-05-25 11:22, Nathan Chancellor wrote: > > > > On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote: > > > >> Silencing the compiler from below compilation error: > > > >> > > > >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable > > > >> 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted > > > >> [-Werror,-Wunneeded-internal-declaration] > > > >> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = { > > > >> ^ > > > >> 1 error generated. > > > >> > > > >> Mark the variable as __maybe_unused to make it clear to clang that this > > > >> is expected, so there is no more warning. > > > >> > > > >> Cc: Christian König > > > >> Cc: Lijo Lazar > > > >> Cc: Luben Tuikov > > > >> Cc: Alex Deucher > > > >> Signed-off-by: Srinivasan Shanmugam > > > > > > > > Traditionally, this attribute would go between the [] and =, but that is > > > > a nit. Can someone please pick this up to unblock our builds on -next? > > > > > > > > Reviewed-by: Nathan Chancellor > > > > > > I'll pick this up, fix it, and submit to amd-staging-drm-next. > > > > Thanks a lot :) > > > > > Which -next are you referring to, Nathan? > > > > linux-next, this warning breaks the build when -Werror is enabled, such > > as with allmodconfig: > > > > https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2QHtlCTz2JL0yXNpRB5hVmiP9lq/build.log > > > > Srinivasan has already pushed it. I'll push it out once CI has > completed. We are trying to figure out the best way to enable -WERROR > in our CI system as it is almost always broken depending on what > compiler you are using. Also, I'm not sure fixing these is always > better. A lot of these warnings seem spurious and in a lot of cases > the "fix" doesn't really improve the code, it just silences a warning. > As one of my coworkers put it, there is a reason warnings are not > errors. I do not necessarily disagree with that final sentiment but at the end of the day, it is pointing out a potential problem ("this variable is only used in a compile time context, is that what you intended or not?") and the solution is either to fix the code so that it works as initially intended or you silence the warning because you know it is not actually a problem. There are always going to be false positives, otherwise they would just always be hard errors, but that does not mean that they are not worth listening to, which is why Linus insists on -Werror being a thing. We can opt out of -Werror for our CI but that does not change the fact it is default enabled with allmodconfig, so that is how most people will test. Cheers, Nathan
Re: [PATCH] drm/amdgpu: Fix up missing kdoc in sdma_v6_0.c
Reviewed-by: Alex Deucher On Thu, May 25, 2023 at 12:15 PM Srinivasan Shanmugam wrote: > > Address a bunch of kdoc warnings: > > gcc with W=1 > drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or > member 'job' not described in 'sdma_v6_0_ring_emit_ib' > drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or > member 'flags' not described in 'sdma_v6_0_ring_emit_ib' > drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:946: warning: Function parameter or > member 'timeout' not described in 'sdma_v6_0_ring_test_ib' > drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1125: warning: Function parameter or > member 'ring' not described in 'sdma_v6_0_ring_pad_ib' > drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1176: warning: Function parameter or > member 'vmid' not described in 'sdma_v6_0_ring_emit_vm_flush' > drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1176: warning: Function parameter or > member 'pd_addr' not described in 'sdma_v6_0_ring_emit_vm_flush' > > Cc: Christian König > Cc: Alex Deucher > Signed-off-by: Srinivasan Shanmugam > --- > drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c > b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c > index 1c90b5c661fb..967849c59ebe 100644 > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c > @@ -238,6 +238,8 @@ static void sdma_v6_0_ring_insert_nop(struct amdgpu_ring > *ring, uint32_t count) > * > * @ring: amdgpu ring pointer > * @ib: IB object to schedule > + * @flags: unused > + * @job: job to retrieve vmid from > * > * Schedule an IB in the DMA ring. > */ > @@ -938,6 +940,7 @@ static int sdma_v6_0_ring_test_ring(struct amdgpu_ring > *ring) > * sdma_v6_0_ring_test_ib - test an IB on the DMA engine > * > * @ring: amdgpu_ring structure holding ring information > + * @timeout: timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT > * > * Test a simple IB in the DMA ring. > * Returns 0 on success, error on failure. > @@ -1167,6 +1170,8 @@ static void sdma_v6_0_ring_emit_pipeline_sync(struct > amdgpu_ring *ring) > * sdma_v6_0_ring_emit_vm_flush - vm flush using sDMA > * > * @ring: amdgpu_ring pointer > + * @vmid: vmid number to use > + * @pd_addr: address > * > * Update the page table base and flush the VM TLB > * using sDMA. > -- > 2.25.1 >
[PATCH 1/3] drm/amdgpu: add cached GPU fault structure to vm struct
When we get a GPU pge fault, cache the fault for later analysis. Cc: samuel.pitoi...@gmail.com Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 31 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 18 +++ 2 files changed, 49 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 22f9a65ca0fc..73e022f3daa4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2631,3 +2631,34 @@ void amdgpu_debugfs_vm_bo_info(struct amdgpu_vm *vm, struct seq_file *m) total_done_objs); } #endif + +/** + * amdgpu_vm_update_fault_cache - update cached fault into. + * @adev: amdgpu device pointer + * @pasid: PASID of the VM + * @addr: Address of the fault + * @status: fault status register + * @vmhub: which vmhub got the fault + * + * Cache the fault info for later use by userspace in debuggging. + */ +void amdgpu_vm_update_fault_cache(struct amdgpu_device *adev, + unsigned int pasid, + uint64_t addr, + uint32_t status, + unsigned int vmhub) +{ + struct amdgpu_vm *vm; + unsigned long flags; + + xa_lock_irqsave(&adev->vm_manager.pasids, flags); + + vm = xa_load(&adev->vm_manager.pasids, pasid); + if (vm) { + vm->fault_info.addr = addr; + vm->fault_info.status = status; + vm->fault_info.vmhub = vmhub; + } + xa_unlock_irqrestore(&adev->vm_manager.pasids, flags); +} + diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index 14f9a2bf3acb..fb66a413110c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -244,6 +244,15 @@ struct amdgpu_vm_update_funcs { struct dma_fence **fence); }; +struct amdgpu_vm_fault_info { + /* fault address */ + uint64_taddr; + /* fault status register */ + uint32_tstatus; + /* which vmhub? gfxhub, mmhub, etc. */ + unsigned intvmhub; +}; + struct amdgpu_vm { /* tree of virtual addresses mapped */ struct rb_root_cached va; @@ -332,6 +341,9 @@ struct amdgpu_vm { /* Memory partition number, -1 means any partition */ int8_t mem_id; + + /* cached fault info */ + struct amdgpu_vm_fault_info fault_info; }; struct amdgpu_vm_manager { @@ -540,4 +552,10 @@ static inline void amdgpu_vm_eviction_unlock(struct amdgpu_vm *vm) mutex_unlock(&vm->eviction_lock); } +void amdgpu_vm_update_fault_cache(struct amdgpu_device *adev, + unsigned int pasid, + uint64_t addr, + uint32_t status, + unsigned int vmhub); + #endif -- 2.40.1
[PATCH 0/3] Add GPU page fault query interface
This patch set adds support for an application to query GPU page faults. It's useful for debugging and there are vulkan extensions that could make use of this. Preliminary user space code which uses this can be found here: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/298 Note, that I made a small change to the vmhub definition to decouple it from how the kernel tracks vmhubs so that we have a consistent user view even if we decide to add more vmhubs like we recently did for gfx 9.4.3. I've also pushed the changed to: https://gitlab.freedesktop.org/agd5f/linux/-/commits/gpu_fault_info_ioctl Alex Deucher (3): drm/amdgpu: add cached GPU fault structure to vm struct drm/amdgpu: cache gpuvm fault information for gmc7+ drm/amdgpu: add new INFO ioctl query for the last GPU page fault drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 16 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 45 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 31 +++-- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 3 ++ drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 3 ++ drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 3 ++ drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 3 ++ drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 11 +++--- include/uapi/drm/amdgpu_drm.h | 16 + 10 files changed, 126 insertions(+), 8 deletions(-) -- 2.40.1
[PATCH 3/3] drm/amdgpu: add new INFO ioctl query for the last GPU page fault
Add a interface to query the last GPU page fault for the process. Useful for debugging context lost errors. v2: split vmhub representation between kernel and userspace Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 libdrm MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 Cc: samuel.pitoi...@gmail.com Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 16 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 16 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 13 ++--- include/uapi/drm/amdgpu_drm.h | 16 5 files changed, 59 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 7300df2a342c..7e17b285decc 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -112,9 +112,10 @@ *gl1c_cache_size, gl2c_cache_size, mall_size, enabled_rb_pipes_mask_hi * 3.53.0 - Support for GFX11 CP GFX shadowing * 3.54.0 - Add AMDGPU_CTX_QUERY2_FLAGS_RESET_IN_PROGRESS support + * - 3.55.0 - Add AMDGPU_INFO_GPUVM_FAULT query */ #define KMS_DRIVER_MAJOR 3 -#define KMS_DRIVER_MINOR 54 +#define KMS_DRIVER_MINOR 55 #define KMS_DRIVER_PATCHLEVEL 0 unsigned int amdgpu_vram_limit = UINT_MAX; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c index 41d047e5de69..bca2a56046ae 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c @@ -1163,6 +1163,22 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) return copy_to_user(out, max_ibs, min((size_t)size, sizeof(max_ibs))) ? -EFAULT : 0; } + case AMDGPU_INFO_GPUVM_FAULT: { + struct amdgpu_fpriv *fpriv = filp->driver_priv; + struct amdgpu_vm *vm = &fpriv->vm; + struct drm_amdgpu_info_gpuvm_fault gpuvm_fault; + + if (!vm) + return -EINVAL; + + memset(&gpuvm_fault, 0, sizeof(gpuvm_fault)); + gpuvm_fault.addr = vm->fault_info.addr; + gpuvm_fault.status = vm->fault_info.status; + gpuvm_fault.vmhub = vm->fault_info.vmhub; + + return copy_to_user(out, &gpuvm_fault, + min((size_t)size, sizeof(gpuvm_fault))) ? -EFAULT : 0; + } default: DRM_DEBUG_KMS("Invalid request %d\n", info->query); return -EINVAL; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 73e022f3daa4..c1b0c5f3c1f8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2657,7 +2657,21 @@ void amdgpu_vm_update_fault_cache(struct amdgpu_device *adev, if (vm) { vm->fault_info.addr = addr; vm->fault_info.status = status; - vm->fault_info.vmhub = vmhub; + if (AMDGPU_IS_GFXHUB(vmhub)) { + vm->fault_info.vmhub = AMDGPU_VMHUB_TYPE_GFX; + vm->fault_info.vmhub |= + (vmhub - AMDGPU_GFXHUB_START) << AMDGPU_VMHUB_IDX_SHIFT; + } else if (AMDGPU_IS_MMHUB0(vmhub)) { + vm->fault_info.vmhub = AMDGPU_VMHUB_TYPE_MM0; + vm->fault_info.vmhub |= + (vmhub - AMDGPU_MMHUB0_START) << AMDGPU_VMHUB_IDX_SHIFT; + } else if (AMDGPU_IS_MMHUB1(vmhub)) { + vm->fault_info.vmhub = AMDGPU_VMHUB_TYPE_MM1; + vm->fault_info.vmhub |= + (vmhub - AMDGPU_MMHUB1_START) << AMDGPU_VMHUB_IDX_SHIFT; + } else { + WARN_ONCE(1, "Invalid vmhub %u\n", vmhub); + } } xa_unlock_irqrestore(&adev->vm_manager.pasids, flags); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index fb66a413110c..1a34fea9acb9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -116,9 +116,16 @@ struct amdgpu_mem_stats; * layout: max 8 GFXHUB + 4 MMHUB0 + 1 MMHUB1 */ #define AMDGPU_MAX_VMHUBS 13 -#define AMDGPU_GFXHUB(x) (x) -#define AMDGPU_MMHUB0(x) (8 + x) -#define AMDGPU_MMHUB1(x) (8 + 4 + x) +#define AMDGPU_GFXHUB_START0 +#define AMDGPU_MMHUB0_START8 +#define AMDGPU_MMHUB1_START12 +#define AMDGPU_GFXHUB(x) (AMDGPU_GFXHUB_START + (x)) +#define AMDGPU_MMHUB0(x) (AMDGPU_MMHUB0_START + (x)) +#define AMDGPU_MM
[PATCH 2/3] drm/amdgpu: cache gpuvm fault information for gmc7+
Cache the current fault info in the vm struct. This can be queried by userspace later to help debug UMDs. Cc: samuel.pitoi...@gmail.com Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 3 +++ drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 3 +++ drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 3 +++ drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 3 +++ drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 11 +++ 5 files changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c index 01bd45651382..5f88db5432b6 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c @@ -155,6 +155,9 @@ static int gmc_v10_0_process_interrupt(struct amdgpu_device *adev, status = RREG32(hub->vm_l2_pro_fault_status); WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1); + + amdgpu_vm_update_fault_cache(adev, entry->pasid, addr, status, +entry->vmid_src ? AMDGPU_MMHUB0(0) : AMDGPU_GFXHUB(0)); } if (!printk_ratelimit()) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c index 4bf807d825c0..087f1ec3cf54 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c @@ -115,6 +115,9 @@ static int gmc_v11_0_process_interrupt(struct amdgpu_device *adev, status = RREG32(hub->vm_l2_pro_fault_status); WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1); + + amdgpu_vm_update_fault_cache(adev, entry->pasid, addr, status, +entry->vmid_src ? AMDGPU_MMHUB0(0) : AMDGPU_GFXHUB(0)); } if (printk_ratelimit()) { diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c index 6f53049619cd..1386e2b2e773 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c @@ -1272,6 +1272,9 @@ static int gmc_v7_0_process_interrupt(struct amdgpu_device *adev, if (!addr && !status) return 0; + amdgpu_vm_update_fault_cache(adev, entry->pasid, +((u64)addr) << AMDGPU_GPU_PAGE_SHIFT, status, AMDGPU_GFXHUB(0)); + if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST) gmc_v7_0_set_fault_enable_default(adev, false); diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c index 48475077ca92..6d46390ee9e3 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c @@ -1447,6 +1447,9 @@ static int gmc_v8_0_process_interrupt(struct amdgpu_device *adev, if (!addr && !status) return 0; + amdgpu_vm_update_fault_cache(adev, entry->pasid, +((u64)addr) << AMDGPU_GPU_PAGE_SHIFT, status, AMDGPU_GFXHUB(0)); + if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST) gmc_v8_0_set_fault_enable_default(adev, false); diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 1e8b2aaa48c1..28a66aa377f3 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -555,6 +555,7 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device *adev, struct amdgpu_vmhub *hub; const char *mmhub_cid; const char *hub_name; + unsigned int vmhub; u64 addr; uint32_t cam_index = 0; int ret, xcc_id = 0; @@ -567,10 +568,10 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device *adev, if (entry->client_id == SOC15_IH_CLIENTID_VMC) { hub_name = "mmhub0"; - hub = &adev->vmhub[AMDGPU_MMHUB0(node_id / 4)]; + vmhub = AMDGPU_MMHUB0(node_id / 4); } else if (entry->client_id == SOC15_IH_CLIENTID_VMC1) { hub_name = "mmhub1"; - hub = &adev->vmhub[AMDGPU_MMHUB1(0)]; + vmhub = AMDGPU_MMHUB1(0); } else { hub_name = "gfxhub0"; if (adev->gfx.funcs->ih_node_to_logical_xcc) { @@ -579,8 +580,9 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device *adev, if (xcc_id < 0) xcc_id = 0; } - hub = &adev->vmhub[xcc_id]; + vmhub = xcc_id; } + hub = &adev->vmhub[vmhub]; if (retry_fault) { if (adev->irq.retry_cam_enabled) { @@ -626,7 +628,6 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device *adev, if (!printk_ratelimit()) return 0; - memset(&task_info, 0, sizeof(struct amdgpu_task_info)); amdgpu_vm_get_task_info(adev, entry->pasid, &task_info); @@ -663,6 +664,8 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device
Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused
On 2023-05-25 12:29, Nathan Chancellor wrote: > On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote: >> On 2023-05-25 11:22, Nathan Chancellor wrote: >>> On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote: Silencing the compiler from below compilation error: drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration] static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = { ^ 1 error generated. Mark the variable as __maybe_unused to make it clear to clang that this is expected, so there is no more warning. Cc: Christian König Cc: Lijo Lazar Cc: Luben Tuikov Cc: Alex Deucher Signed-off-by: Srinivasan Shanmugam >>> >>> Traditionally, this attribute would go between the [] and =, but that is >>> a nit. Can someone please pick this up to unblock our builds on -next? >>> >>> Reviewed-by: Nathan Chancellor >> >> I'll pick this up, fix it, and submit to amd-staging-drm-next. > > Thanks a lot :) > >> Which -next are you referring to, Nathan? > > linux-next, this warning breaks the build when -Werror is enabled, such > as with allmodconfig: > > https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2QHtlCTz2JL0yXNpRB5hVmiP9lq/build.log > Hi Nathan, Thanks for the pointers. Srinivasan has already submitted it to amd-staging-drm-next. Seems Alex will push it upstream. Not sure who fast you need it, we can send you the commit itself for you to git-am if you cannot wait. Regards, Luben > Cheers, > Nathan > --- drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c index 3648994724c2..cba087e529c0 100644 --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c @@ -701,6 +701,7 @@ static void mmhub_v1_8_reset_ras_error_count(struct amdgpu_device *adev) mmhub_v1_8_inst_reset_ras_error_count(adev, i); } +__maybe_unused static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = { regMMEA0_ERR_STATUS, regMMEA1_ERR_STATUS, -- 2.25.1 >>
Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused
On Thu, May 25, 2023 at 12:29 PM Nathan Chancellor wrote: > > On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote: > > On 2023-05-25 11:22, Nathan Chancellor wrote: > > > On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote: > > >> Silencing the compiler from below compilation error: > > >> > > >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable > > >> 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted > > >> [-Werror,-Wunneeded-internal-declaration] > > >> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = { > > >> ^ > > >> 1 error generated. > > >> > > >> Mark the variable as __maybe_unused to make it clear to clang that this > > >> is expected, so there is no more warning. > > >> > > >> Cc: Christian König > > >> Cc: Lijo Lazar > > >> Cc: Luben Tuikov > > >> Cc: Alex Deucher > > >> Signed-off-by: Srinivasan Shanmugam > > > > > > Traditionally, this attribute would go between the [] and =, but that is > > > a nit. Can someone please pick this up to unblock our builds on -next? > > > > > > Reviewed-by: Nathan Chancellor > > > > I'll pick this up, fix it, and submit to amd-staging-drm-next. > > Thanks a lot :) > > > Which -next are you referring to, Nathan? > > linux-next, this warning breaks the build when -Werror is enabled, such > as with allmodconfig: > > https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2QHtlCTz2JL0yXNpRB5hVmiP9lq/build.log > Srinivasan has already pushed it. I'll push it out once CI has completed. We are trying to figure out the best way to enable -WERROR in our CI system as it is almost always broken depending on what compiler you are using. Also, I'm not sure fixing these is always better. A lot of these warnings seem spurious and in a lot of cases the "fix" doesn't really improve the code, it just silences a warning. As one of my coworkers put it, there is a reason warnings are not errors. Alex > Cheers, > Nathan > > > >> --- > > >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 1 + > > >> 1 file changed, 1 insertion(+) > > >> > > >> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c > > >> b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c > > >> index 3648994724c2..cba087e529c0 100644 > > >> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c > > >> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c > > >> @@ -701,6 +701,7 @@ static void mmhub_v1_8_reset_ras_error_count(struct > > >> amdgpu_device *adev) > > >>mmhub_v1_8_inst_reset_ras_error_count(adev, i); > > >> } > > >> > > >> +__maybe_unused > > >> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = { > > >>regMMEA0_ERR_STATUS, > > >>regMMEA1_ERR_STATUS, > > >> -- > > >> 2.25.1 > > >> > >
[PATCH] drm/amd/amdgpu: introduce DRM_AMDGPU_WERROR
We want to do -Werror builds on our CI. However, non-amdgpu breakages have prevented us from doing so thus far. Also, there are a number of additional checks that we should enable, that the community cares about and are hidden behind -Wextra. So, define DRM_AMDGPU_WERROR to only enable -Werror for the amdgpu kernel module and enable -Wextra while disabling all of the checks that are too noisy. Cc: Alex Deucher Cc: Kenny Ho Suggested-by: Jani Nikula Signed-off-by: Hamza Mahfooz --- drivers/gpu/drm/amd/amdgpu/Kconfig | 10 ++ drivers/gpu/drm/amd/amdgpu/Makefile | 9 + 2 files changed, 19 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig b/drivers/gpu/drm/amd/amdgpu/Kconfig index 07135ffa6d24..334511f331e3 100644 --- a/drivers/gpu/drm/amd/amdgpu/Kconfig +++ b/drivers/gpu/drm/amd/amdgpu/Kconfig @@ -66,6 +66,16 @@ config DRM_AMDGPU_USERPTR This option selects CONFIG_HMM and CONFIG_HMM_MIRROR if it isn't already selected to enabled full userptr support. +config DRM_AMDGPU_WERROR + bool "Force the compiler to throw an error instead of a warning when compiling" + depends on DRM_AMDGPU + depends on EXPERT + depends on !COMPILE_TEST + default n + help + Add -Werror to the build flags for amdgpu.ko. + Only enable this if you are warning code for amdgpu.ko. + source "drivers/gpu/drm/amd/acp/Kconfig" source "drivers/gpu/drm/amd/display/Kconfig" source "drivers/gpu/drm/amd/amdkfd/Kconfig" diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile index 74a9aa6fe18c..7ee68b1bbfed 100644 --- a/drivers/gpu/drm/amd/amdgpu/Makefile +++ b/drivers/gpu/drm/amd/amdgpu/Makefile @@ -39,6 +39,15 @@ ccflags-y := -I$(FULL_AMD_PATH)/include/asic_reg \ -I$(FULL_AMD_DISPLAY_PATH)/amdgpu_dm \ -I$(FULL_AMD_PATH)/amdkfd +subdir-ccflags-y := -Wextra +subdir-ccflags-y += -Wunused-but-set-variable +subdir-ccflags-y += -Wno-unused-parameter +subdir-ccflags-y += -Wno-type-limits +subdir-ccflags-y += -Wno-sign-compare +subdir-ccflags-y += -Wno-missing-field-initializers +subdir-ccflags-y += -Wno-override-init +subdir-ccflags-$(CONFIG_DRM_AMDGPU_WERROR) += -Werror + amdgpu-y := amdgpu_drv.o # add KMS driver -- 2.40.1
Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused
On Thu, May 25, 2023 at 12:26:56PM -0400, Luben Tuikov wrote: > On 2023-05-25 11:22, Nathan Chancellor wrote: > > On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote: > >> Silencing the compiler from below compilation error: > >> > >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable > >> 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted > >> [-Werror,-Wunneeded-internal-declaration] > >> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = { > >> ^ > >> 1 error generated. > >> > >> Mark the variable as __maybe_unused to make it clear to clang that this > >> is expected, so there is no more warning. > >> > >> Cc: Christian König > >> Cc: Lijo Lazar > >> Cc: Luben Tuikov > >> Cc: Alex Deucher > >> Signed-off-by: Srinivasan Shanmugam > > > > Traditionally, this attribute would go between the [] and =, but that is > > a nit. Can someone please pick this up to unblock our builds on -next? > > > > Reviewed-by: Nathan Chancellor > > I'll pick this up, fix it, and submit to amd-staging-drm-next. Thanks a lot :) > Which -next are you referring to, Nathan? linux-next, this warning breaks the build when -Werror is enabled, such as with allmodconfig: https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2QHtlCTz2JL0yXNpRB5hVmiP9lq/build.log Cheers, Nathan > >> --- > >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 1 + > >> 1 file changed, 1 insertion(+) > >> > >> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c > >> b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c > >> index 3648994724c2..cba087e529c0 100644 > >> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c > >> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c > >> @@ -701,6 +701,7 @@ static void mmhub_v1_8_reset_ras_error_count(struct > >> amdgpu_device *adev) > >>mmhub_v1_8_inst_reset_ras_error_count(adev, i); > >> } > >> > >> +__maybe_unused > >> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = { > >>regMMEA0_ERR_STATUS, > >>regMMEA1_ERR_STATUS, > >> -- > >> 2.25.1 > >> >
Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused
On 2023-05-25 11:22, Nathan Chancellor wrote: > On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote: >> Silencing the compiler from below compilation error: >> >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable >> 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted >> [-Werror,-Wunneeded-internal-declaration] >> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = { >> ^ >> 1 error generated. >> >> Mark the variable as __maybe_unused to make it clear to clang that this >> is expected, so there is no more warning. >> >> Cc: Christian König >> Cc: Lijo Lazar >> Cc: Luben Tuikov >> Cc: Alex Deucher >> Signed-off-by: Srinivasan Shanmugam > > Traditionally, this attribute would go between the [] and =, but that is > a nit. Can someone please pick this up to unblock our builds on -next? > > Reviewed-by: Nathan Chancellor I'll pick this up, fix it, and submit to amd-staging-drm-next. Which -next are you referring to, Nathan? Regards, Luben > >> --- >> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c >> b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c >> index 3648994724c2..cba087e529c0 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c >> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c >> @@ -701,6 +701,7 @@ static void mmhub_v1_8_reset_ras_error_count(struct >> amdgpu_device *adev) >> mmhub_v1_8_inst_reset_ras_error_count(adev, i); >> } >> >> +__maybe_unused >> static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = { >> regMMEA0_ERR_STATUS, >> regMMEA1_ERR_STATUS, >> -- >> 2.25.1 >>
Re: [PATCH v2] drm/amd/display: enable more strict compile checks
On Thu, May 25, 2023 at 08:37:07AM -0700, Kees Cook wrote: > Hi! > > On Wed, May 24, 2023 at 04:27:31PM -0400, Hamza Mahfooz wrote: > > + Kees > > > > On 5/24/23 15:50, Alex Deucher wrote: > > > On Wed, May 24, 2023 at 3:46 PM Felix Kuehling > > > wrote: > > > > > > > > Sure, I think we tried enabling warnings as errors before and had to > > > > revert it because of weird compiler quirks or the variety of compiler > > > > versions that need to be supported. > > > > > > > > Alex, are you planning to upstream this, or is this only to enforce more > > > > internal discipline about not ignoring warnings? > > > > > > I'd like to upstream it. Upstream already has CONFIG_WERROR as a > > > config option, but it's been problematic to enable in CI because of > > > various breakages outside of the driver and in different compilers. > > > That said, I don't know how much trouble enabling it will cause with > > > various compilers in the wild. > > -Wmisleading-indentation is already part of -Wall, so this is globally > enabled already. > > -Wunused is enabled under W=1, and it's pretty noisy still. If you can > get builds clean in drm, that'll be a good step towards getting it > enabled globally. (A middle ground with less to clean up might be > -Wunused-but-set-variable) > > I agree about -Werror: just stick with CONFIG_WERROR instead. There is also W=e, added by commit c77d06e70d59 ("kbuild: support W=e to make build abort in case of warning") in 5.19, which works well for building with configurations that do not have CONFIG_WERROR enabled and avoiding dipping into menuconfig. Unconditionally enabling -Werror with no way to turn it off is just asking for problems over time with new compiler versions, either due to new warnings in -Wall or warnings that have been improved or changed. Should that still be desired, consider doing what i915 and PowerPC have done and add a Kconfig option that can be disabled. Cheers, Nathan
[PATCH] drm/amdgpu: Fix up missing kdoc in sdma_v6_0.c
Address a bunch of kdoc warnings: gcc with W=1 drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or member 'job' not described in 'sdma_v6_0_ring_emit_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:248: warning: Function parameter or member 'flags' not described in 'sdma_v6_0_ring_emit_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:946: warning: Function parameter or member 'timeout' not described in 'sdma_v6_0_ring_test_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1125: warning: Function parameter or member 'ring' not described in 'sdma_v6_0_ring_pad_ib' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1176: warning: Function parameter or member 'vmid' not described in 'sdma_v6_0_ring_emit_vm_flush' drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c:1176: warning: Function parameter or member 'pd_addr' not described in 'sdma_v6_0_ring_emit_vm_flush' Cc: Christian König Cc: Alex Deucher Signed-off-by: Srinivasan Shanmugam --- drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c index 1c90b5c661fb..967849c59ebe 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c @@ -238,6 +238,8 @@ static void sdma_v6_0_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count) * * @ring: amdgpu ring pointer * @ib: IB object to schedule + * @flags: unused + * @job: job to retrieve vmid from * * Schedule an IB in the DMA ring. */ @@ -938,6 +940,7 @@ static int sdma_v6_0_ring_test_ring(struct amdgpu_ring *ring) * sdma_v6_0_ring_test_ib - test an IB on the DMA engine * * @ring: amdgpu_ring structure holding ring information + * @timeout: timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT * * Test a simple IB in the DMA ring. * Returns 0 on success, error on failure. @@ -1167,6 +1170,8 @@ static void sdma_v6_0_ring_emit_pipeline_sync(struct amdgpu_ring *ring) * sdma_v6_0_ring_emit_vm_flush - vm flush using sDMA * * @ring: amdgpu_ring pointer + * @vmid: vmid number to use + * @pd_addr: address * * Update the page table base and flush the VM TLB * using sDMA. -- 2.25.1
[PATCH 2/2] drm/amdgpu: Remove duplicate fdinfo fields
From: Rob Clark Some of the fields that are handled by drm_show_fdinfo() crept back in when rebasing the patch. Remove them again. Fixes: 376c25f8ca47 ("drm/amdgpu: Switch to fdinfo helper") Signed-off-by: Rob Clark --- drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c index 13d7413d4ca3..a93e5627901a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c @@ -80,23 +80,20 @@ void amdgpu_show_fdinfo(struct drm_printer *p, struct drm_file *file) amdgpu_ctx_mgr_usage(&fpriv->ctx_mgr, usage); /* * ** * For text output format description please see drm-usage-stats.rst! * ** */ drm_printf(p, "pasid:\t%u\n", fpriv->vm.pasid); - drm_printf(p, "drm-driver:\t%s\n", file->minor->dev->driver->name); - drm_printf(p, "drm-pdev:\t%04x:%02x:%02x.%d\n", domain, bus, dev, fn); - drm_printf(p, "drm-client-id:\t%Lu\n", vm->immediate.fence_context); drm_printf(p, "drm-memory-vram:\t%llu KiB\n", stats.vram/1024UL); drm_printf(p, "drm-memory-gtt: \t%llu KiB\n", stats.gtt/1024UL); drm_printf(p, "drm-memory-cpu: \t%llu KiB\n", stats.cpu/1024UL); drm_printf(p, "amd-memory-visible-vram:\t%llu KiB\n", stats.visible_vram/1024UL); drm_printf(p, "amd-evicted-vram:\t%llu KiB\n", stats.evicted_vram/1024UL); drm_printf(p, "amd-evicted-visible-vram:\t%llu KiB\n", stats.evicted_visible_vram/1024UL); drm_printf(p, "amd-requested-vram:\t%llu KiB\n", -- 2.40.1
[PATCH 1/2] drm/amdgpu: Fix no-procfs build
From: Rob Clark Fixes undefined symbol when PROC_FS is not enabled. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202305251510.u0r2as7k-...@intel.com/ Fixes: 376c25f8ca47 ("drm/amdgpu: Switch to fdinfo helper") Signed-off-by: Rob Clark --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 1b46e7ac7cb0..c9a41c997c6c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2795,21 +2795,23 @@ static const struct drm_driver amdgpu_kms_driver = { DRIVER_SYNCOBJ_TIMELINE, .open = amdgpu_driver_open_kms, .postclose = amdgpu_driver_postclose_kms, .lastclose = amdgpu_driver_lastclose_kms, .ioctls = amdgpu_ioctls_kms, .num_ioctls = ARRAY_SIZE(amdgpu_ioctls_kms), .dumb_create = amdgpu_mode_dumb_create, .dumb_map_offset = amdgpu_mode_dumb_mmap, .fops = &amdgpu_driver_kms_fops, .release = &amdgpu_driver_release_kms, +#ifdef CONFIG_PROC_FS .show_fdinfo = amdgpu_show_fdinfo, +#endif .prime_handle_to_fd = drm_gem_prime_handle_to_fd, .prime_fd_to_handle = drm_gem_prime_fd_to_handle, .gem_prime_import = amdgpu_gem_prime_import, .gem_prime_mmap = drm_gem_prime_mmap, .name = DRIVER_NAME, .desc = DRIVER_DESC, .date = DRIVER_DATE, .major = KMS_DRIVER_MAJOR, -- 2.40.1
Re: [PATCH v2] drm/amd/display: enable more strict compile checks
> +subdir-ccflags-y += -Werror -Wunused -Wmisleading-indentation We have a config option for -Werror. Blindly adding this will create problems with too new (or sometimes too old, or just too weird) compilers all the time. Don't do this.
[PATCH] drm/amd/amdgpu: Fix up locking etc in amdgpu_debugfs_gprwave_ioctl()
There are two bugs here. 1) Drop the lock if copy_from_user() fails. 2) If the copy fails then the correct error code is -EFAULT instead of -EINVAL. I also broke up the long line and changed "sizeof rd->id" to "sizeof(rd->id)". Fixes: 164fb2940933 ("drm/amd/amdgpu: Update debugfs for XCC support (v3)") Signed-off-by: Dan Carpenter --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index c657bed350ac..56e89e76ff17 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -478,15 +478,16 @@ static ssize_t amdgpu_debugfs_gprwave_read(struct file *f, char __user *buf, siz static long amdgpu_debugfs_gprwave_ioctl(struct file *f, unsigned int cmd, unsigned long data) { struct amdgpu_debugfs_gprwave_data *rd = f->private_data; - int r; + int r = 0; mutex_lock(&rd->lock); switch (cmd) { case AMDGPU_DEBUGFS_GPRWAVE_IOC_SET_STATE: - r = copy_from_user(&rd->id, (struct amdgpu_debugfs_gprwave_iocdata *)data, sizeof rd->id); - if (r) - return r ? -EINVAL : 0; + if (copy_from_user(&rd->id, + (struct amdgpu_debugfs_gprwave_iocdata *)data, + sizeof(rd->id))) + r = -EFAULT; goto done; default: r = -EINVAL; -- 2.39.2
Re: [PATCH v2] drm/amd/display: enable more strict compile checks
Hi! On Wed, May 24, 2023 at 04:27:31PM -0400, Hamza Mahfooz wrote: > + Kees > > On 5/24/23 15:50, Alex Deucher wrote: > > On Wed, May 24, 2023 at 3:46 PM Felix Kuehling > > wrote: > > > > > > Sure, I think we tried enabling warnings as errors before and had to > > > revert it because of weird compiler quirks or the variety of compiler > > > versions that need to be supported. > > > > > > Alex, are you planning to upstream this, or is this only to enforce more > > > internal discipline about not ignoring warnings? > > > > I'd like to upstream it. Upstream already has CONFIG_WERROR as a > > config option, but it's been problematic to enable in CI because of > > various breakages outside of the driver and in different compilers. > > That said, I don't know how much trouble enabling it will cause with > > various compilers in the wild. -Wmisleading-indentation is already part of -Wall, so this is globally enabled already. -Wunused is enabled under W=1, and it's pretty noisy still. If you can get builds clean in drm, that'll be a good step towards getting it enabled globally. (A middle ground with less to clean up might be -Wunused-but-set-variable) I agree about -Werror: just stick with CONFIG_WERROR instead. -Kees > > > > Alex > > > > > > > > Regards, > > > Felix > > > > > > > > > On 2023-05-24 15:41, Russell, Kent wrote: > > > > > > > > [AMD Official Use Only - General] > > > > > > > > > > > > (Adding Felix in CC) > > > > > > > > I’m a fan of adding it to KFD as well. Felix, can you foresee any > > > > issues here? > > > > > > > > Kent > > > > > > > > *From:* amd-gfx *On Behalf Of > > > > *Ho, Kenny > > > > *Sent:* Wednesday, May 24, 2023 3:23 PM > > > > *To:* Alex Deucher ; Mahfooz, Hamza > > > > > > > > *Cc:* Li, Sun peng (Leo) ; Wentland, Harry > > > > ; Pan, Xinhui ; Siqueira, > > > > Rodrigo ; linux-ker...@vger.kernel.org; > > > > dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Daniel > > > > Vetter ; Deucher, Alexander > > > > ; David Airlie ; Koenig, > > > > Christian > > > > *Subject:* Re: [PATCH v2] drm/amd/display: enable more strict compile > > > > checks > > > > > > > > [AMD Official Use Only - General] > > > > > > > > [AMD Official Use Only - General] > > > > > > > > (+ Felix) > > > > > > > > Should we do the same for other modules under amd (amdkfd)? I was > > > > going to enable full kernel werror in the kconfig used by my CI but > > > > this is probably better. > > > > > > > > Kenny > > > > > > > > > > > > > > > > *From:*Alex Deucher > > > > *Sent:* Wednesday, May 24, 2023 3:22 PM > > > > *To:* Mahfooz, Hamza > > > > *Cc:* amd-gfx@lists.freedesktop.org ; > > > > Li, Sun peng (Leo) ; Ho, Kenny ; > > > > Pan, Xinhui ; Siqueira, Rodrigo > > > > ; linux-ker...@vger.kernel.org > > > > ; dri-de...@lists.freedesktop.org > > > > ; Daniel Vetter ; > > > > Deucher, Alexander ; David Airlie > > > > ; Wentland, Harry ; Koenig, > > > > Christian > > > > *Subject:* Re: [PATCH v2] drm/amd/display: enable more strict compile > > > > checks > > > > > > > > On Wed, May 24, 2023 at 3:20 PM Hamza Mahfooz > > > > wrote: > > > > > > > > > > Currently, there are quite a number of issues that are quite easy for > > > > > the CI to catch, that slip through the cracks. Among them, there are > > > > > unused variable and indentation issues. Also, we should consider all > > > > > warnings to be compile errors, since the community will eventually end > > > > > up complaining about them. So, enable -Werror, -Wunused and > > > > > -Wmisleading-indentation for all kernel builds. > > > > > > > > > > Cc: Alex Deucher > > > > > Cc: Harry Wentland > > > > > Cc: Kenny Ho > > > > > Signed-off-by: Hamza Mahfooz > > > > > --- > > > > > v2: fix grammatical error > > > > > --- > > > > > drivers/gpu/drm/amd/display/Makefile | 2 ++ > > > > > 1 file changed, 2 insertions(+) > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/display/Makefile > > > > b/drivers/gpu/drm/amd/display/Makefile > > > > > index 0d610cb376bb..3c44162ebe21 100644 > > > > > --- a/drivers/gpu/drm/amd/display/Makefile > > > > > +++ b/drivers/gpu/drm/amd/display/Makefile > > > > > @@ -26,6 +26,8 @@ > > > > > > > > > > AMDDALPATH = $(RELATIVE_AMD_DISPLAY_PATH) > > > > > > > > > > +subdir-ccflags-y += -Werror -Wunused -Wmisleading-indentation > > > > > + > > > > > > > > Care to enable this for the rest of amdgpu as well? Or send out an > > > > additional patch to do that? Either way: > > > > Reviewed-by: Alex Deucher > > > > > > > > Alex > > > > > > > > > subdir-ccflags-y += -I$(FULL_AMD_DISPLAY_PATH)/dc/inc/ > > > > > subdir-ccflags-y += -I$(FULL_AMD_DISPLAY_PATH)/dc/inc/hw > > > > > subdir-ccflags-y += -I$(FULL_AMD_DISPLAY_PATH)/dc/clk_mgr > > > > > -- > > > > > 2.40.1 > > > > > > > > > > -- > Hamza > -- Kees Cook
Re: [PATCH] drm/amdgpu: add the accelerator pcie class
On Tue, May 23, 2023 at 10:02:32AM -0400, Alex Deucher wrote: > On Tue, May 23, 2023 at 5:25 AM Christoph Hellwig wrote: > > > > On Tue, May 23, 2023 at 12:02:32PM +0800, Shiwu Zhang wrote: > > > + { PCI_DEVICE(0x1002, PCI_ANY_ID), > > > + .class = PCI_CLASS_ACCELERATOR_PROCESSING << 8, > > > + .class_mask = 0xff, > > > + .driver_data = CHIP_IP_DISCOVERY }, > > > > Probing for every single device of a given class for a single vendor > > to a driver is just fundamentaly wrong. Please list the actual IDs > > that the driver can handle. > > How so? The driver handles all devices of that class. We already do > that for PCI_CLASS_DISPLAY_VGA and PCI_CLASS_DISPLAY_OTHER. Other > drivers do similar things. How is that going to work in the long run? The chances of totally incompatbile devices from the same vendor appearing is absolutely given. > The hda audio driver does the same thing > for PCI_CLASS_MULTIMEDIA_HD_AUDIO for example. > That, just like PCI_CLASS_STORAGE_EXPRESS is a different case, as the class is associated with an actual documented programming interface.
Re: [PATCH 06/36] drm/amd/display: add CRTC driver-specific property for gamma TF
On 5/24/23 04:24, Pekka Paalanen wrote: > On Tue, 23 May 2023 21:14:50 -0100 > Melissa Wen wrote: > >> Hook up driver-specific atomic operations for managing AMD color >> properties and create AMD driver-specific color management properties >> and attach them according to HW capabilities defined by `struct >> dc_color_caps`. Add enumerated transfer function property to DRM CRTC >> gamma to convert to wire encoding with or without a user gamma LUT. >> Enumerated TFs are not supported yet by the DRM color pipeline, >> therefore, create a DRM enum list with the predefined TFs supported by >> the AMD display driver. >> >> Co-developed-by: Joshua Ashton >> Signed-off-by: Joshua Ashton >> Signed-off-by: Melissa Wen >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 36 ++ >> drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h | 8 +++ >> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 22 ++ >> .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c| 72 ++- >> 4 files changed, 137 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c >> index 389396eac222..88af075e6c18 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c >> @@ -1247,6 +1247,38 @@ amdgpu_display_user_framebuffer_create(struct >> drm_device *dev, >> return &amdgpu_fb->base; >> } >> >> +static const struct drm_prop_enum_list drm_transfer_function_enum_list[] = { >> +{ DRM_TRANSFER_FUNCTION_DEFAULT, "Default" }, >> +{ DRM_TRANSFER_FUNCTION_SRGB, "sRGB" }, >> +{ DRM_TRANSFER_FUNCTION_BT709, "BT.709" }, >> +{ DRM_TRANSFER_FUNCTION_PQ, "PQ (Perceptual Quantizer)" }, >> +{ DRM_TRANSFER_FUNCTION_LINEAR, "Linear" }, >> +{ DRM_TRANSFER_FUNCTION_UNITY, "Unity" }, >> +{ DRM_TRANSFER_FUNCTION_HLG, "HLG (Hybrid Log Gamma)" }, >> +{ DRM_TRANSFER_FUNCTION_GAMMA22, "Gamma 2.2" }, >> +{ DRM_TRANSFER_FUNCTION_GAMMA24, "Gamma 2.4" }, >> +{ DRM_TRANSFER_FUNCTION_GAMMA26, "Gamma 2.6" }, >> +}; >> + >> +#ifdef AMD_PRIVATE_COLOR >> +static int >> +amdgpu_display_create_color_properties(struct amdgpu_device *adev) >> +{ >> +struct drm_property *prop; >> + >> +prop = drm_property_create_enum(adev_to_drm(adev), >> +DRM_MODE_PROP_ENUM, >> +"AMD_REGAMMA_TF", > > Hi, > > is this REGAMMA element capable of applying only optical-to-electrical > direction of the listed TFs? > > I was expecting that the listed TF names would include an explanation > of the direction, for example "PQ EOTF" vs. "inverse PQ EOTF". > > Very specifically "inverse EOTF" and not "OETF" because they > generally are not the same concept. > > PQ defines only EOTF while HLG for example defines OETF (HLG defines > its EOTF as a combination of inverse HLG OETF and a parameterised HLG > OOTF). So if you say "PQ TF" I will assume it means > electrical-to-optical and if you say HLG TF I might assume > optical-to-electrical. I think these enum names should be more explicit > about what they refer to, to avoid any ambiguity. > > What kind of TF is "Unity"? > > This patch is not adding any docs for any of these. Is there another > patch that does? > > I'm still confused about how this "private" API should be thought of. > Should it be documented at all? Is it free to use for userspace? > Was the expectation that only the Steam Deck distribution would enable > these in the kernel, and no-one else? And if anyone builds their own > kernel with these enabled? So my ask for docs may or may not be > warranted. > The current plan is to put the API bits behind a #ifdef AMD_PRIVATE_COLOR (or a similar name) and not making it configurable via KConfig. Anyone that wants them would have to build the kernel with -DAMD_PRIVATE_COLOR. Thanks for re-iterating your naming and documentation concerns. It would be good to still fix that up, even if this doesn't become upstream API as-is. Harry > I don't like the names degamma/regamma/gamma at all. I don't like > calling something a LUT when it can have a parametric or enumerated > curve. I don't like calling an element a transfer function if it could > be a shaper or a combination of TF and shaper and maybe something else > (i.e. a LUT). > > But that's nothing new. If the expectation is that no-one should use > these, then it's fine, and you don't need to CC me. You know I will > always respond with similar comments about documenting things, having > good names, etc. that is important for generic userspace, which is just > not needed for "no-users UAPI". ;-) > > > Thanks, > pq > >> +drm_transfer_function_enum_list, >> + >> ARRAY_SIZE(drm_transfer_function_enum_list)); >> +if (!prop) >> +return -ENOMEM; >> +adev->mode_info.regamma_tf_property = prop; >> + >> +return 0; >>
Re: [PATCH] drm/amdgpu: Mark mmhub_v1_8_mmea_err_status_reg as __maybe_unused
On Fri, May 19, 2023 at 06:14:38PM +0530, Srinivasan Shanmugam wrote: > Silencing the compiler from below compilation error: > > drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c:704:23: error: variable > 'mmhub_v1_8_mmea_err_status_reg' is not needed and will not be emitted > [-Werror,-Wunneeded-internal-declaration] > static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = { > ^ > 1 error generated. > > Mark the variable as __maybe_unused to make it clear to clang that this > is expected, so there is no more warning. > > Cc: Christian König > Cc: Lijo Lazar > Cc: Luben Tuikov > Cc: Alex Deucher > Signed-off-by: Srinivasan Shanmugam Traditionally, this attribute would go between the [] and =, but that is a nit. Can someone please pick this up to unblock our builds on -next? Reviewed-by: Nathan Chancellor > --- > drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c > b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c > index 3648994724c2..cba087e529c0 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c > +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c > @@ -701,6 +701,7 @@ static void mmhub_v1_8_reset_ras_error_count(struct > amdgpu_device *adev) > mmhub_v1_8_inst_reset_ras_error_count(adev, i); > } > > +__maybe_unused > static const uint32_t mmhub_v1_8_mmea_err_status_reg[] = { > regMMEA0_ERR_STATUS, > regMMEA1_ERR_STATUS, > -- > 2.25.1 >
Re: [PATCH 02/36] drm/drm_property: make replace_property_blob_from_id a DRM helper
On Tue, May 23, 2023 at 09:14:46PM -0100, Melissa Wen wrote: > Place it in drm_property where drm_property_replace_blob and > drm_property_lookup_blob live. Then we can use the DRM helper for > driver-specific KMS properties too. > > Signed-off-by: Melissa Wen I know that I've got Cc-ed because of a comment, but I did have a look at the whole patch. If it is useful, then you can add Reviewed-by: Liviu Dudau Best regards, Liviu > --- > drivers/gpu/drm/arm/malidp_crtc.c | 2 +- > drivers/gpu/drm/drm_atomic_uapi.c | 43 --- > drivers/gpu/drm/drm_property.c| 49 +++ > include/drm/drm_property.h| 6 > 4 files changed, 61 insertions(+), 39 deletions(-) > > diff --git a/drivers/gpu/drm/arm/malidp_crtc.c > b/drivers/gpu/drm/arm/malidp_crtc.c > index dc01c43f6193..d72c22dcf685 100644 > --- a/drivers/gpu/drm/arm/malidp_crtc.c > +++ b/drivers/gpu/drm/arm/malidp_crtc.c > @@ -221,7 +221,7 @@ static int malidp_crtc_atomic_check_ctm(struct drm_crtc > *crtc, > > /* >* The size of the ctm is checked in > - * drm_atomic_replace_property_blob_from_id. > + * drm_property_replace_blob_from_id. >*/ > ctm = (struct drm_color_ctm *)state->ctm->data; > for (i = 0; i < ARRAY_SIZE(ctm->matrix); ++i) { > diff --git a/drivers/gpu/drm/drm_atomic_uapi.c > b/drivers/gpu/drm/drm_atomic_uapi.c > index c06d0639d552..b76d50ae244c 100644 > --- a/drivers/gpu/drm/drm_atomic_uapi.c > +++ b/drivers/gpu/drm/drm_atomic_uapi.c > @@ -362,39 +362,6 @@ static s32 __user *get_out_fence_for_connector(struct > drm_atomic_state *state, > return fence_ptr; > } > > -static int > -drm_atomic_replace_property_blob_from_id(struct drm_device *dev, > - struct drm_property_blob **blob, > - uint64_t blob_id, > - ssize_t expected_size, > - ssize_t expected_elem_size, > - bool *replaced) > -{ > - struct drm_property_blob *new_blob = NULL; > - > - if (blob_id != 0) { > - new_blob = drm_property_lookup_blob(dev, blob_id); > - if (new_blob == NULL) > - return -EINVAL; > - > - if (expected_size > 0 && > - new_blob->length != expected_size) { > - drm_property_blob_put(new_blob); > - return -EINVAL; > - } > - if (expected_elem_size > 0 && > - new_blob->length % expected_elem_size != 0) { > - drm_property_blob_put(new_blob); > - return -EINVAL; > - } > - } > - > - *replaced |= drm_property_replace_blob(blob, new_blob); > - drm_property_blob_put(new_blob); > - > - return 0; > -} > - > static int drm_atomic_crtc_set_property(struct drm_crtc *crtc, > struct drm_crtc_state *state, struct drm_property *property, > uint64_t val) > @@ -415,7 +382,7 @@ static int drm_atomic_crtc_set_property(struct drm_crtc > *crtc, > } else if (property == config->prop_vrr_enabled) { > state->vrr_enabled = val; > } else if (property == config->degamma_lut_property) { > - ret = drm_atomic_replace_property_blob_from_id(dev, > + ret = drm_property_replace_blob_from_id(dev, > &state->degamma_lut, > val, > -1, sizeof(struct drm_color_lut), > @@ -423,7 +390,7 @@ static int drm_atomic_crtc_set_property(struct drm_crtc > *crtc, > state->color_mgmt_changed |= replaced; > return ret; > } else if (property == config->ctm_property) { > - ret = drm_atomic_replace_property_blob_from_id(dev, > + ret = drm_property_replace_blob_from_id(dev, > &state->ctm, > val, > sizeof(struct drm_color_ctm), -1, > @@ -431,7 +398,7 @@ static int drm_atomic_crtc_set_property(struct drm_crtc > *crtc, > state->color_mgmt_changed |= replaced; > return ret; > } else if (property == config->gamma_lut_property) { > - ret = drm_atomic_replace_property_blob_from_id(dev, > + ret = drm_property_replace_blob_from_id(dev, > &state->gamma_lut, > val, > -1, sizeof(struct drm_color_lut), > @@ -563,7 +530,7 @@ static int drm_atomic_plane_set_property(struct drm_plane > *plane, > } else if (property == plane->color_range_property) { > state->color_range = val; > } else if (property == config->prop_fb_damage_clips) { > - ret = drm_atomic
Re: [PATCH 1/2] Revert "drm/amd/display: Block optimize on consecutive FAMS enables"
On Thu, May 25, 2023 at 6:27 AM Michel Dänzer wrote: > > On 5/23/23 18:09, Hamza Mahfooz wrote: > > On 5/22/23 09:08, Michel Dänzer wrote: > >> From: Michel Dänzer > >> > >> This reverts commit ce560ac40272a5c8b5b68a9d63a75edd9e66aed2. > >> > >> It depends on its parent commit, which we want to revert. > >> > >> Signed-off-by: Michel Dänzer > > > > I have applied the series, thanks! > > Thank you. Note that these need to be merged for 6.4; they weren't in Alex's > 6.4 fixes PR yesterday. Yes, I plan to include it in next week's PR. Alex
Re: [PATCH] drm/amdgpu: keep irq count in amdgpu_irq_disable_all
Am 25.05.23 um 11:28 schrieb Guchun Chen: This can clean up all irq warnings because of unbalanced amdgpu_irq_get/put when unplugging/unbind device, and leave irq count decrease in each ip fini function. Signed-off-by: Guchun Chen Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c index 00f2106c17b9..f90920fbd340 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c @@ -140,7 +140,6 @@ void amdgpu_irq_disable_all(struct amdgpu_device *adev) continue; for (k = 0; k < src->num_types; ++k) { - atomic_set(&src->enabled_types[k], 0); r = src->funcs->set(adev, src, k, AMDGPU_IRQ_STATE_DISABLE); if (r)
Re: [PATCH] drm/amdgpu: enable tmz by default for GC 11.0.1
Reviewed-by: Alex Deucher On Thu, May 25, 2023 at 3:22 AM Ikshwaku Chauhan wrote: > > Add IP GC 11.0.1 in the list of target to have > tmz enabled by default. > > Signed-off-by: Ikshwaku Chauhan > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > index 3f5dd9e32e08..348d856626c6 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > @@ -591,6 +591,8 @@ void amdgpu_gmc_tmz_set(struct amdgpu_device *adev) > case IP_VERSION(9, 3, 0): > /* GC 10.3.7 */ > case IP_VERSION(10, 3, 7): > + /* GC 11.0.1 */ > + case IP_VERSION(11, 0, 1): > if (amdgpu_tmz == 0) { > adev->gmc.tmz_enabled = false; > dev_info(adev->dev, > @@ -614,7 +616,6 @@ void amdgpu_gmc_tmz_set(struct amdgpu_device *adev) > case IP_VERSION(10, 3, 1): > /* YELLOW_CARP*/ > case IP_VERSION(10, 3, 3): > - case IP_VERSION(11, 0, 1): > case IP_VERSION(11, 0, 4): > /* Don't enable it by default yet. > */ > -- > 2.25.1 >
[PATCH] drm/amdgpu: Fix unused sq_int_priv variable in event_interrupt_wq_v11
gcc with W=1 drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_int_process_v11.c: In function ‘event_interrupt_wq_v11’: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_int_process_v11.c:282:38: warning: variable ‘sq_int_priv’ set but not used [-Wunused-but-set-variable] 282 | uint8_t sq_int_enc, sq_int_errtype, sq_int_priv; | Remove unused sq_int_priv variable. Cc: Christian König Cc: Alex Deucher Signed-off-by: Srinivasan Shanmugam --- drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c | 9 + 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c index 0f0fdea4cd8a..fa0cf6d17baa 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c @@ -279,7 +279,7 @@ static void event_interrupt_wq_v11(struct kfd_node *dev, { uint16_t source_id, client_id, ring_id, pasid, vmid; uint32_t context_id0, context_id1; - uint8_t sq_int_enc, sq_int_errtype, sq_int_priv; + u8 sq_int_enc, sq_int_errtype; struct kfd_vm_fault_info info = {0}; struct kfd_hsa_memory_exception_data exception_data; @@ -348,13 +348,6 @@ static void event_interrupt_wq_v11(struct kfd_node *dev, break; case SQ_INTERRUPT_WORD_ENCODING_INST: print_sq_intr_info_inst(context_id0, context_id1); - sq_int_priv = REG_GET_FIELD(context_id0, - SQ_INTERRUPT_WORD_WAVE_CTXID0, PRIV); - /*if (sq_int_priv && (kfd_set_dbg_ev_from_interrupt(dev, pasid, - KFD_CTXID0_DOORBELL_ID(context_id0), - KFD_CTXID0_TRAP_CODE(context_id0), - NULL, 0))) - return;*/ break; case SQ_INTERRUPT_WORD_ENCODING_ERROR: print_sq_intr_info_error(context_id0, context_id1); -- 2.25.1
Re: [PATCH 1/2] Revert "drm/amd/display: Block optimize on consecutive FAMS enables"
On 5/23/23 18:09, Hamza Mahfooz wrote: > On 5/22/23 09:08, Michel Dänzer wrote: >> From: Michel Dänzer >> >> This reverts commit ce560ac40272a5c8b5b68a9d63a75edd9e66aed2. >> >> It depends on its parent commit, which we want to revert. >> >> Signed-off-by: Michel Dänzer > > I have applied the series, thanks! Thank you. Note that these need to be merged for 6.4; they weren't in Alex's 6.4 fixes PR yesterday. -- Earthling Michel Dänzer| https://redhat.com Libre software enthusiast | Mesa and Xwayland developer