On 11/7/25 14:31, Timur Kristóf wrote: > On Fri, 2025-11-07 at 14:14 +0100, Christian König wrote: >> On 11/7/25 11:47, Timur Kristóf wrote: >>> On Fri, 2025-11-07 at 11:01 +0100, Christian König wrote: >>>> On 11/7/25 10:53, Timur Kristóf wrote: >>>>> On Fri, 2025-11-07 at 10:49 +0100, Christian König wrote: >>>>>> On 11/6/25 19:44, Timur Kristóf wrote: >>>>>>> VCE uses the VCPU BO to keep track of currently active >>>>>>> encoding sessions. To make sure the VCE functions correctly >>>>>>> after suspend/resume, save the VCPU BO to system RAM on >>>>>>> suspend and restore it on resume. >>>>>>> >>>>>>> Additionally, make sure to keep the VCPU BO pinned. >>>>>>> The VCPU BO needs to stay at the same location before and >>>>>>> after >>>>>>> sleep/resume because the FW code is not relocatable once >>>>>>> it's >>>>>>> started. >>>>>>> >>>>>>> Unfortunately this is not enough to make VCE suspend work >>>>>>> when >>>>>>> there are encoding sessions active, so don't allow that >>>>>>> yet. >>>>>> >>>>>> The question if this is the right approach or not can only >>>>>> Leo >>>>>> and >>>>>> Ruijing answer. >>>>>> >>>>>> If we can get VCE1-3 to keep session working after >>>>>> suspend/resume >>>>>> we >>>>>> should make this change otherwise we should rather make all >>>>>> old >>>>>> sessions invalid after suspend/resume and only re-load the >>>>>> FW. >>>>>> >>>>>> Anyway I think you should make the removal of >>>>>> "amdgpu_bo_kmap(adev- >>>>>>> vce.vcpu_bo, &cpu_addr);" a separate patch, cause that >>>>>>> seems to >>>>>>> be a >>>>>> good cleanup no matter what. >>>>>> >>>>> >>>>> This change is necessary for the VCE1 code when it loads the >>>>> firmware >>>>> signature. Without this patch, we would need to call reserve(), >>>>> kmap(), >>>>> kunmap(), kunreserve() in vce_v1_0_load_fw_signature(). >>>>> >>>>> Let me know which approach you prefer. >>>> >>>> In this case definately make removal of amdgpu_bo_kmap(adev- >>>>> vce.vcpu_bo, &cpu_addr) a separate patch. >>> >>> Sorry, can you clarify what you mean? >>> Should I just go back to the way things were on the first version >>> of >>> the series, and then clean it up later? >> >> Just add a patch (early in the series, probably as first patch) to >> remove amdgpu_bo_kmap(adev-> vce.vcpu_bo, &cpu_addr). > > Is it allowed to keep a BO mapped when it is unreserved?
Yes, as long as it is also pinned. IIRC the VCE BO is allocated by amdgpu_bo_create_kernel() and that both pins and maps the BO. Regards, Christian. > >> >> Then put the memset_io() into amdgpu_vce_resume() like you had in the >> first series of the patch so that VCE1 behaves the same as VCE2-3. > > Ok > >> >> And when the series has landed we can clean everything up depending >> on what Leo/Ruijing has decided to do with suspend/resume on VCE1-3. > > Sounds good. > > > >> >> >>> >>>> >>>> I want to get initial VCE1 working and landed independent of what >>>> Leo/Ruijing say to suspend/resume. >>> >>> Agreed. >>> >>>> >>>> Can be that suspend/resume is then still broken, but that is also >>>> the >>>> case for VCE2-3 then. >>> >>> Yes, some extra work is going to be needed on top of this patch to >>> make >>> suspend/resume work (if it's possible at all). >>> >>>> >>>> >>>>> >>>>>> >>>>>>> >>>>>>> Signed-off-by: Timur Kristóf <[email protected]> >>>>>>> --- >>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 44 ++++++++----- >>>>>>> ---- >>>>>>> ---- >>>>>>> drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 52 ++++--------- >>>>>>> ---- >>>>>>> ---- >>>>>>> ---- >>>>>>> 2 files changed, 24 insertions(+), 72 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c >>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c >>>>>>> index 2297608c5191..4beec5b56c4f 100644 >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c >>>>>>> @@ -206,6 +206,10 @@ int amdgpu_vce_sw_init(struct >>>>>>> amdgpu_device >>>>>>> *adev, unsigned long size) >>>>>>> if (!adev->vce.fw) >>>>>>> return -ENOENT; >>>>>>> >>>>>>> + adev->vce.saved_bo = kvmalloc(size, GFP_KERNEL); >>>>>>> + if (!adev->vce.saved_bo) >>>>>>> + return -ENOMEM; >>>>>>> + >>>>>>> r = amdgpu_bo_create_kernel(adev, size, PAGE_SIZE, >>>>>>> AMDGPU_GEM_DOMAIN_VRAM >>>>>>> | >>>>>>> AMDGPU_GEM_DOMAIN_GTT, >>>>>>> @@ -254,6 +258,9 @@ int amdgpu_vce_sw_fini(struct >>>>>>> amdgpu_device >>>>>>> *adev) >>>>>>> amdgpu_bo_free_kernel(&adev->vce.vcpu_bo, &adev- >>>>>>>> vce.gpu_addr, >>>>>>> (void **)&adev->vce.cpu_addr); >>>>>>> >>>>>>> + kvfree(adev->vce.saved_bo); >>>>>>> + adev->vce.saved_bo = NULL; >>>>>>> + >>>>>>> return 0; >>>>>>> } >>>>>>> >>>>>>> @@ -290,13 +297,19 @@ int amdgpu_vce_entity_init(struct >>>>>>> amdgpu_device *adev, struct amdgpu_ring *ring) >>>>>>> */ >>>>>>> int amdgpu_vce_suspend(struct amdgpu_device *adev) >>>>>>> { >>>>>>> - int i; >>>>>>> + int i, idx; >>>>>>> >>>>>>> cancel_delayed_work_sync(&adev->vce.idle_work); >>>>>>> >>>>>>> if (adev->vce.vcpu_bo == NULL) >>>>>>> return 0; >>>>>>> >>>>>>> + if (drm_dev_enter(adev_to_drm(adev), &idx)) { >>>>>>> + memcpy_fromio(adev->vce.saved_bo, adev- >>>>>>>> vce.cpu_addr, >>>>>>> + amdgpu_bo_size(adev- >>>>>>>> vce.vcpu_bo)); >>>>>>> + drm_dev_exit(idx); >>>>>>> + } >>>>>>> + >>>>>>> for (i = 0; i < AMDGPU_MAX_VCE_HANDLES; ++i) >>>>>>> if (atomic_read(&adev->vce.handles[i])) >>>>>>> break; >>>>>>> @@ -316,40 +329,17 @@ int amdgpu_vce_suspend(struct >>>>>>> amdgpu_device >>>>>>> *adev) >>>>>>> */ >>>>>>> int amdgpu_vce_resume(struct amdgpu_device *adev) >>>>>>> { >>>>>>> - void *cpu_addr; >>>>>>> - const struct common_firmware_header *hdr; >>>>>>> - unsigned int offset; >>>>>>> - int r, idx; >>>>>>> + int idx; >>>>>>> >>>>>>> if (adev->vce.vcpu_bo == NULL) >>>>>>> return -EINVAL; >>>>>>> >>>>>>> - r = amdgpu_bo_reserve(adev->vce.vcpu_bo, false); >>>>>>> - if (r) { >>>>>>> - dev_err(adev->dev, "(%d) failed to reserve >>>>>>> VCE >>>>>>> bo\n", r); >>>>>>> - return r; >>>>>>> - } >>>>>>> - >>>>>>> - r = amdgpu_bo_kmap(adev->vce.vcpu_bo, &cpu_addr); >>>>>>> - if (r) { >>>>>>> - amdgpu_bo_unreserve(adev->vce.vcpu_bo); >>>>>>> - dev_err(adev->dev, "(%d) VCE map >>>>>>> failed\n", >>>>>>> r); >>>>>>> - return r; >>>>>>> - } >>>>>>> - >>>>>>> - hdr = (const struct common_firmware_header *)adev- >>>>>>>> vce.fw- >>>>>>>> data; >>>>>>> - offset = le32_to_cpu(hdr- >>>>>>>> ucode_array_offset_bytes); >>>>>>> - >>>>>>> if (drm_dev_enter(adev_to_drm(adev), &idx)) { >>>>>>> - memcpy_toio(cpu_addr, adev->vce.fw->data + >>>>>>> offset, >>>>>>> - adev->vce.fw->size - offset); >>>>>>> + memcpy_toio(adev->vce.cpu_addr, adev- >>>>>>>> vce.saved_bo, >>>>>>> + amdgpu_bo_size(adev- >>>>>>>> vce.vcpu_bo)); >>>>>>> drm_dev_exit(idx); >>>>>>> } >>>>>>> >>>>>>> - amdgpu_bo_kunmap(adev->vce.vcpu_bo); >>>>>>> - >>>>>>> - amdgpu_bo_unreserve(adev->vce.vcpu_bo); >>>>>>> - >>>>>>> return 0; >>>>>>> } >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >>>>>>> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >>>>>>> index 2d64002bed61..21b6656b2f41 100644 >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >>>>>>> @@ -448,14 +448,8 @@ static int vce_v4_0_sw_init(struct >>>>>>> amdgpu_ip_block *ip_block) >>>>>>> return r; >>>>>>> >>>>>>> if (adev->firmware.load_type == >>>>>>> AMDGPU_FW_LOAD_PSP) { >>>>>>> - const struct common_firmware_header *hdr; >>>>>>> - unsigned size = amdgpu_bo_size(adev- >>>>>>>> vce.vcpu_bo); >>>>>>> - >>>>>>> - adev->vce.saved_bo = kvmalloc(size, >>>>>>> GFP_KERNEL); >>>>>>> - if (!adev->vce.saved_bo) >>>>>>> - return -ENOMEM; >>>>>>> - >>>>>>> - hdr = (const struct common_firmware_header >>>>>>> *)adev- >>>>>>>> vce.fw->data; >>>>>>> + const struct common_firmware_header *hdr = >>>>>>> + (const struct >>>>>>> common_firmware_header >>>>>>> *)adev->vce.fw->data; >>>>>>> adev- >>>>>>>> firmware.ucode[AMDGPU_UCODE_ID_VCE].ucode_id >>>>>>> = AMDGPU_UCODE_ID_VCE; >>>>>>> adev- >>>>>>>> firmware.ucode[AMDGPU_UCODE_ID_VCE].fw = >>>>>>> adev->vce.fw; >>>>>>> adev->firmware.fw_size += >>>>>>> @@ -506,11 +500,6 @@ static int vce_v4_0_sw_fini(struct >>>>>>> amdgpu_ip_block *ip_block) >>>>>>> /* free MM table */ >>>>>>> amdgpu_virt_free_mm_table(adev); >>>>>>> >>>>>>> - if (adev->firmware.load_type == >>>>>>> AMDGPU_FW_LOAD_PSP) { >>>>>>> - kvfree(adev->vce.saved_bo); >>>>>>> - adev->vce.saved_bo = NULL; >>>>>>> - } >>>>>>> - >>>>>>> r = amdgpu_vce_suspend(adev); >>>>>>> if (r) >>>>>>> return r; >>>>>>> @@ -561,20 +550,7 @@ static int vce_v4_0_hw_fini(struct >>>>>>> amdgpu_ip_block *ip_block) >>>>>>> static int vce_v4_0_suspend(struct amdgpu_ip_block >>>>>>> *ip_block) >>>>>>> { >>>>>>> struct amdgpu_device *adev = ip_block->adev; >>>>>>> - int r, idx; >>>>>>> - >>>>>>> - if (adev->vce.vcpu_bo == NULL) >>>>>>> - return 0; >>>>>>> - >>>>>>> - if (drm_dev_enter(adev_to_drm(adev), &idx)) { >>>>>>> - if (adev->firmware.load_type == >>>>>>> AMDGPU_FW_LOAD_PSP) { >>>>>>> - unsigned size = >>>>>>> amdgpu_bo_size(adev- >>>>>>>> vce.vcpu_bo); >>>>>>> - void *ptr = adev->vce.cpu_addr; >>>>>>> - >>>>>>> - memcpy_fromio(adev->vce.saved_bo, >>>>>>> ptr, >>>>>>> size); >>>>>>> - } >>>>>>> - drm_dev_exit(idx); >>>>>>> - } >>>>>>> + int r; >>>>>>> >>>>>>> /* >>>>>>> * Proper cleanups before halting the HW engine: >>>>>>> @@ -609,25 +585,11 @@ static int vce_v4_0_suspend(struct >>>>>>> amdgpu_ip_block *ip_block) >>>>>>> static int vce_v4_0_resume(struct amdgpu_ip_block >>>>>>> *ip_block) >>>>>>> { >>>>>>> struct amdgpu_device *adev = ip_block->adev; >>>>>>> - int r, idx; >>>>>>> - >>>>>>> - if (adev->vce.vcpu_bo == NULL) >>>>>>> - return -EINVAL; >>>>>>> - >>>>>>> - if (adev->firmware.load_type == >>>>>>> AMDGPU_FW_LOAD_PSP) { >>>>>>> - >>>>>>> - if (drm_dev_enter(adev_to_drm(adev), >>>>>>> &idx)) { >>>>>>> - unsigned size = >>>>>>> amdgpu_bo_size(adev- >>>>>>>> vce.vcpu_bo); >>>>>>> - void *ptr = adev->vce.cpu_addr; >>>>>>> + int r; >>>>>>> >>>>>>> - memcpy_toio(ptr, adev- >>>>>>>> vce.saved_bo, >>>>>>> size); >>>>>>> - drm_dev_exit(idx); >>>>>>> - } >>>>>>> - } else { >>>>>>> - r = amdgpu_vce_resume(adev); >>>>>>> - if (r) >>>>>>> - return r; >>>>>>> - } >>>>>>> + r = amdgpu_vce_resume(adev); >>>>>>> + if (r) >>>>>>> + return r; >>>>>>> >>>>>>> return vce_v4_0_hw_init(ip_block); >>>>>>> }
