On 11/6/25 19:44, Timur Kristóf wrote:
> The VCPU BO doesn't only contain the VCE firmware but also other
> ranges that the VCE uses for its stack and data. Let's initialize
> this to zero to avoid having garbage in the VCPU BO.
> 
> v2:
> - Only clear BO after creation, not on resume.
> 
> Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)")
> Signed-off-by: Timur Kristóf <[email protected]>

For now this patch here is Reviewed-by: Christian König 
<[email protected]> since it addresses a clear problem and potentially 
even need to be back-ported to older kernels.

But I think we should clean that up more full after we landed VCE1 support.

Assuming that it hold true that VCE1-3 can't continue with sessions after 
suspend resume we should do something like this:

1. Remove all amdgpu_bo_kmap(adev->vce.vcpu_bo, &cpu_addr).
        As kernel BO the VCE FW BO is pinned and mapped on creation time.

2. Rename amdgpu_vce_resume() into amdgpu_vce_reload_fw() and add the 
memset_io() there like you originally planned.

3. Also add resetting the VCE FW handles into amdgpu_vce_reload_fw().

   E.g. something like this:
        for (i = 0; i < AMDGPU_MAX_VCE_HANDLES; ++i) {
                atomic_set(&adev->vce.handles[i], 0);
                adev->vce.filp[i] = NULL;
        }

   This way the kernel will reject submissions when userspace tries to use the 
same FW handles as before the suspend/resume and prevent the HW from crashing.

Does that sounds like a plan to you?

Regards,
Christian.

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> index b9060bcd4806..e028ad0d3b7a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> @@ -187,6 +187,8 @@ int amdgpu_vce_sw_init(struct amdgpu_device *adev, 
> unsigned long size)
>               return r;
>       }
>  
> +     memset_io(adev->vce.cpu_addr, 0, size);
> +
>       for (i = 0; i < AMDGPU_MAX_VCE_HANDLES; ++i) {
>               atomic_set(&adev->vce.handles[i], 0);
>               adev->vce.filp[i] = NULL;

Reply via email to