On 5/19/26 10:22, Timur Kristóf wrote:
> UVD 4.x and older can only access FB and MSG buffers from a
> specific 256M VRAM segment that the VCPU BO is also located in.
> We already modify all placements of the given BO to ensure
> the BO is placed within this segment.
>
> Previously, amdgpu_uvd_force_into_uvd_segment() always assumed
> that the UVD segment is the first 256M of VRAM, even though
> under some conditions the VCPU BO could be allocated outside
> this segment, which made UVD non-functional as the BOs were
> not inside the same segment as the UVD VCPU BO.
>
> Solve that by using the segment where the VCPU BO actually is.
>
> This fixes an issue with UVD failing to initialize on SI/CIK
> when resizable BAR is enabled and the VCPU BO is allocated
> in a different segment.
>
> Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/3851
> Signed-off-by: Timur Kristóf <[email protected]>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 36 +++++++++++++++----------
> 1 file changed, 22 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> index 1e59ca924abe..993957927782 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> @@ -550,16 +550,29 @@ void amdgpu_uvd_free_handles(struct amdgpu_device
> *adev, struct drm_file *filp)
> }
> }
>
> +/**
> + * amdgpu_uvd_force_into_uvd_segment() - Forces placement of a BO into the
> UVD segment
> + *
> + * @abo: buffer object whose placement is forced
> + *
> + * UVD 4.x and older can only access FB and MSG buffers from a specific 256M
> VRAM segment
> + * that the VCPU BO is also located in. Force the BO into that segment.
> + */
> static void amdgpu_uvd_force_into_uvd_segment(struct amdgpu_bo *abo)
> {
> - int i;
> + struct amdgpu_device *adev = amdgpu_ttm_adev(abo->tbo.bdev);
> + struct amdgpu_bo *vcpu_bo = adev->uvd.inst[0].vcpu_bo;
> + struct amdgpu_res_cursor vcpu_cur;
>
> - for (i = 0; i < abo->placement.num_placement; ++i) {
> - abo->placements[i].fpfn = 0 >> PAGE_SHIFT;
> - abo->placements[i].lpfn = (256 * 1024 * 1024) >> PAGE_SHIFT;
> - if (abo->placements[i].mem_type == TTM_PL_VRAM)
> - abo->placements[i].flags |= TTM_PL_FLAG_CONTIGUOUS;
> - }
> + amdgpu_res_first(vcpu_bo->tbo.resource, 0, amdgpu_bo_size(vcpu_bo),
> &vcpu_cur);
> +
> + abo->placement.num_placement = 1;
> + abo->placements[0].fpfn = ALIGN_DOWN(vcpu_cur.start, SZ_256M) >>
> PAGE_SHIFT;
> + abo->placements[0].lpfn = abo->placements[0].fpfn + (SZ_256M >>
> PAGE_SHIFT);
> + abo->placements[0].mem_type =
> adev->uvd.inst[0].vcpu_bo->tbo.resource->mem_type;
This should clearly be applied to all placements, it's just that VRAM should
use the vcpu segment and GTT the first one.
> +
> + if (abo->placements[0].mem_type == TTM_PL_VRAM)
> + abo->placements[0].flags |= TTM_PL_FLAG_CONTIGUOUS;
> }
>
> static u64 amdgpu_uvd_get_addr_from_ctx(struct amdgpu_uvd_cs_ctx *ctx)
> @@ -600,13 +613,8 @@ static int amdgpu_uvd_cs_pass1(struct amdgpu_uvd_cs_ctx
> *ctx)
> if (!ctx->parser->adev->uvd.address_64_bit) {
> /* check if it's a message or feedback command */
> cmd = amdgpu_ib_get_value(ctx->ib, ctx->idx) >> 1;
> - if (cmd == 0x0 || cmd == 0x3) {
> - /* yes, force it into VRAM */
> - uint32_t domain = AMDGPU_GEM_DOMAIN_VRAM;
> -
> - amdgpu_bo_placement_from_domain(bo, domain);
> - }
> - amdgpu_uvd_force_into_uvd_segment(bo);
> + if (cmd == 0x0 || cmd == 0x3)
> + amdgpu_uvd_force_into_uvd_segment(bo);
The existing code was already correct. We just messed up the GTT handling by
not having the correct check in the manager and not supporting GTT->GTT moves.
Regards,
Christian.
>
> r = ttm_bo_validate(&bo->tbo, &bo->placement, &tctx);
> }