On Tuesday, May 19, 2026 11:06:51 AM Central European Summer Time Christian 
König wrote:
> On 5/19/26 10:22, Timur Kristóf wrote:
> > UVD 4.x and older can only access FB and MSG buffers from a
> > specific 256M VRAM segment that the VCPU BO is also located in.
> > We already modify all placements of the given BO to ensure
> > the BO is placed within this segment.
> > 
> > Previously, amdgpu_uvd_force_into_uvd_segment() always assumed
> > that the UVD segment is the first 256M of VRAM, even though
> > under some conditions the VCPU BO could be allocated outside
> > this segment, which made UVD non-functional as the BOs were
> > not inside the same segment as the UVD VCPU BO.
> > 
> > Solve that by using the segment where the VCPU BO actually is.
> > 
> > This fixes an issue with UVD failing to initialize on SI/CIK
> > when resizable BAR is enabled and the VCPU BO is allocated
> > in a different segment.
> > 
> > Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/3851
> > Signed-off-by: Timur Kristóf <[email protected]>
> > ---
> > 
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 36 +++++++++++++++----------
> >  1 file changed, 22 insertions(+), 14 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c index
> > 1e59ca924abe..993957927782 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> > @@ -550,16 +550,29 @@ void amdgpu_uvd_free_handles(struct amdgpu_device
> > *adev, struct drm_file *filp)> 
> >     }
> >  
> >  }
> > 
> > +/**
> > + * amdgpu_uvd_force_into_uvd_segment() - Forces placement of a BO into
> > the UVD segment + *
> > + * @abo: buffer object whose placement is forced
> > + *
> > + * UVD 4.x and older can only access FB and MSG buffers from a specific
> > 256M VRAM segment + * that the VCPU BO is also located in. Force the BO
> > into that segment. + */
> > 
> >  static void amdgpu_uvd_force_into_uvd_segment(struct amdgpu_bo *abo)
> >  {
> > 
> > -   int i;
> > +   struct amdgpu_device *adev = amdgpu_ttm_adev(abo->tbo.bdev);
> > +   struct amdgpu_bo *vcpu_bo = adev->uvd.inst[0].vcpu_bo;
> > +   struct amdgpu_res_cursor vcpu_cur;
> > 
> > -   for (i = 0; i < abo->placement.num_placement; ++i) {
> > -           abo->placements[i].fpfn = 0 >> PAGE_SHIFT;
> > -           abo->placements[i].lpfn = (256 * 1024 * 1024) >> 
PAGE_SHIFT;
> > -           if (abo->placements[i].mem_type == TTM_PL_VRAM)
> > -                   abo->placements[i].flags |= 
TTM_PL_FLAG_CONTIGUOUS;
> > -   }
> > +   amdgpu_res_first(vcpu_bo->tbo.resource, 0, amdgpu_bo_size(vcpu_bo),
> > &vcpu_cur); +
> > +   abo->placement.num_placement = 1;
> > +   abo->placements[0].fpfn = ALIGN_DOWN(vcpu_cur.start, SZ_256M) >>
> > PAGE_SHIFT; +       abo->placements[0].lpfn = abo->placements[0].fpfn +
> > (SZ_256M >> PAGE_SHIFT); +  abo->placements[0].mem_type =
> > adev->uvd.inst[0].vcpu_bo->tbo.resource->mem_type;
> This should clearly be applied to all placements, it's just that VRAM should
> use the vcpu segment and GTT the first one.

Previously, the function was trying to do two things at once:

- Forcing some BOs into the UVD segment, as the name of the function suggests, 
but it was buggy because it assumed the UVD segment is always segment 0
- Forcing GTT placements for other BOs, which is not what its name suggests, 
and was actually wrong and not working.

With this patch, the function is now only responsible for forcing MSG/FB BOs 
into the UVD segment, and is no longer responsible for doing anything with the 
other BOs, which was incorrect anyway.

This is also beneficial because it reduces contention for the UVD segment for 
the BOs that don't actually need to be located in that segment.

For non-MSG/FB BOs, this function should no longer be called.
When those BOs are located in GTT, patch 2 should already make sure they are 
placed correctly. When they are in VRAM, patch 5 will make sure to correct 
their placement if it was wrong.

> > +
> > +   if (abo->placements[0].mem_type == TTM_PL_VRAM)
> > +           abo->placements[0].flags |= TTM_PL_FLAG_CONTIGUOUS;
> > 
> >  }
> >  
> >  static u64 amdgpu_uvd_get_addr_from_ctx(struct amdgpu_uvd_cs_ctx *ctx)
> > 
> > @@ -600,13 +613,8 @@ static int amdgpu_uvd_cs_pass1(struct
> > amdgpu_uvd_cs_ctx *ctx)> 
> >     if (!ctx->parser->adev->uvd.address_64_bit) {
> >     
> >             /* check if it's a message or feedback command */
> >             cmd = amdgpu_ib_get_value(ctx->ib, ctx->idx) >> 1;
> > 
> > -           if (cmd == 0x0 || cmd == 0x3) {
> > -                   /* yes, force it into VRAM */
> > -                   uint32_t domain = AMDGPU_GEM_DOMAIN_VRAM;
> > -
> > -                   amdgpu_bo_placement_from_domain(bo, domain);
> > -           }
> > -           amdgpu_uvd_force_into_uvd_segment(bo);
> > +           if (cmd == 0x0 || cmd == 0x3)
> > +                   amdgpu_uvd_force_into_uvd_segment(bo);
> 
> The existing code was already correct. We just messed up the GTT handling by
> not having the correct check in the manager and not supporting GTT->GTT
> moves.

See above for an explanation of why I changed this part.

> 
> Regards,
> Christian.
> 
> >             r = ttm_bo_validate(&bo->tbo, &bo->placement, &tctx);
> >     
> >     }




Reply via email to