Re: [patch] problems with "fix visible VRAM handling during faults"
Christian König wrote: > Am 08.05.24 um 12:17 schrieb Michel Dänzer: > > Does this instead of your patch help by any chance? > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > > index 109fe557a02b..29c197c00018 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > > @@ -427,7 +427,7 @@ bool amdgpu_res_cpu_visible(struct amdgpu_device *adev, > > > > amdgpu_res_first(res, 0, res->size, ); > > while (cursor.remaining) { > > - if ((cursor.start + cursor.size) >= > > adev->gmc.visible_vram_size) > > + if ((cursor.start + cursor.size) > > > adev->gmc.visible_vram_size) > > Oh, good catch. Yes that might be it. Yes, that does it. Thanks!
Re: [patch] problems with "fix visible VRAM handling during faults"
Am 08.05.24 um 12:17 schrieb Michel Dänzer: On 2024-05-07 18:39, Jeremy Day wrote: This is just to report that I've had usually well-behaved applications sometimes having problems with memory access violations since kernel version 6.9-rc5. This past weekend I stumbled across a way to reliably reproduce the problem in the form of a Skyrim save file which causes a crash shortly after loading the game on affected kernels. Things go back to running smoothly only if I revert one of the changes in 5th April's "[PATCH] drm/amdgpu: fix visible VRAM handling during faults" as follows. Patch is against v6.9-rc7. It restores the check for partially visible-to-cpu memory in amdgpu_bo_fault_reserve_notify. Things seem stable again with this change. Does this instead of your patch help by any chance? diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 109fe557a02b..29c197c00018 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -427,7 +427,7 @@ bool amdgpu_res_cpu_visible(struct amdgpu_device *adev, amdgpu_res_first(res, 0, res->size, ); while (cursor.remaining) { - if ((cursor.start + cursor.size) >= adev->gmc.visible_vram_size) + if ((cursor.start + cursor.size) > adev->gmc.visible_vram_size) Oh, good catch. Yes that might be it. Thanks a lot, Christian. return false; amdgpu_res_next(, cursor.size); }
Re: [patch] problems with "fix visible VRAM handling during faults"
On 2024-05-07 18:39, Jeremy Day wrote: > This is just to report that I've had usually well-behaved applications > sometimes having problems with memory access violations since kernel > version 6.9-rc5. This past weekend I stumbled across a way to reliably > reproduce the problem in the form of a Skyrim save file which causes a > crash shortly after loading the game on affected kernels. > > Things go back to running smoothly only if I revert one of the changes > in 5th April's "[PATCH] drm/amdgpu: fix visible VRAM handling during > faults" as follows. > > Patch is against v6.9-rc7. It restores the check for partially > visible-to-cpu memory in amdgpu_bo_fault_reserve_notify. Things > seem stable again with this change. Does this instead of your patch help by any chance? diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 109fe557a02b..29c197c00018 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -427,7 +427,7 @@ bool amdgpu_res_cpu_visible(struct amdgpu_device *adev, amdgpu_res_first(res, 0, res->size, ); while (cursor.remaining) { - if ((cursor.start + cursor.size) >= adev->gmc.visible_vram_size) + if ((cursor.start + cursor.size) > adev->gmc.visible_vram_size) return false; amdgpu_res_next(, cursor.size); } -- Earthling Michel Dänzer| https://redhat.com Libre software enthusiast | Mesa and Xwayland developer