On Wednesday, May 13, 2026 6:36:02 PM Central European Summer Time Alex Deucher wrote: > + Amir > > Amir may have some insights on navi4x as he was looking at this recently. > > Alex
Hi Alex, Amir, I think we are very close to enabling retry faults by default on Navi 3. I'd be happy to receive feedback on the above series. With regards to Navi 4: I also attempted to get it working on Navi 48, and I managed to get retry faults enabled, but it seems that amdgpu_vm_handle_fault() can't actually resolve the page fault on Navi 48. It just keeps retrying until it times out. Christian suggested this may be due to an invalid page being stuck in the cache. I tried adding a TLB flush but unfortunately that just made it worse (it hangs irrecoverably). Any insight is appreciated! Thanks & best regards, Timur > > On Wed, May 13, 2026 at 12:30 PM Timur Kristóf <[email protected]> wrote: > > Fix some issues regarding retry fault handling, > > such as enabling the retry fault interrupt (necessary > > for retry faults to work) and such. > > > > Improve retry faults on Navi 3 dGPUs by enabling > > the filter CAM, which can filter the repeated page > > fault interrupts that happen when retry faults are > > enabled, making the handling more efficient. > > > > With this series, the kernel is able to mitigate > > most page faults on Navi 3 without causing a hang > > and without a need to reset the GPU, when the > > amdgpu.noretry=0 module parameter is set. > > > > Timur Kristóf (6): > > drm/amdgpu: Use gmc->noretry instead of amdgpu_noretry directly > > drm/amdgpu/gfxhub: Enable retry fault interrupts when needed > > drm/amdgpu/gfxhub: Program CRASH_ON_*_FAULT bits to 0 as needed > > drm/amdgpu/gmc: Don't compare page fault timestamps with other > > > > interrupts > > > > drm/amdgpu/ih: Add retry_cam_ack IH function pointer > > drm/amdgpu: Enable retry CAM on Navi 3 dGPUs > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 7 +++++-- > > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 1 + > > drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 + > > drivers/gpu/drm/amd/amdgpu/gfxhub_v11_5_0.c | 17 ++++++++++------- > > drivers/gpu/drm/amd/amdgpu/gfxhub_v12_0.c | 17 ++++++++++------- > > drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c | 19 +++++++++++-------- > > drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c | 15 +++++++++------ > > drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c | 15 +++++++++------ > > drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c | 15 +++++++++------ > > drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c | 15 +++++++++------ > > drivers/gpu/drm/amd/amdgpu/gfxhub_v3_0.c | 17 ++++++++++------- > > drivers/gpu/drm/amd/amdgpu/gfxhub_v3_0_3.c | 17 ++++++++++------- > > drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 5 ++++- > > drivers/gpu/drm/amd/amdgpu/ih_v6_0.c | 18 +++++++++++++++++- > > drivers/gpu/drm/amd/amdgpu/ih_v7_0.c | 6 ++++++ > > drivers/gpu/drm/amd/amdgpu/mmhub_v3_0.c | 2 +- > > drivers/gpu/drm/amd/amdgpu/mmhub_v3_0_1.c | 2 +- > > drivers/gpu/drm/amd/amdgpu/mmhub_v3_0_2.c | 2 +- > > drivers/gpu/drm/amd/amdgpu/mmhub_v3_3.c | 2 +- > > drivers/gpu/drm/amd/amdgpu/mmhub_v4_1_0.c | 2 +- > > drivers/gpu/drm/amd/amdgpu/mmhub_v4_2_0.c | 2 +- > > drivers/gpu/drm/amd/amdgpu/vega20_ih.c | 8 +++++++- > > 22 files changed, 134 insertions(+), 71 deletions(-) > > > > -- > > 2.54.0
