On 2026-02-11 07:36, Christian König wrote:
It turned that using 4 level page tables on GMC generations which support
57bit VAs actually doesn't work at all.
Background is that the GMC actually can't switch between 4 and 5 levels,
but rather just uses a subset of address space when less than 5 levels are
selected.
Philip already removed the automatically switch to 4levels, now fix it as
well should it be enabled by module parameters.
Signed-off-by: Christian König<[email protected]>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 4 ++--
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 1 +
3 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index e8e8bfa098c3..3b9ca5667de4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -33,9 +33,9 @@
#include "amdgpu_ras.h"
/* VA hole for 48bit and 57bit addresses */
-#define AMDGPU_GMC_HOLE_START (adev->vm_manager.root_level == AMDGPU_VM_PDB3
?\
+#define AMDGPU_GMC_HOLE_START (adev->vm_manager.max_level == 5 ?\
max_level is 4 for 5-level paging, vm level 0 - 4, that means max_level
is not equal to 5, then AMDGPU_GMC_HOLE_START is still above 48-bit, not
above 57-bit, this seems incorrect usage. The module parameter
amdgpu_vm_size can be set to smaller vram size to select smaller level
page table, but this can not change max_level to 5, so this condition,
adev->vm_manager.max_level == 5, is always false, do I misunderstand the
change or something I missed?
Thanks,
Philip
0x0100000000000000ULL : 0x0000800000000000ULL)
-#define AMDGPU_GMC_HOLE_END (adev->vm_manager.root_level == AMDGPU_VM_PDB3
?\
+#define AMDGPU_GMC_HOLE_END (adev->vm_manager.max_level == 5 ?\
0xff00000000000000ULL : 0xffff800000000000ULL)
/*
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index dfad7d11826c..c6fd3a091613 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2409,6 +2409,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev,
uint32_t min_vm_size,
}
adev->vm_manager.max_pfn = (uint64_t)vm_size << 18;
+ adev->vm_manager.max_level = max_level;
tmp = roundup_pow_of_two(adev->vm_manager.max_pfn);
if (amdgpu_vm_block_size != -1)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 139642eacdd0..806d62ed61ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -456,6 +456,7 @@ struct amdgpu_vm_manager {
bool concurrent_flush;
uint64_t max_pfn;
+ uint32_t max_level;
uint32_t num_level;
uint32_t block_size;
uint32_t fragment_size;