On 2022-10-04 12:41, Philip Yang wrote:
amdkfd_total_mem_size is the size of total GPUs vram plus system memory
to estimate page tables memory usage and leave enough VRAM room for page
tables allocation.

Calculate amdkfd_total_mem_size in amdgpu_amdkfd_device_probe is
incorrect because adev->gmc.real_vram_size is still 0 called from
amdgpu_device_ip_early_init. Move the calculation
to amdgpu_amdkfd_device_init to get the correct VRAM size.

Signed-off-by: Philip Yang <philip.y...@amd.com>

Reviewed-by: Felix Kuehling <felix.kuehl...@amd.com>

Semi-related to this, there should probably be a reverse calculation in amdgpu_amdkfd_device_fini_sw to support hot-unplugging GPUs.

Regards,
  Felix


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 9e98f3866edc..049d192c7cdf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -75,9 +75,6 @@ void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev)
                return;
adev->kfd.dev = kgd2kfd_probe(adev, vf);
-
-       if (adev->kfd.dev)
-               amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size;
  }
/**
@@ -201,6 +198,8 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
                adev->kfd.init_complete = kgd2kfd_device_init(adev->kfd.dev,
                                                adev_to_drm(adev), 
&gpu_resources);
+ amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size;
+
                INIT_WORK(&adev->kfd.reset_work, amdgpu_amdkfd_reset_work);
        }
  }

Reply via email to