Am 2021-10-14 um 2:12 p.m. schrieb Jonathan Kim:
> ROCr needs to be able to identify all devices that have direct access to
> fine grain memory, which should include CPUs that are connected to GPUs
> over xGMI. The GPU hive ID can be mapped onto the CPU hive ID since the
> CPU is part of the hive.
>
> v2: fixup to ensure all numa nodes get the hive id mapped
>
> Signed-off-by: Jonathan Kim <jonathan....@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 21 ++++++++++++++++++++-
>  1 file changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index 98cca5f2b27f..9fda4ee03813 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -1296,6 +1296,26 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
>  
>       proximity_domain = atomic_inc_return(&topology_crat_proximity_domain);
>  
> +     adev = (struct amdgpu_device *)(gpu->kgd);
> +
> +     /* Include the CPU in xGMI hive if xGMI connected by assigning it the 
> hive ID. */
> +     if (gpu->hive_id && adev->gmc.xgmi.connected_to_cpu) {
> +             int i;
> +
> +             for (i = 0; i < proximity_domain; i++) {
> +                     struct kfd_topology_device *to_dev =
> +                                             
> kfd_topology_device_by_proximity_domain(i);

Sorry, one more nit-pick. This loop is pretty inefficient (0(n^2))
because kfd_topolody_device_by_proximity_domain does a linear search
itself. It would be more efficient to just loop over the
topology_device_list directly here (while holding the read lock):

>         down_read(&topology_lock);
>
>         list_for_each_entry(top_dev, &topology_device_list, list) {
>                 ...
Regards,
  Felix


> +
> +                     if (!to_dev)
> +                             continue;
> +
> +                     if (to_dev->gpu)
> +                             break;
> +
> +                     to_dev->node_props.hive_id = gpu->hive_id;
> +             }
> +     }
> +
>       /* Check to see if this gpu device exists in the topology_device_list.
>        * If so, assign the gpu to that device,
>        * else create a Virtual CRAT for this gpu device and then parse that
> @@ -1457,7 +1477,6 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
>               dev->node_props.max_waves_per_simd = 10;
>       }
>  
> -     adev = (struct amdgpu_device *)(dev->gpu->kgd);
>       /* kfd only concerns sram ecc on GFX and HBM ecc on UMC */
>       dev->node_props.capability |=
>               ((adev->ras_enabled & BIT(AMDGPU_RAS_BLOCK__GFX)) != 0) ?

Reply via email to