Re: [PATCH 2/2] drm/amdgpu: Permit PCIe transfer over links with XGMI

2023-10-11 Thread Felix Kuehling

On 2023-10-11 14:22, David Francis wrote:

When the CPU is XGMI connected, the PCIe links should
not be enumerated for topology purposes. However, PCIe
transfer should still be a valid option for memory attachment
that requires it.


You could be more specific here. This is for remote doorbells and MMIO 
mappings.





Move the XGMI connection check out of the shared helper
function amdgpu_device_is_peer_accessible and into the
topology path.

Signed-off-by: David Francis 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +---
  drivers/gpu/drm/amd/amdkfd/kfd_topology.c  | 3 +++
  2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index bad2b5577e96..b47cb7f8cfbd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5753,9 +5753,7 @@ bool amdgpu_device_is_peer_accessible(struct 
amdgpu_device *adev,
~*peer_adev->dev->dma_mask : ~((1ULL << 32) - 1);
resource_size_t aper_limit =
adev->gmc.aper_base + adev->gmc.aper_size - 1;
-   bool p2p_access =
-   !adev->gmc.xgmi.connected_to_cpu &&
-   !(pci_p2pdma_distance(adev->pdev, peer_adev->dev, false) < 0);
+   bool p2p_access = !(pci_p2pdma_distance(adev->pdev, peer_adev->dev, false) 
< 0);
  
  	return pcie_p2p && p2p_access && (adev->gmc.visible_vram_size &&

adev->gmc.real_vram_size == adev->gmc.visible_vram_size &&
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 4e530791507e..f0cff5072736 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -1423,6 +1423,9 @@ static int kfd_add_peer_prop(struct kfd_topology_device 
*kdev,
peer->gpu->adev))
return ret;
  
+	if (kdev->gpu->adev->gmc.xgmi.connected_to_cpu)

+   return ret;
+


I believe this is only needed for the case that XGMI is disabled via the 
module param. When XGMI is enabled, you shouldn't get here because 
kfd_dev_create_p2p_links doesn't call kfd_add_peer_prop if the GPUs are 
themselves in an XGMI hive. In fact, it may be clearer to move this 
condition into kfd_dev_create_p2p_links.


Regards,
  Felix



iolink1 = list_first_entry(>io_link_props,
struct 
kfd_iolink_properties, list);
if (!iolink1)


[PATCH 2/2] drm/amdgpu: Permit PCIe transfer over links with XGMI

2023-10-11 Thread David Francis
When the CPU is XGMI connected, the PCIe links should
not be enumerated for topology purposes. However, PCIe
transfer should still be a valid option for memory attachment
that requires it.

Move the XGMI connection check out of the shared helper
function amdgpu_device_is_peer_accessible and into the
topology path.

Signed-off-by: David Francis 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +---
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c  | 3 +++
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index bad2b5577e96..b47cb7f8cfbd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5753,9 +5753,7 @@ bool amdgpu_device_is_peer_accessible(struct 
amdgpu_device *adev,
~*peer_adev->dev->dma_mask : ~((1ULL << 32) - 1);
resource_size_t aper_limit =
adev->gmc.aper_base + adev->gmc.aper_size - 1;
-   bool p2p_access =
-   !adev->gmc.xgmi.connected_to_cpu &&
-   !(pci_p2pdma_distance(adev->pdev, peer_adev->dev, false) < 0);
+   bool p2p_access = !(pci_p2pdma_distance(adev->pdev, peer_adev->dev, 
false) < 0);
 
return pcie_p2p && p2p_access && (adev->gmc.visible_vram_size &&
adev->gmc.real_vram_size == adev->gmc.visible_vram_size &&
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 4e530791507e..f0cff5072736 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -1423,6 +1423,9 @@ static int kfd_add_peer_prop(struct kfd_topology_device 
*kdev,
peer->gpu->adev))
return ret;
 
+   if (kdev->gpu->adev->gmc.xgmi.connected_to_cpu)
+   return ret;
+
iolink1 = list_first_entry(>io_link_props,
struct 
kfd_iolink_properties, list);
if (!iolink1)
-- 
2.34.1