On 2026-04-07 15:40, Chen, Xiaogang wrote:


On 4/7/2026 8:38 AM, Philip Yang wrote:
On multi-socket MI300A APU systems, system memory pages mapped to the
closest GPU must use MTYPE_RW instead of MTYPE_NC to maintain correct
cache coherence. The existing mtype override in amdgpu_vm_pte_update_flags()
excluded non-contiguous page mappings from the override. This caused
incorrect MTYPE_NC for scattered local pages, leading to cache coherence
issues.

The override applies to both contiguous and non-contiguous mappings.
When pages_addr is set, resolve the physical address via
pages_addr[addr >> PAGE_SHIFT] before passing it to the override
callback for NUMA node lookup.

Introduce amdgpu_vm_addr_contiguous() helper that, on MI300A, treats
pages on different NUMA nodes as non-contiguous even if their DMA
addresses are adjacent. This ensures amdgpu_vm_update_range() splits
page table updates at NUMA node boundaries so each batch gets the
correct mtype override.

Signed-off-by: Philip Yang<[email protected]>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c    | 48 +++++++++++++++++++----
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 14 +++++--
  2 files changed, 50 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 63156289ae7f..f8fcbf079bf4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1099,6 +1099,34 @@ amdgpu_vm_tlb_flush(struct amdgpu_vm_update_params 
*params,
        }
  }
+/**
+ * amdgpu_vm_addr_contiguous - check if two DMA addresses are contiguous
+ *
+ * @adev: amdgpu_device pointer
+ * @addr: current DMA address
+ * @addr_next: next DMA address to check against
+ * @contiguous: current contiguity state of the range being built
+ *
+ * Check whether @addr and @addr_next are physically contiguous. On APU
+ * platforms with multiple NUMA nodes (e.g. MI300A), a NUMA node boundary
+ * also breaks contiguity so that each contiguous batch stays within a
+ * single NUMA node for correct MTYPE override selection.
+ *
+ * Returns:
+ * true if @addr_next continues the current contiguous range, false otherwise.
+ */

We can use pfn_to_nid or page_to_nid to get which noma(id) the backing memory is at. pfn_to_nid uses pfn from physical address. You use dma_addr_t that is device dependent. It is not always same as physical address of RAM.

ttm_tt also has

/** @pages: Array of pages backing the data. */ struct page **pages;

I think using the pages to get numa id by page_to_nid is more appropriate.

For MI300A APU, iommu is not enabled, dma_address (dma_map_page) is same as pfn (page_to_pfn), otherwise the current override path will not work correctly with pfn_to_nid.

Regards,
Philip

Regards

Xiaogang

+static inline bool amdgpu_vm_addr_contiguous(struct amdgpu_device *adev, 
dma_addr_t addr,
+                                            dma_addr_t addr_next, bool 
contiguous)
+{
+       if (!adev->gmc.is_app_apu || !page_is_ram(addr >> PAGE_SHIFT))
+               return (addr + PAGE_SIZE) == addr_next;
+
+       if (pfn_to_nid(addr >> PAGE_SHIFT) != pfn_to_nid(addr_next >> 
PAGE_SHIFT))
+               return !contiguous;
+
+       return (addr + PAGE_SIZE) == addr_next;
+}
+
  /**
   * amdgpu_vm_update_range - update a range in the vm page table
   *
@@ -1198,22 +1226,26 @@ int amdgpu_vm_update_range(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
                                uint64_t pfn = cursor.start >> PAGE_SHIFT;
                                uint64_t count;
- contiguous = pages_addr[pfn + 1] ==
-                                       pages_addr[pfn] + PAGE_SIZE;
+                               contiguous = amdgpu_vm_addr_contiguous(adev,
+                                                                      
pages_addr[pfn],
+                                                                      
pages_addr[pfn + 1],
+                                                                      
contiguous);
- tmp = num_entries /
-                                       AMDGPU_GPU_PAGES_IN_CPU_PAGE;
+                               tmp = num_entries / 
AMDGPU_GPU_PAGES_IN_CPU_PAGE;
                                for (count = 2; count < tmp; ++count) {
                                        uint64_t idx = pfn + count;
- if (contiguous != (pages_addr[idx] ==
-                                           pages_addr[idx - 1] + PAGE_SIZE))
+                                       if (contiguous != 
amdgpu_vm_addr_contiguous(adev,
+                                                                       
pages_addr[idx - 1],
+                                                                       
pages_addr[idx],
+                                                                       
contiguous))
                                                break;
                                }
+
                                if (!contiguous)
                                        count--;
-                               num_entries = count *
-                                       AMDGPU_GPU_PAGES_IN_CPU_PAGE;
+
+                               num_entries = count * 
AMDGPU_GPU_PAGES_IN_CPU_PAGE;
                        }
if (!contiguous) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
index 31a437ce9570..9e1607fb3b2e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
@@ -708,13 +708,19 @@ static void amdgpu_vm_pte_update_flags(struct 
amdgpu_vm_update_params *params,
                amdgpu_vm_pte_update_noretry_flags(adev, &flags);
/* APUs mapping system memory may need different MTYPEs on different
-        * NUMA nodes. Only do this for contiguous ranges that can be assumed
-        * to be on the same NUMA node.
+        * NUMA nodes. Both contiguous and non-contiguous ranges are handled
+        * since amdgpu_vm_update_range ensures updates don't span NUMA
+        * node boundaries.
         */
        if ((flags & AMDGPU_PTE_SYSTEM) && (adev->flags & AMD_IS_APU) &&
            adev->gmc.gmc_funcs->override_vm_pte_flags &&
-           num_possible_nodes() > 1 && !params->pages_addr && 
params->allow_override)
-               amdgpu_gmc_override_vm_pte_flags(adev, params->vm, addr, 
&flags);
+           num_possible_nodes() > 1 && params->allow_override) {
+               if (params->pages_addr)
+                       amdgpu_gmc_override_vm_pte_flags(adev, params->vm,
+                                       params->pages_addr[addr >> PAGE_SHIFT], 
&flags);
+               else
+                       amdgpu_gmc_override_vm_pte_flags(adev, params->vm, addr, 
&flags);
+       }
params->vm->update_funcs->update(params, pt, pe, addr, count, incr,
                                         flags);

Reply via email to