On 2026-04-07 15:40, Chen, Xiaogang wrote:
For MI300A APU, iommu is not enabled, dma_address (dma_map_page) is same as pfn (page_to_pfn), otherwise the current override path will not work correctly with pfn_to_nid.On 4/7/2026 8:38 AM, Philip Yang wrote:On multi-socket MI300A APU systems, system memory pages mapped to the closest GPU must use MTYPE_RW instead of MTYPE_NC to maintain correct cache coherence. The existing mtype override in amdgpu_vm_pte_update_flags() excluded non-contiguous page mappings from the override. This caused incorrect MTYPE_NC for scattered local pages, leading to cache coherence issues. The override applies to both contiguous and non-contiguous mappings. When pages_addr is set, resolve the physical address via pages_addr[addr >> PAGE_SHIFT] before passing it to the override callback for NUMA node lookup. Introduce amdgpu_vm_addr_contiguous() helper that, on MI300A, treats pages on different NUMA nodes as non-contiguous even if their DMA addresses are adjacent. This ensures amdgpu_vm_update_range() splits page table updates at NUMA node boundaries so each batch gets the correct mtype override. Signed-off-by: Philip Yang<[email protected]> --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 48 +++++++++++++++++++---- drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 14 +++++-- 2 files changed, 50 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 63156289ae7f..f8fcbf079bf4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -1099,6 +1099,34 @@ amdgpu_vm_tlb_flush(struct amdgpu_vm_update_params *params, } }+/**+ * amdgpu_vm_addr_contiguous - check if two DMA addresses are contiguous + * + * @adev: amdgpu_device pointer + * @addr: current DMA address + * @addr_next: next DMA address to check against + * @contiguous: current contiguity state of the range being built + * + * Check whether @addr and @addr_next are physically contiguous. On APU + * platforms with multiple NUMA nodes (e.g. MI300A), a NUMA node boundary + * also breaks contiguity so that each contiguous batch stays within a + * single NUMA node for correct MTYPE override selection. + * + * Returns: + * true if @addr_next continues the current contiguous range, false otherwise. + */We can use pfn_to_nid or page_to_nid to get which noma(id) the backing memory is at. pfn_to_nid uses pfn from physical address. You use dma_addr_t that is device dependent. It is not always same as physical address of RAM.ttm_tt also has /** @pages: Array of pages backing the data. */ struct page **pages; I think using the pages to get numa id by page_to_nid is more appropriate.
Regards, Philip
Regards Xiaogang+static inline bool amdgpu_vm_addr_contiguous(struct amdgpu_device *adev, dma_addr_t addr, + dma_addr_t addr_next, bool contiguous) +{ + if (!adev->gmc.is_app_apu || !page_is_ram(addr >> PAGE_SHIFT)) + return (addr + PAGE_SIZE) == addr_next; + + if (pfn_to_nid(addr >> PAGE_SHIFT) != pfn_to_nid(addr_next >> PAGE_SHIFT)) + return !contiguous; + + return (addr + PAGE_SIZE) == addr_next; +} + /** * amdgpu_vm_update_range - update a range in the vm page table * @@ -1198,22 +1226,26 @@ int amdgpu_vm_update_range(struct amdgpu_device *adev, struct amdgpu_vm *vm, uint64_t pfn = cursor.start >> PAGE_SHIFT; uint64_t count;- contiguous = pages_addr[pfn + 1] ==- pages_addr[pfn] + PAGE_SIZE; + contiguous = amdgpu_vm_addr_contiguous(adev, + pages_addr[pfn], + pages_addr[pfn + 1], + contiguous);- tmp = num_entries /- AMDGPU_GPU_PAGES_IN_CPU_PAGE; + tmp = num_entries / AMDGPU_GPU_PAGES_IN_CPU_PAGE; for (count = 2; count < tmp; ++count) { uint64_t idx = pfn + count;- if (contiguous != (pages_addr[idx] ==- pages_addr[idx - 1] + PAGE_SIZE)) + if (contiguous != amdgpu_vm_addr_contiguous(adev, + pages_addr[idx - 1], + pages_addr[idx], + contiguous)) break; } + if (!contiguous) count--; - num_entries = count * - AMDGPU_GPU_PAGES_IN_CPU_PAGE; + + num_entries = count * AMDGPU_GPU_PAGES_IN_CPU_PAGE; }if (!contiguous) {diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c index 31a437ce9570..9e1607fb3b2e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c @@ -708,13 +708,19 @@ static void amdgpu_vm_pte_update_flags(struct amdgpu_vm_update_params *params, amdgpu_vm_pte_update_noretry_flags(adev, &flags);/* APUs mapping system memory may need different MTYPEs on different- * NUMA nodes. Only do this for contiguous ranges that can be assumed - * to be on the same NUMA node. + * NUMA nodes. Both contiguous and non-contiguous ranges are handled + * since amdgpu_vm_update_range ensures updates don't span NUMA + * node boundaries. */ if ((flags & AMDGPU_PTE_SYSTEM) && (adev->flags & AMD_IS_APU) && adev->gmc.gmc_funcs->override_vm_pte_flags && - num_possible_nodes() > 1 && !params->pages_addr && params->allow_override) - amdgpu_gmc_override_vm_pte_flags(adev, params->vm, addr, &flags); + num_possible_nodes() > 1 && params->allow_override) { + if (params->pages_addr) + amdgpu_gmc_override_vm_pte_flags(adev, params->vm, + params->pages_addr[addr >> PAGE_SHIFT], &flags); + else + amdgpu_gmc_override_vm_pte_flags(adev, params->vm, addr, &flags); + }params->vm->update_funcs->update(params, pt, pe, addr, count, incr,flags);
