Re: [PATCH v2] drm/amdkfd: Fix GPU mappings for APU after prefetch

Chen, Xiaogang Fri, 31 Oct 2025 07:53:51 -0700


On 10/31/2025 8:41 AM, Philip Yang wrote:

On 2025-10-30 18:12, Chen, Xiaogang wrote:
On 10/30/2025 3:14 PM, Philip Yang wrote:
On 2025-10-30 10:20, Alex Deucher wrote:
On Wed, Oct 29, 2025 at 9:36 PM Harish Kasiviswanathan
<[email protected]> wrote:
Fix the following corner case:-
Consider a 2M huge page SVM allocation, followed by prefetchcall forthe first 4K page. The whole range is initially mapped with singlePTE.
After the prefetch, this range gets split to first page + rest of the
pages. Currently, the first page mapping is not updated on MI300A(APU)since page hasn't migrated. However, after range split PTE mappingit not
valid.
Fix this by forcing page table update for the whole range whenprefetchis called. Calling prefetch on APU doesn't improve performance.If all
it deteriotes. However, functionality has to be supported.
v2: Use apu_prefer_gtt as this issue doesn't apply to APUs withcarveout
VRAM
apu_prefer_gtt is used by small APUs as well.  It depends on how much
VRAM vs GTT is available on the system.

         if (adev->flags & AMD_IS_APU) {
                 if (adev->gmc.real_vram_size < gtt_size)
                         adev->apu_prefer_gtt = true;
         }
yes, if apu_perfer_gtt is true, then no page migration becausebest_prefetch_location is always CPU. For small APU, it will havesame issue if KFD is used, prefetch split range page table notupdated because no migration. This patch can fix the issue on bothsmall APU and APP APU.
Reviewed-by: Philip Yang<[email protected]>
Is the case like that: the svm range got split; the pages are notmigrated and attributes for the pages are not changed. Then why needupdate pte as page physical locations and attributes are not changed?Basically it used huge page pte, now you split the pte into smallerranges. Or I misunderstood the scenario?
yes, the range mapped as huge page, use 2MB PDE entry as PTE, afterspliting, the tail range mapping update, not 2MB alignment huge page,alloc pt bo for PDE entry, then tail PTEs updated, the head PTEs isinvalid entry.

My concern is that since it is APU no page got migrated and accessattributes are not changes, just svm range got split. Then neither tailor head sub-range would get mapping updated: the original 2MB PTEmapping can still be used for both sub-ranges that gives betterperformance. This patch spoil 2MB PTE vm mapping that is not necessaryfor APU.


Regards

Xiaogang

Regards,

Philip
Regards

Xiaogang
Alex
Suggested-by: Philip Yang <[email protected]>
Signed-off-by: Harish Kasiviswanathan<[email protected]>
---
  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 7 +++++++
  1 file changed, 7 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.cb/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index c30dfb8ec236..76cab1c8aaa2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -766,14 +766,21 @@ svm_range_apply_attrs(struct kfd_process *p,struct svm_range *prange,
  {
         uint32_t i;
         int gpuidx;
+       struct kfd_node *node;

         for (i = 0; i < nattr; i++) {
                 switch (attrs[i].type) {
                 case KFD_IOCTL_SVM_ATTR_PREFERRED_LOC:
                         prange->preferred_loc = attrs[i].value;
+ node = svm_range_get_node_by_id(prange,attrs[i].value);+ if (node && node->adev->apu_prefer_gtt &&!p->xnack_enabled)
+                               *update_mapping = true;
                         break;
                 case KFD_IOCTL_SVM_ATTR_PREFETCH_LOC:
                         prange->prefetch_loc = attrs[i].value;
+ node = svm_range_get_node_by_id(prange,attrs[i].value);+ if (node && node->adev->apu_prefer_gtt &&!p->xnack_enabled)
+                               *update_mapping = true;
                         break;
                 case KFD_IOCTL_SVM_ATTR_ACCESS:
                 case KFD_IOCTL_SVM_ATTR_ACCESS_IN_PLACE:
--
2.34.1

Re: [PATCH v2] drm/amdkfd: Fix GPU mappings for APU after prefetch

Reply via email to