On 12.06.25 13:37, Baolin Wang wrote:
On 2025/6/12 18:08, David Hildenbrand wrote:
On 12.06.25 05:54, Baolin Wang wrote:
When running the khugepaged selftest for shmem (./khugepaged all:shmem),
Hmm, this combination is not run automatically through run_tests.sh,
right? IIUC, it only runs "./khugepaged" which tests anon only ...
Should we add it there? Then I would probably have noticed that myself
earlier :)
Yes, see patch 2.
Yes, was pleasantly surprised when I found that :)
I encountered the following test failures:
"
Run test: collapse_full (khugepaged:shmem)
Collapse multiple fully populated PTE table.... Fail
...
Run test: collapse_single_pte_entry (khugepaged:shmem)
Collapse PTE table with single PTE entry present.... Fail
...
Run test: collapse_full_of_compound (khugepaged:shmem)
Allocate huge page... OK
Split huge page leaving single PTE page table full of compound
pages... OK
Collapse PTE table full of compound pages.... Fail
"
The reason for the failure is that, it will set MADV_NOHUGEPAGE to
prevent
khugepaged from continuing to scan shmem VMA after khugepaged finishes
scanning in the wait_for_scan() function. Moreover, shmem requires a
refault
to establish PMD mappings.
However, after commit 2b0f922323cc, PMD mappings are prevented if the
VMA is
set with MADV_NOHUGEPAGE flag, so shmem cannot establish PMD mappings
during
refault.
Right. It's always problematic when we have some contradicting
information in the VMA vs. pagecache.
To fix this issue, we can set the MADV_NOHUGEPAGE flag after the shmem
refault.
With this fix, the shmem test case passes.
Fixes: 2b0f922323cc ("mm: don't install PMD mappings when THPs are
disabled by the hw/process/vma")
Signed-off-by: Baolin Wang <baolin.w...@linux.alibaba.com>
---
tools/testing/selftests/mm/khugepaged.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/tools/testing/selftests/mm/khugepaged.c
b/tools/testing/selftests/mm/khugepaged.c
index 8a4d34cce36b..d462f62d8116 100644
--- a/tools/testing/selftests/mm/khugepaged.c
+++ b/tools/testing/selftests/mm/khugepaged.c
@@ -561,8 +561,6 @@ static bool wait_for_scan(const char *msg, char
*p, int nr_hpages,
usleep(TICK);
}
- madvise(p, nr_hpages * hpage_pmd_size, MADV_NOHUGEPAGE);
-
return timeout == -1;
}
@@ -585,6 +583,7 @@ static void khugepaged_collapse(const char *msg,
char *p, int nr_hpages,
if (ops != &__anon_ops)
ops->fault(p, 0, nr_hpages * hpage_pmd_size);
+ madvise(p, nr_hpages * hpage_pmd_size, MADV_NOHUGEPAGE);
if (ops->check_huge(p, expect ? nr_hpages : 0))
success("OK");
else
It's a shame we have this weird interface: there is no way we can clear
VM_HUGEPAGE without setting VM_NOHUGEPAGE :(
Right.
But, do we even care about setting MADV_NOHUGEPAGE at all? IIUC, we'll
almost immediately later call cleanup_area() where we munmap(), right?
I tested removing the MADV_NOHUGEPAGE setting, and the khugepaged test
cases all passed.
However, a potential impact of removing MADV_NOHUGEPAGE is that,
khugepaged might report 'timeout', but check_huge() would still report
'success' (assuming khugepaged tries to scan the VMA and successfully
collapses it after the timeout). Such test result could be confusing.
If we run into the timeout, we return "true" from wait_for_scan(), and
in khugepaged_collapse() returns immediately.
So we wouldn't issue another check_huge() call in khugepaged_collapse().
Did I miss something?
--
Cheers,
David / dhildenb