On 5/11/26 20:58, Nico Pache wrote:
> Currently the collapse_huge_page function requires the mmap_read_lock to
> enter with it held, and exit with it dropped. This function moves the
> unlock into its parent caller, and changes this semantic to requiring it
> to enter/exit with it always unlocked.
>
> In future patches, we need this expectation, as for in mTHP collapse, we
> may have already have dropped the lock, and do not want to conditionally
> check for this by passing through the lock_dropped variable.
>
> No functional change is expected as one of the first things the
> collapse_huge_page function does is drop this lock before allocating the
> hugepage.
>
> Signed-off-by: Nico Pache <[email protected]>
> ---
> mm/khugepaged.c | 18 ++++++++++--------
> 1 file changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 27465161fa6d..37a5f6791816 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -1199,6 +1199,14 @@ static enum scan_result alloc_charge_folio(struct
> folio **foliop, struct mm_stru
> return SCAN_SUCCEED;
> }
>
> +/*
> + * collapse_huge_page expects the mmap_read_lock to be dropped before
> + * entering this function.
"before entering." Talking about "this function" after naming it sounds odd.
Also, there is only an "mmap_lock".
> The function will also always return with the lock
> + * dropped.
"collapse_huge_page expects the mmap_lock to be unlocked before entering and
will always return with the lock unlocked."
Or something simple like that?
> The function starts by allocation a folio, which can potentially
> + * take a long time if it involves sync compaction, and we do not need to
> hold
> + * the mmap_lock during that. We must recheck the vma after taking it again
> in
> + * write mode.
> + */
"... to avoid holding the mmap_lock while allocating a THP, as that could
trigger direct reclaim/compaction. Note that the VMA must be rechecked after
grabbing the mmap_lock again."
?
Ending up with something like
"collapse_huge_page expects the mmap_lock to be unlocked before entering and
will always return with the lock unlocked, to avoid holding the mmap_lock while
allocating a THP, as that could trigger direct reclaim/compaction. Note that
the VMA must be rechecked after grabbing the mmap_lock again."
> static enum scan_result collapse_huge_page(struct mm_struct *mm, unsigned
> long address,
> int referenced, int unmapped, struct collapse_control *cc)
> {
> @@ -1214,14 +1222,6 @@ static enum scan_result collapse_huge_page(struct
> mm_struct *mm, unsigned long a
>
> VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>
> - /*
> - * Before allocating the hugepage, release the mmap_lock read lock.
> - * The allocation can take potentially a long time if it involves
> - * sync compaction, and we do not need to hold the mmap_lock during
> - * that. We will recheck the vma after taking it again in write mode.
> - */
> - mmap_read_unlock(mm);
> -
> result = alloc_charge_folio(&folio, mm, cc, HPAGE_PMD_ORDER);
> if (result != SCAN_SUCCEED)
> goto out_nolock;
> @@ -1526,6 +1526,8 @@ static enum scan_result collapse_scan_pmd(struct
> mm_struct *mm,
> out_unmap:
> pte_unmap_unlock(pte, ptl);
> if (result == SCAN_SUCCEED) {
> + /* collapse_huge_page expects the lock to be dropped before
> calling */
> + mmap_read_unlock(mm);
> result = collapse_huge_page(mm, start_addr, referenced,
> unmapped, cc);
> /* collapse_huge_page will return with the mmap_lock released */
Acked-by: David Hildenbrand (Arm) <[email protected]>
--
Cheers,
David