MADV_COLLAPSE on file-backed mappings fails with -EINVAL when TEXT pages are dirty. This affects scenarios like package/container updates or executing binaries immediately after writing them, etc.
The issue is that collapse_file() triggers async writeback and returns SCAN_FAIL (maps to -EINVAL), expecting khugepaged to revisit later. But MADV_COLLAPSE is synchronous and userspace expects immediate success or a clear retry signal. Reproduction: - Compile or copy 2MB-aligned executable to XFS/ext4 FS - Call MADV_COLLAPSE on .text section - First call fails with -EINVAL (text pages dirty from copy) - Second call succeeds (async writeback completed) Issue Report: https://lore.kernel.org/all/[email protected] Patch applies cleanly on mm-new (https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/log/?h=mm-new) Changelog: V4: - Rebase on mm-new - Fix spurious blank line (Lance) V3: - https://lore.kernel.org/all/[email protected] - Reordered patches: Enum definition comes first as the retry logic depends on it - Renamed SCAN_PAGE_NOT_CLEAN to SCAN_PAGE_DIRTY_OR_WRITEBACK (Dev, Lance, David) - Changed writeback logic: Only trigger synchronous writeback and retry if the initial collapse attempt failed specifically due to dirty/writeback pages, rather than blindly flushing all file-backed VMAs (David) - Added proper file reference counting (get_file/fput) around the unlock window to prevent UAF (Lance) V2: - https://lore.kernel.org/all/[email protected] - Move writeback to madvise_collapse() (better abstraction, proper mmap_lock handling and does VMA revalidation after I/O) (Lorenzo) - Rename to SCAN_PAGE_DIRTY to SCAN_PAGE_NOT_CLEAN and extend its use for all dirty/writeback folio cases that previously returned incorrect results (Dev) V1: https://lore.kernel.org/all/[email protected] Thanks, Shivank Garg (2): mm/khugepaged: map dirty/writeback pages failures to EAGAIN mm/khugepaged: retry with sync writeback for MADV_COLLAPSE include/trace/events/huge_memory.h | 3 +- mm/khugepaged.c | 48 ++++++++++++++++++++++++++++-- 2 files changed, 47 insertions(+), 4 deletions(-) base-commit: d0a24447990a9d8212bfb3a692d59efa74ce9f86 -- 2.43.0
