When a process is being killed it might be in an uninterruptible sleep which leads to an unpredictable delay in its memory reclaim. In low memory situations, when it's important to free up memory quickly, such delay is problematic. Kernel solves this problem with oom-reaper thread which performs memory reclaim even when the victim process is not runnable. Userspace currently lacks such mechanisms and the need and potential solutions were discussed before (see links below). This patchset provides a mechanism to perform memory reclaim of an external process using process_madvise(MADV_DONTNEED). The chosen mechanism is the result of the latest discussion at [4]. The first patch adds PMADV_FLAG_RANGE flag for process_madvise to operate on large address ranges spanning multiple VMAs. Currently it supports only the entire memory of a process. This is done to keep things simple and since it's the only real usecase we currently know of. In the future this can be developed further to support other large ranges. One way to do that is suggested in [5]. The second patch enables MADV_DONTNEED behavior for process_madvise to perform memory reclaim of an external process.
1. https://patchwork.kernel.org/cover/10894999 2. https://lwn.net/Articles/787217 3. https://lore.kernel.org/linux-api/cajucfpgz1kpm3g1gzh+09z7aowkg05qsammisj7h5mdmrrr...@mail.gmail.com 4. https://lkml.org/lkml/2020/11/13/849 5. https://lkml.org/lkml/2020/11/18/1076 Suren Baghdasaryan (2): mm/madvise: allow process_madvise operations on entire memory range mm/madvise: add process_madvise MADV_DONTNEER support arch/alpha/include/uapi/asm/mman.h | 4 + arch/mips/include/uapi/asm/mman.h | 4 + arch/parisc/include/uapi/asm/mman.h | 4 + arch/xtensa/include/uapi/asm/mman.h | 4 + fs/io_uring.c | 2 +- include/linux/mm.h | 3 +- include/uapi/asm-generic/mman-common.h | 4 + mm/madvise.c | 81 ++++++++++++++++++-- tools/include/uapi/asm-generic/mman-common.h | 4 + 9 files changed, 101 insertions(+), 9 deletions(-) -- 2.29.2.454.gaff20da3a2-goog