Let's document what ZONE_MOVABLE means, how it's used, and which special
cases we have regarding unmovable pages (memory offlining vs. migration /
allocations).

Cc: Andrew Morton <a...@linux-foundation.org>
Cc: Michal Hocko <mho...@suse.com>
Cc: Michael S. Tsirkin <m...@redhat.com>
Cc: Mike Kravetz <mike.krav...@oracle.com>
Cc: Mike Rapoport <r...@kernel.org>
Cc: Pankaj Gupta <pankaj.gupta.li...@gmail.com>
Cc: Baoquan He <b...@redhat.com>
Signed-off-by: David Hildenbrand <da...@redhat.com>
---
 include/linux/mmzone.h | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index f6f884970511d..b8c49b2aff684 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -372,6 +372,40 @@ enum zone_type {
         */
        ZONE_HIGHMEM,
 #endif
+       /*
+        * ZONE_MOVABLE is similar to ZONE_NORMAL, except that it *primarily*
+        * only contains movable pages. Main use cases are to make memory
+        * offlining more likely to succeed, and to locally limit unmovable
+        * allocations - e.g., to increase the number of THP/huge pages.
+        * Notable special cases:
+        *
+        * 1. Pinned pages: (Long-term) pinning of movable pages might
+        *    essentially turn such pages unmovable. Memory offlining might
+        *    retry a long time.
+        * 2. memblock allocations: kernelcore/movablecore setups might create
+        *    situations where ZONE_MOVABLE contains unmovable allocations
+        *    after boot. Memory offlining and allocations fail early.
+        * 3. Memory holes: Such pages cannot be allocated. Applies only to
+        *    boot memory, not hotplugged memory. Memory offlining and
+        *    allocations fail early.
+        * 4. PG_hwpoison pages: While poisoned pages can be skipped during
+        *    memory offlining, such pages cannot be allocated.
+        * 5. Unmovable PG_offline pages: In paravirtualized environments,
+        *    hotplugged memory blocks might only partially be managed by the
+        *    buddy (e.g., via XEN-balloon, Hyper-V balloon, virtio-mem). The
+        *    parts not manged by the buddy are unmovable PG_offline pages. In
+        *    some cases (virtio-mem), such pages can be skipped during
+        *    memory offlining, however, cannot be moved/allcoated. These
+        *    techniques might use alloc_contig_range() to hide previously
+        *    exposed pages from the buddy again (e.g., to implement some sort
+        *    of memory unplug in virtio-mem).
+        *
+        * In general, no unmovable allocations that degrade memory offlining
+        * should end up in ZONE_MOVABLE. Allocators (like alloc_contig_range())
+        * have to expect that migrating pages in ZONE_MOVABLE can fail (even
+        * if has_unmovable_pages() states that there are no unmovable pages,
+        * there can be false negatives).
+        */
        ZONE_MOVABLE,
 #ifdef CONFIG_ZONE_DEVICE
        ZONE_DEVICE,
-- 
2.26.2

Reply via email to