On 14 May 2025, at 13:28, David Hildenbrand wrote: >>> >>> Note that PageOffline() is a bit confusing because it's "Memory block >>> online but page is logically offline (e.g., has a memmap that can be >>> touched, but the page content should not be touched)". >> >> So PageOffline() is before memory block offline, which is the first phase of >> memory hotunplug. > > Yes. > >> >>> >>> (memory block offline -> all pages offline and have effectively no state >>> because the memmap is stale) >> >> What do you mean by memmap is stale? When a memory block is offline, memmap >> is >> still present, so pfn scanner can see these pages. pfn scanner checks memmap >> to know that it should not touch these pages, right? > > See pfn_to_online_page() for exactly that use case. > > For an offline memory section (either because it was just added or because it > was just offlined), the memmap is assumed to contain garbage and should not > be touched. > > See remove_pfn_range_from_zone() -> page_init_poison(). > >> >>> >>>> removed from page allocator. >>> >>> Usually, all pages are freed back to the buddy (isolated pageblock -> put >>> onto the isolated list). Memory offlining code can then simply grab these >>> "free" pages from the buddy -- no PageOffline involved. >>> >>> If something fails during memory offlining, these isolated pages are simply >>> put back on the appropriate migratetype list and become ordinary free pages >>> that can be allocated immediately. >> >> I am familiar with this part. Then, when PageOffline is used? >> >> From the comment in page-flags.h, I see two examples: inflated pages by >> balloon driver >> and not onlined pages when onlining the section. These are two different >> operations: >> 1) inflated pages are going to be offline, 2) not onlined pages are going to >> be >> online. But you mentioned above that Memory off lining code does not involve >> PageOffline, so inflated pages by balloon driver is not part of memory >> offlining >> code, but a different way of offlining pages. Am I getting it right? > > Yes. PageOffline means logically offline, for whatever reason someone decides > to turn pages logically offline. > > Memory ballooning uses and virtio-mem are two users, there are more. > >> >> I read a little bit more on memory ballooning and virtio-mem and understand >> that memory ballooning still keeps the inflated page but guest cannot >> allocate >> and use it, whereas virtio-mem and memory hotunplug remove the page from >> Linux completely (i.e., Linux no longer sees the memory). > > In virtio-mem terms, they are considered "fake offline" -- memory behaves as > if it would never have been onlined, but there is a memmap for it. Like a > (current) memory hole. > >> >> It seems that I am mixing memory offlining and memory hotunplug. IIUC, >> memory offlining means no one can allocate and use the offlined memory, but >> Linux still sees it; memory hotunplug means Linux no longer sees it (no >> related >> memmap and other metadata). Am I getting it right? > > The doc has this "Phases of Memory Hotplug" description, where it is roughly > divided into that, yes. > >> >>> >>> Some PageOffline pages can be migrated using the non-folio migration: this >>> is done for memory ballooning (memory comapction). As they get migrated, >>> they are freed back to the buddy, PageOffline() is cleared -- they become >>> PageBuddy() -- and the above applies. >> >> After a PageOffline page is migrated, the destination page becomes >> PageOffline, right? >> OK, I see it in balloon_page_insert(). > > Yes. > >> >>> >>> Other PageOffline pages can be skipped during memory offlining (virtio-mem >>> use case, what we are doing her). We don't want them to ever go through the >>> buddy, especially because if memory offlining fails they must definitely >>> not be treated like free pages that can be allocated immediately. >> >> What do you mean by "skipped during memory offlining"? Are you implying when >> virtio-mem is offlining some pages by marking it PageOffline and >> PG_offline_skippable, >> someone else can do memory offlining in parallel? > > It could happen (e.g., manually offline a Linux memory block using sysfs), > but that is not the primary use case. > > virtio-mem unplugs memory in the following sequence: > > 1) alloc_contig_range() small blocks (e.g., 2 MiB) > > 2) Report the blocks to the hypervisor > > 3) Mark them fake-offline: PageOffline (+ PageOfflineSkippable now) > > Once all small blocks that comprise a Linux memory block (e.g., 128 MiB) are > fake-offline, offline the memory block and remove the memory using > offline_and_remove_memory(). > > In that operation -- offline_and_remove_memory() -- memory offlining code > must be able to skip these PageOffline pages, otherwise > offline_and_remove_memory() will just fail, saying that there are unmovable > pages in there. > >> >>> >>> Next, the page is removed from its memory >>>> block. When will PG_offline_skippable be used? The second phase when >>>> the page is being removed from its memory block? >>> >>> PG_offline_skippable is used during memory offlining, while we look for any >>> pages that are not PageBuddy (... or hwpoisoned ...), to migrate them off >>> the memory so they get converted to PageBuddy. >>> >>> PageOffline + PageOfflineSkippable are checked on that phase, such that >>> they don't require any migration. >> >> Hmm, if you just do not want to get PageOffline migrated, not setting it >> __PageMovable would work right? PageOffline + __PageMovable is used by >> ballooning, as these inflated pages can be migrated. PageOffline without >> __PageMovable should be virtio-mem. Am I missing any other user? > > Sure. Just imagine !CONFIG_BALLOON_COMPACTION. > > In summary, we have > > 1) Migratable PageOffline pages (balloon compaction) > > 2) Unmigratable PageOffline pages (e.g., XEN balloon, hyper-v balloon, > memtrace, in the future likely some memory holes, ... ) > > 3) Skippable PageOffline pages (virtio-mem)
Thank you for all the explanation. Now I understand how memory offline and memory hotunplug work and shall begin to check the patches. :) -- Best Regards, Yan, Zi