Note that PageOffline() is a bit confusing because it's "Memory block online but 
page is logically offline (e.g., has a memmap that can be touched, but the page content 
should not be touched)".

So PageOffline() is before memory block offline, which is the first phase of
memory hotunplug.

Yes.



(memory block offline -> all pages offline and have effectively no state 
because the memmap is stale)

What do you mean by memmap is stale? When a memory block is offline, memmap is
still present, so pfn scanner can see these pages. pfn scanner checks memmap
to know that it should not touch these pages, right?

See pfn_to_online_page() for exactly that use case.

For an offline memory section (either because it was just added or because it was just offlined), the memmap is assumed to contain garbage and should not be touched.

See remove_pfn_range_from_zone() -> page_init_poison().



removed from page allocator.

Usually, all pages are freed back to the buddy (isolated pageblock -> put onto the 
isolated list). Memory offlining code can then simply grab these "free" pages from 
the buddy -- no PageOffline involved.

If something fails during memory offlining, these isolated pages are simply put 
back on the appropriate migratetype list and become ordinary free pages that 
can be allocated immediately.

I am familiar with this part. Then, when PageOffline is used?

 From the comment in page-flags.h, I see two examples: inflated pages by 
balloon driver
and not onlined pages when onlining the section. These are two different 
operations:
1) inflated pages are going to be offline, 2) not onlined pages are going to be
online. But you mentioned above that Memory off lining code does not involve
PageOffline, so inflated pages by balloon driver is not part of memory offlining
code, but a different way of offlining pages. Am I getting it right?

Yes. PageOffline means logically offline, for whatever reason someone decides to turn pages logically offline.

Memory ballooning uses and virtio-mem are two users, there are more.


I read a little bit more on memory ballooning and virtio-mem and understand
that memory ballooning still keeps the inflated page but guest cannot allocate
and use it, whereas virtio-mem and memory hotunplug remove the page from
Linux completely (i.e., Linux no longer sees the memory).

In virtio-mem terms, they are considered "fake offline" -- memory behaves as if it would never have been onlined, but there is a memmap for it. Like a (current) memory hole.


It seems that I am mixing memory offlining and memory hotunplug. IIUC,
memory offlining means no one can allocate and use the offlined memory, but
Linux still sees it; memory hotunplug means Linux no longer sees it (no related
memmap and other metadata). Am I getting it right?

The doc has this "Phases of Memory Hotplug" description, where it is roughly divided into that, yes.



Some PageOffline pages can be migrated using the non-folio migration: this is 
done for memory ballooning (memory comapction). As they get migrated, they are 
freed back to the buddy, PageOffline() is cleared -- they become PageBuddy() -- 
and the above applies.

After a PageOffline page is migrated, the destination page becomes PageOffline, 
right?
OK, I see it in balloon_page_insert().

Yes.



Other PageOffline pages can be skipped during memory offlining (virtio-mem use 
case, what we are doing her). We don't want them to ever go through the buddy, 
especially because if memory offlining fails they must definitely not be 
treated like free pages that can be allocated immediately.

What do you mean by "skipped during memory offlining"? Are you implying when
virtio-mem is offlining some pages by marking it PageOffline and 
PG_offline_skippable,
someone else can do memory offlining in parallel?

It could happen (e.g., manually offline a Linux memory block using sysfs), but that is not the primary use case.

virtio-mem unplugs memory in the following sequence:

1) alloc_contig_range() small blocks (e.g., 2 MiB)

2) Report the blocks to the hypervisor

3) Mark them fake-offline: PageOffline (+ PageOfflineSkippable now)

Once all small blocks that comprise a Linux memory block (e.g., 128 MiB) are fake-offline, offline the memory block and remove the memory using offline_and_remove_memory().

In that operation -- offline_and_remove_memory() -- memory offlining code must be able to skip these PageOffline pages, otherwise offline_and_remove_memory() will just fail, saying that there are unmovable pages in there.



Next, the page is removed from its memory
block. When will PG_offline_skippable be used? The second phase when
the page is being removed from its memory block?

PG_offline_skippable is used during memory offlining, while we look for any 
pages that are not PageBuddy (... or hwpoisoned ...), to migrate them off the 
memory so they get converted to PageBuddy.

PageOffline + PageOfflineSkippable are checked on that phase, such that they 
don't require any migration.

Hmm, if you just do not want to get PageOffline migrated, not setting it
__PageMovable would work right? PageOffline + __PageMovable is used by
ballooning, as these inflated pages can be migrated. PageOffline without
__PageMovable should be virtio-mem. Am I missing any other user?

Sure. Just imagine !CONFIG_BALLOON_COMPACTION.

In summary, we have

1) Migratable PageOffline pages (balloon compaction)

2) Unmigratable PageOffline pages (e.g., XEN balloon, hyper-v balloon,
   memtrace, in the future likely some memory holes, ... )

3) Skippable PageOffline pages (virtio-mem)

--
Cheers,

David / dhildenb


Reply via email to