On Mon, 2023-10-09 at 17:04 +0200, David Hildenbrand wrote: > On 07.10.23 10:55, Huang, Ying wrote: > > Vishal Verma <vishal.l.ve...@intel.com> writes: > > > > > @@ -2167,47 +2221,28 @@ static int __ref try_remove_memory(u64 start, u64 > > > size) > > > if (rc) > > > return rc; > > > > > > + mem_hotplug_begin(); > > > + > > > /* > > > - * We only support removing memory added with > > > MHP_MEMMAP_ON_MEMORY in > > > - * the same granularity it was added - a single memory block. > > > + * For memmap_on_memory, the altmaps could have been added on > > > + * a per-memblock basis. Loop through the entire range if so, > > > + * and remove each memblock and its altmap. > > > */ > > > if (mhp_memmap_on_memory()) { > > > > IIUC, even if mhp_memmap_on_memory() returns true, it's still possible > > that the memmap is put in DRAM after [2/2]. So that, > > arch_remove_memory() are called for each memory block unnecessarily. Can > > we detect this (via altmap?) and call remove_memory_block_and_altmap() > > for the whole range? > > Good point. We should handle memblock-per-memblock onny if we have to > handle the altmap. Otherwise, just call a separate function that doesn't > care about -- e.g., called remove_memory_blocks_no_altmap(). > > We could simply walk all memory blocks and make sure either all have an > altmap or none has an altmap. If there is a mix, we should bail out with > WARN_ON_ONCE(). > Ok I think I follow - based on both of these threads, here's my understanding in an incremental diff from the original patches (may not apply directly as I've already committed changes from the other bits of feedback - but this should provide an idea of the direction) -
--- diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 507291e44c0b..30addcb063b4 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -2201,6 +2201,40 @@ static void __ref remove_memory_block_and_altmap(u64 start, u64 size) } } +static bool memblocks_have_altmaps(u64 start, u64 size) +{ + unsigned long memblock_size = memory_block_size_bytes(); + u64 num_altmaps = 0, num_no_altmaps = 0; + struct memory_block *mem; + u64 cur_start; + int rc = 0; + + if (!mhp_memmap_on_memory()) + return false; + + for (cur_start = start; cur_start < start + size; + cur_start += memblock_size) { + if (walk_memory_blocks(cur_start, memblock_size, &mem, + test_has_altmap_cb)) + num_altmaps++; + else + num_no_altmaps++; + } + + if (!num_altmaps && num_no_altmaps > 0) + return false; + + if (!num_no_altmaps && num_altmaps > 0) + return true; + + /* + * If there is a mix of memblocks with and without altmaps, + * something has gone very wrong. WARN and bail. + */ + WARN_ONCE(1, "memblocks have a mix of missing and present altmaps"); + return false; +} + static int __ref try_remove_memory(u64 start, u64 size) { int rc, nid = NUMA_NO_NODE; @@ -2230,7 +2264,7 @@ static int __ref try_remove_memory(u64 start, u64 size) * a per-memblock basis. Loop through the entire range if so, * and remove each memblock and its altmap. */ - if (mhp_memmap_on_memory()) { + if (mhp_memmap_on_memory() && memblocks_have_altmaps(start, size)) { unsigned long memblock_size = memory_block_size_bytes(); u64 cur_start; @@ -2239,7 +2273,8 @@ static int __ref try_remove_memory(u64 start, u64 size) remove_memory_block_and_altmap(cur_start, memblock_size); } else { - remove_memory_block_and_altmap(start, size); + remove_memory_block_devices(start, size); + arch_remove_memory(start, size, NULL); } if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) {