On 7/11/23 9:14 PM, David Hildenbrand wrote: > On 11.07.23 17:40, Aneesh Kumar K V wrote: >> On 7/11/23 8:56 PM, David Hildenbrand wrote: >>> On 11.07.23 06:48, Aneesh Kumar K.V wrote: >>>> Radix vmemmap mapping can map things correctly at the PMD level or PTE >>>> level based on different device boundary checks. Hence we skip the >>>> restrictions w.r.t vmemmap size to be multiple of PMD_SIZE. This also >>>> makes the feature widely useful because to use PMD_SIZE vmemmap area we >>>> require a memory block size of 2GiB >>>> >>>> We can also use MHP_RESERVE_PAGES_MEMMAP_ON_MEMORY to that the feature >>>> can work with a memory block size of 256MB. Using altmap.reserve feature >>>> to align things correctly at pageblock granularity. We can end up >>>> losing some pages in memory with this. For ex: with a 256MiB memory block >>>> size, we require 4 pages to map vmemmap pages, In order to align things >>>> correctly we end up adding a reserve of 28 pages. ie, for every 4096 >>>> pages 28 pages get reserved. >>>> >>>> Signed-off-by: Aneesh Kumar K.V <aneesh.ku...@linux.ibm.com> >>>> --- >>>> arch/powerpc/Kconfig | 1 + >>>> arch/powerpc/include/asm/pgtable.h | 28 +++++++++++++++++++ >>>> .../platforms/pseries/hotplug-memory.c | 3 +- >>>> mm/memory_hotplug.c | 2 ++ >>>> 4 files changed, 33 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig >>>> index 116d6add0bb0..f890907e5bbf 100644 >>>> --- a/arch/powerpc/Kconfig >>>> +++ b/arch/powerpc/Kconfig >>>> @@ -157,6 +157,7 @@ config PPC >>>> select ARCH_HAS_UBSAN_SANITIZE_ALL >>>> select ARCH_HAVE_NMI_SAFE_CMPXCHG >>>> select ARCH_KEEP_MEMBLOCK >>>> + select ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE if PPC_RADIX_MMU >>>> select ARCH_MIGHT_HAVE_PC_PARPORT >>>> select ARCH_MIGHT_HAVE_PC_SERIO >>>> select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX >>>> diff --git a/arch/powerpc/include/asm/pgtable.h >>>> b/arch/powerpc/include/asm/pgtable.h >>>> index 68817ea7f994..8e6c92dde6ad 100644 >>>> --- a/arch/powerpc/include/asm/pgtable.h >>>> +++ b/arch/powerpc/include/asm/pgtable.h >>>> @@ -169,6 +169,34 @@ static inline bool is_ioremap_addr(const void *x) >>>> int __meminit vmemmap_populated(unsigned long vmemmap_addr, int >>>> vmemmap_map_size); >>>> bool altmap_cross_boundary(struct vmem_altmap *altmap, unsigned long >>>> start, >>>> unsigned long page_size); >>>> +/* >>>> + * mm/memory_hotplug.c:mhp_supports_memmap_on_memory goes into details >>>> + * some of the restrictions. We don't check for PMD_SIZE because our >>>> + * vmemmap allocation code can fallback correctly. The pageblock >>>> + * alignment requirement is met using altmap->reserve blocks. >>>> + */ >>>> +#define arch_supports_memmap_on_memory arch_supports_memmap_on_memory >>>> +static inline bool arch_supports_memmap_on_memory(unsigned long size) >>>> +{ >>>> + unsigned long nr_pages = size >> PAGE_SHIFT; >>>> + unsigned long vmemmap_size = nr_pages * sizeof(struct page); >>>> + >>>> + if (!radix_enabled()) >>>> + return false; >>>> + >>>> +#ifdef CONFIG_PPC_4K_PAGES >>>> + return IS_ALIGNED(vmemmap_size, PMD_SIZE); >>>> +#else >>>> + /* >>>> + * Make sure the vmemmap allocation is fully contianed >>>> + * so that we always allocate vmemmap memory from altmap area. >>>> + * The pageblock alignment requirement is met by using >>>> + * reserve blocks in altmap. >>>> + */ >>>> + return IS_ALIGNED(vmemmap_size, PAGE_SIZE); >>> >>> Can we move that check into common code as well? >>> >>> If our (original) vmemmap size would not fit into a single page, we would >>> be in trouble on any architecture. Did not check if it would be an issue >>> for arm64 as well in case we would allow eventually wasting memory. >>> >> >> >> For x86 and arm we already do IS_ALIGNED(vmemmap_size, PMD_SIZE); in >> arch_supports_memmap_on_memory(). That should imply PAGE_SIZE alignment. >> If arm64 allow the usage of altmap.reserve, I would expect the >> arch_supports_memmap_on_memory to have the PAGE_SIZE check. >> >> Adding the PAGE_SIZE check in mhp_supports_memmap_on_memory() makes it >> redundant check for x86 and arm currently? > > IMHO not an issue. The common code check is a bit weaker and the arch check a > bit stronger. > >>
ok will update >> modified mm/memory_hotplug.c >> @@ -1293,6 +1293,13 @@ static bool mhp_supports_memmap_on_memory(unsigned >> long size) >> */ >> if (!mhp_memmap_on_memory() || size != memory_block_size_bytes()) >> return false; >> + >> + /* >> + * Make sure the vmemmap allocation is fully contianed > > s/contianed/contained/ > >> + * so that we always allocate vmemmap memory from altmap area. > > In theory, it's not only the vmemmap size, but also the vmemmap start (that > it doesn't start somewhere in between a page, crossing a page). I suspect the > start is always guaranteed to be aligned (of the vmemmap size is aligned), > correct? > That is correct. >> + */ >> + if (!IS_ALIGNED(vmemmap_size, PAGE_SIZE)) >> + return false; >> /* >> * Without page reservation remaining pages should be pageblock >> aligned. >> */ >> -aneesh >