On 2020-09-10 03:50, Anshuman Khandual wrote:
On 09/10/2020 01:57 PM, Steven Price wrote:
On 10/09/2020 07:05, Sudarshan Rajagopalan wrote:
When section mappings are enabled, we allocate vmemmap pages from
physically
continuous memory of size PMD_SZIE using vmemmap_alloc_block_buf().
Section
mappings are good to reduce TLB pressure. But when system is highly
fragmented
and memory blocks are being hot-added at runtime, its possible that
such
physically continuous memory allocations can fail. Rather than
failing the
memory hot-add procedure, add a fallback option to allocate vmemmap
pages from
discontinuous pages using vmemmap_populate_basepages().
Signed-off-by: Sudarshan Rajagopalan <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Logan Gunthorpe <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Steven Price <[email protected]>
---
arch/arm64/mm/mmu.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 75df62f..a46c7d4 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1100,6 +1100,7 @@ int __meminit vmemmap_populate(unsigned long
start, unsigned long end, int node,
p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
+ int ret = 0;
do {
next = pmd_addr_end(addr, end);
@@ -1121,15 +1122,23 @@ int __meminit vmemmap_populate(unsigned long
start, unsigned long end, int node,
void *p = NULL;
p = vmemmap_alloc_block_buf(PMD_SIZE, node, altmap);
- if (!p)
- return -ENOMEM;
+ if (!p) {
+#ifdef CONFIG_MEMORY_HOTPLUG
+ vmemmap_free(start, end, altmap);
+#endif
+ ret = -ENOMEM;
+ break;
+ }
pmd_set_huge(pmdp, __pa(p),
__pgprot(PROT_SECT_NORMAL));
} else
vmemmap_verify((pte_t *)pmdp, node, addr, next);
} while (addr = next, addr != end);
- return 0;
+ if (ret)
+ return vmemmap_populate_basepages(start, end, node, altmap);
+ else
+ return ret;
Style comment: I find this usage of 'ret' confusing. When we assign
-ENOMEM above that is never actually the return value of the function
(in that case vmemmap_populate_basepages() provides the actual return
value).
Right.
Also the "return ret" is misleading since we know by that point that
ret==0 (and the 'else' is redundant).
Right.
Can you not just move the call to vmemmap_populate_basepages() up to
just after the (possible) vmemmap_free() call and remove the 'ret'
variable?
Yes the usage of "return ret" is quite confusing and misleading here -
will clean this.
AFAICT the call to vmemmap_free() also doesn't need the #ifdef as the
function is a no-op if CONFIG_MEMORY_HOTPLUG isn't set. I also feel
you
Right, CONFIG_MEMORY_HOTPLUG is not required.
Not quite exactly - the vmemmap_free() declaration in include/linux/mm.h
header file is wrapped around CONFIG_MEMORY_HOTPLUG as well. And since
the function definition is below the place where this is called, it will
throw an implicit declaration compile error when CONFIG_MEMORY_HOTPLUG
is not enabled. We can move the function definition above so that we
don't have to place this #ifdef. But we can go with 1st approach that
Anshuman mentions below.
need at least a comment to explain Anshuman's point that it looks like
you're freeing an unmapped area. Although if I'm reading the code
correctly it seems like the unmapped area will just be skipped.
Proposed vmemmap_free() attempts to free the entire requested vmemmap
range
[start, end] when an intermediate PMD entry can not be allocated. Hence
even
if vmemap_free() could skip an unmapped area (will double check on
that), it
unnecessarily goes through large sections of unmapped range, which
could not
have been mapped.
So, basically there could be two different methods for doing this
fallback.
1. Call vmemmap_populate_basepages() for sections when PMD_SIZE
allocation fails
- vmemmap_free() need not be called
2. Abort at the first instance of PMD_SIZE allocation failure
- Call vmemmap_free() to unmap all sections mapped till that point
- Call vmemmap_populate_basepages() to map the entire request section
The proposed patch tried to mix both approaches. Regardless, the first
approach
here seems better and is the case in vmemmap_populate_hugepages()
implementation
on x86 as well.
The 1st approach looks more cleaner compared to bailing out in first
failure, unmapping all previously mapped sections and map entire request
with vmemmap_populate_basepages. Thanks for the review and suggestion -
will send over a cleaner patch soon.
Sudarshan
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project