Re: [PATCH 04/15] arm64: numa: simplify dummy_numa_init()

2020-07-30 Thread Catalin Marinas
On Tue, Jul 28, 2020 at 08:11:42AM +0300, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> dummy_numa_init() loops over memblock.memory and passes nid=0 to
> numa_add_memblk() which essentially wraps memblock_set_node(). However,
> memblock_set_node() can cope with entire memory span itself, so the loop
> over memblock.memory regions is redundant.
> 
> Replace the loop with a single call to memblock_set_node() to the entire
> memory.
> 
> Signed-off-by: Mike Rapoport 

Acked-by: Catalin Marinas 


Re: [PATCH 04/15] arm64: numa: simplify dummy_numa_init()

2020-07-29 Thread Jonathan Cameron
On Tue, 28 Jul 2020 08:11:42 +0300
Mike Rapoport  wrote:

> From: Mike Rapoport 
> 
> dummy_numa_init() loops over memblock.memory and passes nid=0 to
> numa_add_memblk() which essentially wraps memblock_set_node(). However,
> memblock_set_node() can cope with entire memory span itself, so the loop
> over memblock.memory regions is redundant.
> 
> Replace the loop with a single call to memblock_set_node() to the entire
> memory.

Hi Mike,

I had a similar patch I was going to post shortly so can add a bit more
on the advantages of this one.

Beyond cleaning up, it also fixes an issue with a buggy ACPI firmware in which 
the SRAT
table covers some but not all of the memory in the EFI memory map.  Stealing 
bits
from the draft cover letter I had for that...

> This issue can be easily triggered by having an SRAT table which fails
> to cover all elements of the EFI memory map.
> 
> This firmware error is detected and a warning printed. e.g.
> "NUMA: Warning: invalid memblk node 64 [mem 0x24000-0x27fff]"
> At that point we fall back to dummy_numa_init().
> 
> However, the failed ACPI init has left us with our memblocks all broken
> up as we split them when trying to assign them to NUMA nodes.
> 
> We then iterate over the memblocks and add them to node 0.
> 
> for_each_memblock(memory, mblk) {
>   ret = numa_add_memblk(0, mblk->base, mblk->base + mblk->size);
>   if (!ret)
>   continue;
>   pr_err("NUMA init failed\n");
>   return ret;
> }
> 
> numa_add_memblk() calls memblock_set_node() which merges regions that
> were previously split up during the earlier attempt to add them to different
> nodes during parsing of SRAT.
> 
> This means elements are moved in the memblock array and we can end up
> in a different memblock after the call to numa_add_memblk().
> Result is:
> 
> Unable to handle kernel paging request at virtual address 3a40
> Mem abort info:
>   ESR = 0x9604
>   EC = 0x25: DABT (current EL), IL = 32 bits
>   SET = 0, FnV = 0
>   EA = 0, S1PTW = 0
> Data abort info:
>   ISV = 0, ISS = 0x0004
>   CM = 0, WnR = 0
> [3a40] user address but active_mm is swapper
> Internal error: Oops: 9604 [#1] PREEMPT SMP
> 
> ...
> 
> Call trace:
>   sparse_init_nid+0x5c/0x2b0
>   sparse_init+0x138/0x170
>   bootmem_init+0x80/0xe0
>   setup_arch+0x2a0/0x5fc
>   start_kernel+0x8c/0x648
> 
> As an illustrative example:
> EFI table has one block of memory.
> memblks[0] = [0...0x2f]  so we start with a single memblock.
> 
> SRAT has
> [0x00...0x0f] in node 0
> [0x10...0x1f] in node 1
> but no entry covering 
> [0x20...0x2f].
> 
> Whilst parsing SRAT the single memblock is broken into 3.
> memblks[0] = [0x00...0x0f] in node 0
> memblks[1] = [0x10...0x1f] in node 1
> memblks[2] = [0x20...0x2f] in node MAX_NUM_NODES (invalid value)
> 
> A sanity check parse then detects the invalid section and acpi_numa_init
> fails.  We then fall back to the dummy path.
> 
> That iterates over the memblocks.  We'll use i an index in the array of 
> memblocks
> 
> i = 0;
> memblks[0] = [0x00...0x0f] set to node0.
>merge doesn't do anything because the neighbouring memblock is still in 
> node1.
> 
> i = 1
> memblks[1] = [0x10...0x1f] set to node 0.
>merge combines memblock 0 and 1 to give a new set of memblocks.
> 
> memblks[0] = [0x00..0x1f] in node 0
> memblks[1] = [0x20..0x2f] in node MAX_NUM_NODES.
> 
> i = 2 off the end of the now reduced array of memblocks, so exit the loop.
> (if we restart the loop here everything will be fine).
> 
> Later sparse_init_nid tries to use the node of the second memblock to index
> somethings and boom.


> 
> Signed-off-by: Mike Rapoport 

Acked-by: Jonathan Cameron 

> ---
>  arch/arm64/mm/numa.c | 13 +
>  1 file changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
> index aafcee3e3f7e..0cbdbcc885fb 100644
> --- a/arch/arm64/mm/numa.c
> +++ b/arch/arm64/mm/numa.c
> @@ -423,19 +423,16 @@ static int __init numa_init(int (*init_func)(void))
>   */
>  static int __init dummy_numa_init(void)
>  {
> + phys_addr_t start = memblock_start_of_DRAM();
> + phys_addr_t end = memblock_end_of_DRAM();
>   int ret;
> - struct memblock_region *mblk;
>  
>   if (numa_off)
>   pr_info("NUMA disabled\n"); /* Forced off on command line. */
> - pr_info("Faking a node at [mem %#018Lx-%#018Lx]\n",
> - memblock_start_of_DRAM(), memblock_end_of_DRAM() - 1);
> -
> - for_each_memblock(memory, mblk) {
> - ret = numa_add_memblk(0, mblk->base, mblk->base + mblk->size);
> - if (!ret)
> - continue;
> + pr_info("Faking a node at [mem %#018Lx-%#018Lx]\n", start, end - 1);
>  
> + ret = numa_add_memblk(0, start, end);
> + if (ret) {
>   pr_err("NUMA init failed\n");
>   return ret;
>   }




[PATCH 04/15] arm64: numa: simplify dummy_numa_init()

2020-07-27 Thread Mike Rapoport
From: Mike Rapoport 

dummy_numa_init() loops over memblock.memory and passes nid=0 to
numa_add_memblk() which essentially wraps memblock_set_node(). However,
memblock_set_node() can cope with entire memory span itself, so the loop
over memblock.memory regions is redundant.

Replace the loop with a single call to memblock_set_node() to the entire
memory.

Signed-off-by: Mike Rapoport 
---
 arch/arm64/mm/numa.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
index aafcee3e3f7e..0cbdbcc885fb 100644
--- a/arch/arm64/mm/numa.c
+++ b/arch/arm64/mm/numa.c
@@ -423,19 +423,16 @@ static int __init numa_init(int (*init_func)(void))
  */
 static int __init dummy_numa_init(void)
 {
+   phys_addr_t start = memblock_start_of_DRAM();
+   phys_addr_t end = memblock_end_of_DRAM();
int ret;
-   struct memblock_region *mblk;
 
if (numa_off)
pr_info("NUMA disabled\n"); /* Forced off on command line. */
-   pr_info("Faking a node at [mem %#018Lx-%#018Lx]\n",
-   memblock_start_of_DRAM(), memblock_end_of_DRAM() - 1);
-
-   for_each_memblock(memory, mblk) {
-   ret = numa_add_memblk(0, mblk->base, mblk->base + mblk->size);
-   if (!ret)
-   continue;
+   pr_info("Faking a node at [mem %#018Lx-%#018Lx]\n", start, end - 1);
 
+   ret = numa_add_memblk(0, start, end);
+   if (ret) {
pr_err("NUMA init failed\n");
return ret;
}
-- 
2.26.2