RE: [PATCH 11/37] xen/x86: abstract neutral code from acpi_numa_memory_affinity_init
Hi Jan, > -Original Message- > From: Jan Beulich > Sent: 2022年1月25日 0:51 > To: Wei Chen > Cc: Bertrand Marquis ; xen- > de...@lists.xenproject.org; sstabell...@kernel.org; jul...@xen.org > Subject: Re: [PATCH 11/37] xen/x86: abstract neutral code from > acpi_numa_memory_affinity_init > > On 23.09.2021 14:02, Wei Chen wrote: > > There is some code in acpi_numa_memory_affinity_init to update node > > memory range and update node_memblk_range array. This code is not > > ACPI specific, it can be shared by other NUMA implementation, like > > device tree based NUMA implementation. > > > > So in this patch, we abstract this memory range and blocks relative > > code to a new function. This will avoid exporting static variables > > like node_memblk_range. And the PXM in neutral code print messages > > have been replaced by NODE, as PXM is ACPI specific. > > > > Signed-off-by: Wei Chen > > SRAT is an ACPI concept, which I assume has no meaning with DT. Hence > any generically usable logic here wants, I think, separating out into > a file which is not SRAT-specific (peeking ahead, specifically not a > file named "numa_srat.c"). This may in turn require some more though When I created the file, I wanted to place non-ACPI/DT specific code in a new file. But I was confused about how to name it. I chose numa_srat.c as the file name because I thought the device tree is also a static resource table. But it seems this name is still misleading, because ACPI SRAT is well known. > regarding the proper split between the stuff remaining in srat.c and > the stuff becoming kind of library code. In particular this may mean > moving some of the static variables as well, and with them perhaps > some further functions (while I did peek ahead, I didn't look closely > at the later patch doing the actual movement). And it is then hard to > see why the separation needs to happen in two steps - you could move > the generically usable code to a new file right away. > OK, I will reduce the steps. And I think the "new file" can be common/numa.c. Because the generically usable code are some logical functions to check numa memory blocks/ranges and update nodes, we don't need a "numa_srat.c". > > --- a/xen/arch/x86/srat.c > > +++ b/xen/arch/x86/srat.c > > @@ -104,6 +104,14 @@ nodeid_t setup_node(unsigned pxm) > > return node; > > } > > > > +bool __init numa_memblks_available(void) > > +{ > > + if (num_node_memblks < NR_NODE_MEMBLKS) > > + return true; > > + > > + return false; > > +} > > Please can you avoid expressing things in more complex than necessary > ways? Here I don't see why it can't just be OK, I will simplify it. > > bool __init numa_memblks_available(void) > { > return num_node_memblks < NR_NODE_MEMBLKS; > } > > > @@ -301,69 +309,35 @@ static bool __init > is_node_memory_continuous(nodeid_t nid, > > return true; > > } > > > > -/* Callback for parsing of the Proximity Domain <-> Memory Area > mappings */ > > -void __init > > -acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma) > > +/* Neutral NUMA memory affinity init function for ACPI and DT */ > > +int __init numa_update_node_memblks(nodeid_t node, > > + paddr_t start, paddr_t size, bool hotplug) > > Indentation. OK. > > > { > > - paddr_t start, end; > > - unsigned pxm; > > - nodeid_t node; > > + paddr_t end = start + size; > > int i; > > > > - if (srat_disabled()) > > - return; > > - if (ma->header.length != sizeof(struct acpi_srat_mem_affinity)) { > > - bad_srat(); > > - return; > > - } > > - if (!(ma->flags & ACPI_SRAT_MEM_ENABLED)) > > - return; > > - > > - start = ma->base_address; > > - end = start + ma->length; > > - /* Supplement the heuristics in l1tf_calculations(). */ > > - l1tf_safe_maddr = max(l1tf_safe_maddr, ROUNDUP(end, PAGE_SIZE)); > > - > > - if (num_node_memblks >= NR_NODE_MEMBLKS) > > - { > > - dprintk(XENLOG_WARNING, > > -"Too many numa entry, try bigger NR_NODE_MEMBLKS \n"); > > - bad_srat(); > > - return; > > - } > > - > > - pxm = ma->proximity_domain; > > - if (srat_rev < 2) > > - pxm &= 0xff; > > - node = setup_node(pxm); > > - if (node == NUMA_NO_NODE) { > > - bad_srat(); > > - return
Re: [PATCH 11/37] xen/x86: abstract neutral code from acpi_numa_memory_affinity_init
On 23.09.2021 14:02, Wei Chen wrote: > There is some code in acpi_numa_memory_affinity_init to update node > memory range and update node_memblk_range array. This code is not > ACPI specific, it can be shared by other NUMA implementation, like > device tree based NUMA implementation. > > So in this patch, we abstract this memory range and blocks relative > code to a new function. This will avoid exporting static variables > like node_memblk_range. And the PXM in neutral code print messages > have been replaced by NODE, as PXM is ACPI specific. > > Signed-off-by: Wei Chen SRAT is an ACPI concept, which I assume has no meaning with DT. Hence any generically usable logic here wants, I think, separating out into a file which is not SRAT-specific (peeking ahead, specifically not a file named "numa_srat.c"). This may in turn require some more though regarding the proper split between the stuff remaining in srat.c and the stuff becoming kind of library code. In particular this may mean moving some of the static variables as well, and with them perhaps some further functions (while I did peek ahead, I didn't look closely at the later patch doing the actual movement). And it is then hard to see why the separation needs to happen in two steps - you could move the generically usable code to a new file right away. > --- a/xen/arch/x86/srat.c > +++ b/xen/arch/x86/srat.c > @@ -104,6 +104,14 @@ nodeid_t setup_node(unsigned pxm) > return node; > } > > +bool __init numa_memblks_available(void) > +{ > + if (num_node_memblks < NR_NODE_MEMBLKS) > + return true; > + > + return false; > +} Please can you avoid expressing things in more complex than necessary ways? Here I don't see why it can't just be bool __init numa_memblks_available(void) { return num_node_memblks < NR_NODE_MEMBLKS; } > @@ -301,69 +309,35 @@ static bool __init is_node_memory_continuous(nodeid_t > nid, > return true; > } > > -/* Callback for parsing of the Proximity Domain <-> Memory Area mappings */ > -void __init > -acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma) > +/* Neutral NUMA memory affinity init function for ACPI and DT */ > +int __init numa_update_node_memblks(nodeid_t node, > + paddr_t start, paddr_t size, bool hotplug) Indentation. > { > - paddr_t start, end; > - unsigned pxm; > - nodeid_t node; > + paddr_t end = start + size; > int i; > > - if (srat_disabled()) > - return; > - if (ma->header.length != sizeof(struct acpi_srat_mem_affinity)) { > - bad_srat(); > - return; > - } > - if (!(ma->flags & ACPI_SRAT_MEM_ENABLED)) > - return; > - > - start = ma->base_address; > - end = start + ma->length; > - /* Supplement the heuristics in l1tf_calculations(). */ > - l1tf_safe_maddr = max(l1tf_safe_maddr, ROUNDUP(end, PAGE_SIZE)); > - > - if (num_node_memblks >= NR_NODE_MEMBLKS) > - { > - dprintk(XENLOG_WARNING, > -"Too many numa entry, try bigger NR_NODE_MEMBLKS \n"); > - bad_srat(); > - return; > - } > - > - pxm = ma->proximity_domain; > - if (srat_rev < 2) > - pxm &= 0xff; > - node = setup_node(pxm); > - if (node == NUMA_NO_NODE) { > - bad_srat(); > - return; > - } > - /* It is fine to add this area to the nodes data it will be used later*/ > + /* It is fine to add this area to the nodes data it will be used later > */ > i = conflicting_memblks(start, end); > if (i < 0) > /* everything fine */; > else if (memblk_nodeid[i] == node) { > - bool mismatch = !(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) != > - !test_bit(i, memblk_hotplug); > + bool mismatch = !hotplug != !test_bit(i, memblk_hotplug); > > - printk("%sSRAT: PXM %u (%"PRIpaddr"-%"PRIpaddr") overlaps with > itself (%"PRIpaddr"-%"PRIpaddr")\n", > -mismatch ? KERN_ERR : KERN_WARNING, pxm, start, end, > + printk("%sSRAT: NODE %u (%"PRIpaddr"-%"PRIpaddr") overlaps with > itself (%"PRIpaddr"-%"PRIpaddr")\n", Nit: Unlike PXM, which is an acronym, "node" doesn't want to be all upper case. Also did you check that the node <-> PXM association is known to a reader of a log at this point in time? > +mismatch ? KERN_ERR : KERN_WARNING, node, start, end, > node_memblk_range[i].start, node_memblk_range[i].end); > if (mismatch) { > - bad_srat(); > - return; > + return -1; > } > } else { > printk(KERN_ERR > -"SRAT: PXM %u (%"PRIpaddr"-%"PRIpaddr") overlaps with > PXM %u (%"PRIpaddr"-%"PRIpaddr")\n", > -pxm, start, end, node_to_pxm(memblk_nodeid[i]), > +"SR
Re: [PATCH 11/37] xen/x86: abstract neutral code from acpi_numa_memory_affinity_init
+x86 maintainers On Thu, 23 Sep 2021, Wei Chen wrote: > There is some code in acpi_numa_memory_affinity_init to update node > memory range and update node_memblk_range array. This code is not > ACPI specific, it can be shared by other NUMA implementation, like > device tree based NUMA implementation. > > So in this patch, we abstract this memory range and blocks relative > code to a new function. This will avoid exporting static variables > like node_memblk_range. And the PXM in neutral code print messages > have been replaced by NODE, as PXM is ACPI specific. > > Signed-off-by: Wei Chen > --- > xen/arch/x86/srat.c| 131 + > xen/include/asm-x86/numa.h | 3 + > 2 files changed, 77 insertions(+), 57 deletions(-) > > diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c > index 3334ede7a5..18bc6b19bb 100644 > --- a/xen/arch/x86/srat.c > +++ b/xen/arch/x86/srat.c > @@ -104,6 +104,14 @@ nodeid_t setup_node(unsigned pxm) > return node; > } > > +bool __init numa_memblks_available(void) > +{ > + if (num_node_memblks < NR_NODE_MEMBLKS) > + return true; > + > + return false; > +} > + > int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node) > { > int i; > @@ -301,69 +309,35 @@ static bool __init is_node_memory_continuous(nodeid_t > nid, > return true; > } > > -/* Callback for parsing of the Proximity Domain <-> Memory Area mappings */ > -void __init > -acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma) > +/* Neutral NUMA memory affinity init function for ACPI and DT */ > +int __init numa_update_node_memblks(nodeid_t node, > + paddr_t start, paddr_t size, bool hotplug) > { > - paddr_t start, end; > - unsigned pxm; > - nodeid_t node; > + paddr_t end = start + size; > int i; > > - if (srat_disabled()) > - return; > - if (ma->header.length != sizeof(struct acpi_srat_mem_affinity)) { > - bad_srat(); > - return; > - } > - if (!(ma->flags & ACPI_SRAT_MEM_ENABLED)) > - return; > - > - start = ma->base_address; > - end = start + ma->length; > - /* Supplement the heuristics in l1tf_calculations(). */ > - l1tf_safe_maddr = max(l1tf_safe_maddr, ROUNDUP(end, PAGE_SIZE)); > - > - if (num_node_memblks >= NR_NODE_MEMBLKS) > - { > - dprintk(XENLOG_WARNING, > -"Too many numa entry, try bigger NR_NODE_MEMBLKS \n"); > - bad_srat(); > - return; > - } > - > - pxm = ma->proximity_domain; > - if (srat_rev < 2) > - pxm &= 0xff; > - node = setup_node(pxm); > - if (node == NUMA_NO_NODE) { > - bad_srat(); > - return; > - } > - /* It is fine to add this area to the nodes data it will be used later*/ > + /* It is fine to add this area to the nodes data it will be used later > */ > i = conflicting_memblks(start, end); > if (i < 0) > /* everything fine */; > else if (memblk_nodeid[i] == node) { > - bool mismatch = !(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) != > - !test_bit(i, memblk_hotplug); > + bool mismatch = !hotplug != !test_bit(i, memblk_hotplug); > > - printk("%sSRAT: PXM %u (%"PRIpaddr"-%"PRIpaddr") overlaps with > itself (%"PRIpaddr"-%"PRIpaddr")\n", > -mismatch ? KERN_ERR : KERN_WARNING, pxm, start, end, > + printk("%sSRAT: NODE %u (%"PRIpaddr"-%"PRIpaddr") overlaps with > itself (%"PRIpaddr"-%"PRIpaddr")\n", > +mismatch ? KERN_ERR : KERN_WARNING, node, start, end, > node_memblk_range[i].start, node_memblk_range[i].end); > if (mismatch) { > - bad_srat(); > - return; > + return -1; > } > } else { > printk(KERN_ERR > -"SRAT: PXM %u (%"PRIpaddr"-%"PRIpaddr") overlaps with > PXM %u (%"PRIpaddr"-%"PRIpaddr")\n", > -pxm, start, end, node_to_pxm(memblk_nodeid[i]), > +"SRAT: NODE %u (%"PRIpaddr"-%"PRIpaddr") overlaps with > NODE %u (%"PRIpaddr"-%"PRIpaddr")\n", > +node, start, end, memblk_nodeid[i], > node_memblk_range[i].start, node_memblk_range[i].end); > - bad_srat(); > - return; > + return -1; > } > - if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) { > + > + if (!hotplug) { > struct node *nd = &nodes[node]; > > if (!node_test_and_set(node, memory_nodes_parsed)) { > @@ -375,26 +349,69 @@ acpi_numa_memory_affinity_init(const struct > acpi_srat_mem_affinity *ma) > if (nd->end < end) > nd->end = end; > > - /* Check whether this range contains memor