Re: [PATCH V7 1/3] powerpc/nodes: Ensure enough nodes avail for operations

2017-11-22 Thread Michael Ellerman
Nathan Fontenot  writes:
> On 11/16/2017 11:24 AM, Michael Bringmann wrote:
...
>> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
>> index eb604b3..334a1ff 100644
>> --- a/arch/powerpc/mm/numa.c
>> +++ b/arch/powerpc/mm/numa.c
>> @@ -892,6 +892,37 @@ static void __init setup_node_data(int nid, u64 
>> start_pfn, u64 end_pfn)
>>  NODE_DATA(nid)->node_spanned_pages = spanned_pages;
>>  }
>> 
>> +static void __init find_possible_nodes(void)
>> +{
>> +struct device_node *rtas;
>> +u32 numnodes, i;
>> +
>> +if (min_common_depth <= 0)
>> +return;
>> +
>> +rtas = of_find_node_by_path("/rtas");
>> +if (!rtas)
>> +return;
>> +
>> +if (of_property_read_u32_index(rtas,
>> +"ibm,max-associativity-domains",
>> +min_common_depth, &numnodes))
>> +goto out;
>> +
>> +pr_info("numa: Nodes = %d (mcd = %d)\n", numnodes,
>> +min_common_depth);
>
> numa.c already has a pr_fmt define, no need to pre-pend "numa:" to the
> information message.

And in fact no need to print that out here at all, it's covered
elsewhere. So just drop that pr_info() entirely.

cheers


Re: [PATCH V7 1/3] powerpc/nodes: Ensure enough nodes avail for operations

2017-11-27 Thread Michael Bringmann
See below.

On 11/20/2017 10:33 AM, Nathan Fontenot wrote:
> 
> 
> On 11/16/2017 11:24 AM, Michael Bringmann wrote:
>> On powerpc systems which allow 'hot-add' of CPU or memory resources,
>> it may occur that the new resources are to be inserted into nodes
>> that were not used for these resources at bootup.  In the kernel,
>> any node that is used must be defined and initialized.  These empty
>> nodes may occur when,
>>
>> * Dedicated vs. shared resources.  Shared resources require
>>   information such as the VPHN hcall for CPU assignment to nodes.
>>   Associativity decisions made based on dedicated resource rules,
>>   such as associativity properties in the device tree, may vary
>>   from decisions made using the values returned by the VPHN hcall.
>> * memoryless nodes at boot.  Nodes need to be defined as 'possible'
>>   at boot for operation with other code modules.  Previously, the
>>   powerpc code would limit the set of possible nodes to those which
>>   have memory assigned at boot, and were thus online.  Subsequent
>>   add/remove of CPUs or memory would only work with this subset of
>>   possible nodes.
>> * memoryless nodes with CPUs at boot.  Due to the previous restriction
>>   on nodes, nodes that had CPUs but no memory were being collapsed
>>   into other nodes that did have memory at boot.  In practice this
>>   meant that the node assignment presented by the runtime kernel
>>   differed from the affinity and associativity attributes presented
>>   by the device tree or VPHN hcalls.  Nodes that might be known to
>>   the pHyp were not 'possible' in the runtime kernel because they did
>>   not have memory at boot.
>>
>> This patch ensures that sufficient nodes are defined to support
>> configuration requirements after boot, as well as at boot.  This
>> patch set fixes a couple of problems.
>>
>> * Nodes known to powerpc to be memoryless at boot, but to have
>>   CPUs in them are allowed to be 'possible' and 'online'.  Memory
>>   allocations for those nodes are taken from another node that does
>>   have memory until and if memory is hot-added to the node.
>> * Nodes which have no resources assigned at boot, but which may still
>>   be referenced subsequently by affinity or associativity attributes,
>>   are kept in the list of 'possible' nodes for powerpc.  Hot-add of
>>   memory or CPUs to the system can reference these nodes and bring
>>   them online instead of redirecting to one of the set of nodes that
>>   were known to have memory at boot.
>>
>> This patch extracts the value of the lowest domain level (number of
>> allocable resources) from the device tree property
>> "ibm,max-associativity-domains" to use as the maximum number of nodes
>> to setup as possibly available in the system.  This new setting will
>> override the instruction,
>>
>> nodes_and(node_possible_map, node_possible_map, node_online_map);
>>
>> presently seen in the function arch/powerpc/mm/numa.c:initmem_init().
>>
>> If the "ibm,max-associativity-domains" property is not present at boot,
>> no operation will be performed to define or enable additional nodes, or
>> enable the above 'nodes_and()'.
>>
>> Signed-off-by: Michael Bringmann 
>> ---
>> Changes in V6:
>>   -- Remove some node initialization/allocation from boot setup
>>  to later in runtime to try to limit memory needs early on
>>   -- Augment descriptive documentation for patch
>> ---
>>  arch/powerpc/mm/numa.c |   40 +---
>>  1 file changed, 37 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
>> index eb604b3..334a1ff 100644
>> --- a/arch/powerpc/mm/numa.c
>> +++ b/arch/powerpc/mm/numa.c
>> @@ -892,6 +892,37 @@ static void __init setup_node_data(int nid, u64 
>> start_pfn, u64 end_pfn)
>>  NODE_DATA(nid)->node_spanned_pages = spanned_pages;
>>  }
>>
>> +static void __init find_possible_nodes(void)
>> +{
>> +struct device_node *rtas;
>> +u32 numnodes, i;
>> +
>> +if (min_common_depth <= 0)
>> +return;
>> +
>> +rtas = of_find_node_by_path("/rtas");
>> +if (!rtas)
>> +return;
>> +
>> +if (of_property_read_u32_index(rtas,
>> +"ibm,max-associativity-domains",
>> +min_common_depth, &numnodes))
>> +goto out;
>> +
>> +pr_info("numa: Nodes = %d (mcd = %d)\n", numnodes,
>> +min_common_depth);
> 
> numa.c already has a pr_fmt define, no need to pre-pend "numa:" to the
> information message.
> 
> -Nathan

Okay.

> 
>> +
>> +for (i = 0; i < numnodes; i++) {
>> +if (!node_possible(i)) {
>> +setup_node_data(i, 0, 0);
>> +node_set(i, node_possible_map);
>> +}
>> +}
>> +
>> +out:
>> +of_node_put(rtas);
>> +}
>> +
>>  void __init initmem_init(void)
>>  {
>>  int nid, cpu;
>> @@ -905,12 +936,15 @@ void __init initmem_init(void)
>>  memblock_dump_all();
>>
>>  

Re: [PATCH V7 1/3] powerpc/nodes: Ensure enough nodes avail for operations

2017-11-27 Thread Michael Bringmann
See below.

On 11/22/2017 05:17 AM, Michael Ellerman wrote:
> Nathan Fontenot  writes:
>> On 11/16/2017 11:24 AM, Michael Bringmann wrote:
> ...
>>> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
>>> index eb604b3..334a1ff 100644
>>> --- a/arch/powerpc/mm/numa.c
>>> +++ b/arch/powerpc/mm/numa.c
>>> @@ -892,6 +892,37 @@ static void __init setup_node_data(int nid, u64 
>>> start_pfn, u64 end_pfn)
>>> NODE_DATA(nid)->node_spanned_pages = spanned_pages;
>>>  }
>>>
>>> +static void __init find_possible_nodes(void)
>>> +{
>>> +   struct device_node *rtas;
>>> +   u32 numnodes, i;
>>> +
>>> +   if (min_common_depth <= 0)
>>> +   return;
>>> +
>>> +   rtas = of_find_node_by_path("/rtas");
>>> +   if (!rtas)
>>> +   return;
>>> +
>>> +   if (of_property_read_u32_index(rtas,
>>> +   "ibm,max-associativity-domains",
>>> +   min_common_depth, &numnodes))
>>> +   goto out;
>>> +
>>> +   pr_info("numa: Nodes = %d (mcd = %d)\n", numnodes,
>>> +   min_common_depth);
>>
>> numa.c already has a pr_fmt define, no need to pre-pend "numa:" to the
>> information message.
> 
> And in fact no need to print that out here at all, it's covered
> elsewhere. So just drop that pr_info() entirely.
> 
> cheers
> 
> 

Okay.  pr_info() removed.

-- 
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:   (512) 466-0650
m...@linux.vnet.ibm.com



Re: [PATCH V7 1/3] powerpc/nodes: Ensure enough nodes avail for operations

2017-11-20 Thread Nathan Fontenot


On 11/16/2017 11:24 AM, Michael Bringmann wrote:
> On powerpc systems which allow 'hot-add' of CPU or memory resources,
> it may occur that the new resources are to be inserted into nodes
> that were not used for these resources at bootup.  In the kernel,
> any node that is used must be defined and initialized.  These empty
> nodes may occur when,
> 
> * Dedicated vs. shared resources.  Shared resources require
>   information such as the VPHN hcall for CPU assignment to nodes.
>   Associativity decisions made based on dedicated resource rules,
>   such as associativity properties in the device tree, may vary
>   from decisions made using the values returned by the VPHN hcall.
> * memoryless nodes at boot.  Nodes need to be defined as 'possible'
>   at boot for operation with other code modules.  Previously, the
>   powerpc code would limit the set of possible nodes to those which
>   have memory assigned at boot, and were thus online.  Subsequent
>   add/remove of CPUs or memory would only work with this subset of
>   possible nodes.
> * memoryless nodes with CPUs at boot.  Due to the previous restriction
>   on nodes, nodes that had CPUs but no memory were being collapsed
>   into other nodes that did have memory at boot.  In practice this
>   meant that the node assignment presented by the runtime kernel
>   differed from the affinity and associativity attributes presented
>   by the device tree or VPHN hcalls.  Nodes that might be known to
>   the pHyp were not 'possible' in the runtime kernel because they did
>   not have memory at boot.
> 
> This patch ensures that sufficient nodes are defined to support
> configuration requirements after boot, as well as at boot.  This
> patch set fixes a couple of problems.
> 
> * Nodes known to powerpc to be memoryless at boot, but to have
>   CPUs in them are allowed to be 'possible' and 'online'.  Memory
>   allocations for those nodes are taken from another node that does
>   have memory until and if memory is hot-added to the node.
> * Nodes which have no resources assigned at boot, but which may still
>   be referenced subsequently by affinity or associativity attributes,
>   are kept in the list of 'possible' nodes for powerpc.  Hot-add of
>   memory or CPUs to the system can reference these nodes and bring
>   them online instead of redirecting to one of the set of nodes that
>   were known to have memory at boot.
> 
> This patch extracts the value of the lowest domain level (number of
> allocable resources) from the device tree property
> "ibm,max-associativity-domains" to use as the maximum number of nodes
> to setup as possibly available in the system.  This new setting will
> override the instruction,
> 
> nodes_and(node_possible_map, node_possible_map, node_online_map);
> 
> presently seen in the function arch/powerpc/mm/numa.c:initmem_init().
> 
> If the "ibm,max-associativity-domains" property is not present at boot,
> no operation will be performed to define or enable additional nodes, or
> enable the above 'nodes_and()'.
> 
> Signed-off-by: Michael Bringmann 
> ---
> Changes in V6:
>   -- Remove some node initialization/allocation from boot setup
>  to later in runtime to try to limit memory needs early on
>   -- Augment descriptive documentation for patch
> ---
>  arch/powerpc/mm/numa.c |   40 +---
>  1 file changed, 37 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index eb604b3..334a1ff 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -892,6 +892,37 @@ static void __init setup_node_data(int nid, u64 
> start_pfn, u64 end_pfn)
>   NODE_DATA(nid)->node_spanned_pages = spanned_pages;
>  }
> 
> +static void __init find_possible_nodes(void)
> +{
> + struct device_node *rtas;
> + u32 numnodes, i;
> +
> + if (min_common_depth <= 0)
> + return;
> +
> + rtas = of_find_node_by_path("/rtas");
> + if (!rtas)
> + return;
> +
> + if (of_property_read_u32_index(rtas,
> + "ibm,max-associativity-domains",
> + min_common_depth, &numnodes))
> + goto out;
> +
> + pr_info("numa: Nodes = %d (mcd = %d)\n", numnodes,
> + min_common_depth);

numa.c already has a pr_fmt define, no need to pre-pend "numa:" to the
information message.

-Nathan

> +
> + for (i = 0; i < numnodes; i++) {
> + if (!node_possible(i)) {
> + setup_node_data(i, 0, 0);
> + node_set(i, node_possible_map);
> + }
> + }
> +
> +out:
> + of_node_put(rtas);
> +}
> +
>  void __init initmem_init(void)
>  {
>   int nid, cpu;
> @@ -905,12 +936,15 @@ void __init initmem_init(void)
>   memblock_dump_all();
> 
>   /*
> -  * Reduce the possible NUMA nodes to the online NUMA nodes,
> -  * since we do not support node hotplug. This ensures that  we
> -  * lower the