On 08.07.20 11:15, Mike Rapoport wrote: > On Wed, Jul 08, 2020 at 10:45:17AM +0200, David Hildenbrand wrote: >> On 08.07.20 10:39, Mike Rapoport wrote: >>> On Wed, Jul 08, 2020 at 10:26:41AM +0200, David Hildenbrand wrote: >>>> On 08.07.20 09:50, Dan Williams wrote: >>>>> On Wed, Jul 8, 2020 at 12:22 AM David Hildenbrand <da...@redhat.com> >>>>> wrote: >>>>>> >>>>>>>>>>>> On Tue 07-07-20 13:59:15, Jia He wrote: >>>>>>>>>>>>> This exports memory_add_physaddr_to_nid() for module driver to >>>>>>>>>>>>> use. >>>>>>>>>>>>> >>>>>>>>>>>>> memory_add_physaddr_to_nid() is a fallback option to get the nid >>>>>>>>>>>>> in case >>>>>>>>>>>>> NUMA_NO_NID is detected. >>>>>>>>>>>>> >>>>>>>>>>>>> Suggested-by: David Hildenbrand <da...@redhat.com> >>>>>>>>>>>>> Signed-off-by: Jia He <justin...@arm.com> >>>>>>>>>>>>> --- >>>>>>>>>>>>> arch/arm64/mm/numa.c | 5 +++-- >>>>>>>>>>>>> 1 file changed, 3 insertions(+), 2 deletions(-) >>>>>>>>>>>>> >>>>>>>>>>>>> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c >>>>>>>>>>>>> index aafcee3e3f7e..7eeb31740248 100644 >>>>>>>>>>>>> --- a/arch/arm64/mm/numa.c >>>>>>>>>>>>> +++ b/arch/arm64/mm/numa.c >>>>>>>>>>>>> @@ -464,10 +464,11 @@ void __init arm64_numa_init(void) >>>>>>>>>>>>> >>>>>>>>>>>>> /* >>>>>>>>>>>>> * We hope that we will be hotplugging memory on nodes we >>>>>>>>>>>>> already know about, >>>>>>>>>>>>> - * such that acpi_get_node() succeeds and we never fall back to >>>>>>>>>>>>> this... >>>>>>>>>>>>> + * such that acpi_get_node() succeeds. But when SRAT is not >>>>>>>>>>>>> present, the node >>>>>>>>>>>>> + * id may be probed as NUMA_NO_NODE by acpi, Here provide a >>>>>>>>>>>>> fallback option. >>>>>>>>>>>>> */ >>>>>>>>>>>>> int memory_add_physaddr_to_nid(u64 addr) >>>>>>>>>>>>> { >>>>>>>>>>>>> - pr_warn("Unknown node for memory at 0x%llx, assuming node 0\n", >>>>>>>>>>>>> addr); >>>>>>>>>>>>> return 0; >>>>>>>>>>>>> } >>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid); >>>>>>>>>>>> >>>>>>>>>>>> Does it make sense to export a noop function? Wouldn't make more >>>>>>>>>>>> sense >>>>>>>>>>>> to simply make it static inline somewhere in a header? I haven't >>>>>>>>>>>> checked >>>>>>>>>>>> whether there is an easy way to do that sanely bu this just hit my >>>>>>>>>>>> eyes. >>> >>>> I'd be curious if what we are trying to optimize here is actually worth >>>> optimizing. IOW, is there a well-known scenario where the dummy value on >>>> arm64 would be problematic and is worth the effort? >>> >>> Well, it started with Michal's comment above that EXPORT_SYMBOL_GPL() >>> for a stub might be an overkill. >>> >>> I think Jia's suggestion [1] with addition of a comment that explains >>> why and when the stub will be used, can work for both >>> memory_add_physaddr_to_nid() and phys_to_target_node(). >> >> Agreed. >> >>> >>> But on more theoretical/fundmanetal level, I think we lack a generic >>> abstraction similar to e.g. x86 'struct numa_meminfo' that serves as >>> translaton of firmware supplied information into data that can be used >>> by the generic mm without need to reimplement it for each and every >>> arch. >> >> Right. As I expressed, I am not a friend of using memblock for that, and >> the pgdat node span is tricky. >> >> Maybe abstracting that x86 concept is possible in some way (and we could >> restrict the information to boot-time properties, so we don't have to >> mess with memory hot(un)plug - just as done for numa_meminfo AFAIKS). > > I agree with pgdat part and disagree about memblock. It already has > non-init physmap, why won't we add memblock.memory to the mix? ;-)
Can we generalize and tweak physmap to contain node info? That's all we need, no? (the special mem= parameter handling should not matter for our use case, where "physmap" and "memory" would differ) > > Now, seriously, memblock already has all the necessary information about > the coldplug memory for several architectures. x86 being an exception > because for some reason the reserved memory is not considered memory > there. The infrastructure for quiering and iterating memory regions is > already there. We just need to leave out the irrelevant parts, like > memblock.reserved and allocation funcions. I *really* don't want to mess with memblocks on memory hot(un)plug on x86 and s390x (+other architectures in the future). I also thought about stopping to create memblocks for hotplugged memory on arm64, by tweaking pfn_valid() to query memblocks only for early sections. If "physmem" is not an option, can we at least introduce something like ARCH_UPDTAE_MEMBLOCK_ON_HOTPLUG to avoid doing that on x86 and s390x for now (and later maybe for others)? -- Thanks, David / dhildenb