On Wed, Jul 8, 2020 at 3:04 AM David Hildenbrand <da...@redhat.com> wrote:
>
> On 08.07.20 11:45, Mike Rapoport wrote:
> > On Wed, Jul 08, 2020 at 11:25:36AM +0200, David Hildenbrand wrote:
> >> On 08.07.20 11:15, Mike Rapoport wrote:
> >>>>>>>>
> >>>>> But on more theoretical/fundmanetal level, I think we lack a generic
> >>>>> abstraction similar to e.g. x86 'struct numa_meminfo' that serves as
> >>>>> translaton of firmware supplied information into data that can be used
> >>>>> by the generic mm without need to reimplement it for each and every
> >>>>> arch.
> >>>>
> >>>> Right. As I expressed, I am not a friend of using memblock for that, and
> >>>> the pgdat node span is tricky.
> >>>>
> >>>> Maybe abstracting that x86 concept is possible in some way (and we could
> >>>> restrict the information to boot-time properties, so we don't have to
> >>>> mess with memory hot(un)plug - just as done for numa_meminfo AFAIKS).
> >>>
> >>> I agree with pgdat part and disagree about memblock. It already has
> >>> non-init physmap, why won't we add memblock.memory to the mix? ;-)
> >>
> >> Can we generalize and tweak physmap to contain node info? That's all we
> >> need, no? (the special mem= parameter handling should not matter for our
> >> use case, where "physmap" and "memory" would differ)
> >
> > TBH, I have only random vague thoughts at the moment. This might be an
> > option. But then we need to enable physmap on !s390, right?
>
> Yes, looks like it.
>
> >
> >>> Now, seriously, memblock already has all the necessary information about
> >>> the coldplug memory for several architectures. x86 being an exception
> >>> because for some reason the reserved memory is not considered memory
> >>> there. The infrastructure for quiering and iterating memory regions is
> >>> already there. We just need to leave out the irrelevant parts, like
> >>> memblock.reserved and allocation funcions.
> >>
> >> I *really* don't want to mess with memblocks on memory hot(un)plug on
> >> x86 and s390x (+other architectures in the future). I also thought about
> >> stopping to create memblocks for hotplugged memory on arm64, by tweaking
> >> pfn_valid() to query memblocks only for early sections.
> >>
> >> If "physmem" is not an option, can we at least introduce something like
> >> ARCH_UPDTAE_MEMBLOCK_ON_HOTPLUG to avoid doing that on x86 and s390x for
> >> now (and later maybe for others)?
> >
> > I have to do more memory hotplug howework to answer that ;-)
> >
> > My general point is that we don't have to reinvent the wheel to have
> > coldplug memory representation, it's already there. We just need a way
> > to use it properly.
>
> Yes, I tend to agree. Details to be clarified :)

I'm not quite understanding the concern, or requirement about
"updating memblock" in the hotplug path. The routines
memory_add_physaddr_to_nid() and phys_to_target_node() are helpers to
interrogate platform-firmware numa info through a common abstraction.
They place no burden on the memory hotplug code they're just used to
see if a hot-added range lies within an existing node span when
platform-firmware otherwise fails to communicate a node. x86 can
continue to back those helpers with numa_meminfo, arm64 can use a
generic memblock implementation and other archs can follow the arm64
example if they want better numa answers for drivers.

Reply via email to