On Mon, 27 Sep 2021, 12:22 Wei Chen, <wei.c...@arm.com> wrote:

> Hi Julien,
>
> From: Julien Grall <julien.grall....@gmail.com>
> Sent: 2021年9月27日 15:36
> To: Wei Chen <wei.c...@arm.com>
> Cc: Stefano Stabellini <sstabell...@kernel.org>; xen-devel <
> xen-devel@lists.xenproject.org>; Bertrand Marquis <
> bertrand.marq...@arm.com>; Jan Beulich <jbeul...@suse.com>; Roger Pau
> Monné <roger....@citrix.com>; Andrew Cooper <andrew.coop...@citrix.com>
> Subject: Re: [PATCH 22/37] xen/arm: use NR_MEM_BANKS to override default
> NR_NODE_MEMBLKS
>
>
> On Mon, 27 Sep 2021, 08:53 Wei Chen, <mailto:wei.c...@arm.com> wrote:
> Hi Julien,
>
> > -----Original Message-----
> > From: Xen-devel <mailto:xen-devel-boun...@lists.xenproject.org> On
> Behalf Of Wei
> > Chen
> > Sent: 2021年9月27日 14:46
> > To: Stefano Stabellini <mailto:sstabell...@kernel.org>
> > Cc: mailto:xen-devel@lists.xenproject.org; mailto:jul...@xen.org;
> Bertrand Marquis
> > <mailto:bertrand.marq...@arm.com>; mailto:jbeul...@suse.com; mailto:
> roger....@citrix.com;
> > mailto:andrew.coop...@citrix.com
> > Subject: RE: [PATCH 22/37] xen/arm: use NR_MEM_BANKS to override default
> > NR_NODE_MEMBLKS
> >
> > Hi Stefano, Julien,
> >
> > > -----Original Message-----
> > > From: Stefano Stabellini <mailto:sstabell...@kernel.org>
> > > Sent: 2021年9月27日 13:00
> > > To: Wei Chen <mailto:wei.c...@arm.com>
> > > Cc: Stefano Stabellini <mailto:sstabell...@kernel.org>; xen-
> > > mailto:de...@lists.xenproject.org; mailto:jul...@xen.org; Bertrand
> Marquis
> > > <mailto:bertrand.marq...@arm.com>; mailto:jbeul...@suse.com; mailto:
> roger....@citrix.com;
> > > mailto:andrew.coop...@citrix.com
> > > Subject: RE: [PATCH 22/37] xen/arm: use NR_MEM_BANKS to override
> default
> > > NR_NODE_MEMBLKS
> > >
> > > +x86 maintainers
> > >
> > > On Mon, 27 Sep 2021, Wei Chen wrote:
> > > > > -----Original Message-----
> > > > > From: Stefano Stabellini <mailto:sstabell...@kernel.org>
> > > > > Sent: 2021年9月27日 11:26
> > > > > To: Wei Chen <mailto:wei.c...@arm.com>
> > > > > Cc: Stefano Stabellini <mailto:sstabell...@kernel.org>; xen-
> > > > > mailto:de...@lists.xenproject.org; mailto:jul...@xen.org;
> Bertrand Marquis
> > > > > <mailto:bertrand.marq...@arm.com>
> > > > > Subject: RE: [PATCH 22/37] xen/arm: use NR_MEM_BANKS to override
> > > default
> > > > > NR_NODE_MEMBLKS
> > > > >
> > > > > On Sun, 26 Sep 2021, Wei Chen wrote:
> > > > > > > -----Original Message-----
> > > > > > > From: Stefano Stabellini <mailto:sstabell...@kernel.org>
> > > > > > > Sent: 2021年9月24日 9:35
> > > > > > > To: Wei Chen <mailto:wei.c...@arm.com>
> > > > > > > Cc: mailto:xen-devel@lists.xenproject.org; mailto:
> sstabell...@kernel.org;
> > > > > mailto:jul...@xen.org;
> > > > > > > Bertrand Marquis <mailto:bertrand.marq...@arm.com>
> > > > > > > Subject: Re: [PATCH 22/37] xen/arm: use NR_MEM_BANKS to
> override
> > > > > default
> > > > > > > NR_NODE_MEMBLKS
> > > > > > >
> > > > > > > On Thu, 23 Sep 2021, Wei Chen wrote:
> > > > > > > > As a memory range described in device tree cannot be split
> > > across
> > > > > > > > multiple nodes. So we define NR_NODE_MEMBLKS as NR_MEM_BANKS
> > in
> > > > > > > > arch header.
> > > > > > >
> > > > > > > This statement is true but what is the goal of this patch? Is
> it
> > > to
> > > > > > > reduce code size and memory consumption?
> > > > > > >
> > > > > >
> > > > > > No, when Julien and I discussed this in last version[1], we
> hadn't
> > > > > thought
> > > > > > so deeply. We just thought a memory range described in DT cannot
> > be
> > > > > split
> > > > > > across multiple nodes. So NR_MEM_BANKS should be equal to
> > > NR_MEM_BANKS.
> > > > > >
> > > > > > https://lists.xenproject.org/archives/html/xen-devel/2021-
> > > > > 08/msg00974.html
> > > > > >
> > > > > > > I am asking because NR_MEM_BANKS is 128 and
> > > > > > > NR_NODE_MEMBLKS=2*MAX_NUMNODES which is 64 by default so again
> > > > > > > NR_NODE_MEMBLKS is 128 before this patch.
> > > > > > >
> > > > > > > In other words, this patch alone doesn't make any difference;
> at
> > > least
> > > > > > > doesn't make any difference unless CONFIG_NR_NUMA_NODES is
> > > increased.
> > > > > > >
> > > > > > > So, is the goal to reduce memory usage when
> CONFIG_NR_NUMA_NODES
> > > is
> > > > > > > higher than 64?
> > > > > > >
> > > > > >
> > > > > > I also thought about this problem when I was writing this patch.
> > > > > > CONFIG_NR_NUMA_NODES is increasing, but NR_MEM_BANKS is a fixed
> > > > > > value, then NR_MEM_BANKS can be smaller than CONFIG_NR_NUMA_NODES
> > > > > > at one point.
> > > > > >
> > > > > > But I agree with Julien's suggestion, NR_MEM_BANKS and
> > > NR_NODE_MEMBLKS
> > > > > > must be aware of each other. I had thought to add some ASSERT
> > check,
> > > > > > but I don't know how to do it better. So I post this patch for
> > more
> > > > > > suggestion.
> > > > >
> > > > > OK. In that case I'd say to get rid of the previous definition of
> > > > > NR_NODE_MEMBLKS as it is probably not necessary, see below.
> > > > >
> > > > >
> > > > >
> > > > > > >
> > > > > > > > And keep default NR_NODE_MEMBLKS in common header
> > > > > > > > for those architectures NUMA is disabled.
> > > > > > >
> > > > > > > This last sentence is not accurate: on x86 NUMA is enabled and
> > > > > > > NR_NODE_MEMBLKS is still defined in xen/include/xen/numa.h
> > (there
> > > is
> > > > > no
> > > > > > > x86 definition of it)
> > > > > > >
> > > > > >
> > > > > > Yes.
> > > > > >
> > > > > > >
> > > > > > > > Signed-off-by: Wei Chen <mailto:wei.c...@arm.com>
> > > > > > > > ---
> > > > > > > >  xen/include/asm-arm/numa.h | 8 +++++++-
> > > > > > > >  xen/include/xen/numa.h     | 2 ++
> > > > > > > >  2 files changed, 9 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-
> > > arm/numa.h
> > > > > > > > index 8f1c67e3eb..21569e634b 100644
> > > > > > > > --- a/xen/include/asm-arm/numa.h
> > > > > > > > +++ b/xen/include/asm-arm/numa.h
> > > > > > > > @@ -3,9 +3,15 @@
> > > > > > > >
> > > > > > > >  #include <xen/mm.h>
> > > > > > > >
> > > > > > > > +#include <asm/setup.h>
> > > > > > > > +
> > > > > > > >  typedef u8 nodeid_t;
> > > > > > > >
> > > > > > > > -#ifndef CONFIG_NUMA
> > > > > > > > +#ifdef CONFIG_NUMA
> > > > > > > > +
> > > > > > > > +#define NR_NODE_MEMBLKS NR_MEM_BANKS
> > > > > > > > +
> > > > > > > > +#else
> > > > > > > >
> > > > > > > >  /* Fake one node for now. See also node_online_map. */
> > > > > > > >  #define cpu_to_node(cpu) 0
> > > > > > > > diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
> > > > > > > > index 1978e2be1b..1731e1cc6b 100644
> > > > > > > > --- a/xen/include/xen/numa.h
> > > > > > > > +++ b/xen/include/xen/numa.h
> > > > > > > > @@ -12,7 +12,9 @@
> > > > > > > >  #define MAX_NUMNODES    1
> > > > > > > >  #endif
> > > > > > > >
> > > > > > > > +#ifndef NR_NODE_MEMBLKS
> > > > > > > >  #define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
> > > > > > > > +#endif
> > > > >
> > > > > This one we can remove it completely right?
> > > >
> > > > How about define NR_MEM_BANKS to:
> > > > #ifdef CONFIG_NR_NUMA_NODES
> > > > #define NR_MEM_BANKS (CONFIG_NR_NUMA_NODES * 2)
> > > > #else
> > > > #define NR_MEM_BANKS 128
> > > > #endif
> > > > for both x86 and Arm. For those architectures do not support or
> enable
> > > > NUMA, they can still use "NR_MEM_BANKS 128". And replace all
> > > NR_NODE_MEMBLKS
> > > > in NUMA code to NR_MEM_BANKS to remove NR_NODE_MEMBLKS completely.
> > > > In this case, NR_MEM_BANKS can be aware of the changes of
> > > CONFIG_NR_NUMA_NODES.
> > >
> > > x86 doesn't have NR_MEM_BANKS as far as I can tell. I guess you also
> > > meant to rename NR_NODE_MEMBLKS to NR_MEM_BANKS?
> > >
> >
> > Yes.
> >
> > > But NR_MEM_BANKS is not directly related to CONFIG_NR_NUMA_NODES
> because
> > > there can be many memory banks for each numa node, certainly more than
> > > 2. The existing definition on x86:
> > >
> > > #define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
> > >
> > > Doesn't make a lot of sense to me. Was it just an arbitrary limit for
> > > the lack of a better way to set a maximum?
> > >
> >
> > At that time, this was probably the most cost-effective approach.
> > Enough and easy. But, if more nodes need to be supported in the
> > future, it may bring more memory blocks. And this maximum value
> > might not apply. The maximum may need to support dynamic extension.
> >
> > >
> > > On the other hand, NR_MEM_BANKS and NR_NODE_MEMBLKS seem to be related.
> > > In fact, what's the difference?
> > >
> > > NR_MEM_BANKS is the max number of memory banks (with or without
> > > numa-node-id).
> > >
> > > NR_NODE_MEMBLKS is the max number of memory banks with NUMA support
> > > (with numa-node-id)?
> > >
> > > They are basically the same thing. On ARM I would just do:
> > >
> >
> > Probably not, NR_MEM_BANKS will count those memory ranges without
> > numa-node-id in boot memory parsing stage (process_memory_node or
> > EFI parser). But NR_NODE_MEMBLKS will only count those memory ranges
> > with numa-node-id.
> >
> > > #define NR_NODE_MEMBLKS MAX(NR_MEM_BANKS, (CONFIG_NR_NUMA_NODES * 2))
> > >
> > >
>
> > Quote Julien's comment from HTML email to here:
> > " As you wrote above, the second part of the MAX is totally arbitrary.
> > In fact, it is very likely than if you have more than 64 nodes, you may
> > need a lot more than 2 regions per node.
> >
> > So, for Arm, I would just define NR_NODE_MEMBLKS as an alias to
> NR_MEM_BANKS
> > so it can be used by common code.
> > "
> >
> > > But here comes the problem:
> > > How can we set the NR_MEM_BANKS maximum value, 128 seems an arbitrary
> too?
> >
> > This is based on hardware we currently support (the last time we bumped
> the value was, IIRC, for Thunder-X). In the case of booting UEFI, we can
> get a lot of small ranges as we discover the RAM using the UEFI memory map.
> >
>
> Thanks for the background.
>
> >
> > > If #define NR_MEM_BANKS (CONFIG_NR_NUMA_NODES * N)? And what N should
> be.
> >
> > N would have to be the maximum number of ranges you can find in a NUMA
> node.
> >
> > We would also need to make sure this doesn't break existing platforms.
> So N would have to be quite large or we need a MAX as Stefano suggested.
> >
> > But I would prefer to keep the existing 128 and allow to configure it at
> build time (not necessarily in this series). This avoid to have different
> way to define the value based NUMA vs non-NUMA.
>
> In this case, can we use Stefano's
> "#define NR_NODE_MEMBLKS MAX(NR_MEM_BANKS, (CONFIG_NR_NUMA_NODES * 2))"
> in next version. If yes, should we change x86 part? Because NR_MEM_BANKS
> has not been defined in x86.


What I meant by configuring dynamically is allowing NR_MEM_BANKS to be set
by the user.

The second part of the MAX makes no sense to me (at least on Arm). So I
really prefer if this is not part of the initial version.

We can refine the value, or introduce the MAX in the future if we have a
justification for it.


> > > And maybe the definition could be common with x86 if we define
> > > NR_MEM_BANKS to 128 on x86 too.
> >
> > Julien had comment here, I will continue in that email.
>

Reply via email to