On Thu, Oct 29, 2020 at 04:31:33AM +0000, Honnappa Nagarahalli wrote:
> <snip>
> 
> > > >
> > > > On Wed, Oct 21, 2020 at 01:37:38PM +0200, Juraj Linkeš wrote:
> > > > > The build machine's number of cpus and numa nodes vary, resulting
> > > > > in mismatched counts of RTE_MAX_LCORE and
> > RTE_MAX_NUMA_NODES for
> > > > many
> > > > > builds. Automatically discover the host's numa and cpu counts to
> > > > > remove this mismatch for native builds. Use current defaults for 
> > > > > default
> > builds.
> > > > > Force the users to specify the counts for cross build in cross
> > > > > files or on the command line.
> > > > > Give users the option to override the discovery or values from
> > > > > cross files by specifying them on the command line with
> > > > > -Dmax_lcores and -Dmax_numa_nodes.
> > > > >
> > > > > Signed-off-by: Juraj Linkeš <juraj.lin...@pantheon.tech>
> > > > > ---
> > > > >  buildtools/get_cpu_count.py  |  7 ++++++
> > > > > buildtools/get_numa_count.py | 22 +++++++++++++++++++
> > > > >  buildtools/meson.build       |  2 ++
> > > > >  config/meson.build           | 42
> > ++++++++++++++++++++++++++++++++++--
> > > > >  meson_options.txt            |  8 +++----
> > > > >  5 files changed, 75 insertions(+), 6 deletions(-)  create mode
> > > > > 100644 buildtools/get_cpu_count.py  create mode 100644
> > > > > buildtools/get_numa_count.py
> > > > >
> > > > > diff --git a/buildtools/get_cpu_count.py
> > > > > b/buildtools/get_cpu_count.py new file mode 100644 index
> > > > > 000000000..386f85f8b
> > > > > --- /dev/null
> > > > > +++ b/buildtools/get_cpu_count.py
> > > > > @@ -0,0 +1,7 @@
> > > > > +#!/usr/bin/python3
> > > > > +# SPDX-License-Identifier: BSD-3-Clause # Copyright (c) 2020
> > > > > +PANTHEON.tech s.r.o.
> > > > > +
> > > > > +import os
> > > > > +
> > > > > +print(os.cpu_count())
> > > > > diff --git a/buildtools/get_numa_count.py
> > > > > b/buildtools/get_numa_count.py new file mode 100644 index
> > > > > 000000000..f0c49973a
> > > > > --- /dev/null
> > > > > +++ b/buildtools/get_numa_count.py
> > > > > @@ -0,0 +1,22 @@
> > > > > +#!/usr/bin/python3
> > > > > +# SPDX-License-Identifier: BSD-3-Clause # Copyright (c) 2020
> > > > > +PANTHEON.tech s.r.o.
> > > > > +
> > > > > +import ctypes
> > > > > +import glob
> > > > > +import os
> > > > > +import subprocess
> > > > > +
> > > > > +if os.name == 'posix':
> > > > > +    if os.path.isdir('/sys/devices/system/node'):
> > > > > +        print(len(glob.glob('/sys/devices/system/node/node*')))
> > > > > +    else:
> > > > > +        print(subprocess.run(['sysctl', 'vm.ndomains'],
> > > > > +capture_output=True).stdout)
> > > > > +
> > > > > +elif os.name == 'nt':
> > > > > +    libkernel32 = ctypes.windll.kernel32
> > > > > +
> > > > > +    count = ctypes.c_ulong()
> > > > > +
> > > > > +    libkernel32.GetNumaHighestNodeNumber(ctypes.pointer(count))
> > > > > +    print(count.value + 1)
> > > > > diff --git a/buildtools/meson.build b/buildtools/meson.build index
> > > > > 04808dabc..925e733b1 100644
> > > > > --- a/buildtools/meson.build
> > > > > +++ b/buildtools/meson.build
> > > > > @@ -17,3 +17,5 @@ else
> > > > >  endif
> > > > >  map_to_win_cmd = py3 + files('map_to_win.py')  sphinx_wrapper =
> > > > > py3 +
> > > > > files('call-sphinx-build.py')
> > > > > +get_cpu_count_cmd = py3 + files('get_cpu_count.py')
> > > > > +get_numa_count_cmd = py3 + files('get_numa_count.py')
> > > > > diff --git a/config/meson.build b/config/meson.build index
> > > > > a57c8ae9e..c4477f977 100644
> > > > > --- a/config/meson.build
> > > > > +++ b/config/meson.build
> > > > > @@ -74,7 +74,11 @@ endif
> > > > >  # still being able to support the CPU features required for DPDK.
> > > > >  # This can be bumped up by the DPDK project, but it can never be
> > > > > an # invariant like 'native'
> > > > > +max_lcores = get_option('max_lcores') max_numa_nodes =
> > > > > +get_option('max_numa_nodes')
> > > > >  if machine == 'default'
> > > > > +     max_numa_nodes = 4
> > > > > +     max_lcores = 128
> > > >
> > > > This doesn't seem right, since you are overriding the user-specified
> > > > values with hard-coded ones.
> > > >
> > >
> > > I understand we're using the default build/generic to build portalbe dpdk
> > distro packages, meaning the settings for those packages should always be
> > the same, no? If not, what should the default/generic build be? And when
> > would someone do a default/generic build with their values? It wouldn't be a
> > default/generic anymore, right?
> > >
> > > > >       if host_machine.cpu_family().startswith('x86')
> > > > >               # matches the old pre-meson build systems default
> > > > >               machine = 'corei7'
> > > > > @@ -83,6 +87,22 @@ if machine == 'default'
> > > > >       elif host_machine.cpu_family().startswith('ppc')
> > > > >               machine = 'power8'
> > > > >       endif
> > > > > +elif not meson.is_cross_build()
> > > > > +     # find host core count and numa node count for native builds
> > > > > +     if max_lcores == 0
> > > > > +             max_lcores =
> > > > run_command(get_cpu_count_cmd).stdout().to_int()
> > > > > +             min_lcores = 2
> > > > > +             if max_lcores < min_lcores
> > > > > +                     message('Found less than @0@ cores, building for
> > > > @0@ cores'.format(min_lcores))
> > > > > +                     max_lcores = min_lcores
> > > > > +             else
> > > > > +                     message('Found @0@ cores'.format(max_lcores))
> > > > > +             endif
> > > > > +     endif
> > > > > +     if max_numa_nodes == 0
> > > > > +             max_numa_nodes =
> > > > run_command(get_numa_count_cmd).stdout().to_int()
> > > > > +             message('Found @0@ numa
> > nodes'.format(max_numa_nodes))
> > > > > +     endif
> > > > >  endif
> > > > >
> > > > >  dpdk_conf.set('RTE_MACHINE', machine) @@ -227,8 +247,10 @@
> > > > > foreach
> > > > > arg: warning_flags  endforeach
> > > > >
> > > > >  # set other values pulled from the build options
> > > > > -dpdk_conf.set('RTE_MAX_LCORE', get_option('max_lcores'))
> > > > > -dpdk_conf.set('RTE_MAX_NUMA_NODES',
> > get_option('max_numa_nodes'))
> > > > > +if not meson.is_cross_build()
> > > > > +     dpdk_conf.set('RTE_MAX_LCORE', max_lcores)
> > > > > +     dpdk_conf.set('RTE_MAX_NUMA_NODES', max_numa_nodes)
> > endif
> > > >
> > > > Rather than conditionally setting the value here, you should move
> > > > the checks below up above this to simplify things.
> > > >
> > >
> > > Do you mean the cross build checks? Those have to be after
> > subdir(arch_subdir) so that we can override the values from cross files (as
> > the commit message says).
> > >
> > > > >  dpdk_conf.set('RTE_MAX_ETHPORTS', get_option('max_ethports'))
> > > > > dpdk_conf.set('RTE_LIBEAL_USE_HPET', get_option('use_hpet'))
> > > > > dpdk_conf.set('RTE_ENABLE_TRACE_FP',
> > > > > get_option('enable_trace_fp')) @@
> > > > > -247,6 +269,22 @@ compile_time_cpuflags = []
> > > > >  subdir(arch_subdir)
> > > > >  dpdk_conf.set('RTE_COMPILE_TIME_CPUFLAGS',
> > > > > ','.join(compile_time_cpuflags))
> > > > >
> > > > > +# check that cpu and numa count is set in cross builds if
> > > > > +meson.is_cross_build()
> > > > > +     if max_lcores > 0
> > > > > +             # specified on the cmdline
> > > > > +             dpdk_conf.set('RTE_MAX_LCORE', max_lcores)
> > > > > +     elif not dpdk_conf.has('RTE_MAX_LCORE')
> > > > > +             error('Number of cores for cross build not specified in 
> > > > > @0@
> > > > subdir (e.g. in a cross-file) nor on the
> > > > cmdline'.format(arch_subdir))
> > > > > +     endif
> > > > > +     if max_numa_nodes > 0
> > > > > +             # specified on the cmdline
> > > > > +             dpdk_conf.set('RTE_MAX_NUMA_NODES',
> > max_numa_nodes)
> > > > > +     elif not dpdk_conf.has('RTE_MAX_NUMA_NODES')
> > > > > +             error('Number of numa nodes for cross build not 
> > > > > specified in
> > > > @0@ subdir (e.g. in a cross-file) nor on the
> > > > cmdline'.format(arch_subdir))
> > > > > +     endif
> > > > > +endif
> > > > > +
> > > > >  # set the install path for the drivers
> > > > > dpdk_conf.set_quoted('RTE_EAL_PMD_PATH', eal_pmd_path)
> > > > >
> > > > > diff --git a/meson_options.txt b/meson_options.txt index
> > > > > 9bf18ab6b..01b0c45c3 100644
> > > > > --- a/meson_options.txt
> > > > > +++ b/meson_options.txt
> > > > > @@ -26,10 +26,10 @@ option('machine', type: 'string', value: 'native',
> > > > >       description: 'set the target machine type')
> > > > > option('max_ethports',
> > > > > type: 'integer', value: 32,
> > > > >       description: 'maximum number of Ethernet devices')
> > > > > -option('max_lcores', type: 'integer', value: 128,
> > > > > -     description: 'maximum number of cores/threads supported by EAL')
> > > > > -option('max_numa_nodes', type: 'integer', value: 4,
> > > > > -     description: 'maximum number of NUMA nodes supported by EAL')
> > > > > +option('max_lcores', type: 'integer', value: 0,
> > > > > +     description: 'maximum number of cores/threads supported by EAL.
> > > > > +Value 0 means the number of cpus on the host will be used. For
> > > > > +cross build,
> > > > set to non-zero to overwrite the cross-file value.')
> > > > option('max_numa_nodes',
> > > > type: 'integer', value: 0,
> > > > > +     description: 'maximum number of NUMA nodes supported by EAL.
> > > > > +Value
> > > > 0
> > > > > +means the number of numa nodes on the host will be used. For
> > > > > +cross build, set to non-zero to overwrite the cross-file value.')
> > > >
> > > > I don't like this change, because it very much assumes for
> > > > non-cross-compiles that people will be running DPDK on the system
> > > > they build it on. That's a very, very big assumption!
> > >
> > > I'll be using definitions from https://mesonbuild.com/Cross-
> > compilation.html.
> > > I understand cross compilation to be building for a diffent host machine
> > than the build machine (which is aligned with pretty much every definition I
> > found). I understand this to be true not only for builds between
> > architectures, but also within an architecture (e.g. x86_64 build machine
> > building for x86_64 host machine).
> > > So yes, when someone does a native build, it stands to reason they want to
> > use it on the build machine. If they wanted to use it elsewhere, they would
> > cross compile.
> > > Another thing is the current build philosophy is to detect as much as
> > possible (not having statically defined configuration, as you mentioned in 
> > the
> > past). Detecting the number of cores and numa nodes fits this perfectly.
> > > And yet another thing is that the assumption seems to be already present
> > in the build system - it already detects a lot things, some of which may 
> > not be
> > satisfied on machines other than the build machine. I may be wrong about
> > this.
> > >
> > > > I'm ok with having zero as a "detect" option, and having the values
> > > > overridden from cross-files, but not with detection as the default
> > > > out- of-the-box option! Lots of users may pull builds from a CI
> > > > based on VMs with just a few cores, for instance.
> > >
> > > If not having the automatic detection is a concern because of users using 
> > > CI
> > builds, then we (if it's from our CI) can change what we're building in CI 
> > - the
> > default/generic build seems like a good fit because it's supposed to work on
> > a variety of systems. Expecting that native build from random VMs would
> > work anywhere doesn't seen very realistic - it's been build for that VM
> > environment (because it's a native build).
> > >
> > > Here's my understanding on which the current version is based:
> > > 1. Since we want to get away from having statically defined configuration,
> > numa and core count discovery is exactly what we should have in the build
> > system. Since discorery is currently the default for lib/drivers, it stands 
> > to
> > reason it should be default for everything else, if possible.
> > > 2. Native build should produce binaries matching the build machine as well
> > as possible.
> > > 3. Default/generic build should produce binaries executable on a range of
> > systems (ideally all systems of a given architecture).
> > > 4. Other builds, that is non-native builds, are cross-compilation, since 
> > > we're
> > building for host machine other that the build machine.
> > >
> > > What I haven't taken into account is users using CI builds. That could be
> > remedied by modifying the CI a bit while being consistent with what
> > native/default/generic/cross builds are (or should be). And in any case, if
> > we're not interested in testing the exact CI environment (which we aren't,
> > since we don't want to use 2 cores with 1 numa), we really shouldn't be 
> > doing
> > native builds there.
> > >
> > > I'm interested in hearing where my thinking deviates from yours.
> > >
> > 
> > There are a number of points in which we differ, I think.
> > 
> > Firstly, the use of "native" and "default/generic" for the "machine"
> > parameter refers only to the instruction-set level from the compiler, and
> > should not affect any other settings, since all settings are independent.
> > Therefore, setting "machine" to "native" does not mean that we should
> > detect cores and numa nodes, and similarly setting it to "default" does not
> > mean that we should ignore the settings for these values and pick our own
> > chosen default values.
> Apologies to go to an older discussion.
> I am trying to understand the definitions/expectations for 'native' and 
> 'generic' builds.
> As you say, instruction-set level is definitely one parameter.

Part of the confusion arises from the fact that originally that was the
only parameter set by this - and on x86 it still is. Perhaps this parameter
needs to be renamed to "isa-level" or "architecture-flag" or similar to
reflect its meaning. This would then allow a new "machine" setting, which
can be considered separately. The question then is how much that helps with
the main issue under discussion, that of cores and numa node values.

> But, I think other DPDK specific parameters should also be considered.
> For ex: RTE_MAX_LCORE should have a particular value for 'generic' build in 
> all the supported architectures. The value could be different for each 
> architecture, but it is fixed for the 'generic' build for a given 
> architecture. Otherwise, the 'generic' build might not run on all the 
> machines of that architecture.
> 
> Similarly, for 'native' build, is there any reason not to include other DPDK 
> parameters as part of the definition? IMO, 'native' should refer to the 
> entire build machine, not just the ISA. i.e. build on the target machine.
> 

While I understand the idea here, it is somewhat complicated by the fact
that the meson options given in "meson_options.txt" cannot be set by meson
code, which means that when we change the machine flag to "native" we can
only use or ignore the user-provided lcores and numa nodes setting - we
have no way to change them and reflect those changes back to the user. :-(
This leads to the situation in the discussion in this thread, where we
start needing all sorts of magic values to indicate use of machine-type
defaults or detected defaults.

Regards,
/Bruce

Reply via email to