We just added about a dozen nodes to our cluster, which have AMD EPYC
7281 processors. When a particular users jobs fall on one of these
nodes, he gets these error messages:
--------------------------------------------------------------------------
WARNING: a request was made to bind a process. While the system
supports binding the process itself, at least one node does NOT
support binding memory to the process location.
Node: dawson205
This usually is due to not having the required NUMA support installed
on the node. In some Linux distributions, the required support is
contained in the libnumactl and libnumactl-devel packages.
This is a warning only; your job will continue, though performance may
be degraded.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
A request was made to bind to that would result in binding more
processes than cpus on a resource:
Bind to: NONE
Node: dawson205
#processes: 2
#cpus: 1
You can override this protection by adding the "overload-allowed"
option to your binding directive.
--------------------------------------------------------------------------
The OS is CentOS 6, and numactl and numactl-devel are installed. Any
idea what the issue is and how to fix it? Is SMT enabled when it
shouldn't be, or something along those lines?
--
Prentice