Le 08/01/2020 à 21:51, Prentice Bisbal via users a écrit : > > On 1/8/20 3:30 PM, Brice Goglin via users wrote: >> Le 08/01/2020 à 21:20, Prentice Bisbal via users a écrit : >>> We just added about a dozen nodes to our cluster, which have AMD EPYC >>> 7281 processors. When a particular users jobs fall on one of these >>> nodes, he gets these error messages: >>> >>> -------------------------------------------------------------------------- >>> >>> >>> WARNING: a request was made to bind a process. While the system >>> supports binding the process itself, at least one node does NOT >>> support binding memory to the process location. >>> >>> Node: dawson205 >> >> I wonder if the CentOS 6 kernel properly supports these recent >> processors. Does lstopo show NUMA nodes as expected? >> >> Brice >> > lstopo shows different numa nodes, and it appears to be correct, but I > don't use lstopo that much, so I'm not 100% confident that what it's > showing is correct. I'm at about 98%. >
Now, check memory binding in hwloc: * Does something like "hwloc-bind node:1 -- echo foobar" fail? * What do these lines return? hwloc-bind --membind node:1 -- hwloc-bind --get --membind --nodeset => should return something like 0x00000001 (bind) hwloc-bind --membind --get --nodeset => should return something like 0x000000ff (firsttouch) By the way, which OMPI did you use? If you told OMPI not to use its embedded hwloc, which hwloc do you use? Brice