Le 08/01/2020 à 21:51, Prentice Bisbal via users a écrit :
>
> On 1/8/20 3:30 PM, Brice Goglin via users wrote:
>> Le 08/01/2020 à 21:20, Prentice Bisbal via users a écrit :
>>> We just added about a dozen nodes to our cluster, which have AMD EPYC
>>> 7281 processors. When a particular users jobs fall on one of these
>>> nodes, he gets these error messages:
>>>
>>> --------------------------------------------------------------------------
>>>
>>>
>>> WARNING: a request was made to bind a process. While the system
>>> supports binding the process itself, at least one node does NOT
>>> support binding memory to the process location.
>>>
>>>    Node:  dawson205
>>
>> I wonder if the CentOS 6 kernel properly supports these recent
>> processors. Does lstopo show NUMA nodes as expected?
>>
>> Brice
>>
> lstopo shows different numa nodes, and it appears to be correct, but I
> don't use lstopo that much, so I'm not 100%  confident that what it's
> showing is correct. I'm at about 98%.
>

Now, check memory binding in hwloc:

* Does something like "hwloc-bind node:1 -- echo foobar" fail?

* What do these lines return?

hwloc-bind --membind node:1 -- hwloc-bind --get --membind --nodeset

=> should return something like 0x00000001 (bind)

hwloc-bind --membind --get --nodeset

=> should return something like 0x000000ff (firsttouch)

By the way, which OMPI did you use? If you told OMPI not to use its
embedded hwloc, which hwloc do you use?

Brice


Reply via email to