This is on a Solaris 11 system with hwloc 1.6.1:

% lstopo-no-graphics
Machine (4095MB) + NUMANode L#0 (P#0 4095MB) + Socket L#0
  Core L#0 + PU L#0 (P#0)
  Core L#1 + PU L#1 (P#1)
  Core L#2 + PU L#2 (P#2)
  Core L#3 + PU L#3 (P#3)
% hwloc-bind socket:0.pu:1 hwloc-bind --get
0x0000000f

I assume that output is wrong.  I bind to a single core, but the returned mask 
shows binding to all four cores.

To confirm that binding is indeed happening and that it's the reporting that's 
incorrect:

% hwloc-bind socket:0.pu:0 pbind -q
process id 1773: 0
% hwloc-bind socket:0.pu:1 pbind -q
process id 1774: 1
% hwloc-bind socket:0.pu:2 pbind -q
process id 1775: 2
% hwloc-bind socket:0.pu:3 pbind -q
process id 1776: 3

It seems to me the problem is in topology-solaris.c. In hwloc_solaris_set_sth_cpubind(), we can bind to a single core with processor_bind(), which is what's happening in our case. Then, in hwloc_solaris_get_sth_cpubind(), we check for lgroup affinity but not for any processor_bind() binding. So, we assume we're not bound.

How about adding a check upon entry to hwloc_solaris_get_sth_cpubind(): if processor_bind() shows binding, report this and be done. If not, then continue on with the lgroup logic that's already in that function. Yes?

Reply via email to