Re: [hwloc-users] howloc with scalemp

Brock Palen Wed, 7 Apr 2010 16:46:59 -0400

Brice Goglin wrote:

Brock Palen wrote:

has anyone done work with hwloc on scalemp systems?  They provide
their own tool numabind, but we are looking for a more generic
solution to process placement and control that works well inside our
MPI library (openMPI in most cases).


Any input on this would be great!


Hello Brock,

From what I remember, ScaleMP uses an hypervisor on each node that

virtually merges all of them into a fake big shared-memory machine.Then

a vanilla Linux kernel runs on top of it. So hwloc should just see
regular cores and NUMA node information, assuming the virtual "merged"
hardware reports all necessary information to the OS.

running lstopo 0.9.3 it appears that howloc does see the extra layerof complexity:


[brockp@nyx0809 INTEL]$ lstopo -
System(79GB)
  Misc0
    Node#0(10GB) + Socket#1 + L3(8192KB)
      L2(256KB) + L1(32KB) + Core#0 + P#0
      L2(256KB) + L1(32KB) + Core#1 + P#1
      L2(256KB) + L1(32KB) + Core#2 + P#2
      L2(256KB) + L1(32KB) + Core#3 + P#3
    Node#1(10GB) + Socket#0 + L3(8192KB)
      L2(256KB) + L1(32KB) + Core#0 + P#4
      L2(256KB) + L1(32KB) + Core#1 + P#5
      L2(256KB) + L1(32KB) + Core#2 + P#6
      L2(256KB) + L1(32KB) + Core#3 + P#7
  Misc0
    Node#2(10GB) + Socket#3 + L3(8192KB)
      L2(256KB) + L1(32KB) + Core#0 + P#8
      L2(256KB) + L1(32KB) + Core#1 + P#9
      L2(256KB) + L1(32KB) + Core#2 + P#10
      L2(256KB) + L1(32KB) + Core#3 + P#11
    Node#3(10GB) + Socket#2 + L3(8192KB)
      L2(256KB) + L1(32KB) + Core#0 + P#12
      L2(256KB) + L1(32KB) + Core#1 + P#13
      L2(256KB) + L1(32KB) + Core#2 + P#14
      L2(256KB) + L1(32KB) + Core#3 + P#15
  Misc0
    Node#4(10GB) + Socket#5 + L3(8192KB)
      L2(256KB) + L1(32KB) + Core#0 + P#16
      L2(256KB) + L1(32KB) + Core#1 + P#17
      L2(256KB) + L1(32KB) + Core#2 + P#18
      L2(256KB) + L1(32KB) + Core#3 + P#19
    Node#5(10GB) + Socket#4 + L3(8192KB)
      L2(256KB) + L1(32KB) + Core#0 + P#20
      L2(256KB) + L1(32KB) + Core#1 + P#21
      L2(256KB) + L1(32KB) + Core#2 + P#22
      L2(256KB) + L1(32KB) + Core#3 + P#23
  Misc0
    Node#6(10GB) + Socket#7 + L3(8192KB)
      L2(256KB) + L1(32KB) + Core#0 + P#24
      L2(256KB) + L1(32KB) + Core#1 + P#25
      L2(256KB) + L1(32KB) + Core#2 + P#26
      L2(256KB) + L1(32KB) + Core#3 + P#27
    Node#7(10GB) + Socket#6 + L3(8192KB)
      L2(256KB) + L1(32KB) + Core#0 + P#28
      L2(256KB) + L1(32KB) + Core#1 + P#29
      L2(256KB) + L1(32KB) + Core#2 + P#30
      L2(256KB) + L1(32KB) + Core#3 + P#31

I don't know why they are all labeled Misc0 but it does see the extralayer.


If you want other information let me know.

There's a bit of ScaleMP code in the Linux kernel, but it does pretty
much nothing, it does not seem to add anything to /proc or /sys for

instance. So I am not sure hwloc could get some specializedknowledge of

ScaleMP machines. Maybe their custom numabind tool knows that ScaleMP
machines only works on machines with some well-defined

types/counts/numbering of processors and NUMA nodes, and thus usesthis

information to group sockets/NUMA-nodes depending on their physical
distance.

Brice

_______________________________________________
hwloc-users mailing list
hwloc-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users

Re: [hwloc-users] howloc with scalemp

Reply via email to