Re: [hwloc-users] howloc with scalemp

2010-04-07 Thread Samuel Thibault
Brock Palen, le Wed 07 Apr 2010 16:53:49 -0400, a écrit :
> Sure:
> [root@nyx0809 ~]# cat /sys/devices/system/node/node*/distance
> 10 20 254 254 254 254 254 254
> 20 10 254 254 254 254 254 254
> 254 254 10 20 254 254 254 254
> 254 254 20 10 254 254 254 254
> 254 254 254 254 10 20 254 254
> 254 254 254 254 20 10 254 254
> 254 254 254 254 254 254 10 20
> 254 254 254 254 254 254 20 10

Cool, so they did it the proper way, and thus hwloc just saw it the
proper way :D

Samuel


Re: [hwloc-users] howloc with scalemp

2010-04-07 Thread Brice Goglin
Brock Palen wrote:
> [brockp@nyx0809 INTEL]$ lstopo -
> System(79GB)
>   Misc0
> Node#0(10GB) + Socket#1 + L3(8192KB)
>   L2(256KB) + L1(32KB) + Core#0 + P#0
>   L2(256KB) + L1(32KB) + Core#1 + P#1
>   L2(256KB) + L1(32KB) + Core#2 + P#2
>   L2(256KB) + L1(32KB) + Core#3 + P#3
> Node#1(10GB) + Socket#0 + L3(8192KB)
>   L2(256KB) + L1(32KB) + Core#0 + P#4
>   L2(256KB) + L1(32KB) + Core#1 + P#5
>   L2(256KB) + L1(32KB) + Core#2 + P#6
>   L2(256KB) + L1(32KB) + Core#3 + P#7
>   Misc0
> Node#2(10GB) + Socket#3 + L3(8192KB)
>   L2(256KB) + L1(32KB) + Core#0 + P#8
>   L2(256KB) + L1(32KB) + Core#1 + P#9
>   L2(256KB) + L1(32KB) + Core#2 + P#10
>   L2(256KB) + L1(32KB) + Core#3 + P#11
> Node#3(10GB) + Socket#2 + L3(8192KB)
>   L2(256KB) + L1(32KB) + Core#0 + P#12
>   L2(256KB) + L1(32KB) + Core#1 + P#13
>   L2(256KB) + L1(32KB) + Core#2 + P#14
>   L2(256KB) + L1(32KB) + Core#3 + P#15
>   Misc0
> Node#4(10GB) + Socket#5 + L3(8192KB)
>   L2(256KB) + L1(32KB) + Core#0 + P#16
>   L2(256KB) + L1(32KB) + Core#1 + P#17
>   L2(256KB) + L1(32KB) + Core#2 + P#18
>   L2(256KB) + L1(32KB) + Core#3 + P#19
> Node#5(10GB) + Socket#4 + L3(8192KB)
>   L2(256KB) + L1(32KB) + Core#0 + P#20
>   L2(256KB) + L1(32KB) + Core#1 + P#21
>   L2(256KB) + L1(32KB) + Core#2 + P#22
>   L2(256KB) + L1(32KB) + Core#3 + P#23
>   Misc0
> Node#6(10GB) + Socket#7 + L3(8192KB)
>   L2(256KB) + L1(32KB) + Core#0 + P#24
>   L2(256KB) + L1(32KB) + Core#1 + P#25
>   L2(256KB) + L1(32KB) + Core#2 + P#26
>   L2(256KB) + L1(32KB) + Core#3 + P#27
> Node#7(10GB) + Socket#6 + L3(8192KB)
>   L2(256KB) + L1(32KB) + Core#0 + P#28
>   L2(256KB) + L1(32KB) + Core#1 + P#29
>   L2(256KB) + L1(32KB) + Core#2 + P#30
>   L2(256KB) + L1(32KB) + Core#3 + P#31
>
> I don't know why they are all labeled Misc0  but it does see the extra
> layer.
>
> If you want other information let me know.

Great, there are probably some distance information in sysfs.

Can you send the output of
cat /sys/devices/system/node/node*/distance

Brice



Re: [hwloc-users] howloc with scalemp

2010-04-07 Thread Brock Palen

Brice Goglin wrote:


Brock Palen wrote:

has anyone done work with hwloc on scalemp systems?  They provide
their own tool numabind, but we are looking for a more generic
solution to process placement and control that works well inside our
MPI library (openMPI in most cases).

Any input on this would be great!


Hello Brock,


From what I remember, ScaleMP uses an hypervisor on each node that
virtually merges all of them into a fake big shared-memory machine.  
Then

a vanilla Linux kernel runs on top of it. So hwloc should just see
regular cores and NUMA node information, assuming the virtual "merged"
hardware reports all necessary information to the OS.



running lstopo 0.9.3  it appears that howloc does see the extra layer  
of complexity:


[brockp@nyx0809 INTEL]$ lstopo -
System(79GB)
  Misc0
Node#0(10GB) + Socket#1 + L3(8192KB)
  L2(256KB) + L1(32KB) + Core#0 + P#0
  L2(256KB) + L1(32KB) + Core#1 + P#1
  L2(256KB) + L1(32KB) + Core#2 + P#2
  L2(256KB) + L1(32KB) + Core#3 + P#3
Node#1(10GB) + Socket#0 + L3(8192KB)
  L2(256KB) + L1(32KB) + Core#0 + P#4
  L2(256KB) + L1(32KB) + Core#1 + P#5
  L2(256KB) + L1(32KB) + Core#2 + P#6
  L2(256KB) + L1(32KB) + Core#3 + P#7
  Misc0
Node#2(10GB) + Socket#3 + L3(8192KB)
  L2(256KB) + L1(32KB) + Core#0 + P#8
  L2(256KB) + L1(32KB) + Core#1 + P#9
  L2(256KB) + L1(32KB) + Core#2 + P#10
  L2(256KB) + L1(32KB) + Core#3 + P#11
Node#3(10GB) + Socket#2 + L3(8192KB)
  L2(256KB) + L1(32KB) + Core#0 + P#12
  L2(256KB) + L1(32KB) + Core#1 + P#13
  L2(256KB) + L1(32KB) + Core#2 + P#14
  L2(256KB) + L1(32KB) + Core#3 + P#15
  Misc0
Node#4(10GB) + Socket#5 + L3(8192KB)
  L2(256KB) + L1(32KB) + Core#0 + P#16
  L2(256KB) + L1(32KB) + Core#1 + P#17
  L2(256KB) + L1(32KB) + Core#2 + P#18
  L2(256KB) + L1(32KB) + Core#3 + P#19
Node#5(10GB) + Socket#4 + L3(8192KB)
  L2(256KB) + L1(32KB) + Core#0 + P#20
  L2(256KB) + L1(32KB) + Core#1 + P#21
  L2(256KB) + L1(32KB) + Core#2 + P#22
  L2(256KB) + L1(32KB) + Core#3 + P#23
  Misc0
Node#6(10GB) + Socket#7 + L3(8192KB)
  L2(256KB) + L1(32KB) + Core#0 + P#24
  L2(256KB) + L1(32KB) + Core#1 + P#25
  L2(256KB) + L1(32KB) + Core#2 + P#26
  L2(256KB) + L1(32KB) + Core#3 + P#27
Node#7(10GB) + Socket#6 + L3(8192KB)
  L2(256KB) + L1(32KB) + Core#0 + P#28
  L2(256KB) + L1(32KB) + Core#1 + P#29
  L2(256KB) + L1(32KB) + Core#2 + P#30
  L2(256KB) + L1(32KB) + Core#3 + P#31

I don't know why they are all labeled Misc0  but it does see the extra  
layer.


If you want other information let me know.


There's a bit of ScaleMP code in the Linux kernel, but it does pretty
much nothing, it does not seem to add anything to /proc or /sys for
instance. So I am not sure hwloc could get some specialized  
knowledge of

ScaleMP machines. Maybe their custom numabind tool knows that ScaleMP
machines only works on machines with some well-defined
types/counts/numbering of processors and NUMA nodes, and thus uses  
this

information to group sockets/NUMA-nodes depending on their physical
distance.

Brice

___
hwloc-users mailing list
hwloc-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users






Re: [hwloc-users] howloc with scalemp

2010-04-07 Thread Samuel Thibault
Brock Palen, le Wed 07 Apr 2010 15:52:19 -0400, a écrit :
> has anyone done work with hwloc on scalemp systems?

Not here, but I guess something could be done, yes.

Samuel


[hwloc-users] howloc with scalemp

2010-04-07 Thread Brock Palen
has anyone done work with hwloc on scalemp systems?  They provide  
their own tool numabind, but we are looking for a more generic  
solution to process placement and control that works well inside our  
MPI library (openMPI in most cases).


Any input on this would be great!

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985