Re: [hwloc-users] howloc with scalemp
Brock Palen, le Wed 07 Apr 2010 16:53:49 -0400, a écrit : > Sure: > [root@nyx0809 ~]# cat /sys/devices/system/node/node*/distance > 10 20 254 254 254 254 254 254 > 20 10 254 254 254 254 254 254 > 254 254 10 20 254 254 254 254 > 254 254 20 10 254 254 254 254 > 254 254 254 254 10 20 254 254 > 254 254 254 254 20 10 254 254 > 254 254 254 254 254 254 10 20 > 254 254 254 254 254 254 20 10 Cool, so they did it the proper way, and thus hwloc just saw it the proper way :D Samuel
Re: [hwloc-users] howloc with scalemp
Brock Palen, le Wed 07 Apr 2010 16:46:53 -0400, a écrit : > I don't know why they are all labeled Misc0 Because it doesn't know what these actually are, just that there is some distance being involved :) Samuel
Re: [hwloc-users] howloc with scalemp
Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Apr 7, 2010, at 4:51 PM, Brice Goglin wrote: Brock Palen wrote: [brockp@nyx0809 INTEL]$ lstopo - System(79GB) Misc0 Node#0(10GB) + Socket#1 + L3(8192KB) L2(256KB) + L1(32KB) + Core#0 + P#0 L2(256KB) + L1(32KB) + Core#1 + P#1 L2(256KB) + L1(32KB) + Core#2 + P#2 L2(256KB) + L1(32KB) + Core#3 + P#3 Node#1(10GB) + Socket#0 + L3(8192KB) L2(256KB) + L1(32KB) + Core#0 + P#4 L2(256KB) + L1(32KB) + Core#1 + P#5 L2(256KB) + L1(32KB) + Core#2 + P#6 L2(256KB) + L1(32KB) + Core#3 + P#7 Misc0 Node#2(10GB) + Socket#3 + L3(8192KB) L2(256KB) + L1(32KB) + Core#0 + P#8 L2(256KB) + L1(32KB) + Core#1 + P#9 L2(256KB) + L1(32KB) + Core#2 + P#10 L2(256KB) + L1(32KB) + Core#3 + P#11 Node#3(10GB) + Socket#2 + L3(8192KB) L2(256KB) + L1(32KB) + Core#0 + P#12 L2(256KB) + L1(32KB) + Core#1 + P#13 L2(256KB) + L1(32KB) + Core#2 + P#14 L2(256KB) + L1(32KB) + Core#3 + P#15 Misc0 Node#4(10GB) + Socket#5 + L3(8192KB) L2(256KB) + L1(32KB) + Core#0 + P#16 L2(256KB) + L1(32KB) + Core#1 + P#17 L2(256KB) + L1(32KB) + Core#2 + P#18 L2(256KB) + L1(32KB) + Core#3 + P#19 Node#5(10GB) + Socket#4 + L3(8192KB) L2(256KB) + L1(32KB) + Core#0 + P#20 L2(256KB) + L1(32KB) + Core#1 + P#21 L2(256KB) + L1(32KB) + Core#2 + P#22 L2(256KB) + L1(32KB) + Core#3 + P#23 Misc0 Node#6(10GB) + Socket#7 + L3(8192KB) L2(256KB) + L1(32KB) + Core#0 + P#24 L2(256KB) + L1(32KB) + Core#1 + P#25 L2(256KB) + L1(32KB) + Core#2 + P#26 L2(256KB) + L1(32KB) + Core#3 + P#27 Node#7(10GB) + Socket#6 + L3(8192KB) L2(256KB) + L1(32KB) + Core#0 + P#28 L2(256KB) + L1(32KB) + Core#1 + P#29 L2(256KB) + L1(32KB) + Core#2 + P#30 L2(256KB) + L1(32KB) + Core#3 + P#31 I don't know why they are all labeled Misc0 but it does see the extra layer. If you want other information let me know. Great, there are probably some distance information in sysfs. Can you send the output of cat /sys/devices/system/node/node*/distance Sure: [root@nyx0809 ~]# cat /sys/devices/system/node/node*/distance 10 20 254 254 254 254 254 254 20 10 254 254 254 254 254 254 254 254 10 20 254 254 254 254 254 254 20 10 254 254 254 254 254 254 254 254 10 20 254 254 254 254 254 254 20 10 254 254 254 254 254 254 254 254 10 20 254 254 254 254 254 254 20 10 Brice ___ hwloc-users mailing list hwloc-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
Re: [hwloc-users] howloc with scalemp
Brock Palen wrote: > [brockp@nyx0809 INTEL]$ lstopo - > System(79GB) > Misc0 > Node#0(10GB) + Socket#1 + L3(8192KB) > L2(256KB) + L1(32KB) + Core#0 + P#0 > L2(256KB) + L1(32KB) + Core#1 + P#1 > L2(256KB) + L1(32KB) + Core#2 + P#2 > L2(256KB) + L1(32KB) + Core#3 + P#3 > Node#1(10GB) + Socket#0 + L3(8192KB) > L2(256KB) + L1(32KB) + Core#0 + P#4 > L2(256KB) + L1(32KB) + Core#1 + P#5 > L2(256KB) + L1(32KB) + Core#2 + P#6 > L2(256KB) + L1(32KB) + Core#3 + P#7 > Misc0 > Node#2(10GB) + Socket#3 + L3(8192KB) > L2(256KB) + L1(32KB) + Core#0 + P#8 > L2(256KB) + L1(32KB) + Core#1 + P#9 > L2(256KB) + L1(32KB) + Core#2 + P#10 > L2(256KB) + L1(32KB) + Core#3 + P#11 > Node#3(10GB) + Socket#2 + L3(8192KB) > L2(256KB) + L1(32KB) + Core#0 + P#12 > L2(256KB) + L1(32KB) + Core#1 + P#13 > L2(256KB) + L1(32KB) + Core#2 + P#14 > L2(256KB) + L1(32KB) + Core#3 + P#15 > Misc0 > Node#4(10GB) + Socket#5 + L3(8192KB) > L2(256KB) + L1(32KB) + Core#0 + P#16 > L2(256KB) + L1(32KB) + Core#1 + P#17 > L2(256KB) + L1(32KB) + Core#2 + P#18 > L2(256KB) + L1(32KB) + Core#3 + P#19 > Node#5(10GB) + Socket#4 + L3(8192KB) > L2(256KB) + L1(32KB) + Core#0 + P#20 > L2(256KB) + L1(32KB) + Core#1 + P#21 > L2(256KB) + L1(32KB) + Core#2 + P#22 > L2(256KB) + L1(32KB) + Core#3 + P#23 > Misc0 > Node#6(10GB) + Socket#7 + L3(8192KB) > L2(256KB) + L1(32KB) + Core#0 + P#24 > L2(256KB) + L1(32KB) + Core#1 + P#25 > L2(256KB) + L1(32KB) + Core#2 + P#26 > L2(256KB) + L1(32KB) + Core#3 + P#27 > Node#7(10GB) + Socket#6 + L3(8192KB) > L2(256KB) + L1(32KB) + Core#0 + P#28 > L2(256KB) + L1(32KB) + Core#1 + P#29 > L2(256KB) + L1(32KB) + Core#2 + P#30 > L2(256KB) + L1(32KB) + Core#3 + P#31 > > I don't know why they are all labeled Misc0 but it does see the extra > layer. > > If you want other information let me know. Great, there are probably some distance information in sysfs. Can you send the output of cat /sys/devices/system/node/node*/distance Brice
Re: [hwloc-users] howloc with scalemp
Brice Goglin wrote: Brock Palen wrote: has anyone done work with hwloc on scalemp systems? They provide their own tool numabind, but we are looking for a more generic solution to process placement and control that works well inside our MPI library (openMPI in most cases). Any input on this would be great! Hello Brock, From what I remember, ScaleMP uses an hypervisor on each node that virtually merges all of them into a fake big shared-memory machine. Then a vanilla Linux kernel runs on top of it. So hwloc should just see regular cores and NUMA node information, assuming the virtual "merged" hardware reports all necessary information to the OS. running lstopo 0.9.3 it appears that howloc does see the extra layer of complexity: [brockp@nyx0809 INTEL]$ lstopo - System(79GB) Misc0 Node#0(10GB) + Socket#1 + L3(8192KB) L2(256KB) + L1(32KB) + Core#0 + P#0 L2(256KB) + L1(32KB) + Core#1 + P#1 L2(256KB) + L1(32KB) + Core#2 + P#2 L2(256KB) + L1(32KB) + Core#3 + P#3 Node#1(10GB) + Socket#0 + L3(8192KB) L2(256KB) + L1(32KB) + Core#0 + P#4 L2(256KB) + L1(32KB) + Core#1 + P#5 L2(256KB) + L1(32KB) + Core#2 + P#6 L2(256KB) + L1(32KB) + Core#3 + P#7 Misc0 Node#2(10GB) + Socket#3 + L3(8192KB) L2(256KB) + L1(32KB) + Core#0 + P#8 L2(256KB) + L1(32KB) + Core#1 + P#9 L2(256KB) + L1(32KB) + Core#2 + P#10 L2(256KB) + L1(32KB) + Core#3 + P#11 Node#3(10GB) + Socket#2 + L3(8192KB) L2(256KB) + L1(32KB) + Core#0 + P#12 L2(256KB) + L1(32KB) + Core#1 + P#13 L2(256KB) + L1(32KB) + Core#2 + P#14 L2(256KB) + L1(32KB) + Core#3 + P#15 Misc0 Node#4(10GB) + Socket#5 + L3(8192KB) L2(256KB) + L1(32KB) + Core#0 + P#16 L2(256KB) + L1(32KB) + Core#1 + P#17 L2(256KB) + L1(32KB) + Core#2 + P#18 L2(256KB) + L1(32KB) + Core#3 + P#19 Node#5(10GB) + Socket#4 + L3(8192KB) L2(256KB) + L1(32KB) + Core#0 + P#20 L2(256KB) + L1(32KB) + Core#1 + P#21 L2(256KB) + L1(32KB) + Core#2 + P#22 L2(256KB) + L1(32KB) + Core#3 + P#23 Misc0 Node#6(10GB) + Socket#7 + L3(8192KB) L2(256KB) + L1(32KB) + Core#0 + P#24 L2(256KB) + L1(32KB) + Core#1 + P#25 L2(256KB) + L1(32KB) + Core#2 + P#26 L2(256KB) + L1(32KB) + Core#3 + P#27 Node#7(10GB) + Socket#6 + L3(8192KB) L2(256KB) + L1(32KB) + Core#0 + P#28 L2(256KB) + L1(32KB) + Core#1 + P#29 L2(256KB) + L1(32KB) + Core#2 + P#30 L2(256KB) + L1(32KB) + Core#3 + P#31 I don't know why they are all labeled Misc0 but it does see the extra layer. If you want other information let me know. There's a bit of ScaleMP code in the Linux kernel, but it does pretty much nothing, it does not seem to add anything to /proc or /sys for instance. So I am not sure hwloc could get some specialized knowledge of ScaleMP machines. Maybe their custom numabind tool knows that ScaleMP machines only works on machines with some well-defined types/counts/numbering of processors and NUMA nodes, and thus uses this information to group sockets/NUMA-nodes depending on their physical distance. Brice ___ hwloc-users mailing list hwloc-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
Re: [hwloc-users] howloc with scalemp
Brock Palen wrote: > has anyone done work with hwloc on scalemp systems? They provide > their own tool numabind, but we are looking for a more generic > solution to process placement and control that works well inside our > MPI library (openMPI in most cases). > > Any input on this would be great! Hello Brock, >From what I remember, ScaleMP uses an hypervisor on each node that virtually merges all of them into a fake big shared-memory machine. Then a vanilla Linux kernel runs on top of it. So hwloc should just see regular cores and NUMA node information, assuming the virtual "merged" hardware reports all necessary information to the OS. There's a bit of ScaleMP code in the Linux kernel, but it does pretty much nothing, it does not seem to add anything to /proc or /sys for instance. So I am not sure hwloc could get some specialized knowledge of ScaleMP machines. Maybe their custom numabind tool knows that ScaleMP machines only works on machines with some well-defined types/counts/numbering of processors and NUMA nodes, and thus uses this information to group sockets/NUMA-nodes depending on their physical distance. Brice
Re: [hwloc-users] howloc with scalemp
Brock Palen, le Wed 07 Apr 2010 15:52:19 -0400, a écrit : > has anyone done work with hwloc on scalemp systems? Not here, but I guess something could be done, yes. Samuel
[hwloc-users] howloc with scalemp
has anyone done work with hwloc on scalemp systems? They provide their own tool numabind, but we are looking for a more generic solution to process placement and control that works well inside our MPI library (openMPI in most cases). Any input on this would be great! Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985