Re: [hwloc-users] GPU/NIC/CPU locality
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 30/11/11 03:24, Guy Streeter wrote: > There is a request pending to have hwloc updated to 1.3 in > RHEL6. I do not yet have a schedule for it. I wouldn't hold your breath, I'm still waiting for a nasty kernel bug to be fixed in RHEL5 (ethernet packets delivered on wrong interface of a dual ported 10GigE NIC) reported about a year ago in 5.5. They're now arguing about whether or not they should fix it in RHEL 5.8 or put off for another release yet again (even though it was already fixed upstream in the Mellanox drivers when I reported it). cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk7VZLcACgkQO2KABBYQAh93BgCfQ/t3dDRavWS1CgN6chjxhqLm m+oAnRqw7N9Ck4UW3a5GPcLZypYPe3bW =1QXj -END PGP SIGNATURE-
Re: [hwloc-users] GPU/NIC/CPU locality
On Nov 29, 2011, at 1:04 PM, Brice Goglin wrote: > "XML output" should be "XML input/output" or "XML support". Done: - Hwloc optional build support status (more details can be found above): Probe / display PCI devices: yes Graphical output (Cairo):yes XML input / output: full Memory support: binding, set policy, migrate pages - -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [hwloc-users] GPU/NIC/CPU locality
> Hwloc optional build support status (more details can be found above): > > Probe / display PCI devices: yes > Graphical output (Cairo):yes > XML output: full "XML output" should be "XML input/output" or "XML support". > Memory support: binding, set policy, migrate pages Looks ok otherwise. Brice
Re: [hwloc-users] GPU/NIC/CPU locality
On Nov 29, 2011, at 12:01 PM, Brice Goglin wrote: > Yes, always installed. There are some configure checks for verbs, but > it's only used for enabling verbs-related helper testing. Ok, how's this for output at the end of configure? Linux: - Hwloc optional build support status (more details can be found above): Probe / display PCI devices: yes Graphical output (Cairo):yes XML output: full Memory support: binding, set policy, migrate pages - OS X: - Hwloc optional build support status (more details can be found above): Probe / display PCI devices: no Graphical output (Cairo):yes XML output: full Memory support: none - XML support will show "basic" if libxml2 is not found. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [hwloc-users] GPU/NIC/CPU locality
Le 29/11/2011 17:58, Jeff Squyres a écrit : > On Nov 29, 2011, at 11:53 AM, Brice Goglin wrote: > >>> What about MX, verbs, Cuda, ...? >> MX and verbs are not used internally, we just have public helpers to >> interoperate with them (and tests). > I forget -- are the helpers installed/available even if the MX > headers/libraries are not found at configure time? (ditto for verbs, cuda, > etc.) Yes, always installed. There are some configure checks for verbs, but it's only used for enabling verbs-related helper testing. Brice
Re: [hwloc-users] GPU/NIC/CPU locality
On Nov 29, 2011, at 11:53 AM, Brice Goglin wrote: >> What about MX, verbs, Cuda, ...? > > MX and verbs are not used internally, we just have public helpers to > interoperate with them (and tests). I forget -- are the helpers installed/available even if the MX headers/libraries are not found at configure time? (ditto for verbs, cuda, etc.) > Same for cuda in trunk (until Samuel's cuda branch gets merged). > > Brice > > ___ > hwloc-users mailing list > hwloc-us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [hwloc-users] GPU/NIC/CPU locality
Le 29/11/2011 17:50, Jeff Squyres a écrit : > On Nov 29, 2011, at 10:33 AM, Brice Goglin wrote: > >>> - Kerrighard >>> - PCI device support >>> - XML support >> I would put XML, PCI, Cairo and libnuma > What about MX, verbs, Cuda, ...? MX and verbs are not used internally, we just have public helpers to interoperate with them (and tests). Same for cuda in trunk (until Samuel's cuda branch gets merged). Brice
Re: [hwloc-users] GPU/NIC/CPU locality
On Nov 29, 2011, at 10:33 AM, Brice Goglin wrote: >> - Kerrighard >> - PCI device support >> - XML support > > I would put XML, PCI, Cairo and libnuma What about MX, verbs, Cuda, ...? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [hwloc-users] GPU/NIC/CPU locality
On 11/29/2011 02:57 AM, Stefan Eilemann wrote: > Bonjour Brice, > > On 29. Nov 2011, at 9:45, Brice Goglin wrote: > >> hwloc 1.3 already has support for PCI device detection. These new >> objects contain a "class" field that can help you know if it's a NIC/GPU/... > > Ok, time to upgrade my installation. The cluster has RHEL6.1 which ships with > an older version. > There is a request pending to have hwloc updated to 1.3 in RHEL6. I do not yet have a schedule for it. --Guy
Re: [hwloc-users] GPU/NIC/CPU locality
On 11/29/2011 06:25 AM, Stefan Eilemann wrote: > > On 29. Nov 2011, at 11:41, Samuel Thibault wrote: > >> You are probably missing the libpci-devel package. > > Thanks, that either doesn't exist or wasn't installed on Redhat. It works now. > > I think messages of found/not found optional modules could be more prominent > at the end of the configure process. > > > Cheers, > > Stefan. The package is pciutils-devel on RHEL --Guy
Re: [hwloc-users] GPU/NIC/CPU locality
Le 29/11/2011 16:19, Jeff Squyres a écrit : > On Nov 29, 2011, at 10:16 AM, Stefan Eilemann wrote: > >>> FWIW, I've traditionally been against such things for two reasons: >> Your call, really. The information is there and not too hard to find, but I >> missed it on the first run. Most software I know provides this in a very >> concise list at the end (Supported: A B C\n Unsupported: D E F). > Let me throw this back to Brice / Samuel... > > If we had such a thing at the bottom of configure, what items should we show? > I can think of the following obvious ones offhand: > > - Kerrighard > - PCI device support > - XML support > I would put XML, PCI, Cairo and libnuma Brice
Re: [hwloc-users] GPU/NIC/CPU locality
On Nov 29, 2011, at 10:16 AM, Stefan Eilemann wrote: >> FWIW, I've traditionally been against such things for two reasons: > > Your call, really. The information is there and not too hard to find, but I > missed it on the first run. Most software I know provides this in a very > concise list at the end (Supported: A B C\n Unsupported: D E F). Let me throw this back to Brice / Samuel... If we had such a thing at the bottom of configure, what items should we show? I can think of the following obvious ones offhand: - Kerrighard - PCI device support - XML support -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [hwloc-users] GPU/NIC/CPU locality
Hi Jeff, On 29. Nov 2011, at 15:28, Jeff Squyres wrote: >> I think messages of found/not found optional modules could be more prominent >> at the end of the configure process. > > FWIW, I've traditionally been against such things for two reasons: Your call, really. The information is there and not too hard to find, but I missed it on the first run. Most software I know provides this in a very concise list at the end (Supported: A B C\n Unsupported: D E F). Cheers, Stefan. -- http://www.eyescale.ch http://www.equalizergraphics.com http://www.linkedin.com/in/eilemann
Re: [hwloc-users] GPU/NIC/CPU locality
On Nov 29, 2011, at 7:25 AM, Stefan Eilemann wrote: >> You are probably missing the libpci-devel package. > > Thanks, that either doesn't exist or wasn't installed on Redhat. It works now. > > I think messages of found/not found optional modules could be more prominent > at the end of the configure process. FWIW, I've traditionally been against such things for two reasons: 1. The information *was* displayed above (i.e., that pci-devel wasn't found/wasn't usable/whatever). I realize that most people don't read the stdout of configure at all, but all the information you need is already there. 2. A list of what will/will not be built at the end tends to grow lengthy such that it dilutes the value of repeating the information at the end. That being said, I can *somewhat* see the value of displaying a user-friendly "PCI device support will not be built" vs. the output of a configure test, which might be somewhat obscure. However, in hwloc's case, the configure test output is pretty self-evident. Examples: checking for PCI... no checking pci/pci.h usability... no checking pci/pci.h presence... no checking for pci/pci.h... no checking for LIBXML2... yes checking for xmlNewDoc... yes checking for final LIBXML2 support... yes A simple string search for "pci" and "xml" will find these lines in the configure output. Assumedly, if you're building from source, you've likely got at least *some* experience and it shouldn't be unreasonable to ask you to go look in the output of configure. Don't get me wrong -- I'm not dead-set against a listing at the bottom. I just find it redundant and somewhat of a maintenance hassle. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [hwloc-users] GPU/NIC/CPU locality
On 29. Nov 2011, at 11:41, Samuel Thibault wrote: > You are probably missing the libpci-devel package. Thanks, that either doesn't exist or wasn't installed on Redhat. It works now. I think messages of found/not found optional modules could be more prominent at the end of the configure process. Cheers, Stefan. -- http://www.eyescale.ch http://www.equalizergraphics.com http://www.linkedin.com/in/eilemann
Re: [hwloc-users] GPU/NIC/CPU locality
Stefan Eilemann, le Tue 29 Nov 2011 11:40:18 +0100, a écrit : > Maybe I'm missing something, but I don't see any PCI-related output with > lstopo. You are probably missing the libpci-devel package. Samuel
Re: [hwloc-users] GPU/NIC/CPU locality
Hi Brice, On 29. Nov 2011, at 9:45, Brice Goglin wrote: > hwloc 1.3 already has support for PCI device detection. These new > objects contain a "class" field that can help you know if it's a NIC/GPU/... > > Just run lstopo > on your machine to see what I am talking about. Maybe I'm missing something, but I don't see any PCI-related output with lstopo. I just compiled 1.3 from scratch, and run lstopo as user and hwloc-info as root: $ sudo ./local/bin/hwloc-info -v [sudo] password for eilemann: Machine (24GB) NUMANode L#0 (P#0 12GB) + Socket L#0 + L3 L#0 (12MB) L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 + PU L#0 (P#0) L2 L#1 (256KB) + L1 L#1 (32KB) + Core L#1 + PU L#1 (P#1) L2 L#2 (256KB) + L1 L#2 (32KB) + Core L#2 + PU L#2 (P#2) L2 L#3 (256KB) + L1 L#3 (32KB) + Core L#3 + PU L#3 (P#3) L2 L#4 (256KB) + L1 L#4 (32KB) + Core L#4 + PU L#4 (P#4) L2 L#5 (256KB) + L1 L#5 (32KB) + Core L#5 + PU L#5 (P#5) NUMANode L#1 (P#1 12GB) + Socket L#1 + L3 L#1 (12MB) L2 L#6 (256KB) + L1 L#6 (32KB) + Core L#6 + PU L#6 (P#6) L2 L#7 (256KB) + L1 L#7 (32KB) + Core L#7 + PU L#7 (P#7) L2 L#8 (256KB) + L1 L#8 (32KB) + Core L#8 + PU L#8 (P#8) L2 L#9 (256KB) + L1 L#9 (32KB) + Core L#9 + PU L#9 (P#9) L2 L#10 (256KB) + L1 L#10 (32KB) + Core L#10 + PU L#10 (P#10) L2 L#11 (256KB) + L1 L#11 (32KB) + Core L#11 + PU L#11 (P#11) [eilemann@node01 ~]$ The lstopo graphical output contains the same information. Cheers, Stefan. -- http://www.eyescale.ch http://www.equalizergraphics.com http://www.linkedin.com/in/eilemann
Re: [hwloc-users] GPU/NIC/CPU locality
Le 29/11/2011 09:57, Stefan Eilemann a écrit : > > I use them mostly with OpenGL ('XOpenDisplay(":0.")' and RDMA in > Equalizer/Collage (see links in signature). Is there a straight way to > associate the GPUs with the corresponding X screen? I guess at least the path > through the Xorg PCI ID should work, but it would be nice to have that in > hwloc. I need to think about it, it doesn't look very easy to implement. Brice
Re: [hwloc-users] GPU/NIC/CPU locality
Bonjour Brice, On 29. Nov 2011, at 9:45, Brice Goglin wrote: > hwloc 1.3 already has support for PCI device detection. These new > objects contain a "class" field that can help you know if it's a NIC/GPU/... Ok, time to upgrade my installation. The cluster has RHEL6.1 which ships with an older version. > How are you using GPUs and NICs in your software? Which libraries or > ways do you use to access them? I use them mostly with OpenGL ('XOpenDisplay(":0.")' and RDMA in Equalizer/Collage (see links in signature). Is there a straight way to associate the GPUs with the corresponding X screen? I guess at least the path through the Xorg PCI ID should work, but it would be nice to have that in hwloc. We also use Cuda/OpenMPI here, but I guess this will be easier to support. I'll look into the latest source of lstopo to see how it's done. BTW, I recently created a library for ZeroConf GPU discovery[1], this might be of interest for you. Cheers, Stefan. [1] http://www.equalizergraphics.com/gpu-sd -- http://www.eyescale.ch http://www.equalizergraphics.com http://www.linkedin.com/in/eilemann
Re: [hwloc-users] GPU/NIC/CPU locality
Hello Stefan, hwloc 1.3 already has support for PCI device detection. These new objects contain a "class" field that can help you know if it's a NIC/GPU/... However it's hard to know which PCI device is eth0 or eth1, so we also try to add some OS device inside PCI device. If you're using Linux, you will see which network device (eth0, ...), IB device (mlx4_0, ...), or disk (sda, ...) corresponds to each PCI device (if any). Just run lstopo on your machine to see what I am talking about. Then you should read the I/O devices section in the doc. There's also some work to insert CUDA device information inside those PCI devices. Additionally, we have some helpers to retrieve locality of some custom libraries objects (OFED, CUDA, ...). See the interoperability section in the doc. How are you using GPUs and NICs in your software? Which libraries or ways do you use to access them? hope this helps. Brice Le 29/11/2011 09:32, Stefan Eilemann a écrit : > All, > > We have the need to discover which GPUs and NICs are close to which CPUs[1], > independent from CUDA. From the overview page there are hints that there is > some kind of support planned, but it's unclear to me of how much of this is > implemented. > > Is there support in hwloc, and in which version, for this? If yes, can you > give me a hint/code snippet on how to do this? If no, what does it take to > get this support in hwloc? > > > Cheers, > > Stefan. > > [1] https://github.com/Eyescale/Equalizer/issues/57 >
[hwloc-users] GPU/NIC/CPU locality
All, We have the need to discover which GPUs and NICs are close to which CPUs[1], independent from CUDA. From the overview page there are hints that there is some kind of support planned, but it's unclear to me of how much of this is implemented. Is there support in hwloc, and in which version, for this? If yes, can you give me a hint/code snippet on how to do this? If no, what does it take to get this support in hwloc? Cheers, Stefan. [1] https://github.com/Eyescale/Equalizer/issues/57 -- http://www.eyescale.ch http://www.equalizergraphics.com http://www.linkedin.com/in/eilemann