Re: [hwloc-users] Mapping a GPU to a pci local CPU on Windows

2013-01-08 Thread Ashley Reid
It appears DEVPKEY_Numa_Proximity_Domain with SetupDiGetDeviceProperty, should work. I found this hidden way down in http://blogs.technet.com/b/winserverperformance/archive/2008/09/13/getting-system-topology-information-on-windows.aspx I am looking into seeing if this works. -Original

[hwloc-users] hwloc on Blue Gene/Q?

2013-01-08 Thread Erik Schnetter
I am trying to use hwloc on a Blue Gene/Q. Building and installing worked fine, and it reports the system configuration fine as well (i.e. it shows all PUs). However, when I try to inquire the thread/core bindings, hwloc crashes with an error in libc's free(). This is both with 1.6 and 1.6.1rc1.

Re: [hwloc-users] hwloc on Blue Gene/Q?

2013-01-08 Thread Brice Goglin
Hello Erik, We need specific BGQ binding support, the binding API is different. Also we don't properly detect the 16 4-way cores properly, we only only 64 identical PUs. I am supposed to get a BGQ account in the near future so I hope I will have everything working in v1.7. Stay tuned Brice Le

Re: [hwloc-users] Mapping a GPU to a pci local CPU on Windows

2013-01-08 Thread Brice Goglin
Is your machine NUMA? Maybe Windows returns an error when requesting numa info on non-NUMA info? Brice Le 08/01/2013 18:44, Ashley Reid a écrit : > OS says DEVPKEY_Numa_Proximity_Domain does not exist. Neither does > DEVPKEY_Device_Numa_Node . For all devices. > > Lame :/ > > Thanks, > Ash >

Re: [hwloc-users] hwloc on Blue Gene/Q?

2013-01-08 Thread Erik Schnetter
Jeff Thanks, this is helpful. I am mostly interested in finding out which threads share the D1 cache. I guess that get_bgq_core returns this information. Is there a way to guarantee that this association doesn't change at run time? I guess I could just check periodically... -erik On Tue, Jan

Re: [hwloc-users] hwloc on Blue Gene/Q?

2013-01-08 Thread Jeff Hammond
These functions are returning the physical placement at the moment they are called. If a Pthread moves around, it will still return the correct, current value. You should not cache the output of these functions. They require ~105 cycles per call (I just measured this for 1M calls, with 315-318M