Le 22/01/2013 10:27, Samuel Thibault a écrit : > Kenneth A. Lloyd, le Mon 21 Jan 2013 22:46:37 +0100, a écrit : >> Thanks for making this tutorial available. Using hwloc 1.7, how far down >> into, say, NVIDIA cards can the architecture be reflected? Global memory >> size? SMX cores? None of the above? > None of the above for now. Both are available in the cuda svn branch, > however. >
Now the question to Kenneth is "what do YOU need?" I didn't merge the GPU internals into the trunk yet because I'd like to see if that matches what we would do with OpenCL and other accelerators such as the Xeon Phi. One thing is keep in mind is that most hwloc/GPU users will use hwloc to get locality information but they will also still use CUDA to use the GPU. So they will still be able to use CUDA to get in-depth GPU information anyway. Then the question is how much CUDA info do we want to duplicate in hwloc. hwloc could have the basic/uniform GPU information and let users rely on CUDA for everything CUDA-specific for instance. Right now, the basic/uniform part is almost empty (just contain the GPU model name or so). Also the CUDA branch creates hwloc objects inside the GPU to describe the memory/cores/caches/... Would you use these objects in your application ? or would you rather just have a basic GPU attribute structure containing the number of SMX, the memory size, ... One problem with this is that it may be hard to define a structure that works for all GPUs, even only the NVIDIA ones. We may need an union of structs... I am talking about "your application" above because having lstopo draw very nice GPU internals doesn't mean the corresponding hwloc objects are useful to real application. Brice