Le 22/12/2017 à 11:42, Samuel Thibault a écrit : > Hello, > > Brice Goglin, on mar. 19 déc. 2017 11:48:39 +0100, wrote: >> + Memory, I/O and Misc objects are now stored in dedicated children lists, >> not in the usual children list that is now only used for CPU-side >> objects. >> - hwloc_get_next_child() may still be used to iterate over these 4 lists >> of children at once. > I hadn't realized this before: so the NUMA-related hierarchy level can > not be easily obtained with hwloc_get_type_depth and such, that's really > a concern. For instance in slurm-llnl one can find > > if (hwloc_get_type_depth(topology, HWLOC_OBJ_NODE) > > hwloc_get_type_depth(topology, HWLOC_OBJ_SOCKET)) { > > and probably others are doing this too, e.g. looking up from a CPU to > find the NUMA level becomes very different from looking up from a cPU to > find the L3 level etc. > > Instead of moving these objects to another place which is very > different to find, can't we rather create another type of object, e.g. > HWLOC_OBJ_MEMORY, to represent the different kinds of memories that can > be found in a given NUMA level, and keep HWLOC_OBJ_NODE as it is? >
That won't work. You can have memory attached at different levels of the hierarchy (things like HBM inside a die, normal memory attached to a package, and slow memory attached to the memory interconnect). The notion of NUMA node and proximity domain is changing. It's not a set of CPU+memory anymore. Things are moving towards the separation of "memory initiator" (CPUs) and "memory target" (memory banks, possibly behind memory-side caches). And those targets can be attached to different things. I agree that finding local NUMA nodes is harder now. I thought about having an explicit type saying "I have memory children, other don't" (you propose NUMA with MEMORY children, I rather thought about MEMORY with NUMA children because people are used to NUMA node numbers, and memory-bind to NUMA nodes). But again, there's no guarantee that they will be at the same depth in the hierarchy since they might be attached to different kinds of resources. Things like comparing their depth with socket depth won't work either. So we'd end up with multiple levels just like Groups. I will add helpers to simplify the lookup (give me my local NUMA node if there's a single one, give me the number of "normal" NUMA nodes so I can split the machine in parts, ...) but it's too early to add these, we need more feedback first. About Slurm-llnl, their code is obsolete anyway. NUMA is inside Socket in all modern architectures. So they expose a "Socket" resource that is actually a NUMA node. They used an easy way to detect whether there are multiple NUMAs per socket or the contrary. We can still detect that in v2.0, even if the code is different. Once we'll understand what they *really* want to do, we'll help them update that code to v2.0. Brice _______________________________________________ hwloc-devel mailing list hwloc-devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-devel