Wonderful!!! We've been waiting for such functionality for a while. I do have some questions/remarks related to this patch.
What is the my_node_rank in the orte_proc_info_t structure? Is there any difference between using the field my_node_rank or the vpid part of the my_daemon? What is the correct way of finding that two processes are on the same remote location, comparing their daemon vpid or their node_rank? How the node_rank change with respect to dynamic process management when new daemons are joining? The flag OPAL_PROC_ON_L*CACHE is only set for local processes if I understand correctly your last email? I guess proc_flags in proc.h should be opal_paffinity_locality_t to match the flags on the ORTE level? A more high level remark. The fact that the locality information is automatically packed and exchanged during the grpcomm modex call seems a little bit weird (do the upper level have a saying on it?). I would not have thought that the grpcomm (which based on the grpcomm.h header file is a framework providing communication services that span entire jobs or collections of processes) is the place to put it. Thanks, george. On Oct 19, 2011, at 16:28 , Ralph Castain wrote: > Hi folks > > For those of you who don't follow the commits... > > I just committed (r25323) an extension of the orte_ess.proc_get_locality > function that allows a process to get its relative resource usage with any > other proc in the job. In other words, you can provide a process name to the > function, and the returned bitmask tells you if you share a node, numa, > socket, caches (by level), core, and hyperthread with that process. > > If you are on the same node and unbound, of course, you share all of those. > However, if you are bound, then this can help tell you if you are on a common > numa node, sharing an L1 cache, etc. Might be handy. > > I implemented the underlying functionality so that we can further extend it > to tell you the relative resource location of two procs on a remote node. If > that someday becomes of interest, it would be relatively easy to do - but > would require passing more info around. Hence, I've allowed for it, but not > implemented it until there is some identified need. > > Locality info is available anytime after the modex is completed during > MPI_Init, and is supported regardless of launch environment (minus cnos, for > now), launch by mpirun, or direct-launch - in other words, pretty much always. > > Hope it proves of help in your work > Ralph > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel