For those wishing to use the new locality functionality, here is a little 
(hopefully clearer) info on how to do it. A few clarifications first may help:

1. the locality is defined by the precise cpu set upon which a process is 
bound. If not bound, this obviously includes all the available cpus on the node 
where the process resides. 

2. the locality value we return to you is a bitmask where each bit represents a 
specific layer of common usage between you (the proc in which the call to 
orte_ess.proc_get_locality is made) and the given process. In other words, if 
the "socket" bit is set, it means you and the process you specified are both 
bound to the same socket.

Important note: it does -not- mean that the other process is currently 
executing on the same socket as you are executing upon at this instant in time. 
It only means that the OS is allowing that process to use the same socket that 
you are allowed to use. As the process swaps in/out and moves around, it may or 
may not be co-located on the socket with you at any given instant.

We do not currently provide a way for a process to get the relative locality of 
two other remote processes. However, the infrastructure supports this, so we 
can add it if/when someone shows a use-case for it.

3. every process has locality info for all of its peers AND for any proc that 
connected to it via MPI connect/accept or comm_spawn (the info is included in 
the modex during the connect/accept procedure). This is true regardless of 
launch method, with the exception of cnos (which doesn't have a modex).


With that in mind, let's start with determining if a proc is on the same node. 
The only way to determine if two procs other than yourself are on the same node 
is to compare their daemon vpids:

if (orte_ess.proc_get_daemon(A) == orte_ess.proc_get_daemon(B)), then A and B 
are on the same node.


However, there are two ways to determine if another proc is on the same node as 
you. First, you can of course use the above method to determine if you share 
the same daemon:

if (orte_ess.proc_get_daemon(A) == ORTE_PROC_MY_DAEMON->vpid), then we are on 
the same node

Alternatively, you can use the proc locality since it contains a "node" bit:

if (OPAL_PROC_ON_LOCAL_NODE(orte_ess.proc_get_locality(A))), then the proc is 
on the same node as us.


Similarly, we can determine if another process shares a socket, NUMA node, or 
other hardware element with us by applying the corresponding OPAL_PROC_ON_xxx 
macro to the locality returned by calling orte_ess.proc_get_locality for that 
process.

HTH
Ralph


Reply via email to