BTW: just to be clear. You don't have to write any code to compute these values, or to reset the job structures prior to restarting a process. This has already been done.
Recomputing local and node ranks is done in orte/mca/rmaps/base/rmaps_base_support_fns.c in a function called orte_rmaps_base_update_local_ranks. Resetting the job and proc structures for restarting a process is done in orte/mca/plm/base/plm_base_rsh_support.c in a function called orte_plm_base_reset_job. The restart logic was in the orte/mca/errmgr/orcm module, but I moved that out of the devel trunk recently as we needed to do some orcm-specific things in it. However, I can (and probably should) restore it under a different name if that would help. Ralph On Apr 7, 2010, at 10:15 PM, Ralph Castain wrote: > The local rank of a process is computed by looking at all processes on a node > from that job. The lowest MPI rank process on that node from that job is > given local-rank=0. All processes on the node are given local-ranks in > ascending order according to their MPI rank. > > The node rank is computed the same way, except that we look at all processes > on the node, spanning all MPI jobs. > > Consider this example. Suppose we have an MPI application that launches 3 > processes on each of two nodes, with ranks assigned on a bynode round-robin > basis. Thus, the MPI rank mapping looks like this: > > node0: rank 0, 2, 4 > node1: rank 1, 3, 5 > > The local ranks would look like this: > > Node MPI Rank Local Rank > node0 0 0 > node0 2 1 > node0 4 2 > > node1 1 0 > node1 3 1 > node1 5 2 > > Since we only have one job, the node rank of each process would be identical > to its local rank. Now suppose that application does a comm_spawn that > launches two processes on node0. The local ranks of the new processes would > be 0,1 reflecting their relative position within that job. However, their > node ranks would be 3,4 because of the processes already on the node. > > We use these values when assigning static ports and processor affinity. Other > than that, they have no meaning. > > HTH > Ralph > > > > On Apr 7, 2010, at 7:16 PM, luyang dong wrote: > >> dear teachers: >> In orte_globals.h, there is a data structure. >> typedef struct { >> /* index to node */ >> int32_t node; >> /* local rank */ >> orte_local_rank_t local_rank; >> /* node rank */ >> orte_node_rank_t node_rank; >> } orte_pmap_t; >> And I do not understand what both local_rank and node_rank exactly mean. Is >> local_rank similar to the rank of MPI Specification. Can you help me? My >> motivation is to achieve process migration in openmpi, I urgently want to >> the procedure of launching process. >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >