Re: [OMPI devel] inquiring about data structure in openmpi

2010-04-08 Thread Ralph Castain
The local rank of a process is computed by looking at all processes on a node 
from that job. The lowest MPI rank process on that node from that job is given 
local-rank=0. All processes on the node are given local-ranks in ascending 
order according to their MPI rank.

The node rank is computed the same way, except that we look at all processes on 
the node, spanning all MPI jobs.

Consider this example. Suppose we have an MPI application that launches 3 
processes on each of two nodes, with ranks assigned on a bynode round-robin 
basis. Thus, the MPI rank mapping looks like this:

node0:  rank 0, 2, 4
node1: rank 1, 3, 5

The local ranks would look like this:

Node MPI Rank   Local Rank
node0   0 0
node0   2 1
node0   4 2

node1   1 0
node1   3 1
node1   5 2

Since we only have one job, the node rank of each process would be identical to 
its local rank.  Now suppose that application does a comm_spawn that launches 
two processes on node0. The local ranks of the new processes would be 0,1 
reflecting their relative position within that job. However, their node ranks 
would be 3,4 because of the processes already on the node.

We use these values when assigning static ports and processor affinity. Other 
than that, they have no meaning.

HTH
Ralph



On Apr 7, 2010, at 7:16 PM, luyang dong wrote:

> dear teachers:
>  In orte_globals.h, there is a data structure.
> typedef struct {
> /* index to node */
> int32_t node;
> /* local rank */
> orte_local_rank_t local_rank;
> /* node rank */
> orte_node_rank_t node_rank;
> } orte_pmap_t;
> And I do not understand what both local_rank and node_rank exactly mean. Is 
> local_rank similar to the rank of MPI Specification. Can you help me? My 
> motivation is to achieve process migration in openmpi, I urgently want to the 
> procedure of launching process.
> 
>  ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



Re: [OMPI devel] inquiring about data structure in openmpi

2010-04-08 Thread Ralph Castain
BTW: just to be clear. You don't have to write any code to compute these 
values, or to reset the job structures prior to restarting a process. This has 
already been done.

Recomputing local and node ranks is done in 
orte/mca/rmaps/base/rmaps_base_support_fns.c in a function called 
orte_rmaps_base_update_local_ranks.

Resetting the job and proc structures for restarting a process is done in 
orte/mca/plm/base/plm_base_rsh_support.c in a function called 
orte_plm_base_reset_job.

The restart logic was in the orte/mca/errmgr/orcm module, but I moved that out 
of the devel trunk recently as we needed to do some orcm-specific things in it. 
However, I can (and probably should) restore it under a different name if that 
would help.

Ralph


On Apr 7, 2010, at 10:15 PM, Ralph Castain wrote:

> The local rank of a process is computed by looking at all processes on a node 
> from that job. The lowest MPI rank process on that node from that job is 
> given local-rank=0. All processes on the node are given local-ranks in 
> ascending order according to their MPI rank.
> 
> The node rank is computed the same way, except that we look at all processes 
> on the node, spanning all MPI jobs.
> 
> Consider this example. Suppose we have an MPI application that launches 3 
> processes on each of two nodes, with ranks assigned on a bynode round-robin 
> basis. Thus, the MPI rank mapping looks like this:
> 
> node0:  rank 0, 2, 4
> node1: rank 1, 3, 5
> 
> The local ranks would look like this:
> 
> Node MPI Rank   Local Rank
> node0   0 0
> node0   2 1
> node0   4 2
> 
> node1   1 0
> node1   3 1
> node1   5 2
> 
> Since we only have one job, the node rank of each process would be identical 
> to its local rank.  Now suppose that application does a comm_spawn that 
> launches two processes on node0. The local ranks of the new processes would 
> be 0,1 reflecting their relative position within that job. However, their 
> node ranks would be 3,4 because of the processes already on the node.
> 
> We use these values when assigning static ports and processor affinity. Other 
> than that, they have no meaning.
> 
> HTH
> Ralph
> 
> 
> 
> On Apr 7, 2010, at 7:16 PM, luyang dong wrote:
> 
>> dear teachers:
>>  In orte_globals.h, there is a data structure.
>> typedef struct {
>> /* index to node */
>> int32_t node;
>> /* local rank */
>> orte_local_rank_t local_rank;
>> /* node rank */
>> orte_node_rank_t node_rank;
>> } orte_pmap_t;
>> And I do not understand what both local_rank and node_rank exactly mean. Is 
>> local_rank similar to the rank of MPI Specification. Can you help me? My 
>> motivation is to achieve process migration in openmpi, I urgently want to 
>> the procedure of launching process.
>> 
>>  ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>