OK, that’s what I was suspecting.  It’s a bug, right?  I asked for 4 processes 
and I supplied a host file with 4 lines in it, and mpirun didn’t launch the 
processes where I told it to launch them.

Do you know when or if this changed?  I can’t recall seeing this this behavior 
in 1.6.5 or 1.4 or 1.2, and I know I’ve run cases across workstation clusters, 
so I think I would have noticed this behavior.

Can I throw another one at you, most likely related?  On a system where node01, 
node02, node03, and node04 already had a full load of work (i.e. other 
applications were running a number of processes equal to the number of cores on 
each node), I had a hosts file like this:  node01, node01, node02, node02.   I 
asked for 4 processes.  mpirun launched them as I would think: rank 0 and rank 
1 on node01, and rank 2 and 3 on node02.  Then I tried node01, node01, node02, 
node03.  In this case, all 4 processes were launched on node01.  Is there a 
logical explanation for this behavior as well?

Thanks again,

Ed


From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
Sent: Friday, November 07, 2014 11:51 AM
To: Open MPI Users
Subject: EXTERNAL: Re: [OMPI users] Question on mapping processes to hosts file

Ah, yes - so here is what is happening. When no slot info is provided, we use 
the number of detected cores on each node as the #slots. So if you want to 
loadbalance across the nodes, you need to set —map-by node

Or add slots=1 to each line of your host file to override the default behavior

On Nov 7, 2014, at 8:52 AM, Blosch, Edwin L 
<edwin.l.blo...@lmco.com<mailto:edwin.l.blo...@lmco.com>> wrote:

Here’s my command:

<path_to_OpenMPI_1.8.3>/bin/mpirun <unrelated MCA options> --machinefile 
hosts.dat -np 4 <executable>

Here’s my hosts.dat file:

% cat hosts.dat
node01
node02
node03
node04

All 4 ranks are launched on node01.  I don’t believe I’ve ever seen this 
before.  I had to do a sanity check, so I tried MVAPICH2-2.1a and got what I 
expected: 1 process runs on each of the 4 nodes.  The mpirun man page says 
‘round-robin’, which I take to mean that one process would be launched per line 
in the hosts file, so this really seems like incorrect behavior.

What could be the possibilities here?

Thanks for the help!



_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/11/25707.php

Reply via email to