Ralph,

the behaviour is slightly different between v1.10 and {v2.x,master}, here are the full details on my centos 7 vm with 4 cores.

if i simply run
mpirun ./hw
then 4 tasks are created with the three ompi versions

if i run
mpirun --host localhost ./hw
then 1 task is created with the three ompi versions

now if i run
mpirun --host localhost -np 2 ./hw
v1.10 fails :
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 2 slots
that were requested by the application:
  ./hw

Either request fewer slots for your application, or make more slots available
for use.
--------------------------------------------------------------------------
but v2.x and master success and 2 tasks are created

more surprisingly, if i run
mpirun --host localhost -np 5 ./hw
/* 4 cores available but 5 tasks requested */
v1.10 fails with the same previous error message
*but*
v2.x and master success without any complain, and even if i did not use the --oversubscribe flag

last but not least, with v2.x and master, i can do
mpirun --host localhost:4 ./hw
and 4 tasks are created
*but*
v1.10 fails with the following error message

$ mpirun --host localhost:4 ./hw
ssh: Could not resolve hostname localhost:4: Name or service not known
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:



as far as i am concerned, i'd rather have
mpirun --host localhost ./hw
and
mpirun ./hw
behave the same (e.g. 4 tasks are created)
but i guess this is just a matter of taste ...

that being said, since
mpirun --host localhost ./hw
does create only one task, i think very unconvenient v1.10 does not support setting the number of slots on the command line, e.g.
mpirun --host localhost:4 ./hw


shall i make a pr so v1.10 support --host <hostname>:<slots> ?

Cheers,

Gilles

On 12/22/2015 11:26 PM, Ralph Castain wrote:
That is the behavior folks asked for, yes. I personally don’t care either way, 
but you’ll find that the master and 2.x branch all work that way too.


On Dec 22, 2015, at 12:49 AM, Gilles Gouaillardet <[email protected]> wrote:

Ralph,

i (re)discovered an old and odd behaviour in v1.10, which was discussed in 
https://github.com/open-mpi/ompi-release/pull/664

when running
mpirun --host xxx ...
mpirun v1.10 assumes one slot per host.

consequently, on my vm with 4 cores
mpirun -np 2 ./helloworld_mpi
works fine
but
mpirun -np 2 --host localhost ./helloworld_mpi
fails with the following error message
"There are not enough slots available ..."

if i understand correctly, and thought this is a different behaviour from v1.8, 
this is compliant with the definition of the --host option.
it seems you were open to some change.

did you have time to think about it ?

Cheers,

Gilles
_______________________________________________
devel mailing list
[email protected]
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2015/12/18450.php
_______________________________________________
devel mailing list
[email protected]
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2015/12/18451.php

Reply via email to