Thanks all for your answers, I've added some details about the tests I have
run. See below.
Ralph Castain wrote:
Not precisely correct. It depends on the environment.
If there is a resource manager allocating nodes, or you provide a hostfile
that specifies the number of slots on the nodes, or you use -host, then we
default to no-oversubscribe.
I'm using a batch scheduler (OAR).
# cat /dev/cpuset/oar/begou_7955553/cpuset.cpus
4-7
So 4 cores allowed. Nodes have two height cores cpus.
Node file contains:
# cat $OAR_NODEFILE
frog53
frog53
frog53
frog53
# mpirun --hostfile $OAR_NODEFILE -bind-to core location.exe
is okay (my test code show one process on each core)
(process 3) thread is now running on PU logical index 1 (OS/physical index 5) on
system frog53
(process 0) thread is now running on PU logical index 3 (OS/physical index 7) on
system frog53
(process 1) thread is now running on PU logical index 0 (OS/physical index 4) on
system frog53
(process 2) thread is now running on PU logical index 2 (OS/physical index 6) on
system frog53
# mpirun -np 5 --hostfile $OAR_NODEFILE -bind-to core location.exe
oversuscribe with:
(process 0) thread is now running on PU logical index 3 (OS/physical index 7) on
system frog53
(process 1) thread is now running on PU logical index 1 (OS/physical index 5) on
system frog53
(*process 3*) thread is now running on PU logical index *2 (OS/physical index
6)* on system frog53
(process 4) thread is now running on PU logical index 0 (OS/physical index 4) on
system frog53
(*process 2*) thread is now running on PU logical index *2 (OS/physical index
6)* on system frog53
This is not allowed with OpenMPI 1.7.3
I can increase until the maximul core number of this first pocessor (8 cores)
# mpirun -np 8 --hostfile $OAR_NODEFILE -bind-to core location.exe |grep 'thread
is now running on PU'
(process 5) thread is now running on PU logical index 1 (OS/physical index 5) on
system frog53
(process 7) thread is now running on PU logical index 3 (OS/physical index 7) on
system frog53
(process 4) thread is now running on PU logical index 0 (OS/physical index 4) on
system frog53
(process 6) thread is now running on PU logical index 2 (OS/physical index 6) on
system frog53
(process 2) thread is now running on PU logical index 1 (OS/physical index 5) on
system frog53
(process 0) thread is now running on PU logical index 2 (OS/physical index 6) on
system frog53
(process 1) thread is now running on PU logical index 0 (OS/physical index 4) on
system frog53
(process 3) thread is now running on PU logical index 0 (OS/physical index 4) on
system frog53
But I cannot overload more than the 8 cores (max core number of one cpu).
# mpirun -np 9 --hostfile $OAR_NODEFILE -bind-to core location.exe
A request was made to bind to that would result in binding more
processes than cpus on a resource:
Bind to: CORE
Node: frog53
#processes: 2
#cpus: 1
You can override this protection by adding the "overload-allowed"
option to your binding directive.
Now if I add *--nooversubscribe* the problem doesn't exist anymore (no more than
4 processes, one on each core). So looks like if default behavior would be a
nooversuscribe on cores number of the socket ???
Again, with 1.7.3 this problem doesn't occur at all.
Patrick
If you provide a hostfile that doesn't specify slots, then we use the number
of cores we find on each node, and we allow oversubscription.
What is being described sounds like more of a bug than an intended feature.
I'd need to know more about it, though, to be sure. Can you tell me how you
are specifying this cpuset?
On Sep 15, 2015, at 4:44 PM, Matt Thompson <fort...@gmail.com
<mailto:fort...@gmail.com>> wrote:
Looking at the Open MPI 1.10.0 man page:
https://www.open-mpi.org/doc/v1.10/man1/mpirun.1.php
it looks like perhaps -oversubscribe (which was an option) is now the default
behavior. Instead we have:
*-nooversubscribe, --nooversubscribe*
Do not oversubscribe any nodes; error (without starting any processes) if
the requested number of processes would cause oversubscription. This
option implicitly sets "max_slots" equal to the "slots" value for each node.
It also looks like -map-by has a way to implement it as well (see man page).
Thanks for letting me/us know about this. On a system of mine I sort of
depend on the -nooversubscribe behavior!
Matt
On Tue, Sep 15, 2015 at 11:17 AM, Patrick Begou
<patrick.be...@legi.grenoble-inp.fr
<mailto:patrick.be...@legi.grenoble-inp.fr>> wrote:
Hi,
I'm runing OpenMPI 1.10.0 built with Intel 2015 compilers on a Bullx System.
I've some troubles with the bind-to core option when using cpuset.
If the cpuset is less than all the cores of a cpu (ex: 4 cores allowed on
a 8 cores cpus) OpenMPI 1.10.0 allows to overload these cores until the
maximum number of cores of the cpu.
With this config and because the cpuset only allows 4 cores, I can reach
2 processes/core if I use:
mpirun -np 8 --bind-to core my_application
OpenMPI 1.7.3 doesn't show the problem with the same situation:
mpirun -np 8 --bind-to-core my_application
returns:
/A request was made to bind to that would result in binding more//
//processes than cpus on a resource/
and that's okay of course.
Is there a way to avoid this oveloading with OpenMPI 1.10.0 ?
Thanks
Patrick
--
===================================================================
| Equipe M.O.S.T. | |
| Patrick BEGOU |mailto:patrick.be...@grenoble-inp.fr |
| LEGI | |
| BP 53 X | Tel 04 76 82 51 35 |
| 38041 GRENOBLE CEDEX | Fax 04 76 82 52 71 |
===================================================================
_______________________________________________
users mailing list
us...@open-mpi.org <mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/09/27575.php
--
Matt Thompson
Man Among Men
Fulcrum of History
_______________________________________________
users mailing list
us...@open-mpi.org <mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/09/27579.php
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/09/27580.php
--
===================================================================
| Equipe M.O.S.T. | |
| Patrick BEGOU | mailto:patrick.be...@grenoble-inp.fr |
| LEGI | |
| BP 53 X | Tel 04 76 82 51 35 |
| 38041 GRENOBLE CEDEX | Fax 04 76 82 52 71 |
===================================================================