> On Nov 3, 2014, at 4:54 AM, Mark Dixon <m.c.di...@leeds.ac.uk> wrote:
> 
> Hi there,
> 
> We've started looking at moving to the openmpi 1.8 branch from 1.6 on our 
> CentOS6/Son of Grid Engine cluster and noticed an unexpected difference when 
> binding multiple cores to each rank.
> 
> Has openmpi's definition 'slot' changed between 1.6 and 1.8? It used to mean 
> ranks, but now it appears to mean processing elements (see Details, below).

It actually didn’t change - there were some errors in prior versions in how we 
were handling things, and we have corrected them. A “slot” never was equated to 
an MPI rank, but is an allocation from the scheduler - it means you have been 
allocated one resource on the given node. So the number of “slots” on a node 
equates to the number of resources on that node which were allocated for your 
use.

Note also that a “slot” doesn’t automatically correspond to a core - someone 
may well decide to define a “slot” as being the equivalent of a “container” 
comprised of several cores. It is indeed an abstraction used by the scheduler 
when assigning resources.

Because of the confusion we’ve encountered both internally and externally over 
the meaning of the term “cpu", we adopted the “processing element” term. So if 
you are individually assigning hwthreads, your PE is at the hwthread level. If 
you individually assign cores, then PE equates to core.

We know the mpirun man page in 1.8.3 was woefully out-of-date, and that has 
been fixed for the soon-to-be-release 1.8.4. Some of the options that were 
supposed to be deprecated (a) were accidentally turned completely off, and (b) 
have been restored (and “un-deprecated”) per user request. So bysocket will 
indeed return in 1.8.4

If you only have one allocated PE on a node, then mpirun will correctly tell 
you that it can’t launch with PE>1 as there aren’t enough resources to meet 
your request. IIRC, we may have been ignoring this under SGE and running as 
many procs as we wanted on an allocated node - the SGE folks provided a patch 
to fix that hole.

I’ll check the case you describe below - if you don’t specify the number of 
procs to run, we should correctly resolve the number of ranks to start.

> 
> Thanks,
> 
> Mark
> 
> PS Also, the man page for 1.8.3 reports that '--bysocket' is deprecated, but 
> it doesn't seem to exist when we try to use it:
> 
>  mpirun: Error: unknown option "-bysocket"
>  Type 'mpirun --help' for usage.
> 
> ====== Details ======
> 
> On 1.6.5, we launch with the following core binding options:
> 
>  mpirun --bind-to-core --cpus-per-proc <n> <program>
>  mpirun --bind-to-core --bysocket --cpus-per-proc <n> <program>
> 
>  where <n> is calculated to maximise the number of cores available to
>  use - I guess affectively
>  max(1, int(number of cores per node / slots per node requested)).
> 
>  openmpi reads the file $PE_HOSTFILE and launches a rank for each slot
>  defined in it, binding <n> cores per rank.
> 
> On 1.8.3, we've tried launching with the following core binding options 
> (which we hoped were equivalent):
> 
>  mpirun -map-by node:PE=<n> <program>
>  mpirun -map-by socket:PE=<n> <program>
> 
>  openmpi reads the file $PE_HOSTFILE and launches a factor of <n> fewer
>  ranks than under 1.6.5. We also notice that, where we wanted a single
>  rank on the box and <n> is the number of cores available, openmpi
>  refuses to launch and we get the message:
> 
>  "There are not enough slots available in the system to satisfy the 1
>  slots that were requested by the application"
> 
>  I think that error message needs a little work :)
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Searchable archives: 
> http://www.open-mpi.org/community/lists/users/2014/11/index.php

Reply via email to