Hi  Zbigniew

Besides the OpenMPI processor affinity capability that Jeff mentioned.

If your Curie cluster has a resource manager [Torque, SGE, etc],
your job submission script to the resource manager/ queue system
should specifically request a single node, for the test that you have in mind.

For instance, on Torque/PBS, this would be done by adding this directive to
the top of the job script:

#PBS -l nodes=1:ppn=8
...
mpiexec -np 8 ...

meaning that you want the 8 processors [i.e. cores] to be in a single node.

On top of this, you need to add the appropriate process binding
keywords to the mpiexec command line, as Jeff suggested.
'man mpiexec' will tell you a lot about the OpenMPI process binding capability, specially in 1.6 and 1.4 series.

In the best of the worlds your resource manager has the ability to also assign a group of
cores exclusively to each of the jobs that may be sharing the node.
Say, job1 requests 4 cores and gets cores 0-3 and cannot use any other cores, job2 requests 8 cores and gets cores 4-11 and cannot use any other cores, and so on.

However, not all resource managers/ queue systems are built this way [particularly the older versions],
and may let the various job processes to drift across all cores in the node.

If the resource manager is old and doesn't have that hardware locality capability,
and if you don't want your performance test to risk being polluted by
other jobs running on the same node, that perhaps share the same cores with your job,
then you can request all 32 cores in the node for your job,
but use only 8 of them to run your MPI program.
It is wasteful, but may be the only way to go.
For instance, on Torque:

#PBS -l nodes=1:ppn=32
...
mpiexec -np 8 ...

Again, add the OpenMPI process binding keywords to the mpiexec command line,
to ensure the use of a fixed group of 8 cores.

With SGE and Slurm the syntax is different than the above,
but I would guess that there is an equivalent setup.

I hope this helps,
Gus Correa

On 08/30/2012 08:07 AM, Jeff Squyres wrote:
In the OMPI v1.6 series, you can use the processor affinity options.  And you 
can use --report-bindings to show exactly where processes were bound.  For 
example:

-----
% mpirun -np 4 --bind-to-core --report-bindings -bycore uptime
[svbu-mpi056:18904] MCW rank 0 bound to socket 0[core 0]: [B . . .][. . . .]
[svbu-mpi056:18904] MCW rank 1 bound to socket 0[core 1]: [. B . .][. . . .]
[svbu-mpi056:18904] MCW rank 2 bound to socket 0[core 2]: [. . B .][. . . .]
[svbu-mpi056:18904] MCW rank 3 bound to socket 0[core 3]: [. . . B][. . . .]
  05:06:13 up 7 days,  6:57,  1 user,  load average: 0.29, 0.10, 0.03
  05:06:13 up 7 days,  6:57,  1 user,  load average: 0.29, 0.10, 0.03
  05:06:13 up 7 days,  6:57,  1 user,  load average: 0.29, 0.10, 0.03
  05:06:13 up 7 days,  6:57,  1 user,  load average: 0.29, 0.10, 0.03
%
-----

I bound each process to a single core, and mapped them on a round-robin basis 
by core.  Hence, all 4 processes ended up on their own cores on a single 
processor socket.

The --report-bindings output shows that this particular machine has 2 sockets, 
each with 4 cores.



On Aug 30, 2012, at 5:37 AM, Zbigniew Koza wrote:

Hi,

consider this specification:

"Curie fat consists in 360 nodes which contains 4 eight cores CPU Nehalem-EX clocked 
at 2.27 GHz, let 32 cores / node and 11520 cores for the full fat configuration"

Suppose I would like to run some performance tests just on a single processor 
rather than 4 of them.
Is there a way to do this?
I'm afraid specifying that I need 1 cluster node with 8 MPI prcesses
will result in OS distributing these 8 processes among 4
processors forming the node, and this is not what I'm after.

Z Koza
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to