Hi Iftikhar

Iftikhar Rathore wrote:
Hi
We are using openmpi version 1.2.8 (packaged with ofed-1.4). I am trying
to run hpl-2.0 (linpak). We have two intel quad core CPU's in all our
server (8 total cores)  and all hosts in the hostfile have lines that
look like "10.100.0.227 slots=8max_slots=8".

Is this a typo on your email or on your hostfile?

> look like "10.100.0.227 slots=8max_slots=8".

There should be blank space between the number of slots and max_slots:

10.100.0.227 slots=8 max_slots=8

Another possibility is that you may be inadvertently using another
mpirun on the system.

A third possibility:
Does your HPL.dat file require 896 processors?
The product P x Q on each (P,Q) pair should match 896.
If it is less, HPL will run on less processors, i.e., on P x Q only.
(If it is more, HPL will issue an error message and stop.)
Is this what is happening?

A fourth one ...:
Are you sure processor affinity is not correct?
Do the processes drift across the cores?
Typing 1 on top is not enough to clarify this.
To see the process-to-core map on top,
type "f" (for fields),
then "j" (to display the CPU/core number),
and wait for several minutes to see if processor/core (header "P")
and the process ID (header "PID"),
drift or not.

Even when I launch less processes than the available/requested cores
"--mca mpi_paffinity_alone 1" works right here,
as I just checked, with P=4 and Q=1 on HPL.dat
and with -np 8 on mpiexec.

**

I recently ran a bunch of HPL tests with --mca mpi_paffinity_alone 1
and OpenMPI 1.3.2, built from source, and the processor affinity seems
to work (i.e., the processes stick to the cores).
Building from source quite simple, and would give you the latest OpenMPI.

I don't know if 1.2.8 (which you are using)
has a problem with mpi_paffinity_alone,
but the OpenMPI developers may clarify this.


I hope this helps,
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------


Now when I use mpirun (even with --mca mpi_paffinity_alone 1) it does
not keep the affinity, the processes seem to gravitate towards first
four cores (using top and hitting 1). I know I do have MCA paffinity
available.

[root@devi DLR_WB_88]# ompi_info | grep paffinity
[devi.cisco.com:26178] mca: base: component_find: unable to open btl openib: 
file not found (ignored)
           MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2.8)

The command line I am using is:

# mpirun -nolocal -np 896 -v  --mca mpi_paffinity_alone 1 -hostfile 
/mnt/apps/hosts/896_8slots /mnt/apps/bin/xhpl

Am I doing something wrong and is there a way to confirm cpu affinity besides hitting 
"1" on top.


[root@devi DLR_WB_88]# mpirun -V
mpirun (Open MPI) 1.2.8

Report bugs to http://www.open-mpi.org/community/help/


Reply via email to