1. Ralph, if I try to do my experiment with SGE, then it's the same results with 1.4.2 as with 1.4.1.

$ qrsh -cwd -V -pe ompi* 16 -l h_rt=10:00:00,h_vmem=2G bash

graphics03 $ cat hosts
graphics01 slots=1
graphics02 slots=1
graphics03 slots=1
graphics04 slots=1

graphics03 $ ~/openmpi/gnu129/bin/mpirun -hostfile hosts hostname
graphics03
graphics01
graphics04
graphics02

graphics03 $ ~/openmpi/gnu141/bin/mpirun -hostfile hosts hostname
[graphics03:08200] ras:gridengine: JOB_ID: 89217
graphics03

graphics03 $ ~/openmpi/gnu142/bin/mpirun -hostfile hosts hostname
graphics03


2. However, if I try to reproduce your experiment then I get the exact same result as you do.

$ ~/openmpi/gnu141/bin/mpirun --do-not-resolve --do-not-launch --display-map --display-allocation -default-hostfile defserge -hostfile serge hostname

======================   ALLOCATED NODES   ======================

 Data for node: Name: clhead    Num slots: 0    Max slots: 0
 Data for node: Name: graphics01        Num slots: 4    Max slots: 0
 Data for node: Name: graphics02        Num slots: 4    Max slots: 0
 Data for node: Name: graphics03        Num slots: 4    Max slots: 0
 Data for node: Name: graphics04        Num slots: 4    Max slots: 0

=================================================================

 ========================   JOB MAP   ========================

 Data for node: Name: graphics01        Num procs: 1
        Process OMPI jobid: [26782,1] Process rank: 0

 Data for node: Name: graphics02        Num procs: 1
        Process OMPI jobid: [26782,1] Process rank: 1

 Data for node: Name: graphics03        Num procs: 1
        Process OMPI jobid: [26782,1] Process rank: 2

 Data for node: Name: graphics04        Num procs: 1
        Process OMPI jobid: [26782,1] Process rank: 3

 =============================================================

$ cat defserge
graphics01 slots=4
graphics02 slots=4
graphics03 slots=4
graphics04 slots=4

$ cat serge
graphics01 slots=1
graphics02 slots=1
graphics03 slots=1
graphics04 slots=1

$ ~/openmpi/gnu141/bin/mpirun -default-hostfile defserge -hostfile serge hostname
graphics04
graphics01
graphics02
graphics03


3. So, it this point I am wondering if using a default hostfile in place of the sge allocation actually matters.

Also, as I mentioned, my colleague reports the same problem for OMPI 1.3.4 and SGE 6.2u4 on a different cluster. And both of us installed the software completely independently.

Thank you for the help.

= Serge


Ralph Castain wrote:
Hmmm...not sure what to say. I don't have a copy of 1.4.1 handy, but I did try 
this on the current 1.4 branch (soon to be released as 1.4.2) and it worked 
perfectly (had to use a default hostfile in place of your sge allocation, but 
that doesn't matter to the code):

Ralph:v1.4 rhc$ mpirun --do-not-resolve --do-not-launch --display-map 
--display-allocation -default-hostfile defserge -hostfile serge foo

======================   ALLOCATED NODES   ======================

 Data for node: Name: Ralph     Num slots: 0    Max slots: 0
 Data for node: Name: node01    Num slots: 4    Max slots: 0
 Data for node: Name: node02    Num slots: 4    Max slots: 0
 Data for node: Name: node03    Num slots: 4    Max slots: 0
 Data for node: Name: node04    Num slots: 4    Max slots: 0

=================================================================

 ========================   JOB MAP   ========================

 Data for node: Name: node01    Num procs: 1
        Process OMPI jobid: [47823,1] Process rank: 0

 Data for node: Name: node02    Num procs: 1
        Process OMPI jobid: [47823,1] Process rank: 1

 Data for node: Name: node03    Num procs: 1
        Process OMPI jobid: [47823,1] Process rank: 2

 Data for node: Name: node04    Num procs: 1
        Process OMPI jobid: [47823,1] Process rank: 3

 =============================================================
Ralph:v1.4 rhc$ cat defserge
node01 slots=4
node02 slots=4
node03 slots=4
node04 slots=4

Ralph:v1.4 rhc$ cat serge
node01 slots=1
node02 slots=1
node03 slots=1
node04 slots=1


Can you try it on the 1.4.2 nightly tarball and see if it works okay for you?

http://www.open-mpi.org/nightly/v1.4/



On Apr 7, 2010, at 10:02 AM, Serge wrote:

Thank you, Ralph.

I have read the wiki and the man pages. But I am still not sure I understand what is 
going on in my example. I cannot filter the slots allocated by SGE. I also think that 
there is a deviation from the behavior described on the wiki (precisely example 5 from 
the top in section "NOW RUNNING FROM THE INTERACTIVE SHELL").

So, below, I am copy-pasting my session, and I am asking if you could please 
follow my line of thought and correct me where I am mistaken?

Here I request an interactive session with 16 slots on 4 four-core nodes like 
so:

  $ qrsh -cwd -V -pe ompi* 16 -l h_rt=10:00:00,h_vmem=2G bash

Now, I show that all 16 slots are available and everything is working as 
expected with both OMPI 1.2.9 and OMPI 1.4.1:

  graphics01 $ ~/openmpi/gnu141/bin/mpirun hostname
  [graphics01:24837] ras:gridengine: JOB_ID: 89052
  graphics01
  graphics01
  graphics01
  graphics01
  graphics04
  graphics04
  graphics02
  graphics02
  graphics04
  graphics02
  graphics04
  graphics02
  graphics03
  graphics03
  graphics03
  graphics03

  graphics01 $ ~/openmpi/gnu129/bin/mpirun hostname
  [graphics01:24849] ras:gridengine: JOB_ID: 89052
  graphics01
  graphics04
  graphics02
  graphics03
  graphics01
  graphics04
  graphics02
  graphics03
  graphics01
  graphics03
  graphics01
  graphics04
  graphics03
  graphics04
  graphics02
  graphics02

Now, I want to filter the list of 16 slots by using the host file. I want to 
run 1 process per node.

  graphics01 $ cat hosts
  graphics01 slots=1
  graphics02 slots=1
  graphics03 slots=1
  graphics04 slots=1

And I try to use it with OMPI 1.2.9 and 1.4.1

  graphics01 $ ~/openmpi/gnu129/bin/mpirun -hostfile hosts hostname
  graphics04
  graphics01
  graphics03
  graphics02

  graphics01 $ ~/openmpi/gnu141/bin/mpirun -hostfile hosts hostname
  [graphics01:24903] ras:gridengine: JOB_ID: 89052
  graphics01

So, as you can see OMPI1.4.1 did not recognize any hosts except the current 
shepherd host.

Moreover, similarly to the example down below on 
https://svn.open-mpi.org/trac/ompi/wiki/HostFilePlan,
I create two other host files:

  graphics01 $ cat hosts1
  graphics02
  graphics02

  graphics01 $ cat hosts2
  graphics02 slots=2

And then try to use them with both versions of Open MPI:

It works properly with OMPI 1.2.9 (the same way as showed on the wiki!), but 
does NOT with 1.4.1

  graphics01 $ ~/openmpi/gnu129/bin/mpirun -hostfile hosts1 hostname
  graphics02
  graphics02

  graphics01 $ ~/openmpi/gnu129/bin/mpirun -hostfile hosts2 hostname
  graphics02
  graphics02

  graphics01 $ ~/openmpi/gnu141/bin/mpirun -hostfile hosts1 hostname
  [graphics01:25756] ras:gridengine: JOB_ID: 89055
--------------------------------------------------------------------------
  There are no allocated resources for the application
    hostname
  that match the requested mapping:
    hosts1

  Verify that you have mapped the allocated resources properly using the
  --host or --hostfile specification.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
  A daemon (pid unknown) died unexpectedly on signal 1  while attempting to
  launch so we are aborting.

= Serge


Ralph Castain wrote:
I should have read your original note more closely and I would have spotted the 
issue. How a hostfile is used changed between OMPI 1.2 and the 1.3 (and above) 
releases per user requests. It was actually the SGE side of the community that 
led the change :-)
You can get a full description of how OMPI uses hostfiles in two ways:
* from the man pages:  man orte_hosts
* from the wiki: https://svn.open-mpi.org/trac/ompi/wiki/HostFilePlan
As far as I can tell, OMPI 1.4.x is behaving per that specification. You get 
four slots on your submission script because that is what SGE allocated to you. 
The hostfile filters that when launching, using the provided info to tell it 
how many slots on each node within the allocation to use for that application.
I suggest reading the above documentation to see how OMPI uses hostfiles, and 
then let us know if you have any questions, concerns, or see a deviation from 
the described behavior.
HTH
Ralph
On Apr 7, 2010, at 5:36 AM, Serge wrote:
If you run your cmd with the hostfile option and add
--display-allocation, what does it say?
Thank you, Ralph.

This is the command I used inside my submission script:

mpirun --display-allocation -np 4 -hostfile hosts ./program

And this is the output I got.

Data for node: Name: node03  Num slots: 4    Max slots: 0
Data for node: Name: node02  Num slots: 4    Max slots: 0
Data for node: Name: node04  Num slots: 4    Max slots: 0
Data for node: Name: node01  Num slots: 4    Max slots: 0

If I run the same mpirun command on the cluster head node "clhead" then this is 
what I get:

Data for node: Name: clhead  Num slots: 0    Max slots: 0
Data for node: Name: node01  Num slots: 1    Max slots: 0
Data for node: Name: node02  Num slots: 1    Max slots: 0
Data for node: Name: node03  Num slots: 1    Max slots: 0
Data for node: Name: node04  Num slots: 1    Max slots: 0

The content of the 'hosts' file:

node01 slots=1
node02 slots=1
node03 slots=1
node04 slots=1

= Serge


On Apr 6, 2010, at 12:18 PM, Serge wrote:

Hi,
OpenMPI integrates with Sun Grid Engine really well, and one does not need to specify any parameters for the 
mpirun command to launch the processes on the compute nodes, that is having in the submission script 
"mpirun ./program" is enough; there is no need for "-np XX" or "-hostfile 
file_name".
However, there are cases when being able to specify the hostfile is important 
(hybrid jobs, users with MPICH jobs, etc.). For example, with Grid Engine I can 
request four 4-core nodes, that is total of 16 slots. But I also want to 
specify how to distribute processes on the nodes, so I create the file 'hosts'
node01 slots=1
node02 slots=1
node03 slots=1
node04 slots=1
and modify the line in the submission script to:
mpirun -hostfile hosts ./program
With Open MPI 1.2.x everything worked properly, meaning that Open MPI could 
count the number of slots specified in the 'hosts' file - 4 (i.e. effectively 
supplying the mpirun command with the -np parameter), as well as properly 
distribute processes on the compute nodes (one process per host).
It's different with Open MPI 1.4.1. It cannot process the 'hosts' file properly 
at all. All the processes get launched on just one node -- the shepherd host.
The format of the 'hosts' file does not matter. It can be, say
node01
node01
node02
node02
meaning 2 slots on each node. Open MPI 1.2.x would handle that with no problem, 
however Open MPI 1.4.x would not.
The problem appears with OMPI 1.4.1, SGE 6.1u6. It was also tested with OMPI 
1.3.4 and SGE 6.2u4.
It's important to notice that if the mpirun command is run interactively, not 
from inside the Grid Engine script, then it interprets the content of the host 
file just fine.
I am wondering what changed from OMPI 1.2.x to OMPI 1.4.x that prevents 
expected behavior, and is it possible to get it from OMPI 1.4.x by, say, tuning 
some parameters?
= Serge
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to