>> However, there are cases when being able to specify the hostfile is
>> important (hybrid jobs, users with MPICH jobs, etc.).

>[I don't understand what MPICH has to do with it.]

This was just an example of how the different behavior of OMPI 1.4 may cause problems. The MPICH library is not the subject of discussion. MPICH requires the use of hostfile, which is generated by SGE, and having it in the submission for an Open MPI 1.2.x job has an expected effect. This is different for Open MPI 1.4.x, which appears not interpreting the host file properly.

>> For example,
>> with Grid Engine I can request four 4-core nodes, that is total of 16
>> slots. But I also want to specify how to distribute processes on the
>> nodes, so I create the file 'hosts'
>>
>> node01 slots=1
>> node02 slots=1
>> node03 slots=1
>> node04 slots=1
>>
>> and modify the line in the submission script to:
>> mpirun -hostfile hosts ./program

> Regardless of any open-mpi bug, I'd have thought it was easier just to
> use -npernode in that case. What's the problem with that? It seems to
> me best generally to control the distribution of processes with mpirun
> on the SGE-allocated nodes than to concoct host files as we used to do
> here, e.g. to get -byslot v. -bynode behaviour (or vice versa).

This is exactly what I am doing -- controlling distribution of processes with mpirun on the SGE-allocated nodes, by supplying the hostfile. Grid Engine allocates nodes and generates a hostfile, which I then can modify however I want to, before running the mpirun command. Moreover, it gives more control, by allowing to create specific SGE parallel environments, where the process distribution is predetermined -- one less worry for users playing with mpirun options.

The example in my initial email was deliberately simplified to demonstrate the problem.

= Serge

Reply via email to