On 19 June 2013 16:01, Ralph Castain <r...@open-mpi.org> wrote:
> How is OMPI picking up this hostfile? It isn't being specified on the cmd 
> line - are you running under some resource manager?

Via the environment variable `OMPI_MCA_orte_default_hostfile`.

We're running under SGE, but disable the OMPI/SGE integration (rather
old version of SGE, does not coordinate well with OpenMPI); here's the
relevant snippet from our startup script:

    # the OMPI/SGE integration does not seem to work with
    # our SGE version; so use the `mpi` PE and direct OMPI
    # to look for a "plain old" machine file
    unset PE_HOSTFILE
    if [ -r "${TMPDIR}/machines" ]; then
        OMPI_MCA_orte_default_hostfile="${TMPDIR}/machines"
        export OMPI_MCA_orte_default_hostfile
    fi
    GMSCOMMAND="$openmpi_root/bin/mpiexec -n $NCPUS --nooversubscribe
$gamess $INPUT -scr $(pwd)"

The `$TMPDIR/machines` hostfile is created from SGE's $PE_HOSTFILE by
extracting the host names, and repeating each one for the given number
of slots (unmodified code that comes with SGE):

    PeHostfile2MachineFile()
    {
       cat $1 | while read line; do
          # echo $line
          host=`echo $line|cut -f1 -d" "|cut -f1 -d"."`
          nslots=`echo $line|cut -f2 -d" "`
          i=1
          while [ $i -le $nslots ]; do
             echo $host
             i=`expr $i + 1`
          done
       done
    }

Thanks,
Riccardo

Reply via email to