On Sep 25, 2009, at 7:55 AM, Blosch, Edwin L wrote:
I’m having a problem running OpenMPI under Torque. It complains
like there is a command syntax problem, but the three variations
below are all correct, best I can tell using mpirun –help. The
environment in which the command executes, i.e. PATH and
LD_LIBRARY_PATH, is correct. Torque is 2.3.x. OpenMPI is 1.2.8.
OFED is 1.4.
Is your mpirun a script, perchance? It's almost like the arguments
that end up being passed are getting munged / re-ordered, and Bad
Things happen such that the real mpirun under the covers gets confused.
/usr/mpi/intel/openmpi-1.2.8/bin/mpirun -np 28 /tmp/43.fwnaeglingio/
falconv4_ibm_openmpi -cycles 100 -ri restart.0 -ro /tmp/
43.fwnaeglingio/restart.0
--------------------------------------------------------------------------
Failed to find the following executable:
Host: n8n26
Executable: -p
I don't even see -p in that argument list. Where is it coming from?
A little background: OMPI's mpirun analyzes the command line tokens
that are passed to it. The first one that it doesn't recognize, it
assumes is the executable to invoke. In this case, OMPI's mpirun
found a "-p" on the command line (not sure how that happened; perhaps /
usr/mpi/intel/openmpi-1.2.8/bin/mpirun is not actually OMPI's mpirun,
as I mentioned above...?) and tried to execute it. But then there was
no executable named "-p" to be found in the filesystem, then OMPI
printed the error.
mpirun --prefix /usr/mpi/intel/openmpi-1.2.8 --machinefile /var/
spool/torque/aux/45.fwnaeglingio -np 28 --mca btl ^tcp --mca
mpi_leave_pinned 1 --mca mpool_base_use_mem_hooks 1 -x
LD_LIBRARY_PATH -x MPI_ENVIRONMENT /tmp/45.fwnaeglingio/
falconv4_ibm_openmpi -cycles 100 -ri restart.0 -ro /tmp/
45.fwnaeglingio/restart.0
--------------------------------------------------------------------------
Failed to find or execute the following executable:
Host: n8n27
Executable: --prefix /usr/mpi/intel/openmpi-1.2.8
Ditto on this one. --prefix is a valid mpirun command line argument,
so it should not have complained.
But then again, I confess to not remembering all the 1.2.x command
line options; I don't remember if --prefix was introduced in the 1.2
or 1.3 series...
/usr/mpi/intel/openmpi-1.2.8/bin/mpirun -x LD_LIBRARY_PATH -x
MPI_ENVIRONMENT=1 /tmp/47.fwnaeglingio/falconv4_ibm_openmpi -cycles
100 -ri restart.0 -ro /tmp/47.fwnaeglingio/restart.0
--------------------------------------------------------------------------
Failed to find the following executable:
Host: n8n27
Executable: -
Ditto to #1.
--
Jeff Squyres
jsquy...@cisco.com