On Sep 25, 2009, at 7:55 AM, Blosch, Edwin L wrote:

I’m having a problem running OpenMPI under Torque. It complains like there is a command syntax problem, but the three variations below are all correct, best I can tell using mpirun –help. The environment in which the command executes, i.e. PATH and LD_LIBRARY_PATH, is correct. Torque is 2.3.x. OpenMPI is 1.2.8. OFED is 1.4.

Is your mpirun a script, perchance? It's almost like the arguments that end up being passed are getting munged / re-ordered, and Bad Things happen such that the real mpirun under the covers gets confused.

/usr/mpi/intel/openmpi-1.2.8/bin/mpirun -np 28 /tmp/43.fwnaeglingio/ falconv4_ibm_openmpi -cycles 100 -ri restart.0 -ro /tmp/ 43.fwnaeglingio/restart.0
--------------------------------------------------------------------------
Failed to find the following executable:

Host:       n8n26
Executable: -p

I don't even see -p in that argument list.  Where is it coming from?

A little background: OMPI's mpirun analyzes the command line tokens that are passed to it. The first one that it doesn't recognize, it assumes is the executable to invoke. In this case, OMPI's mpirun found a "-p" on the command line (not sure how that happened; perhaps / usr/mpi/intel/openmpi-1.2.8/bin/mpirun is not actually OMPI's mpirun, as I mentioned above...?) and tried to execute it. But then there was no executable named "-p" to be found in the filesystem, then OMPI printed the error.

mpirun --prefix /usr/mpi/intel/openmpi-1.2.8 --machinefile /var/ spool/torque/aux/45.fwnaeglingio -np 28 --mca btl ^tcp --mca mpi_leave_pinned 1 --mca mpool_base_use_mem_hooks 1 -x LD_LIBRARY_PATH -x MPI_ENVIRONMENT /tmp/45.fwnaeglingio/ falconv4_ibm_openmpi -cycles 100 -ri restart.0 -ro /tmp/ 45.fwnaeglingio/restart.0
--------------------------------------------------------------------------
Failed to find or execute the following executable:

Host:       n8n27
Executable: --prefix /usr/mpi/intel/openmpi-1.2.8

Ditto on this one. --prefix is a valid mpirun command line argument, so it should not have complained.

But then again, I confess to not remembering all the 1.2.x command line options; I don't remember if --prefix was introduced in the 1.2 or 1.3 series...

/usr/mpi/intel/openmpi-1.2.8/bin/mpirun -x LD_LIBRARY_PATH -x MPI_ENVIRONMENT=1 /tmp/47.fwnaeglingio/falconv4_ibm_openmpi -cycles 100 -ri restart.0 -ro /tmp/47.fwnaeglingio/restart.0
--------------------------------------------------------------------------
Failed to find the following executable:

Host:       n8n27
Executable: -


Ditto to #1.

--
Jeff Squyres
jsquy...@cisco.com


Reply via email to