Rolf Vandevaart wrote: > Ray Muno wrote: >> Ray Muno wrote: >> >>> We are running a cluster using Rocks 5.0 and OpenMPI 1.2 (primarily). >>> Scheduling is done through SGE. MPI communication is over InfiniBand. >>> >>> >> >> We also have OpenMPI 1.3 installed and receive similar errors.- >> >> > This does sound like a problem with SGE. By default, we use qrsh to > start the jobs on all the remote nodes. I believe that is the command > that is failing. There are two things you can try to get more info > depending on the version of Open MPI. With version 1.2, you can try > this to get more information. > > |--mca pls_gridengine_verbose 1| > This did not look like it gave me any more info.
> With Open MPI 1.3.2 and later the verbose flag will not help. But > instead, you can disable the use of qrsh and instead use rsh/ssh to > start the remote jobs. > > --mca plm_rsh_disable_qrsh 1 > Tha give me PMGR_COLLECTIVE ERROR: unitialized MPI task: Missing required environment variable: MPIRUN_RANK PMGR_COLLECTIVE ERROR: PMGR_COLLECTIVE ERROR: unitialized MPI task: Missing required environment variable: MPIRUN_RANK -- Ray Muno University of Minnesota