Re: [OMPI users] OpenMPI and SGE

2009-06-25 Thread Ray Muno
As a follow up, the problem was with host name resolution. The error was introduced, with a change to the Rocks environment, which broke reverse lookups for host names. -- Ray Muno

Re: [OMPI users] OpenMPI and SGE

2009-06-23 Thread Ray Muno
Rolf Vandevaart wrote: >> >> PMGR_COLLECTIVE ERROR: unitialized MPI task: Missing required >> environment variable: MPIRUN_RANK >> PMGR_COLLECTIVE ERROR: PMGR_COLLECTIVE ERROR: unitialized MPI task: >> Missing required environment variable: MPIRUN_RANK >> > I do not recognize these errors as pa

Re: [OMPI users] OpenMPI and SGE

2009-06-23 Thread Rolf Vandevaart
Ray Muno wrote: Rolf Vandevaart wrote: Ray Muno wrote: Ray Muno wrote: We are running a cluster using Rocks 5.0 and OpenMPI 1.2 (primarily). Scheduling is done through SGE. MPI communication is over InfiniBand. We also have OpenMPI 1.3 installed and receive s

Re: [OMPI users] OpenMPI and SGE

2009-06-23 Thread Ray Muno
Ray Muno wrote: > Tha give me How about "That gives me" > > PMGR_COLLECTIVE ERROR: unitialized MPI task: Missing required > environment variable: MPIRUN_RANK > PMGR_COLLECTIVE ERROR: PMGR_COLLECTIVE ERROR: unitialized MPI task: > Missing required environment variable: MPIRUN_RANK > > --

Re: [OMPI users] OpenMPI and SGE

2009-06-23 Thread Ray Muno
Rolf Vandevaart wrote: > Ray Muno wrote: >> Ray Muno wrote: >> >>> We are running a cluster using Rocks 5.0 and OpenMPI 1.2 (primarily). >>> Scheduling is done through SGE. MPI communication is over InfiniBand. >>> >>> >> >> We also have OpenMPI 1.3 installed and receive similar errors.- >

Re: [OMPI users] OpenMPI and SGE

2009-06-23 Thread Rolf Vandevaart
Ray Muno wrote: Ray Muno wrote: We are running a cluster using Rocks 5.0 and OpenMPI 1.2 (primarily). Scheduling is done through SGE. MPI communication is over InfiniBand. We also have OpenMPI 1.3 installed and receive similar errors.- This does sound like a problem with SGE. By

Re: [OMPI users] OpenMPI and SGE

2009-06-23 Thread Ray Muno
Ray Muno wrote: > We are running a cluster using Rocks 5.0 and OpenMPI 1.2 (primarily). > Scheduling is done through SGE. MPI communication is over InfiniBand. > We also have OpenMPI 1.3 installed and receive similar errors.- -- Ray Muno University of Minnesota

[OMPI users] OpenMPI and SGE

2009-06-23 Thread Ray Muno
We are running a cluster using Rocks 5.0 and OpenMPI 1.2 (primarily). Scheduling is done through SGE. MPI communication is over InfiniBand. We have been running with this setup for over 9 months. Last week, all user jobs stopped executing (cluster load dropped to zero). User can schedule jobs b