Re: [OMPI users] MPI_TYPE_MAX

2010-04-08 Thread Jeff Squyres
I don't think we have such an environment variable -- are you sure you're using Open MPI? The only reference to MPI_TYPE_MAX I see in the OMPI source tree is in the ROMIO README file: - * When using ROMIO with SGI MPI, you may sometimes get an error message from SGI MPI: ``MPI has run out o

[OMPI users] MPI_TYPE_MAX

2010-04-08 Thread Cole, Derek E
Hi All, I keep getting an error about running out of MPI_TYPE_MAX and needing to set the environment variable higher. What is this, and why is it happening? All of the types and groups, etc that I create during my programs run are freed at the appropriate times. Making this number 10x bigger ge

Re: [OMPI users] Problem running mpirun with ssh on remote nodes -Daemon did not report back when launched problem

2010-04-08 Thread rohan nigam
Hi Jeff, You were right.  One of the other admins of the server I am working on, had a script that runs the firewall every time I logged in. So even when I was turning it off manually, the firewall ran the next time i logged in and hence the error. Thanks. - Rohan --- On Tue, 4/6/10, Jeff S

[OMPI users] Using a rankfile for ompi-restart

2010-04-08 Thread Fernando Lemos
Hello, I've noticed that ompi-restart doesn't support the --rankfile option. It only supports --hostfile/--machinefile. Is there any reason --rankfile isn't supported? Suppose you have a cluster without a shared file system. When one node fails, you transfer its checkpoint to a spare node and in

Re: [OMPI users] orted: error while loading shared libraries

2010-04-08 Thread Fernando Lemos
On Thu, Apr 8, 2010 at 10:31 AM, Jeff Squyres wrote: > Yes.  There is usually a difference between interactive logins and > non-interactive logins on which paths, etc. get set.  Look in your shell > startup and see if there is somewhere that it exits early (or otherwise > doesn't process) for n

Re: [OMPI users] orted: error while loading shared libraries

2010-04-08 Thread Jeff Squyres
Yes. There is usually a difference between interactive logins and non-interactive logins on which paths, etc. get set. Look in your shell startup and see if there is somewhere that it exits early (or otherwise doesn't process) for non-interactive logins. In short: you need to ensure that your

[OMPI users] orted: error while loading shared libraries

2010-04-08 Thread SLIM H.A.
Dear OpenMPI users We built OpenMPI 1.4.1 on a new cluster and get the following error message when starting a test job from the master node: ham4#mpirun -np 4 --host cn001 /path/hello orted: error while loading shared libraries: libimf.so: cannot open shared object file: No such file or dire

Re: [OMPI users] OMPI 1.4.x ignores hostfile and launches all the processes on just one node in Grid Engine

2010-04-08 Thread Serge
1. Ralph, if I try to do my experiment with SGE, then it's the same results with 1.4.2 as with 1.4.1. $ qrsh -cwd -V -pe ompi* 16 -l h_rt=10:00:00,h_vmem=2G bash graphics03 $ cat hosts graphics01 slots=1 graphics02 slots=1 graphics03 slots=1 graphics04 slots=1 graphics03 $ ~/openmpi/gnu129/bin

Re: [OMPI users] OMPI 1.4.x ignores hostfile and launches all the processes on just one node in Grid Engine

2010-04-08 Thread Dave Love
Serge writes: > This is exactly what I am doing -- controlling distribution of > processes with mpirun on the SGE-allocated nodes, by supplying the > hostfile. Grid Engine allocates nodes and generates a hostfile, which > I then can modify however I want to, before running the mpirun > command. M