Re: [OMPI users] Error with multiple MPI runs inside one Slurm allocation (with QLogic PSM)

2012-04-02 Thread Ralph Castain
I'm afraid the 1.5 series doesn't offer any help in this regard. The required changes only exist in the developers trunk, which will be released as the 1.7 series in the not-too-distant future. On Mon, Apr 2, 2012 at 9:42 AM, Reuti wrote: > Am 02.04.2012 um 17:40 schrieb Ralph Castain: > > > I'

Re: [OMPI users] Error with multiple MPI runs inside one Slurm allocation (with QLogic PSM)

2012-04-02 Thread Ralph Castain
I'm afraid not, even with the changes in the developer trunk. What happens is that the local and node ranks for each mpirun start over at 0 because the instances of mpirun don't know about each other. PSM uses the local rank as an index for determining endpoint. So running multiple mpiruns on the s

Re: [OMPI users] Error while loading shared libraries

2012-04-02 Thread Rohan Deshpande
Thanks guys. Using absolute path of mpirun fixes my problem. Cheers On Mon, Apr 2, 2012 at 6:24 PM, Reuti wrote: > Am 02.04.2012 um 09:56 schrieb Rohan Deshpande: > > > Yes, I am trying to run the program using multiple hosts. > > > > The program executes fine but does not use any slaves when

Re: [OMPI users] openmpi 1.5.5. build issue with cuda 4.1

2012-04-02 Thread Srinath Vadlamani
The offending file: openmpi/contrib/vt/vt/vtlib/vt_cudartwrap.c is easily fixed with placing a const in front of the void *ptr for the cudaPointerGetAttributes wrapper code segments. Then the openmpi 1.5.5 release compiles with Cuda 4.1 <>Srinath = Srinath Vadlama

Re: [OMPI users] configuration of openmpi-1.5.4 with visual studio

2012-04-02 Thread toufik hadjazi
Hi Shiqing, i haven't yet find a solution and for the record, i have installed openmpi from an executable on windows 7(i don't know if i mentioned that before). at first, i had an error message while compiling the hello world application : unresolved link or something like that, then i added

[OMPI users] openmpi 1.5.5. build issue with cuda 4.1

2012-04-02 Thread Srinath Vadlamani
I have a build error with the newest release openmpi 1.5.5 building against. cuda 4.1 Making all in vtlib make[5]: Entering directory `/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_science_openmpi/openmpi/work/build/ompi/contrib/vt/vt/vtlib' CC

Re: [OMPI users] Error with multiple MPI runs inside one Slurm allocation (with QLogic PSM)

2012-04-02 Thread Reuti
Am 02.04.2012 um 17:40 schrieb Ralph Castain: > I'm not sure the 1.4 series can support that behavior. Each mpirun only knows > about itself - it has no idea something else is going on. > > If you attempted to bind, all procs of same rank from each run would bind on > the same CPU. > > All you

Re: [OMPI users] Error with multiple MPI runs inside one Slurm allocation (with QLogic PSM)

2012-04-02 Thread Gutierrez, Samuel K
Sorry to hijack the thread, but I have a question regarding the failed PSM initialization. Some of our users oversubscribe a node with multiple mpiruns in order to run their regression tests. Recently, a user reported the same "Could not detect network connectivity" error. My question: is th

Re: [OMPI users] Error with multiple MPI runs inside one Slurm allocation (with QLogic PSM)

2012-04-02 Thread Ralph Castain
I'm not sure the 1.4 series can support that behavior. Each mpirun only knows about itself - it has no idea something else is going on. If you attempted to bind, all procs of same rank from each run would bind on the same CPU. All you can really do is use -host to tell the fourth run not to use

Re: [OMPI users] redirecting output

2012-04-02 Thread Prentice Bisbal
On 03/30/2012 11:12 AM, Tim Prince wrote: > On 03/30/2012 10:41 AM, tyler.bal...@huskers.unl.edu wrote: >> >> >> I am using the command mpirun -np nprocs -machinefile machines.arch >> Pcrystal and my output strolls across my terminal I would like to >> send this output to a file and I cannot figur

[OMPI users] Error with multiple MPI runs inside one Slurm allocation (with QLogic PSM)

2012-04-02 Thread Rémi Palancher
Hi there, I'm encountering a problem when trying to run multiple mpirun in parallel inside one SLURM allocation on multiple nodes using a QLogic interconnect network with PSM. I'm using Open MPI version 1.4.5 compiled with GCC 4.4.5 on Debian Lenny. My cluster is composed of 12 cores nodes

Re: [OMPI users] Help with multicore AMD machine performance

2012-04-02 Thread Nico Mittenzwey
Hi, I'm benchmarking our (well tested) parallel code on and AMD based system, featuring 2x AMD Opteron(TM) Processor 6276, with 16 cores each for a total of 32 cores. The system is running Scientific Linux 6.1 and OpenMPI 1.4.5. When I run a single core job the performance is as expected. How

Re: [OMPI users] Error while loading shared libraries

2012-04-02 Thread Reuti
Am 02.04.2012 um 09:56 schrieb Rohan Deshpande: > Yes, I am trying to run the program using multiple hosts. > > The program executes fine but does not use any slaves when I run > > mpirun -np 8 hello --hostfile slaves > > The program throws error saying shared libraries not found when I run

[OMPI users] (no subject)

2012-04-02 Thread vladimir marjanovic
http://whatbunny.org/web/app/_cache/02efpk.html";> http://whatbunny.org/web/app/_cache/02efpk.html

Re: [OMPI users] Error while loading shared libraries

2012-04-02 Thread Rohan Deshpande
Yes, I am trying to run the program using multiple hosts. The program executes fine but *does not use any slaves* when I run *mpirun -np 8 hello --hostfile slaves* The program throws error saying *shared libraries not found* when I run * mpirun --hostfile slaves -np 8* On Mon, Apr 2, 2012

Re: [OMPI users] Error while loading shared libraries

2012-04-02 Thread Rayson Ho
On Sun, Apr 1, 2012 at 11:27 PM, Rohan Deshpande wrote: >   error while loading shared libraries: libmpi.so.0: cannot open shared > object file no such object file: No such file or directory. Were you trying to run the MPI program on a remote machine?? If you are, then make sure that each machine