Re: [OMPI users] After OS Update MPI_Init fails on one host

2013-07-26 Thread Dave Love
"Kevin H. Hobbs" writes: > The program links to fedora's copies of the libraries of interest : > > mpirun -n 1 ldd mpi_simple | grep hwloc > libhwloc.so.5 => /lib64/libhwloc.so.5 (0x003c5760) [I'm surprised it's in /lib64.] > mpirun -n 1 ldd mpi_simple | grep mpi > libmpi.so.1 => /u

Re: [OMPI users] MPI_Bcast hanging after some amount of data transferred on Infiniband network

2013-07-26 Thread Jeff Squyres (jsquyres)
1.4.3 is fairly ancient. Can you upgrade to 1.6.5? On Jul 26, 2013, at 3:15 AM, Dusan Zoric wrote: > > I am running application that performs some transformations of large matrices > on 7-node cluster. Nodes are connected via QDR 40 Gbit Infiniband. Open MPI > 1.4.3 is installed on the syste

Re: [OMPI users] Locker memory Limits error

2013-07-26 Thread Jeff Squyres (jsquyres)
See this FAQ entry: http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages On Jul 26, 2013, at 4:26 AM, thomas.fo...@ulstein.com wrote: > hi guys > > im having a strange problem when starting some jobs that i dont uderstand. > > its just 1 node that has an issue and i find it

[OMPI users] Locker memory Limits error

2013-07-26 Thread thomas . forde
hi guys im having a strange problem when starting some jobs that i dont uderstand. its just 1 node that has an issue and i find it odd. The OpenFabrics (openib) BTL failed to initialize while trying to allocate some locked memory. This typically can indicate that the memlock limits are set too

[OMPI users] MPI_Bcast hanging after some amount of data transferred on Infiniband network

2013-07-26 Thread Dusan Zoric
I am running application that performs some transformations of large matrices on 7-node cluster. Nodes are connected via QDR 40 Gbit Infiniband. Open MPI 1.4.3 is installed on the system. Given matrix transformation requires large data exchange between nodes in such a way that at each algorithm st