Re: [OMPI users] malloc related crash inside openmpi

2016-11-28 Thread Jeff Squyres (jsquyres)
> On Nov 25, 2016, at 11:20 AM, Noam Bernstein > wrote: > > Looks like this openmpi 2 crash was a matter of not using the correctly > linked executable on all nodes. Now that it’s straightened out, I think it’s > all working, and apparently even fixed my malloc related crash, so perhaps > the

Re: [OMPI users] malloc related crash inside openmpi

2016-11-25 Thread Noam Bernstein
> On Nov 24, 2016, at 10:52 AM, r...@open-mpi.org wrote: > > Just to be clear: are you saying that mpirun exits with that message? Or is > your application process exiting with it? > > There is no reason for mpirun to be looking for that library. > > The library in question is in the /lib/openm

Re: [OMPI users] malloc related crash inside openmpi

2016-11-24 Thread r...@open-mpi.org
Just to be clear: are you saying that mpirun exits with that message? Or is your application process exiting with it? There is no reason for mpirun to be looking for that library. The library in question is in the /lib/openmpi directory, and is named mca_ess_pmi.[la,so] > On Nov 23, 2016, at

Re: [OMPI users] malloc related crash inside openmpi

2016-11-23 Thread Noam Bernstein
> On Nov 23, 2016, at 5:26 PM, r...@open-mpi.org wrote: > > It looks like the library may not have been fully installed on that node - > can you see if the prefix location is present, and that the LD_LIBRARY_PATH > on that node is correctly set? The referenced component did not exist prior > t

Re: [OMPI users] malloc related crash inside openmpi

2016-11-23 Thread r...@open-mpi.org
It looks like the library may not have been fully installed on that node - can you see if the prefix location is present, and that the LD_LIBRARY_PATH on that node is correctly set? The referenced component did not exist prior to the 2.0 series, so I’m betting that your LD_LIBRARY_PATH isn’t cor

Re: [OMPI users] malloc related crash inside openmpi

2016-11-23 Thread Noam Bernstein
> On Nov 23, 2016, at 3:45 PM, George Bosilca wrote: > > Thousands reasons ;) Still trying to check if 2.0.1 fixes the problem, and discovered that earlier runs weren’t actually using the version I intended. When I do use 2.0.1, I get the following errors: ---

Re: [OMPI users] malloc related crash inside openmpi

2016-11-23 Thread George Bosilca
Thousands reasons ;) https://raw.githubusercontent.com/open-mpi/ompi/v2.x/NEWS George. On Wed, Nov 23, 2016 at 1:08 PM, Noam Bernstein wrote: > On Nov 23, 2016, at 3:02 PM, George Bosilca wrote: > > Noam, > > I do not recall exactly which version of Open MPI was affected, but we had > som

Re: [OMPI users] malloc related crash inside openmpi

2016-11-23 Thread Noam Bernstein
> On Nov 23, 2016, at 3:08 PM, Noam Bernstein > wrote: > >> On Nov 23, 2016, at 3:02 PM, George Bosilca > > wrote: >> >> Noam, >> >> I do not recall exactly which version of Open MPI was affected, but we had >> some issues with the non-reentrancy of our memory allo

Re: [OMPI users] malloc related crash inside openmpi

2016-11-23 Thread Noam Bernstein
> On Nov 23, 2016, at 3:02 PM, George Bosilca wrote: > > Noam, > > I do not recall exactly which version of Open MPI was affected, but we had > some issues with the non-reentrancy of our memory allocator. More recent > versions (1.10 and 2.0) will not have this issue. Can you update to a newer

Re: [OMPI users] malloc related crash inside openmpi

2016-11-23 Thread George Bosilca
Noam, I do not recall exactly which version of Open MPI was affected, but we had some issues with the non-reentrancy of our memory allocator. More recent versions (1.10 and 2.0) will not have this issue. Can you update to a newer version of Open MPI (1.10 or maybe 2.0) and see if you can reproduce

Re: [OMPI users] malloc related crash inside openmpi

2016-11-23 Thread Noam Bernstein
> On Nov 17, 2016, at 3:22 PM, Noam Bernstein > wrote: > > Hi - we’ve started seeing over the last few days crashes and hangs in > openmpi, in a code that hasn’t been touched in months, and an openmpi > installation (v. 1.8.5) that also hasn’t been touched in months. The > symptoms are eithe

[OMPI users] malloc related crash inside openmpi

2016-11-17 Thread Noam Bernstein
Hi - we’ve started seeing over the last few days crashes and hangs in openmpi, in a code that hasn’t been touched in months, and an openmpi installation (v. 1.8.5) that also hasn’t been touched in months. The symptoms are either a hang, with a stack trace (from attaching to the one running proc