Re: [OMPI devel] SM init failures

2009-03-31 Thread Eugene Loh
Jeff Squyres wrote: On Mar 31, 2009, at 3:06 PM, Eugene Loh wrote: The thing I was wondering about was memory barriers. E.g., you initialize stuff and then post the FIFO pointer. The other guy sees the FIFO pointer before the initialized memory. We do do memory barriers during that SM s

Re: [OMPI devel] SM init failures

2009-03-31 Thread Jeff Squyres
On Mar 31, 2009, at 3:06 PM, Eugene Loh wrote: The thing I was wondering about was memory barriers. E.g., you initialize stuff and then post the FIFO pointer. The other guy sees the FIFO pointer before the initialized memory. We do do memory barriers during that SM startup sequence. I

[OMPI devel] mallopt fixes

2009-03-31 Thread Jeff Squyres
Ok, I've done a bunch of development and testing on the hg branch with all the mallopt fixes, etc., and I'm fairly confident that it's working properly. I plan to put this stuff back into the trunk tomorrow by noonish US Eastern if no one finds any problems with it: http://www.open-mpi

Re: [OMPI devel] SM init failures

2009-03-31 Thread Eugene Loh
Jeff Squyres wrote: On Mar 31, 2009, at 1:46 AM, Eugene Loh wrote: > FWIW, George found what looks like a race condition in the sm init > code today -- it looks like we don't call maffinity anywhere in the > sm btl startup, so we're not actually guaranteed that the memory is > local to any p

Re: [OMPI devel] custom btl

2009-03-31 Thread Jeff Squyres
On Mar 31, 2009, at 11:15 AM, Roberto Ammendola wrote: Hi all, I am developing a btl module for a custom interconnect board (we call it apelink, it's an academic project), and I am porting the module from 1.2 (at which it used to work) to 1.3 branch. Two issues: 1) the use of pls_rsh_agent

[OMPI devel] custom btl

2009-03-31 Thread Roberto Ammendola
Hi all, I am developing a btl module for a custom interconnect board (we call it apelink, it's an academic project), and I am porting the module from 1.2 (at which it used to work) to 1.3 branch. Two issues: 1) the use of pls_rsh_agent is said to be deprecated. How do I spawn the jobs using rs

Re: [OMPI devel] SM init failures

2009-03-31 Thread Jeff Squyres
On Mar 31, 2009, at 3:45 AM, Sylvain Jeaugey wrote: Sorry to continue off-topic but going to System V shm would be for me like going back in the past. System V shared memory used to be the main way to do shared memory on MPICH and from my (little) experience, this was truly painful : - Cleanu

Re: [OMPI devel] SM init failures

2009-03-31 Thread Jeff Squyres
On Mar 31, 2009, at 1:46 AM, Eugene Loh wrote: > FWIW, George found what looks like a race condition in the sm init > code today -- it looks like we don't call maffinity anywhere in the > sm btl startup, so we're not actually guaranteed that the memory is > local to any particular process(or)

Re: [OMPI devel] SM init failures

2009-03-31 Thread Sylvain Jeaugey
Sorry to continue off-topic but going to System V shm would be for me like going back in the past. System V shared memory used to be the main way to do shared memory on MPICH and from my (little) experience, this was truly painful : - Cleanup issues : does shmctl(IPC_RMID) solve _all_ cases ?

Re: [OMPI devel] SM init failures

2009-03-31 Thread Eugene Loh
Jeff Squyres wrote: FWIW, George found what looks like a race condition in the sm init code today -- it looks like we don't call maffinity anywhere in the sm btl startup, so we're not actually guaranteed that the memory is local to any particular process(or) (!). This race shouldn't cause