Re: [OMPI devel] SM init failures

2009-04-01 Thread Jeff Squyres
On Apr 1, 2009, at 6:58 PM, Ralph Castain wrote: IIRC, we certainly used to unlink the file after init. Are you sure somebody changed that? It looks like we unlink() it during btl sm component close (effectively during MPI_FINALIZE), not before. -- Jeff Squyres Cisco Systems

Re: [OMPI devel] SM init failures

2009-04-01 Thread Ralph Castain
IIRC, we certainly used to unlink the file after init. Are you sure somebody changed that? On Apr 1, 2009, at 4:29 PM, Jeff Squyres wrote: So everyone hates SYSV. Ok. :-) Given that part of the problems we've been having with mmap have been due to filesystem issues, should we just unlin

Re: [OMPI devel] SM init failures

2009-04-01 Thread Jeff Squyres
So everyone hates SYSV. Ok. :-) Given that part of the problems we've been having with mmap have been due to filesystem issues, should we just unlink() the file once all processes have mapped it? I believe we didn't do that originally for two reasons: - leave it around for debugging pu

Re: [OMPI devel] SM init failures

2009-04-01 Thread Ashley Pittman
On Tue, 2009-03-31 at 11:00 -0400, Jeff Squyres wrote: > On Mar 31, 2009, at 3:45 AM, Sylvain Jeaugey wrote: > > System V shared memory used to be the main way to do shared memory on > > MPICH and from my (little) experience, this was truly painful : > > - Cleanup issues : does shmctl(IPC_RMID) s

Re: [OMPI devel] SM init failures

2009-04-01 Thread Iain Bason
On Mar 31, 2009, at 11:00 AM, Jeff Squyres wrote: On Mar 31, 2009, at 3:45 AM, Sylvain Jeaugey wrote: Sorry to continue off-topic but going to System V shm would be for me like going back in the past. System V shared memory used to be the main way to do shared memory on MPICH and from my (li

Re: [OMPI devel] SM init failures

2009-03-31 Thread Eugene Loh
Jeff Squyres wrote: On Mar 31, 2009, at 3:06 PM, Eugene Loh wrote: The thing I was wondering about was memory barriers. E.g., you initialize stuff and then post the FIFO pointer. The other guy sees the FIFO pointer before the initialized memory. We do do memory barriers during that SM s

Re: [OMPI devel] SM init failures

2009-03-31 Thread Jeff Squyres
On Mar 31, 2009, at 3:06 PM, Eugene Loh wrote: The thing I was wondering about was memory barriers. E.g., you initialize stuff and then post the FIFO pointer. The other guy sees the FIFO pointer before the initialized memory. We do do memory barriers during that SM startup sequence. I

Re: [OMPI devel] SM init failures

2009-03-31 Thread Eugene Loh
Jeff Squyres wrote: On Mar 31, 2009, at 1:46 AM, Eugene Loh wrote: > FWIW, George found what looks like a race condition in the sm init > code today -- it looks like we don't call maffinity anywhere in the > sm btl startup, so we're not actually guaranteed that the memory is > local to any p

Re: [OMPI devel] SM init failures

2009-03-31 Thread Jeff Squyres
On Mar 31, 2009, at 3:45 AM, Sylvain Jeaugey wrote: Sorry to continue off-topic but going to System V shm would be for me like going back in the past. System V shared memory used to be the main way to do shared memory on MPICH and from my (little) experience, this was truly painful : - Cleanu

Re: [OMPI devel] SM init failures

2009-03-31 Thread Jeff Squyres
On Mar 31, 2009, at 1:46 AM, Eugene Loh wrote: > FWIW, George found what looks like a race condition in the sm init > code today -- it looks like we don't call maffinity anywhere in the > sm btl startup, so we're not actually guaranteed that the memory is > local to any particular process(or)

Re: [OMPI devel] SM init failures

2009-03-31 Thread Sylvain Jeaugey
Sorry to continue off-topic but going to System V shm would be for me like going back in the past. System V shared memory used to be the main way to do shared memory on MPICH and from my (little) experience, this was truly painful : - Cleanup issues : does shmctl(IPC_RMID) solve _all_ cases ?

Re: [OMPI devel] SM init failures

2009-03-31 Thread Eugene Loh
Jeff Squyres wrote: FWIW, George found what looks like a race condition in the sm init code today -- it looks like we don't call maffinity anywhere in the sm btl startup, so we're not actually guaranteed that the memory is local to any particular process(or) (!). This race shouldn't cause

Re: [OMPI devel] SM init failures

2009-03-30 Thread Jeff Squyres
FWIW, George found what looks like a race condition in the sm init code today -- it looks like we don't call maffinity anywhere in the sm btl startup, so we're not actually guaranteed that the memory is local to any particular process(or) (!). This race shouldn't cause segvs, though; it sh

Re: [OMPI devel] SM init failures

2009-03-30 Thread Eugene Loh
Jeff Squyres wrote: On Mar 30, 2009, at 1:40 PM, Patrick Geoffray wrote: > we will have to find a > pretty smart way to do this or we will completely break the memory > affinity stuff. I didn't look at the code, but I sure hope that the SM init code does touch each page to force allocation,

Re: [OMPI devel] SM init failures

2009-03-30 Thread Eugene Loh
Tim Mattox wrote: I think I remember setting up the MTT tests on Sif so that tests are run both with and without the coll_hierarch component selected. The coll_hierarch component stresses code paths and potential race conditions in its own way. So, if the problems are showing up more frequently

Re: [OMPI devel] SM init failures

2009-03-30 Thread Eugene Loh
Patrick Geoffray wrote: Jeff Squyres wrote: Why not? The "owning" process can do the touch; then it'll be affinity'ed properly. Right? Yes, that's what I meant by forcing allocation. From the thread, it looked like nobody touched the pages of the mapped file. If it's already done, no nee

Re: [OMPI devel] SM init failures

2009-03-30 Thread Eugene Loh
Jeff Squyres wrote: It's half done, actually. But it was still going to be an option, not necessarily the only way to do it: http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/shm-sysv/ On Mar 30, 2009, at 1:40 PM, Tim Mattox wrote: I've been lurking on this conversation, and I am again

Re: [OMPI devel] SM init failures

2009-03-30 Thread Patrick Geoffray
Jeff Squyres wrote: Why not? The "owning" process can do the touch; then it'll be affinity'ed properly. Right? Yes, that's what I meant by forcing allocation. From the thread, it looked like nobody touched the pages of the mapped file. If it's already done, no need to write in the whole fil

Re: [OMPI devel] SM init failures

2009-03-30 Thread Eugene Loh
George Bosilca wrote: Then it looks like the safest solution is the use either ftruncate or the lseek method and then touch the first byte of all memory pages. Unfortunately, I see two problems with this. First, there is a clear performance hit on the startup time. And second, we will have to

Re: [OMPI devel] SM init failures

2009-03-30 Thread Jeff Squyres
It's half done, actually. But it was still going to be an option, not necessarily the only way to do it: http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/shm-sysv/ On Mar 30, 2009, at 1:40 PM, Tim Mattox wrote: I've been lurking on this conversation, and I am again left with the impre

Re: [OMPI devel] SM init failures

2009-03-30 Thread Jeff Squyres
On Mar 30, 2009, at 1:40 PM, Patrick Geoffray wrote: > performance hit on the startup time. And second, we will have to find a > pretty smart way to do this or we will completely break the memory > affinity stuff. I didn't look at the code, but I sure hope that the SM init code does touch eac

Re: [OMPI devel] SM init failures

2009-03-30 Thread Patrick Geoffray
George Bosilca wrote: performance hit on the startup time. And second, we will have to find a pretty smart way to do this or we will completely break the memory affinity stuff. I didn't look at the code, but I sure hope that the SM init code does touch each page to force allocation, otherwise

Re: [OMPI devel] SM init failures

2009-03-30 Thread Tim Mattox
I've been lurking on this conversation, and I am again left with the impression that the underlying shared memory configuration based on sharing a file is flawed. Why not use a System V shared memory segment without a backing file as I described in ticket #1320? On Mon, Mar 30, 2009 at 1:34 PM, G

Re: [OMPI devel] SM init failures

2009-03-30 Thread Jeff Squyres
On Mar 30, 2009, at 1:24 PM, Iain Bason wrote: > But don't we need the whole area to be zero filled? It will be zero-filled on demand using the lseek/touch method. Ok. However, the OS may not reserve space for the skipped pages or disk blocks. Thus one could still get out of memory or fil

Re: [OMPI devel] SM init failures

2009-03-30 Thread George Bosilca
Then it looks like the safest solution is the use either ftruncate or the lseek method and then touch the first byte of all memory pages. Unfortunately, I see two problems with this. First, there is a clear performance hit on the startup time. And second, we will have to find a pretty smart

Re: [OMPI devel] SM init failures

2009-03-30 Thread Iain Bason
On Mar 30, 2009, at 12:05 PM, Jeff Squyres wrote: But don't we need the whole area to be zero filled? It will be zero-filled on demand using the lseek/touch method. However, the OS may not reserve space for the skipped pages or disk blocks. Thus one could still get out of memory or file

Re: [OMPI devel] SM init failures

2009-03-30 Thread Jeff Squyres
But don't we need the whole area to be zero filled? On Mar 28, 2009, at 5:02 PM, George Bosilca wrote: It is way to expensive to write the whole file. That's why I proposed to only write the last byte. This will force the OS to really map the file on the systems less POSIX compliant. geor

Re: [OMPI devel] SM init failures

2009-03-30 Thread Christian Siebert
Hi, as you all have noticed already, ftruncate() does NOT extend the size of a file on all systems. Instead, the preferred way to set a file to a specific size is to call lseek() and then write() one byte (see e.g. [1]). Best regards, Christian [1] Richard Stevens: Advanced Programmi

Re: [OMPI devel] SM init failures

2009-03-28 Thread George Bosilca
It is way to expensive to write the whole file. That's why I proposed to only write the last byte. This will force the OS to really map the file on the systems less POSIX compliant. george. On Mar 28, 2009, at 13:50 , Jeff Squyres wrote: How about just write()ing a bunch of 0's instead o

Re: [OMPI devel] SM init failures

2009-03-28 Thread Jeff Squyres
How about just write()ing a bunch of 0's instead of using ftruncate? On Mar 27, 2009, at 11:09 PM, Eugene Loh wrote: Paul H. Hargrove wrote: > Quoting from a different manpage for ftruncate: >[T]he POSIX standard allows two behaviours for ftruncate >when length exceeds the file

Re: [OMPI devel] SM init failures

2009-03-27 Thread Eugene Loh
Paul H. Hargrove wrote: Quoting from a different manpage for ftruncate: [T]he POSIX standard allows two behaviours for ftruncate when length exceeds the file length [...]: either returning an error, or extending the file. So, if that is to be trusted, it is not legal by PO

Re: [OMPI devel] SM init failures

2009-03-27 Thread Paul H. Hargrove
Quoting from a different manpage for ftruncate: [T]he POSIX standard allows two behaviours for ftruncate when length exceeds the file length [...]: either returning an error, or extending the file. So, if that is to be trusted, it is not legal by POSIX to *silently* not extend

Re: [OMPI devel] SM init failures

2009-03-27 Thread George Bosilca
Talking with Aurelien here @ UT we think we came-up with a possible way to get such an error. Before explaining this let me set the bases. There are 2 critical functions used in setting up the shared memory file. One is ftruncate the other one mmap. Here are two snippets from these function

Re: [OMPI devel] SM init failures

2009-03-27 Thread Tim Mattox
Eugene, I think I remember setting up the MTT tests on Sif so that tests are run both with and without the coll_hierarch component selected. The coll_hierarch component stresses code paths and potential race conditions in its own way. So, if the problems are showing up more frequently for the test

Re: [OMPI devel] SM init failures

2009-03-27 Thread Eugene Loh
Josh Hursey wrote: Sif is also running the coll_hierarch component on some of those tests which has caused some additional problems. I don't know if that is related or not. Indeed. Many of the MTT stack traces (for both 1.3.1 and 1.3.2 and that have seg faults and call out mca_btl_sm.so)

Re: [OMPI devel] SM init failures

2009-03-27 Thread Jeff Squyres
FWIW, when I was looking into this before, the problem was definitely during MPI_INIT. I ran out of time before being able to track it down further, but it was definitely something during the sm startup -- during add_procs, IIRC. It *looked* like there was some kind of bogus value in the b

Re: [OMPI devel] SM init failures

2009-03-27 Thread Ralph Castain
Hmmm...Eugene, you need to be a tad less sensitive. Nobody was attempting to indict you or in any way attack you or your code. What I was attempting to point out is that there are a number of sm failures during sm init. I didn't single you out. I posted it to the community because (a) it is

Re: [OMPI devel] SM init failures

2009-03-27 Thread Josh Hursey
On Mar 26, 2009, at 6:41 PM, Ralph Castain wrote: I suspect Josh or someone at IU could tell you the compiler. I would be very surprised if it wasn't gcc, but I don't know what version. All the MTT runs on Sif are using gcc 4.1.2: -bash-3.2$ gcc --version gcc (GCC) 4.1.2 20080704 (Red Hat

Re: [OMPI devel] SM init failures

2009-03-27 Thread Eugene Loh
Ralph Castain wrote: You are correct - the Sun errors are in a version prior to the insertion of the SM changes. We didn't relabel the version to 1.3.2 until -after- those changes went in, so you have to look for anything with an r number >= 20839. The sif errors are all in that group - I

Re: [OMPI devel] SM init failures

2009-03-26 Thread Ralph Castain
You are correct - the Sun errors are in a version prior to the insertion of the SM changes. We didn't relabel the version to 1.3.2 until -after- those changes went in, so you have to look for anything with an r number >= 20839. The sif errors are all in that group - I would suggest starting

Re: [OMPI devel] SM init failures

2009-03-26 Thread Eugene Loh
Ralph Castain wrote: It looks like the SM revisions we inserted into 1.3.2 are a great detector for shared memory init failures - it segfaulted 143 times last night on IU's sif computer, 34 times on Sun/Linux, and 3 times on Sun/SunOS...almost every single time due to "Address not mapped"

Re: [OMPI devel] SM init failures

2009-03-26 Thread Eugene Loh
Ralph Castain wrote: Hi folks Er, perhaps pronounced "Eugene". :^( It looks like the SM revisions we inserted into 1.3.2 are a great detector for shared memory init failures How delicately put! I appreciate the gentleness. - it segfaulted 143 times last night on IU's sif computer, 34

[OMPI devel] SM init failures

2009-03-26 Thread Ralph Castain
Hi folks It looks like the SM revisions we inserted into 1.3.2 are a great detector for shared memory init failures - it segfaulted 143 times last night on IU's sif computer, 34 times on Sun/Linux, and 3 times on Sun/SunOS...almost every single time due to "Address not mapped" errors in t