Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-06-09 Thread Samuel K. Gutierrez
On Jun 9, 2010, at 11:57 AM, Jeff Squyres wrote: Iiiteresting. This, of course, begs the question of whether we should use sysv shmem or not. It seems like the order of preference should be: - sysv - mmap in a tmpfs - mmap in a "regular" (but not networked) fs The big downer, of cours

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-06-09 Thread Eugene Loh
If anyone is up for it, another interesting performance comparison could be start-up time. That is, consider a fat node with many on-node processes and a large shared-memory area. How long does it take for all that shared memory to be set up? Arguably, start-up time is a "second-order effect

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-06-09 Thread Jeff Squyres
Iiiteresting. This, of course, begs the question of whether we should use sysv shmem or not. It seems like the order of preference should be: - sysv - mmap in a tmpfs - mmap in a "regular" (but not networked) fs The big downer, of course, is the whole "what happens if the job crashes?" is

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-06-09 Thread Samuel K. Gutierrez
Thanks Sylvain! -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 9, 2010, at 9:58 AM, Sylvain Jeaugey wrote: As stated at the conf call, I did some performance testing on a 32 cores node. So, here is graph showing 500 timings of an allreduce operation (repeated 15,000 times fo

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-06-09 Thread Sylvain Jeaugey
As stated at the conf call, I did some performance testing on a 32 cores node. So, here is graph showing 500 timings of an allreduce operation (repeated 15,000 times for good timing) with sysv, mmap on /dev/shm and mmap on /tmp. What is shows : - sysv has the better performance ; - having

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-05-03 Thread Samuel K. Gutierrez
Hi all, Does anyone know of a relatively portable solution for querying a given system for the shmctl behavior that I am relying on, or is this going to be a nightmare? Because, if I am reading this thread correctly, the presence of shmget and Linux is not sufficient for determining an a

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-05-02 Thread N.M. Maclaren
On May 2 2010, Ashley Pittman wrote: On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote: As to performance there should be no difference in use between sys-V shared memory and file-backed shared memory, the instructions issued and the MMU flags for the page should both be the same so the perfo

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-05-02 Thread Christopher Samuel
On 02/05/10 06:49, Ashley Pittman wrote: > I think you should look into this a little deeper, it > certainly used to be the case on Linux that setting > IPC_RMID would also prevent any further processes from > attaching to the segment. That certainly appears to be the case in the current master o

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-05-02 Thread Christopher Samuel
On 01/05/10 23:03, Samuel K. Gutierrez wrote: > I call shmctl IPC_RMID immediately after one process has > attached to the segment because, at least on Linux, this > only marks the segment for destruction. That's correct, looking at the kernel code (at least in the current git master) the functio

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-05-02 Thread Ashley Pittman
On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote: > As far as I can tell, calling shmctl IPC_RMID is immediately destroying > the shared memory segment even though there is at least one process > attached to it. This is interesting and confusing because Solaris 10's > behavior description of sh

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-05-01 Thread Samuel K. Gutierrez
Hi Ethan, Sorry about the lag. As far as I can tell, calling shmctl IPC_RMID is immediately destroying the shared memory segment even though there is at least one process attached to it. This is interesting and confusing because Solaris 10's behavior description of shmctl IPC_RMID is similar to

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-04-30 Thread Ethan Mallove
On Thu, Apr/29/2010 02:52:24PM, Samuel K. Gutierrez wrote: > Hi Ethan, > > Bummer. What does the following command show? > > sysctl -a | grep shm In this case, I think the Solaris equivalent to sysctl is prctl, e.g., $ prctl -i project group.staff project: 10: group.staff NAMEPRIV

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-04-29 Thread Samuel K. Gutierrez
Hi Ethan, Bummer. What does the following command show? sysctl -a | grep shm Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On Apr 29, 2010, at 1:32 PM, Ethan Mallove wrote: Hi Samuel, I'm trying to run off your HG clone, but I'm seeing issues with c_hello, e.g., $ mpirun

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-04-29 Thread Ethan Mallove
Hi Samuel, I'm trying to run off your HG clone, but I'm seeing issues with c_hello, e.g., $ mpirun -mca mpi_common_sm sysv --mca btl self,sm,tcp --host burl-ct-v440-2,burl-ct-v440-2 -np 2 ./c_hello -- A system call fai

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-04-28 Thread Samuel K. Gutierrez
Hi, Faster component initialization/finalization times is one of the main motivating factors of this work. The general idea is to get away from creating a rather large backing file. With respect to module bandwidth and latency, mmap and sysv seem to be comparable - at least that is what

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-04-28 Thread Bogdan Costescu
On Tue, Apr 27, 2010 at 7:55 PM, Samuel K. Gutierrez wrote: > With Jeff and Ralph's help, I have completed a System V shared memory > component for Open MPI. What is the motivation for this work ? Are there situations where the mmap based SM component doesn't work or is slow(er) ? Kind regards,

[OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-04-27 Thread Samuel K. Gutierrez
Hi, With Jeff and Ralph's help, I have completed a System V shared memory component for Open MPI. I have conducted some preliminary tests on our systems, but would like to get test results from a broader audience. As it stands, mmap is the defaul, but System V shared memory can be activa