Re: [OMPI devel] shmem error msg

2011-07-25 Thread Samuel K. Gutierrez
Hi Ralph, On Jul 25, 2011, at 11:05 AM, Ralph Castain wrote: On Jul 25, 2011, at 10:16 AM, Samuel K. Gutierrez wrote: Hi Ralph, It seems as if this issue is related to a missing shm_unlink wrapper within Valgrind. I'm going to disable posix by default and commit later today.

Re: [OMPI devel] shmem error msg

2011-07-25 Thread Samuel K. Gutierrez
Hi Ralph, It seems as if this issue is related to a missing shm_unlink wrapper within Valgrind. I'm going to disable posix by default and commit later today. Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory On Jul 23, 2011, at 8:54 PM, Samuel K. Gutierrez wrote: Hi

Re: [OMPI devel] shmem error msg

2011-07-23 Thread Samuel K. Gutierrez
Hi Ralph, That's mine - I'll take a look. Thanks, Sam > Whenever I run valgrind on orterun (or any OMPI tool), I get the following > error msg: > > -- > A system call failed during shared memory initialization that should >

Re: [OMPI devel] RFC: Bring in Shared Memory Backing Facility Framework (shmem)

2011-06-21 Thread Samuel K. Gutierrez
In r24795. Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 15, 2011, at 10:01 AM, Samuel K. Gutierrez wrote: WHAT: Bring in new shared memory backing facility framework (shmem) and its components. shmem is simply a framework for the manipulation of shared memory

[OMPI devel] RFC: Bring in Shared Memory Backing Facility Framework (shmem)

2011-06-15 Thread Samuel K. Gutierrez
/coll_sm_module.c M orte/mca/odls/base/odls_base_default_fns.c M orte/tools/orte-info/orte-info.c M orte/tools/orte-info/components.c WHEN: Before 1.7. TIMEOUT: Teleconference, Tues 21 June 2011 Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory

Re: [OMPI devel] 1.4.4rc2 is up

2011-05-18 Thread Samuel K. Gutierrez
Here is the 'pgCC -V' output from versions that I have access to. $ pgCC -V pgCC 7.1-6 64-bit target on x86-64 Linux -tp gh-64 Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved. Copyright 2000-2007, STMicroelectronics, Inc. All Rights Reserved. $ pgCC -V pgCC 9.0-3 64-bit ta

Re: [OMPI devel] Too many open files (24)

2011-03-30 Thread Samuel K. Gutierrez
Hi Tim, Great news! Happy calculating :-). -- Samuel K. Gutierrez Los Alamos National Laboratory > Dear Samuel, > > Just as you replied I was trying that on the compute nodes. Surprise, > surprise...the value returned as the hard and soft limits is 1024. > > Thanks for confir

Re: [OMPI devel] Too many open files (24)

2011-03-30 Thread Samuel K. Gutierrez
Hi, It sounds like Open MPI is hitting your system's open file descriptor limit. If that's the case, one potential workaround is to have your system administrator raise file descriptor limits. On a compute node, what does "ulimit -a" show (using bash)? Hope that

Re: [OMPI devel] Threading

2010-10-12 Thread Samuel K. Gutierrez
Same here. -- Samuel K. Gutierrez Los Alamos National Laboratory > On Oct 11, 2010, at 11:41 PM, Ralph Castain wrote: > >> Does anyone know of a reason why mpirun can -not- be threaded, assuming >> that all threads block and do not continuously chew cpu? Is there an >>

Re: [OMPI devel] Question regarding recently common shared-memory component

2010-09-21 Thread Samuel K. Gutierrez
Hi, Just to be clear - do you see similar checkpoint performance differences in 1.5rc6 and 1.4.2 with and without shared memory enabled? Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory On Sep 21, 2010, at 9:35 AM, > wrote: Hello Samuel This problem seems to be resol

Re: [OMPI devel] Question regarding recently common shared-memory component

2010-09-20 Thread Samuel K. Gutierrez
Hi Ananda, This issue should be resolved in r23781. Please let me know if it is not. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On Sep 20, 2010, at 11:26 AM, > wrote: I have used following options to build: ./configure CC=/usr/bin/gcc CXX=/usr/bin/c++ F77=/usr/

Re: [OMPI devel] Question regarding recently common shared-memory component

2010-09-20 Thread Samuel K. Gutierrez
Let me take a look at it. How did you configure your build? Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory On Sep 20, 2010, at 10:14 AM, > wrote: Hi I believe the new common shared memory component was committed to the trunk sometime towards the later part of August

Re: [OMPI devel] common_sm_mmap.c: wrong args to orte_show_help() (1.5rc5 and 1.4.3rc1)

2010-08-26 Thread Samuel K. Gutierrez
Will do. Sam On Aug 26, 2010, at 2:08 PM, Jeff Squyres wrote: I think Sam already submitted CMR's for 1.5: https://svn.open-mpi.org/trac/ompi/ticket/2545 Sam -- can you construct an equivalent for v1.4 and CC Paul so that he knows not to follow up on it? Thanks! On Aug 26, 2010, at 3:

Re: [OMPI devel] Trunk Commit Heads-up: New Common Shared Memory Component

2010-08-23 Thread Samuel K. Gutierrez
Code is in (see r23633). Note: mmap is still the default. -- Samuel K. Gutierrez Los Alamos National Laboratory On Aug 12, 2010, at 11:37 AM, Samuel K. Gutierrez wrote: Sorry, I should have included the link containing the discussion of the plot. http://www.open-mpi.org/community/lists

Re: [OMPI devel] Trunk Commit Heads-up: New Common Shared Memory Component

2010-08-12 Thread Samuel K. Gutierrez
Sorry, I should have included the link containing the discussion of the plot. http://www.open-mpi.org/community/lists/devel/2010/06/8078.php -- Samuel K. Gutierrez Los Alamos National Laboratory On Aug 12, 2010, at 11:20 AM, Terry Dontje wrote: Sorry Rich, I didn't realize there was a

Re: [OMPI devel] Trunk Commit Heads-up: New Common Shared Memory Component

2010-08-11 Thread Samuel K. Gutierrez
Hi Terry, One more thing... Before testing on Solaris 10, could you please update (I just committed a Solaris 10 fix). Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory On Aug 11, 2010, at 1:15 PM, Samuel K. Gutierrez wrote: Hi Terry, On Aug 11, 2010, at 12:34 PM

Re: [OMPI devel] Trunk Commit Heads-up: New Common Shared Memory Component

2010-08-11 Thread Samuel K. Gutierrez
rry about where the OMPI session directory is rooted. -- Samuel K. Gutierrez Los Alamos National Laboratory I still working on testing the code on Solaris but I don't imagine I will see anything that will change my mind. --td Samuel K. Gutierrez wrote: Hi Rich, It's a mod

Re: [OMPI devel] Trunk Commit Heads-up: New Common Shared Memory Component

2010-08-11 Thread Samuel K. Gutierrez
onent ? Rich On 8/10/10 10:52 AM, "Samuel K. Gutierrez" wrote: Hi, I wanted to give everyone a heads-up about a new POSIX shared memory component that has been in the works for a while now and is ready to be pushed into the trunk. http://bitbucket.org/samuelkgutierrez/ompi_pos

[OMPI devel] Trunk Commit Heads-up: New Common Shared Memory Component

2010-08-10 Thread Samuel K. Gutierrez
Addressed some of George's code reuse concerns. If there are no major objections by August 17th, I'll commit the code after the Tuesday morning conference call. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory

[OMPI devel] Los Alamos National Lab MPI Position

2010-07-27 Thread Samuel K. Gutierrez
All, LANL is currently looking for an individual to participate in support and development efforts in parallel code benchmarking, performance analysis, tuning, and tools integration. Duties will include parallel environment optimization for programming models targeted on deployed archite

Re: [OMPI devel] System V Shared Memory for Open MPI: Request forCommunity Input and Testing

2010-06-10 Thread Samuel K. Gutierrez
On Jun 10, 2010, at 1:47 AM, Sylvain Jeaugey wrote: On Wed, 9 Jun 2010, Jeff Squyres wrote: On Jun 9, 2010, at 3:26 PM, Samuel K. Gutierrez wrote: System V shared memory cleanup is a concern only if a process dies in between shmat and shmctl IPC_RMID. Shared memory segment cleanup should

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-06-09 Thread Samuel K. Gutierrez
segments. System V shared memory cleanup is a concern only if a process dies in between shmat and shmctl IPC_RMID. Shared memory segment cleanup should happen automagically in most cases, including abnormal process termination. -- Samuel K. Gutierrez Los Alamos National Laboratory Righ

Re: [OMPI devel] RFC: System V Shared Memory for Open MPI

2010-06-09 Thread Samuel K. Gutierrez
Now in the trunk (see r23260). Thanks everyone! -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 1, 2010, at 11:08 AM, Samuel K. Gutierrez wrote: WHAT: New System V shared memory component. WHY: https://svn.open-mpi.org/trac/ompi/ticket/1320 WHERE: M ompi/mca/btl/sm

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-06-09 Thread Samuel K. Gutierrez
Thanks Sylvain! -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 9, 2010, at 9:58 AM, Sylvain Jeaugey wrote: As stated at the conf call, I did some performance testing on a 32 cores node. So, here is graph showing 500 timings of an allreduce operation (repeated 15,000 times

Re: [OMPI devel] RFC: System V Shared Memory for Open MPI

2010-06-03 Thread Samuel K. Gutierrez
On Jun 2, 2010, at 11:58 AM, Samuel K. Gutierrez wrote: Good point - I forgot about that. -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 2, 2010, at 11:40 AM, Jeff Squyres wrote: Don't forget that the RML is also used to broadcast the success/ failure of the creation o

Re: [OMPI devel] RFC: System V Shared Memory for Open MPI

2010-06-02 Thread Samuel K. Gutierrez
Good point - I forgot about that. -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 2, 2010, at 11:40 AM, Jeff Squyres wrote: Don't forget that the RML is also used to broadcast the success/ failure of the creation of the shared memory segment. If the RML goes away, be sure

Re: [OMPI devel] RFC: System V Shared Memory for Open MPI

2010-06-02 Thread Samuel K. Gutierrez
Hi George, That may work - I'll try it. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 2, 2010, at 10:59 AM, George Bosilca wrote: How about ftok ? The init function takes a file_name as argument, and this file name is unique per instance of the shared memory r

Re: [OMPI devel] RFC: System V Shared Memory for Open MPI

2010-06-02 Thread Samuel K. Gutierrez
memory ID (generated by exactly one process). I'm not sure how we would go about passing along the shared memory ID without RML, but any ideas are greatly appreciated. Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory -- Jeff Squyres jsquy...@cisco.com For corporate

Re: [OMPI devel] RFC: System V Shared Memory for Open MPI

2010-06-02 Thread Samuel K. Gutierrez
inalize). Sam -- are the common parts really common? I.e., could they be factored out? Or are they "just different enough" that factoring them out would be a PITA? I'm sure some refactoring could be done - let me take a look. -- Samuel K. Gutierrez Los Alamos National Labor

Re: [OMPI devel] RFC: System V Shared Memory for Open MPI

2010-06-01 Thread Samuel K. Gutierrez
Hi all, Configure option added: --enable-sysv (default: disabled). For sysv testing purposes, please enable. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 1, 2010, at 11:11 AM, Samuel K. Gutierrez wrote: Doh! bitbucket repository: http://bitbucket.org

Re: [OMPI devel] RFC: System V Shared Memory for Open MPI

2010-06-01 Thread Samuel K. Gutierrez
Hi Rich, I'll add a configure-time option. This addition does not negatively impact the performance of the current sm component. Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 1, 2010, at 11:35 AM, Graham, Richard L. wrote: Can you be a bit more explicit, p

Re: [OMPI devel] RFC: System V Shared Memory for Open MPI

2010-06-01 Thread Samuel K. Gutierrez
Doh! bitbucket repository: http://bitbucket.org/samuelkgutierrez/ompi_sysv_sm Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 1, 2010, at 11:08 AM, Samuel K. Gutierrez wrote: WHAT: New System V shared memory component. WHY: https://svn.open-mpi.org/trac/ompi/ticket

[OMPI devel] RFC: System V Shared Memory for Open MPI

2010-06-01 Thread Samuel K. Gutierrez
map - or a comma delimited combination of them (order dependent). The first component that is successfully selected is used. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory

Re: [OMPI devel] System V Shared Memory forOpenMPI:Request forCommunity Input and Testing

2010-05-10 Thread Samuel K. Gutierrez
iated! http://bitbucket.org/samuelkgutierrez/ompi_sysv_sm Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory On May 5, 2010, at 7:53 AM, Samuel K. Gutierrez wrote: On May 5, 2010, at 6:10 AM, Jeff Squyres wrote: On May 4, 2010, at 9:53 AM, Ashley Pittman wrote: Point noted. But act

Re: [OMPI devel] System V Shared Memory forOpenMPI:Request forCommunity Input and Testing

2010-05-05 Thread Samuel K. Gutierrez
just need to communicate via RML to sync up, I suppose. I need to think about it a little more, but I like this solution. Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory I should of course said fork where I mentioned spawn above to avoid any confusion, spawn has a specific mea

Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing

2010-05-03 Thread Samuel K. Gutierrez
Hi All, New configure-time test added - thanks for the suggestion, Jeff. Update and give it a whirl. Ethan - could you please try again? This time, I'm hoping sysv support will be disabled ;-). Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On May 3, 2010, at 9:

Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing

2010-05-03 Thread Samuel K. Gutierrez
Hi Jeff, Sounds like a plan :-). Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On May 3, 2010, at 9:12 AM, Jeff Squyres wrote: It might well be that you need a configure test to determine whether this behavior occurs or not. Heck, it may even need to be a run- time test

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-05-03 Thread Samuel K. Gutierrez
adequate level of sysv support. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On May 2, 2010, at 7:48 AM, N.M. Maclaren wrote: On May 2 2010, Ashley Pittman wrote: On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote: As to performance there should be no difference in use

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-05-01 Thread Samuel K. Gutierrez
nks for testing! -- Samuel K. Gutierrez Los Alamos National Laboratory > On Thu, Apr/29/2010 02:52:24PM, Samuel K. Gutierrez wrote: >> Hi Ethan, >> Bummer. What does the following command show? >> sysctl -a | grep shm > > In this case, I think the Solaris equivalent

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-04-29 Thread Samuel K. Gutierrez
Hi Ethan, Bummer. What does the following command show? sysctl -a | grep shm Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On Apr 29, 2010, at 1:32 PM, Ethan Mallove wrote: Hi Samuel, I'm trying to run off your HG clone, but I'm seeing issues with c_

Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-04-28 Thread Samuel K. Gutierrez
what my preliminary tests have shown. As it stands, I have not come across a situation where the mmap SM component doesn't work or is slower. Hope that helps, -- Samuel K. Gutierrez Los Alamos National Laboratory On Apr 28, 2010, at 5:35 AM, Bogdan Costescu wrote: On Tue, Apr 27, 20

[OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing

2010-04-27 Thread Samuel K. Gutierrez
an be activated using: -mca mpi_common_sm sysv Repository: http://bitbucket.org/samuelkgutierrez/ompi_sysv_sm Input is greatly appreciated! -- Samuel K. Gutierrez Los Alamos National Laboratory

Re: [OMPI devel] kernel 2.6.23 vs 2.6.24 - communication/wait times

2010-04-22 Thread Samuel K. Gutierrez
On Apr 22, 2010, at 10:08 AM, Rainer Keller wrote: Hello Oliver, thanks for the update. Just my $0.02: the upcoming Open MPI v1.5 will warn users, if their session directory is on NFS (or Lustre). ... or panfs :-) Samuel K. Gutierrez Best regards, Rainer On Thursday 22 April 2010

Re: [OMPI devel] Open MPI v1.3.4rc4 is out

2009-11-05 Thread Samuel K. Gutierrez
That's interesting... Works great now that carto is built. Why is carto now required? -- Samuel K. Gutierrez Los Alamos National Laboratory On Nov 5, 2009, at 4:11 PM, David Gunter wrote: Oh, good catch. I'm not sure who updates the platform files or who would have added

Re: [OMPI devel] Open MPI v1.3.4rc4 is out

2009-11-05 Thread Samuel K. Gutierrez
/lib64 I'll send the build log shortly. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On Nov 5, 2009, at 3:07 PM, Jeff Squyres wrote: How did you build? I see one carto component named "auto_detect" in the 1.3.4 source tree, but I don't see it in your omp

Re: [OMPI devel] Open MPI v1.3.4rc4 is out

2009-11-04 Thread Samuel K. Gutierrez
line 77 [rra011a.rr.lanl.gov:31601] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file orterun.c at line 541 This may be an issue on our end regarding a runtime parameter that isn't set correctly. See attached. Please let me know if you need any more info. Thanks! -- Samuel K. Gutierrez Los Al

Re: [OMPI devel] MPIR_Breakpoint visibility

2009-09-21 Thread Samuel K. Gutierrez
Hi Jeff, Sorry about the ambiguity. I just had another conversation with our TotalView person and the problem -seems- to be unrelated to OMPI. Guess I jumped the gun... Thanks, Samuel K. Gutierrez On Sep 21, 2009, at 8:58 AM, Jeff Squyres wrote: Can you more precisely define &quo

Re: [OMPI devel] MPIR_Breakpoint visibility

2009-09-21 Thread Samuel K. Gutierrez
Hi, According to our TotalView person, PGI and Intel versions of OMPI 1.3.3 are not working properly. She noted that 1.2.8 and 1.3.2 work fine. Thanks, Samuel K. Gutierrez On Sep 21, 2009, at 7:19 AM, Terry Dontje wrote: Ralph Castain wrote: I see it declared "extern" in

Re: [OMPI devel] RFC: convert send to ssend

2009-08-24 Thread Samuel K. Gutierrez
Hi Ashley, My understanding is that this behavior would not be enabled by default in the standard debug build. The "always convert to synchronous sends" mode would be an additional configure-time option. Samuel K. Gutierrez Ashley Pittman wrote: On Mon, 2009-08-24 at 13:27 -

Re: [OMPI devel] RFC: convert send to ssend

2009-08-24 Thread Samuel K. Gutierrez
Hi Jeff, Sounds good to me. Samuel K. Gutierrez Jeff Squyres wrote: The debug builds already have quite a bit of performance overhead. It might be desirable to change this RFC to have a similar tri-state as the MPI parameter checking: - compiled out - compiled in, always check - compiled

Re: [OMPI devel] RFC: convert send to ssend

2009-08-23 Thread Samuel K. Gutierrez
Hi all, How about exposing this functionality as a run-time parameter that is only available in debug builds? This will make debugging easier and won't impact the performance of optimized builds. Just an idea... Samuel K. Gutierrez > > - "Jeff Squyres" wrote: >

Re: [OMPI devel] Enabling debugging and profiling in openMPI (make "CFLAGS=-pg -g")

2009-06-12 Thread Samuel K. Gutierrez
Hi, Let me begin by stating that I'm at most an Open MPI novice - but you may want to try the addition of the --enable-debug configure option. That is, for example: ./configure --enable-debug; make Hope this helps. Samuel K. Gutierrez On Jun 12, 2009, at 3:27 AM, Leo P. wrote:

Re: [OMPI devel] OMPI 1.3 - PERUSE peruse_comm_spec_t peer Negative Value

2009-03-25 Thread Samuel K. Gutierrez
modifications were needed on my end. That being said, my peruse.h does not include ompi_config.h, only mpi.h. Thanks again, Samuel K. Gutierrez > > Regards, > Kiril > > > On Mon, 2009-03-23 at 16:34 -0400, George Bosilca wrote: >> You are absolutely right, the peer shou

Re: [OMPI devel] OMPI 1.3 - PERUSE peruse_comm_spec_t peer Negative Value

2009-03-23 Thread Samuel K. Gutierrez
Hi Kiril, Appreciate the quick response. > Hi Samuel, > > On Sat, 21 Mar 2009 18:18:54 -0600 (MDT) > "Samuel K. Gutierrez" wrote: >> Hi All, >> >> I'm writing a simple profiling library which utilizes >>PERUSE. My callback > > So am I

[OMPI devel] OMPI 1.3 - PERUSE peruse_comm_spec_t peer Negative Value

2009-03-21 Thread Samuel K. Gutierrez
if (spec->peer == rank) { return MPI_SUCCESS; } rrCounts[spec->peer]++; return MPI_SUCCESS; } Any insight is greatly appreciated. Thanks, Samuel K. Gutierrez