Re: [OMPI devel] shmem error msg
Hi Ralph, On Jul 25, 2011, at 11:05 AM, Ralph Castain wrote: On Jul 25, 2011, at 10:16 AM, Samuel K. Gutierrez wrote: Hi Ralph, It seems as if this issue is related to a missing shm_unlink wrapper within Valgrind. I'm going to disable posix by default and commit later today. Is that the right solution? No, not really. If the problem is something in valgrind, then let's not disable something just for their problem. Is there a way we can wrap it ourselves so the error doesn't cause the message? I think so. They outline the procedure in README_MISSING_SYSCALL_OR_IOCTL, so I'll take a look. Stay tuned, Sam Like I said, everything worked just fine - the message just implied the proc would die, and it doesn't. Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory On Jul 23, 2011, at 8:54 PM, Samuel K. Gutierrez wrote: Hi Ralph, That's mine - I'll take a look. Thanks, Sam Whenever I run valgrind on orterun (or any OMPI tool), I get the following error msg: -- A system call failed during shared memory initialization that should not have. It is likely that your MPI job will now either abort or experience performance degradation. Local host: Ralph System call: shm_unlink(2) Error: Function not implemented (errno 78) -- It's coming out of open-rte/help-opal-shmem-posix.txt. Everything continues, so I'm not sure what this is all about. Anyone recognize this??? It's on the trunk, running on a Mac, vanilla configure. Ralph ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] shmem error msg
Hi Ralph, That's mine - I'll take a look. Thanks, Sam > Whenever I run valgrind on orterun (or any OMPI tool), I get the following > error msg: > > -- > A system call failed during shared memory initialization that should > not have. It is likely that your MPI job will now either abort or > experience performance degradation. > > Local host: Ralph > System call: shm_unlink(2) > Error: Function not implemented (errno 78) > -- > > It's coming out of open-rte/help-opal-shmem-posix.txt. > > Everything continues, so I'm not sure what this is all about. Anyone > recognize this??? > > It's on the trunk, running on a Mac, vanilla configure. > Ralph > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >
Re: [OMPI devel] RFC: Bring in Shared Memory Backing Facility Framework (shmem)
In r24795. Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 15, 2011, at 10:01 AM, Samuel K. Gutierrez wrote: WHAT: Bring in new shared memory backing facility framework (shmem) and its components. shmem is simply a framework for the manipulation of shared memory segments (create, attach, detach, unlink, etc). WHY: The use of shared memory is probably going to start poking up in other parts of Open MPI, so this simply provides the needed infrastructure to facilitate that work. WHERE: See: https://bitbucket.org/samuelkgutierrez/orte_shmem Additions: opal/mca/shmem Other Modifications: M opal/runtime/opal_init.c M opal/runtime/opal_params.c M opal/runtime/opal_finalize.c M ompi/tools/ompi_info/ompi_info.c M ompi/tools/ompi_info/components.c M ompi/mca/btl/sm/btl_sm_component.c M ompi/mca/mpool/sm/mpool_sm_module.c ! ompi/mca/common/sm/common_sm_mmap.c M ompi/mca/common/sm/common_sm_rml.c ! ompi/mca/common/sm/common_sm_windows.c ! ompi/mca/common/sm/common_sm_mmap.h M ompi/mca/common/sm/common_sm_rml.h ! ompi/mca/common/sm/common_sm_windows.h ! ompi/mca/common/sm/common_sm_posix.c ! ompi/mca/common/sm/common_sm_sysv.c M ompi/mca/common/sm/help-mpi-common-sm.txt ! ompi/mca/common/sm/common_sm_posix.h M ompi/mca/common/sm/configure.m4 ! ompi/mca/common/sm/common_sm_sysv.h M ompi/mca/common/sm/common_sm.c M ompi/mca/common/sm/Makefile.am M ompi/mca/common/sm/common_sm.h M ompi/mca/coll/sm/coll_sm_component.c M ompi/mca/coll/sm/coll_sm_module.c M orte/mca/odls/base/odls_base_default_fns.c M orte/tools/orte-info/orte-info.c M orte/tools/orte-info/components.c WHEN: Before 1.7. TIMEOUT: Teleconference, Tues 21 June 2011 Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] RFC: Bring in Shared Memory Backing Facility Framework (shmem)
WHAT: Bring in new shared memory backing facility framework (shmem) and its components. shmem is simply a framework for the manipulation of shared memory segments (create, attach, detach, unlink, etc). WHY: The use of shared memory is probably going to start poking up in other parts of Open MPI, so this simply provides the needed infrastructure to facilitate that work. WHERE: See: https://bitbucket.org/samuelkgutierrez/orte_shmem Additions: opal/mca/shmem Other Modifications: M opal/runtime/opal_init.c M opal/runtime/opal_params.c M opal/runtime/opal_finalize.c M ompi/tools/ompi_info/ompi_info.c M ompi/tools/ompi_info/components.c M ompi/mca/btl/sm/btl_sm_component.c M ompi/mca/mpool/sm/mpool_sm_module.c ! ompi/mca/common/sm/common_sm_mmap.c M ompi/mca/common/sm/common_sm_rml.c ! ompi/mca/common/sm/common_sm_windows.c ! ompi/mca/common/sm/common_sm_mmap.h M ompi/mca/common/sm/common_sm_rml.h ! ompi/mca/common/sm/common_sm_windows.h ! ompi/mca/common/sm/common_sm_posix.c ! ompi/mca/common/sm/common_sm_sysv.c M ompi/mca/common/sm/help-mpi-common-sm.txt ! ompi/mca/common/sm/common_sm_posix.h M ompi/mca/common/sm/configure.m4 ! ompi/mca/common/sm/common_sm_sysv.h M ompi/mca/common/sm/common_sm.c M ompi/mca/common/sm/Makefile.am M ompi/mca/common/sm/common_sm.h M ompi/mca/coll/sm/coll_sm_component.c M ompi/mca/coll/sm/coll_sm_module.c M orte/mca/odls/base/odls_base_default_fns.c M orte/tools/orte-info/orte-info.c M orte/tools/orte-info/components.c WHEN: Before 1.7. TIMEOUT: Teleconference, Tues 21 June 2011 Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory
Re: [OMPI devel] 1.4.4rc2 is up
Here is the 'pgCC -V' output from versions that I have access to. $ pgCC -V pgCC 7.1-6 64-bit target on x86-64 Linux -tp gh-64 Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved. Copyright 2000-2007, STMicroelectronics, Inc. All Rights Reserved. $ pgCC -V pgCC 9.0-3 64-bit target on x86-64 Linux -tp gh-64 Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved. Copyright 2000-2009, STMicroelectronics, Inc. All Rights Reserved. $ pgCC -V pgCC 10.3-0 64-bit target on x86-64 Linux -tp istanbul-64 Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved. Copyright 2000-2010, STMicroelectronics, Inc. All Rights Reserved. -- Samuel Gutierrez Los Alamos National Laboratory On May 18, 2011, at 12:34 PM, Paul H. Hargrove wrote: > Below is a sampling of "pgCC -V" outputs in response to Jeff's question. > The complete output looks like: > > $ pgCC -V > > pgCC 11.1-0 64-bit target on x86-64 Linux -tp nehalem > Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved. > Copyright 2000-2011, STMicroelectronics, Inc. All Rights Reserved. > > Including the initial blank line. > > Here is the "important" line for a range of versions I can currently access: > > pgCC 7.2-5 64-bit target on x86-64 Linux -tp gh-64 > pgCC 8.0-6 64-bit target on x86-64 Linux -tp gh-64 > pgCC 9.0-3 64-bit target on x86-64 Linux -tp nehalem-64 > pgCC 10.8-0 64-bit target on x86-64 Linux -tp nehalem-64 > pgCC 11.1-0 64-bit target on x86-64 Linux -tp nehalem > > I am afraid my system w/ 5.x and 6.x versions was retired last month (not > joking). > However, I found the following output for the C (not C++) compiler in my bug > database: > > pgcc 6.0-8 32-bit target on x86-64 Linux > > And for their MacOSX port, there is a wrinkle. As anybody who as dealt w/ > mpicc vs mpiCC knows, Apple's filesystem is case PRESERVING but > case-insensitive. So, there PGI's C++ compiler is "pgcpp" and the -V output > (also from my bug database) looks like: > > pgcpp 7.1-5 64-bit target on Apple OS/X > > > -Paul > > > On 5/18/2011 5:50 AM, Jeff Squyres wrote: >> (addinglibtool-patc...@gnu.org) >> >> Is this guaranteed to work for all versions of the PGI compiler? I.e., does >> "pgCC -V" always return something in the form of (digit)+\. ? > > -- > Paul H. Hargrove phhargr...@lbl.gov > Future Technologies Group > HPC Research Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Too many open files (24)
Hi Tim, Great news! Happy calculating :-). -- Samuel K. Gutierrez Los Alamos National Laboratory > Dear Samuel, > > Just as you replied I was trying that on the compute nodes. Surprise, > surprise...the value returned as the hard and soft limits is 1024. > > Thanks for confirming my suspicions... > > Regards, > > Tim. > > On Mar 30, 2011, at 7:41 PM, Samuel K. Gutierrez wrote: > > Hi, > > It sounds like Open MPI is hitting your system's open file descriptor > limit. If that's the case, one potential workaround is to have your > system administrator raise file descriptor limits. > > On a compute node, what does "ulimit -a" show (using bash)? > > Hope that helps, > > -- > Samuel K. Gutierrez > Los Alamos National Laboratory > > On Mar 30, 2011, at 5:22 PM, Timothy Stitt wrote: > > Dear OpenMPI developers, > > One of our users was running a benchmark on a 1032 core simulation. He had > a successful run at 900 cores but when he stepped up to 1032 cores the job > just stalled and his logs contained many occurrences of the following > line: > > [d6copt368.crc.nd.edu][[25621,1],0][btl_tcp_component.c:885:mca_btl_tcp_component_accept_handler] > accept() failed: Too many open files (24) > > The simulation has a single master task that communicates with all the > other tasks to write out some I/O via the master. We are assuming the > message is related to this bottleneck. Is there a 1024 limit on the number > of open files/connections for instance? > > Can anyone confirm the meaning of this error and secondly provide a > resolution that hopefully doesn't involve a code rewrite. > > Thanks in advance, > > Tim. > > Tim Stitt PhD (User Support Manager). > Center for Research Computing | University of Notre Dame | > P.O. Box 539, Notre Dame, IN 46556 | Phone: 574-631-5287 | Email: > tst...@nd.edu<mailto:tst...@nd.edu> > > ___ > devel mailing list > de...@open-mpi.org<mailto:de...@open-mpi.org> > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > Tim Stitt PhD (User Support Manager). > Center for Research Computing | University of Notre Dame | > P.O. Box 539, Notre Dame, IN 46556 | Phone: 574-631-5287 | Email: > tst...@nd.edu<mailto:tst...@nd.edu> > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Too many open files (24)
Hi, It sounds like Open MPI is hitting your system's open file descriptor limit. If that's the case, one potential workaround is to have your system administrator raise file descriptor limits. On a compute node, what does "ulimit -a" show (using bash)? Hope that helps, -- Samuel K. Gutierrez Los Alamos National Laboratory On Mar 30, 2011, at 5:22 PM, Timothy Stitt wrote: Dear OpenMPI developers, One of our users was running a benchmark on a 1032 core simulation. He had a successful run at 900 cores but when he stepped up to 1032 cores the job just stalled and his logs contained many occurrences of the following line: [d6copt368.crc.nd.edu][[25621,1],0][btl_tcp_component.c: 885:mca_btl_tcp_component_accept_handler] accept() failed: Too many open files (24) The simulation has a single master task that communicates with all the other tasks to write out some I/O via the master. We are assuming the message is related to this bottleneck. Is there a 1024 limit on the number of open files/connections for instance? Can anyone confirm the meaning of this error and secondly provide a resolution that hopefully doesn't involve a code rewrite. Thanks in advance, Tim. Tim Stitt PhD (User Support Manager). Center for Research Computing | University of Notre Dame | P.O. Box 539, Notre Dame, IN 46556 | Phone: 574-631-5287 | Email: tst...@nd.edu ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Threading
Same here. -- Samuel K. Gutierrez Los Alamos National Laboratory > On Oct 11, 2010, at 11:41 PM, Ralph Castain wrote: > >> Does anyone know of a reason why mpirun can -not- be threaded, assuming >> that all threads block and do not continuously chew cpu? Is there an >> environment where this would cause a problem? > > We don't have any machines at Sandia where I could see this being a > problem. > > Brian > > -- > Brian W. Barrett > Dept. 1423: Scalable System Software > Sandia National Laboratories > > > > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >
Re: [OMPI devel] Question regarding recently common shared-memory component
Hi, Just to be clear - do you see similar checkpoint performance differences in 1.5rc6 and 1.4.2 with and without shared memory enabled? Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory On Sep 21, 2010, at 9:35 AM, <ananda.mu...@wipro.com> <ananda.mu...@wipro.com > wrote: Hello Samuel This problem seems to be resolved after I moved to r23781. However, I see another discrepancy in checkpoint image creation time when I disable shared memory (--mca btl self,tcp,openib) vs using it. I mean the time to create checkpoint image for this simple program is about 0.4 seconds if I disable shared memory while it is close to 6.5 seconds when I use shared memory component. I have not seen this behavior earlier. Do I have to tune any other parameter to reduce the time? Thanks Ananda Hi Ananda, This issue should be resolved in r23781. Please let me know if it is not. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On Sep 20, 2010, at 11:26 AM, <ananda.mudar_at_[hidden]> <ananda.mudar_at_[hidden] > wrote: > I have used following options to build: > ./configure CC=/usr/bin/gcc CXX=/usr/bin/c++ F77=/usr/bin/gfortran > FC=/usr/bin/gfortran --prefix /users/amudar/openmpi-1.7 --with-tm=/ > usr/local/pbs --with-openib --with-threads=posix --enable-mpi- thread- > multiple --enable-ft-thread --enable-debug --with-ft=cr --with- blcr=/ > usr/blcr --with-blcr-libdir=/usr/blcr/lib > > Alsop please note that this is with r23756 build. > > Let me know if you need any other information. > > Thanks > Ananda > Let me take a look at it. How did you configure your build? > Thanks, > > -- > Samuel K. Gutierrez > Los Alamos National Laboratory > On Sep 20, 2010, at 10:14 AM, <ananda.mudar_at_[hidden]> > <ananda.mudar_at_[hidden] > > wrote: > > Hi > > > > I believe the new common shared memory component was committed to > > the trunk sometime towards the later part of August. I had not tried > > this trunk version until last week and I have seen some discrepancy > > with this component specifically related to checkpoint > > functionality. I am not able to checkpoint any program with the > > latest trunk version. Am I missing something here? Should I be using > > any other options to enable checkpoint functionality for shared > > memory component? > > > > However if I disable shared memory component and use only self, tcp, > > and openib (--mca btl self,tcp,openib), I can checkpoint > > successfully!! > > > > Following are the options I have used with mpirun: > > > > mpirun -am ft-enable-cr --mca opal_cr_enable_timer 1 --mca > > sstore_stage_global_is_shared 1 --mca > > sstore_base_global_snapshot_dir /scratch/hpl005/UIT_test/amudar/ FWI > > --mca mpi_paffinity_alone 1 -np 32 -hostfile hostfile-32 ../ > hellompi > > > > Please note that hellompi is a very simple program without any > > collective calls. When I issue checkpoint, this program fails with > > the following messages: > > > > hplcnlj158:13937] Signal: Segmentation fault (11) > > [hplcnlj158:13937] Signal code: Address not mapped (1) > > [hplcnlj158:13937] Failing at address: 0x2aaa0001 > > [hplcnlj158:13937] [ 0] /lib64/libpthread.so.0 [0x2b4019a064c0] > > [hplcnlj158:13937] [ 1] /users/amudar/openmpi-1.7/lib/ > > libmca_common_sm.so.0(mca_common_sm_param_register+0x262) > > [0x2d96628a] > > [hplcnlj158:13937] [ 2] /users/amudar/openmpi-1.7/lib/openmpi/ > > mca_btl_sm.so [0x2f0a55e8] > > [hplcnlj158:13937] [ 3] /users/amudar/openmpi-1.7/lib/libmpi.so.0 > > [0x2b4018c3c11b] > > [hplcnlj158:13937] [ 4] /users/amudar/openmpi-1.7/lib/libmpi.so. > > 0(mca_base_components_open+0x3ef) [0x2b4018c3b70b] > > [hplcnlj158:13937] [ 5] /users/amudar/openmpi-1.7/lib/libmpi.so. > > 0(mca_btl_base_open+0xfd) [0x2b4018b620fe] > > [hplcnlj158:13937] [ 6] /users/amudar/openmpi-1.7/lib/openmpi/ > > mca_bml_r2.so [0x2dd9e4fb] > > [hplcnlj158:13937] [ 7] /users/amudar/openmpi-1.7/lib/openmpi/ > > mca_pml_ob1.so [0x2e5fa429] > > [hplcnlj158:13937] [ 8] /users/amudar/openmpi-1.7/lib/openmpi/ > > mca_pml_crcpw.so [0x2dfadce6] > > [hplcnlj158:13937] [ 9] /users/amudar/openmpi-1.7/lib/libmpi.so.0 > > [0x2b4018b01a0d] > > [hplcnlj158:13937] [10] /users/amudar/openmpi-1.7/lib/libmpi.so. > > 0(ompi_cr_coord+0xc0) [0x2b4018b017ba] > > [hplcnlj158:13937] [11] /users/amudar/openmpi-1.7/lib/libmpi.so. > > 0(opal_cr_inc_core_recover+0xed) [0x2b4018c0efab] > > [hplcnlj158:13937] [12] /users/amudar/openmpi-1.7/lib/openmpi/ > > mca_snapc_full.so [0x2bd280fc] > > [hplcnlj158:13937
Re: [OMPI devel] Question regarding recently common shared-memory component
Hi Ananda, This issue should be resolved in r23781. Please let me know if it is not. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On Sep 20, 2010, at 11:26 AM, <ananda.mu...@wipro.com> <ananda.mu...@wipro.com > wrote: I have used following options to build: ./configure CC=/usr/bin/gcc CXX=/usr/bin/c++ F77=/usr/bin/gfortran FC=/usr/bin/gfortran --prefix /users/amudar/openmpi-1.7 --with-tm=/ usr/local/pbs --with-openib --with-threads=posix --enable-mpi-thread- multiple --enable-ft-thread --enable-debug --with-ft=cr --with-blcr=/ usr/blcr --with-blcr-libdir=/usr/blcr/lib Alsop please note that this is with r23756 build. Let me know if you need any other information. Thanks Ananda Let me take a look at it. How did you configure your build? Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory On Sep 20, 2010, at 10:14 AM, <ananda.mudar_at_[hidden]> <ananda.mudar_at_[hidden] > wrote: > Hi > > I believe the new common shared memory component was committed to > the trunk sometime towards the later part of August. I had not tried > this trunk version until last week and I have seen some discrepancy > with this component specifically related to checkpoint > functionality. I am not able to checkpoint any program with the > latest trunk version. Am I missing something here? Should I be using > any other options to enable checkpoint functionality for shared > memory component? > > However if I disable shared memory component and use only self, tcp, > and openib (--mca btl self,tcp,openib), I can checkpoint > successfully!! > > Following are the options I have used with mpirun: > > mpirun -am ft-enable-cr --mca opal_cr_enable_timer 1 --mca > sstore_stage_global_is_shared 1 --mca > sstore_base_global_snapshot_dir /scratch/hpl005/UIT_test/amudar/FWI > --mca mpi_paffinity_alone 1 -np 32 -hostfile hostfile-32 ../ hellompi > > Please note that hellompi is a very simple program without any > collective calls. When I issue checkpoint, this program fails with > the following messages: > > hplcnlj158:13937] Signal: Segmentation fault (11) > [hplcnlj158:13937] Signal code: Address not mapped (1) > [hplcnlj158:13937] Failing at address: 0x2aaa0001 > [hplcnlj158:13937] [ 0] /lib64/libpthread.so.0 [0x2b4019a064c0] > [hplcnlj158:13937] [ 1] /users/amudar/openmpi-1.7/lib/ > libmca_common_sm.so.0(mca_common_sm_param_register+0x262) > [0x2d96628a] > [hplcnlj158:13937] [ 2] /users/amudar/openmpi-1.7/lib/openmpi/ > mca_btl_sm.so [0x2f0a55e8] > [hplcnlj158:13937] [ 3] /users/amudar/openmpi-1.7/lib/libmpi.so.0 > [0x2b4018c3c11b] > [hplcnlj158:13937] [ 4] /users/amudar/openmpi-1.7/lib/libmpi.so. > 0(mca_base_components_open+0x3ef) [0x2b4018c3b70b] > [hplcnlj158:13937] [ 5] /users/amudar/openmpi-1.7/lib/libmpi.so. > 0(mca_btl_base_open+0xfd) [0x2b4018b620fe] > [hplcnlj158:13937] [ 6] /users/amudar/openmpi-1.7/lib/openmpi/ > mca_bml_r2.so [0x2dd9e4fb] > [hplcnlj158:13937] [ 7] /users/amudar/openmpi-1.7/lib/openmpi/ > mca_pml_ob1.so [0x2e5fa429] > [hplcnlj158:13937] [ 8] /users/amudar/openmpi-1.7/lib/openmpi/ > mca_pml_crcpw.so [0x2dfadce6] > [hplcnlj158:13937] [ 9] /users/amudar/openmpi-1.7/lib/libmpi.so.0 > [0x2b4018b01a0d] > [hplcnlj158:13937] [10] /users/amudar/openmpi-1.7/lib/libmpi.so. > 0(ompi_cr_coord+0xc0) [0x2b4018b017ba] > [hplcnlj158:13937] [11] /users/amudar/openmpi-1.7/lib/libmpi.so. > 0(opal_cr_inc_core_recover+0xed) [0x2b4018c0efab] > [hplcnlj158:13937] [12] /users/amudar/openmpi-1.7/lib/openmpi/ > mca_snapc_full.so [0x2bd280fc] > [hplcnlj158:13937] [13] /users/amudar/openmpi-1.7/lib/libmpi.so. > 0(opal_cr_test_if_checkpoint_ready+0x11b) [0x2b4018c0ecd3] > [hplcnlj158:13937] [14] /users/amudar/openmpi-1.7/lib/libmpi.so.0 > [0x2b4018c0f6e7] > [hplcnlj158:13937] [15] /lib64/libpthread.so.0 [0x2b40199fe367] > [hplcnlj158:13937] [16] /lib64/libc.so.6(clone+0x6d) [0x2b4019ce5f7d] > [hplcnlj158:13937] *** End of error message *** > [hplcnlj161:00637] *** Process received signal *** > [hplcnlj161:00637] Signal: Segmentation fault (11) > [hplcnlj161:00637] Signal code: Address not mapped (1) > [hplcnlj161:00637] Failing at address: 0x2aaa0001 > [hplcnlj161:00649] *** Process received signal *** > [hplcnlj161:00649] Signal: Segmentation fault (11) > [hplcnlj161:00649] Signal code: Address not mapped (1) > [hplcnlj161:00649] Failing at address: 0x2aaa0001 > /users/amudar/Fix_for_pidinuse/cr_restart: line 5: 14012 > Segmentation fault /usr/blcr/bin/cr_restart --no-restore-pid "$@" > [hplcnlj161:00643] *** Process received signal *** > [hplcnlj161:00643] Signal: Segmentation fault (11) > [hplcnlj161:00643] Signal code: Address not mapped (1) > [hplcnlj161:00643] Failing at
Re: [OMPI devel] Question regarding recently common shared-memory component
Let me take a look at it. How did you configure your build? Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory On Sep 20, 2010, at 10:14 AM, <ananda.mu...@wipro.com> <ananda.mu...@wipro.com > wrote: Hi I believe the new common shared memory component was committed to the trunk sometime towards the later part of August. I had not tried this trunk version until last week and I have seen some discrepancy with this component specifically related to checkpoint functionality. I am not able to checkpoint any program with the latest trunk version. Am I missing something here? Should I be using any other options to enable checkpoint functionality for shared memory component? However if I disable shared memory component and use only self, tcp, and openib (--mca btl self,tcp,openib), I can checkpoint successfully!! Following are the options I have used with mpirun: mpirun -am ft-enable-cr --mca opal_cr_enable_timer 1 --mca sstore_stage_global_is_shared 1 --mca sstore_base_global_snapshot_dir /scratch/hpl005/UIT_test/amudar/FWI --mca mpi_paffinity_alone 1 -np 32 -hostfile hostfile-32 ../hellompi Please note that hellompi is a very simple program without any collective calls. When I issue checkpoint, this program fails with the following messages: hplcnlj158:13937] Signal: Segmentation fault (11) [hplcnlj158:13937] Signal code: Address not mapped (1) [hplcnlj158:13937] Failing at address: 0x2aaa0001 [hplcnlj158:13937] [ 0] /lib64/libpthread.so.0 [0x2b4019a064c0] [hplcnlj158:13937] [ 1] /users/amudar/openmpi-1.7/lib/ libmca_common_sm.so.0(mca_common_sm_param_register+0x262) [0x2d96628a] [hplcnlj158:13937] [ 2] /users/amudar/openmpi-1.7/lib/openmpi/ mca_btl_sm.so [0x2f0a55e8] [hplcnlj158:13937] [ 3] /users/amudar/openmpi-1.7/lib/libmpi.so.0 [0x2b4018c3c11b] [hplcnlj158:13937] [ 4] /users/amudar/openmpi-1.7/lib/libmpi.so. 0(mca_base_components_open+0x3ef) [0x2b4018c3b70b] [hplcnlj158:13937] [ 5] /users/amudar/openmpi-1.7/lib/libmpi.so. 0(mca_btl_base_open+0xfd) [0x2b4018b620fe] [hplcnlj158:13937] [ 6] /users/amudar/openmpi-1.7/lib/openmpi/ mca_bml_r2.so [0x2dd9e4fb] [hplcnlj158:13937] [ 7] /users/amudar/openmpi-1.7/lib/openmpi/ mca_pml_ob1.so [0x2e5fa429] [hplcnlj158:13937] [ 8] /users/amudar/openmpi-1.7/lib/openmpi/ mca_pml_crcpw.so [0x2dfadce6] [hplcnlj158:13937] [ 9] /users/amudar/openmpi-1.7/lib/libmpi.so.0 [0x2b4018b01a0d] [hplcnlj158:13937] [10] /users/amudar/openmpi-1.7/lib/libmpi.so. 0(ompi_cr_coord+0xc0) [0x2b4018b017ba] [hplcnlj158:13937] [11] /users/amudar/openmpi-1.7/lib/libmpi.so. 0(opal_cr_inc_core_recover+0xed) [0x2b4018c0efab] [hplcnlj158:13937] [12] /users/amudar/openmpi-1.7/lib/openmpi/ mca_snapc_full.so [0x2bd280fc] [hplcnlj158:13937] [13] /users/amudar/openmpi-1.7/lib/libmpi.so. 0(opal_cr_test_if_checkpoint_ready+0x11b) [0x2b4018c0ecd3] [hplcnlj158:13937] [14] /users/amudar/openmpi-1.7/lib/libmpi.so.0 [0x2b4018c0f6e7] [hplcnlj158:13937] [15] /lib64/libpthread.so.0 [0x2b40199fe367] [hplcnlj158:13937] [16] /lib64/libc.so.6(clone+0x6d) [0x2b4019ce5f7d] [hplcnlj158:13937] *** End of error message *** [hplcnlj161:00637] *** Process received signal *** [hplcnlj161:00637] Signal: Segmentation fault (11) [hplcnlj161:00637] Signal code: Address not mapped (1) [hplcnlj161:00637] Failing at address: 0x2aaa0001 [hplcnlj161:00649] *** Process received signal *** [hplcnlj161:00649] Signal: Segmentation fault (11) [hplcnlj161:00649] Signal code: Address not mapped (1) [hplcnlj161:00649] Failing at address: 0x2aaa0001 /users/amudar/Fix_for_pidinuse/cr_restart: line 5: 14012 Segmentation fault /usr/blcr/bin/cr_restart --no-restore-pid "$@" [hplcnlj161:00643] *** Process received signal *** [hplcnlj161:00643] Signal: Segmentation fault (11) [hplcnlj161:00643] Signal code: Address not mapped (1) [hplcnlj161:00643] Failing at address: 0x2aaa0001 [hplcnlj161:00640] *** Process received signal *** [hplcnlj161:00640] Signal: Segmentation fault (11) [hplcnlj161:00640] Signal code: Address not mapped (1) [hplcnlj161:00640] Failing at address: 0x2aaa0001 [hplcnlj161:00636] *** Process received signal *** [hplcnlj161:00652] *** Process received signal *** [hplcnlj161:00652] Signal: Segmentation fault (11) [hplcnlj161:00652] Signal code: Address not mapped (1) [hplcnlj161:00652] Failing at address: 0x2aaa0001 [hplcnlj161:00636] Signal: Segmentation fault (11) [hplcnlj161:00636] Signal code: Address not mapped (1) [hplcnlj161:00636] Failing at address: 0x2aaa0001 [hplcnlj161:00637] [ 0] /lib64/libpthread.so.0 [0x2b86c74694c0] [hplcnlj161:00637] [ 1] /users/amudar/openmpi-1.7/lib/ libmca_common_sm.so.0(mca_common_sm_param_register+0x262) [0x2d96628a] [hplcnlj161:00637] [ 2] /users/amudar/openmpi-1.7/lib/openmpi/ mca_btl_sm.so [0x2f0a55e8] [hplcnlj161:00637] [ 3] /users/amudar/openmpi-1.7/lib/libmpi.so.0 [0x2b86c669f11b] [hplcnlj161:00637] [ 4] /users/
Re: [OMPI devel] common_sm_mmap.c: wrong args to orte_show_help() (1.5rc5 and 1.4.3rc1)
Will do. Sam On Aug 26, 2010, at 2:08 PM, Jeff Squyres wrote: I think Sam already submitted CMR's for 1.5: https://svn.open-mpi.org/trac/ompi/ticket/2545 Sam -- can you construct an equivalent for v1.4 and CC Paul so that he knows not to follow up on it? Thanks! On Aug 26, 2010, at 3:56 PM, Paul H. Hargrove wrote: The warnings below have appeared in some of my other testing results. However, I now know what they correspond to. In both 1.5rc5 and 1.4.3rc1 there are two calls to orte_show_help() that are passing orte_process_info.nodename as the third argument, where a _Bool is expected. It looks to me as if the third argument is actually just missing from these 2 calls. -Paul For 1.4.3rc1: "../../../../../ompi/mca/common/sm/common_sm_mmap.c", line 111.41: 1506-280 (W) Function argument assignment between types "_Bool" and "char*" is not allowed. "../../../../../ompi/mca/common/sm/common_sm_mmap.c", line 136.45: 1506-280 (W) Function argument assignment between types "_Bool" and "char*" is not allowed. For 1.5rc5: "../../../../../ompi/mca/common/sm/common_sm_mmap.c", line 110.41: 1506-280 (W) Function argument assignment between types "_Bool" and "char*" is not allowed. "../../../../../ompi/mca/common/sm/common_sm_mmap.c", line 135.45: 1506-280 (W) Function argument assignment between types "_Bool" and "char*" is not allowed. -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Trunk Commit Heads-up: New Common Shared Memory Component
Sorry, I should have included the link containing the discussion of the plot. http://www.open-mpi.org/community/lists/devel/2010/06/8078.php -- Samuel K. Gutierrez Los Alamos National Laboratory On Aug 12, 2010, at 11:20 AM, Terry Dontje wrote: Sorry Rich, I didn't realize there was a graph attached at the end of message. In other words my comments are not applicable because I really didn't know you were asking about the graph. I agree it would be nice to know what the graph was plotting. --td Terry Dontje wrote: Graham, Richard L. wrote: Stupid question: What is being plotted, and what are the units ? Rich MB of Resident and Shared memory as gotten from top (on linux). The values for each of the processes run cases seem to be the same between posix, mmap and sysv. --td On 8/11/10 3:15 PM, "Samuel K. Gutierrez" <sam...@lanl.gov> wrote: Hi Terry, On Aug 11, 2010, at 12:34 PM, Terry Dontje wrote: I've done some minor testing on Linux looking at resident and shared memory sizes for np=4, 8 and 16 jobs. I could not see any appreciable differences in sizes in the process between sysv, posix or mmap usage in the SM btl. So I am still somewhat non-plussed about making this the default. It seems like the biggest gain out of using posix might be one doesn't need to worry about the location of the backing file. This seems kind of frivolous to me since there is a warning that happens if the backing file is not memory based. If I'm not mistaken, the warning is only issued if the backing files is stored on the following file systems: Lustre, NFS, Panasas, and GPFS (see: opal_path_nfs in opal/util/path.c). Based on the performance numbers that Sylvain provided on June 9th of this year (see attached), there was a performance difference between mmap on /tmp and mmap on a tmpfs-like file system (/dev/ shm in that particular case). Using the new POSIX component provides us with a portable way to provide similar shared memory performance gains without having to worry about where the OMPI session directory is rooted. -- Samuel K. Gutierrez Los Alamos National Laboratory [cid:3364459484_11867134] I still working on testing the code on Solaris but I don't imagine I will see anything that will change my mind. --td Samuel K. Gutierrez wrote: Hi Rich, It's a modification to the existing common sm component. The modifications do include the addition of a new POSIX shared memory facility, however. Sam On Aug 11, 2010, at 10:05 AM, Graham, Richard L. wrote: Is this a modification of the existing component, or a new component ? Rich On 8/10/10 10:52 AM, "Samuel K. Gutierrez" <sam...@lanl.gov> <mailto:sam...@lanl.gov > wrote: Hi, I wanted to give everyone a heads-up about a new POSIX shared memory component that has been in the works for a while now and is ready to be pushed into the trunk. http://bitbucket.org/samuelkgutierrez/ompi_posix_sm_new Some highlights: o New posix component now the new default. o May address some of the shared memory performance issues users encounter when OMPI's session directories are inadvertently placed on a non- local filesystem. o Silent component failover. o In the default case, if the posix component fails initialization, mmap will be selected. o The sysv component will only be queried for selection if it is placed before the mmap component (for example, -mca mpi_common_sm sysv,posix,mmap). In the default case, sysv will never be queried/selected. o Per some on-list discussion, now unlinking mmaped file in both mmap and posix components (see: "System V Shared Memory for Open MPI: Request for Community Input and Testing" thread). o Assuming local process homogeneity with respect to all utilized shared memory facilities. That is, if one local process deems a particular shared memory facility acceptable, then ALL local processes should be able to utilize that facility. As it stands, this is an important point because one process dictates to all other local processes which common sm component will be selected based on its own, local run-time test. o Addressed some of George's code reuse concerns. If there are no major objections by August 17th, I'll commit the code after the Tuesday morning conference call. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org ht
Re: [OMPI devel] Trunk Commit Heads-up: New Common Shared Memory Component
Hi Terry, One more thing... Before testing on Solaris 10, could you please update (I just committed a Solaris 10 fix). Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory On Aug 11, 2010, at 1:15 PM, Samuel K. Gutierrez wrote: Hi Terry, On Aug 11, 2010, at 12:34 PM, Terry Dontje wrote: I've done some minor testing on Linux looking at resident and shared memory sizes for np=4, 8 and 16 jobs. I could not see any appreciable differences in sizes in the process between sysv, posix or mmap usage in the SM btl. So I am still somewhat non-plussed about making this the default. It seems like the biggest gain out of using posix might be one doesn't need to worry about the location of the backing file. This seems kind of frivolous to me since there is a warning that happens if the backing file is not memory based. If I'm not mistaken, the warning is only issued if the backing files is stored on the following file systems: Lustre, NFS, Panasas, and GPFS (see: opal_path_nfs in opal/util/path.c). Based on the performance numbers that Sylvain provided on June 9th of this year (see attached), there was a performance difference between mmap on / tmp and mmap on a tmpfs-like file system (/dev/shm in that particular case). Using the new POSIX component provides us with a portable way to provide similar shared memory performance gains without having to worry about where the OMPI session directory is rooted. -- Samuel K. Gutierrez Los Alamos National Laboratory I still working on testing the code on Solaris but I don't imagine I will see anything that will change my mind. --td Samuel K. Gutierrez wrote: Hi Rich, It's a modification to the existing common sm component. The modifications do include the addition of a new POSIX shared memory facility, however. Sam On Aug 11, 2010, at 10:05 AM, Graham, Richard L. wrote: Is this a modification of the existing component, or a new component ? Rich On 8/10/10 10:52 AM, "Samuel K. Gutierrez" <sam...@lanl.gov> wrote: Hi, I wanted to give everyone a heads-up about a new POSIX shared memory component that has been in the works for a while now and is ready to be pushed into the trunk. http://bitbucket.org/samuelkgutierrez/ompi_posix_sm_new Some highlights: o New posix component now the new default. o May address some of the shared memory performance issues users encounter when OMPI's session directories are inadvertently placed on a non- local filesystem. o Silent component failover. o In the default case, if the posix component fails initialization, mmap will be selected. o The sysv component will only be queried for selection if it is placed before the mmap component (for example, -mca mpi_common_sm sysv,posix,mmap). In the default case, sysv will never be queried/selected. o Per some on-list discussion, now unlinking mmaped file in both mmap and posix components (see: "System V Shared Memory for Open MPI: Request for Community Input and Testing" thread). o Assuming local process homogeneity with respect to all utilized shared memory facilities. That is, if one local process deems a particular shared memory facility acceptable, then ALL local processes should be able to utilize that facility. As it stands, this is an important point because one process dictates to all other local processes which common sm component will be selected based on its own, local run-time test. o Addressed some of George's code reuse concerns. If there are no major objections by August 17th, I'll commit the code after the Tuesday morning conference call. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Terry D. Dontje | Principal Software Engineer Developer Tools Engineering | +1.650.633.7054 Oracle - Performance Technologies 95 Network Drive, Burlington, MA 01803 Email terry.don...@oracle.com ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Trunk Commit Heads-up: New Common Shared Memory Component
Hi Terry, On Aug 11, 2010, at 12:34 PM, Terry Dontje wrote: I've done some minor testing on Linux looking at resident and shared memory sizes for np=4, 8 and 16 jobs. I could not see any appreciable differences in sizes in the process between sysv, posix or mmap usage in the SM btl. So I am still somewhat non-plussed about making this the default. It seems like the biggest gain out of using posix might be one doesn't need to worry about the location of the backing file. This seems kind of frivolous to me since there is a warning that happens if the backing file is not memory based. If I'm not mistaken, the warning is only issued if the backing files is stored on the following file systems: Lustre, NFS, Panasas, and GPFS (see: opal_path_nfs in opal/util/path.c). Based on the performance numbers that Sylvain provided on June 9th of this year (see attached), there was a performance difference between mmap on / tmp and mmap on a tmpfs-like file system (/dev/shm in that particular case). Using the new POSIX component provides us with a portable way to provide similar shared memory performance gains without having to worry about where the OMPI session directory is rooted. -- Samuel K. Gutierrez Los Alamos National Laboratory I still working on testing the code on Solaris but I don't imagine I will see anything that will change my mind. --td Samuel K. Gutierrez wrote: Hi Rich, It's a modification to the existing common sm component. The modifications do include the addition of a new POSIX shared memory facility, however. Sam On Aug 11, 2010, at 10:05 AM, Graham, Richard L. wrote: Is this a modification of the existing component, or a new component ? Rich On 8/10/10 10:52 AM, "Samuel K. Gutierrez" <sam...@lanl.gov> wrote: Hi, I wanted to give everyone a heads-up about a new POSIX shared memory component that has been in the works for a while now and is ready to be pushed into the trunk. http://bitbucket.org/samuelkgutierrez/ompi_posix_sm_new Some highlights: o New posix component now the new default. o May address some of the shared memory performance issues users encounter when OMPI's session directories are inadvertently placed on a non- local filesystem. o Silent component failover. o In the default case, if the posix component fails initialization, mmap will be selected. o The sysv component will only be queried for selection if it is placed before the mmap component (for example, -mca mpi_common_sm sysv,posix,mmap). In the default case, sysv will never be queried/selected. o Per some on-list discussion, now unlinking mmaped file in both mmap and posix components (see: "System V Shared Memory for Open MPI: Request for Community Input and Testing" thread). o Assuming local process homogeneity with respect to all utilized shared memory facilities. That is, if one local process deems a particular shared memory facility acceptable, then ALL local processes should be able to utilize that facility. As it stands, this is an important point because one process dictates to all other local processes which common sm component will be selected based on its own, local run-time test. o Addressed some of George's code reuse concerns. If there are no major objections by August 17th, I'll commit the code after the Tuesday morning conference call. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Terry D. Dontje | Principal Software Engineer Developer Tools Engineering | +1.650.633.7054 Oracle - Performance Technologies 95 Network Drive, Burlington, MA 01803 Email terry.don...@oracle.com ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Trunk Commit Heads-up: New Common Shared Memory Component
Hi Rich, It's a modification to the existing common sm component. The modifications do include the addition of a new POSIX shared memory facility, however. Sam On Aug 11, 2010, at 10:05 AM, Graham, Richard L. wrote: Is this a modification of the existing component, or a new component ? Rich On 8/10/10 10:52 AM, "Samuel K. Gutierrez" <sam...@lanl.gov> wrote: Hi, I wanted to give everyone a heads-up about a new POSIX shared memory component that has been in the works for a while now and is ready to be pushed into the trunk. http://bitbucket.org/samuelkgutierrez/ompi_posix_sm_new Some highlights: o New posix component now the new default. o May address some of the shared memory performance issues users encounter when OMPI's session directories are inadvertently placed on a non- local filesystem. o Silent component failover. o In the default case, if the posix component fails initialization, mmap will be selected. o The sysv component will only be queried for selection if it is placed before the mmap component (for example, -mca mpi_common_sm sysv,posix,mmap). In the default case, sysv will never be queried/selected. o Per some on-list discussion, now unlinking mmaped file in both mmap and posix components (see: "System V Shared Memory for Open MPI: Request for Community Input and Testing" thread). o Assuming local process homogeneity with respect to all utilized shared memory facilities. That is, if one local process deems a particular shared memory facility acceptable, then ALL local processes should be able to utilize that facility. As it stands, this is an important point because one process dictates to all other local processes which common sm component will be selected based on its own, local run-time test. o Addressed some of George's code reuse concerns. If there are no major objections by August 17th, I'll commit the code after the Tuesday morning conference call. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] Trunk Commit Heads-up: New Common Shared Memory Component
Hi, I wanted to give everyone a heads-up about a new POSIX shared memory component that has been in the works for a while now and is ready to be pushed into the trunk. http://bitbucket.org/samuelkgutierrez/ompi_posix_sm_new Some highlights: o New posix component now the new default. o May address some of the shared memory performance issues users encounter when OMPI's session directories are inadvertently placed on a non- local filesystem. o Silent component failover. o In the default case, if the posix component fails initialization, mmap will be selected. o The sysv component will only be queried for selection if it is placed before the mmap component (for example, -mca mpi_common_sm sysv,posix,mmap). In the default case, sysv will never be queried/selected. o Per some on-list discussion, now unlinking mmaped file in both mmap and posix components (see: "System V Shared Memory for Open MPI: Request for Community Input and Testing" thread). o Assuming local process homogeneity with respect to all utilized shared memory facilities. That is, if one local process deems a particular shared memory facility acceptable, then ALL local processes should be able to utilize that facility. As it stands, this is an important point because one process dictates to all other local processes which common sm component will be selected based on its own, local run-time test. o Addressed some of George's code reuse concerns. If there are no major objections by August 17th, I'll commit the code after the Tuesday morning conference call. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory
Re: [OMPI devel] RFC: System V Shared Memory for Open MPI
On Jun 2, 2010, at 11:58 AM, Samuel K. Gutierrez wrote: Good point - I forgot about that. -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 2, 2010, at 11:40 AM, Jeff Squyres wrote: Don't forget that the RML is also used to broadcast the success/ failure of the creation of the shared memory segment. If the RML goes away, be sure that you can still determine that without hanging. Personally, I still don't see the problem with using the RML stuff... On Jun 2, 2010, at 1:07 PM, Samuel K. Gutierrez wrote: Hi George, That may work - I'll try it. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 2, 2010, at 10:59 AM, George Bosilca wrote: How about ftok ? The init function takes a file_name as argument, and this file name is unique per instance of the shared memory region we want to create. We can use this file name with ftok to create a unique key_t that can be used by shmget to retrieve the shared memory identifier. george. Hi George, I think ftok brings us back to the atomic file creation problem. In particular, ftok requires pathname be an existing file. As it stands, this file is created by the common sm module. -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 2, 2010, at 11:53 , Samuel K. Gutierrez wrote: On Jun 2, 2010, at 8:49 AM, Jeff Squyres wrote: On Jun 2, 2010, at 10:44 AM, George Bosilca wrote: Not sure what you mean here. common/sm may create new shmem segments at any time (e.g., during coll sm). The RML message exchange is to ensure that only 1 process creates and initializes the segment and then all the others just attach to it. Absolutely not! The RML messaging is not about initializing the shared memory segment. As stated on my original text it has only one purpose: to ensure the file used by mmap is created atomically. The code for Windows do not exchange any RML messages as the function to allocate the shared memory region provided by the OS is atomic (exactly as the sysv one). I thought that Sam said that it was important that only 1 process shmctl/IPC_RMID...? Hi George, We are using RML messaging in the sysv code to exchange the shared memory ID (generated by exactly one process). I'm not sure how we would go about passing along the shared memory ID without RML, but any ideas are greatly appreciated. Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] RFC: System V Shared Memory for Open MPI
Good point - I forgot about that. -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 2, 2010, at 11:40 AM, Jeff Squyres wrote: Don't forget that the RML is also used to broadcast the success/ failure of the creation of the shared memory segment. If the RML goes away, be sure that you can still determine that without hanging. Personally, I still don't see the problem with using the RML stuff... On Jun 2, 2010, at 1:07 PM, Samuel K. Gutierrez wrote: Hi George, That may work - I'll try it. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 2, 2010, at 10:59 AM, George Bosilca wrote: How about ftok ? The init function takes a file_name as argument, and this file name is unique per instance of the shared memory region we want to create. We can use this file name with ftok to create a unique key_t that can be used by shmget to retrieve the shared memory identifier. george. On Jun 2, 2010, at 11:53 , Samuel K. Gutierrez wrote: On Jun 2, 2010, at 8:49 AM, Jeff Squyres wrote: On Jun 2, 2010, at 10:44 AM, George Bosilca wrote: Not sure what you mean here. common/sm may create new shmem segments at any time (e.g., during coll sm). The RML message exchange is to ensure that only 1 process creates and initializes the segment and then all the others just attach to it. Absolutely not! The RML messaging is not about initializing the shared memory segment. As stated on my original text it has only one purpose: to ensure the file used by mmap is created atomically. The code for Windows do not exchange any RML messages as the function to allocate the shared memory region provided by the OS is atomic (exactly as the sysv one). I thought that Sam said that it was important that only 1 process shmctl/IPC_RMID...? Hi George, We are using RML messaging in the sysv code to exchange the shared memory ID (generated by exactly one process). I'm not sure how we would go about passing along the shared memory ID without RML, but any ideas are greatly appreciated. Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] RFC: System V Shared Memory for Open MPI
Hi George, That may work - I'll try it. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 2, 2010, at 10:59 AM, George Bosilca wrote: How about ftok ? The init function takes a file_name as argument, and this file name is unique per instance of the shared memory region we want to create. We can use this file name with ftok to create a unique key_t that can be used by shmget to retrieve the shared memory identifier. george. On Jun 2, 2010, at 11:53 , Samuel K. Gutierrez wrote: On Jun 2, 2010, at 8:49 AM, Jeff Squyres wrote: On Jun 2, 2010, at 10:44 AM, George Bosilca wrote: Not sure what you mean here. common/sm may create new shmem segments at any time (e.g., during coll sm). The RML message exchange is to ensure that only 1 process creates and initializes the segment and then all the others just attach to it. Absolutely not! The RML messaging is not about initializing the shared memory segment. As stated on my original text it has only one purpose: to ensure the file used by mmap is created atomically. The code for Windows do not exchange any RML messages as the function to allocate the shared memory region provided by the OS is atomic (exactly as the sysv one). I thought that Sam said that it was important that only 1 process shmctl/IPC_RMID...? Hi George, We are using RML messaging in the sysv code to exchange the shared memory ID (generated by exactly one process). I'm not sure how we would go about passing along the shared memory ID without RML, but any ideas are greatly appreciated. Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] RFC: System V Shared Memory for Open MPI
On Jun 2, 2010, at 7:28 AM, Jeff Squyres wrote: On Jun 2, 2010, at 5:38 AM, George Bosilca wrote: I think adding support for sysv shared memory is a good thing. However, I have some strong objections over the implementation in the hg tree. Here are 2 of the major ones: 1) the sysv shared memory creation is __atomic__ based on the flags used. Therefore, all the RML messages exchange is totally useless. Not sure what you mean here. common/sm may create new shmem segments at any time (e.g., during coll sm). The RML message exchange is to ensure that only 1 process creates and initializes the segment and then all the others just attach to it. The initializing of the segment after it is created/attached could be pipelined a little more. E.g, since the init has an atomicly-set flag indicating when it's done, the root could create the seg, signal the others that they can attach, and then do the init -- the non-root procs can wait for flag to change atomicly to know when the seg has been initialized). Is that what you're referring to? 2) the whole code is replicated in the 3 files (mmap, sysv and windows), even the common parts. However in the sysv case most of the comments have been modified to remove all capitals letter. I'm in favor of extracting all the common parts and moving them in a special file. What should be kept in the particular files should only be the really different parts (small part of the init and finalize). Sam -- are the common parts really common? I.e., could they be factored out? Or are they "just different enough" that factoring them out would be a PITA? I'm sure some refactoring could be done - let me take a look. -- Samuel K. Gutierrez Los Alamos National Laboratory -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] RFC: System V Shared Memory for Open MPI
Doh! bitbucket repository: http://bitbucket.org/samuelkgutierrez/ompi_sysv_sm Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory On Jun 1, 2010, at 11:08 AM, Samuel K. Gutierrez wrote: WHAT: New System V shared memory component. WHY: https://svn.open-mpi.org/trac/ompi/ticket/1320 WHERE: M ompi/mca/btl/sm/btl_sm.c M ompi/mca/btl/sm/btl_sm_component.c M ompi/mca/btl/sm/btl_sm.h M ompi/mca/mpool/sm/mpool_sm_component.c M ompi/mca/mpool/sm/mpool_sm.h M ompi/mca/mpool/sm/mpool_sm_module.c A ompi/mca/common/sm/configure.m4 A ompi/mca/common/sm/common_sm_sysv.h A ompi/mca/common/sm/common_sm_windows.c A ompi/mca/common/sm/common_sm_windows.h A ompi/mca/common/sm/common_sm.c A ompi/mca/common/sm/common_sm_sysv.c A ompi/mca/common/sm/common_sm.h M ompi/mca/common/sm/common_sm_mmap.c M ompi/mca/common/sm/common_sm_mmap.h M ompi/mca/common/sm/Makefile.am M ompi/mca/common/sm/help-mpi-common-sm.txt M ompi/mca/coll/sm/coll_sm_module.c M ompi/mca/coll/sm/coll_sm.h WHEN: Upon acceptance. TIMEOUT: Tuesday, June 8, 2010 (after devel concall). HOW: MCA mpi: parameter "mpi_common_sm" (current value: , data source: default value) Which shared memory support will be used. Valid values: sysv,mmap - or a comma delimited combination of them (order dependent). The first component that is successfully selected is used. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] System V Shared Memory forOpenMPI:Request forCommunity Input and Testing
On May 5, 2010, at 6:10 AM, Jeff Squyres wrote: On May 4, 2010, at 9:53 AM, Ashley Pittman wrote: Point noted. But actually -- can you give specific reasons as to why a user should care? Keep in mind that this would be a short- lived fork'ed process -- not "spawn" in the MPI sense of the word. You might be running the job under Valgrind or another debugger, bclr has some issues with fork as I remember and traditionally there have been IB mapping issues here as well. I'm sure you could make a case against any of those points if you wanted to but I think the argument stands, doing this kind of run-time check shouldn't be needed. Mmm; good points (especially Valgrind). BLCR and OpenFabrics verbs shouldn't be much of an issue here, but I can see that there might be unexpectedness if you're running under Valgrind or some other debugger. It might be possible to construct the code however so that if it failed to initialise it just wasn't used rather than aborted the job which would have much the same effect as a run-time test but without having to fork new processes and create short-lived shared memory regions. That's how most of the network transports are in OMPI today -- if they fail to init, they are just skipped. The problem here is that you really need 2 processes to do this test. I suppose it could be done with local ranks 0 and 1 instead of forking a new process -- they would just need to communicate via RML to sync up, I suppose. I need to think about it a little more, but I like this solution. Thanks, -- Samuel K. Gutierrez Los Alamos National Laboratory I should of course said fork where I mentioned spawn above to avoid any confusion, spawn has a specific meaning in the context of MPI. I still think a better understanding of the issue is required before any decision here is made though, I'm surprised by Samuels description of the problem because it's not how I remember it and from what Chris says it doesn't reflect what is in linux Git code either. I'd like to see why there is an apparent difference in behaviour before a decision is made to only support one. There's no intent to only support sysv or mmap. Samuel's work was to extend OMPI to support sysv in the case where it would be advantageous (e.g., guaranteed cleanup of the shmem segment). The mmap stuff is definitely not going to be removed. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing
Hi All, New configure-time test added - thanks for the suggestion, Jeff. Update and give it a whirl. Ethan - could you please try again? This time, I'm hoping sysv support will be disabled ;-). Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On May 3, 2010, at 9:18 AM, Samuel K. Gutierrez wrote: Hi Jeff, Sounds like a plan :-). Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On May 3, 2010, at 9:12 AM, Jeff Squyres wrote: It might well be that you need a configure test to determine whether this behavior occurs or not. Heck, it may even need to be a run-time test! Hrm. Write a small C program that does something like the following (this is off the top of my head): fork a child child goes to sleep immediately sysv alloc a segment attach to it ipc rm it parent wakes up child child tries to attach to segment If that succeeds, then all is good. If not, then don't use this stuff. On May 3, 2010, at 10:55 AM, Samuel K. Gutierrez wrote: Hi all, Does anyone know of a relatively portable solution for querying a given system for the shmctl behavior that I am relying on, or is this going to be a nightmare? Because, if I am reading this thread correctly, the presence of shmget and Linux is not sufficient for determining an adequate level of sysv support. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On May 2, 2010, at 7:48 AM, N.M. Maclaren wrote: On May 2 2010, Ashley Pittman wrote: On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote: As to performance there should be no difference in use between sys- V shared memory and file-backed shared memory, the instructions issued and the MMU flags for the page should both be the same so the performance should be identical. Not necessarily, and possibly not so even for far-future Linuces. On at least one system I used, the poxious kernel wrote the complete file to disk before returning - all right, it did that for System V shared memory, too, just to a 'hidden' file! But, if I recall, on another it did that only for file-backed shared memory - however, it's a decade ago now and I may be misremembering. Of course, that's a serious issue mainly for large segments. I was using multi-GB ones. I don't know how big the ones you need are. The one area you do need to keep an eye on for performance is on numa machines where it's important which process on a node touches each page first, you can end up using different areas (pages, not regions) for communicating in different directions between the same pair of processes. I don't believe this is any different to mmap backed shared memory though. On some systems it may be, but in bizarre, inconsistent, undocumented and unpredictable ways :-( Also, there are usually several system (and sometimes user) configuration options that change the behaviour, so you have to allow for that. My experience of trying to use those is that different uses have incompatible requirements, and most of the critical configuration parameters apply to ALL uses! In my view, the configuration variability is the number one nightmare for trying to write portable code that uses any form of shared memory. ARMCI seem to agree. Because of this, sysv support may be limited to Linux systems - that is, until we can get a better sense of which systems provide the shmctl IPC_RMID behavior that I am relying on. And, I suggest, whether they have an evil gotcha on one of the areas that Ashley Pittman noted. Regards, Nick Maclaren. ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing
Hi Jeff, Sounds like a plan :-). Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On May 3, 2010, at 9:12 AM, Jeff Squyres wrote: It might well be that you need a configure test to determine whether this behavior occurs or not. Heck, it may even need to be a run- time test! Hrm. Write a small C program that does something like the following (this is off the top of my head): fork a child child goes to sleep immediately sysv alloc a segment attach to it ipc rm it parent wakes up child child tries to attach to segment If that succeeds, then all is good. If not, then don't use this stuff. On May 3, 2010, at 10:55 AM, Samuel K. Gutierrez wrote: Hi all, Does anyone know of a relatively portable solution for querying a given system for the shmctl behavior that I am relying on, or is this going to be a nightmare? Because, if I am reading this thread correctly, the presence of shmget and Linux is not sufficient for determining an adequate level of sysv support. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On May 2, 2010, at 7:48 AM, N.M. Maclaren wrote: On May 2 2010, Ashley Pittman wrote: On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote: As to performance there should be no difference in use between sys- V shared memory and file-backed shared memory, the instructions issued and the MMU flags for the page should both be the same so the performance should be identical. Not necessarily, and possibly not so even for far-future Linuces. On at least one system I used, the poxious kernel wrote the complete file to disk before returning - all right, it did that for System V shared memory, too, just to a 'hidden' file! But, if I recall, on another it did that only for file-backed shared memory - however, it's a decade ago now and I may be misremembering. Of course, that's a serious issue mainly for large segments. I was using multi-GB ones. I don't know how big the ones you need are. The one area you do need to keep an eye on for performance is on numa machines where it's important which process on a node touches each page first, you can end up using different areas (pages, not regions) for communicating in different directions between the same pair of processes. I don't believe this is any different to mmap backed shared memory though. On some systems it may be, but in bizarre, inconsistent, undocumented and unpredictable ways :-( Also, there are usually several system (and sometimes user) configuration options that change the behaviour, so you have to allow for that. My experience of trying to use those is that different uses have incompatible requirements, and most of the critical configuration parameters apply to ALL uses! In my view, the configuration variability is the number one nightmare for trying to write portable code that uses any form of shared memory. ARMCI seem to agree. Because of this, sysv support may be limited to Linux systems - that is, until we can get a better sense of which systems provide the shmctl IPC_RMID behavior that I am relying on. And, I suggest, whether they have an evil gotcha on one of the areas that Ashley Pittman noted. Regards, Nick Maclaren. ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing
Hi all, Does anyone know of a relatively portable solution for querying a given system for the shmctl behavior that I am relying on, or is this going to be a nightmare? Because, if I am reading this thread correctly, the presence of shmget and Linux is not sufficient for determining an adequate level of sysv support. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On May 2, 2010, at 7:48 AM, N.M. Maclaren wrote: On May 2 2010, Ashley Pittman wrote: On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote: As to performance there should be no difference in use between sys- V shared memory and file-backed shared memory, the instructions issued and the MMU flags for the page should both be the same so the performance should be identical. Not necessarily, and possibly not so even for far-future Linuces. On at least one system I used, the poxious kernel wrote the complete file to disk before returning - all right, it did that for System V shared memory, too, just to a 'hidden' file! But, if I recall, on another it did that only for file-backed shared memory - however, it's a decade ago now and I may be misremembering. Of course, that's a serious issue mainly for large segments. I was using multi-GB ones. I don't know how big the ones you need are. The one area you do need to keep an eye on for performance is on numa machines where it's important which process on a node touches each page first, you can end up using different areas (pages, not regions) for communicating in different directions between the same pair of processes. I don't believe this is any different to mmap backed shared memory though. On some systems it may be, but in bizarre, inconsistent, undocumented and unpredictable ways :-( Also, there are usually several system (and sometimes user) configuration options that change the behaviour, so you have to allow for that. My experience of trying to use those is that different uses have incompatible requirements, and most of the critical configuration parameters apply to ALL uses! In my view, the configuration variability is the number one nightmare for trying to write portable code that uses any form of shared memory. ARMCI seem to agree. Because of this, sysv support may be limited to Linux systems - that is, until we can get a better sense of which systems provide the shmctl IPC_RMID behavior that I am relying on. And, I suggest, whether they have an evil gotcha on one of the areas that Ashley Pittman noted. Regards, Nick Maclaren. ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing
Hi Ethan, Sorry about the lag. As far as I can tell, calling shmctl IPC_RMID is immediately destroying the shared memory segment even though there is at least one process attached to it. This is interesting and confusing because Solaris 10's behavior description of shmctl IPC_RMID is similar to that of Linux'. I call shmctl IPC_RMID immediately after one process has attached to the segment because, at least on Linux, this only marks the segment for destruction. The segment is only actually destroyed after all attached processes have terminated. I'm relying on this behavior for resource cleanup upon application termination (normal/abnormal). Because of this, sysv support may be limited to Linux systems - that is, until we can get a better sense of which systems provide the shmctl IPC_RMID behavior that I am relying on. Any other ideas are greatly appreciated. Thanks for testing! -- Samuel K. Gutierrez Los Alamos National Laboratory > On Thu, Apr/29/2010 02:52:24PM, Samuel K. Gutierrez wrote: >> Hi Ethan, >> Bummer. What does the following command show? >> sysctl -a | grep shm > > In this case, I think the Solaris equivalent to sysctl is prctl, e.g., > > $ prctl -i project group.staff > project: 10: group.staff > NAMEPRIVILEGE VALUEFLAG ACTION > RECIPIENT > ... > project.max-shm-memory > privileged 3.92GB - deny > - > system 16.0EBmax deny > - > project.max-shm-ids > privileged128 - deny > - > system 16.8M max deny > - > ... > > Is that the info you need? > > -Ethan > >> Thanks! >> -- >> Samuel K. Gutierrez >> Los Alamos National Laboratory >> On Apr 29, 2010, at 1:32 PM, Ethan Mallove wrote: >> > Hi Samuel, >> > >> > I'm trying to run off your HG clone, but I'm seeing issues with c_hello, e.g., >> > >> > $ mpirun -mca mpi_common_sm sysv --mca btl self,sm,tcp --host >> > burl-ct-v440-2,burl-ct-v440-2 -np 2 ./c_hello >> > -- A system call failed during shared memory initialization that should not have. It is likely that your MPI job will now either abort or experience performance degradation. >> > >> >Local host: burl-ct-v440-2 >> >System call: shmat(2) >> >Process: [[43408,1],1] >> >Error: Invalid argument (errno 22) >> > -- ^Cmpirun: killing job... >> > >> > $ uname -a >> > SunOS burl-ct-v440-2 5.10 Generic_118833-33 sun4u sparc >> SUNW,Sun-Fire-V440 >> > >> > The same test works okay if I s/sysv/mmap/. >> > >> > Regards, >> > Ethan >> > >> > >> > On Wed, Apr/28/2010 07:16:12AM, Samuel K. Gutierrez wrote: >> >> Hi, >> >> >> >> Faster component initialization/finalization times is one of the main >> >> motivating factors of this work. The general idea is to get away >> from >> >> creating a rather large backing file. With respect to module >> bandwidth >> >> and >> >> latency, mmap and sysv seem to be comparable - at least that is what >> my >> >> preliminary tests have shown. As it stands, I have not come across a >> >> situation where the mmap SM component doesn't work or is slower. >> >> >> >> Hope that helps, >> >> >> >> -- >> >> Samuel K. Gutierrez >> >> Los Alamos National Laboratory >> >> >> >> >> >> >> >> >> >> >> >> On Apr 28, 2010, at 5:35 AM, Bogdan Costescu wrote: >> >> >> >>> On Tue, Apr 27, 2010 at 7:55 PM, Samuel K. Gutierrez >> <sam...@lanl.gov> >> >>> wrote: >> >>>> With Jeff and Ralph's help, I have completed a System V shared >> memory >> >>>> component for Open MPI. >> >>> >> >>> What is the motivation for this work ? Are there situations where >> the >> >>> mmap based SM component doesn't work or is slow(er) ? >> >>> >> >>> Kind regards, >> >>> Bogdan >> >>> ___ >> >>> devel mailing list >> >>> de...@open-mpi.org >> >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> >> ___ >> >> devel mailing list >> >> de...@open-mpi.org >> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > ___ >> > devel mailing list >> > de...@open-mpi.org >> > http://www.open-mpi.org/mailman/listinfo.cgi/devel >> ___ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >
Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing
Hi Ethan, Bummer. What does the following command show? sysctl -a | grep shm Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On Apr 29, 2010, at 1:32 PM, Ethan Mallove wrote: Hi Samuel, I'm trying to run off your HG clone, but I'm seeing issues with c_hello, e.g., $ mpirun -mca mpi_common_sm sysv --mca btl self,sm,tcp --host burl- ct-v440-2,burl-ct-v440-2 -np 2 ./c_hello -- A system call failed during shared memory initialization that should not have. It is likely that your MPI job will now either abort or experience performance degradation. Local host: burl-ct-v440-2 System call: shmat(2) Process: [[43408,1],1] Error: Invalid argument (errno 22) -- ^Cmpirun: killing job... $ uname -a SunOS burl-ct-v440-2 5.10 Generic_118833-33 sun4u sparc SUNW,Sun- Fire-V440 The same test works okay if I s/sysv/mmap/. Regards, Ethan On Wed, Apr/28/2010 07:16:12AM, Samuel K. Gutierrez wrote: Hi, Faster component initialization/finalization times is one of the main motivating factors of this work. The general idea is to get away from creating a rather large backing file. With respect to module bandwidth and latency, mmap and sysv seem to be comparable - at least that is what my preliminary tests have shown. As it stands, I have not come across a situation where the mmap SM component doesn't work or is slower. Hope that helps, -- Samuel K. Gutierrez Los Alamos National Laboratory On Apr 28, 2010, at 5:35 AM, Bogdan Costescu wrote: On Tue, Apr 27, 2010 at 7:55 PM, Samuel K. Gutierrez <sam...@lanl.gov > wrote: With Jeff and Ralph's help, I have completed a System V shared memory component for Open MPI. What is the motivation for this work ? Are there situations where the mmap based SM component doesn't work or is slow(er) ? Kind regards, Bogdan ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing
Hi, Faster component initialization/finalization times is one of the main motivating factors of this work. The general idea is to get away from creating a rather large backing file. With respect to module bandwidth and latency, mmap and sysv seem to be comparable - at least that is what my preliminary tests have shown. As it stands, I have not come across a situation where the mmap SM component doesn't work or is slower. Hope that helps, -- Samuel K. Gutierrez Los Alamos National Laboratory On Apr 28, 2010, at 5:35 AM, Bogdan Costescu wrote: On Tue, Apr 27, 2010 at 7:55 PM, Samuel K. Gutierrez <sam...@lanl.gov> wrote: With Jeff and Ralph's help, I have completed a System V shared memory component for Open MPI. What is the motivation for this work ? Are there situations where the mmap based SM component doesn't work or is slow(er) ? Kind regards, Bogdan ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] System V Shared Memory for Open MPI: Request for Community Input and Testing
Hi, With Jeff and Ralph's help, I have completed a System V shared memory component for Open MPI. I have conducted some preliminary tests on our systems, but would like to get test results from a broader audience. As it stands, mmap is the defaul, but System V shared memory can be activated using: -mca mpi_common_sm sysv Repository: http://bitbucket.org/samuelkgutierrez/ompi_sysv_sm Input is greatly appreciated! -- Samuel K. Gutierrez Los Alamos National Laboratory
Re: [OMPI devel] kernel 2.6.23 vs 2.6.24 - communication/wait times
On Apr 22, 2010, at 10:08 AM, Rainer Keller wrote: Hello Oliver, thanks for the update. Just my $0.02: the upcoming Open MPI v1.5 will warn users, if their session directory is on NFS (or Lustre). ... or panfs :-) Samuel K. Gutierrez Best regards, Rainer On Thursday 22 April 2010 11:37:48 am Oliver Geisler wrote: To sum up and give an update: The extended communication times while using shared memory communication of openmpi processes are caused by openmpi session directory laying on the network via NFS. The problem is resolved by establishing on each diskless node a ramdisk or mounting a tmpfs. By setting the MCA parameter orte_tmpdir_base to point to the according mountpoint shared memory communication and its files are kept local, thus decreasing the communication times by magnitudes. The relation of the problem to the kernel version is not really resolved, but maybe not "the problem" in this respect. My benchmark is now running fine on a single node with 4 CPU, kernel 2.6.33.1 and openmpi 1.4.1. Running on multiple nodes I experience still higher (TCP) communication times than I would expect. But that requires me some more deep researching the issue (e.g. collisions on the network) and should probably posted to a new thread. Thank you guys for your help. oli -- Rainer Keller, PhD Tel: +1 (865) 241-6293 Oak Ridge National Lab Fax: +1 (865) 241-4811 PO Box 2008 MS 6164 Email: kel...@ornl.gov Oak Ridge, TN 37831-2008AIM/Skype: rusraink ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Open MPI v1.3.4rc4 is out
That's interesting... Works great now that carto is built. Why is carto now required? -- Samuel K. Gutierrez Los Alamos National Laboratory On Nov 5, 2009, at 4:11 PM, David Gunter wrote: Oh, good catch. I'm not sure who updates the platform files or who would have added the "carto" option to the no_build. It's the only difference between the the 1.3.4 platform files and the previous ones, save for some compiler flags. -david -- David Gunter HPC-3: Infrastructure Team Los Alamos National Laboratory On Nov 5, 2009, at 3:55 PM, Jeff Squyres wrote: I see: enable_mca_no_build=carto,crs,routed-direct,routed-linear,snapc,pml- dr,pml-crcp2,pml-crcpw,pml-v,pml-example,crcp,pml-cm,filem Which means that you're directing all carto components not to build at all. It looks like carto is now required...? On Nov 5, 2009, at 5:38 PM, Samuel K. Gutierrez wrote: Hi Jeff, This is how I configured my build. ./configure --with-platform=./contrib/platform/lanl/rr-class/ optimized- panasas --prefix=/usr/projects/hpctools/samuel/local/rr-dev/apps/ openmpi/gcc/ompi-1.3.4rc4 --libdir=/usr/projects/hpctools/samuel/ local/ rr-dev/apps/openmpi/gcc/ompi-1.3.4rc4/lib64 I'll send the build log shortly. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On Nov 5, 2009, at 3:07 PM, Jeff Squyres wrote: > How did you build? > > I see one carto component named "auto_detect" in the 1.3.4 source > tree, but I don't see it in your ompi_info output. > > Did that component not build? > > > On Nov 4, 2009, at 7:20 PM, Samuel K. Gutierrez wrote: > >> Hi All, >> >> I just built OMPI 1.3.4rc4 on one of our Roadrunner machines. When I >> try to launch a simple MPI job, I get the following: >> >> [rra011a.rr.lanl.gov:31601] mca: base: components_open: Looking for >> carto components >> [rra011a.rr.lanl.gov:31601] mca: base: components_open: opening carto >> components >> [rra011a.rr.lanl.gov:31601] mca:base:select: Auto-selecting carto >> components >> [rra011a.rr.lanl.gov:31601] mca:base:select:(carto) No component >> selected! >> -- >> It looks like opal_init failed for some reason; your parallel >> process is >> likely to abort. There are many reasons that a parallel process can >> fail during opal_init; some of which are due to configuration or >> environment problems. This failure appears to be an internal >> failure; >> here's some additional information (which may only be relevant to an >> Open MPI developer): >> >> opal_carto_base_select failed >> --> Returned value -13 instead of OPAL_SUCCESS >> -- >> [rra011a.rr.lanl.gov:31601] [[INVALID],INVALID] ORTE_ERROR_LOG: Not >> found in file runtime/orte_init.c at line 77 >> [rra011a.rr.lanl.gov:31601] [[INVALID],INVALID] ORTE_ERROR_LOG: Not >> found in file orterun.c at line 541 >> >> This may be an issue on our end regarding a runtime parameter that >> isn't set correctly. See attached. Please let me know if you need >> any more info. >> >> Thanks! >> -- >> Samuel K. Gutierrez >> Los Alamos National Laboratory >> >> >> > > > -- > Jeff Squyres > jsquy...@cisco.com > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Open MPI v1.3.4rc4 is out
Hi Jeff, This is how I configured my build. ./configure --with-platform=./contrib/platform/lanl/rr-class/optimized- panasas --prefix=/usr/projects/hpctools/samuel/local/rr-dev/apps/ openmpi/gcc/ompi-1.3.4rc4 --libdir=/usr/projects/hpctools/samuel/local/ rr-dev/apps/openmpi/gcc/ompi-1.3.4rc4/lib64 I'll send the build log shortly. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory On Nov 5, 2009, at 3:07 PM, Jeff Squyres wrote: How did you build? I see one carto component named "auto_detect" in the 1.3.4 source tree, but I don't see it in your ompi_info output. Did that component not build? On Nov 4, 2009, at 7:20 PM, Samuel K. Gutierrez wrote: Hi All, I just built OMPI 1.3.4rc4 on one of our Roadrunner machines. When I try to launch a simple MPI job, I get the following: [rra011a.rr.lanl.gov:31601] mca: base: components_open: Looking for carto components [rra011a.rr.lanl.gov:31601] mca: base: components_open: opening carto components [rra011a.rr.lanl.gov:31601] mca:base:select: Auto-selecting carto components [rra011a.rr.lanl.gov:31601] mca:base:select:(carto) No component selected! -- It looks like opal_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during opal_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): opal_carto_base_select failed --> Returned value -13 instead of OPAL_SUCCESS -- [rra011a.rr.lanl.gov:31601] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file runtime/orte_init.c at line 77 [rra011a.rr.lanl.gov:31601] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file orterun.c at line 541 This may be an issue on our end regarding a runtime parameter that isn't set correctly. See attached. Please let me know if you need any more info. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory -- Jeff Squyres jsquy...@cisco.com ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Open MPI v1.3.4rc4 is out
Hi All, I just built OMPI 1.3.4rc4 on one of our Roadrunner machines. When I try to launch a simple MPI job, I get the following: [rra011a.rr.lanl.gov:31601] mca: base: components_open: Looking for carto components [rra011a.rr.lanl.gov:31601] mca: base: components_open: opening carto components [rra011a.rr.lanl.gov:31601] mca:base:select: Auto-selecting carto components [rra011a.rr.lanl.gov:31601] mca:base:select:(carto) No component selected! -- It looks like opal_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during opal_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): opal_carto_base_select failed --> Returned value -13 instead of OPAL_SUCCESS -- [rra011a.rr.lanl.gov:31601] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file runtime/orte_init.c at line 77 [rra011a.rr.lanl.gov:31601] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file orterun.c at line 541 This may be an issue on our end regarding a runtime parameter that isn't set correctly. See attached. Please let me know if you need any more info. Thanks! -- Samuel K. Gutierrez Los Alamos National Laboratory lanl-rr-class-1.3.4rc4.tar.gz Description: GNU Zip compressed data On Nov 4, 2009, at 3:00 PM, Jeff Squyres wrote: The latest-n-greatest is available here: http://www.open-mpi.org/software/ompi/v1.3/ Please beat it up and look for problems! -- Jeff Squyres jsquy...@cisco.com ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] MPIR_Breakpoint visibility
Hi Jeff, Sorry about the ambiguity. I just had another conversation with our TotalView person and the problem -seems- to be unrelated to OMPI. Guess I jumped the gun... Thanks, Samuel K. Gutierrez On Sep 21, 2009, at 8:58 AM, Jeff Squyres wrote: Can you more precisely define "not working properly"? On Sep 21, 2009, at 10:26 AM, Samuel K. Gutierrez wrote: Hi, According to our TotalView person, PGI and Intel versions of OMPI 1.3.3 are not working properly. She noted that 1.2.8 and 1.3.2 work fine. Thanks, Samuel K. Gutierrez On Sep 21, 2009, at 7:19 AM, Terry Dontje wrote: > Ralph Castain wrote: >> I see it declared "extern" in orte/tools/orterun/debuggers.h, but >> not DECLSPEC'd >> >> FWIW: LANL uses intel compilers + totalview on a regular basis, and >> I have yet to hear of an issue. >> > It actually will work if you attach to the job or if you are not > relying on the MPIR_Breakpoint to actually stop execution. > > --td > >> On Sep 21, 2009, at 7:03 AM, Terry Dontje wrote: >> >>> I was kind of amazed no one else managed to run into this but it >>> was brought to my attention that compiling OMPI with Intel >>> compilers and visibility on that the MPIR_Breakpoint symbol was >>> not being exposed. I am assuming this is due to MPIR_Breakpoint >>> not being ORTE or OMPI_DECLSPEC'd >>> Do others agree or am I missing something obvious here? >>> >>> Interestingly enough, it doesn't look like gcc, pgi, pathscale or >>> sun compilers are hiding the MPIR_Breakpoint symbol. >>> --td >>> >>> ___ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> ___ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] MPIR_Breakpoint visibility
Hi, According to our TotalView person, PGI and Intel versions of OMPI 1.3.3 are not working properly. She noted that 1.2.8 and 1.3.2 work fine. Thanks, Samuel K. Gutierrez On Sep 21, 2009, at 7:19 AM, Terry Dontje wrote: Ralph Castain wrote: I see it declared "extern" in orte/tools/orterun/debuggers.h, but not DECLSPEC'd FWIW: LANL uses intel compilers + totalview on a regular basis, and I have yet to hear of an issue. It actually will work if you attach to the job or if you are not relying on the MPIR_Breakpoint to actually stop execution. --td On Sep 21, 2009, at 7:03 AM, Terry Dontje wrote: I was kind of amazed no one else managed to run into this but it was brought to my attention that compiling OMPI with Intel compilers and visibility on that the MPIR_Breakpoint symbol was not being exposed. I am assuming this is due to MPIR_Breakpoint not being ORTE or OMPI_DECLSPEC'd Do others agree or am I missing something obvious here? Interestingly enough, it doesn't look like gcc, pgi, pathscale or sun compilers are hiding the MPIR_Breakpoint symbol. --td ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] RFC: convert send to ssend
Hi Ashley, My understanding is that this behavior would not be enabled by default in the standard debug build. The "always convert to synchronous sends" mode would be an additional configure-time option. Samuel K. Gutierrez Ashley Pittman wrote: On Mon, 2009-08-24 at 13:27 -0400, Jeff Squyres wrote: It's the difference between: a. if (0) { ... convert ... } Modern compilers will remove this code as part of dead-code removal. b. if (1) { ... convert ... } Modern compilers will remove the "if (1)" and always execute the code. c. if (some_variable) { ... convert ...} An MCA parameter can load some_variable with 0 or 1. The point of b is for sysadmins (or individual developers) who want to force there to *always* be correct MPI applications. But couldn't the sysadmin equally well write a config file to achieve the same effect should they want to? Having it enabled (and on) in the standard "debug" build is going to change the behaviour of applications with using a debug library, may well render bugs un-reproducible in debug mode or worse you may end up with end-user applications that only run in debug mode and not with a normal build. I'm all for having as much error checking enabled in debug builds as possible but to change the behaviour risks masking problems elsewhere IMHO. Ashley,
Re: [OMPI devel] RFC: convert send to ssend
Hi Jeff, Sounds good to me. Samuel K. Gutierrez Jeff Squyres wrote: The debug builds already have quite a bit of performance overhead. It might be desirable to change this RFC to have a similar tri-state as the MPI parameter checking: - compiled out - compiled in, always check - compiled in, use MCA parameter to determine whether to check Adapting that to this RFC, perhaps something like this: - compiled out - compiled in, always convert standard send to sync send - compiled in, use MCA parameter to determine whether to convert standard -> sync And we can leave the default as "compiled out". Howzat? On Aug 23, 2009, at 9:07 PM, Samuel K. Gutierrez wrote: Hi all, How about exposing this functionality as a run-time parameter that is only available in debug builds? This will make debugging easier and won't impact the performance of optimized builds. Just an idea... Samuel K. Gutierrez > > - "Jeff Squyres" <jsquy...@cisco.com> wrote: > >> Does anyone have any suggestions? Or are we stuck >> with compile-time checking? > > I didn't see this until now, but I'd be happy with > just a compile time option so we could produce an > install just for debugging purposes and have our > users explicitly select it with modules. > > I have to say that this is of interest to us as we're > trying to help a researcher at one of our member uni's > to track down a bug where a message appears to go missing. > > cheers! > Chris > -- > Christopher Samuel - (03) 9925 4751 - Systems Manager > The Victorian Partnership for Advanced Computing > P.O. Box 201, Carlton South, VIC 3053, Australia > VPAC is a not-for-profit Registered Research Agency > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] RFC: convert send to ssend
Hi all, How about exposing this functionality as a run-time parameter that is only available in debug builds? This will make debugging easier and won't impact the performance of optimized builds. Just an idea... Samuel K. Gutierrez > > - "Jeff Squyres" <jsquy...@cisco.com> wrote: > >> Does anyone have any suggestions? Or are we stuck >> with compile-time checking? > > I didn't see this until now, but I'd be happy with > just a compile time option so we could produce an > install just for debugging purposes and have our > users explicitly select it with modules. > > I have to say that this is of interest to us as we're > trying to help a researcher at one of our member uni's > to track down a bug where a message appears to go missing. > > cheers! > Chris > -- > Christopher Samuel - (03) 9925 4751 - Systems Manager > The Victorian Partnership for Advanced Computing > P.O. Box 201, Carlton South, VIC 3053, Australia > VPAC is a not-for-profit Registered Research Agency > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >
Re: [OMPI devel] OMPI 1.3 - PERUSE peruse_comm_spec_t peer Negative Value
Hi All, George - I really appreciate the quick response. > Hi, > > at least for the specific test program I used, the negative values for the peer attribute disappeared after George's modifications in 20844. Same here for my profiling library - tested with openmpi-1.3.2a1r20855. > > One remark: after installation, I had to remove the '#include > "ompi_config.h"' line in the "include/peruse.h" header to get PERUSE applications to compile. Otherwise I got a missing header error message for ompi_config.h. I did not experience this - no modifications were needed on my end. That being said, my peruse.h does not include ompi_config.h, only mpi.h. Thanks again, Samuel K. Gutierrez > > Regards, > Kiril > > > On Mon, 2009-03-23 at 16:34 -0400, George Bosilca wrote: >> You are absolutely right, the peer should never be set to -1 on any of the PERUSE callbacks. I checked the code this morning and figure out what was the problem. We report the peer and the tag attached to a request before setting the right values (some code moved around). I submitted a patch and created a "move request" to have this correction as soon as possible on one of our stable releases. The move request can be followed using our TRAC system and the following link >> (https://svn.open-mpi.org/trac/ompi/ticket/1845 >> ). If you want to play with this change please update your Open MPI installation to a nightly build or a fresh checkout from the SVN with at least revision 20844 (a nightly including this change will be posted on our website tomorrow morning). >>Thanks, >> george. >> On Mar 23, 2009, at 13:23 , Samuel K. Gutierrez wrote: >> > Hi Kiril, >> > >> > Appreciate the quick response. >> > >> >> Hi Samuel, >> >> >> >> On Sat, 21 Mar 2009 18:18:54 -0600 (MDT) >> >> "Samuel K. Gutierrez" <sam...@lanl.gov> wrote: >> >>> Hi All, >> >>> >> >>> I'm writing a simple profiling library which utilizes >> >>> PERUSE. My callback >> >> >> >> So am I :) >> >> >> >>> function counts communication events (see example code >> >>> below). I noticed >> >>> that in OMPI v1.3 spec->peer is sometimes a negative >> >>> value (OMPI v1.2.6 >> >>> did not exhibit this behavior). I added some boundary >> >>> checks, but it >> >>> seems as if this is a bug? I hope I'm not missing >> >>> something... >> >> >> >> It took me quite some time to reproduce the error - I also >> > >> > Sorry about that - I should have provided more information. >> > >> >> got peer value "-1" for the Peruse peruse_comm_spec_t >> >> struct. I only managed to reproduce this with >> >> communication of a process with itself, which is an >> >> unusual scenario. Anyway, for all the tests I did, the >> >> error happened only when: >> >> >> >> -a process communicates with itself >> >> -the MPI receive call is made >> >> -the Peruse event "PERUSE_COMM_MSG_REMOVE_FROM_UNEX_Q" is >> >> triggered >> > >> > That's interesting... Nice work! >> > >> >> >> >> >> >> The file ompi/mca/pml/ob1/pml_ob1_recvreq.c seems to be >> >> the place where the above event is called with a wrong >> >> value of the peer attribute. >> >> >> >> I will let you know if I find something. >> > >> > I will also take a look. >> > >> >> >> >> >> >> Best regards, >> >> Kiril >> >> >> >>> >> >>> The peruse test provided in the OMPI v1.3 source >> >>> exhibits similar behavior: >> >>> mpirun -np 2 ./mpi_peruse | grep peer:-1 >> >>> >> >>> int callback(peruse_event_h event_h, MPI_Aint unique_id, >> >>> peruse_comm_spec_t *spec, void *param) { >> >>> if (spec->peer == rank) { >> >>> return MPI_SUCCESS; >> >>> } >> >>> rrCounts[spec->peer]++; >> >>> return MPI_SUCCESS; >> >>> } >> >>> >> >>> >> >>> Any insight is greatly appreciated. >> >>> >> >>> Thanks, >> >>> >> >>> Samuel K. Gutierrez >> >>> ___ >> >>> devel mailing list >> >>> de...@open-mpi.org >> >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> >> >> > >> > Appreciate the help, >> > >> > Samuel K. Gutierrez >> > ___ >> > devel mailing list >> > de...@open-mpi.org >> > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > >
Re: [OMPI devel] OMPI 1.3 - PERUSE peruse_comm_spec_t peer Negative Value
Hi Kiril, Appreciate the quick response. > Hi Samuel, > > On Sat, 21 Mar 2009 18:18:54 -0600 (MDT) > "Samuel K. Gutierrez" <sam...@lanl.gov> wrote: >> Hi All, >> >> I'm writing a simple profiling library which utilizes >>PERUSE. My callback > > So am I :) > >> function counts communication events (see example code >>below). I noticed >> that in OMPI v1.3 spec->peer is sometimes a negative >>value (OMPI v1.2.6 >> did not exhibit this behavior). I added some boundary >>checks, but it >> seems as if this is a bug? I hope I'm not missing >>something... > > It took me quite some time to reproduce the error - I also Sorry about that - I should have provided more information. > got peer value "-1" for the Peruse peruse_comm_spec_t > struct. I only managed to reproduce this with > communication of a process with itself, which is an > unusual scenario. Anyway, for all the tests I did, the > error happened only when: > > -a process communicates with itself > -the MPI receive call is made > -the Peruse event "PERUSE_COMM_MSG_REMOVE_FROM_UNEX_Q" is > triggered That's interesting... Nice work! > > > The file ompi/mca/pml/ob1/pml_ob1_recvreq.c seems to be > the place where the above event is called with a wrong > value of the peer attribute. > > I will let you know if I find something. I will also take a look. > > > Best regards, > Kiril > >> >> The peruse test provided in the OMPI v1.3 source >>exhibits similar behavior: >> mpirun -np 2 ./mpi_peruse | grep peer:-1 >> >> int callback(peruse_event_h event_h, MPI_Aint unique_id, >> peruse_comm_spec_t *spec, void *param) { >>if (spec->peer == rank) { >>return MPI_SUCCESS; >>} >>rrCounts[spec->peer]++; >>return MPI_SUCCESS; >> } >> >> >> Any insight is greatly appreciated. >> >> Thanks, >> >> Samuel K. Gutierrez >> ___ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Appreciate the help, Samuel K. Gutierrez
[OMPI devel] OMPI 1.3 - PERUSE peruse_comm_spec_t peer Negative Value
Hi All, I'm writing a simple profiling library which utilizes PERUSE. My callback function counts communication events (see example code below). I noticed that in OMPI v1.3 spec->peer is sometimes a negative value (OMPI v1.2.6 did not exhibit this behavior). I added some boundary checks, but it seems as if this is a bug? I hope I'm not missing something... The peruse test provided in the OMPI v1.3 source exhibits similar behavior: mpirun -np 2 ./mpi_peruse | grep peer:-1 int callback(peruse_event_h event_h, MPI_Aint unique_id, peruse_comm_spec_t *spec, void *param) { if (spec->peer == rank) { return MPI_SUCCESS; } rrCounts[spec->peer]++; return MPI_SUCCESS; } Any insight is greatly appreciated. Thanks, Samuel K. Gutierrez