I will double check this(afk right now) Are you running on a rhel6 like distro with gcc ?
Iirc, crash vs mpi error is ruled by --with-param-check or something like this... Cheers, Gilles Ralph Castain <r...@open-mpi.org>さんのメール: >I tried it with both the fortran and c versions - got the same result. > > >This was indeed with a debug build. I wouldn’t expect a segfault even with an >optimized build, though - I would expect an MPI error, yes? > > > > >On Nov 26, 2014, at 4:26 PM, Gilles Gouaillardet ><gilles.gouaillar...@gmail.com> wrote: > > >I will have a look > >Btw, i was running the fortran version, not the c one. >Did you confgure with --enable--debug ? >The program sends to a rank *not* in the communicator, so this behavior could >make some sense on an optimized build. > >Cheers, > >Gilles > >Ralph Castain <r...@open-mpi.org>さんのメール: >Ick - I’m getting a segfault when trying to run that test: > > >MPITEST info (0): Starting MPI_Errhandler_fatal test > >MPITEST info (0): This test will abort after printing the results message > >MPITEST info (0): If it does not, then a f.a.i.l.u.r.e will be noted > >[bend001:07714] *** Process received signal *** > >[bend001:07714] Signal: Segmentation fault (11) > >[bend001:07714] Signal code: Address not mapped (1) > >[bend001:07714] Failing at address: 0x50 > >[bend001:07715] *** Process received signal *** > >[bend001:07715] Signal: Segmentation fault (11) > >[bend001:07715] Signal code: Address not mapped (1) > >[bend001:07715] Failing at address: 0x50 > >[bend001:07714] ompi_comm_peer_lookup: invalid peer index (3) > >[bend001:07713] ompi_comm_peer_lookup: invalid peer index (3) > >[bend001:07715] ompi_comm_peer_lookup: invalid peer index (3) > >[bend001:07713] *** Process received signal *** > >[bend001:07713] Signal: Segmentation fault (11) > >[bend001:07713] Signal code: Address not mapped (1) > >[bend001:07713] Failing at address: 0x50 > >[bend001:07713] [ 0] /usr/lib64/libpthread.so.0(+0xf130)[0x7f4485ecb130] > >[bend001:07713] [ 1] >/home/common/openmpi/build/ompi-release/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_send+0x5d)[0x7f4480f74ca6] > >[bend001:07713] [ 2] [bend001:07714] [ 0] >/usr/lib64/libpthread.so.0(+0xf130)[0x7ff457885130] > >[bend001:07714] [ 1] >/home/common/openmpi/build/ompi-release/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_send+0x5d)[0x7ff44e8dbca6] > >[bend001:07714] [ 2] [bend001:07715] [ 0] >/usr/lib64/libpthread.so.0(+0xf130)[0x7ffa97ff6130] > >[bend001:07715] [ 1] >/home/common/openmpi/build/ompi-release/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_send+0x5d)[0x7ffa8eeeeca6] > >[bend001:07715] [ 2] MPITEST_results: MPI_Errhandler_fatal all tests PASSED (3) > > > >This is with the head of the 1.8 branch. Any suggestions? > >Ralph > > > >On Nov 26, 2014, at 8:46 AM, Ralph Castain <r...@open-mpi.org> wrote: > > >Hmmm….yeah, I know we saw this and resolved it in the trunk, but it looks like >the fix indeed failed to come over to 1.8. I’ll take a gander (pretty sure I >remember how I fixed it) - thanks! > >On Nov 26, 2014, at 12:03 AM, Gilles Gouaillardet ><gilles.gouaillar...@iferc.org> wrote: > >Ralph, > >i noted several hangs in mtt with the v1.8 branch. > >a simple way to reproduce it is to use the MPI_Errhandler_fatal_f test >from the intel_tests suite, >invoke mpirun on one node and run the taks on an other node : > >node0$ mpirun -np 3 -host node1 --mca btl tcp,self ./MPI_Errhandler_fatal_f > >/* since this is a race condition, you might need to run this in a loop >in order to hit the bug */ > >the attached tarball contains a patch (add debug + temporary hack) and >some log files obtained with >--mca errmgr_base_verbose 100 --mca odls_base_verbose 100 > >without the hack, i can reproduce the bug with -np 3 (log.ko.txt) , with >the hack, i can still reproduce the hang (though it might >be a different one) with -np 16 (log.ko.2.txt) > >i remember some similar hangs were fixed on the trunk/master a few >monthes ago. >i tried to backport some commits but it did not help :-( > >could you please have a look at this ? > >Cheers, > >Gilles ><abort_hang.tar.gz>_______________________________________________ >devel mailing list >de...@open-mpi.org >Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >Link to this post: >http://www.open-mpi.org/community/lists/devel/2014/11/16357.php > > >_______________________________________________ >devel mailing list >de...@open-mpi.org >Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >Link to this post: >http://www.open-mpi.org/community/lists/devel/2014/11/16364.php > >