Hi again, when I call MPI_Init_thread in the same program the error is: spawning ... opal_mutex_lock(): Resource deadlock avoided [localhost:07566] *** Process received signal *** [localhost:07566] Signal: Aborted (6) [localhost:07566] Signal code: (-6) [localhost:07566] [ 0] /lib/libpthread.so.0 [0x2abe5630ded0] [localhost:07566] [ 1] /lib/libc.so.6(gsignal+0x35) [0x2abe5654c3c5] [localhost:07566] [ 2] /lib/libc.so.6(abort+0x10e) [0x2abe5654d73e] [localhost:07566] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe5528063b] [localhost:07566] [ 4] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe55280559] [localhost:07566] [ 5] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe552805e8] [localhost:07566] [ 6] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe55280fff] [localhost:07566] [ 7] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe55280f3d] [localhost:07566] [ 8] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2abe55281f59] [localhost:07566] [ 9] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_proc_unpack+ 0x204) [0x2abe552823cd] [localhost:07566] [10] /usr/local/mpi/ompi-svn/lib/openmpi/mca_dpm_orte.so [0x2a be58efb5f7] [localhost:07566] [11] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(MPI_Comm_spawn+0x 465) [0x2abe552b55cd] [localhost:07566] [12] ./spawn1(main+0x9d) [0x400b05] [localhost:07566] [13] /lib/libc.so.6(__libc_start_main+0xf4) [0x2abe56539b74] [localhost:07566] [14] ./spawn1 [0x4009d9] [localhost:07566] *** End of error message *** opal_mutex_lock(): Resource deadlock avoided [localhost:07567] *** Process received signal *** [localhost:07567] Signal: Aborted (6) [localhost:07567] Signal code: (-6) [localhost:07567] [ 0] /lib/libpthread.so.0 [0x2b48610f9ed0] [localhost:07567] [ 1] /lib/libc.so.6(gsignal+0x35) [0x2b48613383c5] [localhost:07567] [ 2] /lib/libc.so.6(abort+0x10e) [0x2b486133973e] [localhost:07567] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006c63b] [localhost:07567] [ 4] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006c559] [localhost:07567] [ 5] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006c5e8] [localhost:07567] [ 6] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006cfff] [localhost:07567] [ 7] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006cf3d] [localhost:07567] [ 8] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b486006df59] [localhost:07567] [ 9] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_proc_unpack+ 0x204) [0x2b486006e3cd] [localhost:07567] [10] /usr/local/mpi/ompi-svn/lib/openmpi/mca_dpm_orte.so [0x2b 4863ce75f7] [localhost:07567] [11] /usr/local/mpi/ompi-svn/lib/openmpi/mca_dpm_orte.so [0x2b 4863ce9c2b] [localhost:07567] [12] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b48600720d7] [localhost:07567] [13] /usr/local/mpi/ompi-svn/lib/libmpi.so.0(PMPI_Init_thread+ 0x166) [0x2b48600ae4f2] [localhost:07567] [14] ./spawn1(main+0x2c) [0x400a94] [localhost:07567] [15] /lib/libc.so.6(__libc_start_main+0xf4) [0x2b4861325b74] [localhost:07567] [16] ./spawn1 [0x4009d9] [localhost:07567] *** End of error message *** -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 7566 on node localhost exited on sig nal 6 (Aborted). --------------------------------------------------------------------------
thank for some check, Joao. On Mon, Mar 31, 2008 at 11:49 AM, Joao Vicente Lima <joao.lima.m...@gmail.com> wrote: > Really MPI_Finalize is crashing and calling MPI_Comm_{free,disconnect} works! > I don't know if the free/disconnect must appear before a MPI_Finalize > for this case (spawn processes) .... some suggest ? > > I use loops in spawn: > - first for testing :) > - and second because certain MPI applications don't know in advance > the number of childrens needed to complete his work. > > The spawn works is creat ... I will made other tests. > > thanks, > Joao > > > > On Mon, Mar 31, 2008 at 3:03 AM, Matt Hughes > <matt.c.hughes+o...@gmail.com> wrote: > > On 30/03/2008, Joao Vicente Lima <joao.lima.m...@gmail.com> wrote: > > > Hi, > > > sorry bring this again ... but i hope use spawn in ompi someday :-D > > > > I believe it's crashing in MPI_Finalize because you have not closed > > all communication paths between the parent and the child processes. > > For the parent process, try calling MPI_Comm_free or > > MPI_Comm_disconnect on each intercomm in your intercomm array before > > calling finalize. On the child, call free or disconnect on the parent > > intercomm before calling finalize. > > > > Out of curiosity, why a loop of spawns? Why not increase the value of > > the maxprocs argument, or if you need to spawn different executables, > > or use different arguments for each instance, why not > > MPI_Comm_spawn_multiple? > > > > mch > > > > > > > > > > > > > > > > The execution of spawn in this way works fine: > > > MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 2, MPI_INFO_NULL, 0, > > > MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE); > > > > > > but if this code go to a for I get a problem : > > > for (i= 0; i < 2; i++) > > > { > > > MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 1, > > > MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm[i], MPI_ERRCODES_IGNORE); > > > } > > > > > > and the error is: > > > spawning ... > > > child! > > > child! > > > [localhost:03892] *** Process received signal *** > > > [localhost:03892] Signal: Segmentation fault (11) > > > [localhost:03892] Signal code: Address not mapped (1) > > > [localhost:03892] Failing at address: 0xc8 > > > [localhost:03892] [ 0] /lib/libpthread.so.0 [0x2ac71ca8bed0] > > > [localhost:03892] [ 1] > > > > /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_dpm_base_dyn_finalize+0xa3) > > > [0x2ac71ba7448c] > > > [localhost:03892] [ 2] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 > [0x2ac71b9decdf] > > > [localhost:03892] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 > [0x2ac71ba04765] > > > [localhost:03892] [ 4] > > > /usr/local/mpi/ompi-svn/lib/libmpi.so.0(PMPI_Finalize+0x71) > > > [0x2ac71ba365c9] > > > [localhost:03892] [ 5] ./spawn1(main+0xaa) [0x400ac2] > > > [localhost:03892] [ 6] /lib/libc.so.6(__libc_start_main+0xf4) > [0x2ac71ccb7b74] > > > [localhost:03892] [ 7] ./spawn1 [0x400989] > > > [localhost:03892] *** End of error message *** > > > > -------------------------------------------------------------------------- > > > mpirun noticed that process rank 0 with PID 3892 on node localhost > > > exited on signal 11 (Segmentation fault). > > > > -------------------------------------------------------------------------- > > > > > > the attachments contain the ompi_info, config.log and program. > > > > > > thanks for some check, > > > > > > Joao. > > > > > > > > > > _______________________________________________ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > >