You need to disconnect the parent/child from each other prior to finalizing - see the attached example
simple_spawn.c
Description: Binary data
On Aug 31, 2014, at 9:44 PM, Roy <open...@jsp.selfip.org> wrote: > Hi all, > > I'm using MPI_Comm_spawn to start new child process. > I found that if the parent process execute MPI_Finalize before the child > process, the child process core dump on MPI_Finalize. > > I couldn't find the correct way to have a clean shutdown of all processes > ( parent and child ). > What that I found is that sleep(2) in the parent process just before > calling MPI_Finalize, gives the CPU cycle for the child process to finish > its own MPI_Finalize, and only then there is no core dump. > > Although this resolve the issue, I can't accept this as acceptable solution. > > I guess I'm doing something wrong ( implementation or design ), so this is > why I'm sending this email to the group ( and yes, I did check the FAQ, > and done some search on the distribution list archive ). > > Here is the entire code to reproduce the issue : > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > #include <stdio.h> > #include <string.h> > #include <unistd.h> > #include <mpi.h> > #include <stdlib.h> > > int main(int argc, char* argv[]){ > int my_rank; /* rank of process */ > int p; /* number of processes */ > int source; /* rank of sender */ > int dest; /* rank of receiver */ > int tag=0; /* tag for messages */ > char message[100]; /* storage for message */ > MPI_Status status ; /* return status for receive */ > > /* start up MPI */ > > MPI_Init(&argc, &argv); > > /* find out process rank */ > MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); > fprintf(stderr,"My rank is : %d\n",my_rank); > /* find out number of processes */ > MPI_Comm_size(MPI_COMM_WORLD, &p); > > MPI_Comm parent; > MPI_Comm_get_parent(&parent); > > if ( parent != MPI_COMM_NULL){ > /* create message */ > dest = 0; > /* use strlen+1 so that '\0' get transmitted */ > > MPI_Recv(message, 100, MPI_CHAR, 0, tag,parent, &status); > fprintf(stderr,"Got [%s] from root\n",message); > /* shut down MPI */ > MPI_Finalize(); > > } > else{ > printf("Hello MPI World From process 0: Num processes: %d\n",p); > MPI_Comm everyone; > MPI_Comm_spawn("mpitest",MPI_ARGV_NULL,1,MPI_INFO_NULL,0, > MPI_COMM_SELF,&everyone, > MPI_ERRCODES_IGNORE); > /* find out number of processes */ > MPI_Comm_size(everyone, &p); > fprintf(stderr,"New world size:%d\n",p); > for (source = 0; source < p; source++) { > sprintf(message, "Hello MPI World from root to process > %d!", source); > MPI_Send(message, strlen(message)+1, MPI_CHAR,source, > tag, everyone); > } > > /** > * Why this sleep resolve my core dump issues ? > * Any timing between the parent to child process ? > * Based on the document, and FAQ, I couldn't not find an > answer for > this issue. > * > * If you comment out the sleep(2), the child process will > crash on the > MPI_Finalize with > * singal 11, Segmentation fault. > */ > //sleep(2); //un-comment this line to have the sleep, and avoid > the core > dumps. > > /* shut down MPI */ > MPI_Finalize(); > > } > return 0; > } > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Anyone for the rescue ? > > > Thank you, > Roy Avidor > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Searchable archives: > http://www.open-mpi.org/community/lists/users/2014/09/index.php