On May 28, 2014, at 4:45 AM, Gilles Gouaillardet
<[email protected]> wrote:
> Jeff,
>
> On Wed, May 28, 2014 at 8:31 PM, Jeff Squyres (jsquyres)
> > To be totally clear: MPI says it is erroneous for only some (not all)
> > processes in a communicator to call MPI_COMM_FREE. So if that's the real
> > problem, then the discussion about why the parent(s) is(are) trying to
> > contact the children is moot -- the test is erroneous, and erroneous
> > application behavior is undefined.
>
> This is definetly what happens : only some tasks call MPI_Comm_free()
Really? I don't see how that can happen in loop_spawn - every process is
clearly calling comm_free. Or are you referring to the intercomm_create test?
> i will commit my changes and the initially reported issue is solved :-)
>
>
>
> about the "bonus points" :
>
> v1.8 does not have this issue
>
> i digged it and bottom line, the parent (who did not call MPI_Comm_free
> unlike the children)
I see the parent doing it in every loop:
MPI_Init( &argc, &argv);
for (iter = 0; iter < 1000; ++iter) {
MPI_Comm_spawn(EXE_TEST, NULL, 1, MPI_INFO_NULL,
0, MPI_COMM_WORLD, &comm, &err);
printf("parent: MPI_Comm_spawn #%d return : %d\n", iter, err);
MPI_Intercomm_merge(comm, 0, &merged);
MPI_Comm_rank(merged, &rank);
MPI_Comm_size(merged, &size);
printf("parent: MPI_Comm_spawn #%d rank %d, size %d\n",
iter, rank, size);
MPI_Comm_free(&merged);
}
MPI_Finalize();
I suspect that you are talking about intercomm_create, hence my confusion.
> calls ompi_dpm_base_dyn_finalize, which tries to isend the already exited
> tasks.
>
>
> bottom line, in pml_ob1_sendreq.h line 450
>
> with v1,8
> mca_bml_base_btl_array_get_size(&endpoint->btl_eager) = 0
> nothing is sent but isend is reported successful
>
> with trunk
> mca_bml_base_btl_array_get_size(&endpoint->btl_eager) = 1
> and then try to send the message => BOUM
>
> i found various things that seem counter intuitive to me and will summarize
> all this tomorrow.
>
> Cheers,
>
> Gilles
> _______________________________________________
> devel mailing list
> [email protected]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/05/14884.php