FWIW: I have tracked this problem down. The fix is a little more complicated 
then I'd like, so I'm going to have to ping some other folks to ensure we 
concur on the approach before doing something.

On Oct 25, 2011, at 8:20 AM, Ralph Castain wrote:

> I still see it failing the test George provided on the trunk. I'm unaware of 
> anyone looking further into it, though, as the prior discussion seemed to 
> just end.
> 
> On Oct 25, 2011, at 7:01 AM, orel wrote:
> 
>> Dears,
>> 
>> I try from several days to use advanced MPI2 features in the following 
>> scenario :
>> 
>> 1) a master code A (of size NPA) spawns (MPI_Comm_spawn()) two slave
>>    codes B (of size NPB) and C (of size NPC), providing intercomms A-B and 
>> A-C ;
>> 2) i create intracomm AB and AC by merging intercomms ;
>> 3) then i create intercomm AB-C by calling MPI_Intercomm_create() by using 
>> AC as bridge...
>> 
>>   MPI_Comm intercommABC; A: MPI_Intercomm_create(intracommAB, 0, 
>> intracommAC, NPA, TAG,&intercommABC);
>> B: MPI_Intercomm_create(intracommAB, 0, MPI_COMM_NULL, 0,TAG,&intercommABC);
>> C: MPI_Intercomm_create(intracommC, 0, intracommAC, 0, TAG,&intercommABC);
>> 
>>     In these calls, A0 and C0 play the role of local leader for AB and C 
>> respectively.
>>     C0 and A0 play the roles of remote leader in bridge intracomm AC.
>> 
>> 3)  MPI_Barrier(intercommABC);
>> 4)  i merge intercomm AB-C into intracomm ABC$
>> 5)  MPI_Barrier(intracommABC);
>> 
>> My BUG: These calls success, but when i try to use intracommABC for a 
>> collective communication like MPI_Barrier(),
>>              i got the following error :
>> 
>> *** An error occurred in MPI_Barrier
>> *** on communicator
>> *** MPI_ERR_INTERN: internal error
>> *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
>> 
>> 
>> I try with OpenMPI trunk, 1.5.3, 1.5.4 and Mpich2-1.4.1p1
>> 
>> My code works perfectly if intracomm A, B and C are obtained by 
>> MPI_Comm_split() instead of MPI_Comm_spawn() !!!!
>> 
>> 
>> I found same problem in a previous thread of the OMPI Users mailing list :
>> 
>> => http://www.open-mpi.org/community/lists/users/2011/06/16711.php
>> 
>> Is that bug/problem is currently under investigation ? :-)
>> 
>> i can give detailed code, but the one provided by George Bosilca in this 
>> previous thread provides same error...
>> 
>> Thank you to help me...
>> 
>> -- 
>> Aurélien Esnard
>> University Bordeaux 1 / LaBRI / INRIA (France)
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


Reply via email to