On Jun 7, 2011, at 11:00 , Edgar Gabriel wrote:

> George,
> 
> I did not look over all the details of your test, but it looks to me
> like you are violating one of the requirements of intercomm_create
> namely the request that the two groups have to be disjoint. In your case
> the parent process(es) are part of both local intra-communicators, isn't it?

The two groups of the two local communicators are disjoints. One contains A,B 
while the other only C. The bridge communicator contains A,C.

I'm confident my example is supposed to work. At least for Open MPI the error 
is under the hood, as the resulting inter-communicator is valid but contains 
NULL endpoints for the remote process.

Regarding the fact that the two leader should be separate processes, you will 
not find any wording about this in the current version of the standard. In the 
1.1 there were two opposite sentences about this one stating that the two 
groups can be disjoint, while the other claiming that the two leaders can be 
the same process. After discussion, the agreement was that the two groups have 
to be disjoint, and the standard has been amended to match the agreement.

  george.


> 
> I just have MPI-1.1. at hand right now, but here is what it says:
> ----
> 
> Overlap of local and remote groups that are bound into an
> inter-communicator is prohibited. If there is overlap, then the program
> is erroneous and is likely to deadlock.
> 
> ----
> so bottom line is that the two local intra-communicators that are being
> used have to be disjoint, and the bridgecomm needs to be a communicator
> where at least one process of each of the two disjoint groups need to be
> able to talk to each other. Interestingly I did not find a sentence
> whether it is allowed to be the same process, or whether the two local
> leaders need to be separate processes...
> 
> 
> Thanks
> Edgar
> 
> 
> On 6/7/2011 12:57 AM, George Bosilca wrote:
>> Frederic,
>> 
>> Attached you will find an example that is supposed to work. The main 
>> difference with your code is on T3, T4 where you have inversed the local and 
>> remote comm. As depicted on the picture attached below, during the 3th step 
>> you will create the intercomm between ab and c (no overlap) using ac as a 
>> bridge communicator (here the two roots, a and c, can exchange messages).
>> 
>> Based on the MPI 2.2 standard, especially on the paragraph in PS:, the 
>> attached code should have been working. Unfortunately, I couldn't run it 
>> successfully neither with Open MPI trunk nor MPICH2 1.4rc1. 
>> 
>> george.
>> 
>> PS: Here is what the MPI standard states about the MPI_Intercomm_create:
>>> The function MPI_INTERCOMM_CREATE can be used to create an 
>>> inter-communicator from two existing intra-communicators, in the following 
>>> situation: At least one selected member from each group (the “group 
>>> leader”) has the ability to communicate with the selected member from the 
>>> other group; that is, a “peer” communicator exists to which both leaders 
>>> belong, and each leader knows the rank of the other leader in this peer 
>>> communicator. Furthermore, members of each group know the rank of their 
>>> leader.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On Jun 1, 2011, at 05:00 , Frédéric Feyel wrote:
>> 
>>> Hello,
>>> 
>>> I have a problem using MPI_Intercomm_create.
>>> 
>>> I 5 tasks, let's say T0, T1, T2, T3, T4 resulting from two spawn
>>> operations by T0.
>>> 
>>> So I have two intra-communicator :
>>> 
>>> intra0 contains : T0, T1, T2
>>> intra1 contains : T0, T3, T4
>>> 
>>> my goal is to make a collective loop to build a single intra-communicator
>>> containing T0, T1, T2, T3, T4
>>> 
>>> I tried to do it using MPI_Intercomm_create and MPI_Intercom_merge calls,
>>> but without success (I always get MPI internal errors).
>>> 
>>> What I am doing :
>>> 
>>> on T0 :
>>> *******
>>> 
>>> MPI_Intercom_create(intra0,0,intra1,0,1,&new_com)
>>> 
>>> on T1 and T2 :
>>> **************
>>> 
>>> MPI_Intercom_create(intra0,0,MPI_COMM_WORLD,0,1,&new_com)
>>> 
>>> on T3 and T4 :
>>> **************
>>> 
>>> MPI_Intercom_create(intra1,0,MPI_COMM_WORLD,0,1,&new_com)
>>> 
>>> 
>>> I'm certainly missing something. Could anybody help me to solve this
>>> problem ?
>>> 
>>> Best regards,
>>> 
>>> Frédéric.
>>> 
>>> PS : of course I did an extensive web search without finding anything
>>> usefull on my problem.
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> -- 
> Edgar Gabriel
> Assistant Professor
> Parallel Software Technologies Lab      http://pstl.cs.uh.edu
> Department of Computer Science          University of Houston
> Philip G. Hoffman Hall, Room 524        Houston, TX-77204, USA
> Tel: +1 (713) 743-3857                  Fax: +1 (713) 743-3335
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to