Re: [OMPI users] Question about Asynchronous collectives

Richard Treumann Thu, 23 Sep 2010 14:13:32 -0400

CC stands for any Collective Communication operation. Every CC occurs on 
some communicator.

Every CC is issued (basically the thread the call is on enters the call) 
at some point in  time.  If two threads are issuing CC calls on the same 
communicator, the issue order can become ambiguous so making CC calls from 
different threads but on the same communicator is generally unsafe. There 
is debate about whether it can be made safe by forcing some kind of thread 
serialization but since the MPI standard does not discuss thread 
serialization, the best  advise is to use a different communicator for 
each thread and be sure you have control of issue order.

When CC  calls appear in some static order in a block of code that has no 
branches, issue order is simple to recognize.  An example like this can 
cause problems unless you are sure every process has the same condition:

If (condition) {
  MPI_Ibcast
  MPI_Ireduce
} else {
  MPI_Ireduce
  MPI_Ibcast
}

If some ranks take the if and some ranks take the else, there is an "issue 
order" problem. (I do not have any idea why someone would do this)

              Dick 

Dick Treumann  -  MPI Team 
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846         Fax (845) 433-8363

From:
Gabriele Fatigati <g.fatig...@cineca.it>
To:
Open MPI Users <us...@open-mpi.org>
List-Post: users@lists.open-mpi.org
Date:
09/23/2010 01:02 PM
Subject:
Re: [OMPI users] Question about Asynchronous collectives
Sent by:
users-boun...@open-mpi.org

Sorry Richard,

what is CC issue order on the communicator?, in particular, "CC", what 
does it mean?

2010/9/23 Richard Treumann <treum...@us.ibm.com>

request_1 and request_2 are just local variable names. 

The only thing that determines matching order is CC issue order on the 
communicator.  At each process, some CC is issued first and some CC is 
issued second.  The first issued CC at each process will try to match the 
first issued CC at the other processes.  By this rule, 
rank 0: 
MPI_Ibcast; MPI_Ibcast 
Rank 1; 
MPI_Ibcast; MPI_Ibcast 
is well defined and 

rank 0: 
MPI_Ibcast; MPI_Ireduce 
Rank 1; 
MPI_Ireducet; MPI_Ibcast 
is incorrect. 

I do not agree with Jeff on this below.   The Proc 1 case where the 
MPI_Waits are reversed simply requires the MPI implementation to make 
progress on both MPI_Ibcast operations in the first MPI_Wait. The second 
MPI_Wait call will simply find that the first MPI_Ibcast is already done. 
 The second MPI_Wait call becomes, effectively, a query function. 

proc 0:
MPI_IBcast(MPI_COMM_WORLD, request_1) // first Bcast
MPI_IBcast(MPI_COMM_WORLD, request_2) // second Bcast
MPI_Wait(&request_1, ...);
MPI_Wait(&request_2, ...);

proc 1:
MPI_IBcast(MPI_COMM_WORLD, request_2) // first Bcast
MPI_IBcast(MPI_COMM_WORLD, request_1) // second Bcast
MPI_Wait(&request_1, ...);
MPI_Wait(&request_2, ...);

That may/will deadlock. 

Dick Treumann  -  MPI Team           
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846         Fax (845) 433-8363

From: 
Jeff Squyres <jsquy...@cisco.com> 
To: 
Open MPI Users <us...@open-mpi.org> 
List-Post: users@lists.open-mpi.org
Date: 
09/23/2010 10:13 AM 
Subject: 
Re: [OMPI users] Question about Asynchronous collectives 
Sent by: 
users-boun...@open-mpi.org

On Sep 23, 2010, at 10:00 AM, Gabriele Fatigati wrote:

> to be sure, if i have one processor who does:
> 
> MPI_IBcast(MPI_COMM_WORLD, request_1) // first Bcast
> MPI_IBcast(MPI_COMM_WORLD, request_2) // second Bcast
> 
> it means that i can't have another process who does the follow:
> 
> MPI_IBcast(MPI_COMM_WORLD, request_2) // firt Bcast for another process
> MPI_IBcast(MPI_COMM_WORLD, request_1) // second Bcast for another 
process
> 
> Because first Bcast of second process matches with first Bcast of first 
process, and it's wrong.

If you did a "waitall" on both requests, it would probably work because 
MPI would just "figure it out".  But if you did something like:

proc 0:
MPI_IBcast(MPI_COMM_WORLD, request_1) // first Bcast
MPI_IBcast(MPI_COMM_WORLD, request_2) // second Bcast
MPI_Wait(&request_1, ...);
MPI_Wait(&request_2, ...);

proc 1:
MPI_IBcast(MPI_COMM_WORLD, request_2) // first Bcast
MPI_IBcast(MPI_COMM_WORLD, request_1) // second Bcast
MPI_Wait(&request_1, ...);
MPI_Wait(&request_2, ...);

That may/will deadlock.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Ing. Gabriele Fatigati

Parallel programmer

CINECA Systems & Tecnologies Department

Supercomputing Group

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.it                    Tel:   +39 051 6171722

g.fatigati [AT] cineca.it           
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Question about Asynchronous collectives

Reply via email to