Jeff Squyres wrote:
On May 13, 2011, at 8:31 AM, francoise.r...@obs.ujf-grenoble.fr wrote:
Here is the MUMPS portion of code (in zmumps_part1.F file) where the slaves
call MPI_COMM_DUP , id%PAR and MASTER are initialized to 0 before :
CALL MPI_COMM_SIZE(id%COMM, id%NPROCS, IERR )
I re-indented so that I could read it better:
CALL MPI_COMM_SIZE(id%COMM, id%NPROCS, IERR )
IF ( id%PAR .eq. 0 ) THEN
IF ( id%MYID .eq. MASTER ) THEN
color = MPI_UNDEFINED
ELSE
color = 0
END IF
CALL MPI_COMM_SPLIT( id%COMM, color, 0,
& id%COMM_NODES, IERR )
id%NSLAVES = id%NPROCS - 1
ELSE
CALL MPI_COMM_DUP( id%COMM, id%COMM_NODES, IERR )
id%NSLAVES = id%NPROCS
END IF
IF (id%PAR .ne. 0 .or. id%MYID .NE. MASTER) THEN
CALL MPI_COMM_DUP( id%COMM_NODES, id%COMM_LOAD, IERR
ENDIF
That doesn't look right -- both MPI_COMM_SPLIT and MPI_COMM_DUP are collective,
meaning that all processes in the communicator must call them. In the first
case, only some processes are calling MPI_COMM_SPLIT. Is there some other
logic that forces the rest of the processes to call MPI_COMM_SPLIT, too?
Actually, we look at the first case, that is id%par = 0. But the
MPI_COMM_SPLIT routine is called by all the processes and creates a new
communicator named "id%COMM_NODES". This communicator contains all the
slaves, but not the master. The first MPI_COMM_DUP is not executed, the
second one is executed on all the slaves nodes (id%MYID .NE. MASTER ),
because the communicator is "id%COMM_NODES" and so implies all the
processes of this communicator.
So it seems correct to me but perhaps I make a mistake because the
MPI_COMM_DUP freezes.
Franc,oise