Re: [OMPI users] MPI_COMM_DUP freeze with OpenMPI 1.4.1

francoise.r...@obs.ujf-grenoble.fr Tue, 24 May 2011 04:44:43 -0400

Jeff Squyres wrote:

On May 13, 2011, at 8:31 AM, francoise.r...@obs.ujf-grenoble.fr wrote:

Here is the MUMPS portion of code (in zmumps_part1.F file) where the slaves 
call MPI_COMM_DUP , id%PAR and MASTER are initialized to 0 before :

CALL MPI_COMM_SIZE(id%COMM, id%NPROCS, IERR )


I re-indented so that I could read it better:

      CALL MPI_COMM_SIZE(id%COMM, id%NPROCS, IERR )
      IF ( id%PAR .eq. 0 ) THEN
         IF ( id%MYID .eq. MASTER ) THEN
            color = MPI_UNDEFINED
         ELSE
            color = 0
         END IF
         CALL MPI_COMM_SPLIT( id%COMM, color, 0,
         & id%COMM_NODES, IERR )
         id%NSLAVES = id%NPROCS - 1
      ELSE
         CALL MPI_COMM_DUP( id%COMM, id%COMM_NODES, IERR )
         id%NSLAVES = id%NPROCS
      END IF

      IF (id%PAR .ne. 0 .or. id%MYID .NE. MASTER) THEN
         CALL MPI_COMM_DUP( id%COMM_NODES, id%COMM_LOAD, IERR
      ENDIF

That doesn't look right -- both MPI_COMM_SPLIT and MPI_COMM_DUP are collective, 
meaning that all processes in the communicator must call them. In the first 
case, only some processes are calling MPI_COMM_SPLIT.  Is there some other 
logic that forces the rest of the processes to call MPI_COMM_SPLIT, too?

Actually, we look at the first case, that is id%par = 0. But theMPI_COMM_SPLIT routine is called by all the processes and creates a newcommunicator named "id%COMM_NODES". This communicator contains all theslaves, but not the master. The first MPI_COMM_DUP is not executed, thesecond one is executed on all the slaves nodes (id%MYID .NE. MASTER ),because the communicator is "id%COMM_NODES" and so implies all theprocesses of this communicator.So it seems correct to me but perhaps I make a mistake because theMPI_COMM_DUP freezes.


Franc,oise

Re: [OMPI users] MPI_COMM_DUP freeze with OpenMPI 1.4.1

Reply via email to