Thanks all for your anwers. yes, I understand well that it is a non contiguous memory access problem as the MPI_BCAST should wait for a pointer on a valid memory zone. But I'm surprised that with the MPI module usage Fortran does not hide this discontinuity in a contiguous temporary copy of the array. I've spent some time to build openMPI with g++/gcc/ifort (to create the right mpi module) and ran some additional tests:

Default OpenMPI is openmpi-1.2.8-17.4.x86_64

# module load openmpi
# mpif90 ess.F90&&  mpirun -np 4 ./a.out
           0           1           2           3           0           1        
   2           3           0           1           2           3           0    
       1           2           3
# module unload openmpi
The result is Ok but sometime it hangs (when I require are a lot of processes)

With OpenMPI 1.4.4 and gfortran from gcc-fortran-4.5-19.1.x86_64

# module load openmpi-1.4.4-gcc-gfortran
# mpif90 ess.F90&&  mpirun -np 4 ./a.out
           0          -1          -1          -1           0          -1        
  -1          -1           0          -1          -1          -1           0    
      -1          -1          -1
# module unload openmpi-1.4.4-gcc-gfortran
Node 0 only update the global array with it's subarray. (i only print node 0 
result)


With OpenMPI 1.4.4 and ifort 10.1.018 (yes, it's quite old, I have the latest 
one but it isn't installed!)

# module load openmpi-1.4.4-gcc-intel
# mpif90 ess.F90&&  mpirun -np 4 ./a.out
ess.F90(15): (col. 5) remark: LOOP WAS VECTORIZED.
           0          -1          -1          -1           0          -1
          -1          -1           0          -1          -1          -1
           0          -1          -1          -1

# mpif90 -check arg_temp_created ess.F90&&  mpirun -np 4 ./a.out
gives a lot of messages like:
forrtl: warning (402): fort: (1): In call to MPI_BCAST1DI4, an array temporary 
was created for argument #1

So a temporary array is created for each call. So where is the problem ?

About the fortran compiler, I'm using similar behavior (non contiguous 
subarrays) in MPI_sendrecv calls and all is working fine: I ran some intensive 
tests from 1 to 128 processes on my quad-core workstation. This Fortran 
solution was easier than creating user defined data types.

Can you reproduce this behavior with the test case ? What are your OpenMPI and 
Gfortran/ifort versions ?

Thanks again

Patrick

The test code:

PROGRAM bide
USE mpi
   IMPLICIT NONE
   INTEGER :: nbcpus
   INTEGER :: my_rank
   INTEGER :: ierr,i,buf
   INTEGER, ALLOCATABLE:: tab(:,:)

    CALL MPI_INIT(ierr)
    CALL MPI_COMM_RANK(MPI_COMM_WORLD,my_rank,ierr)
    CALL MPI_COMM_SIZE(MPI_COMM_WORLD,nbcpus,ierr)

    ALLOCATE (tab(0:nbcpus-1,4))

    tab(:,:)=-1
    tab(my_rank,:)=my_rank
    DO i=0,nbcpus-1
       CALL MPI_BCAST(tab(i,:),4,MPI_INTEGER,i,MPI_COMM_WORLD,ierr)
    ENDDO
    IF (my_rank .EQ. 0) print*,tab
    CALL MPI_FINALIZE(ierr)

END PROGRAM bide

-- =============================================================== | Equipe M.O.S.T. | http://most.hmg.inpg.fr | | Patrick BEGOU | ------------ | | LEGI | mailto:patrick.be...@hmg.inpg.fr | | BP 53 X | Tel 04 76 82 51 35 | | 38041 GRENOBLE CEDEX | Fax 04 76 82 52 71 | ===============================================================

Reply via email to