I'm running into problems trying to spawn MPI processes across multiple nodes on a cluster using recent versions of OpenMPI. Specifically, using the attached Fortan code, compiled using OpenMPI 3.1.2 with:
mpif90 test.F90 -o test.exe and run via a PBS scheduler using the attached test1.pbs, it fails as can be seen in the attached testFAIL.err file. If I do the same but using OpenMPI v1.10.3 then it works successfully, giving me the output in the attached testSUCCESS.err file. >From testing a few different versions of OpenMPI it seems that the behavior changed between v1.10.7 and v2.0.4. Is there some change in options needed to make this work with newer OpenMPIs? Output from omp_info --all is attached. config.log can be found here: http://users.obs.carnegiescience.edu/abenson/config.log.bz2 Thanks for any help you can offer! -Andrew
ompi_info.log.bz2
Description: application/bzip
program test use MPI implicit none integer :: status , spawnStatus (16), & & childCommunicator , rank , & & parentCommunicator , mpiSize , & & processorNameLength, mpiThreadingProvided character(len=MPI_Max_Processor_Name), dimension(1) :: processorName call MPI_Init_Thread (MPI_Thread_Multiple,mpiThreadingProvided,status) call MPI_Comm_Rank (MPI_Comm_World ,rank ,status) call MPI_Comm_Size (MPI_Comm_World ,mpiSize ,status) call MPI_Comm_Get_Parent (parentCommunicator ,status) call MPI_Get_Processor_Name(processorName(1) ,processorNameLength ,status) if (parentCommunicator == MPI_Comm_Null) then write (0,*) "parent process: rank, size, processor name = ",rank,mpiSize,trim(processorName(1)) call MPI_Comm_Spawn('test.exe',[''],16,MPI_INFO_NULL,0,MPI_Comm_World,childCommunicator,spawnStatus,status) call MPI_Barrier (childCommunicator,status) write (0,*) "parent passed interbarrier: rank = ",rank call MPI_Comm_Free(childCommunicator,status) else write (0,*) " child process: rank, size, processor name = ",rank,mpiSize,trim(processorName(1)) call MPI_Barrier(MPI_Comm_World,status) write (0,*) " child passed intrabarrier: rank = ",rank call MPI_Barrier(parentCommunicator,status) write (0,*) " child passed interbarrier: rank = ",rank end if call MPI_Finalize(status) end program
test1.pbs
Description: application/shellscript
testFAIL.err.bz2
Description: application/bzip
testSUCCESS.err.bz2
Description: application/bzip
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users