Hi, I am trying to use MPI_Comm_spawn function in my code. I am having trouble with openmpi 2.0.x + sbatch (batch system Slurm). My test program is located here: http://user.it.uu.se/~anakr367/files/MPI_test/
When I am running my code I am getting an error: OPAL ERROR: Timeout in file ../../../../openmpi-2.0.1/opal/mca/pmix/base/pmix_base_fns.c at line 193 *** An error occurred in MPI_Init_thread *** on a NULL communicator *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, *** and potentially your MPI job) -------------------------------------------------------------------------- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): ompi_dpm_dyn_init() failed --> Returned "Timeout" (-15) instead of "Success" (0) -------------------------------------------------------------------------- The interesting thing is that there is no error when I am firstly allocating nodes with salloc and then run my program. So, I noticed that the program works fine using openmpi 1.x+sbach/salloc or openmpi 2.0.x+salloc but not openmpi 2.0.x+sbatch. The error was reproduced on three different computer clusters. Best regards, Anastasia
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users