Hi,
The email is intended to follow the thread about "Problem with MPI_Comm_spawn using openmpi 2.0.x + sbatch". https://mail-archive.com/users@lists.open-mpi.org/msg30650.html We have installed the latest version v2.0.2 on the cluster that <https://mail-archive.com/users@lists.open-mpi.org/msg30654.html>Anastasia Kruchinina were running. It seems to me that the issue still is not fixed in v2.0.2. The job script and sample codes can be found at https://www.pdc.kth.se/~gongjing/files/test_spawn/ The messages we got $ cat error_file.e Currently Loaded Modulefiles: [t03n06.pdc.kth.se:39767] OPAL ERROR: Timeout in file base/pmix_base_fns.c at line 193 *** An error occurred in MPI_Init *** on a NULL communicator *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, *** and potentially your MPI job) $ cat output_file.o -------------------------------------------------------------------------- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): ompi_dpm_dyn_init() failed --> Returned "Timeout" (-15) instead of "Success" (0) -------------------------------------------------------------------------- Please let me know if you need additional information. Thanks a lot for your help. Regards, Jing Gong
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users