Hi, I am running like this: mpirun -np 1 ./manager
Should I do it differently? I also thought that all sbatch does is create an allocation and then run my script in it. But it seems it is not since I am getting these results... I would like to upgrade to OpenMPI, but no clusters near me have it yet :( So I even cannot check if it works with OpenMPI 2.0.2. On 15 February 2017 at 16:04, Howard Pritchard <hpprit...@gmail.com> wrote: > Hi Anastasia, > > Definitely check the mpirun when in batch environment but you may also > want to upgrade to Open MPI 2.0.2. > > Howard > > r...@open-mpi.org <r...@open-mpi.org> schrieb am Mi. 15. Feb. 2017 um 07:49: > >> Nothing immediate comes to mind - all sbatch does is create an allocation >> and then run your script in it. Perhaps your script is using a different >> “mpirun” command than when you type it interactively? >> >> On Feb 14, 2017, at 5:11 AM, Anastasia Kruchinina < >> nastja.kruchin...@gmail.com> wrote: >> >> Hi, >> >> I am trying to use MPI_Comm_spawn function in my code. I am having >> trouble with openmpi 2.0.x + sbatch (batch system Slurm). >> My test program is located here: http://user.it.uu.se/~ >> anakr367/files/MPI_test/ >> >> When I am running my code I am getting an error: >> >> OPAL ERROR: Timeout in file >> ../../../../openmpi-2.0.1/opal/mca/pmix/base/pmix_base_fns.c at line 193 >> *** An error occurred in MPI_Init_thread >> *** on a NULL communicator >> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, >> *** and potentially your MPI job) >> -------------------------------------------------------------------------- >> >> It looks like MPI_INIT failed for some reason; your parallel process is >> likely to abort. There are many reasons that a parallel process can >> fail during MPI_INIT; some of which are due to configuration or >> environment >> problems. This failure appears to be an internal failure; here's some >> additional information (which may only be relevant to an Open MPI >> developer): >> >> ompi_dpm_dyn_init() failed >> --> Returned "Timeout" (-15) instead of "Success" (0) >> -------------------------------------------------------------------------- >> >> >> The interesting thing is that there is no error when I am firstly >> allocating nodes with salloc and then run my program. So, I noticed that >> the program works fine using openmpi 1.x+sbach/salloc or openmpi >> 2.0.x+salloc but not openmpi 2.0.x+sbatch. >> >> The error was reproduced on three different computer clusters. >> >> Best regards, >> Anastasia >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >> >> >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users