Hi,

I am running like this:
mpirun -np 1 ./manager

Should I do it differently?

I also thought that all sbatch does is create an allocation and then run my
script in it. But it seems it is not since I am getting these results...

I would like to upgrade to OpenMPI, but no clusters near me have it yet :(
So I even cannot check if it works with OpenMPI 2.0.2.

On 15 February 2017 at 16:04, Howard Pritchard <hpprit...@gmail.com> wrote:

> Hi Anastasia,
>
> Definitely check the mpirun when in batch environment but you may also
> want to upgrade to Open MPI 2.0.2.
>
> Howard
>
> r...@open-mpi.org <r...@open-mpi.org> schrieb am Mi. 15. Feb. 2017 um 07:49:
>
>> Nothing immediate comes to mind - all sbatch does is create an allocation
>> and then run your script in it. Perhaps your script is using a different
>> “mpirun” command than when you type it interactively?
>>
>> On Feb 14, 2017, at 5:11 AM, Anastasia Kruchinina <
>> nastja.kruchin...@gmail.com> wrote:
>>
>> Hi,
>>
>> I am trying to use MPI_Comm_spawn function in my code. I am having
>> trouble with openmpi 2.0.x + sbatch (batch system Slurm).
>> My test program is located here: http://user.it.uu.se/~
>> anakr367/files/MPI_test/
>>
>> When I am running my code I am getting an error:
>>
>> OPAL ERROR: Timeout in file
>> ../../../../openmpi-2.0.1/opal/mca/pmix/base/pmix_base_fns.c at line 193
>> *** An error occurred in MPI_Init_thread
>> *** on a NULL communicator
>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
>> ***    and potentially your MPI job)
>> --------------------------------------------------------------------------
>>
>> It looks like MPI_INIT failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during MPI_INIT; some of which are due to configuration or
>> environment
>> problems.  This failure appears to be an internal failure; here's some
>> additional information (which may only be relevant to an Open MPI
>> developer):
>>
>>    ompi_dpm_dyn_init() failed
>>    --> Returned "Timeout" (-15) instead of "Success" (0)
>> --------------------------------------------------------------------------
>>
>>
>> The interesting thing is that there is no error when I am firstly
>> allocating nodes with salloc and then run my program. So, I noticed that
>> the program works fine using openmpi 1.x+sbach/salloc or openmpi
>> 2.0.x+salloc but not openmpi 2.0.x+sbatch.
>>
>> The error was reproduced on three different computer clusters.
>>
>> Best regards,
>> Anastasia
>> _______________________________________________
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to