Andrea,

On top of what Ralph just wrote, you might want to upgrade OpenMPI to the latest stable version (1.10.3)

1.6.5 is pretty antique and is no more maintained.


the message indicates that one process died, and so many things could cause a process crash.

(since the crash occurs only with N > 25, the root cause could be an out of memory (just run dmesg and grep OOM),

a division by zero, your application calling exit(...) instead of MPI_Finalize()/MPI_Abort(...) or a bug in your application)


Cheers,


Gilles


On 7/8/2016 7:12 AM, Ralph Castain wrote:
Try running one of the OMPI example codes and verify that things run correctly if N > 25. I suspect you have an error in your code that causes it to fail if its rank is > 25.


On Jul 7, 2016, at 2:49 PM, Alberti, Andrea <alber...@illinois.edu <mailto:alber...@illinois.edu>> wrote:

Hi,

my name is Andrea and I am a new openMPI user.

I have a code compiled with:
intel/16.0.3
openmpi/1.6.5

--> When I try to run my code with: mpirun -n N ./code.exe
    a) the code correctly runs and gives results if N<=25
    b) the code gives the following error if N>25:
        mpirun has exited due to process rank X with PID ...

--> This seems to be a pretty common problem when not all the processes are initialized or finalized.
    However, I do init and finalize the processors.
And, moreover, I do not understand why the problem is not there when N<=25

Could someone, please, help me out with that or point me to some pages where the same problem is discussed/solved?
Thank you very much in advance for the help.

Andrea

_______________________________________________
users mailing list
us...@open-mpi.org <mailto:us...@open-mpi.org>
Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:http://www.open-mpi.org/community/lists/users/2016/07/29596.php



_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2016/07/29597.php

Reply via email to