Victor you might want to take a look at the Open MPI version available from http://fault-tolerance.org/. It provides additional features to graciously handle node failures.
George. On May 30, 2013, at 17:55 , Victor Vysotskiy <victor.vysots...@teokem.lu.se> wrote: > Hi Ralph, > >> -mca orte_abort_non_zero_exit 0 > > Thank you for the hint. That it is exactly what I need! BTW, does it help if > one of the working node occasionally dies during the MPMD run? > > With best regards, > Victor. > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users