Adam. Your MPI program is incorrect. You need to replace the finalize on the process that found the error with MPIAbort
On Nov 16, 2017 10:38, "Adam Sylvester" <op8...@gmail.com> wrote: > I'm using Open MPI 2.1.0 for this but I'm not sure if this is more of an > Open MPI-specific implementation question or what the MPI standard > guarantees. > > I have an application which runs across multiple ranks, eventually > reaching an MPI_Gather() call. Along the way, if one of the ranks > encounters an error, it will call report the error to a log, call > MPI_Finalize(), and exit with a non-zero return code. If this happens > prior to the other ranks making it to the gather, it seems like mpirun > notices this and the process ends on all ranks. This is what I want to > happen - it's a legitimate error, so all processes should be freed up so > the next job can run. It seems like if the other ranks make it into the > MPI_Gather() before the one rank reports an error, the other ranks wait in > the MPI_Gather() forever. > > Is there something simple I can do to guarantee that if any process calls > MPI_Finalize(), all my ranks terminate? > > Thanks. > -Adam > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users