Andrew, the 2 seconds timeout is very likely a bug that was fixed, so i strongly suggest you give a try to the latest 2.0.2 that was released earlier this week.
Ralph is referring an other timeout which is hard coded (fwiw, the MPI standard says nothing about timeout, so we hardcoded one to prevent jobs from hanging forever) to 600 seconds in master, but is still 60 seconds in the v2.0.x branch IIRC, the hard coded timeout is in MPI_Comm_{accept,connect} and i do not know if it is somehow involved in MPI_Comm_spawn. Cheers, Gilles On Saturday, February 4, 2017, r...@open-mpi.org <r...@open-mpi.org> wrote: > We know v2.0.1 has problems with comm_spawn, and so you may be > encountering one of those. Regardless, there is indeed a timeout mechanism > in there. It was added because people would execute a comm_spawn, and then > would hang and eat up their entire allocation time for nothing. > > In v2.0.2, I see it is still hardwired at 60 seconds. I believe we > eventually realized we needed to make that a variable, but it didn’t get > into the 2.0.2 release. > > > > On Feb 1, 2017, at 1:00 AM, elistrato...@info.sgu.ru <javascript:;> > wrote: > > > > I am using Open MPI version 2.0.1. > > _______________________________________________ > > users mailing list > > users@lists.open-mpi.org <javascript:;> > > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > _______________________________________________ > users mailing list > users@lists.open-mpi.org <javascript:;> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users