Re: [OMPI users] Running on crashing nodes

2010-09-27 Thread Randolph Pullen
r...@open-mpi.org> Subject: Re: [OMPI users] Running on crashing nodes To: "Open MPI Users" <us...@open-mpi.org> Received: Friday, 24 September, 2010, 10:18 PM As one of the Open MPI developers actively working on the MPI layer stabilization/recover feature set, I don't think we can gi

Re: [OMPI users] Running on crashing nodes

2010-09-24 Thread Joshua Hursey
As one of the Open MPI developers actively working on the MPI layer stabilization/recover feature set, I don't think we can give you a specific timeframe for availability, especially availability in a stable release. Once the initial functionality is finished, we will open it up for user

Re: [OMPI users] Running on crashing nodes

2010-09-24 Thread Andrei Fokau
Ralph, could you tell us when this functionality will be available in the stable version? A rough estimate will be fine. On Fri, Sep 24, 2010 at 01:24, Ralph Castain wrote: > In a word, no. If a node crashes, OMPI will abort the currently-running job > if it had processes on

Re: [OMPI users] Running on crashing nodes

2010-09-23 Thread Ralph Castain
In a word, no. If a node crashes, OMPI will abort the currently-running job if it had processes on that node. There is no current ability to "ride-thru" such an event. That said, there is work being done to support "ride-thru". Most of that is in the current developer's code trunk, and more is