r...@open-mpi.org>
Subject: Re: [OMPI users] Running on crashing nodes
To: "Open MPI Users" <us...@open-mpi.org>
Received: Friday, 24 September, 2010, 10:18 PM
As one of the Open MPI developers actively working on the MPI layer
stabilization/recover feature set, I don't think we can gi
As one of the Open MPI developers actively working on the MPI layer
stabilization/recover feature set, I don't think we can give you a specific
timeframe for availability, especially availability in a stable release. Once
the initial functionality is finished, we will open it up for user
Ralph, could you tell us when this functionality will be available in the
stable version? A rough estimate will be fine.
On Fri, Sep 24, 2010 at 01:24, Ralph Castain wrote:
> In a word, no. If a node crashes, OMPI will abort the currently-running job
> if it had processes on
In a word, no. If a node crashes, OMPI will abort the currently-running job
if it had processes on that node. There is no current ability to "ride-thru"
such an event.
That said, there is work being done to support "ride-thru". Most of that is
in the current developer's code trunk, and more is