On Jun 27, 2011, at 6:57 AM, Ken Lloyd wrote:
> One point I've been trying to put forward in my domain is, currently, high
> performance computing != high reliability computing. Not by a long shot.
> Seems that they are orthogonally coupled.
I think that has been true in the past - an emerging
One point I've been trying to put forward in my domain is, currently,
high performance computing != high reliability computing. Not by a long
shot. Seems that they are orthogonally coupled.
There are many pieces to this problem-puzzle. Some of these pieces are
inter-related. Some of my work has de
It has been on my to-do list for a while to start a FAQ listing of the
various resilience/FT related activities in and around Open MPI. This would
provide a starting location for users and new developers could go to for an
overview of each of the features, and how to activate/use the feature.
I'll
I think we're some ways away from declaring a "resilient ORTE". Josh and I have
been committing pieces of it over the last two years, and Wes just committed
another piece the other day that might have been titled "fault tolerant OOB" as
it primarily addressed maintaining comm routing during node
Josh and Wesley,
Will you be presenting Resilient ORTE at Resilience 2011 in Bordeaux?
http://xcr.cenit.latech.edu/resilience2011/
=
Kenneth A. Lloyd
CEO - Director of Systems Science
Watt Systems Technologies Inc.
www.wattsys.com
kenneth.ll...@wattsys.com
This e-mail is co