Hi, I have 3 questions to ask about,


1, how does open-mpi find the faulty node?



2, if one node is dead, could the programs continue running? How about two
nodes or even more nodes are dead ?



3, How to recovery faulty node (dead node) ? Is there any possibilities to
recover without check-pointing, since it is time-consuming and decrease
performance ?



Thanks! 



Rui Wang



ICT, P.R. China

Reply via email to