It should detect and abort - what version are you using? Sent from my iPhone
On Jun 20, 2013, at 2:02 PM, Claire Williams <clairewilliams1...@yahoo.com> wrote: > Hi all, > > I was wondering if Open-MPI had any way to detect that a node has crashed, > rebooted, etc. I am currently trying to integrate my MPI application with > Amazon EC2 spot instances, and since spot instances can be terminated at any > time, I would like to try to make it so that my application can detect this > node failure, maybe remove the node from the machine file, and restart the > application automatically. Right now, when one of the worker nodes is > rebooted or terminated, the master that is waiting on the results of that > node will just hang, waiting for results that will never come. > > Thanks, > > Claire > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users