When the link fails, mpirun loses contact with the orted on that node. This
causes the OOB to callback to the routed framework to see if this is a critical
link. Since a link to a daemon -is- considered critical, a call is made to the
errmgr framework indicating that a proc (in this case, a daem
We mentioned today on the call a potentially aggressive schedule to get v1.5
out the door:
- re-branch from SVN trunk this Friday, 13 Jan, 2010
- target release for Tuesday, 16 Feb, 2010
Yes, this means releasing in about 5 weeks. It's an aggressive schedule, but
given that the trunk is pretty
I forgot to include the patch itself -- here's a mercurial commit showing the
change:
http://bitbucket.org/jsquyres/ummunot/changeset/d0dd138df4e5/
If no one objects (and I don't think that anyone will), I'll commit later today.
On Jan 7, 2010, at 3:03 PM, Jeff Squyres wrote:
> WHAT: Mak
Hi,
I want to use OpenMPI in a context where
the link failure has high probability.
My intention is both...I also want to get an
indepth understanding of the code
to know what happens behind the scenes.
Anybody have suggestions or methodologies to flollow