[OMPI devel] bug in openib btl_remove_procs

2008-03-07 Thread Jeff Squyres
I noticed that when btl_remove_procs is invoked on the openib BTL (e.g., when you "mpirun --mca btl self,openib ...", an openib endpoint will be removed because self's exclusivity will edge it out), the openib remove_procs() function will not remove the corresponding endpoint on

Re: [OMPI devel] Fault tolerance

2008-03-07 Thread Aurélien Bouteiller
We now use the errmgr. Aurelien Le 6 mars 08 à 13:38, Aurélien Bouteiller a écrit : Aside of what Josh said, we are working right know at UTK on orted/MPI recovery (without killing/respawning all). For now we had no use of the errgmr, but I'm quite sure this would be the smartest place to

Re: [OMPI devel] t_win failures if openib btl is not loaded

2008-03-07 Thread Jeff Squyres
I filed this as https://svn.open-mpi.org/trac/ompi/ticket/1233 so that it would not be forgotten. On Feb 18, 2008, at 10:53 AM, Tim Prins wrote: Hi all, This is a bit strange, so I thought I'd ping the group before digging any further. The onesided test 't_win' is failing for us