Kewl. FWIW: we already have the ability to migrate processes in the ORTE code. You can tell the system to try and restart the process in its existing location N number of times before requesting relocation. Of course, if a node fails, then we automatically relocate the procs to other nodes.
The relocation algorithm (i.e., where to put the relocating process) is in the "resilient" mapper component (see orte/mca/rmaps/resilient). It tries to ensure that we don't relocate the proc to an inappropriate place. I can provide more details if you like. Ralph On Sep 30, 2014, at 3:20 AM, Manuel Rodríguez Pascual <manuel.rodriguez.pasc...@gmail.com> wrote: > Hi all, > > I kind of broke something with mail mail configuration so I haven't > been able to properly answer to this earlier, sorry. > > @Jsquyres We are planning to work on fault tolerance and improved > scheduling cappabilities for HPC. To do so, we are first focusing on > serial tasks, and in a next step we will work with parallel jobs. In > particular, I will be working on job migration, so tasks composing an > MPI job can be re-allocated inside a cluster. Anyway, this is > anticipating too much, now we are in the the first steps of the > project. Also, thanks for the videos and the environment > recommendations, it has been really helpful. > > @Ralph Castain: Of course :) Our objective is to create open software > adopting the existing Open-MPI license, and make it available to the > community. i am not in charge of the "paperwork", but I will make sure > that someone relevant in my organization looks at this contributor > agreement- > > > Thanks again for your recommendations and warmth welcome. Best regards, > > > Manuel > >> >> Message: 9 >> Date: Fri, 29 Aug 2014 14:40:08 +0000 >> From: "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> >> To: Open MPI Developers List <de...@open-mpi.org> >> Subject: Re: [OMPI devel] Fwd: recomended software stack for >> development? >> Message-ID: <632d2995-ea78-4aa2-ba94-bc77f05ae...@cisco.com> >> Content-Type: text/plain; charset="iso-8859-1" >> >> On Aug 29, 2014, at 5:36 AM, Manuel Rodr?guez Pascual <superma...@gmail.com> >> wrote: >> >>> We are a small development team that will soon start working in open-mpi. >> >> Welcome! >> >>> Being total newbies on the area (both on open-mpi and in this kind of >>> large projects), we are seeking for advise in which tools to use on the >>> development. Any suggestion on IDE, compiler, regression testing software >>> and everything else is more than welcome. Of course this is highly personal, >>> but it would be great to know what you folks are using to help us decide and >>> start working. >> >> I think you'll find us all over the map on IDE. I personally use >> emacs+terminal. I know others who use vim+terminal. Many of us use ctags >> and the like, but it's not quite as helpful as usual because of OMPI's heavy >> use of pointers. I don't think many developers use a full-blown IDE. >> >> For compiler, I'm guessing most of us develop with gcc most of the time, >> although a few may have non-gcc as the default. We test across a wide >> variety of compilers, so portability is important. >> >> For regression testing, we use the MPI Testing Tool >> (https://svn.open-mpi.org/trac/mtt/ and http://mtt.open-mpi.org/). Many of >> us have it configured to do builds of the nightly tarballs; some of us push >> our results to the public database at mtt.open-mpi.org. >> >>> Thanks for your help. We are really looking to cooperate with the project, >>> so we'll hopefully see you around here for a while! >> >> Just curious: what do you anticipate working on? >> >> It might be a good idea to see our "intro to the OMPI code base" videos: >> http://www.open-mpi.org/video/?category=internals >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> >> >> ------------------------------ >> >> Message: 11 >> Date: Fri, 29 Aug 2014 07:53:46 -0700 >> From: Ralph Castain <r...@open-mpi.org> >> To: Open MPI Developers <de...@open-mpi.org> >> Subject: Re: [OMPI devel] Fwd: recomended software stack for >> development? >> Message-ID: <ff0760ff-1a7a-49f4-a8a0-5358c8a19...@open-mpi.org> >> Content-Type: text/plain; charset=iso-8859-1 >> >> Indeed, welcome! >> >> Just to make things smoother: are you planning to contribute your work back >> to the community? If so, we'll need a signed contributor agreement - see >> here: >> >> http://www.open-mpi.org/community/contribute/corporate.php >> > > > > -- > Dr. Manuel Rodríguez-Pascual > skype: manuel.rodriguez.pascual > phone: (+34) 913466173 // (+34) 679925108 > > CIEMAT-Moncloa > Edificio 22, desp. 1.25 > Avenida Complutense, 40 > 28040- MADRID > SPAIN > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/09/15943.php