2013/4/16 Ralph Castain <[email protected]> > +1 > > And you probably have to start the job using OMPI's mpirun cmd - I don't > believe it will work when running the processes directly via srun as I > believe it requires that the OMPI daemons be present to support the > operations. > > > On Apr 16, 2013, at 10:43 AM, Paul Hargrove <[email protected]> wrote: > > Yann, > > I don't know what might be the specific cause of your error, but I do know > that to checkpoint and restart Open MPI jobs with BLCR one should be using > ompi-checkpoint and ompi-restart. You can find some more information at > http://osl.iu.edu/research/ft/ompi-cr/ > > -Paul > > > Thanks to both of you, I will try with ompi specific tools.
cheers
