Re: [OMPI devel] orte question

2011-07-27 Thread Ralph Castain
Hmmm...I'm not seeing that behavior. I get a 0 exit code every time. You'll get a 243 if there are stale session directories laying around as it indicates that the mpirun's in those dirs are not reachable. Perhaps that is what's happening? On Jul 27, 2011, at 3:14 PM, Greg Watson wrote: > Ral

Re: [OMPI devel] orte question

2011-07-27 Thread Ralph Castain
Hmmmno, can't imagine why. I'll fix - thanks! On Jul 27, 2011, at 3:14 PM, Greg Watson wrote: > Ralph, > > Looking good so far. I did notice that ompi-ps always seems to have an exit > code of 243. Is that on purpose? > > Greg > > On Jul 25, 2011, at 4:44 PM, Ralph Castain wrote: > >> r2

Re: [OMPI devel] orte question

2011-07-27 Thread Greg Watson
Ralph, Looking good so far. I did notice that ompi-ps always seems to have an exit code of 243. Is that on purpose? Greg On Jul 25, 2011, at 4:44 PM, Ralph Castain wrote: > r24944 - let me know how it works! > > > On Jul 25, 2011, at 1:01 PM, Greg Watson wrote: > >> That would probably be m

[OMPI devel] Using BLCR tools to checkpoint Open MPI applications

2011-07-27 Thread Eric Roman
Dear Open MPI Developers, We've been working on using Torque's checkpoint/restart support, along with BLCR and Open MPI's C/R support, to perform C/R on parallel jobs running under Torque. The main issue here is that Open MPI requires the use of ompi-checkpoint and ompi-restart commands to check