Dear Group,
I wanted to do a synchronization check with 'MPI_Barrier(MPI_COMM_WORLD)' in
'opal_crs_self_user_checkpoint(char **restart_cmd)' call. Although every
process is present in this call, it fails to synchronize. Is there any reason
why cant we use barrier?Thanks in advance.
Kind regards
nk. What
> version of Open MPI are you using?
>
> -- Josh
>
> On Tue, Oct 18, 2011 at 11:14 AM, Faisal Shahzad wrote:
> > Hi,
> > Thank you for your reply.
> > I actually do not see option flag '--mpirun_opts' with 'ompi-restart
> > --hel
sl.iu.edu/research/ft/ompi-cr/tools.php#ompi-restart
>
> So something like:
> shell$ ompi-restart --mpirun_opts "-npernode 2" ompi-global-snapshot-1234
>
> -- Josh
>
> On Tue, Oct 18, 2011 at 7:45 AM, Faisal Shahzad wrote:
> > Dear Group,
> > I am u
Dear Group,
I am using openmpi/1.5.3 and using ompi-checkpoint to checkpoint my
applicaiton. I use some mpirun option flags (-npernode, -npersocket, binding
options etc. ) for mpirun. It works fine.My question is that is it possible to
specify these mpirun options (-npernode, -npersocket, bind
code, problem is not too severe, so i used 48 or even 96 processes
and many checkpoints to make problem appear. But i my actual code, perhaps due
to more MPI calls, sometimes problem occur even within one node with only few
(2-5) processes as well.
Hope to hear from you.Kind regards,Faisal Shahza
Dear Group,
I have a mpi-program in which every process is communicating with its
neighbors. When SELF-checkpointing, every process writes to a separate
file.Problem is that sometimes after making a checkpoint, program does not
continue again. Having more number of processes makes this problem
mething to try is to run 'nm' on the compiled C++ program and make
> sure that the 'self' checkpointing functions are present in the
> output.
>
> If you can post a small repeater program if the above does not help,
> then I can file a ticket and see if someone can ta
Dear Group,
My question is that, does SELF checkpointing work only with 'c' or also with
'c++' program?I have a simple program written in 'c'. It makes self-checkpoint
(run callback functions) when i compile it with mpicc and do checkpointing
during run.But when i convert same program to .cpp,