Hello,

I am trying to use DMTCP for checkpointing MPI program but got stuck at the 
hellompi program. I was following the instruction on 
https://github.com/dmtcp/dmtcp/blob/master/QUICK-START.md 
<https://github.com/dmtcp/dmtcp/blob/master/QUICK-START.md>. The sequential 
demo works well, I can checkpoint the counting example. However, when I tried 
the hellmpi example using dmtcp_launch -i 5 mpirun -np 2 ./hellompi. I have 
tried the school’s cluster by interactive mode and also a AWS instance and they 
all failed to checkpoint the hellompi example. Should I use slurm or torque 
script to submit the jobs? Can I know what is the latest environment (version 
of dmtcp, mpi, etc.) you have tested on?


—
Lihao Zhang

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Dmtcp-forum mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to