Dear DMTCP team,
Greetings! We are trying to use DMTCP for our work in coordinated
checkpointing. I have few queries in this regard with respect to
distributed applications (where multiple processes running on separate
nodes are communicating with each other and failure of any node means that
all nodes need to be restarted from the last checkpoint):
1. Does DMTCP checkpoints ensure consistent state, i.e. there are no
orphan messages?
2. If yes to the previous question then what is the underlying
algorithm for coordinated checkpointing in DMTCP? A link to corresponding
research paper will help.
3. If not then how can I integrate my “coordinated chekcpointing”
algorithm within DMTCP i.e. using DMTCP for mainly storing the checkpoint
while my algorithm ensures consistency.
Thanks.
Best,
Pushpendra
[image: logo]
http://www.iiitd.edu.in/~pushpendra/
------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Dmtcp-forum mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum