Hi Joshua,
    Our protocol is indeed more like the latter two-step approach, that you 
describe.
In fact, there are a series of barriers in the protocol between coordinator
and clients.  Our original IPDPS-2009 paper is still largely correct "in 
spirit",
if you want to give it a look:
  http://www.ccs.neu.edu/home/gene/papers/ipdps09.pdf
                                                                        - Gene

On Thu, Mar 07, 2013 at 05:13:00PM +0000, Louie, Joshua D wrote:
> For my use case, we won't hit this, because we only allow checkpointing 
> internally at "safe" locations. I actually don't know how your communicator 
> tells the processes connected to it to checkpoint. My guess is that it sends 
> a single message asking them to checkpoint. When they get it, they suspend 
> themselves and then checkpoint. Maybe it should be in a two-step approach, a 
> first message that asks all the processes if they can checkpoint. This should 
> still stall all the processes, and each return whether or not it is possible. 
> If all the processes return that they can, the coordinator sends them a 
> second message saying to really checkpoint. If any of them cannot, the second 
> message is a "just kidding, and continue on" message.
> 
> Joshua Louie

------------------------------------------------------------------------------
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester  
Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the  
endpoint security space. For insight on selecting the right partner to 
tackle endpoint security challenges, access the full report. 
http://p.sf.net/sfu/symantec-dev2dev
_______________________________________________
Dmtcp-forum mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to