Hi Joshua,
Our protocol is indeed more like the latter two-step approach, that you
describe.
In fact, there are a series of barriers in the protocol between coordinator
and clients. Our original IPDPS-2009 paper is still largely correct "in
spirit",
if you want to give it a look:
http://www.ccs.neu.edu/home/gene/papers/ipdps09.pdf
- Gene
On Thu, Mar 07, 2013 at 05:13:00PM +0000, Louie, Joshua D wrote:
> For my use case, we won't hit this, because we only allow checkpointing
> internally at "safe" locations. I actually don't know how your communicator
> tells the processes connected to it to checkpoint. My guess is that it sends
> a single message asking them to checkpoint. When they get it, they suspend
> themselves and then checkpoint. Maybe it should be in a two-step approach, a
> first message that asks all the processes if they can checkpoint. This should
> still stall all the processes, and each return whether or not it is possible.
> If all the processes return that they can, the coordinator sends them a
> second message saying to really checkpoint. If any of them cannot, the second
> message is a "just kidding, and continue on" message.
>
> Joshua Louie
------------------------------------------------------------------------------
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the
endpoint security space. For insight on selecting the right partner to
tackle endpoint security challenges, access the full report.
http://p.sf.net/sfu/symantec-dev2dev
_______________________________________________
Dmtcp-forum mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum