Hi,
currently I am using DMTCP to checkpoint a set of programs running on different 
machines in a virtual environment (virtualbox).
I want to create multiple snapshots of the whole system  at different stages, 
store the checkpoint files so they do not get overwritten and later restart the 
system at some stage I can choose.


Here is my problem:
As far as I can tell, dmtcp_command does not seem to be blocking (which is 
fine), so I cannot for example run dmtcp_command -k to kill all nodes and 
reconnect them with dmtcp_restart, because the nodes will try to reconnect 
before the execution of dtcp_command -k is done.
Since performance (time) is an issue,  I cannot wait a fixed amount of time for 
the command to finish.
It says that the Python API has a checkpoint function that is blocking, however 
when I try to use the API, the coordinator does not seem to react whenever I 
call dmtcp.checkpoint() .
When I try to list the sessions it will tell me "No checkpoint sessions found", 
so I assume the API is working in principle (I modified the PYTHONPATH to 
include contrib/python), but it won't create any snapshots. There are also no 
errors.
I am running dmtcp-2.1 compiled from source. I also tried dmtcp-2.2 but the 
problem is the same.
What am I missing?
I want to be able to run a python script like
(...)
import dmtcp
dmtcp_checkpoint()
(...)
and have the coordinator react.

Thanks in advance,
Ernst

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Dmtcp-forum mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to