Hi, currently I am using DMTCP to checkpoint a set of programs running on different machines in a virtual environment (virtualbox). I want to create multiple snapshots of the whole system at different stages, store the checkpoint files so they do not get overwritten and later restart the system at some stage I can choose.
Here is my problem: As far as I can tell, dmtcp_command does not seem to be blocking (which is fine), so I cannot for example run dmtcp_command -k to kill all nodes and reconnect them with dmtcp_restart, because the nodes will try to reconnect before the execution of dtcp_command -k is done. Since performance (time) is an issue, I cannot wait a fixed amount of time for the command to finish. It says that the Python API has a checkpoint function that is blocking, however when I try to use the API, the coordinator does not seem to react whenever I call dmtcp.checkpoint() . When I try to list the sessions it will tell me "No checkpoint sessions found", so I assume the API is working in principle (I modified the PYTHONPATH to include contrib/python), but it won't create any snapshots. There are also no errors. I am running dmtcp-2.1 compiled from source. I also tried dmtcp-2.2 but the problem is the same. What am I missing? I want to be able to run a python script like (...) import dmtcp dmtcp_checkpoint() (...) and have the coordinator react. Thanks in advance, Ernst ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech _______________________________________________ Dmtcp-forum mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dmtcp-forum
