Hi,

I am so glad that dmtcp is around because my blcr jobs no longer seem to be 
working.

I have finally finished integrating dmtcp into my batch job manager and I have 
some hopefully useful feedback.  Some of my code relies on customizations to 
our copy of dmtcp.  I'm not sure if they've gotten incorporated into the 
distribution, but if they have, then you may ignore this:

1. Ability to put the PID into a file (supporting restarts as well when not 
restoring the PID).

2. Ability to take STDOUT and STDERR files as parameters so that the STDOUT and 
STDERR of the dmtcp software can be kept separate from the outputs of the 
process.

3. For my own tracking purposes, I supply a new checkpoint directory upon 
restart. However in order for me to supply the correct .dmtcp files, I need to 
parse them out of the script stored in the previous checkpoint directory.  It 
would be nice if instead, the .dmtcp checkpoint files were kept in the 
directory I provide upon restart.

If 1 is implemented, this will us to monitor progress of the job and end early 
if it is determined to have finished before the checkpoint time.  Also, using 
this in combination with lsof, it allows us to automatically track the growth 
of output files.

If 2 is implemented, I imagine this would mean that you would no longer have to 
store separate output files and merge them after completion.  Restarts would no 
longer need to supply the output file(s) for the running process.  I can see 
why some people would prefer separate output files

3 is not as important to me.  But it would also allow me to delete previous 
checkpoint directories if I decide I don't need them anymore.  I also wouldn't 
have to worry about keeping my software up to date if the format of the .sh 
script changed - breaking my parser.  I suppose an alternative to this would be 
to supply the .dmtcp file locations associated with each run in another output 
file akin to the port and PID files.

Rob
------------------------------------------------------------------------------
Android is increasing in popularity, but the open development platform that
developers love is also attractive to malware creators. Download this white
paper to learn more about secure code signing practices that can help keep
Android apps secure.
http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk
_______________________________________________
Dmtcp-forum mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to