Dear DMTCP team -- First of all, it *awesome* to have this available and
use it with Grid Engine to allow timed-out jobs to be continued!

I want to report two issues I have seen with Java jobs of this form:

gunzip -c foo.gz | java blah blah 2> blah.err | gzip > bar.gz

1) Typically fails in restart if restarted on a host different from that
    used for first part of the run.  The complaint is about Unix shared-memory
    stuff in the Java process.

    Workaround: Restart only on the original host.

2) In the 2> redirection, if I give an absolute path name, it works, but if
    I give a relative pathname (e.g., a name in the current directory), it
    fails in the Java process during restart.  I saw an OPEN MPI bug that was
    fixed concerning stderr -- wonder if this the same / related?

    Workaround: give a full absolute pathname.

I am using version 2.3.1, built from the sources.

Regards -- Eliot Moss

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Dmtcp-forum mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to