Cross-posting to criu and openmpi devel mailinglists.
To get fault tolerance back into Open MPI I added code to use criu as
a checkpoint/restart tool. I can checkpoint a process successfully
but I have troubles restarting it. CRIU has currently problems restoring
the process which is probably related stdout/stderr handling.
(00.026198) 15852: Error (tty.c:541): tty: Can't dup SELF_STDIN_OFF: Bad file
descriptor
What does Open MPI do with the file descriptors for stdout/stderr?
Would it make sense to close stdout/stderr of each checkpointed process
before checkpointing it?
Is there something concerning stdout/stderr which I forgot to handle?
Adrian