On Mar 19, 2014, at 9:13 AM, Adrian Reber wrote:
> What does Open MPI do with the file descriptors for stdout/stderr?
We admittedly do funny things with stdin, stdout, and stderr... The short
version is that OMPI intercepts all the stdin, stdout, and stderr from each MPI
process and relays it back up to mpirun through our IOF subsystem (IOF = I/O
forwarding).
Consider: users launch N processes (potentially on multiple different servers)
via
mpirun --hostfile hosts -np N my_mpi_executable
They also expect to be able to use standard shell redirection via the mpirun
command. For example:
mpirun --hostfile hosts -np N my_mpi_executable |& tee out.txt
To explain what happens, we have to explain a little of how OMPI launches
processes. Let's take the ssh case, for simplicity (there are other mechanisms
it can use to launch on remote servers, but for the purposes of this
discussion, they're basically variants of what happens with ssh).
1. mpirun parses the hosts hostfile and extracts the list of servers on which
to launch.
2. mpirun fork/execs an ssh command to each remote node, and launches the Open
MPI helper daemon "orted"
3. The orted launches on the remote server, does some housekeeping, and
eventually receives the launch command from mpirun
4. The launch command contains the executable and argv to fork/exec, and how
many of them.
5. For example: mpirun --hostfile hosts -np 4 my_mpi_executable. If the
"hosts" file contains serverA and serverB, then mpirun would launch 2 ssh's --
one each to serverA and serverB. After some startup negotiation, mpirun would
send a launch command telling the orted on each of serverA and serverB to
launch 2 copies of my_mpi_executable.
6. For each child that the orted will create, it:
- creates (up to) 3 pipes, for: stdin, stdout, stderr
- forks
- closes stdin, stdout, stderr
- dups the pipes into 0, 1, 2
- (by default, we actually close stdin on all processes except the first one)
- execs my_mpi_application
7. In this way, the orted can intercept the stdout/stderr from the process and
send it back to mpirun, which can then write it on its own stdout/stderr. And
therefore shell redirection from mpirun works as expected.
8. Similarly, the stdin from mpirun can be sent to any process where we kept
stdin open (as mentioned above, by default, this is only the first process).
In short: the orted acts as a proxy for the stdout and stderr (and potentially
stdin) for all launched processes.
> Would it make sense to close stdout/stderr of each checkpointed process
> before checkpointing it?
Maybe...?
But my gut reaction is that you don't want to because of the "continue" case.
I.e., having the orted go through all the IOF setup again could be a bit
tricky... We didn't need to do this for other checkpointing systems.
--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/