Let me follow up on this...

IOF is but one of the frameworks / plugins involved in launching and monitoring 
processes.

It might actually be easier to get on a webex and give you an overview (Ralph 
would be the best person for this; he's the one would does most of the work in 
the ORTE layer); I'm not sure we have good documentation for it online.

Part of the problem is that in our current design, MPI processes are really not 
designed to be in the same process as the orted.  It *might* be possible to 
make that happen, but I think we have a lot of assumptions built in to the code 
that the orted that the process(es) that it launches will actually be separate 
/ different OS processes.

That being said, it might be an easier solution to just not have the orted.  
That is, ORTE is capable of "orted-less" launches when the underlying runtime 
provides enough support for OMPI to not have to use the orteds.  This would 
allow you to launch the MPI process directly in your container without any 
dlopen/orted-process-merging tomfoolery.  This might avoid running afoul of 
many of the assumptions we have baked into the system.

Ralph will need to give the details of how to support orted-less launching.  
But the first question is: does OSv have some kind of programatic mechanism to 
launch a process in your containers?  I.e., can mpirun programmatically launch 
MPI processes in OSv containers?



> On Oct 16, 2015, at 6:48 AM, Justin Cinkelj <justin.cink...@xlab.si> wrote:
> 
> Thank you. At least its clear now that for the immediate problem I have
> to look at IOF code.
> 
> 
> On 16. 10. 2015 03:32, Gilles Gouaillardet wrote:
>> Justin,
>> 
>> IOF stands for Input/Output (aka I/O) Forwarding
>> 
>> here is a very high level overview of a quite simple case.
>> on host A, you run
>> mpirun -host B,C -np 2 a.out
>> without any batch manager and TCP interconnect
>> 
>> first, mpirun will fork&exec
>> ssh B orted ...
>> ssh C orted ...
>> 
>> the orted daemons will first connect back to mpirun, using TCP and
>> ip/port passed on the orted command line.
>> 
>> then the orted daemons will fork&exec a.out
>> a.out will contact its parent orted (iirc, TCP on v1.10 and Unix
>> socket from v2.x) via ip/port of port from the environment
>> when a.out want to communicate, they will connect to the remote a.out
>> via TCP using ip/port obtained from orted.
>> 
>> from a.out point of view :
>> - stdin is either a pipe to orted or /dev/null
>> - stdout is a pty with orted on the other side
>> - stderr is a pipe to orted
>> 
>> this is basically what happens in a quite simple case,
>> back to your question, mpi_hello.so does not contact mpirun.
>> orted.so contacts mpirun, and mpi_hello.so contacts orted.so,
>> and then mpi_hello.so contact other mpi_hello.so
>> 
>> 
>> note it is also possible to use direct launch (SLURM or cray alps can
>> do that)
>> instead of running
>> mpirun a.out
>> you simply do
>> srun a.out (or aprun a.out)
>> in the case of slurm (i am not sure about alps) there is no orted
>> daemons involved.
>> instead of contacting its orted, a.out contact the slurm daemons
>> (slurmd) so it can exchange information with remote a.out and figure
>> out how to contact them.
>> direct launch does not support dynamic process creation
>> (MPI_Comm_spawn and friends)
>> 
>> 
>> you can run
>> ompi_info --all
>> to list all the parameters.
>> and then you can do
>> mpirun --mca <name> <value> ...
>> to modify a parameter (such as timeout)
>> 
>> that being said, i do not think that should be needed ... just make
>> sure there is no firewall running on your system, and you should be fine.
>> if some hosts have several interfaces, you can restrict to the one
>> that should work (e.g. eth0) with
>> mpirun --mca oob_tcp_if_include eth0 --mca btl_tcp_if_include eth0 ...
>> 
>> 
>> i hope this helps
>> 
>> Gilles
>> 
>> 
>> On 10/16/2015 2:59 AM, Justin Cinkelj wrote:
>>> I'm trying to run OpenMPI in OSv container
>>> (https://github.com/cloudius-systems/osv). It's a single process, single
>>> address space VM, without fork, exec, openpty function. With some
>>> butchering of OSv and OpenMPI I was able to compile orted.so, and run it
>>> inside OSv via mpirun (mpirun is on remote machine). The orted.so loads
>>> mpi_hello.so and executes its main() in new pthread.
>>> 
>>> Which than aborts due to communication failure/timeout - as reported by
>>> mpirun. I assume that that mpi_hello.so should connect back to mpirun,
>>> and report 'something' about itself. What could that be?
>>> Plus, where could I extend that timeout period - once mpirun closes,
>>> output from opal_output is not shown any more.
>>> 
>>> Is there some highlevel overview about OpenMPI, how are modules
>>> connected, what is 'startup' sequence etc?
>>> ompi_info lists compiled modules, but I still don't know how are they
>>> connected.
>>> 
>>> So basically - I lack knowledge of OpenMPI internals, and would highly
>>> appreciate links for "rookie" developers.
>>> Say https://github.com/open-mpi/ompi/wiki/IOFDesign tells me what IOF
>>> is, and a bit about its working. So, if someone has any list of such
>>> links - could it be shared?
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2015/10/18181.php
>>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/10/18189.php
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/10/18190.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to