Re: [OMPI users] How to specify hosts for MPI_Comm_spawn

Jeff Squyres Wed, 30 Jul 2008 11:50:05 -0400

On Jul 30, 2008, at 11:12 AM, Mark Borgerding wrote:

I appreciate the suggestion about running a daemon on each of theremote nodes, but wouldn't I kind of be reinventing the wheel there?Process management is one of the things I'd like to be able to counton ORTE for.

Keep in mind that the daemons here are not for process management --they're for name service.

Would the following work to give the parent process an intercommwith each child?
parent i.e. my non-mpirun-started process calls MPI_Init thenMPI_Open_portparent spawns mpirun command via system/exec to create the remotechildren . The name from MPI_Open_port is placed in the environment.
parent calls MPI_Comm_accept (once for each child?)
all children call MPI_connect to the name

It may be problematic to call system/exec in some environments (e.g.,if using OpenFabrics networks). Bad Things can happen.

I think this would give one intercommunicator back to the parent foreach remote process (not ideal, but I can worry about broadcast datalater)The remote processes can communicate to each other throughMPI_COMM_WORLD.
Actually when I think through the details, much of this is prettysimilar to the daemon MPI_Publish_name+MPI_Lookup_name approach.The main difference being which processes come first.

Instead of having the framework call MPI_Init in your plugin, can youplugin system/exec "mpirun -np 1 my_parent_app"? And perhaps use apipe (or socket or some other IPC) to communicate between theframework process and my_parent_app? I realize it's a kludgeyworkaround, but it looks like we clearly have a bug in the 1.2 serieswith singletons in this area...


--
Jeff Squyres
Cisco Systems

Re: [OMPI users] How to specify hosts for MPI_Comm_spawn

Reply via email to