Hello all Summary: this note provides a brief overview of how various tools can interface to OMPI applications once the next version of ORTE is integrated into the trunk. It includes a request for input regarding any needs (e.g., additional commands to be supported in the interface) that have not been adequately addressed.
As many of you know, I have been working on a tmp branch to complete the revamp of ORTE that has been in progress for quite some time. Among other things, this revamp is intended to simplify the system, provide enhanced scalability, and improved reliability. As part of that effort, I have extensively revised the support for external tools. In the past, tools such as the Eclipse PTP could only interact with Open MPI-based applications via ORTE API's, thus exposing the tool to any changes in those APIs. Most tools, however, do not require the level of control provided by the APIs and can benefit from a simplified interface. Accordingly, the revamped ORTE now offers alternative methods of interaction. The primary change has been the creation of a communications library with a simple serial protocol for interacting with OMPI jobs. Thus, tools now have three choices for interacting with OMPI jobs: 1. I have created a new communications library that tools can link against. It does not include all of the ORTE or OMPI libraries, so it is a very small memory footprint. Besides the usual calls to initialize and finalize, the library contains utilities for finding all of the OMPI jobs running on that HNP (i.e., all OMPI jobs whose mpirun was executed from that host), querying the status of a job (provides the job map plus all proc states); querying the status of nodes (provides node names, status, and list of procs on each node including their state); querying the status of a specific process; spawning a new job; and terminating a job. In addition, you can attach to output streams of any process, specifying stdout, stderr, or both - this "tees" the specified streams, so it won't interfere with the job's normal output flow. I could also create a utility to allow attachment to the input stream of a process. However, I'm a little concerned about possible conflicts with whatever is already flowing across that stream. I would appreciate any suggestions as to whether or not to provide that capability. Note: we removed the concept of the ORTE "universe", so a tool can now talk to any mpirun without complications. Thus, tools can simultaneously "connect" to and monitor multiple mpiruns, if desired. 2. link against all of OMPI or ORTE, and execute a standalone program. In this mode, your tool would act as a surrogate for mpirun by directly spawning the user's application. This provides some flexibility, but it does mean that both the tool and the job -must- end together, and that the tool may need to be revised whenever OMPI/ORTE APIs are updated. 3. link against all of OMPI or ORTE, executing as a distributed set of processes. In this mode, you would execute your tool via "mpirun -pernode ./my_tool" (or whatever command is appropriate - this example would launch one tool process on every node in the allocation). If the tool processes need to communicate with each other, they can call MPI_Init or orte_init, depending upon the level of desired communication. Note that the tool job will be completely standalone from the application job and must be terminated separately. In all of these cases, it is possible for tool processes to connect (via MPI and/or ORTE-RML) to a job's processes provided that the application supports it. I can provide more details, of course, to anyone wishing them. What I would appreciate, though, is any feedback about desired commands, mode of operation, etc. that I might have missed or people would prefer be changed. This code is all in a private repository for my tmp branch, but I expect that to merge with the trunk fairly soon. I have provided a couple of example tools to illustrate the above modes of operation in that code. Thanks Ralph