I. Support for non-MPI jobs Considerable complexity currently exists in ORTE because of the stipulation in our first requirements document that users be able to mpirun non-MPI jobs - i.e., that we support such calls as "mpirun -n 100 hostname". This creates a situation, however, where the RTE cannot know if the application will call MPI_Init (or at least orte_init), which has significant implications to the RTE's architecture. For example, during the launch of the application's processes, the RTE cannot go into any form of blocking receive while waiting for the procs to report a successful startup as this won't occur for execution of something like "hostname".
Jeff has noted that support for non-MPI jobs is not something most (all?) MPIs currently provide, nor something that users are likely to exploit as they can more easily just "qsub hostname" (or the equivalent for that environment). While nice for debugging purposes, therefore, it isn't clear that supporting non-MPI jobs is worth the increased code complexity and fragility. In addition, the fact that we do not know if a job will call Init limits our ability to do collective communications within the RTE, and hence our scalability - see the note on that specific subject for further discussion on this area. This would be a "regression" in behavior, though, so the questions for the community are: (a) do we want to retain the feature to run non-MPI jobs with mpirun as-is (and accept the tradeoffs, including the one described below in II)? (b) do we provide a flag to mpirun (perhaps adding the distinction that "orterun" must be used for non-MPI jobs?) to indicate "this is NOT an MPI job" so we can act accordingly? (c) simply eliminate support for non-MPI jobs? (d) other suggestions? Ralph