Thanks for the response Jeff,
Jeff Squyres wrote:
Greetings Josh.
No, we don't have an easy way to show which plugins were loaded and
may/will be used during the run. The modules you found below in
--display-map are only a few of the plugins (all dealing with the
run-time environment, and only used on the back-end nodes, so it may not
be what you're looking for -- e.g., it doesn't show the plugins used by
mpirun).
What do you need to know?
Well basically I want to know what MTA's are being used to startup a
job. I'm confused as to what the difference is between "used by mpirun"
versus user on the back-end nodes. Doesn't --display-map show which MTA
modules will used to start the backend processes?
The overarching issue is that I'm attempting to just begin testing my
build and when I attempt to startup a job, it just hangs:
[ats@nt147 ~]$ mpirun --mca pls rsh -np 1 ./cpi
[nt147.penguincomputing.com:04640] [0,0,0] ORTE_ERROR_LOG: Not available
in file ras_bjs.c at line 247
The same thing happens if I just disable the bjs RAS MTA, since bjs,
really isn't used with Scyld anymore:
[ats@nt147 ~]$ mpirun --mca ras ^bjs --mca pls rsh -np 1 ./cpi
<hang>
The interesting thing here is that orted starts up, but I'm not sure
what is supposed to happen next:
[root@nt147 ~]# ps -auxwww | grep orte
Warning: bad syntax, perhaps a bogus '-'? See
/usr/share/doc/procps-3.2.3/FAQ
ats 4647 0.0 0.0 48204 2136 ? Ss 12:45 0:00 orted
--bootproxy 1 --name 0.0.1 --num_procs 2 --vpid_start 0 --nodename
nt147.penguincomputing.com --universe
a...@nt147.penguincomputing.com:default-universe-4645 --nsreplica
"0.0.0;tcp://192.168.5.211:59110;tcp://10.10.10.1:59110;tcp://10.11.10.1:59110"
--gprreplica
"0.0.0;tcp://192.168.5.211:59110;tcp://10.10.10.1:59110;tcp://10.11.10.1:59110"
--set-sid
Finally, it should be noted that the upcoming release of Scyld will now
include OpenMPI. This notion is how all of this got started.
-Joshua Bernstein
Software Engineer
Penguin Computing