Thanks for the response Jeff,

Jeff Squyres wrote:
Greetings Josh.

No, we don't have an easy way to show which plugins were loaded and may/will be used during the run. The modules you found below in --display-map are only a few of the plugins (all dealing with the run-time environment, and only used on the back-end nodes, so it may not be what you're looking for -- e.g., it doesn't show the plugins used by mpirun).
What do you need to know?

Well basically I want to know what MTA's are being used to startup a job. I'm confused as to what the difference is between "used by mpirun" versus user on the back-end nodes. Doesn't --display-map show which MTA modules will used to start the backend processes?

The overarching issue is that I'm attempting to just begin testing my build and when I attempt to startup a job, it just hangs:

[ats@nt147 ~]$ mpirun --mca pls rsh -np 1 ./cpi
[nt147.penguincomputing.com:04640] [0,0,0] ORTE_ERROR_LOG: Not available in file ras_bjs.c at line 247

The same thing happens if I just disable the bjs RAS MTA, since bjs, really isn't used with Scyld anymore:

[ats@nt147 ~]$ mpirun --mca ras ^bjs --mca pls rsh -np 1 ./cpi
<hang>

The interesting thing here is that orted starts up, but I'm not sure what is supposed to happen next:

[root@nt147 ~]# ps -auxwww | grep orte
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.3/FAQ ats 4647 0.0 0.0 48204 2136 ? Ss 12:45 0:00 orted --bootproxy 1 --name 0.0.1 --num_procs 2 --vpid_start 0 --nodename nt147.penguincomputing.com --universe a...@nt147.penguincomputing.com:default-universe-4645 --nsreplica "0.0.0;tcp://192.168.5.211:59110;tcp://10.10.10.1:59110;tcp://10.11.10.1:59110" --gprreplica "0.0.0;tcp://192.168.5.211:59110;tcp://10.10.10.1:59110;tcp://10.11.10.1:59110" --set-sid

Finally, it should be noted that the upcoming release of Scyld will now include OpenMPI. This notion is how all of this got started.

-Joshua Bernstein
Software Engineer
Penguin Computing

Reply via email to