Am 04.02.2012 um 00:15 schrieb Tom Bryan: A more detailed answer later, as it's late here. But one short note:
-pe orte 5 => give me exactly 5 slots -pe orte 5-5 => the same -pe orte 5- => give me at least 5 slots, up to the maximum you can get right now in the cluster The output in `qstat -g t` master/slave only tells you what is granted, not was it necessarly used by you right now. It's up to the application to use the granted slots. == Requesting exactly 5, will show you either "one master and four slaves" or "one master and five slaves". This depends on the setting of "job_is_first_task" in the definition of the PE. The rationale behind this is, that it will adjust the number of `qrsh -inherit ` calls (just imagine single core machines to understand the idea behind it) which are allowed. In a plain MPI application usually "job_is_first_task" is set to yes, as also the started executable on the machine where the `mpiexec` is issued in the jobscript is doing some work (usually the rank 0). This would result of 4 `qrsh -inherit` being allowed and have a total of 5. If your rank 0 is for any reason only collecting results and not doing any work (i.e. master/slave application like in PVM), you would like to say "job_is_first_task no". This has the effect, that one additional `qrsh -inherit` is allowed - in detail: a local one plus 4 to other nodes to start 5 slaves. Nowadays, where you have many cores per node and even use only one `qrsh -inherit` per slave machine and then forks or threads for the additional processes, this setting is less meaningful and would need some new options in the PE: https://arc.liv.ac.uk/trac/SGE/ticket/197 -- Reuti > 1. I'm still surprised that the SGE behavior is so different when I > configure my SGE queue differently. See test "a" in the .tgz. When I just > run mpitest in mpi.sh and ask for exactly 5 slots (-pe orte 5-5), it works > if the queue is configured to use a single host. I see 1 MASTER and 4 > SLAVES in qstat -g t, and I get the correct output. If the queue is set to > use multiple hosts, the jobs hang in spawn/init, and I get errors > [grid-03.cisco.com][[19159,2],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint > _complete_connect] connect() to 192.168.122.1 failed: Connection refused > (111) > [grid-10.cisco.com:05327] [[19159,0],3] routed:binomial: Connection to > lifeline [[19159,0],0] lost > [grid-16.cisco.com:25196] [[19159,0],1] routed:binomial: Connection to > lifeline [[19159,0],0] lost > [grid-11.cisco.com:63890] [[19159,0],2] routed:binomial: Connection to > lifeline [[19159,0],0] lost > So, I'll just assume that mpiexec does some magic that is needed in the > multi-machine scenario but not in the single machine scenario. > > 2. I guess I'm not sure how SGE is supposed to behave. Experiment "a" and > "b" were identical except that I changed -pe orte 5-5 to -pe orte 5-. The > single case works like before, and the multiple exec host case fails as > before. The difference is that qstat -g t shows additional SLAVEs that > don't seem to correspond to any jobs on the exec hosts. Are these SLAVEs > just slots that are reserved for my job but that I'm not using? If my job > will only use 5 slots, then I should set the SGE qsub job to ask for exactly > 5 with "-pe orte 5-5", right? > > 3. Experiment "d" was similar to "b", but I use mpi.sh uses "mpiexec -np 1 > mpitest" instead of running mpitest directly. Now both the single machine > queue and multiple machine queue work. So, mpiexec seems to make my > multi-machine configuration happier. In this case, I'm still using "-pe > orte 5-", and I'm still seeing the extra SLAVE slots granted in qstat -g t. > > 4. Based on "d", I thought that I could follow the approach in "a". That > is, for experiment "e", I used mpiexec -np 1, but I also used -pe orte 5-5. > I thought that this would make the multi-machine queue reserve only the 5 > slots that I needed. The single machine queue works correctly, but now the > multi-machine case hangs with no errors. The output from qstat and pstree > are what I'd expect, but it seems to hang in Span_multiple and Init_thread. > I really expected this to work. > > I'm really confused by experiment "e" with multiple machines in the queue. > Based on "a" and "d", I thought that a combination of mpiexec -np 1 would > permit the multi-machine scheduling to work with MPI while the "-pe orte > 5-5" would limit the slots to exactly the number that it needed to run. > > ---Tom > > <mpiExperiments.tgz>_______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users