Am 04.02.2012 um 00:15 schrieb Tom Bryan:

A more detailed answer later, as it's late here. But one short note:

-pe orte 5 => give me exactly 5 slots

-pe orte 5-5 => the same

-pe orte 5- => give me at least 5 slots, up to the maximum you can get right 
now in the cluster

The output in `qstat -g t` master/slave only tells you what is granted, not was 
it necessarly used by you right now. It's up to the application to use the 
granted slots.

==

Requesting exactly 5, will show you either "one master and four slaves" or "one 
master and five slaves". This depends on the setting of "job_is_first_task" in 
the definition of the PE.

The rationale behind this is, that it will adjust the number of `qrsh -inherit 
` calls (just imagine single core machines to understand the idea behind it) 
which are allowed. In a plain MPI application usually "job_is_first_task" is 
set to yes, as also the started executable on the machine where the `mpiexec` 
is issued in the jobscript is doing some work (usually the rank 0). This would 
result of 4 `qrsh -inherit` being allowed and have a total of 5.

If your rank 0 is for any reason only collecting results and not doing any work 
(i.e. master/slave application like in PVM), you would like to say 
"job_is_first_task no". This has the effect, that one additional `qrsh 
-inherit` is allowed - in detail: a local one plus 4 to other nodes to start 5 
slaves.

Nowadays, where you have many cores per node and even use only one `qrsh 
-inherit` per slave machine and then forks or threads for the additional 
processes, this setting is less meaningful and would need some new options in 
the PE:

https://arc.liv.ac.uk/trac/SGE/ticket/197

-- Reuti


> 1. I'm still surprised that the SGE behavior is so different when I
> configure my SGE queue differently.  See test "a" in the .tgz.  When I just
> run mpitest in mpi.sh and ask for exactly 5 slots (-pe orte 5-5), it works
> if the queue is configured to use a single host.  I see 1 MASTER and 4
> SLAVES in qstat -g t, and I get the correct output.  If the queue is set to
> use multiple hosts, the jobs hang in spawn/init, and I get errors
> [grid-03.cisco.com][[19159,2],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint
> _complete_connect] connect() to 192.168.122.1 failed: Connection refused
> (111)
> [grid-10.cisco.com:05327] [[19159,0],3] routed:binomial: Connection to
> lifeline [[19159,0],0] lost
> [grid-16.cisco.com:25196] [[19159,0],1] routed:binomial: Connection to
> lifeline [[19159,0],0] lost
> [grid-11.cisco.com:63890] [[19159,0],2] routed:binomial: Connection to
> lifeline [[19159,0],0] lost
> So, I'll just assume that mpiexec does some magic that is needed in the
> multi-machine scenario but not in the single machine scenario.
> 
> 2. I guess I'm not sure how SGE is supposed to behave.  Experiment "a" and
> "b" were identical except that I changed -pe orte 5-5 to -pe orte 5-.  The
> single case works like before, and the multiple exec host case fails as
> before.  The difference is that qstat -g t shows additional SLAVEs that
> don't seem to correspond to any jobs on the exec hosts.  Are these SLAVEs
> just slots that are reserved for my job but that I'm not using?  If my job
> will only use 5 slots, then I should set the SGE qsub job to ask for exactly
> 5 with "-pe orte 5-5", right?
> 
> 3. Experiment "d" was similar to "b", but I use mpi.sh uses "mpiexec -np 1
> mpitest" instead of running mpitest directly.  Now both the single machine
> queue and multiple machine queue work.  So, mpiexec seems to make my
> multi-machine configuration happier.  In this case, I'm still using "-pe
> orte 5-", and I'm still seeing the extra SLAVE slots granted in qstat -g t.
> 
> 4. Based on "d", I thought that I could follow the approach in "a".  That
> is, for experiment "e", I used mpiexec -np 1, but I also used -pe orte 5-5.
> I thought that this would make the multi-machine queue reserve only the 5
> slots that I needed.  The single machine queue works correctly, but now the
> multi-machine case hangs with no errors.  The output from qstat and pstree
> are what I'd expect, but it seems to hang in Span_multiple and Init_thread.
> I really expected this to work.
> 
> I'm really confused by experiment "e" with multiple machines in the queue.
> Based on "a" and "d", I thought that a combination of mpiexec -np 1 would
> permit the multi-machine scheduling to work with MPI while the "-pe orte
> 5-5" would limit the slots to exactly the number that it needed to run.
> 
> ---Tom
> 
> <mpiExperiments.tgz>_______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to