On Tue, Nov 16, 2010 at 12:23 PM, Terry Dontje <terry.don...@oracle.com>wrote:

>  On 11/16/2010 01:31 PM, Reuti wrote:
>
> Hi Ralph,
>
> Am 16.11.2010 um 15:40 schrieb Ralph Castain:
>
>
>  2. have SGE bind procs it launches to -all- of those cores. I believe SGE 
> does this automatically to constrain the procs to running on only those cores.
>
>  This is another "bug/feature" in SGE: it's a matter of discussion, whether 
> the shepherd should get exactly one core (in case you use more than one 
> `qrsh`per node) for each call, or *all* cores assigned (which we need right 
> now, as the processes in Open MPI will be forks of orte daemon). About such a 
> situtation I filled an issue a long time ago and "limit_to_one_qrsh_per_host 
> yes/no" in the PE definition would do (this setting should then also change 
> the core allocation of the master process):
> http://gridengine.sunsource.net/issues/show_bug.cgi?id=1254
>
> I believe this is indeed the crux of the issue
>
>  fantastic to share the same view.
>
>
>  FWIW, I think I agree too.
>
>   3. tell OMPI to --bind-to-core.
>
> In other words, tell SGE to allocate a certain number of cores on each node, 
> but to bind each proc to all of them (i.e., don't bind a proc to a specific 
> core). I'm pretty sure that is a standard SGE option today (at least, I know 
> it used to be). I don't believe any patch or devel work is required (to 
> either SGE or OMPI).
>
>  When you use a fixed allocation_rule and a matching -binding request it will 
> work today. But any other case won't be distributed in the correct way.
>
> Is it possible to not include the -binding request? If SGE is told to use a 
> fixed allocation_rule, and to allocate (for example) 2 cores/node, then won't 
> the orted see
> itself bound to two specific cores on each node?
>
>  When you leave out the -binding, all jobs are allowed to run on any core.
>
>
>
>  We would then be okay as the spawned children of orted would inherit its 
> binding. Just don't tell mpirun to bind the processes and the threads of 
> those MPI procs will be able to operate across the provided cores.
>
> Or does SGE only allocate 2 cores/node in that case (i.e., allocate, but no 
> -binding given), but doesn't bind the orted to any two specific cores? If so, 
> then that would be a problem as the orted would think itself unconstrained. 
> If I understand the thread correctly, you're saying that this is what happens 
> today - true?
>
>  Exactly. It won't apply any binding at all and orted would think of being 
> unlimited. I.e. limited only by the number of slots it should use thereon.
>
>
>  So I guess the question I have for Ralph.  I thought, and this might be
> mixing some of the ideas Jeff and I've been talking about, that when a RM
> executes the orted with a bound set of resources (ie cores) that orted would
> bind the individual processes on a subset of the bounded resources.  Is this
> not really the case for 1.4.X branch?  I believe it is the case for the
> trunk based on Jeff's refactoring.
>

You are absolutely correct, Terry, and the 1.4 release series does include
the proper code. The point here, though, is that SGE binds the orted to a
single core, even though other cores are also allocated. So the orted
detects an external binding of one core, and binds all its children to that
same core.

What I had suggested to Reuti was to not include the -binding flag to SGE in
the hopes that SGE would then bind the orted to all the allocated cores.
However, as I feared, SGE in that case doesn't bind the orted at all - and
so we assume the entire node is available for our use.

This is an SGE issue. We need them to bind the orted to -all- the allocated
cores (and only those cores) in order for us to operate correctly.



>
>
> --
> [image: Oracle]
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
>  Oracle * - Performance Technologies*
>  95 Network Drive, Burlington, MA 01803
> Email terry.don...@oracle.com
>
>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to