Ralph Castain <r...@open-mpi.org> writes:

> On Apr 25, 2013, at 5:33 PM, Vladimir Yamshchikov <yaxi...@gmail.com> wrote:
>
>> $NSLOTS is what requested by -pe openmpi <ARG> in the script, my 
>> understanding that by default it is threads.

Is there something in the documentation
<http://arc.liv.ac.uk/SGE/htmlman/manuals.html> that suggests that?  [It
currently incorrectly says processes, rather than slots, in at least one
place I'll fix.]

> What you want to do is:
>
> 1. request a number of slots = the number of application processes * the 
> number of threads each process will run

[If really necessary, maybe use a job submission verifier to fiddle what
the user supplies.]

> 2. execute mpirun with the --cpus-per-proc N option, where N = the number of 
> threads each process will run.
>
> This will ensure you have one core for each thread. Note, however,
> that we don't actually bind a thread to the core - so having more
> threads than there are cores on a socket can cause a thread to bounce
> across sockets and (therefore) potentially across NUMA regions.

Does that mean that binding is suppressed in that case, as opposed to
binding N cores per process, which is what I thought it did?  (I can't
immediately test it.)

I don't understand the problem in this specific case which causes
over-subscription.  However, if the program's runtime needs instruction,
you can do things like setting OMP_NUM_THREADS with an SGE JSV; see
archives of the gridengine list.  (The SGE_BINDING variable that recent
SGE provides to the job can be converted to GOMP_CPU_AFFINITY etc., but
that's probably only useful for single-process jobs.)

There may be a case for OMPI to support this sort of thing for DRMs like
SGE which don't start the MPI processes themselves; you potentially need
to export the binding information per-process.

-- 
Community Grid Engine:  http://arc.liv.ac.uk/SGE/

Reply via email to