On Apr 30, 2013, at 7:52 AM, Dave Love <d.l...@liverpool.ac.uk> wrote:

> Ralph Castain <r...@open-mpi.org> writes:
> 
>> On Apr 25, 2013, at 5:33 PM, Vladimir Yamshchikov <yaxi...@gmail.com> wrote:
>> 
>>> $NSLOTS is what requested by -pe openmpi <ARG> in the script, my 
>>> understanding that by default it is threads.
> 
> Is there something in the documentation
> <http://arc.liv.ac.uk/SGE/htmlman/manuals.html> that suggests that?  [It
> currently incorrectly says processes, rather than slots, in at least one
> place I'll fix.]
> 
>> What you want to do is:
>> 
>> 1. request a number of slots = the number of application processes * the 
>> number of threads each process will run
> 
> [If really necessary, maybe use a job submission verifier to fiddle what
> the user supplies.]

?? We have no way of knowing how many threads a process will start, so the user 
has to take that responsibility

> 
>> 2. execute mpirun with the --cpus-per-proc N option, where N = the number of 
>> threads each process will run.
>> 
>> This will ensure you have one core for each thread. Note, however,
>> that we don't actually bind a thread to the core - so having more
>> threads than there are cores on a socket can cause a thread to bounce
>> across sockets and (therefore) potentially across NUMA regions.
> 
> Does that mean that binding is suppressed in that case, as opposed to
> binding N cores per process, which is what I thought it did?  (I can't
> immediately test it.)

No, we do what the user requests. We will bind the process to the N cores - if 
those cores span sockets, that is the responsibility of the user. We try to 
keep it all together, but if you ask for too many...

> 
> I don't understand the problem in this specific case which causes
> over-subscription.  However, if the program's runtime needs instruction,
> you can do things like setting OMP_NUM_THREADS with an SGE JSV; see
> archives of the gridengine list.  (The SGE_BINDING variable that recent
> SGE provides to the job can be converted to GOMP_CPU_AFFINITY etc., but
> that's probably only useful for single-process jobs.)
> 
> There may be a case for OMPI to support this sort of thing for DRMs like
> SGE which don't start the MPI processes themselves; you potentially need
> to export the binding information per-process.

I'm unaware of any OS that currently binds at the process thread level. Can you 
refer us to something?

> 
> -- 
> Community Grid Engine:  http://arc.liv.ac.uk/SGE/
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to