Am 11.11.2014 um 16:13 schrieb Ralph Castain:

> This clearly displays the problem - if you look at the reported “allocated 
> nodes”, you see that we only got one node (cn6050). This is why we mapped all 
> your procs onto that node.
> So the real question is - why? Can you show us the content of PE_HOSTFILE?
>> On Nov 11, 2014, at 4:51 AM, SLIM H.A. <> wrote:
>> Dear Reuti and Ralph
>> Below is the output of the run for openmpi 1.8.3 with this line
>> mpirun -np $NSLOTS --display-map --display-allocation --cpus-per-proc 1 $exe
>> master=cn6050
>> PE=orte
>> JOB_ID=2482923
>> Got 32 slots.
>> slots:
>> cn6050 16 par6.q@cn6050 <NULL>
>> cn6045 16 par6.q@cn6045 <NULL>

The above looks like the PE_HOSTFILE. So it should be 16 slots per node.

I wonder whether any environment variable was reset, which normally allows Open 
MPI to discover that it's running inside SGE.

I.e. SGE_ROOT, JOB_ID, ARC and PE_HOSTFILE are untouched before the job starts?

Supplying "-np $NSLOTS" shouldn't be necessary though.

-- Reuti

>> Tue Nov 11 12:37:37 GMT 2014
>> ======================   ALLOCATED NODES   ======================
>>         cn6050: slots=16 max_slots=0 slots_inuse=0 state=UP
>> =================================================================
>> Data for JOB [57374,1] offset 0
>> ========================   JOB MAP   ========================
>> Data for node: cn6050  Num slots: 16   Max slots: 0    Num procs: 32
>>         Process OMPI jobid: [57374,1] App: 0 Process rank: 0
>>         Process OMPI jobid: [57374,1] App: 0 Process rank: 1
>> …
>>         Process OMPI jobid: [57374,1] App: 0 Process rank: 31
>> Also
>> ompi_info | grep grid
>> gives                 MCA ras: gridengine (MCA v2.0, API v2.0, Component 
>> v1.8.3)
>> and
>> ompi_info | grep psm
>> gives                 MCA mtl: psm (MCA v2.0, API v2.0, Component v1.8.3)
>> because the intercoonect is TrueScale/QLogic
>> and
>> setenv OMPI_MCA_mtl "psm"
>> is set in the script. This is the PE
>> pe_name           orte
>> slots             4000
>> user_lists        NONE
>> xuser_lists       NONE
>> start_proc_args   /bin/true
>> stop_proc_args    /bin/true
>> allocation_rule   $fill_up
>> control_slaves    TRUE
>> job_is_first_task FALSE
>> urgency_slots     min
>> Many thanks
>> Henk
>> From: users [] On Behalf Of Ralph Castain
>> Sent: 10 November 2014 05:07
>> To: Open MPI Users
>> Subject: Re: [OMPI users] oversubscription of slots with GridEngine
>> You might also add the —display-allocation flag to mpirun so we can see what 
>> it thinks the allocation looks like. If there are only 16 slots on the node, 
>> it seems odd that OMPI would assign 32 procs to it unless it thinks there is 
>> only 1 node in the job, and oversubscription is allowed (which it won’t be 
>> by default if it read the GE allocation)
>> On Nov 9, 2014, at 9:56 AM, Reuti <> wrote:
>> Hi,
>> Am 09.11.2014 um 18:20 schrieb SLIM H.A. <>:
>> We switched on hyper threading on our cluster with two eight core sockets 
>> per node (32 threads per node).
>> We configured  gridengine with 16 slots per node to allow the 16 extra 
>> threads for kernel process use but this apparently does not work. Printout 
>> of the gridengine hostfile shows that for a 32 slots job, 16 slots are 
>> placed on each of two nodes as expected. Including the openmpi --display-map 
>> option shows that all 32 processes are incorrectly  placed on the head node.
>> You mean the master node of the parallel job I assume.
>> Here is part of the output
>> master=cn6083
>> PE=orte
>> What allocation rule was defined for this PE - "control_slave yes" is set?
>> JOB_ID=2481793
>> Got 32 slots.
>> slots:
>> cn6083 16 par6.q@cn6083 <NULL>
>> cn6085 16 par6.q@cn6085 <NULL>
>> Sun Nov  9 16:50:59 GMT 2014
>> Data for JOB [44767,1] offset 0
>> ========================   JOB MAP   ========================
>> Data for node: cn6083  Num slots: 16   Max slots: 0    Num procs: 32
>>       Process OMPI jobid: [44767,1] App: 0 Process rank: 0
>>       Process OMPI jobid: [44767,1] App: 0 Process rank: 1
>> ...
>>       Process OMPI jobid: [44767,1] App: 0 Process rank: 31
>> =============================================================
>> I found some related mailings about a new warning in 1.8.2 about 
>> oversubscription and  I tried a few options to avoid the use of the extra 
>> threads for MPI tasks by openmpi without success, e.g. variants of
>> --cpus-per-proc 1 
>> --bind-to-core 
>> and some others. Gridengine treats hw threads as cores==slots (?) but the 
>> content of $PE_HOSTFILE suggests it distributes the slots sensibly  so it 
>> seems there is an option for openmpi required to get 16 cores per node?
>> Was Open MPI configured with --with-sge?
>> -- Reuti
>> I tried both 1.8.2, 1.8.3 and also 1.6.5.
>> Thanks for some clarification that anyone can give.
>> Henk
>> _______________________________________________
>> users mailing list
>> Subscription:
>> Link to this post: 
>> _______________________________________________
>> users mailing list
>> Subscription:
>> Link to this post: 
>> _______________________________________________
>> users mailing list
>> Subscription:
>> Link to this post: 
> _______________________________________________
> users mailing list
> Subscription:
> Link to this post: 

Reply via email to