Hi,

> Am 22.07.2016 um 15:10 schrieb Hans-Werner Paulsen <h...@mpa-garching.mpg.de>:
> 
> Dear all,
> this is with SoGE 8.1.8 and OpenMPI 1.10.2. Our cluster has 24 nodes with 20 
> cores each (40 with HT).
> I am looking for information how to setup a PE for hybrid jobs (OpenMP jobs 
> running on more than one node). I created a PE with "allocation_rule 20", and 
> this seems to work fine with OMP_NUM_THREADS=20. Then a user wanted to run a 
> job with OMP_NUM_THREADS=5 and used "mpirun -n $((NSLOTS/OMP_NUM_THREADS))",

I assume Open MPI fills just the slots according to the PE_HOSTFILE it got.

It reads $PE_HOSTFILE in orte/mca/ras/gridengine/ras_gridengine_module.c and 
acts accordingly.


> but this does not work correctly, on some nodes there are running 20*5=100 
> threads, on some nodes there are no running threads. Creating a new PE with 
> "allocation_rule 5" has the disadvantage, that one cannot use more than 
> 24*5=120 slots.

Unfortunately this is not implemented by default. What you can do:

- have PEs with fixed allocation rule, i.e. 20
- request the overall number of slots your job needs
- in the job script copy the $PE_HOSTFILE to $TMPDIR
- lower the entries therein (i.e. set them to the number of MPI processes you 
want) on each node*
- export PE_HOSTFILE by pointing to the above altered copy of the PE_HOSTFILE
(in case Open MPI was compiled --with-sge it should automatically use only the 
given hosts in the file you provide here)
- export OMP_NUM_THREADS=5

* The format is explained in `man sge_pe` start_proc_args "$pe_hostfile".

-- Reuti

PS: Maybe a prolog could also alter the provided $SGE_JOB_SPOOL_DIR/pe_hostfile 
directly - I never tested it to alter it but changed a copy only.

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to