Hi, > Am 22.07.2016 um 15:10 schrieb Hans-Werner Paulsen <h...@mpa-garching.mpg.de>: > > Dear all, > this is with SoGE 8.1.8 and OpenMPI 1.10.2. Our cluster has 24 nodes with 20 > cores each (40 with HT). > I am looking for information how to setup a PE for hybrid jobs (OpenMP jobs > running on more than one node). I created a PE with "allocation_rule 20", and > this seems to work fine with OMP_NUM_THREADS=20. Then a user wanted to run a > job with OMP_NUM_THREADS=5 and used "mpirun -n $((NSLOTS/OMP_NUM_THREADS))",
I assume Open MPI fills just the slots according to the PE_HOSTFILE it got. It reads $PE_HOSTFILE in orte/mca/ras/gridengine/ras_gridengine_module.c and acts accordingly. > but this does not work correctly, on some nodes there are running 20*5=100 > threads, on some nodes there are no running threads. Creating a new PE with > "allocation_rule 5" has the disadvantage, that one cannot use more than > 24*5=120 slots. Unfortunately this is not implemented by default. What you can do: - have PEs with fixed allocation rule, i.e. 20 - request the overall number of slots your job needs - in the job script copy the $PE_HOSTFILE to $TMPDIR - lower the entries therein (i.e. set them to the number of MPI processes you want) on each node* - export PE_HOSTFILE by pointing to the above altered copy of the PE_HOSTFILE (in case Open MPI was compiled --with-sge it should automatically use only the given hosts in the file you provide here) - export OMP_NUM_THREADS=5 * The format is explained in `man sge_pe` start_proc_args "$pe_hostfile". -- Reuti PS: Maybe a prolog could also alter the provided $SGE_JOB_SPOOL_DIR/pe_hostfile directly - I never tested it to alter it but changed a copy only.
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users