Am 18.09.2013 um 01:19 schrieb Brendan Moloney: > Hi Reuti, > >> Yes. But this queue can have the total slot count of the machine. Or are you >> assigning right now 4 cores to a short queue, 8 to a medium one and the >> remaining 4 cores of a 16 cores machine to a long queue? > > I limit the total number of slots for each queue using an RQS. The shortest > queue (30 minute time limit) is unlimited, and each other queue can use up to > 10% of the total slots (120 total). For example the RQS for one of the queues: > > { > name longlimit > description NONE > enabled TRUE > limit queues long.q hosts * to slots=12
NB: As it's a rule across all machines in total, the "hosts *" can also be left out. -- Reuti > } > > >> The $PE_HOSTFILE will contain an entry for each granted queue. The MPI >> library will need to sum up the ones residing on one and the same host and >> use forks across the overall amount. This was a bug in Open MPI but it was >> fixed some time ago; in MPICH2 1.4.1p1 it's still there, even 3.0.4 and >> 3.1b1. >> I'll bring it up on the MPICH list (as a result several `qrsh -inherit ...` >> will be made to one and the same machine and you can face the error you got). > > Thank you so much for this! I saw your message to the MPICH list and will > definitely keep an eye on it. I may also try out OpenMPI soon. > > Thanks again, > Brendan _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users