Am 18.09.2013 um 01:19 schrieb Brendan Moloney:

> Hi Reuti,
> 
>> Yes. But this queue can have the total slot count of the machine. Or are you 
>> assigning right now 4 cores to a short queue, 8 to a medium one and the 
>> remaining 4 cores of a 16 cores machine to a long queue?
> 
> I limit the total number of slots for each queue using an RQS.  The shortest 
> queue (30 minute time limit) is unlimited, and each other queue can use up to 
> 10% of the total slots (120 total). For example the RQS for one of the queues:
> 
> {
>   name         longlimit
>   description  NONE
>   enabled      TRUE
>   limit        queues long.q hosts * to slots=12

NB: As it's a rule across all machines in total, the "hosts *" can also be left 
out.

-- Reuti


> }
> 
> 
>> The $PE_HOSTFILE will contain an entry for each granted queue. The MPI 
>> library will need to sum up the ones residing on one and the same host and 
>> use forks across the overall amount. This was a bug in Open MPI but it was 
>> fixed some time ago; in MPICH2 1.4.1p1 it's still there, even 3.0.4 and 
>> 3.1b1. 
>> I'll bring it up on the MPICH list (as a result several `qrsh -inherit ...` 
>> will be made to one and the same machine and you can face the error you got).
> 
> Thank you so much for this!  I saw your message to the MPICH list and will 
> definitely keep an eye on it.  I may also try out OpenMPI soon.
> 
> Thanks again,
> Brendan


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to