Re: [gridengine users] slots equals cores

2020-02-03 Thread Jerome
Le 31/01/2020 à 11:26, Reuti a écrit :

> 
> Exactly. Doing it on the command line within a loop is not so laborious and 
> it's a fixed feature of a node which will never change during its lifetime.
> 
> -- Reuti
> 
> 
>> Thank's
>>
>> -- 
>> -- Jérôme
>> Quand un arbre tombe, on l'entend ; quand la forêt pousse, pas un bruit.
>>  (Proverbe africain)
> 

Dear Reuti.

That's seems to work like a charm. I've have to use "-attr" because i
define before a h_vmem variable.

Regards

-- 
-- Jérôme
J'aime le travail : il me fascine.
Je peux rester des heures à le regarder
(Jérôme K. Jérôme)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users


Re: [gridengine users] slots equals cores

2020-02-03 Thread Hay, William
On Fri, Jan 31, 2020 at 06:26:19PM +0100, Reuti wrote:
> 
> 
> > Am 31.01.2020 um 18:23 schrieb Jerome IBt :
> > 
> > Le 31/01/2020 à 10:19, Reuti a écrit :
> >> Hi Jérôme,
> >> 
> >> Personally I would prefer to keep the output of `qquota` short and use it 
> >> only for users's limits. I.e. defining the slot limit on an exechost basis 
> >> instead. This can also be done in a loop containing a command line like:
> >> 
> >> $ qconf -mattr exechost complex_values slots=16 node29
> >> 

> >> My experience is, that sometime RQS are screwed up especially if used in 
> >> combination with some load values (although $num_proc is of course fixed 
> >> in your case).
> >> 
> >> -- Reuti
> >> Dear Reuti,
> > 
> > If i understand correctly, you recomend me to disable the RQS for the
> > case of core, and add a complex_value of slots for all of the computes
> > nodes?
> 
> Exactly. Doing it on the command line within a loop is not so laborious and 
> it's a fixed feature of a node which will never change during its lifetime.
>
You don't even really need the loop as such.  qconf -mattr will take multiple 
hostnames.  I tend to feed it arguments with
xargs  eg:
qhost -l status=POLICYOFF |xargs -r qconf -mattr exechost complex_values 
status=OK 

William 


signature.asc
Description: PGP signature
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users


Re: [gridengine users] slots equals cores

2020-01-31 Thread Reuti


> Am 31.01.2020 um 18:23 schrieb Jerome IBt :
> 
> Le 31/01/2020 à 10:19, Reuti a écrit :
>> Hi Jérôme,
>> 
>> Personally I would prefer to keep the output of `qquota` short and use it 
>> only for users's limits. I.e. defining the slot limit on an exechost basis 
>> instead. This can also be done in a loop containing a command line like:
>> 
>> $ qconf -mattr exechost complex_values slots=16 node29
>> 
>> My experience is, that sometime RQS are screwed up especially if used in 
>> combination with some load values (although $num_proc is of course fixed in 
>> your case).
>> 
>> -- Reuti
>> Dear Reuti,
> 
> If i understand correctly, you recomend me to disable the RQS for the
> case of core, and add a complex_value of slots for all of the computes
> nodes?

Exactly. Doing it on the command line within a loop is not so laborious and 
it's a fixed feature of a node which will never change during its lifetime.

-- Reuti


> Thank's
> 
> -- 
> -- Jérôme
> Quand un arbre tombe, on l'entend ; quand la forêt pousse, pas un bruit.
>   (Proverbe africain)


___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users


Re: [gridengine users] slots equals cores

2020-01-31 Thread Jerome IBt
Le 31/01/2020 à 10:19, Reuti a écrit :
> Hi Jérôme,
> 
> Personally I would prefer to keep the output of `qquota` short and use it 
> only for users's limits. I.e. defining the slot limit on an exechost basis 
> instead. This can also be done in a loop containing a command line like:
> 
> $ qconf -mattr exechost complex_values slots=16 node29
> 
> My experience is, that sometime RQS are screwed up especially if used in 
> combination with some load values (although $num_proc is of course fixed in 
> your case).
> 
> -- Reuti
>  Dear Reuti,

If i understand correctly, you recomend me to disable the RQS for the
case of core, and add a complex_value of slots for all of the computes
nodes?

Thank's

-- 
-- Jérôme
Quand un arbre tombe, on l'entend ; quand la forêt pousse, pas un bruit.
(Proverbe africain)
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users


Re: [gridengine users] slots equals cores

2020-01-31 Thread Reuti
Hi Jérôme,

Personally I would prefer to keep the output of `qquota` short and use it only 
for users's limits. I.e. defining the slot limit on an exechost basis instead. 
This can also be done in a loop containing a command line like:

$ qconf -mattr exechost complex_values slots=16 node29

My experience is, that sometime RQS are screwed up especially if used in 
combination with some load values (although $num_proc is of course fixed in 
your case).

-- Reuti


> Am 31.01.2020 um 17:00 schrieb Jerome :
> 
> Dear all
> 
> I'm facing a new problem on my cluster with SGE. I don't show this
> before.. O maybe I never detect it.
> I have some nodes with 2 queue, one (named "all.q" ) to run jobs no more
> than 24h , and another queue (named "lenta.q" ) to run jobs than need
> more than 24 h.
> I determine qa resource quota as i read some time in this email list,
> defined as following:
> 
> {
>   name slots_equals_cores
>   description  Prevent core over-subscription across queues
>   enabled  TRUE
>   limithosts {*} to slots=$num_proc
> }
> 
> 
> For now, i have a node with 64 cores, 40 cores for the normal queue ,
> and 24 for the large queue.
> 
> 
> all.q@compute-2-0.localBP0/16/4015.93lx-amd64
> 
> lenta.q@compute-2-0.local  BP0/0/24 15.93lx-amd64
> 
> Some jobs with 2 cores don't enter in this node on the large time queue,
> althougth there is no problem with memory or core. The qstat indicate me
> this:
> 
> "compute-2-0/" in rule "slots_equals_cores/1"
>cannot run because it exceeds limit
> "compute-2-0/" in rule "slots_equals_cores/1"
>cannot run because it exceeds limit
> "compute-0-4/" in rule "slots_equals_cores/1"
>cannot run in PE "thread" because it only
> offers 0 slots
> 
> I really don't understand why the job is not running on tis nodes, at
> for my opinion it's free for this.
> 
> Somenoe can help me about this?
> 
> REgards.
> 
> -- 
> -- Jérôme
> Le baiser est la plus sûre façon de se taire en disant tout.
>   (Guy de Maupassant)
> ___
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users


___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users