Am 23.01.2015 um 18:08 schrieb Ilya M <[email protected]>:
> 
> Removed the quota limits. To no avail: same problems.

Ok, but it was worth to try.

-- Reuti


> -------- Original Message --------
> Subject: Re: [gridengine users] Cannot request resource if it is a load value 
> of memory type: SGE reports it as unknown resource
> From: Reuti <[email protected]>
> To: Ilya M <[email protected]>
> Date: 1/23/15, 2:33 AM
>> Can you remove them temporarily? I saw cases where suddenly the "unknown 
>> resource" popped up - and also suddenly vanished again, but it was somehow 
>> connected to RQS was my conclusion.
>> 
>> -- Reuti
>> 
>> 
>>> Am 23.01.2015 um 00:16 schrieb Ilya M <[email protected]>:
>>> 
>>> There are two RQS, one is disabled:
>>> 
>>> {
>>>   name         limit_for_interns
>>>   description  "limit to max 5 GPU jobs per intern."
>>>   enabled      TRUE
>>>   limit        users {int1,int2} hosts @gpu to slots=5
>>> }
>>> {
>>>   name         limit_slots
>>>   description  NONE
>>>   enabled      FALSE
>>>   limit        hosts {@gpu} to slots=2
>>> }
>>> 
>>> 
>>> -------- Original Message --------
>>> Subject: Re: [gridengine users] Cannot request resource if it is a load 
>>> value of memory type: SGE reports it as unknown resource
>>> From: Reuti <[email protected]>
>>> To: Ilya <[email protected]>
>>> Date: 1/21/15, 16:12
>>>> Hi,
>>>> 
>>>> Am 22.01.2015 um 00:52 schrieb Ilya:
>>>> 
>>>>> Something happened to the SGE (6.2u5) that had been running fine for many 
>>>>> months, and users can no longer put resource requests for load values if 
>>>>> they are of memory type, e.g.
>>>>> 
>>>>> qsub -l mem_free=5G -w v .... produces the following output:
>>>>> 
>>>>> cannot run in queue "gpu.q@gpu038" because job requests unknown resource 
>>>>> (mem_free)
>>>>> 
>>>>> The resource is available, though, when querying for it:
>>>>> qhost -F mem_free -h gpu038
>>>>> HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE SWAPTO  
>>>>> SWAPUS
>>>>> -------------------------------------------------------------------------------
>>>>> global                  -               -     -       - -       -       -
>>>>> gpu038                         lx24-amd64     16  2.11  126.1G 15.7G    
>>>>> 4.0G     0.0
>>>>>    Host Resource(s):      hl:mem_free=110.416G
>>>>> 
>>>>> 
>>>>> This was first reported by a user when he tried to request custom "hl" 
>>>>> resource. However, it now appears that all "hl" resources of type 
>>>>> "memory" show this behavior. Integer "hl" are OK.
>>>> Do you have any RQS in place?
>>>> 
>>>> -- Reuti
>>>> 
>>>> 
>>>>> I bounced qmaster between master and shadow-master a couple of times, but 
>>>>> it did not resolve the problem.
>>>>> 
>>>>> Additionally, when I added MONITOR=1 to scheduler's configuration, the 
>>>>> file $SGE_ROOT/$SGE_CELL/common/schedule contains only colons:
>>>>> ::::::::
>>>>> ::::::::
>>>>> ::::::::
>>>>> 
>>>>> Any ideas?
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> [email protected]
>>>>> https://gridengine.org/mailman/listinfo/users
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> https://gridengine.org/mailman/listinfo/users
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to