Dear all,

Yes, I set a default of 1G value for h_vmem in the global complex. The queue has infinity and each node has a h_vmem of 120G. There is nothing set in sge_request.

Here is the error I get

04/09/2015 12:41:54| main|cpu-1-4|W|job 4 exceeds job hard limit "h_vmem" of queue "[email protected]" (7002386432.00000 > limit:1073741824.00000) - sending SIGKILL 04/09/2015 12:41:55| main|cpu-1-4|W|job 4 exceeds job hard limit "h_vmem" of queue "[email protected]" (6889209856.00000 > limit:1073741824.00000) - sending SIGKILL


For this I asked for
#$ -pe openmpi 10
#$ -l h_vmem=1G

checking ulimits via the script gives the expected 10G.

So for some reason there is a limit of 1G there. Checking qacct it shows the the total maxvmem used is just over 3G so asking for 10G should be plenty.

[root@queue ~]# qconf -sc
#name shortcut type relop requestable consumable default urgency
#----------------------------------------------------------------------------------------
h_vmem h_vmem MEMORY <= YES YES 1G 0

[root@queue ~]# qconf -sq all.q
h_vmem                INFINITY

[root@queue ~]# qconf -se cpu-1-4.local
complex_values        h_vmem=120G

Marlies

On 04/10/2015 07:43 PM, Reuti wrote:
Am 10.04.2015 um 05:59 schrieb Marlies Hankel<[email protected]>:

Dear all,

I ran into some trouble with the default value of h_vmem. I set to be 
consumable=yes and also set a default value of 1G. When I submitted a job 
asking for example for 10 slots with 1G per lost the job crashed with an error 
in the queue logs saying that the h_vmem needed by the job (around 3G) was over 
the hard limit of the queue (local host instance) of 1G. I would have thought 
that the request of 1G per slot, so 10G in total would override this and give 
enough memory for the job.

Setting the default value to 6G resolved the problem,
You refer to the setting on a queue level? This is the limit per process. There 
is also a column for the default value in the complex definition for each 
consumable complex. This can be set to 1G and users can override it, as long as 
they stay below the limit on a queue (or exechost) level.

-- Reuti


but as we might be dealing with larger memory jobs in future I would like to 
find a proper fix for this. I am running SGE as installed by ROCKS 6.1.1 
(OGS/Grid Engine 2011.11) and the only thing I changed was to set h_vmem to 
consumable=yes and set the relevant h_vmem values for each host.

I want users to request memory and jobs to be killed if they exceed the 
requested amount, so h_vmem seemed to be the way to go. But how do I set a 
small default value that users can change if they need more? Or should I set it 
to forced without a default and force users to request it?

Thanks in advance

Marlies

--

------------------

Dr. Marlies Hankel
Research Fellow, Theory and Computation Group
Australian Institute for Bioengineering and Nanotechnology (Bldg 75)
eResearch Analyst, Research Computing Centre and Queensland Cyber 
Infrastructure Foundation
The University of Queensland
Qld 4072, Brisbane, Australia
Tel: +61 7 334 63996 | Fax: +61 7 334 63992 | mobile:0404262445
Email: [email protected] | www.theory-computation.uq.edu.au


Notice: If you receive this e-mail by mistake, please notify me,
and do not make any use of its contents. I do not waive any
privilege, confidentiality or copyright associated with it. Unless
stated otherwise, this e-mail represents only the views of the
Sender and not the views of The University of Queensland.


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

--

ccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccms

Please note change of work hours: Monday, Wednesday and Friday

Dr. Marlies Hankel
Research Fellow
High Performance Computing, Quantum Dynamics&  Nanotechnology
Theory and Computational Molecular Sciences Group
Room 229 Australian Institute for Bioengineering and Nanotechnology  (75)
The University of Queensland
Qld 4072, Brisbane
Australia
Tel: +61 (0)7-33463996
Fax: +61 (0)7-334 63992
mobile:+61 (0)404262445
Email: [email protected]
http://web.aibn.uq.edu.au/cbn/

ccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccms

Notice: If you receive this e-mail by mistake, please notify me, and do
not make any use of its contents. I do not waive any privilege,
confidentiality or copyright associated with it. Unless stated
otherwise, this e-mail represents only the views of the Sender and not
the views of The University of Queensland.



_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to