We've been typically taking 4G off the top for memory in our slurm.conf
for the system and other processes. This seems to work pretty well.
-Paul Edmon-
On 01/17/2018 01:44 AM, Marcin Stolarek wrote:
I think that it depends on your kernel and the way the cluster is
booted (for instance initrd size). You can check the memory used by
kernel in dmesg output - search for the line starting with "Memory:".
This is fixed.
It may be also good idea to "reserve" some space for cache and buffers
- check htop or /proc/meminfo (Slab) this may depend on your OS
(filesystem, hardware modules) and if you have a limited set of
applications - workload. Size of this part of memory may depend on
"node size", number of cores should be good measurement.
cheers,
Marcin
2018-01-17 6:03 GMT+01:00 Greg Wickham <greg.wick...@kaust.edu.sa
<mailto:greg.wick...@kaust.edu.sa>>:
We’re using cgroups to limit memory of jobs, but in our slurm.conf
the total node memory capacity is currently specified.
Doing this there could be times when physical memory is over
subscribed (physical allocation per job plus kernel memory
requirements) and then swapping will occur.
Is there a recommended “kernel overhead” memory (either % or
absolute value) that we should deduct from the total physical memory?
thanks,
-greg
--
Dr. Greg Wickham
Advanced Computing Infrastructure Team Lead
Advanced Computing Core Laboratory
King Abdullah University of Science and Technology
Building #1, Office #0124
greg.wick...@kaust.edu.sa <mailto:greg.wick...@kaust.edu.sa> +966
544 700 330 <tel:%2B966%20544%20700%20330>
--
________________________________
This message and its contents including attachments are intended
solely for the original recipient. If you are not the intended
recipient or have received this message in error, please notify me
immediately and delete this message from your computer system. Any
unauthorized use or distribution is prohibited. Please consider
the environment before printing this email.