We've been typically taking 4G off the top for memory in our slurm.conf for the system and other processes.  This seems to work pretty well.

-Paul Edmon-


On 01/17/2018 01:44 AM, Marcin Stolarek wrote:
I think that it depends on your kernel and the way the cluster is booted (for instance initrd size). You can check the memory used by kernel in dmesg output - search for the line starting with "Memory:". This is fixed. It may be also good idea to "reserve" some space for cache and buffers - check htop or /proc/meminfo (Slab) this may depend on your OS (filesystem, hardware modules) and if you have a limited set of applications - workload. Size of this part of memory may depend on "node size", number of cores should be good measurement.

cheers,
Marcin


2018-01-17 6:03 GMT+01:00 Greg Wickham <greg.wick...@kaust.edu.sa <mailto:greg.wick...@kaust.edu.sa>>:


    We’re using cgroups to limit memory of jobs, but in our slurm.conf
    the total node memory capacity is currently specified.

    Doing this there could be times when physical memory is over
    subscribed (physical allocation per job plus kernel memory
    requirements) and then swapping will occur.

    Is there a recommended “kernel overhead” memory (either % or
    absolute value) that we should deduct from the total physical memory?

    thanks,

       -greg

    --
    Dr. Greg Wickham
    Advanced Computing Infrastructure Team Lead
    Advanced Computing Core Laboratory
    King Abdullah University of Science and Technology
    Building #1, Office #0124
    greg.wick...@kaust.edu.sa <mailto:greg.wick...@kaust.edu.sa> +966
    544 700 330 <tel:%2B966%20544%20700%20330>
    --


    ________________________________
    This message and its contents including attachments are intended
    solely for the original recipient. If you are not the intended
    recipient or have received this message in error, please notify me
    immediately and delete this message from your computer system. Any
    unauthorized use or distribution is prohibited. Please consider
    the environment before printing this email.



Reply via email to