Note that the output of your job was printed successfully, then slurmstepd output occurred. At job/step exit time, the Slurm code simply reads the the memory.failcnt and memory.memsw.failcnt files in the relevant cgroup (explanation: https://www.kernel.org/doc/Documentation/cgroups/memory.txt).

Your job's cgroup has memory.failcnt > 0, meaning some of the job was swapped out but not killed. The output is different for memory.memsw.failcnt > 0 because that means that a process was killed.

Ryan

On 04/21/2014 01:48 PM, Guglielmi Matteo wrote:
Installed memory per node:

RAM  32 GB
SWAP 10 GB

#### slurm.conf ####

ProctrackType=proctrack/cgroup
TaskPlugin=task/cgroup
SelectTypeParameters=CR_Core_Memory

NodeName=... RealMemory=29000

####################

### cgroup.conf ####

AllowedRAMSpace=100
AllowedSwapSpace=30.0
ConstrainRAMSpace=YES
ConstrainSwapSpace=YES
MaxRAMPercent=100
MaxSwapPercent=100
MinRAMSpace=30

####################

This program just eats up the requested amount of memory:

### memoryHog.c ####

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define PAGE_SZ (1<<12)

int main(int argc, char **argv) {
    int i;
    int gb = atoi( (argv[1]) ); // memory to consume in GB

    for (i = 0; i < ((unsigned long)gb<<30)/PAGE_SZ ; ++i) {
        void *m = malloc(PAGE_SZ);
        if (!m)
            break;
        memset(m, 0, 1);
    }
    printf("allocated %lu MB\n", ((unsigned long)i*PAGE_SZ)>>20);
    sleep(10);
    return 0;
}

####################

### TESTING ###

$> salloc --mem-per-cpu=9000
salloc: Granted job allocation 1503

$> srun memoryHog.x 8
allocated 8192 MB

$> srun memoryHog.x 9
allocated 9050 MB
slurmstepd: Exceeded step memory limit at some point. Step may have been partially swapped out to disk.

### LOGS: /var/log/slurm/slurmd.log ###

[2014-04-15T18:58:26.212] [1503.0] task/cgroup: /slurm/uid_500/job_1503: alloc=9000MB mem.limit=9000MB memsw.limit=11700MB [2014-04-15T18:58:26.212] [1503.0] task/cgroup: /slurm/uid_500/job_1503/step_0: alloc=9000MB mem.limit=9000MB memsw.limit=11700MB
[2014-04-15T18:58:39.961] [1503.0] done with job
..
..
[2014-04-15T18:58:45.916] [1503.1] task/cgroup: /slurm/uid_500/job_1503: alloc=9000MB mem.limit=9000MB memsw.limit=11700MB [2014-04-15T18:58:45.916] [1503.1] task/cgroup: /slurm/uid_500/job_1503/step_1: alloc=9000MB mem.limit=9000MB memsw.limit=11700MB [2014-04-15T18:59:01.087] [1503.1] Exceeded step memory limit at some point. Step may have been partially swapped out to disk.
[2014-04-15T18:59:01.120] [1503.1] done with job

####################

Since slurm sets "memsw.limit=11700MB" I was expecting
the cgroup feature to start swapping out the exceeding
50 MB or so... they would actually fit in the swap area
and the job should not be killed...

What am I missing here?

Should the code itself be aware of the given "mem.limit=9000MB"?


Thanks for any explanation.

MG

--
Ryan Cox
Operations Director
Fulton Supercomputing Lab
Brigham Young University

Reply via email to