I've been trying to discover why accounting doesn't properly reflect memory 
usage when using jobacct_gather/cgroup.  I've tracked it down to what looks 
like one definite bug, and some conflicting behavior with task/cgroup.  This 
problem is observed when using sbatch without using srun, but could be wider 
than that, I'm not sure yet.

First, the bug...  Without using srun, the step number is set to 
SLURM_BATCH_SCRIPT (-2, or  4294967294 when unsigned).  
plugins/jobacct_gather/cgroup/jobacct_gather_cgroup_cpuacct.c contains a 
special case for this where it sets the cgroup step path to "step_batch".  This 
special case is missing from jobacct_gather_cgroup_memory.c, and as a result 
the step path ends up as "step_4294967294".  I believe this is a bug, as that 
directory does not exist.  Fixing that to mirror what cpuacct does gets us a 
little further, but now comes the conflicting behavior.

The following discussion is in regard to the 'memory' cgroup subsystem...

With the paths fixed, the jobacct_gather/cgroup plugin writes slurmstepd's PID 
to step_batch/task_0/cgroup.procs.  Almost immediately thereafter,
the task/cgroup plugin then writes slurmstepd's PID to step_batch/cgroup.procs, 
thus removing the only PID from task_0 causing task_0 itself to be removed by 
the cgroup release agent.

With task_0 now gone, the periodic calls to the jobacct_gather/cgroup plugin 
fail to collect memory data-
"unable to open 
'/cgroup/memory/slurm/uid_7260/job_1079/step_batch/task_0/memory.stat' for 
reading : No such file or directory"

There appears to be a race between task/cgroup and jobacct_gather/cgroup - if I 
introduce enough delays and jobacct_gather/cgroup runs last and the pid stays 
in task_0, everything seems to work properly.  If task/cgroup runs last and 
task_0 gets removed, the accounting info is lost.

Before I waste any more time trying to debug this, can someone please tell me 
what the desired operation should be?  It seems to me that the memory should be 
associated with task_0, and not step_batch, but I'm not sure.

All testing here was done with 14.11.9.

Thanks,
Kevin

--
Kevin Hildebrand
University of Maryland, College Park

Reply via email to