On Friday, 1 February 2019 6:04:45 AM AEDT Andy Riebs wrote:
> Any thoughts on what might be happening, or what I might try next?
Anything in dmesg on the nodes or syslog at that time?
I'm wondering if you're seeing the OOM killer step in and take processes out.
What does your slurm.conf look l
Nico, yep that’s a very annoying bug as we do the same here with job
efficiency. It was patched in 18.08.05. However the db still needs to be
cleaned up. We are working on a script to fix this. When we are done, we'll
offer it up to the list.
Best,
Chris
—
Christopher Coffey
High-Performance C
Hi
While doing some statistics on efficient CPU usage, I realized that sacct is
reporting inexplicable (at least for me) high values for TotalCPU, UserCPU and
SystemCPU. Here is a simple example (each job step is a infinite while loop):
sacct -j 64338003
--format=jobid,elapsed,ncpus,cputime,
Given the extreme amount of output that will be generated for potentially a
couple hundred job runs, I was hoping that someone would say “Seen it, here’s
how to fix it.” Guess I’ll have to go with the “high output” route.
Thanks Doug!
Andy
From: slurm-users [mailto:slurm-users-boun...@lists.sc