What are your Slurm settings - what's the values of ProctrackType JobAcctGatherType JobAcctGatherParams
and what's the contents of cgroup.conf? Also, what version of Slurm are you using? Sean -- Sean Crosby | Senior DevOpsHPC Engineer and HPC Team Lead Research Computing Services | Business Services The University of Melbourne, Victoria 3010 Australia On Tue, 16 Mar 2021 at 04:52, Chin,David <dw...@drexel.edu> wrote: > * UoM notice: External email. Be cautious of links, attachments, or > impersonation attempts * > ------------------------------ > Hi, all: > > I'm trying to understand why a job exited with an error condition. I think > it was actually terminated by Slurm: job was a Matlab script, and its > output was incomplete. > > Here's sacct output: > > JobID JobName User Partition NodeList > Elapsed State ExitCode ReqMem MaxRSS MaxVMSize > AllocTRES AllocGRE > -------------------- ---------- --------- ---------- --------------- > ---------- ---------- -------- ---------- ---------- ---------- > -------------------------------- -------- > 83387 ProdEmisI+ foob def node001 > 03:34:26 OUT_OF_ME+ 0:125 128Gn > billing=16,cpu=16,node=1 > 83387.batch batch node001 > 03:34:26 OUT_OF_ME+ 0:125 128Gn 1617705K 7880672K > cpu=16,mem=0,node=1 > 83387.extern extern node001 > 03:34:26 COMPLETED 0:0 128Gn 460K 153196K > billing=16,cpu=16,node=1 > > Thanks in advance, > Dave > > -- > David Chin, PhD (he/him) Sr. SysAdmin, URCF, Drexel > dw...@drexel.edu 215.571.4335 (o) > For URCF support: urcf-supp...@drexel.edu > https://proteusmaster.urcf.drexel.edu/urcfwiki > github:prehensilecode > > > Drexel Internal Data >