Matthew BETTINGER <matthew.bettin...@external.total.com> writes: > Just curious if this option or oom setting (which we use) can leave > the nodes in CG "completing" state.
I don't think so. As far as I know, jobs go into completing state when Slurm is cancelling them or when they exit on their own, and stays in that state until any epilogs are run. In my experience, the most typical reasons for jobs hanging in CG are disk system failures or other failures leading to either the job processes or the epilog processes hanging in "disk wait". -- Regards, Bjørn-Helge Mevik, dr. scient, Department for Research Computing, University of Oslo
signature.asc
Description: PGP signature