On 19/9/22 05:46, Paul Raines wrote:
In slurm.conf I had InactiveLimit=60 which I guess is what is happening
but my reading of the docs on this setting was it only affects the
starting of a job with srun/salloc and not a job that has been running
for days. Is it InactiveLimit that leads to the "inactivity time limit
reached" message?
I believe so, but remember that this governs timeouts around
communications between slurmctld and the srun/salloc commands, and not
things like shell inactivity timeouts which are quite different.
See:
https://slurm.schedmd.com/faq.html#purge
# A job is considered inactive if it has no active job steps or
# if the srun command creating the job is not responding.
Hope this helps!
All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA