Hi, One of our users was carrying out some tests and running some very short jobs with a TimeLimit of 60s. However, because one of the nodes had to be booted, which takes a couple of minutes, the jobs were terminated with TIMEOUT as the state.
I am aware that we can set BatchStartTimeout to a larger value, but wouldn't it make more sense if the run-time for the job only started to accumulate, once the slurmd on the node became available? Cheers, Loris -- Dr. Loris Bennett (Mr.) ZEDAT, Freie Universität Berlin Email [email protected]
