This has started happening after upgrading slurm from 20.02 to latest 20.11.
It seems like something exits too early, before slurm, or whatever else is writing that file, has a chance to flush the final output buffer to disk.

For example, take this very simple batch script, which gets submitted via sbatch:

#!/bin/bash
#SBATCH --job-name=test
#SBATCH --ntasks=1
#SBATCH --exclusive
set -e

echo A
echo B
sleep 5
echo C

The resulting slurm-$jobid.out file is only

> A
> B

The final echo never gets written to the output file.

A lot of users print a final result status at the end, which then never hits the logs. So this is a major for them.

The scripts run to completion just fine, it's only the log being missing the end. For example touching some file after the "echo C" will touch that file as expected.

The behaviour is also not at all consistent. Sometimes the output log is written as expected, with no recognizable pattern. Though this seems to be the exception, majority of the time it's truncated.

This was never an issue before the recent slurm update.

Reply via email to