Hello, Could someone in the slurm community please advise me on outputting data (job stats) at the end of a job. I'm currently using the Epilog (slurm.epilog.clean) to print out a report at the end of each slurm job. This works, however it is far from ideal. The slurm.epilog.clean can be executed multiple times and so I do a test to find out when I'm on the "batchhost" and then write out the job stats. That is...
hostnode=`hostname` batchhost=`/local/software/slurm/default/bin/scontrol show job ${SLURM_JOB_ID} | grep BatchHost | cut -f2 -d '='` if [ $hostnode = $batchhost ] ; then printf "Submit time : `/local/software/slurm/default/bin/scontrol show job ${SLURM_JOB_ID}| grep SubmitTime | awk '{print $1}' | cut -f2 -d"="`\n" >>$stdout etc fi This does work, however it strikes me that using the EpilogSlurmctld might be better. The issue, of course, is that the EpilogSlurmctld is executed by the slurm user, and so how can this script be made to write to the stdout file of a job? I am, by the way, grep'ing the output of the scontrol (for submit time, etc) and the sacct (for memory usage, etc) commands to generate my report. Does this approach make sense or are there better alternatives. Here's an example of the data printed out by my epilog script.... Submit time : 2016-10-12T09:47:03 Start time : 2016-10-12T09:47:03 End time : 2016-10-12T09:47:17 Elapsed time : 00:00:14 (Timelimit=02:00:00) JobName MaxRSS Elapsed ---------- ---------- ---------- slurm.mpi 00:00:14 batch 1244K 00:00:14 cpi 28224K 00:00:01 Best regards, David