Using Slurm 14.11.3 on an RHEL 6.5 x86_64 system, scancel has lost the ability to cancel job steps.

For example, in one window,

$ cat script.sh
srun sleep 60
srun sleep 60
srun sleep 60
$ salloc -N2 ./script.sh
salloc: Granted job allocation 18727

Then, in another window, I expect to be able to cancel one of the job steps:

$ squeue --steps
         STEPID     NAME PARTITION     USER      TIME NODELIST
        18727.0    sleep       all    riebs      0:37 beehive[09-10]
$ scancel 18727.0
scancel: error: slurm_kill_job2() failed Invalid job id specified
$ squeue --steps
         STEPID     NAME PARTITION     USER      TIME NODELIST
        18727.0    sleep       all    riebs      0:58 beehive[09-10]

Andy

--
Andy Riebs
Hewlett-Packard Company
High Performance Computing
+1 404 648 9024
My opinions are not necessarily those of HP

Reply via email to