Yesterday I upgraded slurmdbd and slurmctld nodes from RHEL7 / Slurm v. 20.11.8 
to RHEL8.5 / Slurm v. 21.08.6 on our production cluster.
I also updated slurm on the rhel7 login nodes to 21.08.6
Sbatch jobs run fine.

Srun, however, fails from the updated login node with invalid job credential 
errors. Sruns from nodes that are not update runs fine.
I am hoping this looks familiar to you.


$  srun --slurmd-debug=verbose -n 1 -t 8:00:00 --mem=3g -p interact -w c0801 
--pty /bin/bash
srun: job 45281066 queued and waiting for resources
srun: job 45281066 has been allocated resources
srun: error: Task launch for StepId=45281066.0 failed on node c0801: Invalid 
job credential
srun: error: Application launch failed: Invalid job credential
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
srun: error: Timed out waiting for job step to complete


Reply via email to