Double-check the account info on that node (c0801).

Could be the node does not recognize the uid being assigned to the user/job.

Brian Andrus

On 5/13/2022 2:31 PM, Williams, Jenny Avis wrote:

Yesterday I upgraded slurmdbd and slurmctld nodes from RHEL7 / Slurm v. 20.11.8 to RHEL8.5 / Slurm v. 21.08.6 on our production cluster.

I also updated slurm on the rhel7 login nodes to 21.08.6

Sbatch jobs run fine.

Srun, however, fails from the updated login node with invalid job credential errors. Sruns from nodes that are not update runs fine.

I am hoping this looks familiar to you.

$  srun --slurmd-debug=verbose -n 1 -t 8:00:00 --mem=3g -p interact -w c0801 --pty /bin/bash

srun: job 45281066 queued and waiting for resources

srun: job 45281066 has been allocated resources

srun: error: Task launch for StepId=45281066.0 failed on node c0801: Invalid job credential

srun: error: Application launch failed: Invalid job credential

srun: Job step aborted: Waiting up to 32 seconds for job step to finish.

srun: error: Timed out waiting for job step to complete

Reply via email to