Two trivial things to check:

1.       Permissions on /etc/munge and /etc/munge.key

2.       Is munged running on the problem node?

Andy

From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of 
Dean Schulze
Sent: Wednesday, April 15, 2020 1:57 PM
To: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: [slurm-users] Munge decode failing on new node

I've installed two new nodes onto my slurm cluster.  One node works, but the 
other one complains about an invalid credential for munge.  I've verified that 
the munge.key is the same as on all other nodes with

sudo cksum /etc/munge/munge.key

I recopied a munge.key from a node that works.  I've verified that munge uid 
and gid are the same on the nodes.  The time is in sync on all nodes.

Here is what is in the slurmd.log:

 error: Unable to register: Unable to contact slurm controller (connect failure)
 error: Munge decode failed: Invalid credential
 ENCODED: Wed Dec 31 17:00:00 1969
 DECODED: Wed Dec 31 17:00:00 1969
 error: authentication: Invalid authentication credential
 error: slurm_receive_msg_and_forward: Protocol authentication error
 error: service_connection: slurm_receive_msg: Protocol authentication error
 error: Unable to register: Unable to contact slurm controller (connect failure)

I've checked in the munged.log and all it says is

Invalid credential

Thanks for your help

Reply via email to