On Monday, 10 February 2020 12:11:30 PM PST Dean Schulze wrote:
> With this configuration I get this message every second in my slurmctld.log
> file:
>
> error: _slurm_rpc_node_registration node=slurmnode1: Invalid argument
What other errors are in the logs?
Could you check that you've got
In the gres.conf on one of my nodes I have just the line
Autodetect=nvml
as in the last example in https://slurm.schedmd.com/gres.conf.html.
In the slurm.conf on all nodes I have this line for the node with
Autodetect=nvml
NodeName=slurmnode1 CPUs=16 Boards=1 SocketsPerBoard=1 CoresPerS
Usually means you updated the slurm.conf but have not done "scontrol
reconfigure" yet.
Brian Andrus
On 2/10/2020 8:55 AM, Robert Kudyba wrote:
We are using Bright Cluster 8.1 with and just upgraded to slurm-17.11.12.
We're getting the below errors when I restart the slurmctld service.
The f
We are using Bright Cluster 8.1 with and just upgraded to slurm-17.11.12.
We're getting the below errors when I restart the slurmctld service. The
file appears to be the same on the head node and compute nodes:
[root@node001 ~]# ls -l /cm/shared/apps/slurm/var/etc/slurm.conf
-rw-r--r-- 1 root roo
Hi Dean,
Blocking ports with the Linux firewall and/or your network firewall
(wired/Wi-Fi) would have the same effect: Slurm won't work unless you
open ports as specified in
https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#configure-firewall-for-slurm-daemons
/Ole
On 2/8/20 1:26 AM,