Hello there,

We recently deployed SLURM for a Bioinformatics cluster at KEMRI-Wellcome 
Trust, Kilifi, kenya, and after following the setup guide and the online 
configurator ( to build the configuration file), here are the errors we ran ino:


1.       None of the slurmd daemons on either node will start up.

2.       Apparently, slurmdbd starts up correctly and allowed us to register 
the cluster.
Here's the debug information available at the moment:

1.       1. An excerpt from the logs:

less /var/log/slurm/slurmd.log | tail
[2015-11-04T22:33:01.629] fatal: You are running slurmd as something other than 
user slurm(564).  If you want to run as this user add SlurmdUser=root to the 
slurm.conf file.
[2015-11-04T22:36:22.663] Node configuration differs from hardware: 
CPUs=64:64(hw) Boards=1:1(hw) SocketsPerBoard=64:4(hw) CoresPerSocket=1:8(hw) 
ThreadsPerCore=1:2(hw)
[2015-11-04T22:36:22.663] Message aggregation disabled
[2015-11-04T22:36:22.664] Resource spec: Reserved system memory limit not 
configured for this node
[2015-11-04T23:00:17.659] Slurmd shutdown completing
[2015-11-04T23:05:38.092] Node configuration differs from hardware: 
CPUs=64:64(hw) Boards=1:1(hw) SocketsPerBoard=64:4(hw) CoresPerSocket=1:8(hw) 
ThreadsPerCore=1:2(hw)
[2015-11-04T23:05:38.098] Message aggregation disabled
[2015-11-04T23:05:38.111] error: _cpu_freq_cpu_avail: Could not open 
/sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies
[2015-11-04T23:05:38.113] Resource spec: Reserved system memory limit not 
configured for this node
[2015-11-04T23:05:38.127] fatal: You are running slurmd as something other than 
user slurm(564).  If you want to run as this user add SlurmdUser=root to the 
slurm.conf file.

The same message appears on the other three nodes as well.

scontrol ping returns:

Slurmctld(primary/backup) at kenbo-cen05/(NULL) are UP/DOWN

Sinfo returns:

PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
debug*       up       5:00      1  down* kenbo-cen05
highmem      up   infinite      4  down* kenbo-cen[05-08]
batch        up   infinite      4  down* kenbo-cen[05-08]
longrun      up   infinite      4  down* kenbo-cen[05-08]

My configuration file and the init.d scripts for both slurm and slurmdbd are 
attached below for your perusal.

Your assistance will be highly appreciated.

Regards,

Dennis Mungai.


______________________________________________________________________

This e-mail contains information which is confidential. It is intended only for 
the use of the named recipient. If you have received this e-mail in error, 
please let us know by replying to the sender, and immediately delete it from 
your system.  Please note, that in these circumstances, the use, disclosure, 
distribution or copying of this information is strictly prohibited. 
KEMRI-Wellcome Trust Programme cannot accept any responsibility for the  
accuracy or completeness of this message as it has been transmitted over a 
public network. Although the Programme has taken reasonable precautions to 
ensure no viruses are present in emails, it cannot accept responsibility for 
any loss or damage arising from the use of the email or attachments. Any views 
expressed in this message are those of the individual sender, except where the 
sender specifically states them to be the views of KEMRI-Wellcome Trust 
Programme.
______________________________________________________________________

Attachment: slurm.conf
Description: slurm.conf

Attachment: slurm.init.file
Description: slurm.init.file

Attachment: slurmdbd.init.file
Description: slurmdbd.init.file

Reply via email to