[slurm-dev] Slurmd v15 to v17 stopped working (slurmd: fatal: Unable to determine this slurmd's NodeName) on ControlMachine

LAHAYE Olivier Thu, 10 Aug 2017 05:36:15 -0700

Hi,

I've upgraded slurm 15.08.3 (built from rpmbuild -tb <tarball>) to 17.02.6 on 
centos-7-x86_64.


Since I've done that, slurmd refuse to start on ControlMachine and on 
Backupcontroller. (it starts fine on compute nodes)

The error is: slurmd: fatal: Unable to determine this slurmd's NodeName

If I try to specify the nodename it fails with a different error message:

[root@slurm_master] # slurmd -D -N $(hostname -s)
slurmd: Node configuration differs from hardware: CPUs=0:32(hw) Boards=0:1(hw) 
SocketsPerBoard=0:2(hw) CoresPerSocket=0:8(hw) ThreadsPerCore=0:2(hw)
slurmd: Message aggregation disabled
slurmd: error: find_node_record: lookup failure for slurm_master
slurmd: fatal: ROUTE -- slurm_master not found in node_record_table
[root@slurm_master]# hostname -s
slurm_master

Trying to debug seems to show that the hostname is not in the node hash table.

slurmdbd and slurmctld start fine.
I've googled around, but I only find problems related to compute nodes, not 
Controller or Backup.

Any ideas?
-- 
   Olivier LAHAYE
   CEA DRT/LIST/DIR

ClusterName="OSCAR Cluster"
ControlMachine=slurm_master
ControlAddr=oscar-server
BackupController=slurm_slave
BackupAddr=oscar-slave
SlurmUser=slurm
SlurmctldPort=6817
SlurmdPort=6818
AuthType=auth/munge
CacheGroups=1
#CheckpointType=checkpoint/blcr
CryptoType=crypto/munge
#EnforcePartLimits=YES
PrivateData=jobs
ProctrackType=proctrack/pgid
ReturnToService=1
StateSaveLocation=/var/spool/slurm
SlurmdSpoolDir=/var/spool/slurmd
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
RebootProgram=/usr/sbin/reboot

###################  Begin by-hand parameter customisation ####################
# https://computing.llnl.gov/linux/slurm/configurator.html

# SlurmctldTimeout: How many seconds the backup controller waits before becoming
#                   the master controller
SlurmctldTimeout=120
# SlurmdTimeout: How many seconds the SLURM controller waits for the slurmd to
#                respond to a request before considering the node DOWN
SlurmdTimeout=300
# InactiveLimit: How many seconds the SLURM controller waits for srun commands
#                to respond before considering the job or job step inactive and
#                terminating it. A value of zero indicates unlimited wait
InactiveLimit=0
# MinJobAge: How many seconds the SLURM controller waits after a job terminates
#            before purging its record. A record of the job will persist in job
#            completion and/or accounting records indefinitely, but will no
#            longer be visible with the squeue command after puring
MinJobAge=300
# KillWait: How many seconds a job is given to gracefully terminate after
#           reaching its time limit and being sent SIGTERM before sending a
#           SIGKILLL
KillWait=30
# WaitTime: How many seconds after a job step's first task terminates before
#           terminating all remaining tasks. A value of zero indicates unlimited
#           wait
WaitTime=0
# EnforcePartLimits: If set to "YES" then jobs which exceed a partition's size
#                    and/or time limits will be rejected at submission time.
EnforcePartLimits=YES
# SchedulerType: Identifies the type of scheduler to be used.
# sched/backfill = For a backfill scheduling module to augment the default FIFO
#                scheduling. Backfill scheduling will initiate lower-priority
#                jobs if doing so does not delay the expected initiation time of
#                any higher priority job.  Effectiveness of backfill scheduling
#                is dependent upon users specifying job time limits, otherwise
#                all jobs will have the same time limit and backfilling is
#                impossible.
SchedulerType=sched/backfill
# SelectType: Identifies the type of resource selection algorithm to be used.
# select/cons_res = The resources within a node are individually allocated as
#                 consumable resources.
SelectType=select/cons_res
# SelectTypeParameters:
# CR_CPU = CPUs are consumable resources.
# CR_Pack_Nodes = If a job allocation contains more resources than will be used
#               for launching tasks (e.g. if whole nodes are allocated to a
#               job), then rather than evenly distributing a job's tasks evenly
#               across it's allocated nodes, pack them as tightly as possible on
#               these nodes.
SelectTypeParameters=CR_CPU,CR_Pack_Nodes
# max_switch_wait= Maximum number of seconds that a job can delay
#                  execution waiting for the specified desired switch
#                  count. -> 302400 ~ 3.5day
# max_sched_time = How  long that the main scheduling loop will execute for 
#                  before exiting.
# bf_busy_nodes = When selecting resources for pending jobs to reserve
#                 for future execution (i.e. the job can not be
#                 started immediately), then preferentially select
#                 nodes that are in use.
# bf_window = The number of minutes into the future to look when
#             considering jobs to schedule (~3.5 days).
# bf_resolution = The number of seconds in the resolution of data
#                 maintained about when jobs begin and end
SchedulerParameters=max_switch_wait=302400,max_sched_time=30,bf_busy_nodes,bf_window=5040,bf_resolution=210
# PriorityType :
# priority/multifactor= jobs are prioritized based upon size, age, fair-share of
#                       allocation, etc)
PriorityType=priority/multifactor
# PriorityDecayHalfLife: This controls how long prior resource use is considered
#                        in determining how over- or under-serviced an
#                        association is (user, bank account and cluster) in
#                        determining job priority.  The record of usage will be
#                        decayed over time, with half of the original value
#                        cleared at age given by this value.
PriorityDecayHalfLife=7-0       # 7 days
# PriorityMaxAge: Specifies the job age which will be given the maximum age
#                 factor in computing priority.
PriorityMaxAge=2160             # ~ 14 days 
# PriorityWeightAge: An integer value that sets the degree to which the queue
#                    wait time component contributes to the job's priority.
PriorityWeightAge=1024    # ~ 2^10
# PriorityWeightFairshare: An integer value that sets the degree to which the
#                          fair-share component contributes to the job's
#                          priority.
PriorityWeightFairshare=2147483648 # ~ 2^31
# FairShareDampeningFactor: Dampen the effect of exceeding a user or group's
#                           fair share of allocated resources.
FairShareDampeningFactor=3
# PriorityWeightJobSize: An integer value that sets the degree to which the job
#                        size component contributes to the job's priority.
PriorityWeightJobSize=1048576 # ~ 2^20
# AccountingStorageEnforce: This controls what level of association-based
#                           enforcement to impose on job submissions.
AccountingStorageEnforce=limits
# AccountingStorageHost: The name of the machine hosting the accounting storage
#                        database.
AccountingStorageHost=slurm_master
# AccountingStoragePort: The listening port of the accounting storage database
#                        server.
AccountingStoragePort=6819
# AccountingStorageType: The "accounting_storage/slurmdbd" value indicates that
#                        accounting records will be written to the Slurm DBD,
#                        which manages an underlying MySQL database.
AccountingStorageType=accounting_storage/slurmdbd
# JobAcctGatherFrequency: The job accounting and profiling sampling intervals.
JobAcctGatherFrequency=30
# JobAcctGatherType: The job  accounting mechanism type.
JobAcctGatherType=jobacct_gather/linux
# TopologyPlugin: Identifies the plugin to be used for determining the network
#                 topology and optimizing job allocations to minimize network
#                 contention.
# topology/tree = used for a hierarchical network as described in a
#               topology.conf file.
TopologyPlugin=topology/tree
# CompleteWait: The time, in seconds, given for a job to remain in
#               COMPLETING state before any additional jobs are
#               scheduled.
CompleteWait=32
#SlurmSchedLogFile=/var/log/slurmSched.log
##################### End by-hand parameter customisation ####################

#
# Node Configurations
#
# Weight: All things being equal, jobs will be allocated the nodes
#         with the lowest weight which satisfies their requirements.
NodeName=oscarnode[1-6] CPUs=12 State=UNKNOWN Weight=1 NodeAddr=ibnode[1-6]
NodeName=oscarnode[43-49] CPUs=12 State=UNKNOWN Weight=1 NodeAddr=ibnode[43-49] 
Feature=gpu
NodeName=oscarnode[8-42] CPUs=24 State=UNKNOWN Weight=1 NodeAddr=ibnode[8-42]
NodeName=oscarnode[50-62,64-69] CPUs=8 State=UNKNOWN Weight=1
NodeName=oscarnode70 CPUs=48 State=UNKNOWN Weight=100000 NodeAddr=ibnode70 
Feature=2To

#
# Partition Configurations
#
# DefaultTime=320 ~ 5h20
# MaxTime=14-0 ~ 14 days
PartitionName=workq Nodes=oscarnode[1-6,8-62,64-70] Default=YES MaxTime=14-0 
DefaultTime=360 State=UP Shared=NO

[slurm-dev] Slurmd v15 to v17 stopped working (slurmd: fatal: Unable to determine this slurmd's NodeName) on ControlMachine

Reply via email to