Re: [slurm-users] systemctl enable slurmd.service Failed to execute operation: No such file or directory

Nousheen Sun, 30 Jan 2022 21:26:38 -0800

The same error shows up on compute node which is as follows:

[root@c103008 ~]# systemctl enable slurmd.service
[root@c103008 ~]# systemctl start slurmd.service
[root@c103008 ~]# systemctl status slurmd.service
● slurmd.service - Slurm node daemon
   Loaded: loaded (/etc/systemd/system/slurmd.service; enabled; vendor
preset: disabled)
   Active: failed (Result: exit-code) since Mon 2022-01-31 00:22:42 EST; 2s
ago
  Process: 11505 ExecStart=/usr/local/sbin/slurmd -D -s $SLURMD_OPTIONS
(code=exited, status=203/EXEC)
 Main PID: 11505 (code=exited, status=203/EXEC)


Jan 31 00:22:42 c103008 systemd[1]: Started Slurm node daemon.
Jan 31 00:22:42 c103008 systemd[1]: slurmd.service: main process exited,
code=exited, status=203/EXEC
Jan 31 00:22:42 c103008 systemd[1]: Unit slurmd.service entered failed
state.
Jan 31 00:22:42 c103008 systemd[1]: slurmd.service failed.


Best Regards,
Nousheen Parvaiz


ᐧ

On Mon, Jan 31, 2022 at 10:08 AM Nousheen <nousheenparv...@gmail.com> wrote:

> Dear Jeffrey,
>
> Thank you for your response. I have followed the steps as instructed.
> After the copying the files to their respective locations "systemctl status
> slurmctld.service" command gives me an error as follows:
>
> (base) [nousheen@exxact system]$ systemctl daemon-reload
> (base) [nousheen@exxact system]$ systemctl enable slurmctld.service
> (base) [nousheen@exxact system]$ systemctl start slurmctld.service
> (base) [nousheen@exxact system]$ systemctl status slurmctld.service
> ● slurmctld.service - Slurm controller daemon
>    Loaded: loaded (/etc/systemd/system/slurmctld.service; enabled; vendor
> preset: disabled)
>    Active: failed (Result: exit-code) since Mon 2022-01-31 10:04:31 PKT;
> 3s ago
>   Process: 18114 ExecStart=/usr/local/sbin/slurmctld -D -s
> $SLURMCTLD_OPTIONS (code=exited, status=1/FAILURE)
>  Main PID: 18114 (code=exited, status=1/FAILURE)
>
> Jan 31 10:04:31 exxact systemd[1]: Started Slurm controller daemon.
> Jan 31 10:04:31 exxact systemd[1]: slurmctld.service: main process exited,
> code=exited, status=1/FAILURE
> Jan 31 10:04:31 exxact systemd[1]: Unit slurmctld.service entered failed
> state.
> Jan 31 10:04:31 exxact systemd[1]: slurmctld.service failed.
>
>
> Kindly guide me. Thank you so much for your time.
>
> Best Regards,
> Nousheen Parvaiz
>
>
> ᐧ
>
> On Thu, Jan 27, 2022 at 8:25 PM Jeffrey R. Lang <jrl...@uwyo.edu> wrote:
>
>> The missing file error has nothing to do with slurm.  The systemctl
>> command is part of the systems service management.
>>
>>
>>
>> The error message indicates that you haven’t copied the slurmd.service
>> file on your compute node to /etc/systemd/system or
>> /usr/lib/systemd/system.  /etc/systemd/system is usually used when a user
>> adds a new service to a machine.
>>
>>
>>
>> Depending on your version of Linux you may also need to do a systemctl
>> daemon-reload to activate the slurmd.service within system.
>>
>>
>>
>> Once slurmd.service is copied over, the systemctld command should work
>> just fine.
>>
>>
>>
>> Remember:
>>
>>                 slurmd.service     -  Only on compute nodes
>>
>>                 slurmctld.service – Only on your cluster management node
>>
>>               slurmdbd.service – Only on your cluster management node
>>
>>
>>
>> *From:* slurm-users <slurm-users-boun...@lists.schedmd.com> *On Behalf
>> Of *Nousheen
>> *Sent:* Thursday, January 27, 2022 3:54 AM
>> *To:* Slurm User Community List <slurm-users@lists.schedmd.com>
>> *Subject:* [slurm-users] systemctl enable slurmd.service Failed to
>> execute operation: No such file or directory
>>
>>
>>
>> ◆ This message was sent from a non-UWYO address. Please exercise caution
>> when clicking links or opening attachments from external sources.
>>
>>
>>
>>
>>
>> Hello everyone,
>>
>>
>>
>> I am installing slurm on Centos 7 following tutorial:
>> https://www.slothparadise.com/how-to-install-slurm-on-centos-7-cluster/
>>
>>
>>
>> I am at the step where we start slurm but it gives me the following error:
>>
>>
>>
>> [root@exxact slurm-21.08.5]# systemctl enable slurmd.service
>>
>> Failed to execute operation: No such file or directory
>>
>>
>>
>> I have run the command to check if slurm is configured properly
>>
>>
>>
>> [root@exxact slurm-21.08.5]# slurmd -C
>> NodeName=exxact CPUs=12 Boards=1 SocketsPerBoard=1 CoresPerSocket=6
>> ThreadsPerCore=2 RealMemory=31889
>> UpTime=19-16:06:00
>>
>>
>>
>> I am new to this and unable to understand the problem. Kindly help me
>> resolve this.
>>
>>
>>
>> My slurm.conf file is as follows:
>>
>>
>>
>> # slurm.conf file generated by configurator.html.
>> # Put this file on all nodes of your cluster.
>> # See the slurm.conf man page for more information.
>> #
>> ClusterName=cluster194
>> SlurmctldHost=192.168.60.194
>> #SlurmctldHost=
>> #
>> #DisableRootJobs=NO
>> #EnforcePartLimits=NO
>> #Epilog=
>> #EpilogSlurmctld=
>> #FirstJobId=1
>> #MaxJobId=67043328
>> #GresTypes=
>> #GroupUpdateForce=0
>> #GroupUpdateTime=600
>> #JobFileAppend=0
>> #JobRequeue=1
>> #JobSubmitPlugins=lua
>> #KillOnBadExit=0
>> #LaunchType=launch/slurm
>> #Licenses=foo*4,bar
>> #MailProg=/bin/mail
>> #MaxJobCount=10000
>> #MaxStepCount=40000
>> #MaxTasksPerNode=512
>> MpiDefault=none
>> #MpiParams=ports=#-#
>> #PluginDir=
>> #PlugStackConfig=
>> #PrivateData=jobs
>> ProctrackType=proctrack/cgroup
>> #Prolog=
>> #PrologFlags=
>> #PrologSlurmctld=
>> #PropagatePrioProcess=0
>> #PropagateResourceLimits=
>> #PropagateResourceLimitsExcept=
>> #RebootProgram=
>> ReturnToService=1
>> SlurmctldPidFile=/var/run/slurmctld.pid
>> SlurmctldPort=6817
>> SlurmdPidFile=/var/run/slurmd.pid
>> SlurmdPort=6818
>> SlurmdSpoolDir=/var/spool/slurmd
>> SlurmUser=nousheen
>> #SlurmdUser=root
>> #SrunEpilog=
>> #SrunProlog=
>> StateSaveLocation=/home/nousheen/Documents/SILICS/slurm-21.08.5/slurmctld
>> SwitchType=switch/none
>> #TaskEpilog=
>> TaskPlugin=task/affinity
>> #TaskProlog=
>> #TopologyPlugin=topology/tree
>> #TmpFS=/tmp
>> #TrackWCKey=no
>> #TreeWidth=
>> #UnkillableStepProgram=
>> #UsePAM=0
>> #
>> #
>> # TIMERS
>> #BatchStartTimeout=10
>> #CompleteWait=0
>> #EpilogMsgTime=2000
>> #GetEnvTimeout=2
>> #HealthCheckInterval=0
>> #HealthCheckProgram=
>> InactiveLimit=0
>> KillWait=30
>> #MessageTimeout=10
>> #ResvOverRun=0
>> MinJobAge=300
>> #OverTimeLimit=0
>> SlurmctldTimeout=120
>> SlurmdTimeout=300
>> #UnkillableStepTimeout=60
>> #VSizeFactor=0
>> Waittime=0
>> #
>> #
>> # SCHEDULING
>> #DefMemPerCPU=0
>> #MaxMemPerCPU=0
>> #SchedulerTimeSlice=30
>> SchedulerType=sched/backfill
>> SelectType=select/cons_tres
>> SelectTypeParameters=CR_Core
>> #
>> #
>> # JOB PRIORITY
>> #PriorityFlags=
>> #PriorityType=priority/basic
>> #PriorityDecayHalfLife=
>> #PriorityCalcPeriod=
>> #PriorityFavorSmall=
>> #PriorityMaxAge=
>> #PriorityUsageResetPeriod=
>> #PriorityWeightAge=
>> #PriorityWeightFairshare=
>> #PriorityWeightJobSize=
>> #PriorityWeightPartition=
>> #PriorityWeightQOS=
>> #
>> #
>> # LOGGING AND ACCOUNTING
>> #AccountingStorageEnforce=0
>> #AccountingStorageHost=
>> #AccountingStoragePass=
>> #AccountingStoragePort=
>> AccountingStorageType=accounting_storage/none
>> #AccountingStorageUser=
>> #AccountingStoreFlags=
>> #JobCompHost=
>> #JobCompLoc=
>> #JobCompPass=
>> #JobCompPort=
>> JobCompType=jobcomp/none
>> #JobCompUser=
>> #JobContainerType=job_container/none
>> JobAcctGatherFrequency=30
>> JobAcctGatherType=jobacct_gather/none
>> SlurmctldDebug=info
>> SlurmctldLogFile=/var/log/slurmctld.log
>> SlurmdDebug=info
>> SlurmdLogFile=/var/log/slurmd.log
>> #SlurmSchedLogFile=
>> #SlurmSchedLogLevel=
>> #DebugFlags=
>> #
>> #
>> # POWER SAVE SUPPORT FOR IDLE NODES (optional)
>> #SuspendProgram=
>> #ResumeProgram=
>> #SuspendTimeout=
>> #ResumeTimeout=
>> #ResumeRate=
>> #SuspendExcNodes=
>> #SuspendExcParts=
>> #SuspendRate=
>> #SuspendTime=
>> #
>> #
>> # COMPUTE NODES
>> NodeName=linux[1-32] CPUs=11 State=UNKNOWN
>>
>> PartitionName=debug Nodes=ALL Default=YES MaxTime=INFINITE State=UP
>>
>>
>>
>>
>> Best Regards,
>>
>> Nousheen Parvaiz
>>
>> ᐧ
>>
>

Re: [slurm-users] systemctl enable slurmd.service Failed to execute operation: No such file or directory

Reply via email to