Hi Nousheen,

I recommend you again to follow the steps for installing Slurm on a CentOS 7 cluster:
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation

Maybe you will need to start installation from scratch, but the steps are guaranteed to work if followed correctly.

IHTH,
Ole

On 1/31/22 06:23, Nousheen wrote:
The same error shows up on compute node which is as follows:

[root@c103008 ~]# systemctl enable slurmd.service
[root@c103008 ~]# systemctl start slurmd.service
[root@c103008 ~]# systemctl status slurmd.service
● slurmd.service - Slurm node daemon
   Loaded: loaded (/etc/systemd/system/slurmd.service; enabled; vendor preset: disabled)    Active: failed (Result: exit-code) since Mon 2022-01-31 00:22:42 EST; 2s ago   Process: 11505 ExecStart=/usr/local/sbin/slurmd -D -s $SLURMD_OPTIONS (code=exited, status=203/EXEC)
  Main PID: 11505 (code=exited, status=203/EXEC)

Jan 31 00:22:42 c103008 systemd[1]: Started Slurm node daemon.
Jan 31 00:22:42 c103008 systemd[1]: slurmd.service: main process exited, code=exited, status=203/EXEC
Jan 31 00:22:42 c103008 systemd[1]: Unit slurmd.service entered failed state.
Jan 31 00:22:42 c103008 systemd[1]: slurmd.service failed.


Best Regards,
Nousheen Parvaiz


ᐧ

On Mon, Jan 31, 2022 at 10:08 AM Nousheen <nousheenparv...@gmail.com <mailto:nousheenparv...@gmail.com>> wrote:

    Dear Jeffrey,

    Thank you for your response. I have followed the steps as instructed.
    After the copying the files to their respective locations "systemctl
    status slurmctld.service" command gives me an error as follows:

    (base) [nousheen@exxact system]$ systemctl daemon-reload
    (base) [nousheen@exxact system]$ systemctl enable slurmctld.service
    (base) [nousheen@exxact system]$ systemctl start slurmctld.service
    (base) [nousheen@exxact system]$ systemctl status slurmctld.service
    ● slurmctld.service - Slurm controller daemon
        Loaded: loaded (/etc/systemd/system/slurmctld.service; enabled;
    vendor preset: disabled)
        Active: failed (Result: exit-code) since Mon 2022-01-31 10:04:31
    PKT; 3s ago
       Process: 18114 ExecStart=/usr/local/sbin/slurmctld -D -s
    $SLURMCTLD_OPTIONS (code=exited, status=1/FAILURE)
      Main PID: 18114 (code=exited, status=1/FAILURE)

    Jan 31 10:04:31 exxact systemd[1]: Started Slurm controller daemon.
    Jan 31 10:04:31 exxact systemd[1]: slurmctld.service: main process
    exited, code=exited, status=1/FAILURE
    Jan 31 10:04:31 exxact systemd[1]: Unit slurmctld.service entered
    failed state.
    Jan 31 10:04:31 exxact systemd[1]: slurmctld.service failed.

    Kindly guide me. Thank you so much for your time.

    Best Regards,
    Nousheen Parvaiz

    ᐧ

    On Thu, Jan 27, 2022 at 8:25 PM Jeffrey R. Lang <jrl...@uwyo.edu
    <mailto:jrl...@uwyo.edu>> wrote:

        The missing file error has nothing to do with slurm.  The
        systemctl command is part of the systems service management.____

        __ __

        The error message indicates that you haven’t copied the
        slurmd.service file on your compute node to /etc/systemd/system or
        /usr/lib/systemd/system.  /etc/systemd/system is usually used when
        a user adds a new service to a machine.____

        __ __

        Depending on your version of Linux you may also need to do a
        systemctl daemon-reload to activate the slurmd.service within
        system.____

        __ __

        Once slurmd.service is copied over, the systemctld command should
        work just fine.____

        __ __

        Remember:____

                         slurmd.service     -  Only on compute nodes____

                         slurmctld.service – Only on your cluster
        management node____

                       slurmdbd.service – Only on your cluster management
        node____

        __ __

        *From:* slurm-users <slurm-users-boun...@lists.schedmd.com
        <mailto:slurm-users-boun...@lists.schedmd.com>> *On Behalf Of
        *Nousheen
        *Sent:* Thursday, January 27, 2022 3:54 AM
        *To:* Slurm User Community List <slurm-users@lists.schedmd.com
        <mailto:slurm-users@lists.schedmd.com>>
        *Subject:* [slurm-users] systemctl enable slurmd.service Failed to
        execute operation: No such file or directory____

        __ __

        ◆ This message was sent from a non-UWYO address. Please exercise
        caution when clicking links or opening attachments from external
        sources.____

        __ __

        __ __

        Hello everyone,____

        __ __

        I am installing slurm on Centos 7 following tutorial:
        https://www.slothparadise.com/how-to-install-slurm-on-centos-7-cluster/
        
<https://www.slothparadise.com/how-to-install-slurm-on-centos-7-cluster/>____

        __ __

        I am at the step where we start slurm but it gives me the
        following error:____

        __ __

        [root@exxact slurm-21.08.5]# systemctl enable slurmd.service____

        Failed to execute operation: No such file or directory____

        __ __

        I have run the command to check if slurm is configured properly____

        __ __

        [root@exxact slurm-21.08.5]# slurmd -C
        NodeName=exxact CPUs=12 Boards=1 SocketsPerBoard=1
        CoresPerSocket=6 ThreadsPerCore=2 RealMemory=31889
        UpTime=19-16:06:00____

        __ __

        I am new to this and unable to understand the problem. Kindly help
        me resolve this.____

        __ __

        My slurm.conf file is as follows:____

        __ __

        # slurm.conf file generated by configurator.html.
        # Put this file on all nodes of your cluster.
        # See the slurm.conf man page for more information.
        #
        ClusterName=cluster194
        SlurmctldHost=192.168.60.194
        #SlurmctldHost=
        #
        #DisableRootJobs=NO
        #EnforcePartLimits=NO
        #Epilog=
        #EpilogSlurmctld=
        #FirstJobId=1
        #MaxJobId=67043328
        #GresTypes=
        #GroupUpdateForce=0
        #GroupUpdateTime=600
        #JobFileAppend=0
        #JobRequeue=1
        #JobSubmitPlugins=lua
        #KillOnBadExit=0
        #LaunchType=launch/slurm
        #Licenses=foo*4,bar
        #MailProg=/bin/mail
        #MaxJobCount=10000
        #MaxStepCount=40000
        #MaxTasksPerNode=512
        MpiDefault=none
        #MpiParams=ports=#-#
        #PluginDir=
        #PlugStackConfig=
        #PrivateData=jobs
        ProctrackType=proctrack/cgroup
        #Prolog=
        #PrologFlags=
        #PrologSlurmctld=
        #PropagatePrioProcess=0
        #PropagateResourceLimits=
        #PropagateResourceLimitsExcept=
        #RebootProgram=
        ReturnToService=1
        SlurmctldPidFile=/var/run/slurmctld.pid
        SlurmctldPort=6817
        SlurmdPidFile=/var/run/slurmd.pid
        SlurmdPort=6818
        SlurmdSpoolDir=/var/spool/slurmd
        SlurmUser=nousheen
        #SlurmdUser=root
        #SrunEpilog=
        #SrunProlog=
        
StateSaveLocation=/home/nousheen/Documents/SILICS/slurm-21.08.5/slurmctld
        SwitchType=switch/none
        #TaskEpilog=
        TaskPlugin=task/affinity
        #TaskProlog=
        #TopologyPlugin=topology/tree
        #TmpFS=/tmp
        #TrackWCKey=no
        #TreeWidth=
        #UnkillableStepProgram=
        #UsePAM=0
        #
        #
        # TIMERS
        #BatchStartTimeout=10
        #CompleteWait=0
        #EpilogMsgTime=2000
        #GetEnvTimeout=2
        #HealthCheckInterval=0
        #HealthCheckProgram=
        InactiveLimit=0
        KillWait=30
        #MessageTimeout=10
        #ResvOverRun=0
        MinJobAge=300
        #OverTimeLimit=0
        SlurmctldTimeout=120
        SlurmdTimeout=300
        #UnkillableStepTimeout=60
        #VSizeFactor=0
        Waittime=0
        #
        #
        # SCHEDULING
        #DefMemPerCPU=0
        #MaxMemPerCPU=0
        #SchedulerTimeSlice=30
        SchedulerType=sched/backfill
        SelectType=select/cons_tres
        SelectTypeParameters=CR_Core
        #
        #
        # JOB PRIORITY
        #PriorityFlags=
        #PriorityType=priority/basic
        #PriorityDecayHalfLife=
        #PriorityCalcPeriod=
        #PriorityFavorSmall=
        #PriorityMaxAge=
        #PriorityUsageResetPeriod=
        #PriorityWeightAge=
        #PriorityWeightFairshare=
        #PriorityWeightJobSize=
        #PriorityWeightPartition=
        #PriorityWeightQOS=
        #
        #
        # LOGGING AND ACCOUNTING
        #AccountingStorageEnforce=0
        #AccountingStorageHost=
        #AccountingStoragePass=
        #AccountingStoragePort=
        AccountingStorageType=accounting_storage/none
        #AccountingStorageUser=
        #AccountingStoreFlags=
        #JobCompHost=
        #JobCompLoc=
        #JobCompPass=
        #JobCompPort=
        JobCompType=jobcomp/none
        #JobCompUser=
        #JobContainerType=job_container/none
        JobAcctGatherFrequency=30
        JobAcctGatherType=jobacct_gather/none
        SlurmctldDebug=info
        SlurmctldLogFile=/var/log/slurmctld.log
        SlurmdDebug=info
        SlurmdLogFile=/var/log/slurmd.log
        #SlurmSchedLogFile=
        #SlurmSchedLogLevel=
        #DebugFlags=
        #
        #
        # POWER SAVE SUPPORT FOR IDLE NODES (optional)
        #SuspendProgram=
        #ResumeProgram=
        #SuspendTimeout=
        #ResumeTimeout=
        #ResumeRate=
        #SuspendExcNodes=
        #SuspendExcParts=
        #SuspendRate=
        #SuspendTime=
        #
        #
        # COMPUTE NODES
        NodeName=linux[1-32] CPUs=11 State=UNKNOWN____

        PartitionName=debug Nodes=ALL Default=YES MaxTime=INFINITE
        State=UP ____

        __ __


        ____

        Best Regards,____

        Nousheen Parvaiz____

        ᐧ____

Reply via email to