-----Ursprüngliche Nachricht----- Von: slurm-users <slurm-users-boun...@lists.schedmd.com> Im Auftrag von Ole Holm Nielsen Gesendet: Mittwoch, 13. Februar 2019 15:10 An: slurm-users@lists.schedmd.com Betreff: Re: [slurm-users] Slurmd not starting
Hi Nathalie, Which Slurm version and which OS version are you using? > I use version 17.11 and Ubuntu 18.04. I installed it with sudo apt install > munge slurm-wlm. FYI: My Slurm Wiki contains all the details of setting up Slurm on CentOS 7: https://wiki.fysik.dtu.dk/niflheim/SLURM Best regards, Ole On 2/13/19 2:58 PM, Nathalie Gocht wrote: > Hey, > > I am building up a one node cluster. Master and node are n the same > machine. My slurm.conf: > > ControlMachine=bayes > > # > > MpiDefault=none > > ProctrackType=proctrack/pgid > > ReturnToService=1 > > SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid > > SlurmctldPort=6817 > > SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid > > SlurmdPort=6818 > > SlurmdSpoolDir=/var/spool/slurmd > > SlurmUser=slurm > > StateSaveLocation=/var/spool/slurmctld > > SwitchType=switch/none > > TaskPlugin=task/none > > # > > # > > # TIMERS > > InactiveLimit=0 > > KillWait=30 > > MinJobAge=300 > > SlurmctldTimeout=120 > > SlurmdTimeout=300 > > Waittime=0 > > # > > # > > # SCHEDULING > > FastSchedule=1 > > SchedulerType=sched/builtin > > SelectType=select/linear > > # > > # > > # LOGGING AND ACCOUNTING > > AccountingStorageLoc=/var/log/slurm-llnl/job_accounting > > AccountingStorageType=accounting_storage/filetxt > > AccountingStoreJobComment=YES > > ClusterName=bayes > > JobCompLoc=/var/log/slurm-llnl/job_completion > > JobCompType=jobcomp/filetxt > > JobAcctGatherFrequency=60 > > JobAcctGatherType=jobacct_gather/linux > > SlurmctldDebug=info > > SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log > > SlurmdDebug=info > > SlurmdLogFile=/var/log/slurm-llnl/slurmd.log > > # COMPUTE NODES > > GresTypes=gpu > > NodeName=bayes Gres=gpu:tesla:1 CPUs=48 Sockets=2 CoresPerSocket=12 > ThreadsPerCore=2 State=UNKNOWN > > PartitionName=long Nodes=bayes Default=YES MaxTime=INFINITE State=UP > > I started the control deamon, but get this information: > > $ systemctl status slurmctld.service > > ● slurmctld.service - Slurm controller daemon > > Loaded: loaded (/lib/systemd/system/slurmctld.service; enabled; > vendor preset: enabled) > > Active: failed (Result: exit-code) since Wed 2019-02-13 14:43:02 > CET; 7min ago > > Docs: man:slurmctld(8) > > Process: 40552 ExecStart=/usr/sbin/slurmctld $SLURMCTLD_OPTIONS > (code=exited, status=0/SUCCE > > Main PID: 40560 (code=exited, status=1/FAILURE) > > $ sinfo > > PARTITION AVAIL TIMELIMIT NODES STATE NODELIST > > long* up infinite 1 idle bayes > > I tried to start the slurm deamon, but the timout exceeds. slurmd > -Dvvvgives: > > slurmd: error: chmod(/var/spool/slurmd, 0755): Operation not permitted > > slurmd: error: Unable to initialize slurmd spooldir > > slurmd: error: slurmd initialization failed > > Does someone know whats going on?