Also - are the nodes up an running wrt SLURM? What is the output of :

sinfo -N

?

(fwiw, I really like the alias sn="sinfo -Nle -o "%.20n %.15C %.8O %.7t" |
uniq" )

cheers
L.

------
"The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic civics
is the insistence that we cannot ignore the truth, nor should we panic
about it. It is a shared consciousness that our institutions have failed
and our ecosystem is collapsing, yet we are still here — and we are
creative agents who can shape our destinies. Apocalyptic civics is the
conviction that the only way out is through, and the only way through is
together. "

*Greg Bloom* @greggish
https://twitter.com/greggish/status/873177525903609857

On 28 July 2017 at 10:47, Lachlan Musicman <data...@gmail.com> wrote:

> I think it's because hostname is so undemanding.
>
> How many CPUs does each host have?
>
> You may need to use ((number of cpus per host) + 1) to see action on
> another node.
>
> You can try using stress-ng to test higher loads?
>
> https://www.cyberciti.biz/faq/stress-test-linux-unix-server-
> with-stress-ng/
>
> cheers
> L.
>
>
> ------
> "The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic
> civics is the insistence that we cannot ignore the truth, nor should we
> panic about it. It is a shared consciousness that our institutions have
> failed and our ecosystem is collapsing, yet we are still here — and we are
> creative agents who can shape our destinies. Apocalyptic civics is the
> conviction that the only way out is through, and the only way through is
> together. "
>
> *Greg Bloom* @greggish https://twitter.com/greggish/
> status/873177525903609857
>
> On 28 July 2017 at 10:28, 허웅 <hoewoongg...@naver.com> wrote:
>
>> I have 5 nodes include control node.
>>
>> and my nodes are looking like this
>>
>> Control Node : GO1
>> Compute Nodes : GO[1-5]
>>
>> when i trying to allocate some job to multiple nodes, only one node
>> works.
>>
>> example]
>>
>> $ srun -N5 hostname
>> GO1
>> GO1
>> GO1
>> GO1
>> GO1
>>
>> even I expected like this
>>
>> $ srun -N5 hostname
>> GO1
>> GO2
>> GO3
>> GO4
>> GO5
>>
>> What should i do?
>>
>> there are some my configures.
>>
>> $ scontrol show frontend
>> FrontendName=GO1 State=IDLE Version=17.02 Reason=(null)
>> BootTime=2017-06-02T20:14:39 SlurmdStartTime=2017-07-27T16:29:46
>>
>> FrontendName=GO2 State=IDLE Version=17.02 Reason=(null)
>> BootTime=2017-07-05T17:54:13 SlurmdStartTime=2017-07-27T16:30:07
>>
>> FrontendName=GO3 State=IDLE Version=17.02 Reason=(null)
>> BootTime=2017-07-05T17:22:58 SlurmdStartTime=2017-07-27T16:30:08
>>
>> FrontendName=GO4 State=IDLE Version=17.02 Reason=(null)
>> BootTime=2017-07-05T17:21:40 SlurmdStartTime=2017-07-27T16:30:08
>>
>> FrontendName=GO5 State=IDLE Version=17.02 Reason=(null)
>> BootTime=2017-07-05T17:21:39 SlurmdStartTime=2017-07-27T16:30:09
>>
>> $ scontrol ping
>> Slurmctld(primary/backup) at GO1/(NULL) are UP/DOWN
>>
>> [slurm.conf]
>> # slurm.conf
>> #
>> # See the slurm.conf man page for more information.
>> #
>> ClusterName=linux
>> ControlMachine=GO1
>> ControlAddr=192.168.30.74
>> #
>> SlurmUser=slurm
>> SlurmctldPort=6817
>> SlurmdPort=6818
>> AuthType=auth/munge
>> StateSaveLocation=/var/lib/slurmd
>> SlurmdSpoolDir=/var/spool/slurmd
>> SwitchType=switch/none
>> MpiDefault=none
>> SlurmctldPidFile=/var/run/slurmd/slurmctld.pid
>> SlurmdPidFile=/var/run/slurmd/slurmd.pid
>> ProctrackType=proctrack/pgid
>> ReturnToService=0
>> TreeWidth=50
>> #
>> # TIMERS
>> SlurmctldTimeout=300
>> SlurmdTimeout=300
>> InactiveLimit=0
>> MinJobAge=300
>> KillWait=30
>> Waittime=0
>> #
>> # SCHEDULING
>> SchedulerType=sched/backfill
>> FastSchedule=1
>> #
>> # LOGGING
>> SlurmctldDebug=7
>> SlurmctldLogFile=/var/log/slurmctld.log
>> SlurmdDebug=7
>> SlurmdLogFile=/var/log/slurmd.log
>> JobCompType=jobcomp/none
>> #
>> # COMPUTE NODES
>> NodeName=sgo[1-5] NodeHostName=GO[1-5] #NodeAddr=192.168.30.[74,141,68,70,72]
>>
>> #
>> # PARTITIONS
>> PartitionName=party Default=yes Nodes=ALL
>>
>
>

Reply via email to