[slurm-users] update node config while jobs are running

2020-03-09 Thread Rundall, Jacob D
I need to update the configuration for the nodes in a cluster and I’d like to 
let jobs keep running while I do so. Specifically I need to add 
RealMemory= to the node definitions (NodeName=). Is it safe to do this 
for nodes where jobs are currently running? Or I need to make sure nodes are 
drained while updating their config? We are using SelectType=select/linear on 
this cluster. Users would only be allocating complete nodes.

Additionally, do I need to restart the Slurm daemons (slurmctld and slurmd) to 
make this change? I understand if I were adding completely new nodes I would 
need to do so (and that it’s advised to stop slurmctld, update config files, 
restart slurmd on all computes, and then start slurmctld). But is restarting 
the Slurm daemons also required when updating node config as I would like to 
do, or would ‘scontrol reconfigure’ suffice?


Re: [slurm-users] slurmd -C showing incorrect core count

2020-03-09 Thread Chris Samuel

On 9/3/20 7:44 am, mike tie wrote:

Specifically, how is slurmd -C getting that info?  Maybe this is a 
kernel issue, but other than lscpu and /proc/cpuinfo, I don't know where 
to look.  Maybe I should be looking at the slurmd source?


It would be worth looking at what something like "lstopo" from the hwloc 
package says about your VM.


All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



Re: [slurm-users] srun --reboot option is not working

2020-03-09 Thread MrBr @ GMail
>  Ah. Looks like the --reboot option is telling slurmctld to put them in
the CF state and wait for them to come back up. Slurmctld then waits for
them to 'disconnect' and come back. Since they never reboot (therefore
never disconnect), slurmctld keeps them in the CF state until the timeout
occurs.
Hmm, seems to be logical. Is there a way for me to confirm this? slurmctld
log says nothing.

> Do you have RebootProgram defined?
Yes, and it successfully works with "scontrol reboot "

> so normal users cannot use "--reboot"

1. as far as i understand this is true since ver. 20.02. I have 19.05

2. If I'm using the wrong user, should it be reflected in some log?

3. I think that I've configured my user as admin. But i'm not 100% sure.
please see output below

$ sacctmgr list user michael
  User   Def Acct Admin
-- -- -
 michael   sw_user Administ+


On Mon, Mar 9, 2020 at 8:43 PM Brian Andrus  wrote:

> Ah. Looks like the --reboot option is telling slurmctld to put them in the
> CF state and wait for them to come back up. Slurmctld then waits for them
> to 'disconnect' and come back. Since they never reboot (therefore never
> disconnect), slurmctld keeps them in the CF state until the timeout occurs.
>
> Do you have RebootProgram defined?
>
> Note, the manual states:
>
>   Force the allocated nodes to reboot before starting the
> job.  This is only supported with some system configurations and will
> otherwise be silently ignored. *Only root, SlurmUser or admins can reboot
> nodes.*
>
> so normal users cannot use "--reboot"
>
> Brian Andrus
> On 3/9/2020 10:14 AM, MrBr @ GMail wrote:
>
> Hi Brian
> The nodes work with slurm without any issues till I try the "--reboot"
> option.
> I can successfully allocate the nodes or any other slurm related operation
>
> > You may want to double check that the node is actually rebooting and
> that slurmd is set to start on boot.
> That's the problem, they are not been rebooted. I'm monitoring the nodes
>
> sinfo from the nodes works without issue before and after using "--reboot"
> slurmd is up
>
>
> On Mon, Mar 9, 2020 at 5:59 PM Brian Andrus  wrote:
>
>> You may want to double check that the node is actually rebooting and
>> that slurmd is set to start on boot.
>>
>> ResumeTimeoutReached, in a nutshell, means slurmd isn't talking to
>> slurmctld.
>> Are you able to log onto the node itself and see that it has rebooted?
>> If so, try doing something like 'sinfo' from the node and verify it is
>> able to talk to slurmctld from the node and verify slurmd started
>> successfully.
>>
>> Brian Andrus
>>
>> On 3/9/2020 4:38 AM, MrBr @ GMail wrote:
>> > Hi all
>> >
>> > I'm trying to use the --reboot option of srun to reboot the nodes
>> > before allocation.
>> > However the nodes not been rebooted
>> >
>> > The node get's stuck in allocated# state as show by sinfo or CF - as
>> > shown by squeue
>> > The logs of slurmctld and slurmd show no relevant information,
>> > debug levels at "debug5"
>> > Eventually the nodes got to "down" due to "ResumeTimeout reached"
>> >
>> > Strangest thing is that the "scontrol reboot " works without
>> > any issues.
>> > AFAIK both command rely on the same RebootProgram
>> >
>> > In srun document there is a following statement: "This is only
>> > supported with some system configurations and will otherwise be
>> > silently ignored". May be I have this "non-supported" configuration?
>> >
>> > Does anyone has suggestion regarding root cause of this behavior or
>> > possible investigation path?
>> >
>> > Tech data:
>> > Slurm 19.05
>> > The user that executes the srun is an admin, although it's not
>> > required in 19.05
>>
>>


Re: [slurm-users] srun --reboot option is not working

2020-03-09 Thread Brian Andrus
Ah. Looks like the --reboot option is telling slurmctld to put them in 
the CF state and wait for them to come back up. Slurmctld then waits for 
them to 'disconnect' and come back. Since they never reboot (therefore 
never disconnect), slurmctld keeps them in the CF state until the 
timeout occurs.


Do you have RebootProgram defined?

Note, the manual states:

  Force the allocated nodes to reboot before starting the 
job.  This is only supported with some system configurations and will 
otherwise be silently ignored. *Only root, SlurmUser or admins can 
reboot nodes.*


so normal users cannot use "--reboot"

Brian Andrus

On 3/9/2020 10:14 AM, MrBr @ GMail wrote:

Hi Brian
The nodes work with slurm without any issues till I try the "--reboot" 
option.

I can successfully allocate the nodes or any other slurm related operation

> You may want to double check that the node is actually rebooting and
that slurmd is set to start on boot.
That's the problem, they are not been rebooted. I'm monitoring the nodes

sinfo from the nodes works without issue before and after using "--reboot"
slurmd is up


On Mon, Mar 9, 2020 at 5:59 PM Brian Andrus > wrote:


You may want to double check that the node is actually rebooting and
that slurmd is set to start on boot.

ResumeTimeoutReached, in a nutshell, means slurmd isn't talking to
slurmctld.
Are you able to log onto the node itself and see that it has rebooted?
If so, try doing something like 'sinfo' from the node and verify
it is
able to talk to slurmctld from the node and verify slurmd started
successfully.

Brian Andrus

On 3/9/2020 4:38 AM, MrBr @ GMail wrote:
> Hi all
>
> I'm trying to use the --reboot option of srun to reboot the nodes
> before allocation.
> However the nodes not been rebooted
>
> The node get's stuck in allocated# state as show by sinfo or CF
- as
> shown by squeue
> The logs of slurmctld and slurmd show no relevant information,
> debug levels at "debug5"
> Eventually the nodes got to "down" due to "ResumeTimeout reached"
>
> Strangest thing is that the "scontrol reboot " works
without
> any issues.
> AFAIK both command rely on the same RebootProgram
>
> In srun document there is a following statement: "This is only
> supported with some system configurations and will otherwise be
> silently ignored". May be I have this "non-supported" configuration?
>
> Does anyone has suggestion regarding root cause of this behavior or
> possible investigation path?
>
> Tech data:
> Slurm 19.05
> The user that executes the srun is an admin, although it's not
> required in 19.05



Re: [slurm-users] Preemption within same QOS

2020-03-09 Thread Relu Patrascu
We received no replies, so we solved the problem in house by writing a
simple plugin based on the qos priority plugin.

On Wed, Jan 22, 2020 at 2:50 PM Relu Patrascu 
wrote:

> We're having a bit of a problem setting up slurm to achieve this:
>
> 1. Two QOSs, 'high' and 'normal'.
> 2. Preemption type: requeue.
> 3. Any job has a guarantee of running 60 minutes before being preempted.
> 4. Any job submitted with --qos=high can preempt jobs with --qos=normal if
> no resources available and all jobs with --qos=normal have been running for
> at least 60 minutes.
> 5. Job A can preempt job B in the same QOS if all the resources are
> allocated and if job B has run for at least 60 minutes, and if job B has
> lower priority than job A. If job A preempts Job B then B is requeued.
>
> We already set this up, but slurm does not allow loops in QOS preemption.
> That is, QOS 'normal' cannot preempt QOS 'normal'. Is there a way to
> achieve what we want without having to change the source code? We actually
> did try modifying the source code to allow preemption within the same QOS,
> but what we're observing is a job with a lower priority can preempt a job
> with a higher priority, after the 60 minutes grace time that we have set.
> We use the builtin scheduler:
>
> SchedulerType   = sched/builtin
> SchedulerParameters = preempt_strict_order,preempt_reorder_count=5
>
> sacctmgr show qos
> format=Name,Priority,Preempt%20,PreemptExemptTime,PreemptMode
>   Name   Priority  Preempt   PreemptExemptTime
> PreemptMode
> -- --  --- ---
> -
> normal  1   normal01:00:00 requeue
>
>   high   1000  high,normal01:00:00
> requeue
>
> Any ideas how we could go about to achieve what we want?
>
>
>

-- 

+1-647-680-7564


Re: [slurm-users] srun --reboot option is not working

2020-03-09 Thread MrBr @ GMail
Hi Brian
The nodes work with slurm without any issues till I try the "--reboot"
option.
I can successfully allocate the nodes or any other slurm related operation

> You may want to double check that the node is actually rebooting and
that slurmd is set to start on boot.
That's the problem, they are not been rebooted. I'm monitoring the nodes

sinfo from the nodes works without issue before and after using "--reboot"
slurmd is up


On Mon, Mar 9, 2020 at 5:59 PM Brian Andrus  wrote:

> You may want to double check that the node is actually rebooting and
> that slurmd is set to start on boot.
>
> ResumeTimeoutReached, in a nutshell, means slurmd isn't talking to
> slurmctld.
> Are you able to log onto the node itself and see that it has rebooted?
> If so, try doing something like 'sinfo' from the node and verify it is
> able to talk to slurmctld from the node and verify slurmd started
> successfully.
>
> Brian Andrus
>
> On 3/9/2020 4:38 AM, MrBr @ GMail wrote:
> > Hi all
> >
> > I'm trying to use the --reboot option of srun to reboot the nodes
> > before allocation.
> > However the nodes not been rebooted
> >
> > The node get's stuck in allocated# state as show by sinfo or CF - as
> > shown by squeue
> > The logs of slurmctld and slurmd show no relevant information,
> > debug levels at "debug5"
> > Eventually the nodes got to "down" due to "ResumeTimeout reached"
> >
> > Strangest thing is that the "scontrol reboot " works without
> > any issues.
> > AFAIK both command rely on the same RebootProgram
> >
> > In srun document there is a following statement: "This is only
> > supported with some system configurations and will otherwise be
> > silently ignored". May be I have this "non-supported" configuration?
> >
> > Does anyone has suggestion regarding root cause of this behavior or
> > possible investigation path?
> >
> > Tech data:
> > Slurm 19.05
> > The user that executes the srun is an admin, although it's not
> > required in 19.05
>
>


Re: [slurm-users] srun --reboot option is not working

2020-03-09 Thread Brian Andrus
You may want to double check that the node is actually rebooting and 
that slurmd is set to start on boot.


ResumeTimeoutReached, in a nutshell, means slurmd isn't talking to 
slurmctld.

Are you able to log onto the node itself and see that it has rebooted?
If so, try doing something like 'sinfo' from the node and verify it is 
able to talk to slurmctld from the node and verify slurmd started 
successfully.


Brian Andrus

On 3/9/2020 4:38 AM, MrBr @ GMail wrote:

Hi all

I'm trying to use the --reboot option of srun to reboot the nodes 
before allocation.

However the nodes not been rebooted

The node get's stuck in allocated# state as show by sinfo or CF - as 
shown by squeue
The logs of slurmctld and slurmd show no relevant information, 
debug levels at "debug5"

Eventually the nodes got to "down" due to "ResumeTimeout reached"

Strangest thing is that the "scontrol reboot " works without 
any issues.

AFAIK both command rely on the same RebootProgram

In srun document there is a following statement: "This is only 
supported with some system configurations and will otherwise be 
silently ignored". May be I have this "non-supported" configuration?


Does anyone has suggestion regarding root cause of this behavior or 
possible investigation path?


Tech data:
Slurm 19.05
The user that executes the srun is an admin, although it's not 
required in 19.05




Re: [slurm-users] slurmd -C showing incorrect core count

2020-03-09 Thread mike tie
Interesting.   I'm still confused by the where slurmd -C is getting the
data.  When I think of where the kernel stores info about the processor, I
normally think of /proc/cpuinfo. (by the way, I am running centos 7 in the
vm.  The vm hypervisor is VMware).  /proc/cpuinfo does show 16 cores.

I understand your concern over the processor speed.  So I tried a different
vm where I see the following specs:

vendor_id : GenuineIntel

cpu family : 6

model : 85

model name : Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz


When I increase the core count on that vm, reboot, and run slurm -C it too
continues to show the lower original core count.

Specifically, how is slurmd -C getting that info?  Maybe this is a kernel
issue, but other than lscpu and /proc/cpuinfo, I don't know where to look.
Maybe I should be looking at the slurmd source?

-Mike



*Michael Tie*Technical Director
Mathematics, Statistics, and Computer Science

 One North College Street  phn:  507-222-4067
 Northfield, MN 55057   cel:952-212-8933
 m...@carleton.edufax:507-222-4312


On Sun, Mar 8, 2020 at 7:32 PM Kirill 'kkm' Katsnelson 
wrote:

> To answer your direct question, the ground truth of 'slurmctld -C' is what
> the kernel thinks the hardware is (what you see in lscpu, except it
> probably employs some tricks for VMs with an odd topology). And it got
> severely confused by what the kernel reported to it. I know from experience
> that certain odd cloud VM shapes throw it off balance.
>
> I do not really like the output of lscpu. I have never seen such a strange
> shape of a VM. CPU family 15 is in the Pentium 4 line <
> https://software.intel.com/en-us/articles/intel-architecture-and-processor-identification-with-cpuid-model-and-family-numbers>,
> and model 6 was the last breath of this unsuccessful NetBurst
> architecture--such a rarity that Linux kernel does not even have in its
> database: "Common KVM processor" is a slug for "everything else that one
> of these soul-sapping KVMs may return". Flags show that the processor
> supports SSE2 and 3, but not 4.1, 4.2 or AVX, which is consistent with a
> Pentium 4, but 16M of L3 cache is about an average total RAM in a desktop
> at the time P4 was a thing. And the CPU is a NUMA (no real Pentium 4 had
> the NUMA, only SMP)¹.
>
> Any advice?
>>
>
> My best advice would be to either use a different hypervisor or tune
> correctly the one you have. Sometimes a hypervisor is tuned for live VM
> migration, when it is frozen on one hardware type and thawed on another,
> and may tweak the CPUID in advance to hide features from the guest OS so
> that it would be able to continue if migrated to less capable hardware; but
> still, using the P4 as the least common denominator is way too extreme.
> Something is seriously wrong on the KVM host.
>
> The VM itself is braindead. Even if you will have got it up and running,
> the absence of SSE4.1 and 4.2, AVX, AXV2, and AVX512² would make it about
> as efficient a computing node as a brick. Unless the host CPU is really a
> Presler Pentium 4, in which case you are way too long overdue for a
> hardware upgrade :)))
>
>  -kkm
>   
> ¹ It's not impossible that lscpu shows an SMP machine as if containing a
> single NUMA node, but I have a recollection that this is not the case. I
> haven't seen a non-NUMA CPU in quite a while.
> ² Intel had gone besides-itself-creative this time. It was even bigger a
> naming leap than switching from Roman to decimal between Pentium III to
> Pentium *drum roll* 4 *cymbal crash*.
>
>
> On Sun, Mar 8, 2020 at 1:20 PM mike tie  wrote:
>
>>
>> I am running a slurm client on a virtual machine.  The virtual machine
>> originally had a core count of 10.  But I have now increased the cores to
>> 16, but "slurmd -C" continues to show 10.  I have increased the core count
>> in the slurm.conf file. and that is being seen correctly.  The state of the
>> node is stuck in a Drain state because of this conflict.  How do I get
>> slurmd -C to see the new number of cores?
>>
>> I'm running slurm 18.08.  I have tried running "scontrol reconfigure" on
>> the head node.  I have restarted slurmd on all the client nodes, and I have
>> restarted slurmctld on the master node.
>>
>> Where is the data about compute note CPUs stored?  I can't seem to find a
>> config or setting file on the compute node.
>>
>> The compute node that I am working on is "liverpool"
>>
>> *mtie@liverpool** ~ $* slurmd -C
>>
>> NodeName=liverpool CPUs=10 Boards=1 SocketsPerBoard=10 CoresPerSocket=1
>> ThreadsPerCore=1 RealMemory=64263
>>
>> UpTime=1-21:55:36
>>
>>
>> *mtie@liverpool** ~ $* lscpu
>>
>> Architecture:  x86_64
>>
>> CPU op-mode(s):32-bit, 64-bit
>>
>> Byte Order:Little Endian
>>
>> CPU(s):16
>>
>> On-line CPU(s) list:   0-15
>>
>> Thread(s) per core:1
>>
>> Core(s) per socket:4
>>
>> Socket(s): 4
>>
>> NUMA node(s):  1
>>
>> 

[slurm-users] srun --reboot option is not working

2020-03-09 Thread MrBr @ GMail
Hi all

I'm trying to use the --reboot option of srun to reboot the nodes before
allocation.
However the nodes not been rebooted

The node get's stuck in allocated# state as show by sinfo or CF - as shown
by squeue
The logs of slurmctld and slurmd show no relevant information, debug levels
at "debug5"
Eventually the nodes got to "down" due to "ResumeTimeout reached"

Strangest thing is that the "scontrol reboot " works without any
issues.
AFAIK both command rely on the same RebootProgram

In srun document there is a following statement: "This is only supported
with some system configurations and will otherwise be silently ignored".
May be I have this "non-supported" configuration?

Does anyone has suggestion regarding root cause of this behavior or
possible investigation path?

Tech data:
Slurm 19.05
The user that executes the srun is an admin, although it's not required in
19.05