Re: [slurm-users] Simple free for all cluster

2020-10-17 Thread John H
Thanks Chris will likely need it :)

John

On Sat, Oct 10, 2020 at 04:19:06PM -0700, Chris Samuel wrote:
> On Tuesday, 6 October 2020 7:53:02 AM PDT Jason Simms wrote:
> 
> > I currently don't have a MaxTime defined, because how do I know how long a
> > job will take? Most jobs on my cluster require no more than 3-4 days, but
> > in some cases at other campuses, I know that jobs can run for weeks. I
> > suppose even setting a time limit such as 4 weeks would be overkill, but at
> > least it's not infinite. I'm curious what others use as that value, and how
> > you arrived at it
> 
> My journey over the last 16 years in HPC has been one of decreasing time 
> limits, back in 2003 with VPAC's first Linux cluster we had no time limits, 
> we 
> then introduced a 90 day limit so we could plan quarterly maintenances (and 
> yes, we had users who had jobs which legitimately ran longer than that, so 
> they had to learn to checkpoint).  At VLSCI we had 30 day limits (life 
> sciences, so many long running poorly scaling jobs), then when I was at 
> Swinburne it was a 7 day limit, and now here at NERSC we've got 2 day limits.
> 
> It really is down to what your use cases are and how much influence you have 
> over your users.  It's often the HPC sysadmins responsibility to try and find 
> that balance between good utilisation, effective use of the system and 
> reaching 
> the desired science/research/development outcomes.
> 
> Best of luck!
> Chris
> -- 
>   Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
> 
> 
> 
> 

-- 
j...@sdf.org
SDF Public Access UNIX System - http://sdf.org



Re: [slurm-users] Simple free for all cluster

2020-10-10 Thread Chris Samuel
On Tuesday, 6 October 2020 7:53:02 AM PDT Jason Simms wrote:

> I currently don't have a MaxTime defined, because how do I know how long a
> job will take? Most jobs on my cluster require no more than 3-4 days, but
> in some cases at other campuses, I know that jobs can run for weeks. I
> suppose even setting a time limit such as 4 weeks would be overkill, but at
> least it's not infinite. I'm curious what others use as that value, and how
> you arrived at it

My journey over the last 16 years in HPC has been one of decreasing time 
limits, back in 2003 with VPAC's first Linux cluster we had no time limits, we 
then introduced a 90 day limit so we could plan quarterly maintenances (and 
yes, we had users who had jobs which legitimately ran longer than that, so 
they had to learn to checkpoint).  At VLSCI we had 30 day limits (life 
sciences, so many long running poorly scaling jobs), then when I was at 
Swinburne it was a 7 day limit, and now here at NERSC we've got 2 day limits.

It really is down to what your use cases are and how much influence you have 
over your users.  It's often the HPC sysadmins responsibility to try and find 
that balance between good utilisation, effective use of the system and reaching 
the desired science/research/development outcomes.

Best of luck!
Chris
-- 
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA






Re: [slurm-users] Simple free for all cluster

2020-10-07 Thread Marcus Wagner

Hi Jason,

we intend to have a maximum wallclock time of 5 days. We chose this, to have 
the possibility to do a timely maintenance without disturbing and killing the 
users jobs. Yet we see that some users and / or codes need a longer runtime. 
That is why we set the maxtime for the partitions to 30 days. Our users must 
write a proposal if they need a larger amount of core hours. There they have to 
justify why they need a longer runtime than 5 days. This maxtime is limited by 
the association created for the triple (account,user,partition). When we need 
to do a timely maintenance, we kill such long running jobs. Our users know that.

Our default time is set to 15 minutes.

Best
Marcus

Am 06.10.2020 um 16:53 schrieb Jason Simms:

FWIW, I define the DefaultTime as 5 minutes, which effectively means for any 
"real" job that users must actually define a time. It helps users get into that 
habit, because in the absence of a DefaultTime, most will not even bother to think 
critically and carefully about what time limit is actually reasonable, which is important 
for, e.g., effective job backfill and scheduling estimations.

I currently don't have a MaxTime defined, because how do I know how long a job 
will take? Most jobs on my cluster require no more than 3-4 days, but in some 
cases at other campuses, I know that jobs can run for weeks. I suppose even 
setting a time limit such as 4 weeks would be overkill, but at least it's not 
infinite. I'm curious what others use as that value, and how you arrived at it.

Warmest regards,
Jason

On Tue, Oct 6, 2020 at 5:55 AM John H mailto:j...@sdf.org>> 
wrote:

Yes I hadn't considered that! Thanks for the tip, Michael I shall do that.

John

On Fri, Oct 02, 2020 at 01:49:44PM +, Renfro, Michael wrote:
 > Depending on the users who will be on this cluster, I'd probably adjust 
the partition to have a defined, non-infinite MaxTime, and maybe a lower 
DefaultTime. Otherwise, it would be very easy for someone to start a job that 
reserves all cores until the nodes get rebooted, since all they have to do is 
submit a job with no explicit time limit (which would then use DefaultTime, which 
itself has a default value of MaxTime).
 >



--
*Jason L. Simms, Ph.D., M.P.H.*
Manager of Research and High-Performance Computing
XSEDE Campus Champion
Lafayette College
Information Technology Services
710 Sullivan Rd | Easton, PA 18042
Office: 112 Skillman Library
p: (610) 330-5632


--
Dipl.-Inf. Marcus Wagner

IT Center
Gruppe: Systemgruppe Linux
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de
www.itc.rwth-aachen.de

Social Media Kanäle des IT Centers:
https://blog.rwth-aachen.de/itc/
https://www.facebook.com/itcenterrwth
https://www.linkedin.com/company/itcenterrwth
https://twitter.com/ITCenterRWTH
https://www.youtube.com/channel/UCKKDJJukeRwO0LP-ac8x8rQ



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [slurm-users] Simple free for all cluster

2020-10-07 Thread Diego Zuccato
Il 06/10/20 16:53, Jason Simms ha scritto:
> FWIW, I define the DefaultTime as 5 minutes, which effectively means for
> any "real" job that users must actually define a time. It helps users
> get into that habit, because in the absence of a DefaultTime, most will
> not even bother to think critically and carefully about what time limit
> is actually reasonable, which is important for, e.g., effective job
> backfill and scheduling estimations.
+1

> I currently don't have a MaxTime defined, because how do I know how long
> a job will take? Most jobs on my cluster require no more than 3-4 days,
> but in some cases at other campuses, I know that jobs can run for weeks.
> I suppose even setting a time limit such as 4 weeks would be overkill,
> but at least it's not infinite. I'm curious what others use as that
> value, and how you arrived at it.
We're currently using 24h,and will up to 72h. It's a compromise between
users requesting more time for their jobs and how long they're willing
to wait for their jobs to start.

Checkpointing is always needed anyway.

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786



Re: [slurm-users] Simple free for all cluster

2020-10-06 Thread Sebastian T Smith
Our MaxTime and DefaultTime are 14-days.  Setting a high DefaultTime was a 
convenience to our users (and the support team) but has evolved into a mistake 
because it impacts backfill.  Under high load we'll see small backfill jobs 
take over because the estimated start and end time of "DefaultTime" jobs are 
wildly incorrect -- the backfill algorithm is less likely to calculate a delay 
in larger, highest-priority jobs and backfills smaller jobs.  I've tuned many 
of the backfill SchedulerParameters, but there's no replacement for an accurate 
time estimate.

Default values also become difficult to change once hundreds of submit scripts 
ignore them.  Jason, I think setting a small DefaultTime limit is a good 
approach.  We've considered resetting our default to 1 min to force jobs to 
specify a time but will (likely) target an average-ish value now that we have 
stats from a couple of million jobs.

- Sebastian

--

[University of Nevada, Reno]<http://www.unr.edu/>
Sebastian Smith
High-Performance Computing Engineer
Office of Information Technology
1664 North Virginia Street
MS 0291

work-phone: 775-682-5050
email: stsm...@unr.edu<mailto:stsm...@unr.edu>
website: http://rc.unr.edu<http://rc.unr.edu/>


From: slurm-users  on behalf of Jason 
Simms 
Sent: Tuesday, October 6, 2020 7:53 AM
To: Slurm User Community List 
Subject: Re: [slurm-users] Simple free for all cluster

FWIW, I define the DefaultTime as 5 minutes, which effectively means for any 
"real" job that users must actually define a time. It helps users get into that 
habit, because in the absence of a DefaultTime, most will not even bother to 
think critically and carefully about what time limit is actually reasonable, 
which is important for, e.g., effective job backfill and scheduling estimations.

I currently don't have a MaxTime defined, because how do I know how long a job 
will take? Most jobs on my cluster require no more than 3-4 days, but in some 
cases at other campuses, I know that jobs can run for weeks. I suppose even 
setting a time limit such as 4 weeks would be overkill, but at least it's not 
infinite. I'm curious what others use as that value, and how you arrived at it.

Warmest regards,
Jason

On Tue, Oct 6, 2020 at 5:55 AM John H mailto:j...@sdf.org>> wrote:
Yes I hadn't considered that! Thanks for the tip, Michael I shall do that.

John

On Fri, Oct 02, 2020 at 01:49:44PM +, Renfro, Michael wrote:
> Depending on the users who will be on this cluster, I'd probably adjust the 
> partition to have a defined, non-infinite MaxTime, and maybe a lower 
> DefaultTime. Otherwise, it would be very easy for someone to start a job that 
> reserves all cores until the nodes get rebooted, since all they have to do is 
> submit a job with no explicit time limit (which would then use DefaultTime, 
> which itself has a default value of MaxTime).
>



--
Jason L. Simms, Ph.D., M.P.H.
Manager of Research and High-Performance Computing
XSEDE Campus Champion
Lafayette College
Information Technology Services
710 Sullivan Rd | Easton, PA 18042
Office: 112 Skillman Library
p: (610) 330-5632


Re: [slurm-users] Simple free for all cluster

2020-10-06 Thread Jason Simms
FWIW, I define the DefaultTime as 5 minutes, which effectively means for
any "real" job that users must actually define a time. It helps users get
into that habit, because in the absence of a DefaultTime, most will not
even bother to think critically and carefully about what time limit is
actually reasonable, which is important for, e.g., effective job backfill
and scheduling estimations.

I currently don't have a MaxTime defined, because how do I know how long a
job will take? Most jobs on my cluster require no more than 3-4 days, but
in some cases at other campuses, I know that jobs can run for weeks. I
suppose even setting a time limit such as 4 weeks would be overkill, but at
least it's not infinite. I'm curious what others use as that value, and how
you arrived at it.

Warmest regards,
Jason

On Tue, Oct 6, 2020 at 5:55 AM John H  wrote:

> Yes I hadn't considered that! Thanks for the tip, Michael I shall do that.
>
> John
>
> On Fri, Oct 02, 2020 at 01:49:44PM +, Renfro, Michael wrote:
> > Depending on the users who will be on this cluster, I'd probably adjust
> the partition to have a defined, non-infinite MaxTime, and maybe a lower
> DefaultTime. Otherwise, it would be very easy for someone to start a job
> that reserves all cores until the nodes get rebooted, since all they have
> to do is submit a job with no explicit time limit (which would then use
> DefaultTime, which itself has a default value of MaxTime).
> >
>
>

-- 
*Jason L. Simms, Ph.D., M.P.H.*
Manager of Research and High-Performance Computing
XSEDE Campus Champion
Lafayette College
Information Technology Services
710 Sullivan Rd | Easton, PA 18042
Office: 112 Skillman Library
p: (610) 330-5632


Re: [slurm-users] Simple free for all cluster

2020-10-06 Thread John H
Yes I hadn't considered that! Thanks for the tip, Michael I shall do that.

John

On Fri, Oct 02, 2020 at 01:49:44PM +, Renfro, Michael wrote:
> Depending on the users who will be on this cluster, I'd probably adjust the 
> partition to have a defined, non-infinite MaxTime, and maybe a lower 
> DefaultTime. Otherwise, it would be very easy for someone to start a job that 
> reserves all cores until the nodes get rebooted, since all they have to do is 
> submit a job with no explicit time limit (which would then use DefaultTime, 
> which itself has a default value of MaxTime). 
> 



[slurm-users] Simple free for all cluster

2020-10-02 Thread John H
Hi All

Hope you are all keeping well in these difficult times.

I have setup a small Slurm cluster of 8 compute nodes (4 x 1-core CPUs, 16GB 
RAM) without scheduling or accounting as it isn't really needed.

I'm just looking for confirmation it's configured correctly to allow the 
controller to 'see' all resource and allocate incoming jobs to the most readily 
available node in the cluster. I can see 
jobs are being delivered to different nodes but want to ensure I haven't 
inadvertently done anything to render it sub optimal (even in such a simple use 
case!)

Thanks very much for any assistance, here is my cfg:

#
# SLURM.CONF
ControlMachine=slnode1
BackupController=slnode2
MpiDefault=none
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
SlurmctldPort=6817
SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid
SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
StateSaveLocation=/var/spool/slurm-llnl
SwitchType=switch/none
TaskPlugin=task/none
#
# TIMERS
MinJobAge=86400
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_CPU_MEMORY
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=cluster
JobAcctGatherType=jobacct_gather/none
SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
SlurmdDebug=3
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
#
# COMPUTE NODES
NodeName=slnode[1-8] CPUs=4 Boards=1 SocketsPerBoard=4 CoresPerSocket=1 
ThreadsPerCore=1 RealMemory=16017
PartitionName=sl Nodes=slnode[1-8] Default=YES MaxTime=INFINITE State=UP

John


-- 
j...@sdf.org
SDF Public Access UNIX System - http://sdf.org



Re: [slurm-users] Simple free for all cluster

2020-10-02 Thread Renfro, Michael
Depending on the users who will be on this cluster, I'd probably adjust the 
partition to have a defined, non-infinite MaxTime, and maybe a lower 
DefaultTime. Otherwise, it would be very easy for someone to start a job that 
reserves all cores until the nodes get rebooted, since all they have to do is 
submit a job with no explicit time limit (which would then use DefaultTime, 
which itself has a default value of MaxTime). 

On 10/2/20, 7:37 AM, "slurm-users on behalf of John H" 
 wrote:

Hi All

Hope you are all keeping well in these difficult times.

I have setup a small Slurm cluster of 8 compute nodes (4 x 1-core CPUs, 
16GB RAM) without scheduling or accounting as it isn't really needed.

I'm just looking for confirmation it's configured correctly to allow the 
controller to 'see' all resource and allocate incoming jobs to the most readily 
available node in the cluster. I can see
jobs are being delivered to different nodes but want to ensure I haven't 
inadvertently done anything to render it sub optimal (even in such a simple use 
case!)

Thanks very much for any assistance, here is my cfg:

#
# SLURM.CONF
ControlMachine=slnode1
BackupController=slnode2
MpiDefault=none
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
SlurmctldPort=6817
SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid
SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
StateSaveLocation=/var/spool/slurm-llnl
SwitchType=switch/none
TaskPlugin=task/none
#
# TIMERS
MinJobAge=86400
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_CPU_MEMORY
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=cluster
JobAcctGatherType=jobacct_gather/none
SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
SlurmdDebug=3
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
#
# COMPUTE NODES
NodeName=slnode[1-8] CPUs=4 Boards=1 SocketsPerBoard=4 CoresPerSocket=1 
ThreadsPerCore=1 RealMemory=16017
PartitionName=sl Nodes=slnode[1-8] Default=YES MaxTime=INFINITE State=UP

John