Thanks Chris will likely need it :)


On Sat, Oct 10, 2020 at 04:19:06PM -0700, Chris Samuel wrote:
> On Tuesday, 6 October 2020 7:53:02 AM PDT Jason Simms wrote:
> > I currently don't have a MaxTime defined, because how do I know how long a
> > job will take? Most jobs on my cluster require no more than 3-4 days, but
> > in some cases at other campuses, I know that jobs can run for weeks. I
> > suppose even setting a time limit such as 4 weeks would be overkill, but at
> > least it's not infinite. I'm curious what others use as that value, and how
> > you arrived at it
> My journey over the last 16 years in HPC has been one of decreasing time 
> limits, back in 2003 with VPAC's first Linux cluster we had no time limits, 
> we 
> then introduced a 90 day limit so we could plan quarterly maintenances (and 
> yes, we had users who had jobs which legitimately ran longer than that, so 
> they had to learn to checkpoint).  At VLSCI we had 30 day limits (life 
> sciences, so many long running poorly scaling jobs), then when I was at 
> Swinburne it was a 7 day limit, and now here at NERSC we've got 2 day limits.
> It really is down to what your use cases are and how much influence you have 
> over your users.  It's often the HPC sysadmins responsibility to try and find 
> that balance between good utilisation, effective use of the system and 
> reaching 
> the desired science/research/development outcomes.
> Best of luck!
> Chris
> -- 
>   Chris Samuel  :  :  Berkeley, CA, USA

SDF Public Access UNIX System -

Reply via email to