Re: [slurm-users] Slurm configuration, Weight Parameter

2019-11-23 Thread Chris Samuel

On 23/11/19 9:14 am, Chris Samuel wrote:

My gut instinct (and I've never tried this) is to make the 3GB nodes be 
in a separate partition that is guarded by AllowQos=3GB and have a QOS 
called "3GB" that uses MinTRESPerJob to require jobs to ask for more 
than 2GB of RAM to be allowed into the QOS.


Of course there's nothing to stop a user requesting more memory than 
they need to get access to these nodes, but that's a social issue not a 
technical one. :-)


--
 Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



Re: [slurm-users] Slurm configuration, Weight Parameter

2019-11-23 Thread Chris Samuel

On 21/11/19 7:25 am, Sistemas NLHPC wrote:

Currently we have two types of nodes, one with 3GB and another with 2GB 
of RAM, it is required that in nodes of 3 GB it is not allowed to 
execute tasks with less than 2GB, to avoid underutilization of resources.


My gut instinct (and I've never tried this) is to make the 3GB nodes be 
in a separate partition that is guarded by AllowQos=3GB and have a QOS 
called "3GB" that uses MinTRESPerJob to require jobs to ask for more 
than 2GB of RAM to be allowed into the QOS.


All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



Re: [slurm-users] Force a use job to a node with state=drain/maint

2019-11-23 Thread Chris Samuel

On 23/11/19 8:54 am, René Neumaier wrote:


In general, is it possible to move a pending job (means forcing as root)
to a specific node which is marked as DRAIN for troubleshooting?


I don't believe so.  Put a reservation on the node first only for this 
user, add the reservation to the job then resume the node.


All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



[slurm-users] Force a use job to a node with state=drain/maint

2019-11-23 Thread René Neumaier
Hello everyone!

In general, is it possible to move a pending job (means forcing as root)
to a specific node which is marked as DRAIN for troubleshooting?

I know, it's not what "DRAIN" normally means. Maybe the reservation is
the way to go. But how can I force/forward a specific user job which is
in pending at the same partition?


Best regards and thanks a lot,
René


-- 
_
René Neumaier
Systemadministration

LMU München
Department für Geo und
Umweltwissenschaften,
Paläontologie & Geobiologie

Richard-Wagner-Str. 10
D-80333 München
Tel.: +4989-2180-6625
Fax.: +4989-2180-6601
rene.neuma...@lmu.de
r.neuma...@lrz.uni-muenchen.de

GPG-Fingerprint:
EC0E B6F6 B3FF 6324 B0C8 9452 EF6B 4E3C 2E59 F5AA



Re: [slurm-users] Environment modules

2019-11-23 Thread William Brown
Agreed, I have just been setting up Lmod on a national compute cluster
where I am a non-privileged cluster and on an internal cluster where I have
full rights.  It works very well, and Lmod can read theTcl module files
also.  The most recent version has some extra features specially for
Slurm.  An I use EasyBuild, saves hundreds of hours of effort.   I do quite
often have to hand create simple module files for software with no
EasyConfig but I can just copy the structure from module files created by
EasyBuild so it has never been a great problem.

The best bit of modules is being able to offer multiple conflicting
versions of software like Java, Perl, R etc.

William

On Sat, 23 Nov 2019 at 03:57, Chris Samuel  wrote:

> On 22/11/19 9:37 am, Mariano.Maluf wrote:
>
> > The cluster is operational but I need to install and configure
> > environment modules.
>
> If you use Easybuild to install your HPC software then it can take care
> of the modules too for you.  I'd also echo the recommendation from
> others to use Lmod.
>
> Website: https://easybuilders.github.io/easybuild/
> Documentation: https://easybuild.readthedocs.io/
>
> All the best,
> Chris
> --
>   Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
>
>