Re: [slurm-users] Environment modules

2019-11-22 Thread Chris Samuel

On 22/11/19 9:37 am, Mariano.Maluf wrote:

The cluster is operational but I need to install and configure 
environment modules.


If you use Easybuild to install your HPC software then it can take care 
of the modules too for you.  I'd also echo the recommendation from 
others to use Lmod.


Website: https://easybuilders.github.io/easybuild/
Documentation: https://easybuild.readthedocs.io/

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



Re: [slurm-users] Array jobs vs. many jobs

2019-11-22 Thread Christopher Samuel

Hi Ryan,

On 11/22/19 12:18 PM, Ryan Novosielski wrote:


Quick question that I'm not sure how to find the answer to otherwise: do array 
jobs have less impact on the scheduler in any way than a whole long list of 
jobs run the more traditional way? Less startup overhead, anything like that?


Slurm will represent the whole job array as a single entity until it 
needs to create elements for scheduling purposes (ageing if you limit 
the number of jobs that can accrue time, or just starting them up).


So if you have a 10,000 element job array it uses the same amount of 
memory as 1 job until things start to happen to it.


It's a big win if you've got a workload that can take advantage of it.

All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



Re: [slurm-users] Array jobs vs. many jobs

2019-11-22 Thread Ree, Jan-Albert van


Jan-Albert van Ree  | Linux System Administrator | Digital Services
MARIN | T +31 317 49 35 48 | mailto:j.a.v@marin.nl | http://www.marin.nl

It helps a lot indeed ; we run arrays up to 100k elements and more. If you 
submit 100k separate jobs, the scheduler will definately grind to a halt.

Regards,
--
Jan-Albert




From: slurm-users  on behalf of Ryan 
Novosielski 
Sent: Friday, November 22, 2019 21:18
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] Array jobs vs. many jobs

Hi there,

Quick question that I'm not sure how to find the answer to otherwise: do array 
jobs have less impact on the scheduler in any way than a whole long list of 
jobs run the more traditional way? Less startup overhead, anything like that?

Thanks!

(we run 17.11 on CentOS 7, but I'm not sure it makes any difference here)


[slurm-users] Array jobs vs. many jobs

2019-11-22 Thread Ryan Novosielski
Hi there,

Quick question that I'm not sure how to find the answer to otherwise: do array 
jobs have less impact on the scheduler in any way than a whole long list of 
jobs run the more traditional way? Less startup overhead, anything like that?

Thanks!

(we run 17.11 on CentOS 7, but I'm not sure it makes any difference here)

--

|| \\UTGERS,|---*O*---
||_// the State  | Ryan Novosielski - novos...@rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\of NJ  | Office of Advanced Research Computing - MSB C630, Newark
 `'



Re: [slurm-users] Environment modules

2019-11-22 Thread Ree, Jan-Albert van


Jan-Albert van Ree  | Linux System Administrator | Digital Services
MARIN | T +31 317 49 35 48 | mailto:j.a.v@marin.nl | http://www.marin.nl

Just install the default CentOS RPM package environment-modules and play with 
it. If you're at home in bash you'll pick it up in minutes.

All default modules will be put in /usr/share/Modules/modulefiles or 
/etc/modulefiles for CentOS but you can add new locations (in a cluster you'd 
put it on the shared filesystem, so all nodes can have immediate access after 
installing it there)

For the correct syntax for environment modules , just check out some default 
modulefiles ; install the CentOS openmpi package and look at the file 
/etc/modulefiles/mpi/openmpi-x86_64 for some of the possibilities with 
modulefiles , although there's a lot more possible, such as automatically 
loading of dependent modules

Hope this helps
--
Jan-Albert



From: slurm-users  on behalf of 
Mariano.Maluf 
Sent: Friday, November 22, 2019 18:37
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] Environment modules

Hi all

I am setting up for the first time a cluster with Slurm in Centos7 with
1 headnode and 12 nodes.

The cluster is operational but I need to install and configure
environment modules.

Could you advise me some documentation about it?

Thanks in advance.

Regards,
Mariano.

--
Lic. Mariano Maluf
Universidad Nacional de San Martín
2033-1400 int. 6046




Re: [slurm-users] Environment modules

2019-11-22 Thread Nguyen Dai Quy
On Fri, Nov 22, 2019 at 6:37 PM Mariano.Maluf 
wrote:

> Hi all
>
> I am setting up for the first time a cluster with Slurm in Centos7 with
> 1 headnode and 12 nodes.
>
> The cluster is operational but I need to install and configure
> environment modules.
>
> Could you advise me some documentation about it?
>
>
Nothing to see with Slurm :-)
But it's not so hard to setup. Just define your path to modulefiles, the
same for headnode & nodes. Put your module files on shared folder for all
nodes. That's all :-)





> Thanks in advance.
>
> Regards,
> Mariano.
>
> --
> Lic. Mariano Maluf
> Universidad Nacional de San Martín
> 2033-1400 int. 6046
>
>
>


Re: [slurm-users] Environment modules

2019-11-22 Thread Wiegand, Paul
We use TACC's lmod system.  It is pretty straightforward to setup and 
reasonably well documented:

https://www.tacc.utexas.edu/research-development/tacc-projects/lmod

Paul.


> On Nov 22, 2019, at 12:37 PM, Mariano.Maluf  wrote:
> 
> Hi all
> 
> I am setting up for the first time a cluster with Slurm in Centos7 with 1 
> headnode and 12 nodes.
> 
> The cluster is operational but I need to install and configure environment 
> modules.
> 
> Could you advise me some documentation about it?
> 
> Thanks in advance.
> 
> Regards,
> Mariano.
> 
> -- 
> Lic. Mariano Maluf
> Universidad Nacional de San Martín
> 2033-1400 int. 6046
> 
> 



[slurm-users] Environment modules

2019-11-22 Thread Mariano.Maluf

Hi all

I am setting up for the first time a cluster with Slurm in Centos7 with 
1 headnode and 12 nodes.


The cluster is operational but I need to install and configure 
environment modules.


Could you advise me some documentation about it?

Thanks in advance.

Regards,
Mariano.

--
Lic. Mariano Maluf
Universidad Nacional de San Martín
2033-1400 int. 6046




Re: [slurm-users] Slurm configuration, Weight Parameter

2019-11-22 Thread Goetz, Patrick G
Can't you just set the usage priority to be higher for the 2GB machines? 
  This way, if the requested memory is less than 2GB those machines will 
be used first, and larger jobs skip to the higher memory machines. 
 


On 11/21/19 9:44 AM, Jim Prewett wrote:
> 
> Hi Sistemas,
> 
> I could be mistaken, but I don't think there is a way to require jobs on 
> the 3GB nodes to request more than 2GB!
> 
> https://slurm.schedmd.com/slurm.conf.html states this: "Note that if a 
> job allocation request can not be satisfied using the nodes with the 
> lowest weight, the set of nodes with the next lowest weight is added to 
> the set of nodes under consideration for use (repeat as needed for 
> higher weight values)."
> 
> I read that to mean "if there are only 3GB nodes available, jobs will be 
> run there reguardless of the memory needed."  We had a similar request 
> but were unable to find a solution (and, ultimately the particular user 
> is happier to not have idle machines when there's work to be done!).
> 
> If I'm misunderstanding, I'd love to know!
> 
> HTH,
> Jim
> 
> On Thu, 21 Nov 2019, Sistemas NLHPC wrote:
> 
>> Hi all,
>>
>> Currently we have two types of nodes, one with 3GB and another with 
>> 2GB of
>> RAM, it is required that in nodes of 3 GB it is not allowed to execute
>> tasks with less than 2GB, to avoid underutilization of resources.
>>
>> This, because we have nodes that can fulfill the condition of executing
>> tasks with 2GB or less.
>>
>> I try in the nodes configuration with the option "Weight".I send 
>> multiples
>> jobs but slurm not asigned by "Weight", it's arbitrary in the order how
>> send jobs. Some configuration and logs:
>>
>> slurm.conf
>>
>> NodeName=DEFAULT RealMemory=3007 Features=3007MB Weight=500 State=idle
>> Sockets=2 CoresPerSocket=1
>> NodeName=devcn050
>>
>> NodeName=DEFAULT RealMemory=3007 Features=3007MB Weight=100 State=idle
>> Sockets=2 CoresPerSocket=1
>> NodeName=devcn002
>>
>> NodeName=DEFAULT RealMemory=2000 Features=2000MB Weight=1 State=idle
>> Sockets=2 CoresPerSocket=1
>> NodeName=devcn001
>>
>> Extra information, I see that slurm assing Weight in the node.
>>
>> # sinfo -N -l
>>
>> NODELIST   NODES PARTITION   STATE CPUS    S:C:T MEMORY TMP_DISK 
>> WEIGHT
>> AVAIL_FE REASON
>> devcn001   1  slims*   idle   2
>> 2:1:1   2000   0    1 2000MB    none
>>
>> devcn002   1  slims*   idle   2
>> 2:1:1   3007   0 100    3007MB    none
>>
>> devcn050   1  slims*   idle   2
>> 2:1:1   3007   0 500    3007MB    none
>>
>> I test other settings, such as the TRESWeigths parameter with no results,
>> for example:
>>
>> NodeName=devcn001 TRESWeights="CPU=2.0,Mem=2000MB"
>>
>> Too PriorityType=priority/multifactor plugin is also activated and
>> deactivated to test, but in all these cases it does not work.
>>
>> Thanks in advance.
>>
>> Regards.
>>
> 
> James E. Prewett    j...@prewett.org downl...@hpc.unm.edu
> Systems Team Leader   LoGS: http://www.hpc.unm.edu/~download/LoGS/
> Designated Security Officer OpenPGP key: pub 1024D/31816D93
> HPC Systems Engineer III   UNM HPC  505.277.8210
> 
>>> This message is from an external sender. Learn more about why this <<
>>> matters at https://links.utexas.edu/rtyclf.    <<
> 


[slurm-users] nss_slurm not passing groups

2019-11-22 Thread Brian Andrus
Ok, so I wanted to test nss_slurm more after hitting the BoF yesterday.

I have it running, but it does not seem to pass groups.

>From a simple interactive bash session:

[andrubr@gen-b2-03 ~]$ getent -s slurm passwd
andrubr:x:43871:11513:Andrus, Brian:/home/andrubr:/bin/bash
[andrubr@gen-b2-03 ~]$ scontrol getent gen-b2-03
JobId=236243.Extern:
User:
andrubr:x:43871:11513:Andrus, Brian:/home/andrubr:/bin/bash
Groups:

JobId=236243.0:
User:
andrubr:x:43871:11513:Andrus, Brian:/home/andrubr:/bin/bash
Groups:
---
but when I do a standard 'id', I get back 41 groups I am in.

Bug?

Brian Andrus