Re: [slurm-users] 20.11.1 on Cray: job_submit.lua: SO loaded on CtlD restart: script skipped when job submitted

2020-12-16 Thread Chris Samuel

On 16/12/20 6:21 pm, Kevin Buckley wrote:


The skip is occuring, in src/lua/slurm_lua.c, because of this trap


That looks right to me, that's Doug's code which is checking whether the 
file has been updated since slurmctld last read it in.  If it has then 
it'll reload it, but if it hasn't then it'll skip it (and if you've got 
debugging up high then you'll see that message).


So if you see that message then the lua has been read in to slurmctld 
and should get called.  You might want to check the log for when it last 
read it in, just in case there was some error detected at that point.


You can also use luac to run a check over the script you've got like this:

luac -p /etc/opt/slurm/job_submit.lua

All the best,
Chris
--
Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



[slurm-users] 20.11.1 on Cray: job_submit.lua: SO loaded on CtlD restart: script skipped when job submitted

2020-12-16 Thread Kevin Buckley

Probaly not specific to 20.11.1, nor a Cray, but has anyone out there seen 
anything like this.

As the slurmctld restarts, after upping the debug level, it all look hunky dory,

[2020-12-17T09:23:46.204] debug3: Trying to load plugin 
/opt/slurm/20.11.1/lib64/slurm/job_submit_cray_aries.so
[2020-12-17T09:23:46.205] debug3: Success.
[2020-12-17T09:23:46.206] debug3: Trying to load plugin 
/opt/slurm/20.11.1/lib64/slurm/job_submit_lua.so
[2020-12-17T09:23:46.207] debug3: slurm_lua_loadscript: job_submit/lua: loading 
Lua script: /etc/opt/slurm/job_submit.lua
[2020-12-17T09:23:46.208] debug3: Success.
[2020-12-17T09:23:46.209] debug3: Trying to load plugin 
/opt/slurm/20.11.1/lib64/slurm/prep_script.so
[2020-12-17T09:23:46.210] debug3: Success.

but, at the point a submiited job that should pass through the job_submit 
script,

[2020-12-17T09:26:06.806] debug3: job_submit/lua: slurm_lua_loadscript: 
skipping loading Lua script: /etc/opt/slurm/job_submit.lua
[2020-12-17T09:26:06.807] debug3: assoc_mgr_fill_in_user: found correct user: 
someuser(12345)
[2020-12-17T09:26:06.808] debug5: assoc_mgr_fill_in_assoc: looking for assoc of 
user=someuser(12345), acct=accnts0001, cluster=clust, partition=acceptance
[2020-12-17T09:26:06.809] debug3: assoc_mgr_fill_in_assoc: found correct 
association of user=someuser(12345), acct=accnts0001, cluster=clust, 
partition=acceptance to assoc=67 acct=accnts0001


Reason I went looking is that the job_submit.lua should be telling
me, the job submitter, to "sling my hook" as I have, deliberately,
left something out.

FWIW, the debug level here goes all the way to 5, so I was hoping
for a little more info as to why it is skipping it.

The skip is occuring, in src/lua/slurm_lua.c, because of this trap

if (st.st_mtime <= *load_time) {
debug3("%s: %s: skipping loading Lua script: %s", plugin,
   __func__, script_path);
return SLURM_SUCCESS;
}
debug3("%s: %s: loading Lua script: %s", __func__, plugin, script_path);

where "st" is a stat struct, but I am currently none the wiser as why
such a condition would be (maybe even, would need to be) triggered?

The job submit script is certainly "younger" than the time of the slurmctld
restart, and of the job submission, be then, why wouldn't it be?

Kevin
--
Supercomputing Systems Administrator
Pawsey Supercomputing Centre



Re: [slurm-users] using resources effectively?

2020-12-16 Thread Weijun Gao

Thanks you Michael!

I've tried the following example:

    NodeName=gpunode01 Gres=gpu:1 Sockets=2 CoresPerSocket=28 
ThreadsPerCore=2 State=UNKNOWN RealMemory=38
    PartitionName=gpu MaxCPUsPerNode=56 MaxMemPerNode=19 
Nodes=gpunode01 Default=NO MaxTime=1-0 State=UP
    PartitionName=cpu MaxCPUsPerNode=56 MaxMemPerNode=19 
Nodes=gpunode01 Default=YES MaxTime=1-0 State=UP


1) So when the system is idling, the following "gpu" job will start 
immediately ("gpu" partition, 1 GPU, 20 CPUs):


    srun -p gpu --gpus=1 -c 20 --pty bash -i

2) If I run the same command again, it will be queued ... this is normal 
("gpu" partition, 1 GPU, 20 CPUs):


    srun -p gpu --gpus=1 -c 20 --pty bash -i

3) Then the following "cpu" job will be queued too ("cpu" partition, 20 
x CPUs):


    srun -p cpu --gpus=0 -c 20 --pty bash -i

Is there a way to let the "cpu" job run instead of waiting?

Any suggestions?

Thanks again,

Weijun

On 12/16/2020 2:54 PM, Renfro, Michael wrote:

*EXTERNAL EMAIL:*

We have overlapping partitions for GPU work and some kinds non-GPU 
work (both large memory and regular memory jobs).


For 28-core nodes with 2 GPUs, we have:

PartitionName=gpu MaxCPUsPerNode=16 … Nodes=gpunode[001-004]

PartitionName=any-interactive MaxCPUsPerNode=12 … 
Nodes=node[001-040],gpunode[001-004]


PartitionName=bigmem MaxCPUsPerNode=12 … Nodes=gpunode[001-003]

PartitionName=hugemem MaxCPUsPerNode=12 … Nodes=gpunode004

Worst case, non-GPU jobs could reserve up to 24 of the 28 cores on a 
GPU node, but only for a limited time (our any-interactive partition 
has a 2 hour time limit). In practice, it's let us use a lot of 
otherwise idle CPU capacity in the GPU nodes for short test runs.


*From: *slurm-users 
*Date: *Wednesday, December 16, 2020 at 1:04 PM
*To: *Slurm User Community List 
*Subject: *[slurm-users] using resources effectively?

External Email Warning

This email originated from outside the university. Please use caution 
when opening attachments, clicking links, or responding to requests.




Hi,

Say if I have a Slurm node with 1 x GPU and 112 x CPU cores, and:

 1) there is a job running on the node using the GPU and 20 x CPU 
cores


 2) there is a job waiting in the queue asking for 1 x GPU and 20 x
CPU cores

Is it possible to a) let a new job asking for 0 x GPU and 20 x CPU cores
(safe for the queued GPU job) start immediately; and b) let a new job
asking for 0 x GPU and 100 x CPU cores (not safe for the queued GPU job)
wait in the queue? Or c) is it doable to put the node into two Slurm
partitions, 56 CPU cores to a "cpu" partition, and 56 CPU cores to a
"gpu" partition, for example?

Thank you in advance for any suggestions / tips.

Best,

Weijun

===
Weijun Gao
Computational Research Support Specialist
Department of Psychology, University of Toronto Scarborough
1265 Military Trail, Room SW416
Toronto, ON M1C 1M2
E-mail: weijun@utoronto.ca



[slurm-users] Questions about sacctmgr load filename

2020-12-16 Thread Richard Lefebvre
Hi,

I would like to do the equivalent of:

sacctmgr -i add user namef account=grpa
sacctmgr -i add user nameg account=grpa
...
sacctmgr -i add user namez account=grpa

but with an "sacct -i load filename" in which filename contains the grpa
with the list of user. The documentation mentions the "load" is to read a
formerly "dump" created file. But is doesn't mention if a partial file can
be used. Or what the format of a partial file would look like? Does a load
erase/replace existing info? Can a load be used to remove entries
too(sacctmgr -i delete userb account=grpa)? I'm missing a piece of
documentation I can't seem to find.

The reason I'm looking to use "load" instead of multiple "add" is that
after doing 4-5 add,  the slurmdbd/slurmctld system becomes very
slow/unresponsive for a few minutes.

Richard


Re: [slurm-users] using resources effectively?

2020-12-16 Thread Renfro, Michael
We have overlapping partitions for GPU work and some kinds non-GPU work (both 
large memory and regular memory jobs).

For 28-core nodes with 2 GPUs, we have:

PartitionName=gpu MaxCPUsPerNode=16 … Nodes=gpunode[001-004]
PartitionName=any-interactive MaxCPUsPerNode=12 … 
Nodes=node[001-040],gpunode[001-004]
PartitionName=bigmem MaxCPUsPerNode=12 … Nodes=gpunode[001-003]
PartitionName=hugemem MaxCPUsPerNode=12 … Nodes=gpunode004

Worst case, non-GPU jobs could reserve up to 24 of the 28 cores on a GPU node, 
but only for a limited time (our any-interactive partition has a 2 hour time 
limit). In practice, it's let us use a lot of otherwise idle CPU capacity in 
the GPU nodes for short test runs.

From: slurm-users 
Date: Wednesday, December 16, 2020 at 1:04 PM
To: Slurm User Community List 
Subject: [slurm-users] using resources effectively?
External Email Warning

This email originated from outside the university. Please use caution when 
opening attachments, clicking links, or responding to requests.



Hi,

Say if I have a Slurm node with 1 x GPU and 112 x CPU cores, and:

 1) there is a job running on the node using the GPU and 20 x CPU cores

 2) there is a job waiting in the queue asking for 1 x GPU and 20 x
CPU cores

Is it possible to a) let a new job asking for 0 x GPU and 20 x CPU cores
(safe for the queued GPU job) start immediately; and b) let a new job
asking for 0 x GPU and 100 x CPU cores (not safe for the queued GPU job)
wait in the queue? Or c) is it doable to put the node into two Slurm
partitions, 56 CPU cores to a "cpu" partition, and 56 CPU cores to a
"gpu" partition, for example?

Thank you in advance for any suggestions / tips.

Best,

Weijun

===
Weijun Gao
Computational Research Support Specialist
Department of Psychology, University of Toronto Scarborough
1265 Military Trail, Room SW416
Toronto, ON M1C 1M2
E-mail: weijun@utoronto.ca



[slurm-users] using resources effectively?

2020-12-16 Thread Weijun Gao

Hi,

Say if I have a Slurm node with 1 x GPU and 112 x CPU cores, and:

    1) there is a job running on the node using the GPU and 20 x CPU cores

    2) there is a job waiting in the queue asking for 1 x GPU and 20 x 
CPU cores


Is it possible to a) let a new job asking for 0 x GPU and 20 x CPU cores 
(safe for the queued GPU job) start immediately; and b) let a new job 
asking for 0 x GPU and 100 x CPU cores (not safe for the queued GPU job) 
wait in the queue? Or c) is it doable to put the node into two Slurm 
partitions, 56 CPU cores to a "cpu" partition, and 56 CPU cores to a 
"gpu" partition, for example?


Thank you in advance for any suggestions / tips.

Best,

Weijun

===
Weijun Gao
Computational Research Support Specialist
Department of Psychology, University of Toronto Scarborough
1265 Military Trail, Room SW416
Toronto, ON M1C 1M2
E-mail: weijun@utoronto.ca




[slurm-users] Constraint multiple counts not working

2020-12-16 Thread Jeffrey T Frey
On a cluster running Slurm 17.11.8 (cons_res) I can submit a job that requests 
e.g. 2 nodes with unique features on each:


$ sbatch --nodes=2 --ntasks-per-node=1 --constraint="[256GB*1&192GB*1]" …


The job is submitted and runs as expected:  on 1 node with feature "256GB" and 
1 node with feature "192GB."  A similar job on a cluster running 20.11.1 
(cons_res OR cons_tres, tested with both) fails to submit:


sbatch: error: Batch job submission failed: Requested node configuration is not 
available


I enabled debug5 output with NodeFeatures:


[2020-12-16T08:53:19.024] debug:  JobId=118 feature list: [512GB*1&768GB*1]
[2020-12-16T08:53:19.025] NODE_FEATURES: _log_feature_nodes: FEAT:512GB COUNT:1 
PAREN:0 OP:XAND ACTIVE:r1n[00-47] AVAIL:r1n[00-47]
[2020-12-16T08:53:19.025] NODE_FEATURES: _log_feature_nodes: FEAT:768GB COUNT:1 
PAREN:0 OP:END ACTIVE:r2l[00-31] AVAIL:r2l[00-31]
[2020-12-16T08:53:19.025] NODE_FEATURES: valid_feature_counts: feature:512GB 
feature_bitmap:r1n[00-47],r2l[00-31],r2x[00-10] 
work_bitmap:r1n[00-47],r2l[00-31],r2x[00-10] tmp_bitmap:r1n[00-47] count:1
[2020-12-16T08:53:19.025] NODE_FEATURES: valid_feature_counts: feature:768GB 
feature_bitmap:r1n[00-47],r2l[00-31],r2x[00-10] 
work_bitmap:r1n[00-47],r2l[00-31],r2x[00-10] tmp_bitmap:r2l[00-31] count:1
[2020-12-16T08:53:19.025] NODE_FEATURES: valid_feature_counts: 
NODES:r1n[00-47],r2l[00-31],r2x[00-10] HAS_XOR:T status:No error
[2020-12-16T08:53:19.025] select/cons_tres: _job_test: SELECT_TYPE: test 0 
pass: test_only
[2020-12-16T08:53:19.026] debug2: job_allocate: setting JobId=118_* to 
"BadConstraints" due to a flaw in the job request (Requested node configuration 
is not available)
[2020-12-16T08:53:19.026] _slurm_rpc_submit_batch_job: Requested node 
configuration is not available


My syntax agrees with the 20.11.1 documentation (online and man pages) so it 
seems correct — and it works fine in 17.11.8.  Any ideas?



::
 Jeffrey T. Frey, Ph.D.
 Systems Programmer V / Cluster Management
 Network & Systems Services / College of Engineering
 University of Delaware, Newark DE  19716
 Office: (302) 831-6034  Mobile: (302) 419-4976
::



Re: [slurm-users] getting fairshare

2020-12-16 Thread Paul Edmon
You can use the -o option to select which field you want it to print.  
The last column is the FairShare score.  The equation is part of the 
slurm documentation: https://slurm.schedmd.com/priority_multifactor.html



If you are using the Classic Fairshare you can look at our 
documentation: https://docs.rc.fas.harvard.edu/kb/fairshare/



-Paul Edmon-


On 12/16/2020 12:30 PM, Erik Bryer wrote:

$ sshare -a
             Account       User  RawShares  NormShares  RawUsage 
 EffectvUsage  FairShare
 -- -- --- --- 
- --
root                                          0.00     158     
 1.00
 root                      root          1    0.25       0     
 0.00   1.00
 borrowed                                1    0.25     157     
 0.994905
  borrowed               ebryer          6    0.020979     157     
 1.00   0.08
  borrowed            napierski          7    0.024476       0     
 0.00   0.33
  borrowed           sagatest01        259    0.905594       0     
 0.00   0.33
  borrowed           sagatest02         14    0.048951       0     
 0.00   0.33
 gaia                                    1    0.25       0     
 0.005095
  gaia                   ebryer          3    0.272727       0     
 1.00   0.416667
  gaia                 napiersk          2    0.181818       0     
 0.00   0.67
  gaia               sagatest01          1    0.090909       0     
 0.00   0.67
  gaia               sagatest02          5    0.454545       0     
 0.00   0.67
 saral                                   1    0.25       0     
 0.00
  saral                  ebryer         20    0.869565       0     
 0.00   1.00
  saral               napierski          1    0.043478       0     
 0.00   1.00
  saral              sagatest01          2    0.086957       0     
 0.00   1.00


Is there a way to take output from sshare and get FairShare? I'm 
looking for a simple equation or some indication why that's not 
possible. I've ready everything I can find on this topic.


Thanks,
Erik


[slurm-users] getting fairshare

2020-12-16 Thread Erik Bryer
$ sshare -a
 Account   User  RawShares  NormSharesRawUsage  
EffectvUsage  FairShare
 -- -- --- --- 
- --
root  0.00 158  1.00
 root  root  10.25   0  
0.00   1.00
 borrowed10.25 157  0.994905
  borrowed   ebryer  60.020979 157  
1.00   0.08
  borrowednapierski  70.024476   0  
0.00   0.33
  borrowed   sagatest012590.905594   0  
0.00   0.33
  borrowed   sagatest02 140.048951   0  
0.00   0.33
 gaia10.25   0  0.005095
  gaia   ebryer  30.272727   0  
1.00   0.416667
  gaia napiersk  20.181818   0  
0.00   0.67
  gaia   sagatest01  10.090909   0  
0.00   0.67
  gaia   sagatest02  50.454545   0  
0.00   0.67
 saral   10.25   0  0.00
  saral  ebryer 200.869565   0  
0.00   1.00
  saral   napierski  10.043478   0  
0.00   1.00
  saral  sagatest01  20.086957   0  
0.00   1.00

Is there a way to take output from sshare and get FairShare? I'm looking for a 
simple equation or some indication why that's not possible. I've ready 
everything I can find on this topic.

Thanks,
Erik


Re: [slurm-users] Query for minimum memory required in partition

2020-12-16 Thread Paul Edmon

We do this here using the job_submit.lua script.   Here is an example:

    if part == "bigmem" then
    if (job_desc.pn_min_memory ~= 0) then
    if (job_desc.pn_min_memory < 19 or 
job_desc.pn_min_memory > 2147483646) then
    slurm.log_user("You must request 
more than 190GB for jobs in bigmem partition")

    return 2052
    end
    end
    end

-Paul Edmon-

On 12/16/2020 11:06 AM, Sistemas NLHPC wrote:

Hello

Good afternoon, i have a query currently in our cluster we have 
different partitions:


1 partition called slims with 48 Gb of ram
1 partition called general 192 Gb of ram
1 partition called largemem with 768 Gb of ram.

Is it possible to restrict access to the largemem partition and for 
tasks to be accepted as long as a minimum of 193 Gb is reserved in 
slurm.conf or another method? This is because we have users who use 
the largemem partition reserving less than 192 GB.


Thanks for help.
--

Mirko Pizarro  Pizarro mailto:mpiza...@nlhpc.cl>
Ingeniero de Sistemas
National Laboratory for High Performance Computing (NLHPC)
www.nlhpc.cl 

CMM - Centro de Modelamiento Matemático
Facultad de Ciencias Físicas y Matemáticas (FCFM)
Universidad de Chile

Beauchef 851
Edificio Norte - Piso 6, of. 601
Santiago – Chile
tel +56 2 2978 4603


Re: [slurm-users] gres names

2020-12-16 Thread Erik Bryer
I just found an error in my attempt. I ran on saga-test02 while I'd made the 
change to saga-test01. Things are working better now.
Thanks,
Erik

From: Erik Bryer 
Sent: Wednesday, December 16, 2020 8:51 AM
To: Slurm User Community List 
Subject: Re: [slurm-users] gres names

Hi Loris,

That actually makes some sense. There is one thing that troubles me though. If, 
on a VM with no GPUs, I define...

NodeName=saga-test01 CPUS=2 SocketsPerBoard=1 CoresPerSocket=2 ThreadsPerCore=1 
RealMemory=1800 State=UNKNOWN Gres=gpu:gtx1080ti:4

...and try to run the following I get an error...

$ sbatch -w saga-test02 --gpus=gtx1080ti:1  --partition scavenge --wrap "ls -l" 
--qos scavengesbatch: error: Batch job submission failed: Requested node 
configuration is not available

This also fouls the whole cluster. Directly after issuing the sbatch, this 
occurs:

Dec 16 07:39:03 saga-test03 slurmctld[1169]: error: Setting node saga-test01 
state to DRAIN

During past tests I've been unable to get both nodes back online without 
removing the spurious gres from the node definition. All this still makes me 
wonder whether there is a direct link between the hardware and gres names. I 
think so. Someone mentioned the gres names get spit out by NVML (but you can 
also make up your own (?)), but I can't find a record of ours. Any thoughts?

Thanks,
Erik

From: slurm-users  on behalf of Loris 
Bennett 
Sent: Wednesday, December 16, 2020 12:07 AM
To: Slurm User Community List 
Subject: Re: [slurm-users] gres names

Hi Erik,

Erik Bryer  writes:

> Thanks for your reply. I can't find NVML in the logs going back to
> 11/22. dmesg goes back to the last boot, but has no mention of
> NVML. Regarding make one up on my own, how does Slurm know string
> "xyzzy" corresponds to a tesla gpu, e.g.?

As I understand it, Slurm doesn't need to know the correspondence, since
all it is doing is counting.  If you define a GRES, say,

  magic:wand

you can configure your nodes to have, say, 2 of these.  Then if a job
requests

 --gres=magic:wand:1

and starts, a subsequent job which requests

 --gres=magic:wand:2

will have to wait until the first magic wand become free again.
However, Slurm doesn't need to know whether your nodes really do have
magic wands, but your users do need to request them, if their jobs
require them.  To prevent them using a magic wand without requesting
one, you have to check the job parameters on submission, which you can
do via the job submit plugin.

Regards

Loris

> Thanks,
> Erik
> 
> From: slurm-users  on behalf of 
> Michael Di Domenico 
> Sent: Tuesday, December 15, 2020 1:24 PM
> To: Slurm User Community List 
> Subject: Re: [slurm-users] gres names
>
> you can either make them up on your own or they get spit out by NVML
> in the slurmd.log file
>
> On Tue, Dec 15, 2020 at 12:55 PM Erik Bryer  wrote:
>>
>> Hi,
>>
>> Where do I get the gres names, e.g. "rtx2080ti", to use for my gpus in my 
>> node definitions in slurm.conf?
>>
>> Thanks,
>> Erik
>
--
Dr. Loris Bennett (Hr./Mr.)
ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de



[slurm-users] Query for minimum memory required in partition

2020-12-16 Thread Sistemas NLHPC
 Hello

Good afternoon, i have a query currently in our cluster we have different
partitions:

1 partition called slims with 48 Gb of ram
1 partition called general 192 Gb of ram
1 partition called largemem with 768 Gb of ram.

Is it possible to restrict access to the largemem partition and for tasks
to be accepted as long as a minimum of 193 Gb is reserved in slurm.conf or
another method? This is because we have users who use the largemem
partition reserving less than 192 GB.

Thanks for help.
-- 

Mirko Pizarro  Pizarro 

Re: [slurm-users] gres names

2020-12-16 Thread Erik Bryer
Hi Loris,

That actually makes some sense. There is one thing that troubles me though. If, 
on a VM with no GPUs, I define...

NodeName=saga-test01 CPUS=2 SocketsPerBoard=1 CoresPerSocket=2 ThreadsPerCore=1 
RealMemory=1800 State=UNKNOWN Gres=gpu:gtx1080ti:4

...and try to run the following I get an error...

$ sbatch -w saga-test02 --gpus=gtx1080ti:1  --partition scavenge --wrap "ls -l" 
--qos scavengesbatch: error: Batch job submission failed: Requested node 
configuration is not available

This also fouls the whole cluster. Directly after issuing the sbatch, this 
occurs:

Dec 16 07:39:03 saga-test03 slurmctld[1169]: error: Setting node saga-test01 
state to DRAIN

During past tests I've been unable to get both nodes back online without 
removing the spurious gres from the node definition. All this still makes me 
wonder whether there is a direct link between the hardware and gres names. I 
think so. Someone mentioned the gres names get spit out by NVML (but you can 
also make up your own (?)), but I can't find a record of ours. Any thoughts?

Thanks,
Erik

From: slurm-users  on behalf of Loris 
Bennett 
Sent: Wednesday, December 16, 2020 12:07 AM
To: Slurm User Community List 
Subject: Re: [slurm-users] gres names

Hi Erik,

Erik Bryer  writes:

> Thanks for your reply. I can't find NVML in the logs going back to
> 11/22. dmesg goes back to the last boot, but has no mention of
> NVML. Regarding make one up on my own, how does Slurm know string
> "xyzzy" corresponds to a tesla gpu, e.g.?

As I understand it, Slurm doesn't need to know the correspondence, since
all it is doing is counting.  If you define a GRES, say,

  magic:wand

you can configure your nodes to have, say, 2 of these.  Then if a job
requests

 --gres=magic:wand:1

and starts, a subsequent job which requests

 --gres=magic:wand:2

will have to wait until the first magic wand become free again.
However, Slurm doesn't need to know whether your nodes really do have
magic wands, but your users do need to request them, if their jobs
require them.  To prevent them using a magic wand without requesting
one, you have to check the job parameters on submission, which you can
do via the job submit plugin.

Regards

Loris

> Thanks,
> Erik
> 
> From: slurm-users  on behalf of 
> Michael Di Domenico 
> Sent: Tuesday, December 15, 2020 1:24 PM
> To: Slurm User Community List 
> Subject: Re: [slurm-users] gres names
>
> you can either make them up on your own or they get spit out by NVML
> in the slurmd.log file
>
> On Tue, Dec 15, 2020 at 12:55 PM Erik Bryer  wrote:
>>
>> Hi,
>>
>> Where do I get the gres names, e.g. "rtx2080ti", to use for my gpus in my 
>> node definitions in slurm.conf?
>>
>> Thanks,
>> Erik
>
--
Dr. Loris Bennett (Hr./Mr.)
ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de



Re: [slurm-users] slurm/munge problem: invalid credentials

2020-12-16 Thread Ole Holm Nielsen

Hi Olaf,

Since you are testing Slurm, perhape my Slurm Wiki page may be of interest 
to you:

https://wiki.fysik.dtu.dk/niflheim/Slurm_installation

There is a discussion about the setup of Munge.

Best regards,
Ole

On 12/15/20 5:48 PM, Olaf Gellert wrote:

Hi all,

we are setting up a new test cluster to test some features for our
next HPC system. On one of the compute nodes we get these messages
in the log:

[2020-12-15T10:00:21.753] error: Munge decode failed: Invalid credential
[2020-12-15T10:00:21.753] auth/munge: _print_cred: ENCODED: Thu Jan 01 
01:00:00 1970
[2020-12-15T10:00:21.753] auth/munge: _print_cred: DECODED: Thu Jan 01 
01:00:00 1970
[2020-12-15T10:00:21.753] error: slurm_receive_msg_and_forward: 
g_slurm_auth_verify: REQUEST_NODE_REGISTRATION_STATUS has authentication 
error: Invalid authentication credential
[2020-12-15T10:00:21.753] error: slurm_receive_msg_and_forward: Protocol 
authentication error
[2020-12-15T10:00:21.763] error: service_connection: slurm_receive_msg: 
Protocol authentication error


I checked munge authentication in the usual way, so:
- time between nodes is synchronised
- munge is using same UID/GID on both sides
- "munge -c0 -z0 -n | unmunge" works on compute nodes and on slurmctld
   node
- ssh slurmcontrolnode "munge -c0 -z0 -n" | unmunge on a compute node
   works
- ssh computenode "munge -c0 -z0 -n" | unmunge on the slurmctld node
   works

So munge seems to work as far as I can say. What else does
slurm using munge? Are hostnames part of the authentication?
Do I have to wonder about the time "Thu Jan 01 01:00:00 1970"
(in the logs above)?

All machines are CentOS8, slurm is self-built 20.11.0,
munge is from CentOS8 rpm:

munge-0.5.13-1.el8.x86_64
munge-libs-0.5.13-1.el8.x86_64

Cheers, Olaf






Re: [slurm-users] [EXT] slurm/munge problem: invalid credentials

2020-12-16 Thread Sean Crosby
Hi Olaf,

Check the firewalls between your compute node and the Slurm controller to
make sure that they can contact each other. Slurmctld needs to contact the
SlurmdPort (default 6818), and slurmd needs to contact the SlurmctldPort
(default 6817). Also the other compute nodes need to be able to contact the
new compute node on SlurmdPort.

Sean

--
Sean Crosby | Senior DevOpsHPC Engineer and HPC Team Lead
Research Computing Services | Business Services
The University of Melbourne, Victoria 3010 Australia



On Wed, 16 Dec 2020 at 03:48, Olaf Gellert  wrote:

> UoM notice: External email. Be cautious of links, attachments, or
> impersonation attempts
>
> Hi all,
>
> we are setting up a new test cluster to test some features for our
> next HPC system. On one of the compute nodes we get these messages
> in the log:
>
> [2020-12-15T10:00:21.753] error: Munge decode failed: Invalid credential
> [2020-12-15T10:00:21.753] auth/munge: _print_cred: ENCODED: Thu Jan 01
> 01:00:00 1970
> [2020-12-15T10:00:21.753] auth/munge: _print_cred: DECODED: Thu Jan 01
> 01:00:00 1970
> [2020-12-15T10:00:21.753] error: slurm_receive_msg_and_forward:
> g_slurm_auth_verify: REQUEST_NODE_REGISTRATION_STATUS has authentication
> error: Invalid authentication credential
> [2020-12-15T10:00:21.753] error: slurm_receive_msg_and_forward: Protocol
> authentication error
> [2020-12-15T10:00:21.763] error: service_connection: slurm_receive_msg:
> Protocol authentication error
>
> I checked munge authentication in the usual way, so:
> - time between nodes is synchronised
> - munge is using same UID/GID on both sides
> - "munge -c0 -z0 -n | unmunge" works on compute nodes and on slurmctld
>node
> - ssh slurmcontrolnode "munge -c0 -z0 -n" | unmunge on a compute node
>works
> - ssh computenode "munge -c0 -z0 -n" | unmunge on the slurmctld node
>works
>
> So munge seems to work as far as I can say. What else does
> slurm using munge? Are hostnames part of the authentication?
> Do I have to wonder about the time "Thu Jan 01 01:00:00 1970"
> (in the logs above)?
>
> All machines are CentOS8, slurm is self-built 20.11.0,
> munge is from CentOS8 rpm:
>
> munge-0.5.13-1.el8.x86_64
> munge-libs-0.5.13-1.el8.x86_64
>
> Cheers, Olaf
>
> --
> Dipl. Inform. Olaf Gellertemail  gell...@dkrz.de
> Deutsches Klimarechenzentrum GmbH phone  +49 (0)40 460094 214
> Bundesstrasse 45a fax+49 (0)40 460094 270
> D-20146 Hamburg, Germany  wwwhttp://www.dkrz.de
>
> Sitz der Gesellschaft: Hamburg
> Geschäftsführer: Prof. Dr. Thomas Ludwig
> Registergericht: Amtsgericht Hamburg, HRB 39784
>
>


[slurm-users] Tuto in building a slurm minimal in a single server

2020-12-16 Thread Richard Randriatoamanana
Hi,

Surfing during days on the net and seeking talks/tutos on schedmd website, I 
didn’t really find a tuto (that works on a systemd env) how to install, 
configure and deploy a slurm system on a single compute server with many cores 
and many memory. Explanations and tutos in administration I have tested and 
read so far are mainly for cluster above 2-3 servers (frontal, master and 
workers) not one.

I know it is not the best practice to do so but I am validating a PoC with this 
specific minimal infra.

Thanks in advance for your feedbacks. I will of course share my tuto that works 
for my environment.

Regards,
Richard
—
Sent from my rotary and brandy phone !
Apologies for typos and any spelling errors

Re: [slurm-users] slurm/munge problem: invalid credentials

2020-12-16 Thread Ward Poelmans

On 15/12/2020 17:48, Olaf Gellert wrote:

So munge seems to work as far as I can say. What else does
slurm using munge? Are hostnames part of the authentication?
Do I have to wonder about the time "Thu Jan 01 01:00:00 1970"


I'm not an expert but I know that hostnames are part of munge 
authentication. From version 0.5.14 you can choose which ip/hostname it 
should use, in your version it uses what gethostname() returns.



Ward