Re: [slurm-users] Job step do not take the hole allocation

2023-06-30 Thread Bjørn-Helge Mevik
Hei, Ole! :)

Ole Holm Nielsen  writes:

> Can anyone she light on the relationship between Tommi's
> slurm_cli_pre_submit function and the ones defined in the
> cli_filter_plugins page?

I think the *_p_* functions are functions you need to implement if you
write a cli plugin in C.  When you write a cli plugin-script in Lua, you write
Lua functions called slurm_cli_setup_defaults, slurm_cli_pre_submit,
etc in the Lua code, and then the C-code of the Lua plugin itself
implements the *_p_* functions (I believe).

That said, I too found it hard to find any documentation of the Lua
plugin.  Eventually, I found an example script in the Slurm source code
(etc/cli_filter.lua.example), which I've taken as a starting point for
my cli filter plugin scripts.

-- 
B/H



signature.asc
Description: PGP signature


Re: [slurm-users] Job step do not take the hole allocation

2023-06-30 Thread Ole Holm Nielsen

On 6/30/23 08:41, Tommi Tervo wrote:

This was an annoying change:

22.05.x RELEASE_NOTES:
  -- srun will no longer read in SLURM_CPUS_PER_TASK. This means you will
 implicitly have to specify --cpus-per-task on your srun calls, or set the
 new SRUN_CPUS_PER_TASK env var to accomplish the same thing.

Here one can find relevant discussion:

https://bugs.schedmd.com/show_bug.cgi?id=15632

I'll attach our cli-filter pre_submit function which works for us.


The discussion in bug 15632 concludes that this bug will only be fixed in 
23.11.  Your workaround looks nice, however, I have not been able to find 
any documentation of slurmctld calling any Lua functions named 
slurm_cli_pre_submit or slurm_cli_post_submit.


Some very similar functions are documented in 
https://slurm.schedmd.com/cli_filter_plugins.html for functions 
cli_filter_p_setup_defaults, cli_filter_p_pre_submit, and 
cli_filter_p_post_submit.


Can anyone she light on the relationship between Tommi's 
slurm_cli_pre_submit function and the ones defined in the 
cli_filter_plugins page?


Thanks,
Ole



Re: [slurm-users] Job step do not take the hole allocation

2023-06-30 Thread Danny Marc Rotscher
Hi,

thank you very much for your help!

Best wishes,
Danny

> Am 30.06.2023 um 08:41 schrieb Tommi Tervo :
> 
> 



smime.p7s
Description: S/MIME cryptographic signature


Re: [slurm-users] Job step do not take the hole allocation

2023-06-29 Thread Tommi Tervo
> I searched the slurm.conf documentation, the mailing list and also the
> changelog, but found no reference to a matching parameter.
> Do anyone of you know the behavior and how to change it?

Hi,

This was an annoying change:

22.05.x RELEASE_NOTES:
 -- srun will no longer read in SLURM_CPUS_PER_TASK. This means you will
implicitly have to specify --cpus-per-task on your srun calls, or set the
new SRUN_CPUS_PER_TASK env var to accomplish the same thing.

Here one can find relevant discussion:

https://bugs.schedmd.com/show_bug.cgi?id=15632

I'll attach our cli-filter pre_submit function which works for us.

BR,
Tommi Tervo
CSC


function slurm_cli_pre_submit(options, pack_offset)
  --slurm.log_info("Function: %s", "CSC pre_submit")
  p = require("posix")
  -- cpus_per_task = p.getenv("SLURM_CPUS_PER_TASK")
  cpus_per_task = options['cpus-per-task']
  if (cpus_per_task ~= nil and tonumber(cpus_per_task) > 0) then
--Set the environment variable:
p.setenv("SRUN_CPUS_PER_TASK", cpus_per_task)
cpus_per_task=p.getenv("SRUN_CPUS_PER_TASK")
--slurm.log_info("SRUN_CPUS_PER_TASK=%u", cpus_per_task)

--Or set the command-line option:
--options['cpus-per-task'] = cpus_per_task
--slurm.log_info("SRUN_CPUS_PER_TASK=%u", options['cpus-per-task'])
  end
return slurm.SUCCESS
end



[slurm-users] Job step do not take the hole allocation

2023-06-29 Thread Danny Marc Rotscher
Dear all,

we currently see a change of a default behavior of a job step.
On our old cluster (Slurm 20.11.9) a job step take all the resources of my 
allocation.
rotscher@tauruslogin5:~> salloc --partition=interactive --nodes=1 --ntasks=1 
--cpus-per-task=24 --hint=nomultithread
salloc: Pending job allocation 37851810
salloc: job 37851810 queued and waiting for resources
salloc: job 37851810 has been allocated resources
salloc: Granted job allocation 37851810
salloc: Waiting for resource configuration
salloc: Nodes taurusi6605 are ready for job
bash-4.2$ srun numactl -show
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
cpubind: 0 1
nodebind: 0 1
membind: 0 1

If run the same command on our new cluster the job step take only 1 core 
instead of all without any further paramter.
[rotscher@login1 ~]$ salloc --nodes=1 --ntasks=1 --cpus-per-task=24 
--hint=nomultithread
salloc: Pending job allocation 9197
salloc: job 9197 queued and waiting for resources
salloc: job 9197 has been allocated resources
salloc: Granted job allocation 9197
salloc: Waiting for resource configuration
salloc: Nodes n1601 are ready for job
[rotscher@login1 ~]$ srun numactl -show
policy: default
preferred node: current
physcpubind: 0
cpubind: 0
nodebind: 0
membind: 0 1 2 3 4 5 6 7

If I add the parameter „-c 24“ to the job step it also take the hole resources, 
but the step should take it per default.
[rotscher@login1 ~]$ srun -c 24 numactl -show
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
cpubind: 0 1
nodebind: 0 1
membind: 0 1 2 3 4 5 6 7

I searched the slurm.conf documentation, the mailing list and also the 
changelog, but found no reference to a matching parameter.
Do anyone of you know the behavior and how to change it?

Best wishes,
Danny

smime.p7s
Description: S/MIME cryptographic signature