Re: [slurm-users] Job step do not take the hole allocation
Hei, Ole! :) Ole Holm Nielsen writes: > Can anyone she light on the relationship between Tommi's > slurm_cli_pre_submit function and the ones defined in the > cli_filter_plugins page? I think the *_p_* functions are functions you need to implement if you write a cli plugin in C. When you write a cli plugin-script in Lua, you write Lua functions called slurm_cli_setup_defaults, slurm_cli_pre_submit, etc in the Lua code, and then the C-code of the Lua plugin itself implements the *_p_* functions (I believe). That said, I too found it hard to find any documentation of the Lua plugin. Eventually, I found an example script in the Slurm source code (etc/cli_filter.lua.example), which I've taken as a starting point for my cli filter plugin scripts. -- B/H signature.asc Description: PGP signature
Re: [slurm-users] Job step do not take the hole allocation
On 6/30/23 08:41, Tommi Tervo wrote: This was an annoying change: 22.05.x RELEASE_NOTES: -- srun will no longer read in SLURM_CPUS_PER_TASK. This means you will implicitly have to specify --cpus-per-task on your srun calls, or set the new SRUN_CPUS_PER_TASK env var to accomplish the same thing. Here one can find relevant discussion: https://bugs.schedmd.com/show_bug.cgi?id=15632 I'll attach our cli-filter pre_submit function which works for us. The discussion in bug 15632 concludes that this bug will only be fixed in 23.11. Your workaround looks nice, however, I have not been able to find any documentation of slurmctld calling any Lua functions named slurm_cli_pre_submit or slurm_cli_post_submit. Some very similar functions are documented in https://slurm.schedmd.com/cli_filter_plugins.html for functions cli_filter_p_setup_defaults, cli_filter_p_pre_submit, and cli_filter_p_post_submit. Can anyone she light on the relationship between Tommi's slurm_cli_pre_submit function and the ones defined in the cli_filter_plugins page? Thanks, Ole
Re: [slurm-users] Job step do not take the hole allocation
Hi, thank you very much for your help! Best wishes, Danny > Am 30.06.2023 um 08:41 schrieb Tommi Tervo : > > smime.p7s Description: S/MIME cryptographic signature
Re: [slurm-users] Job step do not take the hole allocation
> I searched the slurm.conf documentation, the mailing list and also the > changelog, but found no reference to a matching parameter. > Do anyone of you know the behavior and how to change it? Hi, This was an annoying change: 22.05.x RELEASE_NOTES: -- srun will no longer read in SLURM_CPUS_PER_TASK. This means you will implicitly have to specify --cpus-per-task on your srun calls, or set the new SRUN_CPUS_PER_TASK env var to accomplish the same thing. Here one can find relevant discussion: https://bugs.schedmd.com/show_bug.cgi?id=15632 I'll attach our cli-filter pre_submit function which works for us. BR, Tommi Tervo CSC function slurm_cli_pre_submit(options, pack_offset) --slurm.log_info("Function: %s", "CSC pre_submit") p = require("posix") -- cpus_per_task = p.getenv("SLURM_CPUS_PER_TASK") cpus_per_task = options['cpus-per-task'] if (cpus_per_task ~= nil and tonumber(cpus_per_task) > 0) then --Set the environment variable: p.setenv("SRUN_CPUS_PER_TASK", cpus_per_task) cpus_per_task=p.getenv("SRUN_CPUS_PER_TASK") --slurm.log_info("SRUN_CPUS_PER_TASK=%u", cpus_per_task) --Or set the command-line option: --options['cpus-per-task'] = cpus_per_task --slurm.log_info("SRUN_CPUS_PER_TASK=%u", options['cpus-per-task']) end return slurm.SUCCESS end
[slurm-users] Job step do not take the hole allocation
Dear all, we currently see a change of a default behavior of a job step. On our old cluster (Slurm 20.11.9) a job step take all the resources of my allocation. rotscher@tauruslogin5:~> salloc --partition=interactive --nodes=1 --ntasks=1 --cpus-per-task=24 --hint=nomultithread salloc: Pending job allocation 37851810 salloc: job 37851810 queued and waiting for resources salloc: job 37851810 has been allocated resources salloc: Granted job allocation 37851810 salloc: Waiting for resource configuration salloc: Nodes taurusi6605 are ready for job bash-4.2$ srun numactl -show policy: default preferred node: current physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 cpubind: 0 1 nodebind: 0 1 membind: 0 1 If run the same command on our new cluster the job step take only 1 core instead of all without any further paramter. [rotscher@login1 ~]$ salloc --nodes=1 --ntasks=1 --cpus-per-task=24 --hint=nomultithread salloc: Pending job allocation 9197 salloc: job 9197 queued and waiting for resources salloc: job 9197 has been allocated resources salloc: Granted job allocation 9197 salloc: Waiting for resource configuration salloc: Nodes n1601 are ready for job [rotscher@login1 ~]$ srun numactl -show policy: default preferred node: current physcpubind: 0 cpubind: 0 nodebind: 0 membind: 0 1 2 3 4 5 6 7 If I add the parameter „-c 24“ to the job step it also take the hole resources, but the step should take it per default. [rotscher@login1 ~]$ srun -c 24 numactl -show policy: default preferred node: current physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 cpubind: 0 1 nodebind: 0 1 membind: 0 1 2 3 4 5 6 7 I searched the slurm.conf documentation, the mailing list and also the changelog, but found no reference to a matching parameter. Do anyone of you know the behavior and how to change it? Best wishes, Danny smime.p7s Description: S/MIME cryptographic signature