[slurm-users] Slurm version 20.02.0 is now available
After 9 months of development and testing we are pleased to announce the availability of Slurm version 20.02.0! Downloads are available from https://www.schedmd.com/downloads.php. Highlights of the 20.02 release include: - A "configless" method of deploying Slurm within the cluster, in which the slurmd and user commands can use DNS SRV records to locate the slurmctld host and automatically download the relevant configuration files. - A new "auth/jwt" authentication mechanism using JWT, which can help integrate untrusted external systems into the cluster. - A new "slurmrestd" command/daemon which translates a new Slurm REST API into the underlying libslurm calls. - Packaging fixes for RHEL8 distributions. - Significant performance improvements to the backfill scheduler, as well as to string construction and processing. Thank you to all customers, partners, and community members who contributed to this release. As with past releases, the documentation available at https://slurm.schedmd.com has been updated to the 20.02 release. Past versions are available in the archive. This release also marks the end of support for the 18.08 release. The 19.05 release will remain supported up until the 20.11 release in November, but will not see as frequent updates, and bug-fixes will be targeted for the 20.02 maintenance releases going forward. -- Tim Wickberg Chief Technology Officer, SchedMD Commercial Slurm Development and Support
Re: [slurm-users] Slurm version 20.02.0 is now available
Hi Tim, I'm very interested in the "configless" setup for slurm. Is the setup for configless documented somewhere? Dean Schulze 303.909.3245 mobile On Tue, Feb 25, 2020 at 11:57 AM Tim Wickberg wrote: > After 9 months of development and testing we are pleased to announce the > availability of Slurm version 20.02.0! > > Downloads are available from https://www.schedmd.com/downloads.php. > > Highlights of the 20.02 release include: > > - A "configless" method of deploying Slurm within the cluster, in which > the slurmd and user commands can use DNS SRV records to locate the > slurmctld host and automatically download the relevant configuration files. > > - A new "auth/jwt" authentication mechanism using JWT, which can help > integrate untrusted external systems into the cluster. > > - A new "slurmrestd" command/daemon which translates a new Slurm REST > API into the underlying libslurm calls. > > - Packaging fixes for RHEL8 distributions. > > - Significant performance improvements to the backfill scheduler, as > well as to string construction and processing. > > Thank you to all customers, partners, and community members who > contributed to this release. > > As with past releases, the documentation available at > https://slurm.schedmd.com has been updated to the 20.02 release. Past > versions are available in the archive. This release also marks the end > of support for the 18.08 release. The 19.05 release will remain > supported up until the 20.11 release in November, but will not see as > frequent updates, and bug-fixes will be targeted for the 20.02 > maintenance releases going forward. > > -- > Tim Wickberg > Chief Technology Officer, SchedMD > Commercial Slurm Development and Support > >
Re: [slurm-users] Slurm version 20.02.0 is now available
On 2/25/20 11:41 AM, Dean Schulze wrote: I'm very interested in the "configless" setup for slurm. Is the setup for configless documented somewhere? Looks like the website has already been updated for the 20.02 documentation, and it looks like it's here: https://slurm.schedmd.com/configless_slurm.html All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
Re: [slurm-users] Slurm version 20.02.0 is now available
I suppose I can ask Bright Computing but does anyone know what version of Bright is needed? I would guess 8.2 or 9.0. Definitely want to dive into this.
Re: [slurm-users] Slurm version 20.02.0 is now available
Bright is not needed... for much of anything... On 2/25/2020 12:48 PM, Robert Kudyba wrote: I suppose I can ask Bright Computing but does anyone know what version of Bright is needed? I would guess 8.2 or 9.0. Definitely want to dive into this.
Re: [slurm-users] Slurm version 20.02.0 is now available
There was a major refactoring between the 19.05 and 20.02 code. Most of the callbacks for select plugins were moved to cons_common. I have a plugin for 19.05 that depends on two of those callbacks: select_p_job_begin() and select_p_job_fini(). My plugin is a copy of the select/cons_res plugin, but when I implement those functions in my plugin I get this error because those functions already exist in cons_common: /home/dean/src/slurm.versions/slurm-20.02/slurm-20.02.0/src/plugins/select/cons_common/cons_common.c:1134: multiple definition of `select_p_job_begin'; .libs/select_liqid_cons_res.o:/home/dean/src/slurm.versions/slurm-20.02/slurm-20.02.0/src/plugins/select/liqid_cons_res/select_liqid_cons_res.c:559: first defined here /usr/bin/ld: ../cons_common/.libs/libcons_common.a(cons_common.o): in function `select_p_job_fini': /home/dean/src/slurm.versions/slurm-20.02/slurm-20.02.0/src/plugins/select/cons_common/cons_common.c:1561: multiple definition of `select_p_job_fini'; .libs/select_liqid_cons_res.o:/home/dean/src/slurm.versions/slurm-20.02/slurm-20.02.0/src/plugins/select/liqid_cons_res/select_liqid_cons_res.c:607: first defined here collect2: error: ld returned 1 exit status Since only one select plugin can be used at a time (determined in slurm.conf) I could put my code in the cons_common implementation of those functions, but if I ever switch plugins then my plugin code will get executed when it shouldn't be. How can I "override" those callbacks in my own plugin? This isn't Java (but it sure looks like the slurm code tries to do Java in C). On Tue, Feb 25, 2020 at 11:57 AM Tim Wickberg wrote: > After 9 months of development and testing we are pleased to announce the > availability of Slurm version 20.02.0! > > Downloads are available from https://www.schedmd.com/downloads.php. > > Highlights of the 20.02 release include: > > - A "configless" method of deploying Slurm within the cluster, in which > the slurmd and user commands can use DNS SRV records to locate the > slurmctld host and automatically download the relevant configuration files. > > - A new "auth/jwt" authentication mechanism using JWT, which can help > integrate untrusted external systems into the cluster. > > - A new "slurmrestd" command/daemon which translates a new Slurm REST > API into the underlying libslurm calls. > > - Packaging fixes for RHEL8 distributions. > > - Significant performance improvements to the backfill scheduler, as > well as to string construction and processing. > > Thank you to all customers, partners, and community members who > contributed to this release. > > As with past releases, the documentation available at > https://slurm.schedmd.com has been updated to the 20.02 release. Past > versions are available in the archive. This release also marks the end > of support for the 18.08 release. The 19.05 release will remain > supported up until the 20.11 release in November, but will not see as > frequent updates, and bug-fixes will be targeted for the 20.02 > maintenance releases going forward. > > -- > Tim Wickberg > Chief Technology Officer, SchedMD > Commercial Slurm Development and Support > >
Re: [slurm-users] Slurm version 20.02.0 is now available
Did you reuse the 20.02 select/cons_res/Makefile.{in,am} in your plugin's source? You probably will have to re-model your plugin after the select/cray_aries plugin if you need to override those two functions (it also defines its own select_p_job_begin() and doesn't link against libcons_common.la). Naturally, omitting libcons_common.a from your plugin doesn't help if you use other functions defined in select/common. > On Feb 26, 2020, at 00:48 , Dean Schulze wrote: > > There was a major refactoring between the 19.05 and 20.02 code. Most of the > callbacks for select plugins were moved to cons_common. I have a plugin for > 19.05 that depends on two of those callbacks: select_p_job_begin() and > select_p_job_fini(). My plugin is a copy of the select/cons_res plugin, but > when I implement those functions in my plugin I get this error because those > functions already exist in cons_common: > > /home/dean/src/slurm.versions/slurm-20.02/slurm-20.02.0/src/plugins/select/cons_common/cons_common.c:1134: > multiple definition of `select_p_job_begin'; > .libs/select_liqid_cons_res.o:/home/dean/src/slurm.versions/slurm-20.02/slurm-20.02.0/src/plugins/select/liqid_cons_res/select_liqid_cons_res.c:559: > first defined here > /usr/bin/ld: ../cons_common/.libs/libcons_common.a(cons_common.o): in > function `select_p_job_fini': > /home/dean/src/slurm.versions/slurm-20.02/slurm-20.02.0/src/plugins/select/cons_common/cons_common.c:1561: > multiple definition of `select_p_job_fini'; > .libs/select_liqid_cons_res.o:/home/dean/src/slurm.versions/slurm-20.02/slurm-20.02.0/src/plugins/select/liqid_cons_res/select_liqid_cons_res.c:607: > first defined here > collect2: error: ld returned 1 exit status > > Since only one select plugin can be used at a time (determined in slurm.conf) > I could put my code in the cons_common implementation of those functions, but > if I ever switch plugins then my plugin code will get executed when it > shouldn't be. > > How can I "override" those callbacks in my own plugin? This isn't Java (but > it sure looks like the slurm code tries to do Java in C). > > > On Tue, Feb 25, 2020 at 11:57 AM Tim Wickberg wrote: > After 9 months of development and testing we are pleased to announce the > availability of Slurm version 20.02.0! > > Downloads are available from https://www.schedmd.com/downloads.php. > > Highlights of the 20.02 release include: > > - A "configless" method of deploying Slurm within the cluster, in which > the slurmd and user commands can use DNS SRV records to locate the > slurmctld host and automatically download the relevant configuration files. > > - A new "auth/jwt" authentication mechanism using JWT, which can help > integrate untrusted external systems into the cluster. > > - A new "slurmrestd" command/daemon which translates a new Slurm REST > API into the underlying libslurm calls. > > - Packaging fixes for RHEL8 distributions. > > - Significant performance improvements to the backfill scheduler, as > well as to string construction and processing. > > Thank you to all customers, partners, and community members who > contributed to this release. > > As with past releases, the documentation available at > https://slurm.schedmd.com has been updated to the 20.02 release. Past > versions are available in the archive. This release also marks the end > of support for the 18.08 release. The 19.05 release will remain > supported up until the 20.11 release in November, but will not see as > frequent updates, and bug-fixes will be targeted for the 20.02 > maintenance releases going forward. > > -- > Tim Wickberg > Chief Technology Officer, SchedMD > Commercial Slurm Development and Support >
Re: [slurm-users] Slurm version 20.02.0 is now available
So it sounds like the simplest approach would be to remove libcons_common from the make file and copy cons_common.[ch] into my project and provide my own implementations in the appropriate functions in cons_common.c. On Wed, Feb 26, 2020 at 6:12 AM Jeffrey T Frey wrote: > Did you reuse the 20.02 select/cons_res/Makefile.{in,am} in your plugin's > source? You probably will have to re-model your plugin after the > select/cray_aries plugin if you need to override those two functions (it > also defines its own select_p_job_begin() and doesn't link against > libcons_common.la). Naturally, omitting libcons_common.a from your > plugin doesn't help if you use other functions defined in select/common. > > > > > > > On Feb 26, 2020, at 00:48 , Dean Schulze > wrote: > > > > There was a major refactoring between the 19.05 and 20.02 code. Most of > the callbacks for select plugins were moved to cons_common. I have a > plugin for 19.05 that depends on two of those callbacks: > select_p_job_begin() and select_p_job_fini(). My plugin is a copy of the > select/cons_res plugin, but when I implement those functions in my plugin I > get this error because those functions already exist in cons_common: > > > > > /home/dean/src/slurm.versions/slurm-20.02/slurm-20.02.0/src/plugins/select/cons_common/cons_common.c:1134: > multiple definition of `select_p_job_begin'; > .libs/select_liqid_cons_res.o:/home/dean/src/slurm.versions/slurm-20.02/slurm-20.02.0/src/plugins/select/liqid_cons_res/select_liqid_cons_res.c:559: > first defined here > > /usr/bin/ld: ../cons_common/.libs/libcons_common.a(cons_common.o): in > function `select_p_job_fini': > > > /home/dean/src/slurm.versions/slurm-20.02/slurm-20.02.0/src/plugins/select/cons_common/cons_common.c:1561: > multiple definition of `select_p_job_fini'; > .libs/select_liqid_cons_res.o:/home/dean/src/slurm.versions/slurm-20.02/slurm-20.02.0/src/plugins/select/liqid_cons_res/select_liqid_cons_res.c:607: > first defined here > > collect2: error: ld returned 1 exit status > > > > Since only one select plugin can be used at a time (determined in > slurm.conf) I could put my code in the cons_common implementation of those > functions, but if I ever switch plugins then my plugin code will get > executed when it shouldn't be. > > > > How can I "override" those callbacks in my own plugin? This isn't Java > (but it sure looks like the slurm code tries to do Java in C). > > > > > > On Tue, Feb 25, 2020 at 11:57 AM Tim Wickberg wrote: > > After 9 months of development and testing we are pleased to announce the > > availability of Slurm version 20.02.0! > > > > Downloads are available from https://www.schedmd.com/downloads.php. > > > > Highlights of the 20.02 release include: > > > > - A "configless" method of deploying Slurm within the cluster, in which > > the slurmd and user commands can use DNS SRV records to locate the > > slurmctld host and automatically download the relevant configuration > files. > > > > - A new "auth/jwt" authentication mechanism using JWT, which can help > > integrate untrusted external systems into the cluster. > > > > - A new "slurmrestd" command/daemon which translates a new Slurm REST > > API into the underlying libslurm calls. > > > > - Packaging fixes for RHEL8 distributions. > > > > - Significant performance improvements to the backfill scheduler, as > > well as to string construction and processing. > > > > Thank you to all customers, partners, and community members who > > contributed to this release. > > > > As with past releases, the documentation available at > > https://slurm.schedmd.com has been updated to the 20.02 release. Past > > versions are available in the archive. This release also marks the end > > of support for the 18.08 release. The 19.05 release will remain > > supported up until the 20.11 release in November, but will not see as > > frequent updates, and bug-fixes will be targeted for the 20.02 > > maintenance releases going forward. > > > > -- > > Tim Wickberg > > Chief Technology Officer, SchedMD > > Commercial Slurm Development and Support > > > > >
Re: [slurm-users] Slurm version 20.02.0 is now available
Hi all, Looks like using --config-server limits to 1 config server if I'm not mistaken? Specifying multiple --config-server will cause slurmd to consider only the last one. (A quick glance at the source seems to agree) Any plan on accepting a second server via command line options? Thanks & regards, Angelos On 2/26/20 4:44 AM, Christopher Samuel wrote: On 2/25/20 11:41 AM, Dean Schulze wrote: I'm very interested in the "configless" setup for slurm. Is the setup for configless documented somewhere? Looks like the website has already been updated for the 20.02 documentation, and it looks like it's here: https://slurm.schedmd.com/configless_slurm.html All the best, Chris